Download the current and legacy versions of the BenMAP program. Download configuration and aggregation/pooling/valuation files to estimate benefits. BenMAP-CE is free and open source software, and the source code is available upon request.
Schroedinger’s code: Source code availability and transparency in astrophysics
NASA Astrophysics Data System (ADS)
Ryan, PW; Allen, Alice; Teuben, Peter
2018-01-01
Astronomers use software for their research, but how many of the codes they use are available as source code? We examined a sample of 166 papers from 2015 for clearly identified software use, then searched for source code for the software packages mentioned in these research papers. We categorized the software to indicate whether source code is available for download and whether there are restrictions to accessing it, and if source code was not available, whether some other form of the software, such as a binary, was. Over 40% of the source code for the software used in our sample was not available for download.As URLs have often been used as proxy citations for software, we also extracted URLs from one journal’s 2015 research articles, removed those from certain long-term, reliable domains, and tested the remainder to determine what percentage of these URLs were still accessible in September and October, 2017.
NASA Astrophysics Data System (ADS)
Allen, Alice; Teuben, Peter J.; Ryan, P. Wesley
2018-05-01
We examined software usage in a sample set of astrophysics research articles published in 2015 and searched for the source codes for the software mentioned in these research papers. We categorized the software to indicate whether the source code is available for download and whether there are restrictions to accessing it, and if the source code is not available, whether some other form of the software, such as a binary, is. We also extracted hyperlinks from one journal’s 2015 research articles, as links in articles can serve as an acknowledgment of software use and lead to the data used in the research, and tested them to determine which of these URLs are still accessible. For our sample of 715 software instances in the 166 articles we examined, we were able to categorize 418 records as according to whether source code was available and found that 285 unique codes were used, 58% of which offered the source code for download. Of the 2558 hyperlinks extracted from 1669 research articles, at best, 90% of them were available over our testing period.
A Cooperative Downloading Method for VANET Using Distributed Fountain Code.
Liu, Jianhang; Zhang, Wenbin; Wang, Qi; Li, Shibao; Chen, Haihua; Cui, Xuerong; Sun, Yi
2016-10-12
Cooperative downloading is one of the effective methods to improve the amount of downloaded data in vehicular ad hoc networking (VANET). However, the poor channel quality and short encounter time bring about a high packet loss rate, which decreases transmission efficiency and fails to satisfy the requirement of high quality of service (QoS) for some applications. Digital fountain code (DFC) can be utilized in the field of wireless communication to increase transmission efficiency. For cooperative forwarding, however, processing delay from frequent coding and decoding as well as single feedback mechanism using DFC cannot adapt to the environment of VANET. In this paper, a cooperative downloading method for VANET using concatenated DFC is proposed to solve the problems above. The source vehicle and cooperative vehicles encodes the raw data using hierarchical fountain code before they send to the client directly or indirectly. Although some packets may be lost, the client can recover the raw data, so long as it receives enough encoded packets. The method avoids data retransmission due to packet loss. Furthermore, the concatenated feedback mechanism in the method reduces the transmission delay effectively. Simulation results indicate the benefits of the proposed scheme in terms of increasing amount of downloaded data and data receiving rate.
SEGY to ASCII: Conversion and Plotting Program
Goldman, Mark R.
1999-01-01
This report documents a computer program to convert standard 4 byte, IBM floating point SEGY files to ASCII xyz format. The program then optionally plots the seismic data using the GMT plotting package. The material for this publication is contained in a standard tar file (of99-126.tar) that is uncompressed and 726 K in size. It can be downloaded by any Unix machine. Move the tar file to the directory you wish to use it in, then type 'tar xvf of99-126.tar' The archive files (and diskette) contain a NOTE file, a README file, a version-history file, source code, a makefile for easy compilation, and an ASCII version of the documentation. The archive files (and diskette) also contain example test files, including a typical SEGY file along with the resulting ASCII xyz and postscript files. Requirements for compiling the source code into an executable are a C++ compiler. The program has been successfully compiled using Gnu's g++ version 2.8.1, and use of other compilers may require modifications to the existing source code. The g++ compiler is a free, high quality C++ compiler and may be downloaded from the ftp site: ftp://ftp.gnu.org/gnu Requirements for plotting the seismic data is the existence of the GMT plotting package. The GMT plotting package may be downloaded from the web site: http://www.soest.hawaii.edu/gmt/
specification How to install the software How to use the software Download the source code (using .gz). Standard Exchange Format (SHEF) is a documented set of rules for coding of data in a form for both visual and information to describe the data. Current SHEF specification How to install the software How to use the
Practices in Code Discoverability: Astrophysics Source Code Library
NASA Astrophysics Data System (ADS)
Allen, A.; Teuben, P.; Nemiroff, R. J.; Shamir, L.
2012-09-01
Here we describe the Astrophysics Source Code Library (ASCL), which takes an active approach to sharing astrophysics source code. ASCL's editor seeks out both new and old peer-reviewed papers that describe methods or experiments that involve the development or use of source code, and adds entries for the found codes to the library. This approach ensures that source codes are added without requiring authors to actively submit them, resulting in a comprehensive listing that covers a significant number of the astrophysics source codes used in peer-reviewed studies. The ASCL now has over 340 codes in it and continues to grow. In 2011, the ASCL has on average added 19 codes per month. An advisory committee has been established to provide input and guide the development and expansion of the new site, and a marketing plan has been developed and is being executed. All ASCL source codes have been used to generate results published in or submitted to a refereed journal and are freely available either via a download site or from an identified source. This paper provides the history and description of the ASCL. It lists the requirements for including codes, examines the advantages of the ASCL, and outlines some of its future plans.
Astrophysics Source Code Library: Incite to Cite!
NASA Astrophysics Data System (ADS)
DuPrie, K.; Allen, A.; Berriman, B.; Hanisch, R. J.; Mink, J.; Nemiroff, R. J.; Shamir, L.; Shortridge, K.; Taylor, M. B.; Teuben, P.; Wallen, J. F.
2014-05-01
The Astrophysics Source Code Library (ASCl,http://ascl.net/) is an on-line registry of over 700 source codes that are of interest to astrophysicists, with more being added regularly. The ASCL actively seeks out codes as well as accepting submissions from the code authors, and all entries are citable and indexed by ADS. All codes have been used to generate results published in or submitted to a refereed journal and are available either via a download site or from an identified source. In addition to being the largest directory of scientist-written astrophysics programs available, the ASCL is also an active participant in the reproducible research movement with presentations at various conferences, numerous blog posts and a journal article. This poster provides a description of the ASCL and the changes that we are starting to see in the astrophysics community as a result of the work we are doing.
M2Lite: An Open-source, Light-weight, Pluggable and Fast Proteome Discoverer MSF to mzIdentML Tool.
Aiyetan, Paul; Zhang, Bai; Chen, Lily; Zhang, Zhen; Zhang, Hui
2014-04-28
Proteome Discoverer is one of many tools used for protein database search and peptide to spectrum assignment in mass spectrometry-based proteomics. However, the inadequacy of conversion tools makes it challenging to compare and integrate its results to those of other analytical tools. Here we present M2Lite, an open-source, light-weight, easily pluggable and fast conversion tool. M2Lite converts proteome discoverer derived MSF files to the proteomics community defined standard - the mzIdentML file format. M2Lite's source code is available as open-source at https://bitbucket.org/paiyetan/m2lite/src and its compiled binaries and documentation can be freely downloaded at https://bitbucket.org/paiyetan/m2lite/downloads.
GIS Data Downloads | USDA Plant Hardiness Zone Map
Acknowledgments & Citation Copyright Map & Data Downloads Map Downloads Geography (GIS) Downloads Multi & Data Downloads / GIS Data Downloads Topics Map Downloads Geography (GIS) Downloads Multi-Zip Code
The Astrophysics Source Code Library: An Update
NASA Astrophysics Data System (ADS)
Allen, Alice; Nemiroff, R. J.; Shamir, L.; Teuben, P. J.
2012-01-01
The Astrophysics Source Code Library (ASCL), founded in 1999, takes an active approach to sharing astrophysical source code. ASCL's editor seeks out both new and old peer-reviewed papers that describe methods or experiments that involve the development or use of source code, and adds entries for the found codes to the library. This approach ensures that source codes are added without requiring authors to actively submit them, resulting in a comprehensive listing that covers a significant number of the astrophysics source codes used in peer-reviewed studies. The ASCL moved to a new location in 2010, and has over 300 codes in it and continues to grow. In 2011, the ASCL (http://asterisk.apod.com/viewforum.php?f=35) has on average added 19 new codes per month; we encourage scientists to submit their codes for inclusion. An advisory committee has been established to provide input and guide the development and expansion of its new site, and a marketing plan has been developed and is being executed. All ASCL source codes have been used to generate results published in or submitted to a refereed journal and are freely available either via a download site or from an identified source. This presentation covers the history of the ASCL and examines the current state and benefits of the ASCL, the means of and requirements for including codes, and outlines its future plans.
Site Map | USDA Plant Hardiness Zone Map
Acknowledgments & Citation Copyright Map & Data Downloads Map Downloads Geography (GIS) Downloads Multi ; Citation Copyright Map & Data Downloads Map Downloads Geography (GIS) Downloads Multi-ZIP Code Finder
DLRS: gene tree evolution in light of a species tree.
Sjöstrand, Joel; Sennblad, Bengt; Arvestad, Lars; Lagergren, Jens
2012-11-15
PrIME-DLRS (or colloquially: 'Delirious') is a phylogenetic software tool to simultaneously infer and reconcile a gene tree given a species tree. It accounts for duplication and loss events, a relaxed molecular clock and is intended for the study of homologous gene families, for example in a comparative genomics setting involving multiple species. PrIME-DLRS uses a Bayesian MCMC framework, where the input is a known species tree with divergence times and a multiple sequence alignment, and the output is a posterior distribution over gene trees and model parameters. PrIME-DLRS is available for Java SE 6+ under the New BSD License, and JAR files and source code can be downloaded from http://code.google.com/p/jprime/. There is also a slightly older C++ version available as a binary package for Ubuntu, with download instructions at http://prime.sbc.su.se. The C++ source code is available upon request. joel.sjostrand@scilifelab.se or jens.lagergren@scilifelab.se. PrIME-DLRS is based on a sound probabilistic model (Åkerborg et al., 2009) and has been thoroughly validated on synthetic and biological datasets (Supplementary Material online).
Forde, Arnell S.; Flocks, James G.; Wiese, Dana S.; Fredericks, Jake J.
2016-03-29
The archived trace data are in standard SEG Y rev. 0 format (Barry and others, 1975); the first 3,200 bytes of the card image header are in American Standard Code for Information Interchange (ASCII) format instead of Extended Binary Coded Decimal Interchange Code (EBCDIC) format. The SEG Y files are available on the DVD version of this report or online, downloadable via the USGS Coastal and Marine Geoscience Data System (http://cmgds.marine.usgs.gov). The data are also available for viewing using GeoMapApp (http://www.geomapapp.org) and Virtual Ocean (http://www.virtualocean.org) multi-platform open source software. The Web version of this archive does not contain the SEG Y trace files. To obtain the complete DVD archive, contact USGS Information Services at 1-888-ASK-USGS or infoservices@usgs.gov. The SEG Y files may be downloaded and processed with commercial or public domain software such as Seismic Unix (SU) (Cohen and Stockwell, 2010). See the How To Download SEG Y Data page for download instructions. The printable profiles are provided as Graphics Interchange Format (GIF) images processed and gained using SU software and can be viewed from theProfiles page or by using the links located on the trackline maps; refer to the Software page for links to example SU processing scripts.
Using the Astrophysics Source Code Library
NASA Astrophysics Data System (ADS)
Allen, Alice; Teuben, P. J.; Berriman, G. B.; DuPrie, K.; Hanisch, R. J.; Mink, J. D.; Nemiroff, R. J.; Shamir, L.; Wallin, J. F.
2013-01-01
The Astrophysics Source Code Library (ASCL) is a free on-line registry of source codes that are of interest to astrophysicists; with over 500 codes, it is the largest collection of scientist-written astrophysics programs in existence. All ASCL source codes have been used to generate results published in or submitted to a refereed journal and are available either via a download site or from an identified source. An advisory committee formed in 2011 provides input and guides the development and expansion of the ASCL, and since January 2012, all accepted ASCL entries are indexed by ADS. Though software is increasingly important for the advancement of science in astrophysics, these methods are still often hidden from view or difficult to find. The ASCL (ascl.net/) seeks to improve the transparency and reproducibility of research by making these vital methods discoverable, and to provide recognition and incentive to those who write and release programs useful for astrophysics research. This poster provides a description of the ASCL, an update on recent additions, and the changes in the astrophysics community we are starting to see because of the ASCL.
Joyce, Brendan; Lee, Danny; Rubio, Alex; Ogurtsov, Aleksey; Alves, Gelio; Yu, Yi-Kuo
2018-03-15
RAId is a software package that has been actively developed for the past 10 years for computationally and visually analyzing MS/MS data. Founded on rigorous statistical methods, RAId's core program computes accurate E-values for peptides and proteins identified during database searches. Making this robust tool readily accessible for the proteomics community by developing a graphical user interface (GUI) is our main goal here. We have constructed a graphical user interface to facilitate the use of RAId on users' local machines. Written in Java, RAId_GUI not only makes easy executions of RAId but also provides tools for data/spectra visualization, MS-product analysis, molecular isotopic distribution analysis, and graphing the retrieval versus the proportion of false discoveries. The results viewer displays and allows the users to download the analyses results. Both the knowledge-integrated organismal databases and the code package (containing source code, the graphical user interface, and a user manual) are available for download at https://www.ncbi.nlm.nih.gov/CBBresearch/Yu/downloads/raid.html .
Espino, Jeremy U; Wagner, M; Szczepaniak, C; Tsui, F C; Su, H; Olszewski, R; Liu, Z; Chapman, W; Zeng, X; Ma, L; Lu, Z; Dara, J
2004-09-24
Computer-based outbreak and disease surveillance requires high-quality software that is well-supported and affordable. Developing software in an open-source framework, which entails free distribution and use of software and continuous, community-based software development, can produce software with such characteristics, and can do so rapidly. The objective of the Real-Time Outbreak and Disease Surveillance (RODS) Open Source Project is to accelerate the deployment of computer-based outbreak and disease surveillance systems by writing software and catalyzing the formation of a community of users, developers, consultants, and scientists who support its use. The University of Pittsburgh seeded the Open Source Project by releasing the RODS software under the GNU General Public License. An infrastructure was created, consisting of a website, mailing lists for developers and users, designated software developers, and shared code-development tools. These resources are intended to encourage growth of the Open Source Project community. Progress is measured by assessing website usage, number of software downloads, number of inquiries, number of system deployments, and number of new features or modules added to the code base. During September--November 2003, users generated 5,370 page views of the project website, 59 software downloads, 20 inquiries, one new deployment, and addition of four features. Thus far, health departments and companies have been more interested in using the software as is than in customizing or developing new features. The RODS laboratory anticipates that after initial installation has been completed, health departments and companies will begin to customize the software and contribute their enhancements to the public code base.
Implementation of the Regulatory Authority Information System in Egypt
DOE Office of Scientific and Technical Information (OSTI.GOV)
Carson, S.D.; Schetnan, R.; Hasan, A.
2006-07-01
As part of the implementation of a bar-code-based system to track radioactive sealed sources (RSS) in Egypt, the Regulatory Authority Information System Personal Digital Assistant (RAIS PDA) Application was developed to extend the functionality of the International Atomic Energy Agency's (IAEA's) RAIS database by allowing users to download RSS data from the database to a portable PDA equipped with a bar-code scanner. [1, 4] The system allows users in the field to verify radioactive sealed source data, gather radioactive sealed source audit information, and upload that data to the RAIS database. This paper describes the development of the RAIS PDAmore » Application, its features, and how it will be implemented in Egypt. (authors)« less
Genetically improved BarraCUDA.
Langdon, W B; Lam, Brian Yee Hong
2017-01-01
BarraCUDA is an open source C program which uses the BWA algorithm in parallel with nVidia CUDA to align short next generation DNA sequences against a reference genome. Recently its source code was optimised using "Genetic Improvement". The genetically improved (GI) code is up to three times faster on short paired end reads from The 1000 Genomes Project and 60% more accurate on a short BioPlanet.com GCAT alignment benchmark. GPGPU BarraCUDA running on a single K80 Tesla GPU can align short paired end nextGen sequences up to ten times faster than bwa on a 12 core server. The speed up was such that the GI version was adopted and has been regularly downloaded from SourceForge for more than 12 months.
Prior-Based Quantization Bin Matching for Cloud Storage of JPEG Images.
Liu, Xianming; Cheung, Gene; Lin, Chia-Wen; Zhao, Debin; Gao, Wen
2018-07-01
Millions of user-generated images are uploaded to social media sites like Facebook daily, which translate to a large storage cost. However, there exists an asymmetry in upload and download data: only a fraction of the uploaded images are subsequently retrieved for viewing. In this paper, we propose a cloud storage system that reduces the storage cost of all uploaded JPEG photos, at the expense of a controlled increase in computation mainly during download of requested image subset. Specifically, the system first selectively re-encodes code blocks of uploaded JPEG images using coarser quantization parameters for smaller storage sizes. Then during download, the system exploits known signal priors-sparsity prior and graph-signal smoothness prior-for reverse mapping to recover original fine quantization bin indices, with either deterministic guarantee (lossless mode) or statistical guarantee (near-lossless mode). For fast reverse mapping, we use small dictionaries and sparse graphs that are tailored for specific clusters of similar blocks, which are classified via tree-structured vector quantizer. During image upload, cluster indices identifying the appropriate dictionaries and graphs for the re-quantized blocks are encoded as side information using a differential distributed source coding scheme to facilitate reverse mapping during image download. Experimental results show that our system can reap significant storage savings (up to 12.05%) at roughly the same image PSNR (within 0.18 dB).
The Astrophysics Source Code Library: Supporting software publication and citation
NASA Astrophysics Data System (ADS)
Allen, Alice; Teuben, Peter
2018-01-01
The Astrophysics Source Code Library (ASCL, ascl.net), established in 1999, is a free online registry for source codes used in research that has appeared in, or been submitted to, peer-reviewed publications. The ASCL is indexed by the SAO/NASA Astrophysics Data System (ADS) and Web of Science and is citable by using the unique ascl ID assigned to each code. In addition to registering codes, the ASCL can house archive files for download and assign them DOIs. The ASCL advocations for software citation on par with article citation, participates in multidiscipinary events such as Force11, OpenCon, and the annual Workshop on Sustainable Software for Science, works with journal publishers, and organizes Special Sessions and Birds of a Feather meetings at national and international conferences such as Astronomical Data Analysis Software and Systems (ADASS), European Week of Astronomy and Space Science, and AAS meetings. In this presentation, I will discuss some of the challenges of gathering credit for publishing software and ideas and efforts from other disciplines that may be useful to astronomy.
Variation in Drug Prices at Pharmacies: Are Prices Higher in Poorer Areas?
Gellad, Walid F; Choudhry, Niteesh K; Friedberg, Mark W; Brookhart, M Alan; Haas, Jennifer S; Shrank, William H
2009-01-01
Objective To determine whether retail prices for prescription drugs are higher in poorer areas. Data Sources The MyFloridarx.com website, which provides retail prescription prices at Florida pharmacies, and median ZIP code income from the 2000 Census. Study Design We compared mean pharmacy prices for each of the four study drugs across ZIP code income groups. Pharmacies were classified as either chain pharmacies or independent pharmacies. Data Collection Prices were downloaded in November 2006. Principal Findings Across the four study drugs, mean prices were highest in the poorest ZIP codes: 9 percent above the statewide average. Independent pharmacies in the poorest ZIP codes charged the highest mean prices. Conclusions Retail prescription prices appear to be higher in poorer ZIP codes of Florida. PMID:19178584
Open source clustering software.
de Hoon, M J L; Imoto, S; Nolan, J; Miyano, S
2004-06-12
We have implemented k-means clustering, hierarchical clustering and self-organizing maps in a single multipurpose open-source library of C routines, callable from other C and C++ programs. Using this library, we have created an improved version of Michael Eisen's well-known Cluster program for Windows, Mac OS X and Linux/Unix. In addition, we generated a Python and a Perl interface to the C Clustering Library, thereby combining the flexibility of a scripting language with the speed of C. The C Clustering Library and the corresponding Python C extension module Pycluster were released under the Python License, while the Perl module Algorithm::Cluster was released under the Artistic License. The GUI code Cluster 3.0 for Windows, Macintosh and Linux/Unix, as well as the corresponding command-line program, were released under the same license as the original Cluster code. The complete source code is available at http://bonsai.ims.u-tokyo.ac.jp/mdehoon/software/cluster. Alternatively, Algorithm::Cluster can be downloaded from CPAN, while Pycluster is also available as part of the Biopython distribution.
InterProScan 5: genome-scale protein function classification
Jones, Philip; Binns, David; Chang, Hsin-Yu; Fraser, Matthew; Li, Weizhong; McAnulla, Craig; McWilliam, Hamish; Maslen, John; Mitchell, Alex; Nuka, Gift; Pesseat, Sebastien; Quinn, Antony F.; Sangrador-Vegas, Amaia; Scheremetjew, Maxim; Yong, Siew-Yit; Lopez, Rodrigo; Hunter, Sarah
2014-01-01
Motivation: Robust large-scale sequence analysis is a major challenge in modern genomic science, where biologists are frequently trying to characterize many millions of sequences. Here, we describe a new Java-based architecture for the widely used protein function prediction software package InterProScan. Developments include improvements and additions to the outputs of the software and the complete reimplementation of the software framework, resulting in a flexible and stable system that is able to use both multiprocessor machines and/or conventional clusters to achieve scalable distributed data analysis. InterProScan is freely available for download from the EMBl-EBI FTP site and the open source code is hosted at Google Code. Availability and implementation: InterProScan is distributed via FTP at ftp://ftp.ebi.ac.uk/pub/software/unix/iprscan/5/ and the source code is available from http://code.google.com/p/interproscan/. Contact: http://www.ebi.ac.uk/support or interhelp@ebi.ac.uk or mitchell@ebi.ac.uk PMID:24451626
Development and Implementation of Dynamic Scripts to Execute Cycled WRF/GSI Forecasts
NASA Technical Reports Server (NTRS)
Zavodsky, Bradley; Srikishen, Jayanthi; Berndt, Emily; Li, Quanli; Watson, Leela
2014-01-01
Automating the coupling of data assimilation (DA) and modeling systems is a unique challenge in the numerical weather prediction (NWP) research community. In recent years, the Development Testbed Center (DTC) has released well-documented tools such as the Weather Research and Forecasting (WRF) model and the Gridpoint Statistical Interpolation (GSI) DA system that can be easily downloaded, installed, and run by researchers on their local systems. However, developing a coupled system in which the various preprocessing, DA, model, and postprocessing capabilities are all integrated can be labor-intensive if one has little experience with any of these individual systems. Additionally, operational modeling entities generally have specific coupling methodologies that can take time to understand and develop code to implement properly. To better enable collaborating researchers to perform modeling and DA experiments with GSI, the Short-term Prediction Research and Transition (SPoRT) Center has developed a set of Perl scripts that couple GSI and WRF in a cycling methodology consistent with the use of real-time, regional observation data from the National Centers for Environmental Prediction (NCEP)/Environmental Modeling Center (EMC). Because Perl is open source, the code can be easily downloaded and executed regardless of the user's native shell environment. This paper will provide a description of this open-source code and descriptions of a number of the use cases that have been performed by SPoRT collaborators using the scripts on different computing systems.
Cytoscape.js: a graph theory library for visualisation and analysis.
Franz, Max; Lopes, Christian T; Huck, Gerardo; Dong, Yue; Sumer, Onur; Bader, Gary D
2016-01-15
Cytoscape.js is an open-source JavaScript-based graph library. Its most common use case is as a visualization software component, so it can be used to render interactive graphs in a web browser. It also can be used in a headless manner, useful for graph operations on a server, such as Node.js. Cytoscape.js is implemented in JavaScript. Documentation, downloads and source code are available at http://js.cytoscape.org. gary.bader@utoronto.ca. © The Author 2015. Published by Oxford University Press.
LSDCat: Detection and cataloguing of emission-line sources in integral-field spectroscopy datacubes
NASA Astrophysics Data System (ADS)
Herenz, Edmund Christian; Wisotzki, Lutz
2017-06-01
We present a robust, efficient, and user-friendly algorithm for detecting faint emission-line sources in large integral-field spectroscopic datacubes together with the public release of the software package Line Source Detection and Cataloguing (LSDCat). LSDCat uses a three-dimensional matched filter approach, combined with thresholding in signal-to-noise, to build a catalogue of individual line detections. In a second pass, the detected lines are grouped into distinct objects, and positions, spatial extents, and fluxes of the detected lines are determined. LSDCat requires only a small number of input parameters, and we provide guidelines for choosing appropriate values. The software is coded in Python and capable of processing very large datacubes in a short time. We verify the implementation with a source insertion and recovery experiment utilising a real datacube taken with the MUSE instrument at the ESO Very Large Telescope. The LSDCat software is available for download at http://muse-vlt.eu/science/tools and via the Astrophysics Source Code Library at http://ascl.net/1612.002
Specht, Michael; Kuhlgert, Sebastian; Fufezan, Christian; Hippler, Michael
2011-04-15
We present Proteomatic, an operating system independent and user-friendly platform that enables the construction and execution of MS/MS data evaluation pipelines using free and commercial software. Required external programs such as for peptide identification are downloaded automatically in the case of free software. Due to a strict separation of functionality and presentation, and support for multiple scripting languages, new processing steps can be added easily. Proteomatic is implemented in C++/Qt, scripts are implemented in Ruby, Python and PHP. All source code is released under the LGPL. Source code and installers for Windows, Mac OS X, and Linux are freely available at http://www.proteomatic.org. michael.specht@uni-muenster.de Supplementary data are available at Bioinformatics online.
Academic Software Downloads from Google Code: Useful Usage Indicators?
ERIC Educational Resources Information Center
Thelwall, Mike; Kousha, Kayvan
2016-01-01
Introduction: Computer scientists and other researchers often make their programs freely available online. If this software makes a valuable contribution inside or outside of academia then its creators may want to demonstrate this with a suitable indicator, such as download counts. Methods: Download counts, citation counts, labels and licenses…
MRMPlus: an open source quality control and assessment tool for SRM/MRM assay development.
Aiyetan, Paul; Thomas, Stefani N; Zhang, Zhen; Zhang, Hui
2015-12-12
Selected and multiple reaction monitoring involves monitoring a multiplexed assay of proteotypic peptides and associated transitions in mass spectrometry runs. To describe peptide and associated transitions as stable, quantifiable, and reproducible representatives of proteins of interest, experimental and analytical validation is required. However, inadequate and disparate analytical tools and validation methods predispose assay performance measures to errors and inconsistencies. Implemented as a freely available, open-source tool in the platform independent Java programing language, MRMPlus computes analytical measures as recommended recently by the Clinical Proteomics Tumor Analysis Consortium Assay Development Working Group for "Tier 2" assays - that is, non-clinical assays sufficient enough to measure changes due to both biological and experimental perturbations. Computed measures include; limit of detection, lower limit of quantification, linearity, carry-over, partial validation of specificity, and upper limit of quantification. MRMPlus streamlines assay development analytical workflow and therefore minimizes error predisposition. MRMPlus may also be used for performance estimation for targeted assays not described by the Assay Development Working Group. MRMPlus' source codes and compiled binaries can be freely downloaded from https://bitbucket.org/paiyetan/mrmplusgui and https://bitbucket.org/paiyetan/mrmplusgui/downloads respectively.
PyCorrFit-generic data evaluation for fluorescence correlation spectroscopy.
Müller, Paul; Schwille, Petra; Weidemann, Thomas
2014-09-01
We present a graphical user interface (PyCorrFit) for the fitting of theoretical model functions to experimental data obtained by fluorescence correlation spectroscopy (FCS). The program supports many data file formats and features a set of tools specialized in FCS data evaluation. The Python source code is freely available for download from the PyCorrFit web page at http://pycorrfit.craban.de. We offer binaries for Ubuntu Linux, Mac OS X and Microsoft Windows. © The Author 2014. Published by Oxford University Press.
Saint: a lightweight integration environment for model annotation.
Lister, Allyson L; Pocock, Matthew; Taschuk, Morgan; Wipat, Anil
2009-11-15
Saint is a web application which provides a lightweight annotation integration environment for quantitative biological models. The system enables modellers to rapidly mark up models with biological information derived from a range of data sources. Saint is freely available for use on the web at http://www.cisban.ac.uk/saint. The web application is implemented in Google Web Toolkit and Tomcat, with all major browsers supported. The Java source code is freely available for download at http://saint-annotate.sourceforge.net. The Saint web server requires an installation of libSBML and has been tested on Linux (32-bit Ubuntu 8.10 and 9.04).
JBioWH: an open-source Java framework for bioinformatics data integration
Vera, Roberto; Perez-Riverol, Yasset; Perez, Sonia; Ligeti, Balázs; Kertész-Farkas, Attila; Pongor, Sándor
2013-01-01
The Java BioWareHouse (JBioWH) project is an open-source platform-independent programming framework that allows a user to build his/her own integrated database from the most popular data sources. JBioWH can be used for intensive querying of multiple data sources and the creation of streamlined task-specific data sets on local PCs. JBioWH is based on a MySQL relational database scheme and includes JAVA API parser functions for retrieving data from 20 public databases (e.g. NCBI, KEGG, etc.). It also includes a client desktop application for (non-programmer) users to query data. In addition, JBioWH can be tailored for use in specific circumstances, including the handling of massive queries for high-throughput analyses or CPU intensive calculations. The framework is provided with complete documentation and application examples and it can be downloaded from the Project Web site at http://code.google.com/p/jbiowh. A MySQL server is available for demonstration purposes at hydrax.icgeb.trieste.it:3307. Database URL: http://code.google.com/p/jbiowh PMID:23846595
JBioWH: an open-source Java framework for bioinformatics data integration.
Vera, Roberto; Perez-Riverol, Yasset; Perez, Sonia; Ligeti, Balázs; Kertész-Farkas, Attila; Pongor, Sándor
2013-01-01
The Java BioWareHouse (JBioWH) project is an open-source platform-independent programming framework that allows a user to build his/her own integrated database from the most popular data sources. JBioWH can be used for intensive querying of multiple data sources and the creation of streamlined task-specific data sets on local PCs. JBioWH is based on a MySQL relational database scheme and includes JAVA API parser functions for retrieving data from 20 public databases (e.g. NCBI, KEGG, etc.). It also includes a client desktop application for (non-programmer) users to query data. In addition, JBioWH can be tailored for use in specific circumstances, including the handling of massive queries for high-throughput analyses or CPU intensive calculations. The framework is provided with complete documentation and application examples and it can be downloaded from the Project Web site at http://code.google.com/p/jbiowh. A MySQL server is available for demonstration purposes at hydrax.icgeb.trieste.it:3307. Database URL: http://code.google.com/p/jbiowh.
Recent Updates to the System Advisor Model (SAM)
DOE Office of Scientific and Technical Information (OSTI.GOV)
DiOrio, Nicholas A
The System Advisor Model (SAM) is a mature suite of techno-economic models for many renewable energy technologies that can be downloaded for free as a desktop application or software development kit. SAM is used for system-level modeling, including generating performance pro the release of the code as an open source project on GitHub. Other additions that will be covered include the ability to download data directly into SAM from the National Solar Radiation Database (NSRDB) and up- dates to a user-interface macro that assists with PV system sizing. A brief update on SAM's battery model and its integration with themore » detailed photovoltaic model will also be discussed. Finally, an outline of planned work for the next year will be presented, including the addition of a bifacial model, support for multiple MPPT inputs for detailed inverter modeling, and the addition of a model for inverter thermal behavior.« less
Microcomputer software development facilities
NASA Technical Reports Server (NTRS)
Gorman, J. S.; Mathiasen, C.
1980-01-01
A more efficient and cost effective method for developing microcomputer software is to utilize a host computer with high-speed peripheral support. Application programs such as cross assemblers, loaders, and simulators are implemented in the host computer for each of the microcomputers for which software development is a requirement. The host computer is configured to operate in a time share mode for multiusers. The remote terminals, printers, and down loading capabilities provided are based on user requirements. With this configuration a user, either local or remote, can use the host computer for microcomputer software development. Once the software is developed (through the code and modular debug stage) it can be downloaded to the development system or emulator in a test area where hardware/software integration functions can proceed. The microcomputer software program sources reside in the host computer and can be edited, assembled, loaded, and then downloaded as required until the software development project has been completed.
Springate, David A; Kontopantelis, Evangelos; Ashcroft, Darren M; Olier, Ivan; Parisi, Rosa; Chamapiwa, Edmore; Reeves, David
2014-01-01
Lists of clinical codes are the foundation for research undertaken using electronic medical records (EMRs). If clinical code lists are not available, reviewers are unable to determine the validity of research, full study replication is impossible, researchers are unable to make effective comparisons between studies, and the construction of new code lists is subject to much duplication of effort. Despite this, the publication of clinical codes is rarely if ever a requirement for obtaining grants, validating protocols, or publishing research. In a representative sample of 450 EMR primary research articles indexed on PubMed, we found that only 19 (5.1%) were accompanied by a full set of published clinical codes and 32 (8.6%) stated that code lists were available on request. To help address these problems, we have built an online repository where researchers using EMRs can upload and download lists of clinical codes. The repository will enable clinical researchers to better validate EMR studies, build on previous code lists and compare disease definitions across studies. It will also assist health informaticians in replicating database studies, tracking changes in disease definitions or clinical coding practice through time and sharing clinical code information across platforms and data sources as research objects.
Springate, David A.; Kontopantelis, Evangelos; Ashcroft, Darren M.; Olier, Ivan; Parisi, Rosa; Chamapiwa, Edmore; Reeves, David
2014-01-01
Lists of clinical codes are the foundation for research undertaken using electronic medical records (EMRs). If clinical code lists are not available, reviewers are unable to determine the validity of research, full study replication is impossible, researchers are unable to make effective comparisons between studies, and the construction of new code lists is subject to much duplication of effort. Despite this, the publication of clinical codes is rarely if ever a requirement for obtaining grants, validating protocols, or publishing research. In a representative sample of 450 EMR primary research articles indexed on PubMed, we found that only 19 (5.1%) were accompanied by a full set of published clinical codes and 32 (8.6%) stated that code lists were available on request. To help address these problems, we have built an online repository where researchers using EMRs can upload and download lists of clinical codes. The repository will enable clinical researchers to better validate EMR studies, build on previous code lists and compare disease definitions across studies. It will also assist health informaticians in replicating database studies, tracking changes in disease definitions or clinical coding practice through time and sharing clinical code information across platforms and data sources as research objects. PMID:24941260
Installing a Local Copy of the Reactome Web Site and Knowledgebase
McKay, Sheldon J; Weiser, Joel
2015-01-01
The Reactome project builds, maintains, and publishes a knowledgebase of biological pathways. The information in the knowledgebase is gathered from the experts in the field, peer reviewed, and edited by Reactome editorial staff and then published to the Reactome Web site, http://www.reactome.org (see UNIT 8.7; Croft et al., 2013). The Reactome software is open source and builds on top of other open-source or freely available software. Reactome data and code can be freely downloaded in its entirety and the Web site installed locally. This allows for more flexible interrogation of the data and also makes it possible to add one’s own information to the knowledgebase. PMID:26087747
MX: A beamline control system toolkit
NASA Astrophysics Data System (ADS)
Lavender, William M.
2000-06-01
The development of experimental and beamline control systems for two Collaborative Access Teams at the Advanced Photon Source has resulted in the creation of a portable data acquisition and control toolkit called MX. MX consists of a set of servers, application programs and libraries that enable the creation of command line and graphical user interface applications that may be easily retargeted to new and different kinds of motor and device controllers. The source code for MX is written in ANSI C and Tcl/Tk with interprocess communication via TCP/IP. MX is available for several versions of Unix, Windows 95/98/NT and DOS. It may be downloaded from the web site http://www.imca.aps.anl.gov/mx/.
Simulated single molecule microscopy with SMeagol.
Lindén, Martin; Ćurić, Vladimir; Boucharin, Alexis; Fange, David; Elf, Johan
2016-08-01
SMeagol is a software tool to simulate highly realistic microscopy data based on spatial systems biology models, in order to facilitate development, validation and optimization of advanced analysis methods for live cell single molecule microscopy data. SMeagol runs on Matlab R2014 and later, and uses compiled binaries in C for reaction-diffusion simulations. Documentation, source code and binaries for Mac OS, Windows and Ubuntu Linux can be downloaded from http://smeagol.sourceforge.net johan.elf@icm.uu.se Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.
Interactive web-based identification and visualization of transcript shared sequences.
Azhir, Alaleh; Merino, Louis-Henri; Nauen, David W
2018-05-12
We have developed TraC (Transcript Consensus), a web-based tool for detecting and visualizing shared sequences among two or more mRNA transcripts such as splice variants. Results including exon-exon boundaries are returned in a highly intuitive, data-rich, interactive plot that permits users to explore the similarities and differences of multiple transcript sequences. The online tool (http://labs.pathology.jhu.edu/nauen/trac/) is free to use. The source code is freely available for download (https://github.com/nauenlab/TraC). Copyright © 2018 Elsevier Inc. All rights reserved.
Aiyetan, Paul; Zhang, Bai; Zhang, Zhen; Zhang, Hui
2014-01-01
Mass spectrometry based glycoproteomics has become a major means of identifying and characterizing previously N-linked glycan attached loci (glycosites). In the bottom-up approach, several factors which include but not limited to sample preparation, mass spectrometry analyses, and protein sequence database searches result in previously N-linked peptide spectrum matches (PSMs) of varying lengths. Given that multiple PSM scan map to a glycosite, we reason that identified PSMs are varying length peptide species of a unique set of glycosites. Because associated spectra of these PSMs are typically summed separately, true glycosite associated spectra counts are lost or complicated. Also, these varying length peptide species complicate protein inference as smaller sized peptide sequences are more likely to map to more proteins than larger sized peptides or actual glycosite sequences. Here, we present XGlycScan. XGlycScan maps varying length peptide species to glycosites to facilitate an accurate quantification of glycosite associated spectra counts. We observed that this reduced the variability in reported identifications of mass spectrometry technical replicates of our sample dataset. We also observed that mapping identified peptides to glycosites provided an assessment of search-engine identification. Inherently, XGlycScan reported glycosites reduce the complexity in protein inference. We implemented XGlycScan in the platform independent Java programing language and have made it available as open source. XGlycScan's source code is freely available at https://bitbucket.org/paiyetan/xglycscan/src and its compiled binaries and documentation can be freely downloaded at https://bitbucket.org/paiyetan/xglycscan/downloads. The graphical user interface version can also be found at https://bitbucket.org/paiyetan/xglycscangui/src and https://bitbucket.org/paiyetan/xglycscangui/downloads respectively.
HippDB: a database of readily targeted helical protein-protein interactions.
Bergey, Christina M; Watkins, Andrew M; Arora, Paramjit S
2013-11-01
HippDB catalogs every protein-protein interaction whose structure is available in the Protein Data Bank and which exhibits one or more helices at the interface. The Web site accepts queries on variables such as helix length and sequence, and it provides computational alanine scanning and change in solvent-accessible surface area values for every interfacial residue. HippDB is intended to serve as a starting point for structure-based small molecule and peptidomimetic drug development. HippDB is freely available on the web at http://www.nyu.edu/projects/arora/hippdb. The Web site is implemented in PHP, MySQL and Apache. Source code freely available for download at http://code.google.com/p/helidb, implemented in Perl and supported on Linux. arora@nyu.edu.
Mod3DMT and EMTF: Free Software for MT Data Processing and Inversion
NASA Astrophysics Data System (ADS)
Egbert, G. D.; Kelbert, A.; Meqbel, N. M.
2017-12-01
"ModEM" was developed at Oregon State University as a modular system for inversion of electromagnetic (EM) geophysical data (Egbert and Kelbert, 2012; Kelbert et al., 2014). Although designed for more general (frequency domain) EM applications, and originally intended as a testbed for exploring inversion search and regularization strategies, our own initial uses of ModEM were for 3-D imaging of the deep crust and upper mantle at large scales. Since 2013 we have offered a version of the source code suitable for 3D magnetotelluric (MT) inversion on an "as is, user beware" basis for free for non-commercial applications. This version, which we refer to as Mod3DMT, has since been widely used by the international MT community. Over 250 users have registered to download the source code, and at least 50 MT studies in the refereed literature, covering locations around the globe at a range of spatial scales, cite use of ModEM for 3D inversion. For over 30 years I have also made MT processing software available for free use. In this presentation, I will discuss my experience with these freely available (but perhaps not truly open-source) computer codes. Although users are allowed to make modifications to the codes (on conditions that they provide a copy of the modified version) only a handful of users have tried to make any modification, and only rarely are modifications even reported, much less provided back to the developers.
Multiple Primary and Histology Coding Rules - SEER
Download the coding manual and training resources for cases diagnosed from 2007 to 2017. Sites included are lung, breast, colon, melanoma of the skin, head and neck, kidney, renal pelvis/ureter/bladder, benign brain, and malignant brain.
Unipro UGENE: a unified bioinformatics toolkit.
Okonechnikov, Konstantin; Golosova, Olga; Fursov, Mikhail
2012-04-15
Unipro UGENE is a multiplatform open-source software with the main goal of assisting molecular biologists without much expertise in bioinformatics to manage, analyze and visualize their data. UGENE integrates widely used bioinformatics tools within a common user interface. The toolkit supports multiple biological data formats and allows the retrieval of data from remote data sources. It provides visualization modules for biological objects such as annotated genome sequences, Next Generation Sequencing (NGS) assembly data, multiple sequence alignments, phylogenetic trees and 3D structures. Most of the integrated algorithms are tuned for maximum performance by the usage of multithreading and special processor instructions. UGENE includes a visual environment for creating reusable workflows that can be launched on local resources or in a High Performance Computing (HPC) environment. UGENE is written in C++ using the Qt framework. The built-in plugin system and structured UGENE API make it possible to extend the toolkit with new functionality. UGENE binaries are freely available for MS Windows, Linux and Mac OS X at http://ugene.unipro.ru/download.html. UGENE code is licensed under the GPLv2; the information about the code licensing and copyright of integrated tools can be found in the LICENSE.3rd_party file provided with the source bundle.
catsHTM: A Tool for Fast Accessing and Cross-matching Large Astronomical Catalogs
NASA Astrophysics Data System (ADS)
Soumagnac, Maayane T.; Ofek, Eran O.
2018-07-01
Fast access to large catalogs is required for some astronomical applications. Here we introduce the catsHTM tool, consisting of several large catalogs reformatted into HDF5-based file format, which can be downloaded and used locally. To allow fast access, the catalogs are partitioned into hierarchical triangular meshes and stored in HDF5 files. Several tools are provided to perform efficient cone searches at resolutions spanning from a few arc-seconds to degrees, within a few milliseconds time. The first released version includes the following catalogs (by alphabetical order): 2MASS, 2MASS extended sources, AKARI, APASS, Cosmos, DECaLS/DR5, FIRST, GAIA/DR1, GAIA/DR2, GALEX/DR6Plus7, HSC/v2, IPHAS/DR2, NED redshifts, NVSS, Pan-STARRS1/DR1, PTF photometric catalog, ROSAT faint source, SDSS sources, SDSS/DR14 spectroscopy, SkyMapper, Spitzer/SAGE, Spitzer/IRAC galactic center, UCAC4, UKIDSS/DR10, VST/ATLAS/DR3, VST/KiDS/DR3, WISE and XMM. We provide Python code that allows to perform cone searches, as well as MATLAB code for performing cone searches, catalog cross-matching, general searches, as well as load and create these catalogs.
MultiElec: A MATLAB Based Application for MEA Data Analysis.
Georgiadis, Vassilis; Stephanou, Anastasis; Townsend, Paul A; Jackson, Thomas R
2015-01-01
We present MultiElec, an open source MATLAB based application for data analysis of microelectrode array (MEA) recordings. MultiElec displays an extremely user-friendly graphic user interface (GUI) that allows the simultaneous display and analysis of voltage traces for 60 electrodes and includes functions for activation-time determination, the production of activation-time heat maps with activation time and isoline display. Furthermore, local conduction velocities are semi-automatically calculated along with their corresponding vector plots. MultiElec allows ad hoc signal suppression, enabling the user to easily and efficiently handle signal artefacts and for incomplete data sets to be analysed. Voltage traces and heat maps can be simply exported for figure production and presentation. In addition, our platform is able to produce 3D videos of signal progression over all 60 electrodes. Functions are controlled entirely by a single GUI with no need for command line input or any understanding of MATLAB code. MultiElec is open source under the terms of the GNU General Public License as published by the Free Software Foundation, version 3. Both the program and source code are available to download from http://www.cancer.manchester.ac.uk/MultiElec/.
Chaste: An Open Source C++ Library for Computational Physiology and Biology
Mirams, Gary R.; Arthurs, Christopher J.; Bernabeu, Miguel O.; Bordas, Rafel; Cooper, Jonathan; Corrias, Alberto; Davit, Yohan; Dunn, Sara-Jane; Fletcher, Alexander G.; Harvey, Daniel G.; Marsh, Megan E.; Osborne, James M.; Pathmanathan, Pras; Pitt-Francis, Joe; Southern, James; Zemzemi, Nejib; Gavaghan, David J.
2013-01-01
Chaste — Cancer, Heart And Soft Tissue Environment — is an open source C++ library for the computational simulation of mathematical models developed for physiology and biology. Code development has been driven by two initial applications: cardiac electrophysiology and cancer development. A large number of cardiac electrophysiology studies have been enabled and performed, including high-performance computational investigations of defibrillation on realistic human cardiac geometries. New models for the initiation and growth of tumours have been developed. In particular, cell-based simulations have provided novel insight into the role of stem cells in the colorectal crypt. Chaste is constantly evolving and is now being applied to a far wider range of problems. The code provides modules for handling common scientific computing components, such as meshes and solvers for ordinary and partial differential equations (ODEs/PDEs). Re-use of these components avoids the need for researchers to ‘re-invent the wheel’ with each new project, accelerating the rate of progress in new applications. Chaste is developed using industrially-derived techniques, in particular test-driven development, to ensure code quality, re-use and reliability. In this article we provide examples that illustrate the types of problems Chaste can be used to solve, which can be run on a desktop computer. We highlight some scientific studies that have used or are using Chaste, and the insights they have provided. The source code, both for specific releases and the development version, is available to download under an open source Berkeley Software Distribution (BSD) licence at http://www.cs.ox.ac.uk/chaste, together with details of a mailing list and links to documentation and tutorials. PMID:23516352
NASA Astrophysics Data System (ADS)
Ames, D. P.
2013-12-01
As has been seen in other informatics fields, well-documented and appropriately licensed open source software tools have the potential to significantly increase both opportunities and motivation for inter-institutional science and technology collaboration. The CUAHSI HIS (and related HydroShare) projects have aimed to foster such activities in hydrology resulting in the development of many useful community software components including the HydroDesktop software application. HydroDesktop is an open source, GIS-based, scriptable software application for discovering data on the CUAHSI Hydrologic Information System and related resources. It includes a well-defined plugin architecture and interface to allow 3rd party developers to create extensions and add new functionality without requiring recompiling of the full source code. HydroDesktop is built in the C# programming language and uses the open source DotSpatial GIS engine for spatial data management. Capabilities include data search, discovery, download, visualization, and export. An extension that integrates the R programming language with HydroDesktop provides scripting and data automation capabilities and an OpenMI plugin provides the ability to link models. Current revision and updates to HydroDesktop include migration of core business logic to cross platform, scriptable Python code modules that can be executed in any operating system or linked into other software front-end applications.
D-GENIES: dot plot large genomes in an interactive, efficient and simple way.
Cabanettes, Floréal; Klopp, Christophe
2018-01-01
Dot plots are widely used to quickly compare sequence sets. They provide a synthetic similarity overview, highlighting repetitions, breaks and inversions. Different tools have been developed to easily generated genomic alignment dot plots, but they are often limited in the input sequence size. D-GENIES is a standalone and web application performing large genome alignments using minimap2 software package and generating interactive dot plots. It enables users to sort query sequences along the reference, zoom in the plot and download several image, alignment or sequence files. D-GENIES is an easy-to-install, open-source software package (GPL) developed in Python and JavaScript. The source code is available at https://github.com/genotoul-bioinfo/dgenies and it can be tested at http://dgenies.toulouse.inra.fr/.
A Web-based open-source database for the distribution of hyperspectral signatures
NASA Astrophysics Data System (ADS)
Ferwerda, J. G.; Jones, S. D.; Du, Pei-Jun
2006-10-01
With the coming of age of field spectroscopy as a non-destructive means to collect information on the physiology of vegetation, there is a need for storage of signatures, and, more importantly, their metadata. Without the proper organisation of metadata, the signatures itself become limited. In order to facilitate re-distribution of data, a database for the storage & distribution of hyperspectral signatures and their metadata was designed. The database was built using open-source software, and can be used by the hyperspectral community to share their data. Data is uploaded through a simple web-based interface. The database recognizes major file-formats by ASD, GER and International Spectronics. The database source code is available for download through the hyperspectral.info web domain, and we happily invite suggestion for additions & modification for the database to be submitted through the online forums on the same website.
DOE Office of Scientific and Technical Information (OSTI.GOV)
2013-08-01
This is a Node.js command line utility for scraping XML metadata from CSW and WFS, downloading linkage data from CSW and WFS, pinging hosts and returning status codes, pinging data linkages and returning status codes, writing ping status to CSV files, and uploading data to Amazon S3.
YNOGK: A New Public Code for Calculating Null Geodesics in the Kerr Spacetime
NASA Astrophysics Data System (ADS)
Yang, Xiaolin; Wang, Jiancheng
2013-07-01
Following the work of Dexter & Agol, we present a new public code for the fast calculation of null geodesics in the Kerr spacetime. Using Weierstrass's and Jacobi's elliptic functions, we express all coordinates and affine parameters as analytical and numerical functions of a parameter p, which is an integral value along the geodesic. This is the main difference between our code and previous similar ones. The advantage of this treatment is that the information about the turning points does not need to be specified in advance by the user, and many applications such as imaging, the calculation of line profiles, and the observer-emitter problem, become root-finding problems. All elliptic integrations are computed by Carlson's elliptic integral method as in Dexter & Agol, which guarantees the fast computational speed of our code. The formulae to compute the constants of motion given by Cunningham & Bardeen have been extended, which allow one to readily handle situations in which the emitter or the observer has an arbitrary distance from, and motion state with respect to, the central compact object. The validation of the code has been extensively tested through applications to toy problems from the literature. The source FORTRAN code is freely available for download on our Web site http://www1.ynao.ac.cn/~yangxl/yxl.html.
Open source software to control Bioflo bioreactors.
Burdge, David A; Libourel, Igor G L
2014-01-01
Bioreactors are designed to support highly controlled environments for growth of tissues, cell cultures or microbial cultures. A variety of bioreactors are commercially available, often including sophisticated software to enhance the functionality of the bioreactor. However, experiments that the bioreactor hardware can support, but that were not envisioned during the software design cannot be performed without developing custom software. In addition, support for third party or custom designed auxiliary hardware is often sparse or absent. This work presents flexible open source freeware for the control of bioreactors of the Bioflo product family. The functionality of the software includes setpoint control, data logging, and protocol execution. Auxiliary hardware can be easily integrated and controlled through an integrated plugin interface without altering existing software. Simple experimental protocols can be entered as a CSV scripting file, and a Python-based protocol execution model is included for more demanding conditional experimental control. The software was designed to be a more flexible and free open source alternative to the commercially available solution. The source code and various auxiliary hardware plugins are publicly available for download from https://github.com/LibourelLab/BiofloSoftware. In addition to the source code, the software was compiled and packaged as a self-installing file for 32 and 64 bit windows operating systems. The compiled software will be able to control a Bioflo system, and will not require the installation of LabVIEW.
Open Source Software to Control Bioflo Bioreactors
Burdge, David A.; Libourel, Igor G. L.
2014-01-01
Bioreactors are designed to support highly controlled environments for growth of tissues, cell cultures or microbial cultures. A variety of bioreactors are commercially available, often including sophisticated software to enhance the functionality of the bioreactor. However, experiments that the bioreactor hardware can support, but that were not envisioned during the software design cannot be performed without developing custom software. In addition, support for third party or custom designed auxiliary hardware is often sparse or absent. This work presents flexible open source freeware for the control of bioreactors of the Bioflo product family. The functionality of the software includes setpoint control, data logging, and protocol execution. Auxiliary hardware can be easily integrated and controlled through an integrated plugin interface without altering existing software. Simple experimental protocols can be entered as a CSV scripting file, and a Python-based protocol execution model is included for more demanding conditional experimental control. The software was designed to be a more flexible and free open source alternative to the commercially available solution. The source code and various auxiliary hardware plugins are publicly available for download from https://github.com/LibourelLab/BiofloSoftware. In addition to the source code, the software was compiled and packaged as a self-installing file for 32 and 64 bit windows operating systems. The compiled software will be able to control a Bioflo system, and will not require the installation of LabVIEW. PMID:24667828
AMPA: an automated web server for prediction of protein antimicrobial regions.
Torrent, Marc; Di Tommaso, Paolo; Pulido, David; Nogués, M Victòria; Notredame, Cedric; Boix, Ester; Andreu, David
2012-01-01
AMPA is a web application for assessing the antimicrobial domains of proteins, with a focus on the design on new antimicrobial drugs. The application provides fast discovery of antimicrobial patterns in proteins that can be used to develop new peptide-based drugs against pathogens. Results are shown in a user-friendly graphical interface and can be downloaded as raw data for later examination. AMPA is freely available on the web at http://tcoffee.crg.cat/apps/ampa. The source code is also available in the web. marc.torrent@upf.edu; david.andreu@upf.edu Supplementary data are available at Bioinformatics online.
Gobe: an interactive, web-based tool for comparative genomic visualization.
Pedersen, Brent S; Tang, Haibao; Freeling, Michael
2011-04-01
Gobe is a web-based tool for viewing comparative genomic data. It supports viewing multiple genomic regions simultaneously. Its simple text format and flash-based rendering make it an interactive, exploratory research tool. Gobe can be used without installation through our web service, or downloaded and customized with stylesheets and javascript callback functions. Gobe is a flash application that runs in all modern web-browsers. The full source-code, including that for the online web application is available under the MIT license at: http://github.com/brentp/gobe. Sample applications are hosted at http://try-gobe.appspot.com/ and http://synteny.cnr.berkeley.edu/gobe-app/.
jSquid: a Java applet for graphical on-line network exploration.
Klammer, Martin; Roopra, Sanjit; Sonnhammer, Erik L L
2008-06-15
jSquid is a graph visualization tool for exploring graphs from protein-protein interaction or functional coupling networks. The tool was designed for the FunCoup web site, but can be used for any similar network exploring purpose. The program offers various visualization and graph manipulation techniques to increase the utility for the user. jSquid is available for direct usage and download at http://jSquid.sbc.su.se including source code under the GPLv3 license, and input examples. It requires Java version 5 or higher to run properly. erik.sonnhammer@sbc.su.se Supplementary data are available at Bioinformatics online.
Jackson, M E; Gnadt, J W
1999-03-01
The object-oriented graphical programming language LabView was used to implement the numerical solution to a computational model of saccade generation in primates. The computational model simulates the activity and connectivity of anatomical strictures known to be involved in saccadic eye movements. The LabView program provides a graphical user interface to the model that makes it easy to observe and modify the behavior of each element of the model. Essential elements of the source code of the LabView program are presented and explained. A copy of the model is available for download from the internet.
Coronal Magnetism and Forward Solarsoft Idl Package
NASA Astrophysics Data System (ADS)
Gibson, S. E.
2014-12-01
The FORWARD suite of Solar Soft IDL codes is a community resource for model-data comparison, with a particular emphasis on analyzing coronal magnetic fields. FORWARD may be used both to synthesize a broad range of coronal observables, and to access and compare to existing data. FORWARD works with numerical model datacubes, interfaces with the web-served Predictive Science Inc MAS simulation datacubes and the Solar Soft IDL Potential Field Source Surface (PFSS) package, and also includes several analytic models (more can be added). It connects to the Virtual Solar Observatory and other web-served observations to download data in a format directly comparable to model predictions. It utilizes the CHIANTI database in modeling UV/EUV lines, and links to the CLE polarimetry synthesis code for forbidden coronal lines. FORWARD enables "forward-fitting" of specific observations, and helps to build intuition into how the physical properties of coronal magnetic structures translate to observable properties.
MzJava: An open source library for mass spectrometry data processing.
Horlacher, Oliver; Nikitin, Frederic; Alocci, Davide; Mariethoz, Julien; Müller, Markus; Lisacek, Frederique
2015-11-03
Mass spectrometry (MS) is a widely used and evolving technique for the high-throughput identification of molecules in biological samples. The need for sharing and reuse of code among bioinformaticians working with MS data prompted the design and implementation of MzJava, an open-source Java Application Programming Interface (API) for MS related data processing. MzJava provides data structures and algorithms for representing and processing mass spectra and their associated biological molecules, such as metabolites, glycans and peptides. MzJava includes functionality to perform mass calculation, peak processing (e.g. centroiding, filtering, transforming), spectrum alignment and clustering, protein digestion, fragmentation of peptides and glycans as well as scoring functions for spectrum-spectrum and peptide/glycan-spectrum matches. For data import and export MzJava implements readers and writers for commonly used data formats. For many classes support for the Hadoop MapReduce (hadoop.apache.org) and Apache Spark (spark.apache.org) frameworks for cluster computing was implemented. The library has been developed applying best practices of software engineering. To ensure that MzJava contains code that is correct and easy to use the library's API was carefully designed and thoroughly tested. MzJava is an open-source project distributed under the AGPL v3.0 licence. MzJava requires Java 1.7 or higher. Binaries, source code and documentation can be downloaded from http://mzjava.expasy.org and https://bitbucket.org/sib-pig/mzjava. This article is part of a Special Issue entitled: Computational Proteomics. Copyright © 2015 Elsevier B.V. All rights reserved.
Seismic Analysis Code (SAC): Development, porting, and maintenance within a legacy code base
NASA Astrophysics Data System (ADS)
Savage, B.; Snoke, J. A.
2017-12-01
The Seismic Analysis Code (SAC) is the result of toil of many developers over almost a 40-year history. Initially a Fortran-based code, it has undergone major transitions in underlying bit size from 16 to 32, in the 1980s, and 32 to 64 in 2009; as well as a change in language from Fortran to C in the late 1990s. Maintenance of SAC, the program and its associated libraries, have tracked changes in hardware and operating systems including the advent of Linux in the early 1990, the emergence and demise of Sun/Solaris, variants of OSX processors (PowerPC and x86), and Windows (Cygwin). Traces of these systems are still visible in source code and associated comments. A major concern while improving and maintaining a routinely used, legacy code is a fear of introducing bugs or inadvertently removing favorite features of long-time users. Prior to 2004, SAC was maintained and distributed by LLNL (Lawrence Livermore National Lab). In that year, the license was transferred from LLNL to IRIS (Incorporated Research Institutions for Seismology), but the license is not open source. However, there have been thousands of downloads a year of the package, either source code or binaries for specific system. Starting in 2004, the co-authors have maintained the SAC package for IRIS. In our updates, we fixed bugs, incorporated newly introduced seismic analysis procedures (such as EVALRESP), added new, accessible features (plotting and parsing), and improved the documentation (now in HTML and PDF formats). Moreover, we have added modern software engineering practices to the development of SAC including use of recent source control systems, high-level tests, and scripted, virtualized environments for rapid testing and building. Finally, a "sac-help" listserv (administered by IRIS) was setup for SAC-related issues and is the primary avenue for users seeking advice and reporting bugs. Attempts are always made to respond to issues and bugs in a timely fashion. For the past thirty-plus years, SAC files contained a fixed-length header. Time and distance-related values are stored in single precision, which has become a problem with the increase in desired precision for data compared to thirty years ago. A future goal is to address this precision problem, but in a backward compatible manner. We would also like to transition SAC to a more open source license.
User's Guide for RESRAD-OFFSITE
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gnanapragasam, E.; Yu, C.
2015-04-01
The RESRAD-OFFSITE code can be used to model the radiological dose or risk to an offsite receptor. This User’s Guide for RESRAD-OFFSITE Version 3.1 is an update of the User’s Guide for RESRAD-OFFSITE Version 2 contained in the Appendix A of the User’s Manual for RESRAD-OFFSITE Version 2 (ANL/EVS/TM/07-1, DOE/HS-0005, NUREG/CR-6937). This user’s guide presents the basic information necessary to use Version 3.1 of the code. It also points to the help file and other documents that provide more detailed information about the inputs, the input forms and features/tools in the code; two of the features (overriding the source termmore » and computing area factors) are discussed in the appendices to this guide. Section 2 describes how to download and install the code and then verify the installation of the code. Section 3 shows ways to navigate through the input screens to simulate various exposure scenarios and to view the results in graphics and text reports. Section 4 has screen shots of each input form in the code and provides basic information about each parameter to increase the user’s understanding of the code. Section 5 outlines the contents of all the text reports and the graphical output. It also describes the commands in the two output viewers. Section 6 deals with the probabilistic and sensitivity analysis tools available in the code. Section 7 details the various ways of obtaining help in the code.« less
BASiNET-BiologicAl Sequences NETwork: a case study on coding and non-coding RNAs identification.
Ito, Eric Augusto; Katahira, Isaque; Vicente, Fábio Fernandes da Rocha; Pereira, Luiz Filipe Protasio; Lopes, Fabrício Martins
2018-06-05
With the emergence of Next Generation Sequencing (NGS) technologies, a large volume of sequence data in particular de novo sequencing was rapidly produced at relatively low costs. In this context, computational tools are increasingly important to assist in the identification of relevant information to understand the functioning of organisms. This work introduces BASiNET, an alignment-free tool for classifying biological sequences based on the feature extraction from complex network measurements. The method initially transform the sequences and represents them as complex networks. Then it extracts topological measures and constructs a feature vector that is used to classify the sequences. The method was evaluated in the classification of coding and non-coding RNAs of 13 species and compared to the CNCI, PLEK and CPC2 methods. BASiNET outperformed all compared methods in all adopted organisms and datasets. BASiNET have classified sequences in all organisms with high accuracy and low standard deviation, showing that the method is robust and non-biased by the organism. The proposed methodology is implemented in open source in R language and freely available for download at https://cran.r-project.org/package=BASiNET.
The IntAct molecular interaction database in 2012
Kerrien, Samuel; Aranda, Bruno; Breuza, Lionel; Bridge, Alan; Broackes-Carter, Fiona; Chen, Carol; Duesbury, Margaret; Dumousseau, Marine; Feuermann, Marc; Hinz, Ursula; Jandrasits, Christine; Jimenez, Rafael C.; Khadake, Jyoti; Mahadevan, Usha; Masson, Patrick; Pedruzzi, Ivo; Pfeiffenberger, Eric; Porras, Pablo; Raghunath, Arathi; Roechert, Bernd; Orchard, Sandra; Hermjakob, Henning
2012-01-01
IntAct is an open-source, open data molecular interaction database populated by data either curated from the literature or from direct data depositions. Two levels of curation are now available within the database, with both IMEx-level annotation and less detailed MIMIx-compatible entries currently supported. As from September 2011, IntAct contains approximately 275 000 curated binary interaction evidences from over 5000 publications. The IntAct website has been improved to enhance the search process and in particular the graphical display of the results. New data download formats are also available, which will facilitate the inclusion of IntAct's data in the Semantic Web. IntAct is an active contributor to the IMEx consortium (http://www.imexconsortium.org). IntAct source code and data are freely available at http://www.ebi.ac.uk/intact. PMID:22121220
SeisCode: A seismological software repository for discovery and collaboration
NASA Astrophysics Data System (ADS)
Trabant, C.; Reyes, C. G.; Clark, A.; Karstens, R.
2012-12-01
SeisCode is a community repository for software used in seismological and related fields. The repository is intended to increase discoverability of such software and to provide a long-term home for software projects. Other places exist where seismological software may be found, but none meet the requirements necessary for an always current, easy to search, well documented, and citable resource for projects. Organizations such as IRIS, ORFEUS, and the USGS have websites with lists of available or contributed seismological software. Since the authors themselves do often not maintain these lists, the documentation often consists of a sentence or paragraph, and the available software may be outdated. Repositories such as GoogleCode and SourceForge, which are directly maintained by the authors, provide version control and issue tracking but do not provide a unified way of locating geophysical software scattered in and among countless unrelated projects. Additionally, projects are hosted at language-specific sites such as Mathworks and PyPI, in FTP directories, and in websites strewn across the Web. Search engines are only partially effective discovery tools, as the desired software is often hidden deep within the results. SeisCode provides software authors a place to present their software, codes, scripts, tutorials, and examples to the seismological community. Authors can choose their own level of involvement. At one end of the spectrum, the author might simply create a web page that points to an existing site. At the other extreme, an author may choose to leverage the many tools provided by SeisCode, such as a source code management tool with integrated issue tracking, forums, news feeds, downloads, wikis, and more. For software development projects with multiple authors, SeisCode can also be used as a central site for collaboration. SeisCode provides the community with an easy way to discover software, while providing authors a way to build a community around their software packages. IRIS invites the seismological community to browse and to submit projects to https://seiscode.iris.washington.edu/
NASA Astrophysics Data System (ADS)
Jarboe, N.; Minnett, R.; Koppers, A.; Constable, C.; Tauxe, L.; Jonestrask, L.
2017-12-01
The Magnetics Information Consortium (MagIC) supports an online database for the paleo, geo, and rock magnetic communities ( https://earthref.org/MagIC ). Researchers can upload data into the archive and download data as selected with a sophisticated search system. MagIC has completed the transition from an Oracle backed, Perl based, server oriented website to an ElasticSearch backed, Meteor based thick client website technology stack. Using JavaScript on both the sever and the client enables increased code reuse and allows easy offloading many computational operations to the client for faster response. On-the-fly data validation, column header suggestion, and spreadsheet online editing are some new features available with the new system. The 3.0 data model, method codes, and vocabulary lists can be browsed via the MagIC website and more easily updated. Source code for MagIC is publicly available on GitHub ( https://github.com/earthref/MagIC ). The MagIC file format is natively compatible with the PmagPy ( https://github.com/PmagPy/PmagPy) paleomagnetic analysis software. MagIC files can now be downloaded from the database and viewed and interpreted in the PmagPy GUI based tool, pmag_gui. Changes or interpretations of the data can then be saved by pmag_gui in the MagIC 3.0 data format and easily uploaded to the MagIC database. The rate of new contributions to the database has been increasing with many labs contributing measurement level data for the first time in the last year. Over a dozen file format conversion scripts are available for translating non-MagIC measurement data files into the MagIC format for easy uploading. We will continue to work with more labs until the whole community has a manageable workflow for contributing their measurement level data. MagIC will continue to provide a global repository for archiving and retrieving paleomagnetic and rock magnetic data and, with the new system in place, be able to more quickly respond to the community's requests for changes and improvements.
Development of an IHE MRRT-compliant open-source web-based reporting platform.
Pinto Dos Santos, Daniel; Klos, G; Kloeckner, R; Oberle, R; Dueber, C; Mildenberger, P
2017-01-01
To develop a platform that uses structured reporting templates according to the IHE Management of Radiology Report Templates (MRRT) profile, and to implement this platform into clinical routine. The reporting platform uses standard web technologies (HTML / JavaScript and PHP / MySQL) only. Several freely available external libraries were used to simplify the programming. The platform runs on a standard web server, connects with the radiology information system (RIS) and PACS, and is easily accessible via a standard web browser. A prototype platform that allows structured reporting to be easily incorporated into the clinical routine was developed and successfully tested. To date, 797 reports were generated using IHE MRRT-compliant templates (many of them downloaded from the RSNA's radreport.org website). Reports are stored in a MySQL database and are easily accessible for further analyses. Development of an IHE MRRT-compliant platform for structured reporting is feasible using only standard web technologies. All source code will be made available upon request under a free license, and the participation of other institutions in further development is welcome. • A platform for structured reporting using IHE MRRT-compliant templates is presented. • Incorporating structured reporting into clinical routine is feasible. • Full source code will be provided upon request under a free license.
JAMI: a Java library for molecular interactions and data interoperability.
Sivade Dumousseau, M; Koch, M; Shrivastava, A; Alonso-López, D; De Las Rivas, J; Del-Toro, N; Combe, C W; Meldal, B H M; Heimbach, J; Rappsilber, J; Sullivan, J; Yehudi, Y; Orchard, S
2018-04-11
A number of different molecular interactions data download formats now exist, designed to allow access to these valuable data by diverse user groups. These formats include the PSI-XML and MITAB standard interchange formats developed by Molecular Interaction workgroup of the HUPO-PSI in addition to other, use-specific downloads produced by other resources. The onus is currently on the user to ensure that a piece of software is capable of read/writing all necessary versions of each format. This problem may increase, as data providers strive to meet ever more sophisticated user demands and data types. A collaboration between EMBL-EBI and the University of Cambridge has produced JAMI, a single library to unify standard molecular interaction data formats such as PSI-MI XML and PSI-MITAB. The JAMI free, open-source library enables the development of molecular interaction computational tools and pipelines without the need to produce different versions of software to read different versions of the data formats. Software and tools developed on top of the JAMI framework are able to integrate and support both PSI-MI XML and PSI-MITAB. The use of JAMI avoids the requirement to chain conversions between formats in order to reach a desired output format and prevents code and unit test duplication as the code becomes more modular. JAMI's model interfaces are abstracted from the underlying format, hiding the complexity and requirements of each data format from developers using JAMI as a library.
Design of Provider-Provisioned Website Protection Scheme against Malware Distribution
NASA Astrophysics Data System (ADS)
Yagi, Takeshi; Tanimoto, Naoto; Hariu, Takeo; Itoh, Mitsutaka
Vulnerabilities in web applications expose computer networks to security threats, and many websites are used by attackers as hopping sites to attack other websites and user terminals. These incidents prevent service providers from constructing secure networking environments. To protect websites from attacks exploiting vulnerabilities in web applications, service providers use web application firewalls (WAFs). WAFs filter accesses from attackers by using signatures, which are generated based on the exploit codes of previous attacks. However, WAFs cannot filter unknown attacks because the signatures cannot reflect new types of attacks. In service provider environments, the number of exploit codes has recently increased rapidly because of the spread of vulnerable web applications that have been developed through cloud computing. Thus, generating signatures for all exploit codes is difficult. To solve these problems, our proposed scheme detects and filters malware downloads that are sent from websites which have already received exploit codes. In addition, to collect information for detecting malware downloads, web honeypots, which automatically extract the communication records of exploit codes, are used. According to the results of experiments using a prototype, our scheme can filter attacks automatically so that service providers can provide secure and cost-effective network environments.
Sridhar, Vishnu B; Tian, Peifang; Dale, Anders M; Devor, Anna; Saisan, Payam A
2014-01-01
We present a database client software-Neurovascular Network Explorer 1.0 (NNE 1.0)-that uses MATLAB(®) based Graphical User Interface (GUI) for interaction with a database of 2-photon single-vessel diameter measurements from our previous publication (Tian et al., 2010). These data are of particular interest for modeling the hemodynamic response. NNE 1.0 is downloaded by the user and then runs either as a MATLAB script or as a standalone program on a Windows platform. The GUI allows browsing the database according to parameters specified by the user, simple manipulation and visualization of the retrieved records (such as averaging and peak-normalization), and export of the results. Further, we provide NNE 1.0 source code. With this source code, the user can database their own experimental results, given the appropriate data structure and naming conventions, and thus share their data in a user-friendly format with other investigators. NNE 1.0 provides an example of seamless and low-cost solution for sharing of experimental data by a regular size neuroscience laboratory and may serve as a general template, facilitating dissemination of biological results and accelerating data-driven modeling approaches.
An automated workflow for parallel processing of large multiview SPIM recordings
Schmied, Christopher; Steinbach, Peter; Pietzsch, Tobias; Preibisch, Stephan; Tomancak, Pavel
2016-01-01
Summary: Selective Plane Illumination Microscopy (SPIM) allows to image developing organisms in 3D at unprecedented temporal resolution over long periods of time. The resulting massive amounts of raw image data requires extensive processing interactively via dedicated graphical user interface (GUI) applications. The consecutive processing steps can be easily automated and the individual time points can be processed independently, which lends itself to trivial parallelization on a high performance computing (HPC) cluster. Here, we introduce an automated workflow for processing large multiview, multichannel, multiillumination time-lapse SPIM data on a single workstation or in parallel on a HPC cluster. The pipeline relies on snakemake to resolve dependencies among consecutive processing steps and can be easily adapted to any cluster environment for processing SPIM data in a fraction of the time required to collect it. Availability and implementation: The code is distributed free and open source under the MIT license http://opensource.org/licenses/MIT. The source code can be downloaded from github: https://github.com/mpicbg-scicomp/snakemake-workflows. Documentation can be found here: http://fiji.sc/Automated_workflow_for_parallel_Multiview_Reconstruction. Contact: schmied@mpi-cbg.de Supplementary information: Supplementary data are available at Bioinformatics online. PMID:26628585
An automated workflow for parallel processing of large multiview SPIM recordings.
Schmied, Christopher; Steinbach, Peter; Pietzsch, Tobias; Preibisch, Stephan; Tomancak, Pavel
2016-04-01
Selective Plane Illumination Microscopy (SPIM) allows to image developing organisms in 3D at unprecedented temporal resolution over long periods of time. The resulting massive amounts of raw image data requires extensive processing interactively via dedicated graphical user interface (GUI) applications. The consecutive processing steps can be easily automated and the individual time points can be processed independently, which lends itself to trivial parallelization on a high performance computing (HPC) cluster. Here, we introduce an automated workflow for processing large multiview, multichannel, multiillumination time-lapse SPIM data on a single workstation or in parallel on a HPC cluster. The pipeline relies on snakemake to resolve dependencies among consecutive processing steps and can be easily adapted to any cluster environment for processing SPIM data in a fraction of the time required to collect it. The code is distributed free and open source under the MIT license http://opensource.org/licenses/MIT The source code can be downloaded from github: https://github.com/mpicbg-scicomp/snakemake-workflows Documentation can be found here: http://fiji.sc/Automated_workflow_for_parallel_Multiview_Reconstruction : schmied@mpi-cbg.de Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press.
eQuilibrator--the biochemical thermodynamics calculator.
Flamholz, Avi; Noor, Elad; Bar-Even, Arren; Milo, Ron
2012-01-01
The laws of thermodynamics constrain the action of biochemical systems. However, thermodynamic data on biochemical compounds can be difficult to find and is cumbersome to perform calculations with manually. Even simple thermodynamic questions like 'how much Gibbs energy is released by ATP hydrolysis at pH 5?' are complicated excessively by the search for accurate data. To address this problem, eQuilibrator couples a comprehensive and accurate database of thermodynamic properties of biochemical compounds and reactions with a simple and powerful online search and calculation interface. The web interface to eQuilibrator (http://equilibrator.weizmann.ac.il) enables easy calculation of Gibbs energies of compounds and reactions given arbitrary pH, ionic strength and metabolite concentrations. The eQuilibrator code is open-source and all thermodynamic source data are freely downloadable in standard formats. Here we describe the database characteristics and implementation and demonstrate its use.
eQuilibrator—the biochemical thermodynamics calculator
Flamholz, Avi; Noor, Elad; Bar-Even, Arren; Milo, Ron
2012-01-01
The laws of thermodynamics constrain the action of biochemical systems. However, thermodynamic data on biochemical compounds can be difficult to find and is cumbersome to perform calculations with manually. Even simple thermodynamic questions like ‘how much Gibbs energy is released by ATP hydrolysis at pH 5?’ are complicated excessively by the search for accurate data. To address this problem, eQuilibrator couples a comprehensive and accurate database of thermodynamic properties of biochemical compounds and reactions with a simple and powerful online search and calculation interface. The web interface to eQuilibrator (http://equilibrator.weizmann.ac.il) enables easy calculation of Gibbs energies of compounds and reactions given arbitrary pH, ionic strength and metabolite concentrations. The eQuilibrator code is open-source and all thermodynamic source data are freely downloadable in standard formats. Here we describe the database characteristics and implementation and demonstrate its use. PMID:22064852
Microsoft Biology Initiative: .NET Bioinformatics Platform and Tools
Diaz Acosta, B.
2011-01-01
The Microsoft Biology Initiative (MBI) is an effort in Microsoft Research to bring new technology and tools to the area of bioinformatics and biology. This initiative is comprised of two primary components, the Microsoft Biology Foundation (MBF) and the Microsoft Biology Tools (MBT). MBF is a language-neutral bioinformatics toolkit built as an extension to the Microsoft .NET Framework—initially aimed at the area of Genomics research. Currently, it implements a range of parsers for common bioinformatics file formats; a range of algorithms for manipulating DNA, RNA, and protein sequences; and a set of connectors to biological web services such as NCBI BLAST. MBF is available under an open source license, and executables, source code, demo applications, documentation and training materials are freely downloadable from http://research.microsoft.com/bio. MBT is a collection of tools that enable biology and bioinformatics researchers to be more productive in making scientific discoveries.
MetaQuant: a tool for the automatic quantification of GC/MS-based metabolome data.
Bunk, Boyke; Kucklick, Martin; Jonas, Rochus; Münch, Richard; Schobert, Max; Jahn, Dieter; Hiller, Karsten
2006-12-01
MetaQuant is a Java-based program for the automatic and accurate quantification of GC/MS-based metabolome data. In contrast to other programs MetaQuant is able to quantify hundreds of substances simultaneously with minimal manual intervention. The integration of a self-acting calibration function allows the parallel and fast calibration for several metabolites simultaneously. Finally, MetaQuant is able to import GC/MS data in the common NetCDF format and to export the results of the quantification into Systems Biology Markup Language (SBML), Comma Separated Values (CSV) or Microsoft Excel (XLS) format. MetaQuant is written in Java and is available under an open source license. Precompiled packages for the installation on Windows or Linux operating systems are freely available for download. The source code as well as the installation packages are available at http://bioinformatics.org/metaquant
Duley, Aaron R; Janelle, Christopher M; Coombes, Stephen A
2004-11-01
The cardiovascular system has been extensively measured in a variety of research and clinical domains. Despite technological and methodological advances in cardiovascular science, the analysis and evaluation of phasic changes in heart rate persists as a way to assess numerous psychological concomitants. Some researchers, however, have pointed to constraints on data analysis when evaluating cardiac activity indexed by heart rate or heart period. Thus, an off-line application toolkit for heart rate analysis is presented. The program, written with National Instruments' LabVIEW, incorporates a variety of tools for off-line extraction and analysis of heart rate data. Current methods and issues concerning heart rate analysis are highlighted, and how the toolkit provides a flexible environment to ameliorate common problems that typically lead to trial rejection is discussed. Source code for this program may be downloaded from the Psychonomic Society Web archive at www.psychonomic.org/archive/.
Donkin, Christopher; Brown, Scott D; Heathcote, Andrew
2009-02-01
Psychological experiments often collect choice responses using buttonpresses. However, spoken responses are useful in many cases-for example, when working with special clinical populations, or when a paradigm demands vocalization, or when accurate response time measurements are desired. In these cases, spoken responses are typically collected using a voice key, which usually involves manual coding by experimenters in a tedious and error-prone manner. We describe ChoiceKey, an open-source speech recognition package for MATLAB. It can be optimized by training for small response sets and different speakers. We show ChoiceKey to be reliable with minimal training for most participants in experiments with two different responses. Problems presented by individual differences, and occasional atypical responses, are examined, and extensions to larger response sets are explored. The ChoiceKey source files and instructions may be downloaded as supplemental materials for this article from brm.psychonomic-journals.org/content/supplemental.
Horvath, Keith J; Alemu, Dawit; Danh, Thu; Baker, Jason V; Carrico, Adam W
2016-04-15
The use of stimulant drugs among men who have sex with men (MSM) with human immunodeficiency virus (HIV) is associated with decreased odds of antiretroviral therapy (ART) adherence and elevated risk of forward HIV transmission. Advancing tailored and innovative mobile phone-based ART adherence app interventions for stimulant-using HIV-positive MSM requires greater understanding of their needs and preferences in this emerging area. The purpose of this study is to (1) assess reasons that stimulant-using HIV-positive MSM download and sustain their use of mobile phone apps in general, and (2) obtain feedback on features and functions that these men prefer in a mobile phone app to optimize their ART adherence. Focus groups were conducted with stimulant-using HIV-positive MSM (24-57 years of age; mostly non-Hispanic white; 42% once a week or more frequent stimulant drug use) in San Francisco and Minneapolis. Our aim was to explore the mobile phone app features and functions that they considered when deciding to download and sustain their use of general apps over time, as well as specific features and functions that they would like to see incorporated into an ART adherence mobile app. Focus groups were audiorecorded and transcribed verbatim. Thematic analysis was applied to transcripts using line-by-line open coding and organizing codes into meaningful themes. Men reported that they currently had a variety of health and wellness, social media and networking, gaming and entertainment, and utility apps on their mobile phones. Downloading apps to their mobile phones was influenced by the cost of the app, recommendations by a trusted source, and the time it takes to download. In addition, downloading and sustained use of apps was more likely to occur when men had control over most features of the app and apps were perceived to be useful, engaging, secure, and credible. Participants suggested that ART adherence mobile phone apps include social networking features, connections to local resources and their medical chart, and breaking HIV news and updates. Although some men expressed concerns about daily self-monitoring of HIV medication doses, many appreciated receiving a summary of their medication adherence over time and suggested that feedback about missed doses be delivered in an encouraging and humorous manner. In this study, we were able to recruit a relatively high proportion (42%) of HIV-positive MSM reporting weekly or more stimulant use. These results suggest critical design elements that may need to be considered during development of ART adherence-related mobile phone apps for this, and possibly other, high-risk groups. In particular, finding the optimal balance of security, engagement, usefulness, control capabilities, and credibility will be critical to sustained used of HIV treatment apps.
MOCAT: A Metagenomics Assembly and Gene Prediction Toolkit
Li, Junhua; Chen, Weineng; Chen, Hua; Mende, Daniel R.; Arumugam, Manimozhiyan; Pan, Qi; Liu, Binghang; Qin, Junjie; Wang, Jun; Bork, Peer
2012-01-01
MOCAT is a highly configurable, modular pipeline for fast, standardized processing of single or paired-end sequencing data generated by the Illumina platform. The pipeline uses state-of-the-art programs to quality control, map, and assemble reads from metagenomic samples sequenced at a depth of several billion base pairs, and predict protein-coding genes on assembled metagenomes. Mapping against reference databases allows for read extraction or removal, as well as abundance calculations. Relevant statistics for each processing step can be summarized into multi-sheet Excel documents and queryable SQL databases. MOCAT runs on UNIX machines and integrates seamlessly with the SGE and PBS queuing systems, commonly used to process large datasets. The open source code and modular architecture allow users to modify or exchange the programs that are utilized in the various processing steps. Individual processing steps and parameters were benchmarked and tested on artificial, real, and simulated metagenomes resulting in an improvement of selected quality metrics. MOCAT can be freely downloaded at http://www.bork.embl.de/mocat/. PMID:23082188
MOCAT: a metagenomics assembly and gene prediction toolkit.
Kultima, Jens Roat; Sunagawa, Shinichi; Li, Junhua; Chen, Weineng; Chen, Hua; Mende, Daniel R; Arumugam, Manimozhiyan; Pan, Qi; Liu, Binghang; Qin, Junjie; Wang, Jun; Bork, Peer
2012-01-01
MOCAT is a highly configurable, modular pipeline for fast, standardized processing of single or paired-end sequencing data generated by the Illumina platform. The pipeline uses state-of-the-art programs to quality control, map, and assemble reads from metagenomic samples sequenced at a depth of several billion base pairs, and predict protein-coding genes on assembled metagenomes. Mapping against reference databases allows for read extraction or removal, as well as abundance calculations. Relevant statistics for each processing step can be summarized into multi-sheet Excel documents and queryable SQL databases. MOCAT runs on UNIX machines and integrates seamlessly with the SGE and PBS queuing systems, commonly used to process large datasets. The open source code and modular architecture allow users to modify or exchange the programs that are utilized in the various processing steps. Individual processing steps and parameters were benchmarked and tested on artificial, real, and simulated metagenomes resulting in an improvement of selected quality metrics. MOCAT can be freely downloaded at http://www.bork.embl.de/mocat/.
NHDS: The New Hampshire Dispersion Relation Solver
NASA Astrophysics Data System (ADS)
Verscharen, Daniel; Chandran, Benjamin D. G.
2018-04-01
NHDS is the New Hampshire Dispersion Relation Solver. This article describes the numerics of the solver and its capabilities. The code is available for download on https://github.com/danielver02/NHDS.
Colorectal Cancer Risk Assessment Tool
... 11/12/2014 Risk Calculator About the Tool Colorectal Cancer Risk Factors Download SAS and Gauss Code Page ... Rectal Cancer: Prevention, Genetics, Causes Tests to Detect Colorectal Cancer and Polyps Cancer Risk Prediction Resources Update November ...
The HYPE Open Source Community
NASA Astrophysics Data System (ADS)
Strömbäck, L.; Pers, C.; Isberg, K.; Nyström, K.; Arheimer, B.
2013-12-01
The Hydrological Predictions for the Environment (HYPE) model is a dynamic, semi-distributed, process-based, integrated catchment model. It uses well-known hydrological and nutrient transport concepts and can be applied for both small and large scale assessments of water resources and status. In the model, the landscape is divided into classes according to soil type, vegetation and altitude. The soil representation is stratified and can be divided in up to three layers. Water and substances are routed through the same flow paths and storages (snow, soil, groundwater, streams, rivers, lakes) considering turn-over and transformation on the way towards the sea. HYPE has been successfully used in many hydrological applications at SMHI. For Europe, we currently have three different models; The S-HYPE model for Sweden; The BALT-HYPE model for the Baltic Sea; and the E-HYPE model for the whole Europe. These models simulate hydrological conditions and nutrients for their respective areas and are used for characterization, forecasts, and scenario analyses. Model data can be downloaded from hypeweb.smhi.se. In addition, we provide models for the Arctic region, the Arab (Middle East and Northern Africa) region, India, the Niger River basin, the La Plata Basin. This demonstrates the applicability of the HYPE model for large scale modeling in different regions of the world. An important goal with our work is to make our data and tools available as open data and services. For this aim we created the HYPE Open Source Community (OSC) that makes the source code of HYPE available for anyone interested in further development of HYPE. The HYPE OSC (hype.sourceforge.net) is an open source initiative under the Lesser GNU Public License taken by SMHI to strengthen international collaboration in hydrological modeling and hydrological data production. The hypothesis is that more brains and more testing will result in better models and better code. The code is transparent and can be changed and learnt from. New versions of the main code are delivered frequently. HYPE OSC is open to everyone interested in hydrology, hydrological modeling and code development - e.g. scientists, authorities, and consultancies. By joining the HYPE OSC you get access a state-of-the-art operational hydrological model. The HYPE source code is designed to efficiently handle large scale modeling for forecast, hindcast and climate applications. The code is under constant development to improve the hydrological processes, efficiency and readability. In the beginning of 2013 we released a version with new and better modularization based on hydrological processes. This will make the code easier to understand and further develop for a new user. An important challenge in this process is to produce code that is easy for anyone to understand and work with, but still maintain the properties that make the code efficient enough for large scale applications. Input from the HYPE Open Source Community is an important source for future improvements of the HYPE model. Therefore, by joining the community you become an active part of the development, get access to the latest features and can influence future versions of the model.
NASA Astrophysics Data System (ADS)
Boettcher, M. A.; Butt, B. M.; Klinkner, S.
2016-10-01
A major concern of a university satellite mission is to download the payload and the telemetry data from a satellite. While the ground station antennas are in general easy and with limited afford to procure, the receiving unit is most certainly not. The flexible and low-cost software-defined radio (SDR) transceiver "BladeRF" is used to receive the QPSK modulated and CCSDS compliant coded data of a satellite in the HAM radio S-band. The control software is based on the Open Source program GNU Radio, which also is used to perform CCSDS post processing of the binary bit stream. The test results show a good performance of the receiving system.
Music 4C, a multi-voiced synthesis program with instruments defined in C
NASA Astrophysics Data System (ADS)
Beauchamp, James W.
2003-04-01
Music 4C is a program which runs under Unix (including Linux) and provides a means for the synthesis of arbitrary signals as defined by the C code. The program is actually a loose translation of an earlier program, Music 4BF [H. S. Howe, Jr., Electronic Music Synthesis (Norton, 1975)]. A set of instrument definitions are driven by a numerical score which consists of a series of ``events.'' Each event gives an instrument name, start time and duration, and a number of parameters (e.g., pitch) which describe the event. Each instrument definition consists of event parameters, performance variables, initializations, and a synthesis algorithmic code. Thus, the synthetic signal, no matter how complex, is precisely defined. Moreover, the resulting sounds can be overlaid in any arbitrary pattern. The program serves as a mixer of algorithmically produced sounds or recorded sounds taken from sample files or synthesized from spectrum files. A score file can be entered by hand, generated from a program, translated from a MIDI file, or generated from an alpha-numeric score using an auxiliary program, Notepro. Output sample files are in wav, snd, or aiff format. The program is provided in the C source code for download.
Software Security Knowledge: CWE. Knowing What Could Make Software Vulnerable to Attack
2011-05-01
shall be subject to a penalty for failing to comply with a collection of information if it does not display a currently valid OMB control number. 1...Buffer • CWE-642: External Control of Critical State Data • CWE-73: External Control of File Name or Path • CWE-426: Untrusted Search Path • CWE...94: Failure to Control Generation of Code (aka ’Code Injection’) • CWE-494: Download of Code Without Integrity Check • CWE-404: Improper Resource
Data and Tools | Concentrating Solar Power | NREL
download. Solar Power tower Integrated Layout and Optimization Tool (SolarPILOT(tm)) The SolarPILOT is code rapid layout and optimization capability of the analytical DELSOL3 program with the accuracy and
PD5: a general purpose library for primer design software.
Riley, Michael C; Aubrey, Wayne; Young, Michael; Clare, Amanda
2013-01-01
Complex PCR applications for large genome-scale projects require fast, reliable and often highly sophisticated primer design software applications. Presently, such applications use pipelining methods to utilise many third party applications and this involves file parsing, interfacing and data conversion, which is slow and prone to error. A fully integrated suite of software tools for primer design would considerably improve the development time, the processing speed, and the reliability of bespoke primer design software applications. The PD5 software library is an open-source collection of classes and utilities, providing a complete collection of software building blocks for primer design and analysis. It is written in object-oriented C(++) with an emphasis on classes suitable for efficient and rapid development of bespoke primer design programs. The modular design of the software library simplifies the development of specific applications and also integration with existing third party software where necessary. We demonstrate several applications created using this software library that have already proved to be effective, but we view the project as a dynamic environment for building primer design software and it is open for future development by the bioinformatics community. Therefore, the PD5 software library is published under the terms of the GNU General Public License, which guarantee access to source-code and allow redistribution and modification. The PD5 software library is downloadable from Google Code and the accompanying Wiki includes instructions and examples: http://code.google.com/p/primer-design.
ProteoWizard: open source software for rapid proteomics tools development.
Kessner, Darren; Chambers, Matt; Burke, Robert; Agus, David; Mallick, Parag
2008-11-01
The ProteoWizard software project provides a modular and extensible set of open-source, cross-platform tools and libraries. The tools perform proteomics data analyses; the libraries enable rapid tool creation by providing a robust, pluggable development framework that simplifies and unifies data file access, and performs standard proteomics and LCMS dataset computations. The library contains readers and writers of the mzML data format, which has been written using modern C++ techniques and design principles and supports a variety of platforms with native compilers. The software has been specifically released under the Apache v2 license to ensure it can be used in both academic and commercial projects. In addition to the library, we also introduce a rapidly growing set of companion tools whose implementation helps to illustrate the simplicity of developing applications on top of the ProteoWizard library. Cross-platform software that compiles using native compilers (i.e. GCC on Linux, MSVC on Windows and XCode on OSX) is available for download free of charge, at http://proteowizard.sourceforge.net. This website also provides code examples, and documentation. It is our hope the ProteoWizard project will become a standard platform for proteomics development; consequently, code use, contribution and further development are strongly encouraged.
American Academy of Pediatric Dentistry
... 500 Welcome Bonus for AAPD Members! Pediatric Dentist Toolkit Now Available! AAPD Coding and Insurance Manual 2018 Updates Looking to Find a Job? Looking to Fill a Position? Try the New AAPD Career Center Download the AAPD Reference Manual App Today! ...
NASA Astrophysics Data System (ADS)
Roach, Colin; Carlsson, Johan; Cary, John R.; Alexander, David A.
2002-11-01
The National Transport Code Collaboration (NTCC) has developed an array of software, including a data client/server. The data server, which is written in C++, serves local data (in the ITER Profile Database format) as well as remote data (by accessing one or several MDS+ servers). The client, a web-invocable Java applet, provides a uniform, intuitive, user-friendly, graphical interface to the data server. The uniformity of the interface relieves the user from the trouble of mastering the differences between different data formats and lets him/her focus on the essentials: plotting and viewing the data. The user runs the client by visiting a web page using any Java capable Web browser. The client is automatically downloaded and run by the browser. A reference to the data server is then retrieved via the standard Web protocol (HTTP). The communication between the client and the server is then handled by the mature, industry-standard CORBA middleware. CORBA has bindings for all common languages and many high-quality implementations are available (both Open Source and commercial). The NTCC data server has been installed at the ITPA International Multi-tokamak Confinement Profile Database, which is hosted by the UKAEA at Culham Science Centre. The installation of the data server is protected by an Internet firewall. To make it accessible to clients outside the firewall some modifications of the server were required. The working version of the ITPA confinement profile database is not open to the public. Authentification of legitimate users is done utilizing built-in Java security features to demand a password to download the client. We present an overview of the NTCC data client/server and some details of how the CORBA firewall-traversal issues were resolved and how the user authentification is implemented.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Iversen, C.M.; Powell, A.S.; McCormack, M.L.
The second version of the Fine-Root Ecology Database is available for download! Download the full FRED 2.0 data set, user guidance document, map, and list of data sources here. Prior to downloading the data, please read and follow the Data Use Guidelines, and it's worth checking out some tips for using FRED before you begin your analyses. Also, see here for an updating list of corrections to FRED 2.0.
Reactome Pengine: A web-logic API to the homo sapiens reactome.
Neaves, Samuel R; Tsoka, Sophia; Millard, Louise A C
2018-03-30
Existing ways of accessing data from the Reactome database are limited. Either a researcher is restricted to particular queries defined by a web application programming interface (API), or they have to download the whole database. Reactome Pengine is a web service providing a logic programming based API to the human reactome. This gives researchers greater flexibility in data access than existing APIs, as users can send their own small programs (alongside queries) to Reactome Pengine. The server and an example notebook can be found at https://apps.nms.kcl.ac.uk/reactome-pengine. Source code is available at https://github.com/samwalrus/reactome-pengine and a Docker image is available at https://hub.docker.com/r/samneaves/rp4/ . samuel.neaves@kcl.ac.uk. Supplementary data are available at Bioinformatics online.
The GeoClaw software for depth-averaged flows with adaptive refinement
Berger, M.J.; George, D.L.; LeVeque, R.J.; Mandli, Kyle T.
2011-01-01
Many geophysical flow or wave propagation problems can be modeled with two-dimensional depth-averaged equations, of which the shallow water equations are the simplest example. We describe the GeoClaw software that has been designed to solve problems of this nature, consisting of open source Fortran programs together with Python tools for the user interface and flow visualization. This software uses high-resolution shock-capturing finite volume methods on logically rectangular grids, including latitude-longitude grids on the sphere. Dry states are handled automatically to model inundation. The code incorporates adaptive mesh refinement to allow the efficient solution of large-scale geophysical problems. Examples are given illustrating its use for modeling tsunamis and dam-break flooding problems. Documentation and download information is available at www.clawpack.org/geoclaw. ?? 2011.
GAMBIT: the global and modular beyond-the-standard-model inference tool
NASA Astrophysics Data System (ADS)
Athron, Peter; Balazs, Csaba; Bringmann, Torsten; Buckley, Andy; Chrząszcz, Marcin; Conrad, Jan; Cornell, Jonathan M.; Dal, Lars A.; Dickinson, Hugh; Edsjö, Joakim; Farmer, Ben; Gonzalo, Tomás E.; Jackson, Paul; Krislock, Abram; Kvellestad, Anders; Lundberg, Johan; McKay, James; Mahmoudi, Farvah; Martinez, Gregory D.; Putze, Antje; Raklev, Are; Ripken, Joachim; Rogan, Christopher; Saavedra, Aldo; Savage, Christopher; Scott, Pat; Seo, Seon-Hee; Serra, Nicola; Weniger, Christoph; White, Martin; Wild, Sebastian
2017-11-01
We describe the open-source global fitting package GAMBIT: the Global And Modular Beyond-the-Standard-Model Inference Tool. GAMBIT combines extensive calculations of observables and likelihoods in particle and astroparticle physics with a hierarchical model database, advanced tools for automatically building analyses of essentially any model, a flexible and powerful system for interfacing to external codes, a suite of different statistical methods and parameter scanning algorithms, and a host of other utilities designed to make scans faster, safer and more easily-extendible than in the past. Here we give a detailed description of the framework, its design and motivation, and the current models and other specific components presently implemented in GAMBIT. Accompanying papers deal with individual modules and present first GAMBIT results. GAMBIT can be downloaded from gambit.hepforge.org.
Alemu, Dawit; Danh, Thu; Baker, Jason V; Carrico, Adam W
2016-01-01
Background The use of stimulant drugs among men who have sex with men (MSM) with human immunodeficiency virus (HIV) is associated with decreased odds of antiretroviral therapy (ART) adherence and elevated risk of forward HIV transmission. Advancing tailored and innovative mobile phone–based ART adherence app interventions for stimulant-using HIV-positive MSM requires greater understanding of their needs and preferences in this emerging area. Objective The purpose of this study is to (1) assess reasons that stimulant-using HIV-positive MSM download and sustain their use of mobile phone apps in general, and (2) obtain feedback on features and functions that these men prefer in a mobile phone app to optimize their ART adherence. Methods Focus groups were conducted with stimulant-using HIV-positive MSM (24-57 years of age; mostly non-Hispanic white; 42% once a week or more frequent stimulant drug use) in San Francisco and Minneapolis. Our aim was to explore the mobile phone app features and functions that they considered when deciding to download and sustain their use of general apps over time, as well as specific features and functions that they would like to see incorporated into an ART adherence mobile app. Focus groups were audiorecorded and transcribed verbatim. Thematic analysis was applied to transcripts using line-by-line open coding and organizing codes into meaningful themes. Results Men reported that they currently had a variety of health and wellness, social media and networking, gaming and entertainment, and utility apps on their mobile phones. Downloading apps to their mobile phones was influenced by the cost of the app, recommendations by a trusted source, and the time it takes to download. In addition, downloading and sustained use of apps was more likely to occur when men had control over most features of the app and apps were perceived to be useful, engaging, secure, and credible. Participants suggested that ART adherence mobile phone apps include social networking features, connections to local resources and their medical chart, and breaking HIV news and updates. Although some men expressed concerns about daily self-monitoring of HIV medication doses, many appreciated receiving a summary of their medication adherence over time and suggested that feedback about missed doses be delivered in an encouraging and humorous manner. Conclusions In this study, we were able to recruit a relatively high proportion (42%) of HIV-positive MSM reporting weekly or more stimulant use. These results suggest critical design elements that may need to be considered during development of ART adherence-related mobile phone apps for this, and possibly other, high-risk groups. In particular, finding the optimal balance of security, engagement, usefulness, control capabilities, and credibility will be critical to sustained used of HIV treatment apps. PMID:27084049
Niklasson, Markus; Ahlner, Alexandra; Andresen, Cecilia; Marsh, Joseph A; Lundström, Patrik
2015-01-01
The process of resonance assignment is fundamental to most NMR studies of protein structure and dynamics. Unfortunately, the manual assignment of residues is tedious and time-consuming, and can represent a significant bottleneck for further characterization. Furthermore, while automated approaches have been developed, they are often limited in their accuracy, particularly for larger proteins. Here, we address this by introducing the software COMPASS, which, by combining automated resonance assignment with manual intervention, is able to achieve accuracy approaching that from manual assignments at greatly accelerated speeds. Moreover, by including the option to compensate for isotope shift effects in deuterated proteins, COMPASS is far more accurate for larger proteins than existing automated methods. COMPASS is an open-source project licensed under GNU General Public License and is available for download from http://www.liu.se/forskning/foass/tidigare-foass/patrik-lundstrom/software?l=en. Source code and binaries for Linux, Mac OS X and Microsoft Windows are available.
An open source Java web application to build self-contained Web GIS sites
NASA Astrophysics Data System (ADS)
Zavala Romero, O.; Ahmed, A.; Chassignet, E.; Zavala-Hidalgo, J.
2014-12-01
This work describes OWGIS, an open source Java web application that creates Web GIS sites by automatically writing HTML and JavaScript code. OWGIS is configured by XML files that define which layers (geographic datasets) will be displayed on the websites. This project uses several Open Geospatial Consortium standards to request data from typical map servers, such as GeoServer, and is also able to request data from ncWMS servers. The latter allows for the displaying of 4D data stored using the NetCDF file format (widely used for storing environmental model datasets). Some of the features available on the sites built with OWGIS are: multiple languages, animations, vertical profiles and vertical transects, color palettes, color ranges, and the ability to download data. OWGIS main users are scientists, such as oceanographers or climate scientists, who store their data in NetCDF files and want to analyze, visualize, share, or compare their data using a website.
Utah FORGE Gravity Data Shapefile
Joe Moore
2016-03-13
This is a zipped GIS compatible shapefile of gravity data points used in the Milford, Utah FORGE project as of March 21st, 2016. The shapefile is native to ArcGIS, but can be used with many GIS software packages. Additionally, there is a .dbf (dBase) file that contains the dataset which can be read with Microsoft Excel. The Data was downloaded from the PACES (Pan American Center for Earth and Environmental Studies) hosted by University of Texas El Paso (http://research.utep.edu/Default.aspx?alias=research.utep.edu/paces) Explanation:Source: data source code if available LatNAD83: latitude in NAD83 [decimal degrees] LonNAD83: longitude in NAD83 [decimal degrees]zWGS84: elevation in WGS84 (ellipsoidal) [m]OBSless976: observed gravity minus 976000 mGalIZTC: inner zone terrain correction [mGal]OZTC: outer zone terrain correction [mGal]FA: Free Air anomaly value [mGal]CBGA: Complete Bouguer gravity anomaly value [mGal
Niklasson, Markus; Ahlner, Alexandra; Andresen, Cecilia; Marsh, Joseph A.; Lundström, Patrik
2015-01-01
The process of resonance assignment is fundamental to most NMR studies of protein structure and dynamics. Unfortunately, the manual assignment of residues is tedious and time-consuming, and can represent a significant bottleneck for further characterization. Furthermore, while automated approaches have been developed, they are often limited in their accuracy, particularly for larger proteins. Here, we address this by introducing the software COMPASS, which, by combining automated resonance assignment with manual intervention, is able to achieve accuracy approaching that from manual assignments at greatly accelerated speeds. Moreover, by including the option to compensate for isotope shift effects in deuterated proteins, COMPASS is far more accurate for larger proteins than existing automated methods. COMPASS is an open-source project licensed under GNU General Public License and is available for download from http://www.liu.se/forskning/foass/tidigare-foass/patrik-lundstrom/software?l=en. Source code and binaries for Linux, Mac OS X and Microsoft Windows are available. PMID:25569628
A CellML simulation compiler and code generator using ODE solving schemes
2012-01-01
Models written in description languages such as CellML are becoming a popular solution to the handling of complex cellular physiological models in biological function simulations. However, in order to fully simulate a model, boundary conditions and ordinary differential equation (ODE) solving schemes have to be combined with it. Though boundary conditions can be described in CellML, it is difficult to explicitly specify ODE solving schemes using existing tools. In this study, we define an ODE solving scheme description language-based on XML and propose a code generation system for biological function simulations. In the proposed system, biological simulation programs using various ODE solving schemes can be easily generated. We designed a two-stage approach where the system generates the equation set associating the physiological model variable values at a certain time t with values at t + Δt in the first stage. The second stage generates the simulation code for the model. This approach enables the flexible construction of code generation modules that can support complex sets of formulas. We evaluate the relationship between models and their calculation accuracies by simulating complex biological models using various ODE solving schemes. Using the FHN model simulation, results showed good qualitative and quantitative correspondence with the theoretical predictions. Results for the Luo-Rudy 1991 model showed that only first order precision was achieved. In addition, running the generated code in parallel on a GPU made it possible to speed up the calculation time by a factor of 50. The CellML Compiler source code is available for download at http://sourceforge.net/projects/cellmlcompiler. PMID:23083065
SPIDERMAN: an open-source code to model phase curves and secondary eclipses
NASA Astrophysics Data System (ADS)
Louden, Tom; Kreidberg, Laura
2018-06-01
We present SPIDERMAN (Secondary eclipse and Phase curve Integrator for 2D tempERature MAppiNg), a fast code for calculating exoplanet phase curves and secondary eclipses with arbitrary surface brightness distributions in two dimensions. Using a geometrical algorithm, the code solves exactly the area of sections of the disc of the planet that are occulted by the star. The code is written in C with a user-friendly Python interface, and is optimized to run quickly, with no loss in numerical precision. Approximately 1000 models can be generated per second in typical use, making Markov Chain Monte Carlo analyses practicable. The modular nature of the code allows easy comparison of the effect of multiple different brightness distributions for the data set. As a test case, we apply the code to archival data on the phase curve of WASP-43b using a physically motivated analytical model for the two-dimensional brightness map. The model provides a good fit to the data; however, it overpredicts the temperature of the nightside. We speculate that this could be due to the presence of clouds on the nightside of the planet, or additional reflected light from the dayside. When testing a simple cloud model, we find that the best-fitting model has a geometric albedo of 0.32 ± 0.02 and does not require a hot nightside. We also test for variation of the map parameters as a function of wavelength and find no statistically significant correlations. SPIDERMAN is available for download at https://github.com/tomlouden/spiderman.
Pathogen metadata platform: software for accessing and analyzing pathogen strain information.
Chang, Wenling E; Peterson, Matthew W; Garay, Christopher D; Korves, Tonia
2016-09-15
Pathogen metadata includes information about where and when a pathogen was collected and the type of environment it came from. Along with genomic nucleotide sequence data, this metadata is growing rapidly and becoming a valuable resource not only for research but for biosurveillance and public health. However, current freely available tools for analyzing this data are geared towards bioinformaticians and/or do not provide summaries and visualizations needed to readily interpret results. We designed a platform to easily access and summarize data about pathogen samples. The software includes a PostgreSQL database that captures metadata useful for disease outbreak investigations, and scripts for downloading and parsing data from NCBI BioSample and BioProject into the database. The software provides a user interface to query metadata and obtain standardized results in an exportable, tab-delimited format. To visually summarize results, the user interface provides a 2D histogram for user-selected metadata types and mapping of geolocated entries. The software is built on the LabKey data platform, an open-source data management platform, which enables developers to add functionalities. We demonstrate the use of the software in querying for a pathogen serovar and for genome sequence identifiers. This software enables users to create a local database for pathogen metadata, populate it with data from NCBI, easily query the data, and obtain visual summaries. Some of the components, such as the database, are modular and can be incorporated into other data platforms. The source code is freely available for download at https://github.com/wchangmitre/bioattribution .
Evaluating bacterial gene-finding HMM structures as probabilistic logic programs.
Mørk, Søren; Holmes, Ian
2012-03-01
Probabilistic logic programming offers a powerful way to describe and evaluate structured statistical models. To investigate the practicality of probabilistic logic programming for structure learning in bioinformatics, we undertook a simplified bacterial gene-finding benchmark in PRISM, a probabilistic dialect of Prolog. We evaluate Hidden Markov Model structures for bacterial protein-coding gene potential, including a simple null model structure, three structures based on existing bacterial gene finders and two novel model structures. We test standard versions as well as ADPH length modeling and three-state versions of the five model structures. The models are all represented as probabilistic logic programs and evaluated using the PRISM machine learning system in terms of statistical information criteria and gene-finding prediction accuracy, in two bacterial genomes. Neither of our implementations of the two currently most used model structures are best performing in terms of statistical information criteria or prediction performances, suggesting that better-fitting models might be achievable. The source code of all PRISM models, data and additional scripts are freely available for download at: http://github.com/somork/codonhmm. Supplementary data are available at Bioinformatics online.
Jannovar: a java library for exome annotation.
Jäger, Marten; Wang, Kai; Bauer, Sebastian; Smedley, Damian; Krawitz, Peter; Robinson, Peter N
2014-05-01
Transcript-based annotation and pedigree analysis are two basic steps in the computational analysis of whole-exome sequencing experiments in genetic diagnostics and disease-gene discovery projects. Here, we present Jannovar, a stand-alone Java application as well as a Java library designed to be used in larger software frameworks for exome and genome analysis. Jannovar uses an interval tree to identify all transcripts affected by a given variant, and provides Human Genome Variation Society-compliant annotations both for variants affecting coding sequences and splice junctions as well as untranslated regions and noncoding RNA transcripts. Jannovar can also perform family-based pedigree analysis with Variant Call Format (VCF) files with data from members of a family segregating a Mendelian disorder. Using a desktop computer, Jannovar requires a few seconds to annotate a typical VCF file with exome data. Jannovar is freely available under the BSD2 license. Source code as well as the Java application and library file can be downloaded from http://compbio.charite.de (with tutorial) and https://github.com/charite/jannovar. © 2014 WILEY PERIODICALS, INC.
PuffinPlot: A versatile, user-friendly program for paleomagnetic analysis
NASA Astrophysics Data System (ADS)
Lurcock, P. C.; Wilson, G. S.
2012-06-01
PuffinPlot is a user-friendly desktop application for analysis of paleomagnetic data, offering a unique combination of features. It runs on several operating systems, including Windows, Mac OS X, and Linux; supports both discrete and long core data; and facilitates analysis of very weakly magnetic samples. As well as interactive graphical operation, PuffinPlot offers batch analysis for large volumes of data, and a Python scripting interface for programmatic control of its features. Available data displays include demagnetization/intensity, Zijderveld, equal-area (for sample, site, and suite level demagnetization data, and for magnetic susceptibility anisotropy data), a demagnetization data table, and a natural remanent magnetization intensity histogram. Analysis types include principal component analysis, Fisherian statistics, and great-circle path intersections. The results of calculations can be exported as CSV (comma-separated value) files; graphs can be printed, and can also be saved as publication-quality vector files in SVG or PDF format. PuffinPlot is free, and the program, user manual, and fully documented source code may be downloaded from http://code.google.com/p/puffinplot/.
Kenne, Deric; Wolfram, Taylor M; Abram, Jenica K; Fleming, Michael
2016-01-01
Background Given the high penetration of social media use, social media has been proposed as a method for the dissemination of information to health professionals and patients. This study explored the potential for social media dissemination of the Academy of Nutrition and Dietetics Evidence-Based Nutrition Practice Guideline (EBNPG) for Heart Failure (HF). Objectives The objectives were to (1) describe the existing social media content on HF, including message content, source, and target audience, and (2) describe the attitude of physicians and registered dietitian nutritionists (RDNs) who care for outpatient HF patients toward the use of social media as a method to obtain information for themselves and to share this information with patients. Methods The methods were divided into 2 parts. Part 1 involved conducting a content analysis of tweets related to HF, which were downloaded from Twitonomy and assigned codes for message content (19 codes), source (9 codes), and target audience (9 codes); code frequency was described. A comparison in the popularity of tweets (those marked as favorites or retweeted) based on applied codes was made using t tests. Part 2 involved conducting phone interviews with RDNs and physicians to describe health professionals’ attitude toward the use of social media to communicate general health information and information specifically related to the HF EBNPG. Interviews were transcribed and coded; exemplar quotes representing frequent themes are presented. Results The sample included 294 original tweets with the hashtag “#heartfailure.” The most frequent message content codes were “HF awareness” (166/294, 56.5%) and “patient support” (97/294, 33.0%). The most frequent source codes were “professional, government, patient advocacy organization, or charity” (112/277, 40.4%) and “patient or family” (105/277, 37.9%). The most frequent target audience codes were “unable to identify” (111/277, 40.1%) and “other” (55/277, 19.9%). Significant differences were found in the popularity of tweets with (mean 1, SD 1.3 favorites) or without (mean 0.7, SD 1.3 favorites), the content code being “HF research” (P=.049). Tweets with the source code “professional, government, patient advocacy organizations, or charities” were significantly more likely to be marked as a favorite and retweeted than those without this source code (mean 1.2, SD 1.4 vs mean 0.8, SD 1.2, P=.03) and (mean 1.5, SD 1.8 vs mean 0.9, SD 2.0, P=.03). Interview participants believed that social media was a useful way to gather professional information. They did not believe that social media was useful for communicating with patients due to privacy concerns and the fact that the information had to be kept general rather than be tailored for a specific patient and the belief that their patients did not use social media or technology. Conclusions Existing Twitter content related to HF comes from a combination of patients and evidence-based organizations; however, there is little nutrition content. That gap may present an opportunity for EBNPG dissemination. Health professionals use social media to gather information for themselves but are skeptical of its value when communicating with patients, particularly due to privacy concerns and misconceptions about the characteristics of social media users. PMID:27847349
Hand, Rosa K; Kenne, Deric; Wolfram, Taylor M; Abram, Jenica K; Fleming, Michael
2016-11-15
Given the high penetration of social media use, social media has been proposed as a method for the dissemination of information to health professionals and patients. This study explored the potential for social media dissemination of the Academy of Nutrition and Dietetics Evidence-Based Nutrition Practice Guideline (EBNPG) for Heart Failure (HF). The objectives were to (1) describe the existing social media content on HF, including message content, source, and target audience, and (2) describe the attitude of physicians and registered dietitian nutritionists (RDNs) who care for outpatient HF patients toward the use of social media as a method to obtain information for themselves and to share this information with patients. The methods were divided into 2 parts. Part 1 involved conducting a content analysis of tweets related to HF, which were downloaded from Twitonomy and assigned codes for message content (19 codes), source (9 codes), and target audience (9 codes); code frequency was described. A comparison in the popularity of tweets (those marked as favorites or retweeted) based on applied codes was made using t tests. Part 2 involved conducting phone interviews with RDNs and physicians to describe health professionals' attitude toward the use of social media to communicate general health information and information specifically related to the HF EBNPG. Interviews were transcribed and coded; exemplar quotes representing frequent themes are presented. The sample included 294 original tweets with the hashtag "#heartfailure." The most frequent message content codes were "HF awareness" (166/294, 56.5%) and "patient support" (97/294, 33.0%). The most frequent source codes were "professional, government, patient advocacy organization, or charity" (112/277, 40.4%) and "patient or family" (105/277, 37.9%). The most frequent target audience codes were "unable to identify" (111/277, 40.1%) and "other" (55/277, 19.9%). Significant differences were found in the popularity of tweets with (mean 1, SD 1.3 favorites) or without (mean 0.7, SD 1.3 favorites), the content code being "HF research" (P=.049). Tweets with the source code "professional, government, patient advocacy organizations, or charities" were significantly more likely to be marked as a favorite and retweeted than those without this source code (mean 1.2, SD 1.4 vs mean 0.8, SD 1.2, P=.03) and (mean 1.5, SD 1.8 vs mean 0.9, SD 2.0, P=.03). Interview participants believed that social media was a useful way to gather professional information. They did not believe that social media was useful for communicating with patients due to privacy concerns and the fact that the information had to be kept general rather than be tailored for a specific patient and the belief that their patients did not use social media or technology. Existing Twitter content related to HF comes from a combination of patients and evidence-based organizations; however, there is little nutrition content. That gap may present an opportunity for EBNPG dissemination. Health professionals use social media to gather information for themselves but are skeptical of its value when communicating with patients, particularly due to privacy concerns and misconceptions about the characteristics of social media users. ©Rosa K Hand, Deric Kenne, Taylor M Wolfram, Jenica K Abram, Michael Fleming. Originally published in the Journal of Medical Internet Research (http://www.jmir.org), 15.11.2016.
Condor-COPASI: high-throughput computing for biochemical networks
2012-01-01
Background Mathematical modelling has become a standard technique to improve our understanding of complex biological systems. As models become larger and more complex, simulations and analyses require increasing amounts of computational power. Clusters of computers in a high-throughput computing environment can help to provide the resources required for computationally expensive model analysis. However, exploiting such a system can be difficult for users without the necessary expertise. Results We present Condor-COPASI, a server-based software tool that integrates COPASI, a biological pathway simulation tool, with Condor, a high-throughput computing environment. Condor-COPASI provides a web-based interface, which makes it extremely easy for a user to run a number of model simulation and analysis tasks in parallel. Tasks are transparently split into smaller parts, and submitted for execution on a Condor pool. Result output is presented to the user in a number of formats, including tables and interactive graphical displays. Conclusions Condor-COPASI can effectively use a Condor high-throughput computing environment to provide significant gains in performance for a number of model simulation and analysis tasks. Condor-COPASI is free, open source software, released under the Artistic License 2.0, and is suitable for use by any institution with access to a Condor pool. Source code is freely available for download at http://code.google.com/p/condor-copasi/, along with full instructions on deployment and usage. PMID:22834945
Uhlirova, Hana; Tian, Peifang; Kılıç, Kıvılcım; Thunemann, Martin; Sridhar, Vishnu B; Chmelik, Radim; Bartsch, Hauke; Dale, Anders M; Devor, Anna; Saisan, Payam A
2018-05-04
The importance of sharing experimental data in neuroscience grows with the amount and complexity of data acquired and various techniques used to obtain and process these data. However, the majority of experimental data, especially from individual studies of regular-sized laboratories never reach wider research community. A graphical user interface (GUI) engine called Neurovascular Network Explorer 2.0 (NNE 2.0) has been created as a tool for simple and low-cost sharing and exploring of vascular imaging data. NNE 2.0 interacts with a database containing optogenetically-evoked dilation/constriction time-courses of individual vessels measured in mice somatosensory cortex in vivo by 2-photon microscopy. NNE 2.0 enables selection and display of the time-courses based on different criteria (subject, branching order, cortical depth, vessel diameter, arteriolar tree) as well as simple mathematical manipulation (e.g. averaging, peak-normalization) and data export. It supports visualization of the vascular network in 3D and enables localization of the individual functional vessel diameter measurements within vascular trees. NNE 2.0, its source code, and the corresponding database are freely downloadable from UCSD Neurovascular Imaging Laboratory website 1 . The source code can be utilized by the users to explore the associated database or as a template for databasing and sharing their own experimental results provided the appropriate format.
[The QR code in society, economy and medicine--fields of application, options and chances].
Flaig, Benno; Parzeller, Markus
2011-01-01
2D codes like the QR Code ("Quick Response") are becoming more and more common in society and medicine. The application spectrum and benefits in medicine and other fields are described. 2D codes can be created free of charge on any computer with internet access without any previous knowledge. The codes can be easily used in publications, presentations, on business cards and posters. Editors choose between contact details, text or a hyperlink as information behind the code. At expert conferences, linkage by QR Code allows the audience to download presentations and posters quickly. The documents obtained can then be saved, printed, processed etc. Fast access to stored data in the internet makes it possible to integrate additional and explanatory multilingual videos into medical posters. In this context, a combination of different technologies (printed handout, QR Code and screen) may be reasonable.
Boros, L G; Lepow, C; Ruland, F; Starbuck, V; Jones, S; Flancbaum, L; Townsend, M C
1992-07-01
A powerful method of processing MEDLINE and CINAHL source data uploaded to the IBM 3090 mainframe computer through an IBM/PC is described. Data are first downloaded from the CD-ROM's PC devices to floppy disks. These disks then are uploaded to the mainframe computer through an IBM/PC equipped with WordPerfect text editor and computer network connection (SONNGATE). Before downloading, keywords specifying the information to be accessed are typed at the FIND prompt of the CD-ROM station. The resulting abstracts are downloaded into a file called DOWNLOAD.DOC. The floppy disks containing the information are simply carried to an IBM/PC which has a terminal emulation (TELNET) connection to the university-wide computer network (SONNET) at the Ohio State University Academic Computing Services (OSU ACS). The WordPerfect (5.1) processes and saves the text into DOS format. Using the File Transfer Protocol (FTP, 130,000 bytes/s) of SONNET, the entire text containing the information obtained through the MEDLINE and CINAHL search is transferred to the remote mainframe computer for further processing. At this point, abstracts in the specified area are ready for immediate access and multiple retrieval by any PC having network switch or dial-in connection after the USER ID, PASSWORD and ACCOUNT NUMBER are specified by the user. The system provides the user an on-line, very powerful and quick method of searching for words specifying: diseases, agents, experimental methods, animals, authors, and journals in the research area downloaded. The user can also copy the TItles, AUthors and SOurce with optional parts of abstracts into papers under edition. This arrangement serves the special demands of a research laboratory by handling MEDLINE and CINAHL source data resulting after a search is performed with keywords specified for ongoing projects. Since the Ohio State University has a centrally founded mainframe system, the data upload, storage and mainframe operations are free.
The PLUTO code for astrophysical gasdynamics .
NASA Astrophysics Data System (ADS)
Mignone, A.
Present numerical codes appeal to a consolidated theory based on finite difference and Godunov-type schemes. In this context we have developed a versatile numerical code, PLUTO, suitable for the solution of high-mach number flow in 1, 2 and 3 spatial dimensions and different systems of coordinates. Different hydrodynamic modules and algorithms may be independently selected to properly describe Newtonian, relativistic, MHD, or relativistic MHD fluids. The modular structure exploits a general framework for integrating a system of conservation laws, built on modern Godunov-type shock-capturing schemes. The code is freely distributed under the GNU public license and it is available for download to the astrophysical community at the URL http://plutocode.to.astro.it.
A longitudinal dataset of five years of public activity in the Scratch online community.
Hill, Benjamin Mako; Monroy-Hernández, Andrés
2017-01-31
Scratch is a programming environment and an online community where young people can create, share, learn, and communicate. In collaboration with the Scratch Team at MIT, we created a longitudinal dataset of public activity in the Scratch online community during its first five years (2007-2012). The dataset comprises 32 tables with information on more than 1 million Scratch users, nearly 2 million Scratch projects, more than 10 million comments, more than 30 million visits to Scratch projects, and more. To help researchers understand this dataset, and to establish the validity of the data, we also include the source code of every version of the software that operated the website, as well as the software used to generate this dataset. We believe this is the largest and most comprehensive downloadable dataset of youth programming artifacts and communication.
Python for large-scale electrophysiology.
Spacek, Martin; Blanche, Tim; Swindale, Nicholas
2008-01-01
Electrophysiology is increasingly moving towards highly parallel recording techniques which generate large data sets. We record extracellularly in vivo in cat and rat visual cortex with 54-channel silicon polytrodes, under time-locked visual stimulation, from localized neuronal populations within a cortical column. To help deal with the complexity of generating and analysing these data, we used the Python programming language to develop three software projects: one for temporally precise visual stimulus generation ("dimstim"); one for electrophysiological waveform visualization and spike sorting ("spyke"); and one for spike train and stimulus analysis ("neuropy"). All three are open source and available for download (http://swindale.ecc.ubc.ca/code). The requirements and solutions for these projects differed greatly, yet we found Python to be well suited for all three. Here we present our software as a showcase of the extensive capabilities of Python in neuroscience.
NGL Viewer: Web-based molecular graphics for large complexes.
Rose, Alexander S; Bradley, Anthony R; Valasatava, Yana; Duarte, Jose M; Prlic, Andreas; Rose, Peter W
2018-05-29
The interactive visualization of very large macromolecular complexes on the web is becoming a challenging problem as experimental techniques advance at an unprecedented rate and deliver structures of increasing size. We have tackled this problem by developing highly memory-efficient and scalable extensions for the NGL WebGL-based molecular viewer and by using MMTF, a binary and compressed Macromolecular Transmission Format. These enable NGL to download and render molecular complexes with millions of atoms interactively on desktop computers and smartphones alike, making it a tool of choice for web-based molecular visualization in research and education. The source code is freely available under the MIT license at github.com/arose/ngl and distributed on NPM (npmjs.com/package/ngl). MMTF-JavaScript encoders and decoders are available at github.com/rcsb/mmtf-javascript. asr.moin@gmail.com.
NASA Astrophysics Data System (ADS)
Bieryla, Allyson; Diaz Merced, Wanda; Davis, Daniel
2018-06-01
In astronomy, the relationship between color and temperature is an important concept. This concept can be demonstrated in a laboratory or seen at telescope when observing stars. A blind/visually-impaired (B/VI) person would not be able to engage in the same observational demonstrations that are typically done to explain this concept. We’ve developed a tool for B/VI students to participate in these types of observational activities. Using an arduino compatible micro controller with and RGB light sensor, we are able to convert filtered light into sound. The device will produce different timbres for different wavelengths of light, which can then be used to distinguish the temperature of an object. The device is handheld, easy to program and inexpensive to reproduce (< $50). It is also fitted to mount on a telescope for observing. The design schematic and code will be open source and available for download.
Automatic morphological classification of galaxy images
Shamir, Lior
2009-01-01
We describe an image analysis supervised learning algorithm that can automatically classify galaxy images. The algorithm is first trained using a manually classified images of elliptical, spiral, and edge-on galaxies. A large set of image features is extracted from each image, and the most informative features are selected using Fisher scores. Test images can then be classified using a simple Weighted Nearest Neighbor rule such that the Fisher scores are used as the feature weights. Experimental results show that galaxy images from Galaxy Zoo can be classified automatically to spiral, elliptical and edge-on galaxies with accuracy of ~90% compared to classifications carried out by the author. Full compilable source code of the algorithm is available for free download, and its general-purpose nature makes it suitable for other uses that involve automatic image analysis of celestial objects. PMID:20161594
Free and open source software for the manipulation of digital images.
Solomon, Robert W
2009-06-01
Free and open source software is a type of software that is nearly as powerful as commercial software but is freely downloadable. This software can do almost everything that the expensive programs can. GIMP (gnu image manipulation program) is the free program that is comparable to Photoshop, and versions are available for Windows, Macintosh, and Linux platforms. This article briefly describes how GIMP can be installed and used to manipulate radiology images. It is no longer necessary to budget large amounts of money for high-quality software to achieve the goals of image processing and document creation because free and open source software is available for the user to download at will.
Zhou, Carol L Ecale
2015-01-01
In order to better define regions of similarity among related protein structures, it is useful to identify the residue-residue correspondences among proteins. Few codes exist for constructing a one-to-many multiple sequence alignment derived from a set of structure or sequence alignments, and a need was evident for creating such a tool for combining pairwise structure alignments that would allow for insertion of gaps in the reference structure. This report describes a new Python code, CombAlign, which takes as input a set of pairwise sequence alignments (which may be structure based) and generates a one-to-many, gapped, multiple structure- or sequence-based sequence alignment (MSSA). The use and utility of CombAlign was demonstrated by generating gapped MSSAs using sets of pairwise structure-based sequence alignments between structure models of the matrix protein (VP40) and pre-small/secreted glycoprotein (sGP) of Reston Ebolavirus and the corresponding proteins of several other filoviruses. The gapped MSSAs revealed structure-based residue-residue correspondences, which enabled identification of structurally similar versus differing regions in the Reston proteins compared to each of the other corresponding proteins. CombAlign is a new Python code that generates a one-to-many, gapped, multiple structure- or sequence-based sequence alignment (MSSA) given a set of pairwise sequence alignments (which may be structure based). CombAlign has utility in assisting the user in distinguishing structurally conserved versus divergent regions on a reference protein structure relative to other closely related proteins. CombAlign was developed in Python 2.6, and the source code is available for download from the GitHub code repository.
77 FR 43535 - Grantee Codes for Certified Radiofrequency Equipment
Federal Register 2010, 2011, 2012, 2013, 2014
2012-07-25
... Radiofrequency Equipment AGENCY: Federal Communications Commission. ACTION: Final rule. SUMMARY: This document... certify new equipment. DATES: Effective August 24, 2012. FOR FURTHER INFORMATION CONTACT: Hugh Van Tuyl... may also be downloaded at: www.fcc.gov . Summary of the Order 1. The Commission operates an equipment...
SiGN-SSM: open source parallel software for estimating gene networks with state space models.
Tamada, Yoshinori; Yamaguchi, Rui; Imoto, Seiya; Hirose, Osamu; Yoshida, Ryo; Nagasaki, Masao; Miyano, Satoru
2011-04-15
SiGN-SSM is an open-source gene network estimation software able to run in parallel on PCs and massively parallel supercomputers. The software estimates a state space model (SSM), that is a statistical dynamic model suitable for analyzing short time and/or replicated time series gene expression profiles. SiGN-SSM implements a novel parameter constraint effective to stabilize the estimated models. Also, by using a supercomputer, it is able to determine the gene network structure by a statistical permutation test in a practical time. SiGN-SSM is applicable not only to analyzing temporal regulatory dependencies between genes, but also to extracting the differentially regulated genes from time series expression profiles. SiGN-SSM is distributed under GNU Affero General Public Licence (GNU AGPL) version 3 and can be downloaded at http://sign.hgc.jp/signssm/. The pre-compiled binaries for some architectures are available in addition to the source code. The pre-installed binaries are also available on the Human Genome Center supercomputer system. The online manual and the supplementary information of SiGN-SSM is available on our web site. tamada@ims.u-tokyo.ac.jp.
ChelomEx: Isotope-assisted discovery of metal chelates in complex media using high-resolution LC-MS.
Baars, Oliver; Morel, François M M; Perlman, David H
2014-11-18
Chelating agents can control the speciation and reactivity of trace metals in biological, environmental, and laboratory-derived media. A large number of trace metals (including Fe, Cu, Zn, Hg, and others) show characteristic isotopic fingerprints that can be exploited for the discovery of known and unknown organic metal complexes and related chelating ligands in very complex sample matrices using high-resolution liquid chromatography mass spectrometry (LC-MS). However, there is currently no free open-source software available for this purpose. We present a novel software tool, ChelomEx, which identifies isotope pattern-matched chromatographic features associated with metal complexes along with free ligands and other related adducts in high-resolution LC-MS data. High sensitivity and exclusion of false positives are achieved by evaluation of the chromatographic coherence of the isotope pattern within chromatographic features, which we demonstrate through the analysis of bacterial culture media. A built-in graphical user interface and compound library aid in identification and efficient evaluation of results. ChelomEx is implemented in MatLab. The source code, binaries for MS Windows and MAC OS X as well as test LC-MS data are available for download at SourceForge ( http://sourceforge.net/projects/chelomex ).
CAMPAIGN: an open-source library of GPU-accelerated data clustering algorithms.
Kohlhoff, Kai J; Sosnick, Marc H; Hsu, William T; Pande, Vijay S; Altman, Russ B
2011-08-15
Data clustering techniques are an essential component of a good data analysis toolbox. Many current bioinformatics applications are inherently compute-intense and work with very large datasets. Sequential algorithms are inadequate for providing the necessary performance. For this reason, we have created Clustering Algorithms for Massively Parallel Architectures, Including GPU Nodes (CAMPAIGN), a central resource for data clustering algorithms and tools that are implemented specifically for execution on massively parallel processing architectures. CAMPAIGN is a library of data clustering algorithms and tools, written in 'C for CUDA' for Nvidia GPUs. The library provides up to two orders of magnitude speed-up over respective CPU-based clustering algorithms and is intended as an open-source resource. New modules from the community will be accepted into the library and the layout of it is such that it can easily be extended to promising future platforms such as OpenCL. Releases of the CAMPAIGN library are freely available for download under the LGPL from https://simtk.org/home/campaign. Source code can also be obtained through anonymous subversion access as described on https://simtk.org/scm/?group_id=453. kjk33@cantab.net.
CCDST: A free Canadian climate data scraping tool
NASA Astrophysics Data System (ADS)
Bonifacio, Charmaine; Barchyn, Thomas E.; Hugenholtz, Chris H.; Kienzle, Stefan W.
2015-02-01
In this paper we present a new software tool that automatically fetches, downloads and consolidates climate data from a Web database where the data are contained on multiple Web pages. The tool is called the Canadian Climate Data Scraping Tool (CCDST) and was developed to enhance access and simplify analysis of climate data from Canada's National Climate Data and Information Archive (NCDIA). The CCDST deconstructs a URL for a particular climate station in the NCDIA and then iteratively modifies the date parameters to download large volumes of data, remove individual file headers, and merge data files into one output file. This automated sequence enhances access to climate data by substantially reducing the time needed to manually download data from multiple Web pages. To this end, we present a case study of the temporal dynamics of blowing snow events that resulted in ~3.1 weeks time savings. Without the CCDST, the time involved in manually downloading climate data limits access and restrains researchers and students from exploring climate trends. The tool is coded as a Microsoft Excel macro and is available to researchers and students for free. The main concept and structure of the tool can be modified for other Web databases hosting geophysical data.
JEWEL 2.0.0: directions for use
NASA Astrophysics Data System (ADS)
Zapp, Korinna
2014-02-01
In this publication the first official release of the Jewel 2.0.0 code [The first version Jewel 1 (Zapp et al. in Eur Phys J C 60:617, 2009) could only treat elastic scattering explicitly and the code was never published, The code can be downloaded from the official Jewel homepage http://jewel.hepforge.org] is presented. Jewel is a Monte Carlo event generator simulating QCD jet evolution in heavy-ion collisions. It treats the interplay of QCD radiation and re-scattering in a medium with fully microscopic dynamics in a consistent perturbative framework with minimal assumptions. After a qualitative introduction into the physics of Jewel detailed information about the practical aspects of using the code is given. The code is available from the official Jewel homepage http://jewel.hepforge.org.
DOE Office of Scientific and Technical Information (OSTI.GOV)
None
2000-04-01
The NITON{reg_sign} 800 series analyzer is a hand-held, battery operated unit that measures 8-in x 3-in x 2-in and weighs 2.5 pounds. The analyzer uses x-ray fluorescence spectrum analysis to identify and quantify elements in metal and then compares the readings to a built-in library to determine a metal's alloy. The library contains 300 elements and alloys, and can be customized to identify other elements and alloys (depending on the sources in the instrument). The basic unit utilizes a Cadmium-109 source, but each analyzer unit can hold up to two sources. These sources include Iron-55 and Americium-241. Pushing a safetymore » button located on the side of the unit and placing it against a surface opens the shutter window. Within seconds the unit beeps, and displays the results. The analyzer stores up to 1,000 data sets, including sample identification codes using a barcode reader. The data is easily downloaded to a conventional computer when sampling has been completed. Batteries are good for 8-hrs and charge in less than 2 hours and it can be carried, shipped, or transported without exterior labeling, conforming to 49 CFR 143.421.« less
caGrid 1.0 : an enterprise Grid infrastructure for biomedical research.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Oster, S.; Langella, S.; Hastings, S.
To develop software infrastructure that will provide support for discovery, characterization, integrated access, and management of diverse and disparate collections of information sources, analysis methods, and applications in biomedical research. Design: An enterprise Grid software infrastructure, called caGrid version 1.0 (caGrid 1.0), has been developed as the core Grid architecture of the NCI-sponsored cancer Biomedical Informatics Grid (caBIG{trademark}) program. It is designed to support a wide range of use cases in basic, translational, and clinical research, including (1) discovery, (2) integrated and large-scale data analysis, and (3) coordinated study. Measurements: The caGrid is built as a Grid software infrastructure andmore » leverages Grid computing technologies and the Web Services Resource Framework standards. It provides a set of core services, toolkits for the development and deployment of new community provided services, and application programming interfaces for building client applications. Results: The caGrid 1.0 was released to the caBIG community in December 2006. It is built on open source components and caGrid source code is publicly and freely available under a liberal open source license. The core software, associated tools, and documentation can be downloaded from the following URL:
Estimation of toxicity using a Java based software tool
A software tool has been developed that will allow a user to estimate the toxicity for a variety of endpoints (such as acute aquatic toxicity). The software tool is coded in Java and can be accessed using a web browser (or alternatively downloaded and ran as a stand alone applic...
Changing the Latitudes and Attitudes about Content Analysis Research
ERIC Educational Resources Information Center
Brank, Eve M.; Fox, Kathleen A.; Youstin, Tasha J.; Boeppler, Lee C.
2008-01-01
The current research employs the use of content analysis to teach research methods concepts among students enrolled in an upper division research methods course. Students coded and analyzed Jimmy Buffett song lyrics rather than using a downloadable database or collecting survey data. Students' knowledge of content analysis concepts increased after…
EPEPT: A web service for enhanced P-value estimation in permutation tests
2011-01-01
Background In computational biology, permutation tests have become a widely used tool to assess the statistical significance of an event under investigation. However, the common way of computing the P-value, which expresses the statistical significance, requires a very large number of permutations when small (and thus interesting) P-values are to be accurately estimated. This is computationally expensive and often infeasible. Recently, we proposed an alternative estimator, which requires far fewer permutations compared to the standard empirical approach while still reliably estimating small P-values [1]. Results The proposed P-value estimator has been enriched with additional functionalities and is made available to the general community through a public website and web service, called EPEPT. This means that the EPEPT routines can be accessed not only via a website, but also programmatically using any programming language that can interact with the web. Examples of web service clients in multiple programming languages can be downloaded. Additionally, EPEPT accepts data of various common experiment types used in computational biology. For these experiment types EPEPT first computes the permutation values and then performs the P-value estimation. Finally, the source code of EPEPT can be downloaded. Conclusions Different types of users, such as biologists, bioinformaticians and software engineers, can use the method in an appropriate and simple way. Availability http://informatics.systemsbiology.net/EPEPT/ PMID:22024252
A Grassroots Remote Sensing Toolkit Using Live Coding, Smartphones, Kites and Lightweight Drones
Anderson, K.; Griffiths, D.; DeBell, L.; Hancock, S.; Duffy, J. P.; Shutler, J. D.; Reinhardt, W. J.; Griffiths, A.
2016-01-01
This manuscript describes the development of an android-based smartphone application for capturing aerial photographs and spatial metadata automatically, for use in grassroots mapping applications. The aim of the project was to exploit the plethora of on-board sensors within modern smartphones (accelerometer, GPS, compass, camera) to generate ready-to-use spatial data from lightweight aerial platforms such as drones or kites. A visual coding ‘scheme blocks’ framework was used to build the application (‘app’), so that users could customise their own data capture tools in the field. The paper reports on the coding framework, then shows the results of test flights from kites and lightweight drones and finally shows how open-source geospatial toolkits were used to generate geographical information system (GIS)-ready GeoTIFF images from the metadata stored by the app. Two Android smartphones were used in testing–a high specification OnePlus One handset and a lower cost Acer Liquid Z3 handset, to test the operational limits of the app on phones with different sensor sets. We demonstrate that best results were obtained when the phone was attached to a stable single line kite or to a gliding drone. Results show that engine or motor vibrations from powered aircraft required dampening to ensure capture of high quality images. We demonstrate how the products generated from the open-source processing workflow are easily used in GIS. The app can be downloaded freely from the Google store by searching for ‘UAV toolkit’ (UAV toolkit 2016), and used wherever an Android smartphone and aerial platform are available to deliver rapid spatial data (e.g. in supporting decision-making in humanitarian disaster-relief zones, in teaching or for grassroots remote sensing and democratic mapping). PMID:27144310
A Grassroots Remote Sensing Toolkit Using Live Coding, Smartphones, Kites and Lightweight Drones.
Anderson, K; Griffiths, D; DeBell, L; Hancock, S; Duffy, J P; Shutler, J D; Reinhardt, W J; Griffiths, A
2016-01-01
This manuscript describes the development of an android-based smartphone application for capturing aerial photographs and spatial metadata automatically, for use in grassroots mapping applications. The aim of the project was to exploit the plethora of on-board sensors within modern smartphones (accelerometer, GPS, compass, camera) to generate ready-to-use spatial data from lightweight aerial platforms such as drones or kites. A visual coding 'scheme blocks' framework was used to build the application ('app'), so that users could customise their own data capture tools in the field. The paper reports on the coding framework, then shows the results of test flights from kites and lightweight drones and finally shows how open-source geospatial toolkits were used to generate geographical information system (GIS)-ready GeoTIFF images from the metadata stored by the app. Two Android smartphones were used in testing-a high specification OnePlus One handset and a lower cost Acer Liquid Z3 handset, to test the operational limits of the app on phones with different sensor sets. We demonstrate that best results were obtained when the phone was attached to a stable single line kite or to a gliding drone. Results show that engine or motor vibrations from powered aircraft required dampening to ensure capture of high quality images. We demonstrate how the products generated from the open-source processing workflow are easily used in GIS. The app can be downloaded freely from the Google store by searching for 'UAV toolkit' (UAV toolkit 2016), and used wherever an Android smartphone and aerial platform are available to deliver rapid spatial data (e.g. in supporting decision-making in humanitarian disaster-relief zones, in teaching or for grassroots remote sensing and democratic mapping).
openQ*D simulation code for QCD+QED
NASA Astrophysics Data System (ADS)
Campos, Isabel; Fritzsch, Patrick; Hansen, Martin; Krstić Marinković, Marina; Patella, Agostino; Ramos, Alberto; Tantalo, Nazario
2018-03-01
The openQ*D code for the simulation of QCD+QED with C* boundary conditions is presented. This code is based on openQCD-1.6, from which it inherits the core features that ensure its efficiency: the locally-deflated SAP-preconditioned GCR solver, the twisted-mass frequency splitting of the fermion action, the multilevel integrator, the 4th order OMF integrator, the SSE/AVX intrinsics, etc. The photon field is treated as fully dynamical and C* boundary conditions can be chosen in the spatial directions. We discuss the main features of openQ*D, and we show basic test results and performance analysis. An alpha version of this code is publicly available and can be downloaded from http://rcstar.web.cern.ch/.
Behavior Change Techniques in Apps for Medication Adherence: A Content Analysis.
Morrissey, Eimear C; Corbett, Teresa K; Walsh, Jane C; Molloy, Gerard J
2016-05-01
There are a vast number of smartphone applications (apps) aimed at promoting medication adherence on the market; however, the theory and evidence base in terms of applying established health behavior change techniques underpinning these apps remains unclear. This study aimed to code these apps using the Behavior Change Technique Taxonomy (v1) for the presence or absence of established behavior change techniques. The sample of apps was identified through systematic searches in both the Google Play Store and Apple App Store in February 2015. All apps that fell into the search categories were downloaded for analysis. The downloaded apps were screened with exclusion criteria, and suitable apps were reviewed and coded for behavior change techniques in March 2015. Two researchers performed coding independently. In total, 166 medication adherence apps were identified and coded. The number of behavior change techniques contained in an app ranged from zero to seven (mean=2.77). A total of 12 of a possible 96 behavior change techniques were found to be present across apps. The most commonly included behavior change techniques were "action planning" and "prompt/cues," which were included in 96% of apps, followed by "self-monitoring" (37%) and "feedback on behavior" (36%). The current extent to which established behavior change techniques are used in medication adherence apps is limited. The development of medication adherence apps may not have benefited from advances in the theory and practice of health behavior change. Copyright © 2016 American Journal of Preventive Medicine. Published by Elsevier Inc. All rights reserved.
GeneXplorer: an interactive web application for microarray data visualization and analysis.
Rees, Christian A; Demeter, Janos; Matese, John C; Botstein, David; Sherlock, Gavin
2004-10-01
When publishing large-scale microarray datasets, it is of great value to create supplemental websites where either the full data, or selected subsets corresponding to figures within the paper, can be browsed. We set out to create a CGI application containing many of the features of some of the existing standalone software for the visualization of clustered microarray data. We present GeneXplorer, a web application for interactive microarray data visualization and analysis in a web environment. GeneXplorer allows users to browse a microarray dataset in an intuitive fashion. It provides simple access to microarray data over the Internet and uses only HTML and JavaScript to display graphic and annotation information. It provides radar and zoom views of the data, allows display of the nearest neighbors to a gene expression vector based on their Pearson correlations and provides the ability to search gene annotation fields. The software is released under the permissive MIT Open Source license, and the complete documentation and the entire source code are freely available for download from CPAN http://search.cpan.org/dist/Microarray-GeneXplorer/.
compendiumdb: an R package for retrieval and storage of functional genomics data.
Nandal, Umesh K; van Kampen, Antoine H C; Moerland, Perry D
2016-09-15
Currently, the Gene Expression Omnibus (GEO) contains public data of over 1 million samples from more than 40 000 microarray-based functional genomics experiments. This provides a rich source of information for novel biological discoveries. However, unlocking this potential often requires retrieving and storing a large number of expression profiles from a wide range of different studies and platforms. The compendiumdb R package provides an environment for downloading functional genomics data from GEO, parsing the information into a local or remote database and interacting with the database using dedicated R functions, thus enabling seamless integration with other tools available in R/Bioconductor. The compendiumdb package is written in R, MySQL and Perl. Source code and binaries are available from CRAN (http://cran.r-project.org/web/packages/compendiumdb/) for all major platforms (Linux, MS Windows and OS X) under the GPLv3 license. p.d.moerland@amc.uva.nl Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
LEOPARD on a personal computer
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lancaster, D.B.
1988-01-01
The LEOPARD code is very widely used to produce four- or two-group cross sections for water reactors. Although it is heavily used it had not been downloaded to the PC. This paper has been written to announce the completion of downloading LEOPARD. LEOPARD can now be run on anything from the early PC to the most advanced 80386 machines. The only requirements are 512 Kbytes of memory (LEOPARD actually only needs 235, but with buffers, 256 Kbytes may not be enough) and two disk rives (preferably, one is a hard drive). The run times for various machines and configurations aremore » summarized. The accuracy of the PC-LEOPARD results are documented.« less
Landlab: an Open-Source Python Library for Modeling Earth Surface Dynamics
NASA Astrophysics Data System (ADS)
Gasparini, N. M.; Adams, J. M.; Hobley, D. E. J.; Hutton, E.; Nudurupati, S. S.; Istanbulluoglu, E.; Tucker, G. E.
2016-12-01
Landlab is an open-source Python modeling library that enables users to easily build unique models to explore earth surface dynamics. The Landlab library provides a number of tools and functionalities that are common to many earth surface models, thus eliminating the need for a user to recode fundamental model elements each time she explores a new problem. For example, Landlab provides a gridding engine so that a user can build a uniform or nonuniform grid in one line of code. The library has tools for setting boundary conditions, adding data to a grid, and performing basic operations on the data, such as calculating gradients and curvature. The library also includes a number of process components, which are numerical implementations of physical processes. To create a model, a user creates a grid and couples together process components that act on grid variables. The current library has components for modeling a diverse range of processes, from overland flow generation to bedrock river incision, from soil wetting and drying to vegetation growth, succession and death. The code is freely available for download (https://github.com/landlab/landlab) or can be installed as a Python package. Landlab models can also be built and run on Hydroshare (www.hydroshare.org), an online collaborative environment for sharing hydrologic data, models, and code. Tutorials illustrating a wide range of Landlab capabilities such as building a grid, setting boundary conditions, reading in data, plotting, using components and building models are also available (https://github.com/landlab/tutorials). The code is also comprehensively documented both online and natively in Python. In this presentation, we illustrate the diverse capabilities of Landlab. We highlight existing functionality by illustrating outcomes from a range of models built with Landlab - including applications that explore landscape evolution and ecohydrology. Finally, we describe the range of resources available for new users.
Seismic Canvas: Evolution as a Data Exploration and Analysis Tool
NASA Astrophysics Data System (ADS)
Kroeger, G. C.
2015-12-01
SeismicCanvas, originally developed as a prototype interactive waveform display and printing application for educational use has evolved to include significant data exploration and analysis functionality. The most recent version supports data import from a variety of standard file formats including SAC and mini-SEED, as well as search and download capabilities via IRIS/FDSN Web Services. Data processing tools now include removal of means and trends, interactive windowing, filtering, smoothing, tapering, resampling. Waveforms can be displayed in a free-form canvas or as a record section based on angular or great circle distance, azimuth or back azimuth. Integrated tau-p code allows the calculation and display of theoretical phase arrivals from a variety of radial Earth models. Waveforms can be aligned by absolute time, event time, picked or theoretical arrival times and can be stacked after alignment. Interactive measurements include means, amplitudes, time delays, ray parameters and apparent velocities. Interactive picking of an arbitrary list of seismic phases is supported. Bode plots of amplitude and phase spectra and spectrograms can be created from multiple seismograms or selected windows of seismograms. Direct printing is implemented on all supported platforms along with output of high-resolution pdf files. With these added capabilities, the application is now being used as a data exploration tool for research. Coded in C++ and using the cross-platform Qt framework, the most recent version is available as a 64-bit application for Windows 7-10, Mac OS X 10.6-10.11, and most distributions of Linux, and a 32-bit version for Windows XP and 7. With the latest improvements and refactoring of trace display classes, the 64-bit versions have been tested with over 250 million samples and remain responsive in interactive operations. The source code is available under a LPGLv3 license and both source and executables are available through the IRIS SeisCode repository.
IAC-POP: FINDING THE STAR FORMATION HISTORY OF RESOLVED GALAXIES
DOE Office of Scientific and Technical Information (OSTI.GOV)
Aparicio, Antonio; Hidalgo, Sebastian L.
2009-08-15
IAC-pop is a code designed to solve the star formation history (SFH) of a complex stellar population system, like a galaxy, from the analysis of the color-magnitude diagram (CMD). It uses a genetic algorithm to minimize a {chi}{sup 2} merit function comparing the star distributions in the observed CMD and the CMD of a synthetic stellar population. A parameterization of the CMDs is used, which is the main input of the code. In fact, the code can be applied to any problem in which a similar parameterization of an experimental set of data and models can be made. The method'smore » internal consistency and robustness against several error sources, including observational effects, data sampling, and stellar evolution library differences, are tested. It is found that the best stability of the solution and the best way to estimate errors are obtained by several runs of IAC-pop with varying the input data parameterization. The routine MinnIAC is used to control this process. IAC-pop is offered for free use and can be downloaded from the site http://iac-star.iac.es/iac-pop. The routine MinnIAC is also offered under request, but support cannot be provided for its use. The only requirement for the use of IAC-pop and MinnIAC is referencing this paper and crediting as indicated in the site.« less
Hu, Jialu; Kehr, Birte; Reinert, Knut
2014-02-15
Owing to recent advancements in high-throughput technologies, protein-protein interaction networks of more and more species become available in public databases. The question of how to identify functionally conserved proteins across species attracts a lot of attention in computational biology. Network alignments provide a systematic way to solve this problem. However, most existing alignment tools encounter limitations in tackling this problem. Therefore, the demand for faster and more efficient alignment tools is growing. We present a fast and accurate algorithm, NetCoffee, which allows to find a global alignment of multiple protein-protein interaction networks. NetCoffee searches for a global alignment by maximizing a target function using simulated annealing on a set of weighted bipartite graphs that are constructed using a triplet approach similar to T-Coffee. To assess its performance, NetCoffee was applied to four real datasets. Our results suggest that NetCoffee remedies several limitations of previous algorithms, outperforms all existing alignment tools in terms of speed and nevertheless identifies biologically meaningful alignments. The source code and data are freely available for download under the GNU GPL v3 license at https://code.google.com/p/netcoffee/.
DoOR 2.0 - Comprehensive Mapping of Drosophila melanogaster Odorant Responses
NASA Astrophysics Data System (ADS)
Münch, Daniel; Galizia, C. Giovanni
2016-02-01
Odors elicit complex patterns of activated olfactory sensory neurons. Knowing the complete olfactome, i.e. the responses in all sensory neurons for all relevant odorants, is desirable to understand olfactory coding. The DoOR project combines all available Drosophila odorant response data into a single consensus response matrix. Since its first release many studies were published: receptors were deorphanized and several response profiles were expanded. In this study, we add unpublished data to the odor-response profiles for four odorant receptors (Or10a, Or42b, Or47b, Or56a). We deorphanize Or69a, showing a broad response spectrum with the best ligands including 3-hydroxyhexanoate, alpha-terpineol, 3-octanol and linalool. We include all of these datasets into DoOR, provide a comprehensive update of both code and data, and new tools for data analyses and visualizations. The DoOR project has a web interface for quick queries (http://neuro.uni.kn/DoOR), and a downloadable, open source toolbox written in R, including all processed and original datasets. DoOR now gives reliable odorant-responses for nearly all Drosophila olfactory responding units, listing 693 odorants, for a total of 7381 data points.
BioFVM: an efficient, parallelized diffusive transport solver for 3-D biological simulations
Ghaffarizadeh, Ahmadreza; Friedman, Samuel H.; Macklin, Paul
2016-01-01
Motivation: Computational models of multicellular systems require solving systems of PDEs for release, uptake, decay and diffusion of multiple substrates in 3D, particularly when incorporating the impact of drugs, growth substrates and signaling factors on cell receptors and subcellular systems biology. Results: We introduce BioFVM, a diffusive transport solver tailored to biological problems. BioFVM can simulate release and uptake of many substrates by cell and bulk sources, diffusion and decay in large 3D domains. It has been parallelized with OpenMP, allowing efficient simulations on desktop workstations or single supercomputer nodes. The code is stable even for large time steps, with linear computational cost scalings. Solutions are first-order accurate in time and second-order accurate in space. The code can be run by itself or as part of a larger simulator. Availability and implementation: BioFVM is written in C ++ with parallelization in OpenMP. It is maintained and available for download at http://BioFVM.MathCancer.org and http://BioFVM.sf.net under the Apache License (v2.0). Contact: paul.macklin@usc.edu. Supplementary information: Supplementary data are available at Bioinformatics online. PMID:26656933
i-ADHoRe 2.0: an improved tool to detect degenerated genomic homology using genomic profiles.
Simillion, Cedric; Janssens, Koen; Sterck, Lieven; Van de Peer, Yves
2008-01-01
i-ADHoRe is a software tool that combines gene content and gene order information of homologous genomic segments into profiles to detect highly degenerated homology relations within and between genomes. The new version offers, besides a significant increase in performance, several optimizations to the algorithm, most importantly to the profile alignment routine. As a result, the annotations of multiple genomes, or parts thereof, can be fed simultaneously into the program, after which it will report all regions of homology, both within and between genomes. The i-ADHoRe 2.0 package contains the C++ source code for the main program as well as various Perl scripts and a fully documented Perl API to facilitate post-processing. The software runs on any Linux- or -UNIX based platform. The package is freely available for academic users and can be downloaded from http://bioinformatics.psb.ugent.be/
fluff: exploratory analysis and visualization of high-throughput sequencing data
Georgiou, Georgios
2016-01-01
Summary. In this article we describe fluff, a software package that allows for simple exploration, clustering and visualization of high-throughput sequencing data mapped to a reference genome. The package contains three command-line tools to generate publication-quality figures in an uncomplicated manner using sensible defaults. Genome-wide data can be aggregated, clustered and visualized in a heatmap, according to different clustering methods. This includes a predefined setting to identify dynamic clusters between different conditions or developmental stages. Alternatively, clustered data can be visualized in a bandplot. Finally, fluff includes a tool to generate genomic profiles. As command-line tools, the fluff programs can easily be integrated into standard analysis pipelines. The installation is straightforward and documentation is available at http://fluff.readthedocs.org. Availability. fluff is implemented in Python and runs on Linux. The source code is freely available for download at https://github.com/simonvh/fluff. PMID:27547532
Python for Large-Scale Electrophysiology
Spacek, Martin; Blanche, Tim; Swindale, Nicholas
2008-01-01
Electrophysiology is increasingly moving towards highly parallel recording techniques which generate large data sets. We record extracellularly in vivo in cat and rat visual cortex with 54-channel silicon polytrodes, under time-locked visual stimulation, from localized neuronal populations within a cortical column. To help deal with the complexity of generating and analysing these data, we used the Python programming language to develop three software projects: one for temporally precise visual stimulus generation (“dimstim”); one for electrophysiological waveform visualization and spike sorting (“spyke”); and one for spike train and stimulus analysis (“neuropy”). All three are open source and available for download (http://swindale.ecc.ubc.ca/code). The requirements and solutions for these projects differed greatly, yet we found Python to be well suited for all three. Here we present our software as a showcase of the extensive capabilities of Python in neuroscience. PMID:19198646
DendroPy: a Python library for phylogenetic computing.
Sukumaran, Jeet; Holder, Mark T
2010-06-15
DendroPy is a cross-platform library for the Python programming language that provides for object-oriented reading, writing, simulation and manipulation of phylogenetic data, with an emphasis on phylogenetic tree operations. DendroPy uses a splits-hash mapping to perform rapid calculations of tree distances, similarities and shape under various metrics. It contains rich simulation routines to generate trees under a number of different phylogenetic and coalescent models. DendroPy's data simulation and manipulation facilities, in conjunction with its support of a broad range of phylogenetic data formats (NEXUS, Newick, PHYLIP, FASTA, NeXML, etc.), allow it to serve a useful role in various phyloinformatics and phylogeographic pipelines. The stable release of the library is available for download and automated installation through the Python Package Index site (http://pypi.python.org/pypi/DendroPy), while the active development source code repository is available to the public from GitHub (http://github.com/jeetsukumaran/DendroPy).
PROVAT: a tool for Voronoi tessellation analysis of protein structures and complexes.
Gore, Swanand P; Burke, David F; Blundell, Tom L
2005-08-01
Voronoi tessellation has proved to be a useful tool in protein structure analysis. We have developed PROVAT, a versatile public domain software that enables computation and visualization of Voronoi tessellations of proteins and protein complexes. It is a set of Python scripts that integrate freely available specialized software (Qhull, Pymol etc.) into a pipeline. The calculation component of the tool computes Voronoi tessellation of a given protein system in a way described by a user-supplied XML recipe and stores resulting neighbourhood information as text files with various styles. The Python pickle file generated in the process is used by the visualization component, a Pymol plug-in, that offers a GUI to explore the tessellation visually. PROVAT source code can be downloaded from http://raven.bioc.cam.ac.uk/~swanand/Provat1, which also provides a webserver for its calculation component, documentation and examples.
ChromA: signal-based retention time alignment for chromatography-mass spectrometry data.
Hoffmann, Nils; Stoye, Jens
2009-08-15
We describe ChromA, a web-based alignment tool for chromatography-mass spectrometry data from the metabolomics and proteomics domains. Users can supply their data in open and standardized file formats for retention time alignment using dynamic time warping with different configurable local distance and similarity functions. Additionally, user-defined anchors can be used to constrain and speedup the alignment. A neighborhood around each anchor can be added to increase the flexibility of the constrained alignment. ChromA offers different visualizations of the alignment for easier qualitative interpretation and comparison of the data. For the multiple alignment of more than two data files, the center-star approximation is applied to select a reference among input files to align to. ChromA is available at http://bibiserv.techfak.uni-bielefeld.de/chroma. Executables and source code under the L-GPL v3 license are provided for download at the same location.
OXSA: An open-source magnetic resonance spectroscopy analysis toolbox in MATLAB.
Purvis, Lucian A B; Clarke, William T; Biasiolli, Luca; Valkovič, Ladislav; Robson, Matthew D; Rodgers, Christopher T
2017-01-01
In vivo magnetic resonance spectroscopy provides insight into metabolism in the human body. New acquisition protocols are often proposed to improve the quality or efficiency of data collection. Processing pipelines must also be developed to use these data optimally. Current fitting software is either targeted at general spectroscopy fitting, or for specific protocols. We therefore introduce the MATLAB-based OXford Spectroscopy Analysis (OXSA) toolbox to allow researchers to rapidly develop their own customised processing pipelines. The toolbox aims to simplify development by: being easy to install and use; seamlessly importing Siemens Digital Imaging and Communications in Medicine (DICOM) standard data; allowing visualisation of spectroscopy data; offering a robust fitting routine; flexibly specifying prior knowledge when fitting; and allowing batch processing of spectra. This article demonstrates how each of these criteria have been fulfilled, and gives technical details about the implementation in MATLAB. The code is freely available to download from https://github.com/oxsatoolbox/oxsa.
Deterministic Design Optimization of Structures in OpenMDAO Framework
NASA Technical Reports Server (NTRS)
Coroneos, Rula M.; Pai, Shantaram S.
2012-01-01
Nonlinear programming algorithms play an important role in structural design optimization. Several such algorithms have been implemented in OpenMDAO framework developed at NASA Glenn Research Center (GRC). OpenMDAO is an open source engineering analysis framework, written in Python, for analyzing and solving Multi-Disciplinary Analysis and Optimization (MDAO) problems. It provides a number of solvers and optimizers, referred to as components and drivers, which users can leverage to build new tools and processes quickly and efficiently. Users may download, use, modify, and distribute the OpenMDAO software at no cost. This paper summarizes the process involved in analyzing and optimizing structural components by utilizing the framework s structural solvers and several gradient based optimizers along with a multi-objective genetic algorithm. For comparison purposes, the same structural components were analyzed and optimized using CometBoards, a NASA GRC developed code. The reliability and efficiency of the OpenMDAO framework was compared and reported in this report.
EMERALD: A Flexible Framework for Managing Seismic Data
NASA Astrophysics Data System (ADS)
West, J. D.; Fouch, M. J.; Arrowsmith, R.
2010-12-01
The seismological community is challenged by the vast quantity of new broadband seismic data provided by large-scale seismic arrays such as EarthScope’s USArray. While this bonanza of new data enables transformative scientific studies of the Earth’s interior, it also illuminates limitations in the methods used to prepare and preprocess those data. At a recent seismic data processing focus group workshop, many participants expressed the need for better systems to minimize the time and tedium spent on data preparation in order to increase the efficiency of scientific research. Another challenge related to data from all large-scale transportable seismic experiments is that there currently exists no system for discovering and tracking changes in station metadata. This critical information, such as station location, sensor orientation, instrument response, and clock timing data, may change over the life of an experiment and/or be subject to post-experiment correction. Yet nearly all researchers utilize metadata acquired with the downloaded data, even though subsequent metadata updates might alter or invalidate results produced with older metadata. A third long-standing issue for the seismic community is the lack of easily exchangeable seismic processing codes. This problem stems directly from the storage of seismic data as individual time series files, and the history of each researcher developing his or her preferred data file naming convention and directory organization. Because most processing codes rely on the underlying data organization structure, such codes are not easily exchanged between investigators. To address these issues, we are developing EMERALD (Explore, Manage, Edit, Reduce, & Analyze Large Datasets). The goal of the EMERALD project is to provide seismic researchers with a unified, user-friendly, extensible system for managing seismic event data, thereby increasing the efficiency of scientific enquiry. EMERALD stores seismic data and metadata in a state-of-the-art open source relational database (PostgreSQL), and can, on a timed basis or on demand, download the most recent metadata, compare it with previously acquired values, and alert the user to changes. The backend relational database is capable of easily storing and managing many millions of records. The extensible, plug-in architecture of the EMERALD system allows any researcher to contribute new visualization and processing methods written in any of 12 programming languages, and a central Internet-enabled repository for such methods provides users with the opportunity to download, use, and modify new processing methods on demand. EMERALD includes data acquisition tools allowing direct importation of seismic data, and also imports data from a number of existing seismic file formats. Pre-processed clean sets of data can be exported as standard sac files with user-defined file naming and directory organization, for use with existing processing codes. The EMERALD system incorporates existing acquisition and processing tools, including SOD, TauP, GMT, and FISSURES/DHI, making much of the functionality of those tools available in a unified system with a user-friendly web browser interface. EMERALD is now in beta test. See emerald.asu.edu or contact john.d.west@asu.edu for more details.
WMT: The CSDMS Web Modeling Tool
NASA Astrophysics Data System (ADS)
Piper, M.; Hutton, E. W. H.; Overeem, I.; Syvitski, J. P.
2015-12-01
The Community Surface Dynamics Modeling System (CSDMS) has a mission to enable model use and development for research in earth surface processes. CSDMS strives to expand the use of quantitative modeling techniques, promotes best practices in coding, and advocates for the use of open-source software. To streamline and standardize access to models, CSDMS has developed the Web Modeling Tool (WMT), a RESTful web application with a client-side graphical interface and a server-side database and API that allows users to build coupled surface dynamics models in a web browser on a personal computer or a mobile device, and run them in a high-performance computing (HPC) environment. With WMT, users can: Design a model from a set of components Edit component parameters Save models to a web-accessible server Share saved models with the community Submit runs to an HPC system Download simulation results The WMT client is an Ajax application written in Java with GWT, which allows developers to employ object-oriented design principles and development tools such as Ant, Eclipse and JUnit. For deployment on the web, the GWT compiler translates Java code to optimized and obfuscated JavaScript. The WMT client is supported on Firefox, Chrome, Safari, and Internet Explorer. The WMT server, written in Python and SQLite, is a layered system, with each layer exposing a web service API: wmt-db: database of component, model, and simulation metadata and output wmt-api: configure and connect components wmt-exe: launch simulations on remote execution servers The database server provides, as JSON-encoded messages, the metadata for users to couple model components, including descriptions of component exchange items, uses and provides ports, and input parameters. Execution servers are network-accessible computational resources, ranging from HPC systems to desktop computers, containing the CSDMS software stack for running a simulation. Once a simulation completes, its output, in NetCDF, is packaged and uploaded to a data server where it is stored and from which a user can download it as a single compressed archive file.
The HYPE Open Source Community
NASA Astrophysics Data System (ADS)
Strömbäck, Lena; Arheimer, Berit; Pers, Charlotta; Isberg, Kristina
2013-04-01
The Hydrological Predictions for the Environment (HYPE) model is a dynamic, semi-distributed, process-based, integrated catchment model (Lindström et al., 2010). It uses well-known hydrological and nutrient transport concepts and can be applied for both small and large scale assessments of water resources and status. In the model, the landscape is divided into classes according to soil type, vegetation and altitude. The soil representation is stratified and can be divided in up to three layers. Water and substances are routed through the same flow paths and storages (snow, soil, groundwater, streams, rivers, lakes) considering turn-over and transformation on the way towards the sea. In Sweden, the model is used by water authorities to fulfil the Water Framework Directive and the Marine Strategy Framework Directive. It is used for characterization, forecasts, and scenario analyses. Model data can be downloaded for free from three different HYPE applications: Europe (www.smhi.se/e-hype), Baltic Sea basin (www.smhi.se/balt-hype), and Sweden (vattenweb.smhi.se) The HYPE OSC (hype.sourceforge.net) is an open source initiative under the Lesser GNU Public License taken by SMHI to strengthen international collaboration in hydrological modelling and hydrological data production. The hypothesis is that more brains and more testing will result in better models and better code. The code is transparent and can be changed and learnt from. New versions of the main code will be delivered frequently. The main objective of the HYPE OSC is to provide public access to a state-of-the-art operational hydrological model and to encourage hydrologic expertise from different parts of the world to contribute to model improvement. HYPE OSC is open to everyone interested in hydrology, hydrological modelling and code development - e.g. scientists, authorities, and consultancies. The HYPE Open Source Community was initiated in November 2011 by a kick-off and workshop with 50 eager participants from twelve different countries. In beginning of 2013 we will release a new version of the code featuring new and better modularization, corresponding to hydrological processes which will make the code easier to understand and further develop. During 2013 we also plan a new workshop and HYPE course for everyone interested in the community. Lindström, G., Pers, C.P., Rosberg, R., Strömqvist, J., Arheimer, B. 2010. Development and test of the HYPE (Hydrological Predictions for the Environment) model - A water quality model for different spatial scales. Hydrology Research 41.3-4:295-319
An Analysis of Open Source Security Software Products Downloads
ERIC Educational Resources Information Center
Barta, Brian J.
2014-01-01
Despite the continued demand for open source security software, a gap in the identification of success factors related to the success of open source security software persists. There are no studies that accurately assess the extent of this persistent gap, particularly with respect to the strength of the relationships of open source software…
DOE Office of Scientific and Technical Information (OSTI.GOV)
Marsh, Amber; Harsch, Tim; Pitt, Julie
2007-08-31
The computer side of the IMAGE project consists of a collection of Perl scripts that perform a variety of tasks; scripts are available to insert, update and delete data from the underlying Oracle database, download data from NCBI's Genbank and other sources, and generate data files for download by interested parties. Web scripts make up the tracking interface, and various tools available on the project web-site (image.llnl.gov) that provide a search interface to the database.
Proof Compression and the Mobius PCC Architecture for Embedded Devices
NASA Technical Reports Server (NTRS)
Jensen, Thomas
2009-01-01
The EU Mobius project has been concerned with the security of Java applications, and of mobile devices such as smart phones that execute such applications. In this talk, I'll give a brief overview of the results obtained on on-device checking of various security-related program properties. I'll then describe in more detail how the concept of certified abstract interpretation and abstraction-carrying code can be applied to polyhedral-based analysis of Java byte code in order to verify properties pertaining to the usage of resources of a down-loaded application. Particular emphasis has been on finding ways of reducing the size of the certificates that accompany a piece of code.
Wolff, Hans-Georg; Preising, Katja
2005-02-01
To ease the interpretation of higher order factor analysis, the direct relationships between variables and higher order factors may be calculated by the Schmid-Leiman solution (SLS; Schmid & Leiman, 1957). This simple transformation of higher order factor analysis orthogonalizes first-order and higher order factors and thereby allows the interpretation of the relative impact of factor levels on variables. The Schmid-Leiman solution may also be used to facilitate theorizing and scale development. The rationale for the procedure is presented, supplemented by syntax codes for SPSS and SAS, since the transformation is not part of most statistical programs. Syntax codes may also be downloaded from www.psychonomic.org/archive/.
NASA Astrophysics Data System (ADS)
Fraser, Ryan; Gross, Lutz; Wyborn, Lesley; Evans, Ben; Klump, Jens
2015-04-01
Recent investments in HPC, cloud and Petascale data stores, have dramatically increased the scale and resolution that earth science challenges can now be tackled. These new infrastructures are highly parallelised and to fully utilise them and access the large volumes of earth science data now available, a new approach to software stack engineering needs to be developed. The size, complexity and cost of the new infrastructures mean any software deployed has to be reliable, trusted and reusable. Increasingly software is available via open source repositories, but these usually only enable code to be discovered and downloaded. As a user it is hard for a scientist to judge the suitability and quality of individual codes: rarely is there information on how and where codes can be run, what the critical dependencies are, and in particular, on the version requirements and licensing of the underlying software stack. A trusted software framework is proposed to enable reliable software to be discovered, accessed and then deployed on multiple hardware environments. More specifically, this framework will enable those who generate the software, and those who fund the development of software, to gain credit for the effort, IP, time and dollars spent, and facilitate quantification of the impact of individual codes. For scientific users, the framework delivers reviewed and benchmarked scientific software with mechanisms to reproduce results. The trusted framework will have five separate, but connected components: Register, Review, Reference, Run, and Repeat. 1) The Register component will facilitate discovery of relevant software from multiple open source code repositories. The registration process of the code should include information about licensing, hardware environments it can be run on, define appropriate validation (testing) procedures and list the critical dependencies. 2) The Review component is targeting on the verification of the software typically against a set of benchmark cases. This will be achieved by linking the code in the software framework to peer review forums such as Mozilla Science or appropriate Journals (e.g. Geoscientific Model Development Journal) to assist users to know which codes to trust. 3) Referencing will be accomplished by linking the Software Framework to groups such as Figshare or ImpactStory that help disseminate and measure the impact of scientific research, including program code. 4) The Run component will draw on information supplied in the registration process, benchmark cases described in the review and relevant information to instantiate the scientific code on the selected environment. 5) The Repeat component will tap into existing Provenance Workflow engines that will automatically capture information that relate to a particular run of that software, including identification of all input and output artefacts, and all elements and transactions within that workflow. The proposed trusted software framework will enable users to rapidly discover and access reliable code, reduce the time to deploy it and greatly facilitate sharing, reuse and reinstallation of code. Properly designed it could enable an ability to scale out to massively parallel systems and be accessed nationally/ internationally for multiple use cases, including Supercomputer centres, cloud facilities, and local computers.
SORTA: a system for ontology-based re-coding and technical annotation of biomedical phenotype data.
Pang, Chao; Sollie, Annet; Sijtsma, Anna; Hendriksen, Dennis; Charbon, Bart; de Haan, Mark; de Boer, Tommy; Kelpin, Fleur; Jetten, Jonathan; van der Velde, Joeri K; Smidt, Nynke; Sijmons, Rolf; Hillege, Hans; Swertz, Morris A
2015-01-01
There is an urgent need to standardize the semantics of biomedical data values, such as phenotypes, to enable comparative and integrative analyses. However, it is unlikely that all studies will use the same data collection protocols. As a result, retrospective standardization is often required, which involves matching of original (unstructured or locally coded) data to widely used coding or ontology systems such as SNOMED CT (clinical terms), ICD-10 (International Classification of Disease) and HPO (Human Phenotype Ontology). This data curation process is usually a time-consuming process performed by a human expert. To help mechanize this process, we have developed SORTA, a computer-aided system for rapidly encoding free text or locally coded values to a formal coding system or ontology. SORTA matches original data values (uploaded in semicolon delimited format) to a target coding system (uploaded in Excel spreadsheet, OWL ontology web language or OBO open biomedical ontologies format). It then semi- automatically shortlists candidate codes for each data value using Lucene and n-gram based matching algorithms, and can also learn from matches chosen by human experts. We evaluated SORTA's applicability in two use cases. For the LifeLines biobank, we used SORTA to recode 90 000 free text values (including 5211 unique values) about physical exercise to MET (Metabolic Equivalent of Task) codes. For the CINEAS clinical symptom coding system, we used SORTA to map to HPO, enriching HPO when necessary (315 terms matched so far). Out of the shortlists at rank 1, we found a precision/recall of 0.97/0.98 in LifeLines and of 0.58/0.45 in CINEAS. More importantly, users found the tool both a major time saver and a quality improvement because SORTA reduced the chances of human mistakes. Thus, SORTA can dramatically ease data (re)coding tasks and we believe it will prove useful for many more projects. Database URL: http://molgenis.org/sorta or as an open source download from http://www.molgenis.org/wiki/SORTA. © The Author(s) 2015. Published by Oxford University Press.
SORTA: a system for ontology-based re-coding and technical annotation of biomedical phenotype data
Pang, Chao; Sollie, Annet; Sijtsma, Anna; Hendriksen, Dennis; Charbon, Bart; de Haan, Mark; de Boer, Tommy; Kelpin, Fleur; Jetten, Jonathan; van der Velde, Joeri K.; Smidt, Nynke; Sijmons, Rolf; Hillege, Hans; Swertz, Morris A.
2015-01-01
There is an urgent need to standardize the semantics of biomedical data values, such as phenotypes, to enable comparative and integrative analyses. However, it is unlikely that all studies will use the same data collection protocols. As a result, retrospective standardization is often required, which involves matching of original (unstructured or locally coded) data to widely used coding or ontology systems such as SNOMED CT (clinical terms), ICD-10 (International Classification of Disease) and HPO (Human Phenotype Ontology). This data curation process is usually a time-consuming process performed by a human expert. To help mechanize this process, we have developed SORTA, a computer-aided system for rapidly encoding free text or locally coded values to a formal coding system or ontology. SORTA matches original data values (uploaded in semicolon delimited format) to a target coding system (uploaded in Excel spreadsheet, OWL ontology web language or OBO open biomedical ontologies format). It then semi- automatically shortlists candidate codes for each data value using Lucene and n-gram based matching algorithms, and can also learn from matches chosen by human experts. We evaluated SORTA’s applicability in two use cases. For the LifeLines biobank, we used SORTA to recode 90 000 free text values (including 5211 unique values) about physical exercise to MET (Metabolic Equivalent of Task) codes. For the CINEAS clinical symptom coding system, we used SORTA to map to HPO, enriching HPO when necessary (315 terms matched so far). Out of the shortlists at rank 1, we found a precision/recall of 0.97/0.98 in LifeLines and of 0.58/0.45 in CINEAS. More importantly, users found the tool both a major time saver and a quality improvement because SORTA reduced the chances of human mistakes. Thus, SORTA can dramatically ease data (re)coding tasks and we believe it will prove useful for many more projects. Database URL: http://molgenis.org/sorta or as an open source download from http://www.molgenis.org/wiki/SORTA PMID:26385205
An interactive web application for the dissemination of human systems immunology data.
Speake, Cate; Presnell, Scott; Domico, Kelly; Zeitner, Brad; Bjork, Anna; Anderson, David; Mason, Michael J; Whalen, Elizabeth; Vargas, Olivia; Popov, Dimitry; Rinchai, Darawan; Jourde-Chiche, Noemie; Chiche, Laurent; Quinn, Charlie; Chaussabel, Damien
2015-06-19
Systems immunology approaches have proven invaluable in translational research settings. The current rate at which large-scale datasets are generated presents unique challenges and opportunities. Mining aggregates of these datasets could accelerate the pace of discovery, but new solutions are needed to integrate the heterogeneous data types with the contextual information that is necessary for interpretation. In addition, enabling tools and technologies facilitating investigators' interaction with large-scale datasets must be developed in order to promote insight and foster knowledge discovery. State of the art application programming was employed to develop an interactive web application for browsing and visualizing large and complex datasets. A collection of human immune transcriptome datasets were loaded alongside contextual information about the samples. We provide a resource enabling interactive query and navigation of transcriptome datasets relevant to human immunology research. Detailed information about studies and samples are displayed dynamically; if desired the associated data can be downloaded. Custom interactive visualizations of the data can be shared via email or social media. This application can be used to browse context-rich systems-scale data within and across systems immunology studies. This resource is publicly available online at [Gene Expression Browser Landing Page ( https://gxb.benaroyaresearch.org/dm3/landing.gsp )]. The source code is also available openly [Gene Expression Browser Source Code ( https://github.com/BenaroyaResearch/gxbrowser )]. We have developed a data browsing and visualization application capable of navigating increasingly large and complex datasets generated in the context of immunological studies. This intuitive tool ensures that, whether taken individually or as a whole, such datasets generated at great effort and expense remain interpretable and a ready source of insight for years to come.
An Evolving Ecosystem for Natural Language Processing in Department of Veterans Affairs.
Garvin, Jennifer H; Kalsy, Megha; Brandt, Cynthia; Luther, Stephen L; Divita, Guy; Coronado, Gregory; Redd, Doug; Christensen, Carrie; Hill, Brent; Kelly, Natalie; Treitler, Qing Zeng
2017-02-01
In an ideal clinical Natural Language Processing (NLP) ecosystem, researchers and developers would be able to collaborate with others, undertake validation of NLP systems, components, and related resources, and disseminate them. We captured requirements and formative evaluation data from the Veterans Affairs (VA) Clinical NLP Ecosystem stakeholders using semi-structured interviews and meeting discussions. We developed a coding rubric to code interviews. We assessed inter-coder reliability using percent agreement and the kappa statistic. We undertook 15 interviews and held two workshop discussions. The main areas of requirements related to; design and functionality, resources, and information. Stakeholders also confirmed the vision of the second generation of the Ecosystem and recommendations included; adding mechanisms to better understand terms, measuring collaboration to demonstrate value, and datasets/tools to navigate spelling errors with consumer language, among others. Stakeholders also recommended capability to: communicate with developers working on the next version of the VA electronic health record (VistA Evolution), provide a mechanism to automatically monitor download of tools and to automatically provide a summary of the downloads to Ecosystem contributors and funders. After three rounds of coding and discussion, we determined the percent agreement of two coders to be 97.2% and the kappa to be 0.7851. The vision of the VA Clinical NLP Ecosystem met stakeholder needs. Interviews and discussion provided key requirements that inform the design of the VA Clinical NLP Ecosystem.
SunPy: Python for Solar Physics
NASA Astrophysics Data System (ADS)
Bobra, M.; Inglis, A. R.; Mumford, S.; Christe, S.; Freij, N.; Hewett, R.; Ireland, J.; Martinez Oliveros, J. C.; Reardon, K.; Savage, S. L.; Shih, A. Y.; Pérez-Suárez, D.
2017-12-01
SunPy is a community-developed open-source software library for solar physics. It is written in Python, a free, cross-platform, general-purpose, high-level programming language which is being increasingly adopted throughout the scientific community. SunPy aims to provide the software for obtaining and analyzing solar and heliospheric data. This poster introduces a new major release, SunPy version 0.8. The first major new feature introduced is Fido, the new primary interface to download data. It provides a consistent and powerful search interface to all major data providers including the VSO and the JSOC, as well as individual data sources such as GOES XRS time series. It is also easy to add new data sources as they become available, i.e. DKIST. The second major new feature is the SunPy coordinate framework. This provides a powerful way of representing coordinates, allowing simple and intuitive conversion between coordinate systems and viewpoints of different instruments (i.e., Solar Orbiter and the Parker Solar Probe), including transformation to astrophysical frames like ICRS. Other new features including new timeseries capabilities with better support for concatenation and metadata, updated documentation and example gallery. SunPy is distributed through pip and conda and all of its code is publicly available (sunpy.org).
NASA Astrophysics Data System (ADS)
Álvarez del Castillo, Alejandra; Alaniz-Álvarez, Susana Alicia; Nieto-Samaniego, Angel Francisco; Xu, Shunshan; Ochoa-González, Gil Humberto; Velasquillo-Martínez, Luis Germán
2017-07-01
In the oil, gas and geothermal industry, the extraction or the input of fluids induces changes in the stress field of the reservoir, if the in-situ stress state of a fault plane is sufficiently disturbed, a fault may slip and can trigger fluid leakage or the reservoir might fracture and become damaged. The goal of the SSLIPO 1.0 software is to obtain data that can reduce the risk of affecting the stability of wellbores. The input data are the magnitudes of the three principal stresses and their orientation in geographic coordinates. The output data are the slip direction of a fracture in geographic coordinates, and its normal (σn) and shear (τ) stresses resolved on a single or multiple fracture planes. With this information, it is possible to calculate the slip tendency (τ/σn) and the propensity to open a fracture that is inversely proportional to σn. This software could analyze any compressional stress system, even non-Andersonian. An example is given from an oilfield in southern Mexico, in a region that contains fractures formed in three events of deformation. In the example SSLIPO 1.0 was used to determine in which deformation event the oil migrated. SSLIPO 1.0 is an open code application developed in MATLAB. The URL to obtain the source code and to download SSLIPO 1.0 are: http://www.geociencias.unam.mx/ alaniz/main_code.txt, http://www.geociencias.unam.mx/ alaniz/ SSLIPO_pkg.exe.
Photo-z-SQL: Integrated, flexible photometric redshift computation in a database
NASA Astrophysics Data System (ADS)
Beck, R.; Dobos, L.; Budavári, T.; Szalay, A. S.; Csabai, I.
2017-04-01
We present a flexible template-based photometric redshift estimation framework, implemented in C#, that can be seamlessly integrated into a SQL database (or DB) server and executed on-demand in SQL. The DB integration eliminates the need to move large photometric datasets outside a database for redshift estimation, and utilizes the computational capabilities of DB hardware. The code is able to perform both maximum likelihood and Bayesian estimation, and can handle inputs of variable photometric filter sets and corresponding broad-band magnitudes. It is possible to take into account the full covariance matrix between filters, and filter zero points can be empirically calibrated using measurements with given redshifts. The list of spectral templates and the prior can be specified flexibly, and the expensive synthetic magnitude computations are done via lazy evaluation, coupled with a caching of results. Parallel execution is fully supported. For large upcoming photometric surveys such as the LSST, the ability to perform in-place photo-z calculation would be a significant advantage. Also, the efficient handling of variable filter sets is a necessity for heterogeneous databases, for example the Hubble Source Catalog, and for cross-match services such as SkyQuery. We illustrate the performance of our code on two reference photo-z estimation testing datasets, and provide an analysis of execution time and scalability with respect to different configurations. The code is available for download at https://github.com/beckrob/Photo-z-SQL.
SinEx DB: a database for single exon coding sequences in mammalian genomes.
Jorquera, Roddy; Ortiz, Rodrigo; Ossandon, F; Cárdenas, Juan Pablo; Sepúlveda, Rene; González, Carolina; Holmes, David S
2016-01-01
Eukaryotic genes are typically interrupted by intragenic, noncoding sequences termed introns. However, some genes lack introns in their coding sequence (CDS) and are generally known as 'single exon genes' (SEGs). In this work, a SEG is defined as a nuclear, protein-coding gene that lacks introns in its CDS. Whereas, many public databases of Eukaryotic multi-exon genes are available, there are only two specialized databases for SEGs. The present work addresses the need for a more extensive and diverse database by creating SinEx DB, a publicly available, searchable database of predicted SEGs from 10 completely sequenced mammalian genomes including human. SinEx DB houses the DNA and protein sequence information of these SEGs and includes their functional predictions (KOG) and the relative distribution of these functions within species. The information is stored in a relational database built with My SQL Server 5.1.33 and the complete dataset of SEG sequences and their functional predictions are available for downloading. SinEx DB can be interrogated by: (i) a browsable phylogenetic schema, (ii) carrying out BLAST searches to the in-house SinEx DB of SEGs and (iii) via an advanced search mode in which the database can be searched by key words and any combination of searches by species and predicted functions. SinEx DB provides a rich source of information for advancing our understanding of the evolution and function of SEGs.Database URL: www.sinex.cl. © The Author(s) 2016. Published by Oxford University Press.
The Athena Astrophysical MHD Code in Cylindrical Geometry
NASA Astrophysics Data System (ADS)
Skinner, M. A.; Ostriker, E. C.
2011-10-01
We have developed a method for implementing cylindrical coordinates in the Athena MHD code (Skinner & Ostriker 2010). The extension has been designed to alter the existing Cartesian-coordinates code (Stone et al. 2008) as minimally and transparently as possible. The numerical equations in cylindrical coordinates are formulated to maintain consistency with constrained transport, a central feature of the Athena algorithm, while making use of previously implemented code modules such as the eigensystems and Riemann solvers. Angular-momentum transport, which is critical in astrophysical disk systems dominated by rotation, is treated carefully. We describe modifications for cylindrical coordinates of the higher-order spatial reconstruction and characteristic evolution steps as well as the finite-volume and constrained transport updates. Finally, we have developed a test suite of standard and novel problems in one-, two-, and three-dimensions designed to validate our algorithms and implementation and to be of use to other code developers. The code is suitable for use in a wide variety of astrophysical applications and is freely available for download on the web.
NASA Astrophysics Data System (ADS)
Veerraju, R. P. S. P.; Rao, A. Srinivasa; Murali, G.
2010-10-01
Refactoring is a disciplined technique for restructuring an existing body of code, altering its internal structure without changing its external behavior. It improves internal code structure without altering its external functionality by transforming functions and rethinking algorithms. It is an iterative process. Refactoring include reducing scope, replacing complex instructions with simpler or built-in instructions, and combining multiple statements into one statement. By transforming the code with refactoring techniques it will be faster to change, execute, and download. It is an excellent best practice to adopt for programmers wanting to improve their productivity. Refactoring is similar to things like performance optimizations, which are also behavior- preserving transformations. It also helps us find bugs when we are trying to fix a bug in difficult-to-understand code. By cleaning things up, we make it easier to expose the bug. Refactoring improves the quality of application design and implementation. In general, three cases concerning refactoring. Iterative refactoring, Refactoring when is necessary, Not refactor. Mr. Martin Fowler identifies four key reasons to refractor. Refactoring improves the design of software, makes software easier to understand, helps us find bugs and also helps in executing the program faster. There is an additional benefit of refactoring. It changes the way a developer thinks about the implementation when not refactoring. There are the three types of refactorings. 1) Code refactoring: It often referred to simply as refactoring. This is the refactoring of programming source code. 2) Database refactoring: It is a simple change to a database schema that improves its design while retaining both its behavioral and informational semantics. 3) User interface (UI) refactoring: It is a simple change to the UI which retains its semantics. Finally, we conclude the benefits of Refactoring are: Improves the design of software, Makes software easier to understand, Software gets cleaned up and Helps us to find bugs and Helps us to program faster.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Veerraju, R. P. S. P.; Rao, A. Srinivasa; Murali, G.
2010-10-26
Refactoring is a disciplined technique for restructuring an existing body of code, altering its internal structure without changing its external behavior. It improves internal code structure without altering its external functionality by transforming functions and rethinking algorithms. It is an iterative process. Refactoring include reducing scope, replacing complex instructions with simpler or built-in instructions, and combining multiple statements into one statement. By transforming the code with refactoring techniques it will be faster to change, execute, and download. It is an excellent best practice to adopt for programmers wanting to improve their productivity. Refactoring is similar to things like performance optimizations,more » which are also behavior- preserving transformations. It also helps us find bugs when we are trying to fix a bug in difficult-to-understand code. By cleaning things up, we make it easier to expose the bug. Refactoring improves the quality of application design and implementation. In general, three cases concerning refactoring. Iterative refactoring, Refactoring when is necessary, Not refactor.Mr. Martin Fowler identifies four key reasons to refractor. Refactoring improves the design of software, makes software easier to understand, helps us find bugs and also helps in executing the program faster. There is an additional benefit of refactoring. It changes the way a developer thinks about the implementation when not refactoring. There are the three types of refactorings. 1) Code refactoring: It often referred to simply as refactoring. This is the refactoring of programming source code. 2) Database refactoring: It is a simple change to a database schema that improves its design while retaining both its behavioral and informational semantics. 3) User interface (UI) refactoring: It is a simple change to the UI which retains its semantics. Finally, we conclude the benefits of Refactoring are: Improves the design of software, Makes software easier to understand, Software gets cleaned up and Helps us to find bugs and Helps us to program faster.« less
Free for All: Open Source Software
ERIC Educational Resources Information Center
Schneider, Karen
2008-01-01
Open source software has become a catchword in libraryland. Yet many remain unclear about open source's benefits--or even what it is. So what is open source software (OSS)? It's software that is free in every sense of the word: free to download, free to use, and free to view or modify. Most OSS is distributed on the Web and one doesn't need to…
CellAnimation: an open source MATLAB framework for microscopy assays.
Georgescu, Walter; Wikswo, John P; Quaranta, Vito
2012-01-01
Advances in microscopy technology have led to the creation of high-throughput microscopes that are capable of generating several hundred gigabytes of images in a few days. Analyzing such wealth of data manually is nearly impossible and requires an automated approach. There are at present a number of open-source and commercial software packages that allow the user to apply algorithms of different degrees of sophistication to the images and extract desired metrics. However, the types of metrics that can be extracted are severely limited by the specific image processing algorithms that the application implements, and by the expertise of the user. In most commercial software, code unavailability prevents implementation by the end user of newly developed algorithms better suited for a particular type of imaging assay. While it is possible to implement new algorithms in open-source software, rewiring an image processing application requires a high degree of expertise. To obviate these limitations, we have developed an open-source high-throughput application that allows implementation of different biological assays such as cell tracking or ancestry recording, through the use of small, relatively simple image processing modules connected into sophisticated imaging pipelines. By connecting modules, non-expert users can apply the particular combination of well-established and novel algorithms developed by us and others that are best suited for each individual assay type. In addition, our data exploration and visualization modules make it easy to discover or select specific cell phenotypes from a heterogeneous population. CellAnimation is distributed under the Creative Commons Attribution-NonCommercial 3.0 Unported license (http://creativecommons.org/licenses/by-nc/3.0/). CellAnimationsource code and documentation may be downloaded from www.vanderbilt.edu/viibre/software/documents/CellAnimation.zip. Sample data are available at www.vanderbilt.edu/viibre/software/documents/movies.zip. walter.georgescu@vanderbilt.edu Supplementary data available at Bioinformatics online.
Brief Pictorial Description of New Mobile Technologies Used in Cultural Institutions in Japan
ERIC Educational Resources Information Center
Awano, Yumi
2007-01-01
Many Japanese museums and other cultural institutions have been exploring ways to enrich visitors' experiences using new digital devices. This paper briefly describes some examples in Japan, ranging from a PDA, a mobile phone, podcasting, and an audio guide speaker-equipped vest to a Quick Response (QR) code on a brochure for downloading a short…
Yuan, Jie; Xu, Guan; Yu, Yao; Zhou, Yu; Carson, Paul L; Wang, Xueding; Liu, Xiaojun
2013-08-01
Photoacoustic tomography (PAT) offers structural and functional imaging of living biological tissue with highly sensitive optical absorption contrast and excellent spatial resolution comparable to medical ultrasound (US) imaging. We report the development of a fully integrated PAT and US dual-modality imaging system, which performs signal scanning, image reconstruction, and display for both photoacoustic (PA) and US imaging all in a truly real-time manner. The back-projection (BP) algorithm for PA image reconstruction is optimized to reduce the computational cost and facilitate parallel computation on a state of the art graphics processing unit (GPU) card. For the first time, PAT and US imaging of the same object can be conducted simultaneously and continuously, at a real-time frame rate, presently limited by the laser repetition rate of 10 Hz. Noninvasive PAT and US imaging of human peripheral joints in vivo were achieved, demonstrating the satisfactory image quality realized with this system. Another experiment, simultaneous PAT and US imaging of contrast agent flowing through an artificial vessel, was conducted to verify the performance of this system for imaging fast biological events. The GPU-based image reconstruction software code for this dual-modality system is open source and available for download from http://sourceforge.net/projects/patrealtime.
ExcelAutomat: a tool for systematic processing of files as applied to quantum chemical calculations
NASA Astrophysics Data System (ADS)
Laloo, Jalal Z. A.; Laloo, Nassirah; Rhyman, Lydia; Ramasami, Ponnadurai
2017-07-01
The processing of the input and output files of quantum chemical calculations often necessitates a spreadsheet as a key component of the workflow. Spreadsheet packages with a built-in programming language editor can automate the steps involved and thus provide a direct link between processing files and the spreadsheet. This helps to reduce user-interventions as well as the need to switch between different programs to carry out each step. The ExcelAutomat tool is the implementation of this method in Microsoft Excel (MS Excel) using the default Visual Basic for Application (VBA) programming language. The code in ExcelAutomat was adapted to work with the platform-independent open-source LibreOffice Calc, which also supports VBA. ExcelAutomat provides an interface through the spreadsheet to automate repetitive tasks such as merging input files, splitting, parsing and compiling data from output files, and generation of unique filenames. Selected extracted parameters can be retrieved as variables which can be included in custom codes for a tailored approach. ExcelAutomat works with Gaussian files and is adapted for use with other computational packages including the non-commercial GAMESS. ExcelAutomat is available as a downloadable MS Excel workbook or as a LibreOffice workbook.
ExcelAutomat: a tool for systematic processing of files as applied to quantum chemical calculations.
Laloo, Jalal Z A; Laloo, Nassirah; Rhyman, Lydia; Ramasami, Ponnadurai
2017-07-01
The processing of the input and output files of quantum chemical calculations often necessitates a spreadsheet as a key component of the workflow. Spreadsheet packages with a built-in programming language editor can automate the steps involved and thus provide a direct link between processing files and the spreadsheet. This helps to reduce user-interventions as well as the need to switch between different programs to carry out each step. The ExcelAutomat tool is the implementation of this method in Microsoft Excel (MS Excel) using the default Visual Basic for Application (VBA) programming language. The code in ExcelAutomat was adapted to work with the platform-independent open-source LibreOffice Calc, which also supports VBA. ExcelAutomat provides an interface through the spreadsheet to automate repetitive tasks such as merging input files, splitting, parsing and compiling data from output files, and generation of unique filenames. Selected extracted parameters can be retrieved as variables which can be included in custom codes for a tailored approach. ExcelAutomat works with Gaussian files and is adapted for use with other computational packages including the non-commercial GAMESS. ExcelAutomat is available as a downloadable MS Excel workbook or as a LibreOffice workbook.
icoshift: A versatile tool for the rapid alignment of 1D NMR spectra
NASA Astrophysics Data System (ADS)
Savorani, F.; Tomasi, G.; Engelsen, S. B.
2010-02-01
The increasing scientific and industrial interest towards metabonomics takes advantage from the high qualitative and quantitative information level of nuclear magnetic resonance (NMR) spectroscopy. However, several chemical and physical factors can affect the absolute and the relative position of an NMR signal and it is not always possible or desirable to eliminate these effects a priori. To remove misalignment of NMR signals a posteriori, several algorithms have been proposed in the literature. The icoshift program presented here is an open source and highly efficient program designed for solving signal alignment problems in metabonomic NMR data analysis. The icoshift algorithm is based on correlation shifting of spectral intervals and employs an FFT engine that aligns all spectra simultaneously. The algorithm is demonstrated to be faster than similar methods found in the literature making full-resolution alignment of large datasets feasible and thus avoiding down-sampling steps such as binning. The algorithm uses missing values as a filling alternative in order to avoid spectral artifacts at the segment boundaries. The algorithm is made open source and the Matlab code including documentation can be downloaded from www.models.life.ku.dk.
A collection of open source applications for mass spectrometry data mining.
Gallardo, Óscar; Ovelleiro, David; Gay, Marina; Carrascal, Montserrat; Abian, Joaquin
2014-10-01
We present several bioinformatics applications for the identification and quantification of phosphoproteome components by MS. These applications include a front-end graphical user interface that combines several Thermo RAW formats to MASCOT™ Generic Format extractors (EasierMgf), two graphical user interfaces for search engines OMSSA and SEQUEST (OmssaGui and SequestGui), and three applications, one for the management of databases in FASTA format (FastaTools), another for the integration of search results from up to three search engines (Integrator), and another one for the visualization of mass spectra and their corresponding database search results (JsonVisor). These applications were developed to solve some of the common problems found in proteomic and phosphoproteomic data analysis and were integrated in the workflow for data processing and feeding on our LymPHOS database. Applications were designed modularly and can be used standalone. These tools are written in Perl and Python programming languages and are supported on Windows platforms. They are all released under an Open Source Software license and can be freely downloaded from our software repository hosted at GoogleCode. © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
EnviroAtlas - Industrial Water Demand by 12-Digit HUC for the Conterminous United States
This EnviroAtlas dataset includes industrial water demand attributes which provide insight into the amount of water currently used for manufacturing and production of commodities in the contiguous United States. The values are based on 2005 water demand and Dun and Bradstreet's 2009/2010 source data, and have been summarized by watershed or 12-digit hydrologic unit code (HUC). For the purposes of this metric, industrial water use includes chemical, food, paper, wood, and metal production. The industrial water is for self-supplied only such as by private wells or reservoirs. Sources include either surface water or groundwater. This dataset was produced by the US EPA to support research and online mapping activities related to the EnviroAtlas. EnviroAtlas (https://www.epa.gov/enviroatlas) allows the user to interact with a web-based, easy-to-use, mapping application to view and analyze multiple ecosystem services for the contiguous United States. The dataset is available as downloadable data (https://edg.epa.gov/data/Public/ORD/EnviroAtlas) or as an EnviroAtlas map service. Additional descriptive information about each attribute in this dataset can be found in its associated EnviroAtlas Fact Sheet (https://www.epa.gov/enviroatlas/enviroatlas-fact-sheets).
IPeak: An open source tool to combine results from multiple MS/MS search engines.
Wen, Bo; Du, Chaoqin; Li, Guilin; Ghali, Fawaz; Jones, Andrew R; Käll, Lukas; Xu, Shaohang; Zhou, Ruo; Ren, Zhe; Feng, Qiang; Xu, Xun; Wang, Jun
2015-09-01
Liquid chromatography coupled tandem mass spectrometry (LC-MS/MS) is an important technique for detecting peptides in proteomics studies. Here, we present an open source software tool, termed IPeak, a peptide identification pipeline that is designed to combine the Percolator post-processing algorithm and multi-search strategy to enhance the sensitivity of peptide identifications without compromising accuracy. IPeak provides a graphical user interface (GUI) as well as a command-line interface, which is implemented in JAVA and can work on all three major operating system platforms: Windows, Linux/Unix and OS X. IPeak has been designed to work with the mzIdentML standard from the Proteomics Standards Initiative (PSI) as an input and output, and also been fully integrated into the associated mzidLibrary project, providing access to the overall pipeline, as well as modules for calling Percolator on individual search engine result files. The integration thus enables IPeak (and Percolator) to be used in conjunction with any software packages implementing the mzIdentML data standard. IPeak is freely available and can be downloaded under an Apache 2.0 license at https://code.google.com/p/mzidentml-lib/. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
PsyScript: a Macintosh application for scripting experiments.
Bates, Timothy C; D'Oliveiro, Lawrence
2003-11-01
PsyScript is a scriptable application allowing users to describe experiments in Apple's compiled high-level object-oriented AppleScript language, while still supporting millisecond or better within-trial event timing (delays can be in milliseconds or refresh-based, and PsyScript can wait on external I/O, such as eye movement fixations). Because AppleScript is object oriented and system-wide, PsyScript experiments support complex branching, code reuse, and integration with other applications. Included AppleScript-based libraries support file handling and stimulus randomization and sampling, as well as more specialized tasks, such as adaptive testing. Advanced features include support for the BBox serial port button box, as well as a low-cost USB-based digital I/O card for millisecond timing, recording of any number and types of responses within a trial, novel responses, such as graphics tablet drawing, and use of the Macintosh sound facilities to provide an accurate voice key, saving voice responses to disk, scriptable image creation, support for flicker-free animation, and gaze-dependent masking. The application is open source, allowing researchers to enhance the feature set and verify internal functions. Both the application and the source are available for free download at www.maccs.mq.edu.au/-tim/psyscript/.
Ensemble Eclipse: A Process for Prefab Development Environment for the Ensemble Project
NASA Technical Reports Server (NTRS)
Wallick, Michael N.; Mittman, David S.; Shams, Khawaja, S.; Bachmann, Andrew G.; Ludowise, Melissa
2013-01-01
This software simplifies the process of having to set up an Eclipse IDE programming environment for the members of the cross-NASA center project, Ensemble. It achieves this by assembling all the necessary add-ons and custom tools/preferences. This software is unique in that it allows developers in the Ensemble Project (approximately 20 to 40 at any time) across multiple NASA centers to set up a development environment almost instantly and work on Ensemble software. The software automatically has the source code repositories and other vital information and settings included. The Eclipse IDE is an open-source development framework. The NASA (Ensemble-specific) version of the software includes Ensemble-specific plug-ins as well as settings for the Ensemble project. This software saves developers the time and hassle of setting up a programming environment, making sure that everything is set up in the correct manner for Ensemble development. Existing software (i.e., standard Eclipse) requires an intensive setup process that is both time-consuming and error prone. This software is built once by a single user and tested, allowing other developers to simply download and use the software
Forth, Thomas; McConkey, Glenn A; Westhead, David R
2010-09-15
An application has been developed to help with the creation and editing of Systems Biology Markup Language (SBML) format metabolic networks up to the organism scale. Networks are defined as a collection of Kyoto Encyclopedia of Genes and Genomes (KEGG) LIGAND reactions with an optional associated Enzyme Classification (EC) number for each reaction. Additional custom reactions can be defined by the user. Reactions within the network can be assigned flux constraints and compartmentalization is supported for each reaction in addition to the support for reactions that occur across compartment boundaries. Exported networks are fully SBML L2V4 compatible with an optional L2V1 export for compatibility with old versions of the COBRA toolbox. The software runs in the free Microsoft Access 2007 Runtime (Microsoft Inc.), which is included with the installer and works on Windows XP SP2 or better. Full source code is viewable in the full version of Access 2007 or 2010. Users must have a license to use the KEGG LIGAND database (free academic licensing is available). Please go to www.bioinformatics.leeds.ac.uk/~pytf/metnetmaker for software download, help and tutorials.
Sodium 3D COncentration MApping (COMA 3D) using 23Na and proton MRI
NASA Astrophysics Data System (ADS)
Truong, Milton L.; Harrington, Michael G.; Schepkin, Victor D.; Chekmenev, Eduard Y.
2014-10-01
Functional changes of sodium 3D MRI signals were converted into millimolar concentration changes using an open-source fully automated MATLAB toolbox. These concentration changes are visualized via 3D sodium concentration maps, and they are overlaid over conventional 3D proton images to provide high-resolution co-registration for easy correlation of functional changes to anatomical regions. Nearly 5000/h concentration maps were generated on a personal computer (ca. 2012) using 21.1 T 3D sodium MRI brain images of live rats with spatial resolution of 0.8 × 0.8 × 0.8 mm3 and imaging matrices of 60 × 60 × 60. The produced concentration maps allowed for non-invasive quantitative measurement of in vivo sodium concentration in the normal rat brain as a functional response to migraine-like conditions. The presented work can also be applied to sodium-associated changes in migraine, cancer, and other metabolic abnormalities that can be sensed by molecular imaging. The MATLAB toolbox allows for automated image analysis of the 3D images acquired on the Bruker platform and can be extended to other imaging platforms. The resulting images are presented in a form of series of 2D slices in all three dimensions in native MATLAB and PDF formats. The following is provided: (a) MATLAB source code for image processing, (b) the detailed processing procedures, (c) description of the code and all sub-routines, (d) example data sets of initial and processed data. The toolbox can be downloaded at: http://www.vuiis.vanderbilt.edu/ truongm/COMA3D/.
The Open Spectral Database: an open platform for sharing and searching spectral data.
Chalk, Stuart J
2016-01-01
A number of websites make available spectral data for download (typically as JCAMP-DX text files) and one (ChemSpider) that also allows users to contribute spectral files. As a result, searching and retrieving such spectral data can be time consuming, and difficult to reuse if the data is compressed in the JCAMP-DX file. What is needed is a single resource that allows submission of JCAMP-DX files, export of the raw data in multiple formats, searching based on multiple chemical identifiers, and is open in terms of license and access. To address these issues a new online resource called the Open Spectral Database (OSDB) http://osdb.info/ has been developed and is now available. Built using open source tools, using open code (hosted on GitHub), providing open data, and open to community input about design and functionality, the OSDB is available for anyone to submit spectral data, making it searchable and available to the scientific community. This paper details the concept and coding, internal architecture, export formats, Representational State Transfer (REST) Application Programming Interface and options for submission of data. The OSDB website went live in November 2015. Concurrently, the GitHub repository was made available at https://github.com/stuchalk/OSDB/, and is open for collaborators to join the project, submit issues, and contribute code. The combination of a scripting environment (PHPStorm), a PHP Framework (CakePHP), a relational database (MySQL) and a code repository (GitHub) provides all the capabilities to easily develop REST based websites for ingestion, curation and exposure of open chemical data to the community at all levels. It is hoped this software stack (or equivalent ones in other scripting languages) will be leveraged to make more chemical data available for both humans and computers.
GEO2D - Two-Dimensional Computer Model of a Ground Source Heat Pump System
James Menart
2013-06-07
This file contains a zipped file that contains many files required to run GEO2D. GEO2D is a computer code for simulating ground source heat pump (GSHP) systems in two-dimensions. GEO2D performs a detailed finite difference simulation of the heat transfer occurring within the working fluid, the tube wall, the grout, and the ground. Both horizontal and vertical wells can be simulated with this program, but it should be noted that the vertical wall is modeled as a single tube. This program also models the heat pump in conjunction with the heat transfer occurring. GEO2D simulates the heat pump and ground loop as a system. Many results are produced by GEO2D as a function of time and position, such as heat transfer rates, temperatures and heat pump performance. On top of this information from an economic comparison between the geothermal system simulated and a comparable air heat pump systems or a comparable gas, oil or propane heating systems with a vapor compression air conditioner. The version of GEO2D in the attached file has been coupled to the DOE heating and cooling load software called ENERGYPLUS. This is a great convenience for the user because heating and cooling loads are an input to GEO2D. GEO2D is a user friendly program that uses a graphical user interface for inputs and outputs. These make entering data simple and they produce many plotted results that are easy to understand. In order to run GEO2D access to MATLAB is required. If this program is not available on your computer you can download the program MCRInstaller.exe, the 64 bit version, from the MATLAB website or from this geothermal depository. This is a free download which will enable you to run GEO2D..
MetaPhinder—Identifying Bacteriophage Sequences in Metagenomic Data Sets
Villarroel, Julia; Lund, Ole; Voldby Larsen, Mette; Nielsen, Morten
2016-01-01
Bacteriophages are the most abundant biological entity on the planet, but at the same time do not account for much of the genetic material isolated from most environments due to their small genome sizes. They also show great genetic diversity and mosaic genomes making it challenging to analyze and understand them. Here we present MetaPhinder, a method to identify assembled genomic fragments (i.e.contigs) of phage origin in metagenomic data sets. The method is based on a comparison to a database of whole genome bacteriophage sequences, integrating hits to multiple genomes to accomodate for the mosaic genome structure of many bacteriophages. The method is demonstrated to out-perform both BLAST methods based on single hits and methods based on k-mer comparisons. MetaPhinder is available as a web service at the Center for Genomic Epidemiology https://cge.cbs.dtu.dk/services/MetaPhinder/, while the source code can be downloaded from https://bitbucket.org/genomicepidemiology/metaphinder or https://github.com/vanessajurtz/MetaPhinder. PMID:27684958
MetaPhinder-Identifying Bacteriophage Sequences in Metagenomic Data Sets.
Jurtz, Vanessa Isabell; Villarroel, Julia; Lund, Ole; Voldby Larsen, Mette; Nielsen, Morten
Bacteriophages are the most abundant biological entity on the planet, but at the same time do not account for much of the genetic material isolated from most environments due to their small genome sizes. They also show great genetic diversity and mosaic genomes making it challenging to analyze and understand them. Here we present MetaPhinder, a method to identify assembled genomic fragments (i.e.contigs) of phage origin in metagenomic data sets. The method is based on a comparison to a database of whole genome bacteriophage sequences, integrating hits to multiple genomes to accomodate for the mosaic genome structure of many bacteriophages. The method is demonstrated to out-perform both BLAST methods based on single hits and methods based on k-mer comparisons. MetaPhinder is available as a web service at the Center for Genomic Epidemiology https://cge.cbs.dtu.dk/services/MetaPhinder/, while the source code can be downloaded from https://bitbucket.org/genomicepidemiology/metaphinder or https://github.com/vanessajurtz/MetaPhinder.
SIFTER search: a web server for accurate phylogeny-based protein function prediction
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sahraeian, Sayed M.; Luo, Kevin R.; Brenner, Steven E.
We are awash in proteins discovered through high-throughput sequencing projects. As only a minuscule fraction of these have been experimentally characterized, computational methods are widely used for automated annotation. Here, we introduce a user-friendly web interface for accurate protein function prediction using the SIFTER algorithm. SIFTER is a state-of-the-art sequence-based gene molecular function prediction algorithm that uses a statistical model of function evolution to incorporate annotations throughout the phylogenetic tree. Due to the resources needed by the SIFTER algorithm, running SIFTER locally is not trivial for most users, especially for large-scale problems. The SIFTER web server thus provides access tomore » precomputed predictions on 16 863 537 proteins from 232 403 species. Users can explore SIFTER predictions with queries for proteins, species, functions, and homologs of sequences not in the precomputed prediction set. Lastly, the SIFTER web server is accessible at http://sifter.berkeley.edu/ and the source code can be downloaded.« less
ExoLocator--an online view into genetic makeup of vertebrate proteins.
Khoo, Aik Aun; Ogrizek-Tomas, Mario; Bulovic, Ana; Korpar, Matija; Gürler, Ece; Slijepcevic, Ivan; Šikic, Mile; Mihalek, Ivana
2014-01-01
ExoLocator (http://exolocator.eopsf.org) collects in a single place information needed for comparative analysis of protein-coding exons from vertebrate species. The main source of data--the genomic sequences, and the existing exon and homology annotation--is the ENSEMBL database of completed vertebrate genomes. To these, ExoLocator adds the search for ostensibly missing exons in orthologous protein pairs across species, using an extensive computational pipeline to narrow down the search region for the candidate exons and find a suitable template in the other species, as well as state-of-the-art implementations of pairwise alignment algorithms. The resulting complements of exons are organized in a way currently unique to ExoLocator: multiple sequence alignments, both on the nucleotide and on the peptide levels, clearly indicating the exon boundaries. The alignments can be inspected in the web-embedded viewer, downloaded or used on the spot to produce an estimate of conservation within orthologous sets, or functional divergence across paralogues.
BigWig and BigBed: enabling browsing of large distributed datasets.
Kent, W J; Zweig, A S; Barber, G; Hinrichs, A S; Karolchik, D
2010-09-01
BigWig and BigBed files are compressed binary indexed files containing data at several resolutions that allow the high-performance display of next-generation sequencing experiment results in the UCSC Genome Browser. The visualization is implemented using a multi-layered software approach that takes advantage of specific capabilities of web-based protocols and Linux and UNIX operating systems files, R trees and various indexing and compression tricks. As a result, only the data needed to support the current browser view is transmitted rather than the entire file, enabling fast remote access to large distributed data sets. Binaries for the BigWig and BigBed creation and parsing utilities may be downloaded at http://hgdownload.cse.ucsc.edu/admin/exe/linux.x86_64/. Source code for the creation and visualization software is freely available for non-commercial use at http://hgdownload.cse.ucsc.edu/admin/jksrc.zip, implemented in C and supported on Linux. The UCSC Genome Browser is available at http://genome.ucsc.edu.
Azahar: a PyMOL plugin for construction, visualization and analysis of glycan molecules
NASA Astrophysics Data System (ADS)
Arroyuelo, Agustina; Vila, Jorge A.; Martin, Osvaldo A.
2016-08-01
Glycans are key molecules in many physiological and pathological processes. As with other molecules, like proteins, visualization of the 3D structures of glycans adds valuable information for understanding their biological function. Hence, here we introduce Azahar, a computing environment for the creation, visualization and analysis of glycan molecules. Azahar is implemented in Python and works as a plugin for the well known PyMOL package (Schrodinger in The PyMOL molecular graphics system, version 1.3r1, 2010). Besides the already available visualization and analysis options provided by PyMOL, Azahar includes 3 cartoon-like representations and tools for 3D structure caracterization such as a comformational search using a Monte Carlo with minimization routine and also tools to analyse single glycans or trajectories/ensembles including the calculation of radius of gyration, Ramachandran plots and hydrogen bonds. Azahar is freely available to download from http://www.pymolwiki.org/index.php/Azahar and the source code is available at https://github.com/agustinaarroyuelo/Azahar.
Estimating population diversity with CatchAll
Bunge, John; Woodard, Linda; Böhning, Dankmar; Foster, James A.; Connolly, Sean; Allen, Heather K.
2012-01-01
Motivation: The massive data produced by next-generation sequencing require advanced statistical tools. We address estimating the total diversity or species richness in a population. To date, only relatively simple methods have been implemented in available software. There is a need for software employing modern, computationally intensive statistical analyses including error, goodness-of-fit and robustness assessments. Results: We present CatchAll, a fast, easy-to-use, platform-independent program that computes maximum likelihood estimates for finite-mixture models, weighted linear regression-based analyses and coverage-based non-parametric methods, along with outlier diagnostics. Given sample ‘frequency count’ data, CatchAll computes 12 different diversity estimates and applies a model-selection algorithm. CatchAll also derives discounted diversity estimates to adjust for possibly uncertain low-frequency counts. It is accompanied by an Excel-based graphics program. Availability: Free executable downloads for Linux, Windows and Mac OS, with manual and source code, at www.northeastern.edu/catchall. Contact: jab18@cornell.edu PMID:22333246
SIFTER search: a web server for accurate phylogeny-based protein function prediction
Sahraeian, Sayed M.; Luo, Kevin R.; Brenner, Steven E.
2015-05-15
We are awash in proteins discovered through high-throughput sequencing projects. As only a minuscule fraction of these have been experimentally characterized, computational methods are widely used for automated annotation. Here, we introduce a user-friendly web interface for accurate protein function prediction using the SIFTER algorithm. SIFTER is a state-of-the-art sequence-based gene molecular function prediction algorithm that uses a statistical model of function evolution to incorporate annotations throughout the phylogenetic tree. Due to the resources needed by the SIFTER algorithm, running SIFTER locally is not trivial for most users, especially for large-scale problems. The SIFTER web server thus provides access tomore » precomputed predictions on 16 863 537 proteins from 232 403 species. Users can explore SIFTER predictions with queries for proteins, species, functions, and homologs of sequences not in the precomputed prediction set. Lastly, the SIFTER web server is accessible at http://sifter.berkeley.edu/ and the source code can be downloaded.« less
IMAGE EXPLORER: Astronomical Image Analysis on an HTML5-based Web Application
NASA Astrophysics Data System (ADS)
Gopu, A.; Hayashi, S.; Young, M. D.
2014-05-01
Large datasets produced by recent astronomical imagers cause the traditional paradigm for basic visual analysis - typically downloading one's entire image dataset and using desktop clients like DS9, Aladin, etc. - to not scale, despite advances in desktop computing power and storage. This paper describes Image Explorer, a web framework that offers several of the basic visualization and analysis functionality commonly provided by tools like DS9, on any HTML5 capable web browser on various platforms. It uses a combination of the modern HTML5 canvas, JavaScript, and several layers of lossless PNG tiles producted from the FITS image data. Astronomers are able to rapidly and simultaneously open up several images on their web-browser, adjust the intensity min/max cutoff or its scaling function, and zoom level, apply color-maps, view position and FITS header information, execute typically used data reduction codes on the corresponding FITS data using the FRIAA framework, and overlay tiles for source catalog objects, etc.
Faster sequence homology searches by clustering subsequences.
Suzuki, Shuji; Kakuta, Masanori; Ishida, Takashi; Akiyama, Yutaka
2015-04-15
Sequence homology searches are used in various fields. New sequencing technologies produce huge amounts of sequence data, which continuously increase the size of sequence databases. As a result, homology searches require large amounts of computational time, especially for metagenomic analysis. We developed a fast homology search method based on database subsequence clustering, and implemented it as GHOSTZ. This method clusters similar subsequences from a database to perform an efficient seed search and ungapped extension by reducing alignment candidates based on triangle inequality. The database subsequence clustering technique achieved an ∼2-fold increase in speed without a large decrease in search sensitivity. When we measured with metagenomic data, GHOSTZ is ∼2.2-2.8 times faster than RAPSearch and is ∼185-261 times faster than BLASTX. The source code is freely available for download at http://www.bi.cs.titech.ac.jp/ghostz/ akiyama@cs.titech.ac.jp Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press.
ChromA: signal-based retention time alignment for chromatography–mass spectrometry data
Hoffmann, Nils; Stoye, Jens
2009-01-01
Summary: We describe ChromA, a web-based alignment tool for chromatography–mass spectrometry data from the metabolomics and proteomics domains. Users can supply their data in open and standardized file formats for retention time alignment using dynamic time warping with different configurable local distance and similarity functions. Additionally, user-defined anchors can be used to constrain and speedup the alignment. A neighborhood around each anchor can be added to increase the flexibility of the constrained alignment. ChromA offers different visualizations of the alignment for easier qualitative interpretation and comparison of the data. For the multiple alignment of more than two data files, the center-star approximation is applied to select a reference among input files to align to. Availability: ChromA is available at http://bibiserv.techfak.uni-bielefeld.de/chroma. Executables and source code under the L-GPL v3 license are provided for download at the same location. Contact: stoye@techfak.uni-bielefeld.de Supplementary information: Supplementary data are available at Bioinformatics online. PMID:19505941
Optimized data fusion for K-means Laplacian clustering
Yu, Shi; Liu, Xinhai; Tranchevent, Léon-Charles; Glänzel, Wolfgang; Suykens, Johan A. K.; De Moor, Bart; Moreau, Yves
2011-01-01
Motivation: We propose a novel algorithm to combine multiple kernels and Laplacians for clustering analysis. The new algorithm is formulated on a Rayleigh quotient objective function and is solved as a bi-level alternating minimization procedure. Using the proposed algorithm, the coefficients of kernels and Laplacians can be optimized automatically. Results: Three variants of the algorithm are proposed. The performance is systematically validated on two real-life data fusion applications. The proposed Optimized Kernel Laplacian Clustering (OKLC) algorithms perform significantly better than other methods. Moreover, the coefficients of kernels and Laplacians optimized by OKLC show some correlation with the rank of performance of individual data source. Though in our evaluation the K values are predefined, in practical studies, the optimal cluster number can be consistently estimated from the eigenspectrum of the combined kernel Laplacian matrix. Availability: The MATLAB code of algorithms implemented in this paper is downloadable from http://homes.esat.kuleuven.be/~sistawww/bioi/syu/oklc.html. Contact: shiyu@uchicago.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:20980271
ChemoPy: freely available python package for computational biology and chemoinformatics.
Cao, Dong-Sheng; Xu, Qing-Song; Hu, Qian-Nan; Liang, Yi-Zeng
2013-04-15
Molecular representation for small molecules has been routinely used in QSAR/SAR, virtual screening, database search, ranking, drug ADME/T prediction and other drug discovery processes. To facilitate extensive studies of drug molecules, we developed a freely available, open-source python package called chemoinformatics in python (ChemoPy) for calculating the commonly used structural and physicochemical features. It computes 16 drug feature groups composed of 19 descriptors that include 1135 descriptor values. In addition, it provides seven types of molecular fingerprint systems for drug molecules, including topological fingerprints, electro-topological state (E-state) fingerprints, MACCS keys, FP4 keys, atom pairs fingerprints, topological torsion fingerprints and Morgan/circular fingerprints. By applying a semi-empirical quantum chemistry program MOPAC, ChemoPy can also compute a large number of 3D molecular descriptors conveniently. The python package, ChemoPy, is freely available via http://code.google.com/p/pychem/downloads/list, and it runs on Linux and MS-Windows. Supplementary data are available at Bioinformatics online.
BioPartsBuilder: a synthetic biology tool for combinatorial assembly of biological parts.
Yang, Kun; Stracquadanio, Giovanni; Luo, Jingchuan; Boeke, Jef D; Bader, Joel S
2016-03-15
Combinatorial assembly of DNA elements is an efficient method for building large-scale synthetic pathways from standardized, reusable components. These methods are particularly useful because they enable assembly of multiple DNA fragments in one reaction, at the cost of requiring that each fragment satisfies design constraints. We developed BioPartsBuilder as a biologist-friendly web tool to design biological parts that are compatible with DNA combinatorial assembly methods, such as Golden Gate and related methods. It retrieves biological sequences, enforces compliance with assembly design standards and provides a fabrication plan for each fragment. BioPartsBuilder is accessible at http://public.biopartsbuilder.org and an Amazon Web Services image is available from the AWS Market Place (AMI ID: ami-508acf38). Source code is released under the MIT license, and available for download at https://github.com/baderzone/biopartsbuilder joel.bader@jhu.edu Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press.
The Plotting Library http://astroplotlib.stsci.edu
NASA Astrophysics Data System (ADS)
Úbeda, L.
2014-05-01
astroplotlib is a multi-language astronomical library of plots. It is a collection of software templates that are useful to create paper-quality figures. All current templates are coded in IDL, some in Python and Mathematica. This free resource supported at Space Telescope Science Institute allows users to download any plot and customize it to their own needs. It is also intended as an educational tool.
Common spaceborne multicomputer operating system and development environment
NASA Technical Reports Server (NTRS)
Craymer, L. G.; Lewis, B. F.; Hayes, P. J.; Jones, R. L.
1994-01-01
A preliminary technical specification for a multicomputer operating system is developed. The operating system is targeted for spaceborne flight missions and provides a broad range of real-time functionality, dynamic remote code-patching capability, and system fault tolerance and long-term survivability features. Dataflow concepts are used for representing application algorithms. Functional features are included to ensure real-time predictability for a class of algorithms which require data-driven execution on an iterative steady state basis. The development environment supports the development of algorithm code, design of control parameters, performance analysis, simulation of real-time dataflow applications, and compiling and downloading of the resulting application.
Uptake of an Incentive-Based mHealth App: Process Evaluation of the Carrot Rewards App
Oh, Paul; Alter, David; Leahey, Tricia; Kwan, Matthew; Faulkner, Guy
2017-01-01
Background Behavioral economics has stimulated renewed interest in financial health incentives worldwide. The Carrot Rewards app was developed as part of a public-private partnership to reward Canadians with loyalty points (eg, movies and groceries) for downloading the app, referring friends, and completing an average of 1 to 2 educational health quizzes per week (“micro-learning”), with long-term objectives of increasing health knowledge and encouraging healthy behaviors. Objective The main objective of this study was to evaluate uptake of a loyalty points-based mHealth app during the exclusive 3-month launch period in British Columbia (BC), Canada. The secondary aims were to describe the health and sociodemographic characteristics of users, as well as participation levels (eg, proportion of quizzes completed and friends referred). Methods The app was promoted via loyalty program email campaigns (1.64 million emails). Number of downloads and registrations (users enter age, gender, and valid BC postal code to register) were collected. Additional sociodemographics were inferred by linking postal codes with census data at the local health area (LHA) level. Health risk assessments were also deployed. Participation levels were collected over 3 months and descriptive data were presented. Results In 3 months, 67,464 individuals downloaded the app; in its first week, Carrot Rewards was the most downloaded health app in Canada. Among valid users (n=57,885; at least one quiz completed), the majority were female (62.96%; 36,446/57,885) and aged 18 to 34 years (54.34%; 31,459/57,885). More than half of the users (52.40%; 30,332/57,885) resided in LHAs where the median personal income was below the provincial average (Can $28,765). Furthermore, 64.42% (37,291/57,885) of users lived in metropolitan (ie, urban) LHAs, compared with 56.17% of the general BC population. The most prevalent risk factors were “not” meeting physical activity guidelines (72.70%; 31,765/43,692) and “not” getting the flu shot last year (67.69%; 30,286/44,739). Regarding participation, 60.05% (34,761/57,885) of users were classified as “very high” engagers (>75% quiz completion rate). Conclusions Early results suggest that loyalty points may promote mHealth app uptake. The app was downloaded by younger females especially, and BC residents from higher and lower income regions were equally represented. Loyalty points appear to have driven participation throughout the inaugural 3-month period (ie, quiz completion). PMID:28559224
Inventory of Data Sources for Estimating Health Care Costs in the United States
Lund, Jennifer L.; Yabroff, K. Robin; Ibuka, Yoko; Russell, Louise B.; Barnett, Paul G.; Lipscomb, Joseph; Lawrence, William F.; Brown, Martin L.
2011-01-01
Objective To develop an inventory of data sources for estimating health care costs in the United States and provide information to aid researchers in identifying appropriate data sources for their specific research questions. Methods We identified data sources for estimating health care costs using 3 approaches: (1) a review of the 18 articles included in this supplement, (2) an evaluation of websites of federal government agencies, non profit foundations, and related societies that support health care research or provide health care services, and (3) a systematic review of the recently published literature. Descriptive information was abstracted from each data source, including sponsor, website, lowest level of data aggregation, type of data source, population included, cross-sectional or longitudinal data capture, source of diagnosis information, and cost of obtaining the data source. Details about the cost elements available in each data source were also abstracted. Results We identified 88 data sources that can be used to estimate health care costs in the United States. Most data sources were sponsored by government agencies, national or nationally representative, and cross-sectional. About 40% were surveys, followed by administrative or linked administrative data, fee or cost schedules, discharges, and other types of data. Diagnosis information was available in most data sources through procedure or diagnosis codes, self-report, registry, or chart review. Cost elements included inpatient hospitalizations (42.0%), physician and other outpatient services (45.5%), outpatient pharmacy or laboratory (28.4%), out-of-pocket (22.7%), patient time and other direct nonmedical costs (35.2%), and wages (13.6%). About half were freely available for downloading or available for a nominal fee, and the cost of obtaining the remaining data sources varied by the scope of the project. Conclusions Available data sources vary in population included, type of data source, scope, and accessibility, and have different strengths and weaknesses for specific research questions. PMID:19536009
PathVisio 3: an extendable pathway analysis toolbox.
Kutmon, Martina; van Iersel, Martijn P; Bohler, Anwesha; Kelder, Thomas; Nunes, Nuno; Pico, Alexander R; Evelo, Chris T
2015-02-01
PathVisio is a commonly used pathway editor, visualization and analysis software. Biological pathways have been used by biologists for many years to describe the detailed steps in biological processes. Those powerful, visual representations help researchers to better understand, share and discuss knowledge. Since the first publication of PathVisio in 2008, the original paper was cited more than 170 times and PathVisio was used in many different biological studies. As an online editor PathVisio is also integrated in the community curated pathway database WikiPathways. Here we present the third version of PathVisio with the newest additions and improvements of the application. The core features of PathVisio are pathway drawing, advanced data visualization and pathway statistics. Additionally, PathVisio 3 introduces a new powerful extension systems that allows other developers to contribute additional functionality in form of plugins without changing the core application. PathVisio can be downloaded from http://www.pathvisio.org and in 2014 PathVisio 3 has been downloaded over 5,500 times. There are already more than 15 plugins available in the central plugin repository. PathVisio is a freely available, open-source tool published under the Apache 2.0 license (http://www.apache.org/licenses/LICENSE-2.0). It is implemented in Java and thus runs on all major operating systems. The code repository is available at http://svn.bigcat.unimaas.nl/pathvisio. The support mailing list for users is available on https://groups.google.com/forum/#!forum/wikipathways-discuss and for developers on https://groups.google.com/forum/#!forum/wikipathways-devel.
Design of FPGA ICA for hyperspectral imaging processing
NASA Astrophysics Data System (ADS)
Nordin, Anis; Hsu, Charles C.; Szu, Harold H.
2001-03-01
The remote sensing problem which uses hyperspectral imaging can be transformed into a blind source separation problem. Using this model, hyperspectral imagery can be de-mixed into sub-pixel spectra which indicate the different material present in the pixel. This can be further used to deduce areas which contain forest, water or biomass, without even knowing the sources which constitute the image. This form of remote sensing allows previously blurred images to show the specific terrain involved in that region. The blind source separation problem can be implemented using an Independent Component Analysis algorithm. The ICA Algorithm has previously been successfully implemented using software packages such as MATLAB, which has a downloadable version of FastICA. The challenge now lies in implementing it in a form of hardware, or firmware in order to improve its computational speed. Hardware implementation also solves insufficient memory problem encountered by software packages like MATLAB when employing ICA for high resolution images and a large number of channels. Here, a pipelined solution of the firmware, realized using FPGAs are drawn out and simulated using C. Since C code can be translated into HDLs or be used directly on the FPGAs, it can be used to simulate its actual implementation in hardware. The simulated results of the program is presented here, where seven channels are used to model the 200 different channels involved in hyperspectral imaging.
NASA Astrophysics Data System (ADS)
Hasenkopf, C. A.
2017-12-01
Increasingly, open data, open-source projects are unearthing rich datasets and tools, previously impossible for more traditional avenues to generate. These projects are possible, in part, because of the emergence of online collaborative and code-sharing tools, decreasing costs of cloud-based services to fetch, store, and serve data, and increasing interest of individuals to contribute their time and skills to 'open projects.' While such projects have generated palpable enthusiasm from many sectors, many of these projects face uncharted paths for sustainability, visibility, and acceptance. Our project, OpenAQ, is an example of an open-source, open data community that is currently forging its own uncharted path. OpenAQ is an open air quality data platform that aggregates and universally formats government and research-grade air quality data from 50 countries across the world. To date, we make available more than 76 million air quality (PM2.5, PM10, SO2, NO2, O3, CO and black carbon) data points through an open Application Programming Interface (API) and a user-customizable download interface at https://openaq.org. The goal of the platform is to enable an ecosystem of users to advance air pollution efforts from science to policy to the private sector. The platform is also an open-source project (https://github.com/openaq) and has only been made possible through the coding and data contributions of individuals around the world. In our first two years of existence, we have seen requests for data to our API skyrocket to more than 6 million datapoints per month, and use-cases as varied as ingesting data aggregated from our system into real-time models of wildfires to building open-source statistical packages (e.g. ropenaq and py-openaq) on top of the platform to creating public-friendly apps and chatbots. We will share a whirl-wind trip through our evolution and the many lessons learned so far related to platform structure, community engagement, organizational model type and sustainability.
NASA Technical Reports Server (NTRS)
Ancheta, T. C., Jr.
1976-01-01
A method of using error-correcting codes to obtain data compression, called syndrome-source-coding, is described in which the source sequence is treated as an error pattern whose syndrome forms the compressed data. It is shown that syndrome-source-coding can achieve arbitrarily small distortion with the number of compressed digits per source digit arbitrarily close to the entropy of a binary memoryless source. A 'universal' generalization of syndrome-source-coding is formulated which provides robustly effective distortionless coding of source ensembles. Two examples are given, comparing the performance of noiseless universal syndrome-source-coding to (1) run-length coding and (2) Lynch-Davisson-Schalkwijk-Cover universal coding for an ensemble of binary memoryless sources.
Popular Mobile Phone Apps for Diet and Weight Loss: A Content Analysis.
Zaidan, Sarah; Roehrer, Erin
2016-07-11
A review of the literature has revealed that the rates of overweight and obesity have been increasing in Australia over the last two decades and that wellness mobile phone apps play a significant role in monitoring and managing individuals' weight. Although mobile phone app markets (iTunes and Google Play) list thousands of mobile phone health apps, it is not always clear whether those apps are supported by credible sources. Likewise, despite the prevailing use of mobile phone apps to aid with weight management, the usability features of these apps are not well characterized. The research explored how usability taxonomy could inform the popularity of downloaded, socially focused wellness mobile phone apps, in particular weight loss and diet apps. The aim of the study was to investigate the Australian mobile phone app stores (iTunes and Google Play) in order to examine the usability features of the most popular (ie, most downloaded) wellness apps. The design of this study comprises 3 main stages: stage 1, identifying apps; stage 2, development of weight loss and diet evaluation framework; and stage 3, application of the evaluation framework. Each stage includes specific data collection, analysis tools, and techniques. The study has resulted in the development of a justified evaluation framework for weight loss and diet mobile phone apps. Applying the evaluation framework to the identified apps has shown that the most downloaded iTunes and Google Play apps are not necessarily the most usable or effective. In addition, the research found that search algorithms for iTunes and Google Play are biased toward apps' titles and keywords that do not accurately define the real functionality of the app. Moreover, the study has also analyzed the apps' user reviews, which served as justification for the developed evaluation framework. The analysis has shown that ease of use, reminder, bar code scanning, motivation, usable for all, and synchronization are significant attributes that should be included in weight loss and diet mobile phone apps and ultimately in potential weight loss and diet evaluation frameworks.
Popular Mobile Phone Apps for Diet and Weight Loss: A Content Analysis
Roehrer, Erin
2016-01-01
Background A review of the literature has revealed that the rates of overweight and obesity have been increasing in Australia over the last two decades and that wellness mobile phone apps play a significant role in monitoring and managing individuals’ weight. Although mobile phone app markets (iTunes and Google Play) list thousands of mobile phone health apps, it is not always clear whether those apps are supported by credible sources. Likewise, despite the prevailing use of mobile phone apps to aid with weight management, the usability features of these apps are not well characterized. Objective The research explored how usability taxonomy could inform the popularity of downloaded, socially focused wellness mobile phone apps, in particular weight loss and diet apps. The aim of the study was to investigate the Australian mobile phone app stores (iTunes and Google Play) in order to examine the usability features of the most popular (ie, most downloaded) wellness apps. Methods The design of this study comprises 3 main stages: stage 1, identifying apps; stage 2, development of weight loss and diet evaluation framework; and stage 3, application of the evaluation framework. Each stage includes specific data collection, analysis tools, and techniques. Results The study has resulted in the development of a justified evaluation framework for weight loss and diet mobile phone apps. Applying the evaluation framework to the identified apps has shown that the most downloaded iTunes and Google Play apps are not necessarily the most usable or effective. In addition, the research found that search algorithms for iTunes and Google Play are biased toward apps’ titles and keywords that do not accurately define the real functionality of the app. Moreover, the study has also analyzed the apps’ user reviews, which served as justification for the developed evaluation framework. Conclusions The analysis has shown that ease of use, reminder, bar code scanning, motivation, usable for all, and synchronization are significant attributes that should be included in weight loss and diet mobile phone apps and ultimately in potential weight loss and diet evaluation frameworks. PMID:27400806
Forde, Arnell S.; Dadisman, Shawn V.; Kindinger, Jack G.; Miselis, Jennifer L.; Wiese, Dana S.; Buster, Noreen A.
2012-01-01
In September of 2010, the U.S. Geological Survey (USGS), in cooperation with the U.S. Army Corps of Engineers (USACE), conducted a geophysical survey to investigate the geologic controls on barrier island framework of Cat Island, Miss., as part of a broader USGS study on Barrier Island Mapping (BIM). These surveys were funded through the Mississippi Coastal Improvements Program (MsCIP) and the Northern Gulf of Mexico (NGOM) Ecosystem Change and Hazard Susceptibility Project as part of the Holocene Coastal Evolution of the Mississippi-Alabama Region Subtask. This report serves as an archive of unprocessed digital chirp subbottom data, trackline maps, navigation files, GIS files, Field Activity Collection System (FACS) logs, and formal FGDC metadata. Gained (showing a relative increase in signal amplitude) digital images of the seismic profiles are also provided. Refer to the Acronyms page for expansions of acronyms and abbreviations used in this report. The USGS Saint Petersburg Coastal and Marine Science Center (SPCMSC) assigns a unique identifier to each cruise or field activity. For example, 10BIM04 tells us the data were collected in 2010 during the fourth field activity for that project in that calendar year. Refer to http://walrus.wr.usgs.gov/infobank/programs/html/definition/activity.html for a detailed description of the method used to assign the field activity identification (ID). All chirp systems use a signal of continuously varying frequency; the EdgeTech SB-512i system used during this survey produces high-resolution, shallow-penetration (typically less than 50 milliseconds (ms)) profile images of sub-seafloor stratigraphy. The towfish contains a transducer that transmits and receives acoustic energy; it was housed within a float system (built at the SPCMSC), which allows the towfish to be towed at a constant depth of 1.07 meters (m) below the sea surface. As transmitted acoustic energy intersects density boundaries, such as the seafloor or sub-surface sediment layers, some energy is reflected back toward the transducer, received, and recorded by a Personal Computer (PC)-based seismic acquisition system. This process is repeated at regular time intervals (for example, 0.125 seconds (s)), and returned energy is recorded for a specific duration (for example, 50 ms). In this way, a two-dimensional (2-D) vertical image of the shallow geologic structure beneath the ship track is produced. Figure 1 displays the acquisition geometry. Refer to table 1 for a summary of acquisition parameters and table 2 for trackline statistics. The archived trace data are in standard Society of Exploration Geophysicists (SEG) SEG Y rev. 0 format (Barry and others, 1975); the first 3,200 bytes of the card image header are in American Standard Code for Information Interchange (ASCII) format instead of Extended Binary Coded Decimal Interchange Code (EBCDIC) format. The SEG Y files may be downloaded and processed with commercial or public domain software such as Seismic Unix (SU) (Cohen and Stockwell, 2010). See the How To Download SEG Y Data page for download instructions. The printable profiles provided here are GIF images that were processed and gained using SU software, and they can be viewed from the Profiles page or from links located on the trackline maps; refer to the Software page for links to example SU processing scripts. The SEG Y files are available on the DVD version of this report or on the Web, downloadable via the USGS Coastal and Marine Geoscience Data System (http://cmgds.marine.usgs.gov). The data are also available for viewing using GeoMapApp (http://www.geomapapp.org) and Virtual Ocean (http://www.virtualocean.org) multi-platform open source software.
System Advisor Model (SAM) General Description (Version 2017.9.5)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Freeman, Janine M; DiOrio, Nicholas A; Blair, Nathan J
This document describes the capabilities of the System Advisor Model (SAM) developed and distributed by the U.S. Department of Energy's National Renewable Energy Laboratory. The document is for potential users and others wanting to learn about the model's capabilities. SAM is a techno-economic computer model that calculates performance and financial metrics of renewable energy projects. Project developers, policy makers, equipment manufacturers, and researchers use graphs and tables of SAM results in the process of evaluating financial, technology, and incentive options for renewable energy projects. SAM simulates the performance of photovoltaic, concentrating solar power, solar water heating, wind, geothermal, biomass, andmore » conventional power systems. The financial models are for projects that either buy and sell electricity at retail rates (residential and commercial) or sell electricity at a price determined in a power purchase agreement (PPA). SAM's simulation tools facilitate parametric and sensitivity analyses, Monte Carlo simulation and weather variability (P50/P90) studies. SAM can also read input variables from Microsoft Excel worksheets. For software developers, the SAM software development kit (SDK) makes it possible to use SAM simulation modules in their applications written in C/C plus plus, C sharp, Java, Python, MATLAB, and other languages. NREL provides both SAM and the SDK as free downloads at https://sam.nrel.gov. SAM is an open source project, so its source code is available to the public. Researchers can study the code to understand the model algorithms, and software programmers can contribute their own models and enhancements to the project. Technical support and more information about the software are available on the website.« less
Berillo, Olga; Régnier, Mireille; Ivashchenko, Anatoly
2014-01-01
microRNAs are small RNA molecules that inhibit the translation of target genes. microRNA binding sites are located in the untranslated regions as well as in the coding domains. We describe TmiRUSite and TmiROSite scripts developed using python as tools for the extraction of nucleotide sequences for miRNA binding sites with their encoded amino acid residue sequences. The scripts allow for retrieving a set of additional sequences at left and at right from the binding site. The scripts presents all received data in table formats that are easy to analyse further. The predicted data finds utility in molecular and evolutionary biology studies. They find use in studying miRNA binding sites in animals and plants. TmiRUSite and TmiROSite scripts are available for free from authors upon request and at https: //sites.google.com/site/malaheenee/downloads for download.
NASA Astrophysics Data System (ADS)
Miller, C. J.; Gasson, D.; Fuentes, E.
2007-10-01
The NOAO NVO Portal is a web application for one-stop discovery, analysis, and access to VO-compliant imaging data and services. The current release allows for GUI-based discovery of nearly a half million images from archives such as the NOAO Science Archive, the Hubble Space Telescope WFPC2 and ACS instruments, XMM-Newton, Chandra, and ESO's INT Wide-Field Survey, among others. The NOAO Portal allows users to view image metadata, footprint wire-frames, FITS image previews, and provides one-click access to science quality imaging data throughout the entire sky via the Firefox web browser (i.e., no applet or code to download). Users can stage images from multiple archives at the NOAO NVO Portal for quick and easy bulk downloads. The NOAO NVO Portal also provides simplified and direct access to VO analysis services, such as the WESIX catalog generation service. We highlight the features of the NOAO NVO Portal (http://nvo.noao.edu).
What will the future of cloud-based astronomical data processing look like?
NASA Astrophysics Data System (ADS)
Green, Andrew W.; Mannering, Elizabeth; Harischandra, Lloyd; Vuong, Minh; O'Toole, Simon; Sealey, Katrina; Hopkins, Andrew M.
2017-06-01
Astronomy is rapidly approaching an impasse: very large datasets require remote or cloud-based parallel processing, yet many astronomers still try to download the data and develop serial code locally. Astronomers understand the need for change, but the hurdles remain high. We are developing a data archive designed from the ground up to simplify and encourage cloud-based parallel processing. While the volume of data we host remains modest by some standards, it is still large enough that download and processing times are measured in days and even weeks. We plan to implement a python based, notebook-like interface that automatically parallelises execution. Our goal is to provide an interface sufficiently familiar and user-friendly that it encourages the astronomer to run their analysis on our system in the cloud-astroinformatics as a service. We describe how our system addresses the approaching impasse in astronomy using the SAMI Galaxy Survey as an example.
OpenStructure: a flexible software framework for computational structural biology.
Biasini, Marco; Mariani, Valerio; Haas, Jürgen; Scheuber, Stefan; Schenk, Andreas D; Schwede, Torsten; Philippsen, Ansgar
2010-10-15
Developers of new methods in computational structural biology are often hampered in their research by incompatible software tools and non-standardized data formats. To address this problem, we have developed OpenStructure as a modular open source platform to provide a powerful, yet flexible general working environment for structural bioinformatics. OpenStructure consists primarily of a set of libraries written in C++ with a cleanly designed application programmer interface. All functionality can be accessed directly in C++ or in a Python layer, meeting both the requirements for high efficiency and ease of use. Powerful selection queries and the notion of entity views to represent these selections greatly facilitate the development and implementation of algorithms on structural data. The modular integration of computational core methods with powerful visualization tools makes OpenStructure an ideal working and development environment. Several applications, such as the latest versions of IPLT and QMean, have been implemented based on OpenStructure-demonstrating its value for the development of next-generation structural biology algorithms. Source code licensed under the GNU lesser general public license and binaries for MacOS X, Linux and Windows are available for download at http://www.openstructure.org. torsten.schwede@unibas.ch Supplementary data are available at Bioinformatics online.
An event database for rotational seismology
NASA Astrophysics Data System (ADS)
Salvermoser, Johannes; Hadziioannou, Celine; Hable, Sarah; Chow, Bryant; Krischer, Lion; Wassermann, Joachim; Igel, Heiner
2016-04-01
The ring laser sensor (G-ring) located at Wettzell, Germany, routinely observes earthquake-induced rotational ground motions around a vertical axis since its installation in 2003. Here we present results from a recently installed event database which is the first that will provide ring laser event data in an open access format. Based on the GCMT event catalogue and some search criteria, seismograms from the ring laser and the collocated broadband seismometer are extracted and processed. The ObsPy-based processing scheme generates plots showing waveform fits between rotation rate and transverse acceleration and extracts characteristic wavefield parameters such as peak ground motions, noise levels, Love wave phase velocities and waveform coherence. For each event, these parameters are stored in a text file (json dictionary) which is easily readable and accessible on the website. The database contains >10000 events starting in 2007 (Mw>4.5). It is updated daily and therefore provides recent events at a time lag of max. 24 hours. The user interface allows to filter events for epoch, magnitude, and source area, whereupon the events are displayed on a zoomable world map. We investigate how well the rotational motions are compatible with the expectations from the surface wave magnitude scale. In addition, the website offers some python source code examples for downloading and processing the openly accessible waveforms.
Syndrome source coding and its universal generalization
NASA Technical Reports Server (NTRS)
Ancheta, T. C., Jr.
1975-01-01
A method of using error-correcting codes to obtain data compression, called syndrome-source-coding, is described in which the source sequence is treated as an error pattern whose syndrome forms the compressed data. It is shown that syndrome-source-coding can achieve arbitrarily small distortion with the number of compressed digits per source digit arbitrarily close to the entropy of a binary memoryless source. A universal generalization of syndrome-source-coding is formulated which provides robustly-effective, distortionless, coding of source ensembles.
National Air Toxic Assessments (NATA) Results
The National Air Toxics Assessment was conducted by EPA in 2002 to assess air toxics emissions in order to identify and prioritize air toxics, emission source types and locations which are of greatest potential concern in terms of contributing to population risk. This data source provides downloadable information on emissions at the state, county and census tract level.
Simple and detailed conceptual model diagram and associated narrative for ammonia, dissolved oxygen, flow alteration, herbicides, insecticides, ionic strength, metals, nutrients, ph, physical habitat, sediments, temperature, unspecified toxic chemicals.
Expression Atlas: gene and protein expression across multiple studies and organisms
Tang, Y Amy; Bazant, Wojciech; Burke, Melissa; Fuentes, Alfonso Muñoz-Pomer; George, Nancy; Koskinen, Satu; Mohammed, Suhaib; Geniza, Matthew; Preece, Justin; Jarnuczak, Andrew F; Huber, Wolfgang; Stegle, Oliver; Brazma, Alvis; Petryszak, Robert
2018-01-01
Abstract Expression Atlas (http://www.ebi.ac.uk/gxa) is an added value database that provides information about gene and protein expression in different species and contexts, such as tissue, developmental stage, disease or cell type. The available public and controlled access data sets from different sources are curated and re-analysed using standardized, open source pipelines and made available for queries, download and visualization. As of August 2017, Expression Atlas holds data from 3,126 studies across 33 different species, including 731 from plants. Data from large-scale RNA sequencing studies including Blueprint, PCAWG, ENCODE, GTEx and HipSci can be visualized next to each other. In Expression Atlas, users can query genes or gene-sets of interest and explore their expression across or within species, tissues, developmental stages in a constitutive or differential context, representing the effects of diseases, conditions or experimental interventions. All processed data matrices are available for direct download in tab-delimited format or as R-data. In addition to the web interface, data sets can now be searched and downloaded through the Expression Atlas R package. Novel features and visualizations include the on-the-fly analysis of gene set overlaps and the option to view gene co-expression in experiments investigating constitutive gene expression across tissues or other conditions. PMID:29165655
NASA Astrophysics Data System (ADS)
De Vecchi, Daniele; Dell'Acqua, Fabio
2016-04-01
The EU FP7 MARSITE project aims at assessing the "state of the art" of seismic risk evaluation and management at European level, as a starting point to move a "step forward" towards new concepts of risk mitigation and management by long-term monitoring activities carried out both on land and at sea. Spaceborne Earth Observation (EO) is one of the means through which MARSITE is accomplishing this commitment, whose importance is growing as a consequence of the operational unfolding of the Copernicus initiative. Sentinel-2 data, with its open-data policy, represents an unprecedented opportunity to access global spaceborne multispectral data for various purposes including risk monitoring. In the framework of EU FP7 projects MARSITE, RASOR and SENSUM, our group has developed a suite of geospatial software tools to automatically extract risk-related features from EO data, especially on the exposure and vulnerability side of the "risk equation" [1]. These are for example the extension of a built-up area or the distribution of building density. These tools are available open-source as QGIS plug-ins [2] and their source code can be freely downloaded from GitHub [3]. A test case on the risk-prone mega city of Istanbul has been set up, and preliminary results will be presented in this paper. The output of the algorithms can be incorporated into a risk modeling process, whose output is very useful to stakeholders and decision makers who intend to assess and mitigate the risk level across the giant urban agglomerate. Keywords - Remote Sensing, Copernicus, Istanbul megacity, seismic risk, multi-risk, exposure, open-source References [1] Harb, M.M.; De Vecchi, D.; Dell'Acqua, F., "Physical Vulnerability Proxies from Remotes Sensing: Reviewing, Implementing and Disseminating Selected Techniques," Geoscience and Remote Sensing Magazine, IEEE , vol.3, no.1, pp.20,33, March 2015. doi: 10.1109/MGRS.2015.2398672 [2] SENSUM QGIS plugin, 2016, available online at: https://plugins.qgis.org/plugins/sensum_eo_tools/ [3] SENSUM QGIS code repository, 2016, available online at: https://github.com/SENSUM-project/sensum_rs_qgis
AASG Wells Data for the EGS Test Site Planning and Analysis Task
Augustine, Chad
2013-10-09
AASG Wells Data for the EGS Test Site Planning and Analysis Task Temperature measurement data obtained from boreholes for the Association of American State Geologists (AASG) geothermal data project. Typically bottomhole temperatures are recorded from log headers, and this information is provided through a borehole temperature observation service for each state. Service includes header records, well logs, temperature measurements, and other information for each borehole. Information presented in Geothermal Prospector was derived from data aggregated from the borehole temperature observations for all states. For each observation, the given well location was recorded and the best available well identified (name), temperature and depth were chosen. The “Well Name Source,” “Temp. Type” and “Depth Type” attributes indicate the field used from the original service. This data was then cleaned and converted to consistent units. The accuracy of the observation’s location, name, temperature or depth was note assessed beyond that originally provided by the service. - AASG bottom hole temperature datasets were downloaded from repository.usgin.org between the dates of May 16th and May 24th, 2013. - Datasets were cleaned to remove “null” and non-real entries, and data converted into consistent units across all datasets - Methodology for selecting ”best” temperature and depth attributes from column headers in AASG BHT Data sets: • Temperature: • CorrectedTemperature – best • MeasuredTemperature – next best • Depth: • DepthOfMeasurement – best • TrueVerticalDepth – next best • DrillerTotalDepth – last option • Well Name/Identifier • APINo – best • WellName – next best • ObservationURI - last option. The column headers are as follows: • gid = internal unique ID • src_state = the state from which the well was downloaded (note: the low temperature wells in Idaho are coded as “ID_LowTemp”, while all other wells are simply the two character state abbreviation) • source_url = the url for the source WFS service or Excel file • temp_c = “best” temperature in Celsius • temp_type = indicates whether temp_c comes from the corrected or measured temperature header column in the source document • depth_m = “best” depth in meters • depth_type = indicates whether depth_m comes from the measured, true vertical, or driller total depth header column in the source document • well_name = “best” well name or ID • name_src = indicates whether well_name came from apino, wellname, or observationuri header column in the source document • lat_wgs84 = latitude in wgs84 • lon_wgs84 = longitude in wgs84 • state = state in which the point is located • county = county in which the point is located
Khomtchouk, Bohdan B; Van Booven, Derek J; Wahlestedt, Claes
2014-01-01
The graphical visualization of gene expression data using heatmaps has become an integral component of modern-day medical research. Heatmaps are used extensively to plot quantitative differences in gene expression levels, such as those measured with RNAseq and microarray experiments, to provide qualitative large-scale views of the transcriptonomic landscape. Creating high-quality heatmaps is a computationally intensive task, often requiring considerable programming experience, particularly for customizing features to a specific dataset at hand. Software to create publication-quality heatmaps is developed with the R programming language, C++ programming language, and OpenGL application programming interface (API) to create industry-grade high performance graphics. We create a graphical user interface (GUI) software package called HeatmapGenerator for Windows OS and Mac OS X as an intuitive, user-friendly alternative to researchers with minimal prior coding experience to allow them to create publication-quality heatmaps using R graphics without sacrificing their desired level of customization. The simplicity of HeatmapGenerator is that it only requires the user to upload a preformatted input file and download the publicly available R software language, among a few other operating system-specific requirements. Advanced features such as color, text labels, scaling, legend construction, and even database storage can be easily customized with no prior programming knowledge. We provide an intuitive and user-friendly software package, HeatmapGenerator, to create high-quality, customizable heatmaps generated using the high-resolution color graphics capabilities of R. The software is available for Microsoft Windows and Apple Mac OS X. HeatmapGenerator is released under the GNU General Public License and publicly available at: http://sourceforge.net/projects/heatmapgenerator/. The Mac OS X direct download is available at: http://sourceforge.net/projects/heatmapgenerator/files/HeatmapGenerator_MAC_OSX.tar.gz/download. The Windows OS direct download is available at: http://sourceforge.net/projects/heatmapgenerator/files/HeatmapGenerator_WINDOWS.zip/download.
visPIG--a web tool for producing multi-region, multi-track, multi-scale plots of genetic data.
Scales, Matthew; Jäger, Roland; Migliorini, Gabriele; Houlston, Richard S; Henrion, Marc Y R
2014-01-01
We present VISual Plotting Interface for Genetics (visPIG; http://vispig.icr.ac.uk), a web application to produce multi-track, multi-scale, multi-region plots of genetic data. visPIG has been designed to allow users not well versed with mathematical software packages and/or programming languages such as R, Matlab®, Python, etc., to integrate data from multiple sources for interpretation and to easily create publication-ready figures. While web tools such as the UCSC Genome Browser or the WashU Epigenome Browser allow custom data uploads, such tools are primarily designed for data exploration. This is also true for the desktop-run Integrative Genomics Viewer (IGV). Other locally run data visualisation software such as Circos require significant computer skills of the user. The visPIG web application is a menu-based interface that allows users to upload custom data tracks and set track-specific parameters. Figures can be downloaded as PDF or PNG files. For sensitive data, the underlying R code can also be downloaded and run locally. visPIG is multi-track: it can display many different data types (e.g association, functional annotation, intensity, interaction, heat map data,…). It also allows annotation of genes and other custom features in the plotted region(s). Data tracks can be plotted individually or on a single figure. visPIG is multi-region: it supports plotting multiple regions, be they kilo- or megabases apart or even on different chromosomes. Finally, visPIG is multi-scale: a sub-region of particular interest can be 'zoomed' in. We describe the various features of visPIG and illustrate its utility with examples. visPIG is freely available through http://vispig.icr.ac.uk under a GNU General Public License (GPLv3).
libChEBI: an API for accessing the ChEBI database.
Swainston, Neil; Hastings, Janna; Dekker, Adriano; Muthukrishnan, Venkatesh; May, John; Steinbeck, Christoph; Mendes, Pedro
2016-01-01
ChEBI is a database and ontology of chemical entities of biological interest. It is widely used as a source of identifiers to facilitate unambiguous reference to chemical entities within biological models, databases, ontologies and literature. ChEBI contains a wealth of chemical data, covering over 46,500 distinct chemical entities, and related data such as chemical formula, charge, molecular mass, structure, synonyms and links to external databases. Furthermore, ChEBI is an ontology, and thus provides meaningful links between chemical entities. Unlike many other resources, ChEBI is fully human-curated, providing a reliable, non-redundant collection of chemical entities and related data. While ChEBI is supported by a web service for programmatic access and a number of download files, it does not have an API library to facilitate the use of ChEBI and its data in cheminformatics software. To provide this missing functionality, libChEBI, a comprehensive API library for accessing ChEBI data, is introduced. libChEBI is available in Java, Python and MATLAB versions from http://github.com/libChEBI, and provides full programmatic access to all data held within the ChEBI database through a simple and documented API. libChEBI is reliant upon the (automated) download and regular update of flat files that are held locally. As such, libChEBI can be embedded in both on- and off-line software applications. libChEBI allows better support of ChEBI and its data in the development of new cheminformatics software. Covering three key programming languages, it allows for the entirety of the ChEBI database to be accessed easily and quickly through a simple API. All code is open access and freely available.
caGrid 1.0: An Enterprise Grid Infrastructure for Biomedical Research
Oster, Scott; Langella, Stephen; Hastings, Shannon; Ervin, David; Madduri, Ravi; Phillips, Joshua; Kurc, Tahsin; Siebenlist, Frank; Covitz, Peter; Shanbhag, Krishnakant; Foster, Ian; Saltz, Joel
2008-01-01
Objective To develop software infrastructure that will provide support for discovery, characterization, integrated access, and management of diverse and disparate collections of information sources, analysis methods, and applications in biomedical research. Design An enterprise Grid software infrastructure, called caGrid version 1.0 (caGrid 1.0), has been developed as the core Grid architecture of the NCI-sponsored cancer Biomedical Informatics Grid (caBIG™) program. It is designed to support a wide range of use cases in basic, translational, and clinical research, including 1) discovery, 2) integrated and large-scale data analysis, and 3) coordinated study. Measurements The caGrid is built as a Grid software infrastructure and leverages Grid computing technologies and the Web Services Resource Framework standards. It provides a set of core services, toolkits for the development and deployment of new community provided services, and application programming interfaces for building client applications. Results The caGrid 1.0 was released to the caBIG community in December 2006. It is built on open source components and caGrid source code is publicly and freely available under a liberal open source license. The core software, associated tools, and documentation can be downloaded from the following URL: https://cabig.nci.nih.gov/workspaces/Architecture/caGrid. Conclusions While caGrid 1.0 is designed to address use cases in cancer research, the requirements associated with discovery, analysis and integration of large scale data, and coordinated studies are common in other biomedical fields. In this respect, caGrid 1.0 is the realization of a framework that can benefit the entire biomedical community. PMID:18096909
SunPy 0.8 - Python for Solar Physics
NASA Astrophysics Data System (ADS)
Inglis, Andrew; Bobra, Monica; Christe, Steven; Hewett, Russell; Ireland, Jack; Mumford, Stuart; Martinez Oliveros, Juan Carlos; Perez-Suarez, David; Reardon, Kevin P.; Savage, Sabrina; Shih, Albert Y.; Ryan, Daniel; Sipocz, Brigitta; Freij, Nabil
2017-08-01
SunPy is a community-developed open-source software library for solar physics. It is written in Python, a free, cross-platform, general-purpose, high-level programming language which is being increasingly adopted throughout the scientific community. Python is one of the top ten most often used programming languages, as such it provides a wide array of software packages, such as numerical computation (NumPy, SciPy), machine learning (scikit-learn), signal processing (scikit-image, statsmodels) to visualization and plotting (matplotlib, mayavi). SunPy aims to provide the software for obtaining and analyzing solar and heliospheric data. This poster introduces a new major release of SunPy (0.8). This release includes two major new functionalities, as well as a number of bug fixes. It is based on 1120 contributions from 34 unique contributors. Fido is the new primary interface to download data. It provides a consistent and powerful search interface to all major data sources provides including VSO, JSOC, as well as individual data sources such as GOES XRS time series and and is fully pluggable to add new data sources, i.e. DKIST. In anticipation of Solar Orbiter and the Parker Solar Probe, SunPy now provides a powerful way of representing coordinates, allowing conversion between coordinate systems and viewpoints of different instruments, including preliminary reprojection capabilities. Other new features including new timeseries capabilities with better support for concatenation and metadata, updated documentation and example gallery. SunPy is distributed through pip and conda and all of its code is publicly available (sunpy.org).
caGrid 1.0: an enterprise Grid infrastructure for biomedical research.
Oster, Scott; Langella, Stephen; Hastings, Shannon; Ervin, David; Madduri, Ravi; Phillips, Joshua; Kurc, Tahsin; Siebenlist, Frank; Covitz, Peter; Shanbhag, Krishnakant; Foster, Ian; Saltz, Joel
2008-01-01
To develop software infrastructure that will provide support for discovery, characterization, integrated access, and management of diverse and disparate collections of information sources, analysis methods, and applications in biomedical research. An enterprise Grid software infrastructure, called caGrid version 1.0 (caGrid 1.0), has been developed as the core Grid architecture of the NCI-sponsored cancer Biomedical Informatics Grid (caBIG) program. It is designed to support a wide range of use cases in basic, translational, and clinical research, including 1) discovery, 2) integrated and large-scale data analysis, and 3) coordinated study. The caGrid is built as a Grid software infrastructure and leverages Grid computing technologies and the Web Services Resource Framework standards. It provides a set of core services, toolkits for the development and deployment of new community provided services, and application programming interfaces for building client applications. The caGrid 1.0 was released to the caBIG community in December 2006. It is built on open source components and caGrid source code is publicly and freely available under a liberal open source license. The core software, associated tools, and documentation can be downloaded from the following URL: https://cabig.nci.nih.gov/workspaces/Architecture/caGrid. While caGrid 1.0 is designed to address use cases in cancer research, the requirements associated with discovery, analysis and integration of large scale data, and coordinated studies are common in other biomedical fields. In this respect, caGrid 1.0 is the realization of a framework that can benefit the entire biomedical community.
Automatic mathematical modeling for space application
NASA Technical Reports Server (NTRS)
Wang, Caroline K.
1987-01-01
A methodology for automatic mathematical modeling is described. The major objective is to create a very friendly environment for engineers to design, maintain and verify their model and also automatically convert the mathematical model into FORTRAN code for conventional computation. A demonstration program was designed for modeling the Space Shuttle Main Engine simulation mathematical model called Propulsion System Automatic Modeling (PSAM). PSAM provides a very friendly and well organized environment for engineers to build a knowledge base for base equations and general information. PSAM contains an initial set of component process elements for the Space Shuttle Main Engine simulation and a questionnaire that allows the engineer to answer a set of questions to specify a particular model. PSAM is then able to automatically generate the model and the FORTRAN code. A future goal is to download the FORTRAN code to the VAX/VMS system for conventional computation.
Standards-Based Open-Source Planetary Map Server: Lunaserv
NASA Astrophysics Data System (ADS)
Estes, N. M.; Silva, V. H.; Bowley, K. S.; Lanjewar, K. K.; Robinson, M. S.
2018-04-01
Lunaserv is a planetary capable Web Map Service developed by the LROC SOC. It enables researchers to serve their own planetary data to a wide variety of GIS clients without any additional processing or download steps.
The Baghdad that Was: Using Primary Sources to Teach World History
ERIC Educational Resources Information Center
Schur, Joan Brodsky
2009-01-01
That primary source documents have the power to bring the past alive is no news to social studies teachers. What is new in the last 10 years is the number of digitized documents available online that teachers can download and use in their classrooms. Encouraging teachers to utilize this ever-increasing treasure trove of resources was the goal of…
IDGenerator: unique identifier generator for epidemiologic or clinical studies.
Olden, Matthias; Holle, Rolf; Heid, Iris M; Stark, Klaus
2016-09-15
Creating study identifiers and assigning them to study participants is an important feature in epidemiologic studies, ensuring the consistency and privacy of the study data. The numbering system for identifiers needs to be random within certain number constraints, to carry extensions coding for organizational information, or to contain multiple layers of numbers per participant to diversify data access. Available software can generate globally-unique identifiers, but identifier-creating tools meeting the special needs of epidemiological studies are lacking. We have thus set out to develop a software program to generate IDs for epidemiological or clinical studies. Our software IDGenerator creates unique identifiers that not only carry a random identifier for a study participant, but also support the creation of structured IDs, where organizational information is coded into the ID directly. This may include study center (for multicenter-studies), study track (for studies with diversified study programs), or study visit (baseline, follow-up, regularly repeated visits). Our software can be used to add a check digit to the ID to minimize data entry errors. It facilitates the generation of IDs in batches and the creation of layered IDs (personal data ID, study data ID, temporary ID, external data ID) to ensure a high standard of data privacy. The software is supported by a user-friendly graphic interface that enables the generation of IDs in both standard text and barcode 128B format. Our software IDGenerator can create identifiers meeting the specific needs for epidemiologic or clinical studies to facilitate study organization and data privacy. IDGenerator is freeware under the GNU General Public License version 3; a Windows port and the source code can be downloaded at the Open Science Framework website: https://osf.io/urs2g/ .
A Graphical User Interface for a Method to Infer Kinetics and Network Architecture (MIKANA)
Mourão, Márcio A.; Srividhya, Jeyaraman; McSharry, Patrick E.; Crampin, Edmund J.; Schnell, Santiago
2011-01-01
One of the main challenges in the biomedical sciences is the determination of reaction mechanisms that constitute a biochemical pathway. During the last decades, advances have been made in building complex diagrams showing the static interactions of proteins. The challenge for systems biologists is to build realistic models of the dynamical behavior of reactants, intermediates and products. For this purpose, several methods have been recently proposed to deduce the reaction mechanisms or to estimate the kinetic parameters of the elementary reactions that constitute the pathway. One such method is MIKANA: Method to Infer Kinetics And Network Architecture. MIKANA is a computational method to infer both reaction mechanisms and estimate the kinetic parameters of biochemical pathways from time course data. To make it available to the scientific community, we developed a Graphical User Interface (GUI) for MIKANA. Among other features, the GUI validates and processes an input time course data, displays the inferred reactions, generates the differential equations for the chemical species in the pathway and plots the prediction curves on top of the input time course data. We also added a new feature to MIKANA that allows the user to exclude a priori known reactions from the inferred mechanism. This addition improves the performance of the method. In this article, we illustrate the GUI for MIKANA with three examples: an irreversible Michaelis–Menten reaction mechanism; the interaction map of chemical species of the muscle glycolytic pathway; and the glycolytic pathway of Lactococcus lactis. We also describe the code and methods in sufficient detail to allow researchers to further develop the code or reproduce the experiments described. The code for MIKANA is open source, free for academic and non-academic use and is available for download (Information S1). PMID:22096591
A graphical user interface for a method to infer kinetics and network architecture (MIKANA).
Mourão, Márcio A; Srividhya, Jeyaraman; McSharry, Patrick E; Crampin, Edmund J; Schnell, Santiago
2011-01-01
One of the main challenges in the biomedical sciences is the determination of reaction mechanisms that constitute a biochemical pathway. During the last decades, advances have been made in building complex diagrams showing the static interactions of proteins. The challenge for systems biologists is to build realistic models of the dynamical behavior of reactants, intermediates and products. For this purpose, several methods have been recently proposed to deduce the reaction mechanisms or to estimate the kinetic parameters of the elementary reactions that constitute the pathway. One such method is MIKANA: Method to Infer Kinetics And Network Architecture. MIKANA is a computational method to infer both reaction mechanisms and estimate the kinetic parameters of biochemical pathways from time course data. To make it available to the scientific community, we developed a Graphical User Interface (GUI) for MIKANA. Among other features, the GUI validates and processes an input time course data, displays the inferred reactions, generates the differential equations for the chemical species in the pathway and plots the prediction curves on top of the input time course data. We also added a new feature to MIKANA that allows the user to exclude a priori known reactions from the inferred mechanism. This addition improves the performance of the method. In this article, we illustrate the GUI for MIKANA with three examples: an irreversible Michaelis-Menten reaction mechanism; the interaction map of chemical species of the muscle glycolytic pathway; and the glycolytic pathway of Lactococcus lactis. We also describe the code and methods in sufficient detail to allow researchers to further develop the code or reproduce the experiments described. The code for MIKANA is open source, free for academic and non-academic use and is available for download (Information S1).
Coding Complete Genome for the Mogiana Tick Virus, a Jingmenvirus Isolated from Ticks in Brazil
2017-05-04
sequences for all four genome segments. We downloaded the raw Illumina sequence reads from the NCBI Short Read Archive (GenBank...MGTV genome segments through sequence similarity (BLASTN) to the published genome of Jingmen tick virus (JMTV) isolate SY84 (GenBank: KJ001579-KJ001582...2014. Standards for sequencing viral genomes in the era of high-throughput sequencing . MBio 5:e01360–14. 8. Bankevich A, Nurk S, Antipov
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hart, David; Klise, Katherine A.
The PyEPANET package is a set of commands for the Python programming language that are built to wrap the EPANET toolkit library commands, without requiring the end user to program using the ctypes package. This package does not contain the EPANET code, nor does it implement the functions within the EPANET software, and it requires the separately downloaded or compiled EPANET2 toolkit dynamic library (epanet.dll, libepanent.so, or epanet.dylib) and/or the EPANET-MSX dynamic library in order to function.
Fast simulation tool for ultraviolet radiation at the earth's surface
NASA Astrophysics Data System (ADS)
Engelsen, Ola; Kylling, Arve
2005-04-01
FastRT is a fast, yet accurate, UV simulation tool that computes downward surface UV doses, UV indices, and irradiances in the spectral range 290 to 400 nm with a resolution as small as 0.05 nm. It computes a full UV spectrum within a few milliseconds on a standard PC, and enables the user to convolve the spectrum with user-defined and built-in spectral response functions including the International Commission on Illumination (CIE) erythemal response function used for UV index calculations. The program accounts for the main radiative input parameters, i.e., instrumental characteristics, solar zenith angle, ozone column, aerosol loading, clouds, surface albedo, and surface altitude. FastRT is based on look-up tables of carefully selected entries of atmospheric transmittances and spherical albedos, and exploits the smoothness of these quantities with respect to atmospheric, surface, geometrical, and spectral parameters. An interactive site, http://nadir.nilu.no/~olaeng/fastrt/fastrt.html, enables the public to run the FastRT program with most input options. This page also contains updated information about FastRT and links to freely downloadable source codes and binaries.
Scoria: a Python module for manipulating 3D molecular data.
Ropp, Patrick; Friedman, Aaron; Durrant, Jacob D
2017-09-18
Third-party packages have transformed the Python programming language into a powerful computational-biology tool. Package installation is easy for experienced users, but novices sometimes struggle with dependencies and compilers. This presents a barrier that can hinder the otherwise broad adoption of new tools. We present Scoria, a Python package for manipulating three-dimensional molecular data. Unlike similar packages, Scoria requires no dependencies, compilation, or system-wide installation. One can incorporate the Scoria source code directly into their own programs. But Scoria is not designed to compete with other similar packages. Rather, it complements them. Our package leverages others (e.g. NumPy, SciPy), if present, to speed and extend its own functionality. To show its utility, we use Scoria to analyze a molecular dynamics trajectory. Our FootPrint script colors the atoms of one chain by the frequency of their contacts with a second chain. We are hopeful that Scoria will be a useful tool for the computational-biology community. A copy is available for download free of charge (Apache License 2.0) at http://durrantlab.com/scoria/ . Graphical abstract .
The annotation-enriched non-redundant patent sequence databases.
Li, Weizhong; Kondratowicz, Bartosz; McWilliam, Hamish; Nauche, Stephane; Lopez, Rodrigo
2013-01-01
The EMBL-European Bioinformatics Institute (EMBL-EBI) offers public access to patent sequence data, providing a valuable service to the intellectual property and scientific communities. The non-redundant (NR) patent sequence databases comprise two-level nucleotide and protein sequence clusters (NRNL1, NRNL2, NRPL1 and NRPL2) based on sequence identity (level-1) and patent family (level-2). Annotation from the source entries in these databases is merged and enhanced with additional information from the patent literature and biological context. Corrections in patent publication numbers, kind-codes and patent equivalents significantly improve the data quality. Data are available through various user interfaces including web browser, downloads via FTP, SRS, Dbfetch and EBI-Search. Sequence similarity/homology searches against the databases are available using BLAST, FASTA and PSI-Search. In this article, we describe the data collection and annotation and also outline major changes and improvements introduced since 2009. Apart from data growth, these changes include additional annotation for singleton clusters, the identifier versioning for tracking entry change and the entry mappings between the two-level databases. Database URL: http://www.ebi.ac.uk/patentdata/nr/
The Annotation-enriched non-redundant patent sequence databases
Li, Weizhong; Kondratowicz, Bartosz; McWilliam, Hamish; Nauche, Stephane; Lopez, Rodrigo
2013-01-01
The EMBL-European Bioinformatics Institute (EMBL-EBI) offers public access to patent sequence data, providing a valuable service to the intellectual property and scientific communities. The non-redundant (NR) patent sequence databases comprise two-level nucleotide and protein sequence clusters (NRNL1, NRNL2, NRPL1 and NRPL2) based on sequence identity (level-1) and patent family (level-2). Annotation from the source entries in these databases is merged and enhanced with additional information from the patent literature and biological context. Corrections in patent publication numbers, kind-codes and patent equivalents significantly improve the data quality. Data are available through various user interfaces including web browser, downloads via FTP, SRS, Dbfetch and EBI-Search. Sequence similarity/homology searches against the databases are available using BLAST, FASTA and PSI-Search. In this article, we describe the data collection and annotation and also outline major changes and improvements introduced since 2009. Apart from data growth, these changes include additional annotation for singleton clusters, the identifier versioning for tracking entry change and the entry mappings between the two-level databases. Database URL: http://www.ebi.ac.uk/patentdata/nr/ PMID:23396323
CellMiner Companion: an interactive web application to explore CellMiner NCI-60 data.
Wang, Sufang; Gribskov, Michael; Hazbun, Tony R; Pascuzzi, Pete E
2016-08-01
The NCI-60 human tumor cell line panel is an invaluable resource for cancer researchers, providing drug sensitivity, molecular and phenotypic data for a range of cancer types. CellMiner is a web resource that provides tools for the acquisition and analysis of quality-controlled NCI-60 data. CellMiner supports queries of up to 150 drugs or genes, but the output is an Excel file for each drug or gene. This output format makes it difficult for researchers to explore the data from large queries. CellMiner Companion is a web application that facilitates the exploration and visualization of output from CellMiner, further increasing the accessibility of NCI-60 data. The web application is freely accessible at https://pul-bioinformatics.shinyapps.io/CellMinerCompanion The R source code can be downloaded at https://github.com/pepascuzzi/CellMinerCompanion.git ppascuzz@purdue.edu Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Chen Peng; Ao Li
2017-01-01
The emergence of multi-dimensional data offers opportunities for more comprehensive analysis of the molecular characteristics of human diseases and therefore improving diagnosis, treatment, and prevention. In this study, we proposed a heterogeneous network based method by integrating multi-dimensional data (HNMD) to identify GBM-related genes. The novelty of the method lies in that the multi-dimensional data of GBM from TCGA dataset that provide comprehensive information of genes, are combined with protein-protein interactions to construct a weighted heterogeneous network, which reflects both the general and disease-specific relationships between genes. In addition, a propagation algorithm with resistance is introduced to precisely score and rank GBM-related genes. The results of comprehensive performance evaluation show that the proposed method significantly outperforms the network based methods with single-dimensional data and other existing approaches. Subsequent analysis of the top ranked genes suggests they may be functionally implicated in GBM, which further corroborates the superiority of the proposed method. The source code and the results of HNMD can be downloaded from the following URL: http://bioinformatics.ustc.edu.cn/hnmd/ .
gCUP: rapid GPU-based HIV-1 co-receptor usage prediction for next-generation sequencing.
Olejnik, Michael; Steuwer, Michel; Gorlatch, Sergei; Heider, Dominik
2014-11-15
Next-generation sequencing (NGS) has a large potential in HIV diagnostics, and genotypic prediction models have been developed and successfully tested in the recent years. However, albeit being highly accurate, these computational models lack computational efficiency to reach their full potential. In this study, we demonstrate the use of graphics processing units (GPUs) in combination with a computational prediction model for HIV tropism. Our new model named gCUP, parallelized and optimized for GPU, is highly accurate and can classify >175 000 sequences per second on an NVIDIA GeForce GTX 460. The computational efficiency of our new model is the next step to enable NGS technologies to reach clinical significance in HIV diagnostics. Moreover, our approach is not limited to HIV tropism prediction, but can also be easily adapted to other settings, e.g. drug resistance prediction. The source code can be downloaded at http://www.heiderlab.de d.heider@wz-straubing.de. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Computer programming for generating visual stimuli.
Bukhari, Farhan; Kurylo, Daniel D
2008-02-01
Critical to vision research is the generation of visual displays with precise control over stimulus metrics. Generating stimuli often requires adapting commercial software or developing specialized software for specific research applications. In order to facilitate this process, we give here an overview that allows nonexpert users to generate and customize stimuli for vision research. We first give a review of relevant hardware and software considerations, to allow the selection of display hardware, operating system, programming language, and graphics packages most appropriate for specific research applications. We then describe the framework of a generic computer program that can be adapted for use with a broad range of experimental applications. Stimuli are generated in the context of trial events, allowing the display of text messages, the monitoring of subject responses and reaction times, and the inclusion of contingency algorithms. This approach allows direct control and management of computer-generated visual stimuli while utilizing the full capabilities of modern hardware and software systems. The flowchart and source code for the stimulus-generating program may be downloaded from www.psychonomic.org/archive.
ProGeRF: Proteome and Genome Repeat Finder Utilizing a Fast Parallel Hash Function
Moraes, Walas Jhony Lopes; Rodrigues, Thiago de Souza; Bartholomeu, Daniella Castanheira
2015-01-01
Repetitive element sequences are adjacent, repeating patterns, also called motifs, and can be of different lengths; repetitions can involve their exact or approximate copies. They have been widely used as molecular markers in population biology. Given the sizes of sequenced genomes, various bioinformatics tools have been developed for the extraction of repetitive elements from DNA sequences. However, currently available tools do not provide options for identifying repetitive elements in the genome or proteome, displaying a user-friendly web interface, and performing-exhaustive searches. ProGeRF is a web site for extracting repetitive regions from genome and proteome sequences. It was designed to be efficient, fast, and accurate and primarily user-friendly web tool allowing many ways to view and analyse the results. ProGeRF (Proteome and Genome Repeat Finder) is freely available as a stand-alone program, from which the users can download the source code, and as a web tool. It was developed using the hash table approach to extract perfect and imperfect repetitive regions in a (multi)FASTA file, while allowing a linear time complexity. PMID:25811026
Filatov, Gleb; Bauwens, Bruno; Kertész-Farkas, Attila
2018-05-07
Bioinformatics studies often rely on similarity measures between sequence pairs, which often pose a bottleneck in large-scale sequence analysis. Here, we present a new convolutional kernel function for protein sequences called the LZW-Kernel. It is based on code words identified with the Lempel-Ziv-Welch (LZW) universal text compressor. The LZW-Kernel is an alignment-free method, it is always symmetric, is positive, always provides 1.0 for self-similarity and it can directly be used with Support Vector Machines (SVMs) in classification problems, contrary to normalized compression distance (NCD), which often violates the distance metric properties in practice and requires further techniques to be used with SVMs. The LZW-Kernel is a one-pass algorithm, which makes it particularly plausible for big data applications. Our experimental studies on remote protein homology detection and protein classification tasks reveal that the LZW-Kernel closely approaches the performance of the Local Alignment Kernel (LAK) and the SVM-pairwise method combined with Smith-Waterman (SW) scoring at a fraction of the time. Moreover, the LZW-Kernel outperforms the SVM-pairwise method when combined with BLAST scores, which indicates that the LZW code words might be a better basis for similarity measures than local alignment approximations found with BLAST. In addition, the LZW-Kernel outperforms n-gram based mismatch kernels, hidden Markov model based SAM and Fisher kernel, and protein family based PSI-BLAST, among others. Further advantages include the LZW-Kernel's reliance on a simple idea, its ease of implementation, and its high speed, three times faster than BLAST and several magnitudes faster than SW or LAK in our tests. LZW-Kernel is implemented as a standalone C code and is a free open-source program distributed under GPLv3 license and can be downloaded from https://github.com/kfattila/LZW-Kernel. akerteszfarkas@hse.ru. Supplementary data are available at Bioinformatics Online.
Gillespie, Alex; Reader, Tom W
2016-01-01
Background Letters of complaint written by patients and their advocates reporting poor healthcare experiences represent an under-used data source. The lack of a method for extracting reliable data from these heterogeneous letters hinders their use for monitoring and learning. To address this gap, we report on the development and reliability testing of the Healthcare Complaints Analysis Tool (HCAT). Methods HCAT was developed from a taxonomy of healthcare complaints reported in a previously published systematic review. It introduces the novel idea that complaints should be analysed in terms of severity. Recruiting three groups of educated lay participants (n=58, n=58, n=55), we refined the taxonomy through three iterations of discriminant content validity testing. We then supplemented this refined taxonomy with explicit coding procedures for seven problem categories (each with four levels of severity), stage of care and harm. These combined elements were further refined through iterative coding of a UK national sample of healthcare complaints (n= 25, n=80, n=137, n=839). To assess reliability and accuracy for the resultant tool, 14 educated lay participants coded a referent sample of 125 healthcare complaints. Results The seven HCAT problem categories (quality, safety, environment, institutional processes, listening, communication, and respect and patient rights) were found to be conceptually distinct. On average, raters identified 1.94 problems (SD=0.26) per complaint letter. Coders exhibited substantial reliability in identifying problems at four levels of severity; moderate and substantial reliability in identifying stages of care (except for ‘discharge/transfer’ that was only fairly reliable) and substantial reliability in identifying overall harm. Conclusions HCAT is not only the first reliable tool for coding complaints, it is the first tool to measure the severity of complaints. It facilitates service monitoring and organisational learning and it enables future research examining whether healthcare complaints are a leading indicator of poor service outcomes. HCAT is freely available to download and use. PMID:26740496
Simonaitis, Linas; McDonald, Clement J
2009-10-01
The utility of National Drug Codes (NDCs) and drug knowledge bases (DKBs) in the organization of prescription records from multiple sources was studied. The master files of most pharmacy systems include NDCs and local codes to identify the products they dispense. We obtained a large sample of prescription records from seven different sources. These records carried a national product code or a local code that could be translated into a national product code via their formulary master. We obtained mapping tables from five DKBs. We measured the degree to which the DKB mapping tables covered the national product codes carried in or associated with the sample of prescription records. Considering the total prescription volume, DKBs covered 93.0-99.8% of the product codes from three outpatient sources and 77.4-97.0% of the product codes from four inpatient sources. Among the in-patient sources, invented codes explained 36-94% of the noncoverage. Outpatient pharmacy sources rarely invented codes, which comprised only 0.11-0.21% of their total prescription volume, compared with inpatient pharmacy sources for which invented codes comprised 1.7-7.4% of their prescription volume. The distribution of prescribed products was highly skewed, with 1.4-4.4% of codes accounting for 50% of the message volume and 10.7-34.5% accounting for 90% of the message volume. DKBs cover the product codes used by outpatient sources sufficiently well to permit automatic mapping. Changes in policies and standards could increase coverage of product codes used by inpatient sources.
Jaitly, Navdeep; Mayampurath, Anoop; Littlefield, Kyle; Adkins, Joshua N; Anderson, Gordon A; Smith, Richard D
2009-01-01
Background Data generated from liquid chromatography coupled to high-resolution mass spectrometry (LC-MS)-based studies of a biological sample can contain large amounts of biologically significant information in the form of proteins, peptides, and metabolites. Interpreting this data involves inferring the masses and abundances of biomolecules injected into the instrument. Because of the inherent complexity of mass spectral patterns produced by these biomolecules, the analysis is significantly enhanced by using visualization capabilities to inspect and confirm results. In this paper we describe Decon2LS, an open-source software package for automated processing and visualization of high-resolution MS data. Drawing extensively on algorithms developed over the last ten years for ICR2LS, Decon2LS packages the algorithms as a rich set of modular, reusable processing classes for performing diverse functions such as reading raw data, routine peak finding, theoretical isotope distribution modelling, and deisotoping. Because the source code is openly available, these functionalities can now be used to build derivative applications in relatively fast manner. In addition, Decon2LS provides an extensive set of visualization tools, such as high performance chart controls. Results With a variety of options that include peak processing, deisotoping, isotope composition, etc, Decon2LS supports processing of multiple raw data formats. Deisotoping can be performed on an individual scan, an individual dataset, or on multiple datasets using batch processing. Other processing options include creating a two dimensional view of mass and liquid chromatography (LC) elution time features, generating spectrum files for tandem MS data, creating total intensity chromatograms, and visualizing theoretical peptide profiles. Application of Decon2LS to deisotope different datasets obtained across different instruments yielded a high number of features that can be used to identify and quantify peptides in the biological sample. Conclusion Decon2LS is an efficient software package for discovering and visualizing features in proteomics studies that require automated interpretation of mass spectra. Besides being easy to use, fast, and reliable, Decon2LS is also open-source, which allows developers in the proteomics and bioinformatics communities to reuse and refine the algorithms to meet individual needs. Decon2LS source code, installer, and tutorials may be downloaded free of charge at . PMID:19292916
Jaitly, Navdeep; Mayampurath, Anoop; Littlefield, Kyle; Adkins, Joshua N; Anderson, Gordon A; Smith, Richard D
2009-03-17
Data generated from liquid chromatography coupled to high-resolution mass spectrometry (LC-MS)-based studies of a biological sample can contain large amounts of biologically significant information in the form of proteins, peptides, and metabolites. Interpreting this data involves inferring the masses and abundances of biomolecules injected into the instrument. Because of the inherent complexity of mass spectral patterns produced by these biomolecules, the analysis is significantly enhanced by using visualization capabilities to inspect and confirm results. In this paper we describe Decon2LS, an open-source software package for automated processing and visualization of high-resolution MS data. Drawing extensively on algorithms developed over the last ten years for ICR2LS, Decon2LS packages the algorithms as a rich set of modular, reusable processing classes for performing diverse functions such as reading raw data, routine peak finding, theoretical isotope distribution modelling, and deisotoping. Because the source code is openly available, these functionalities can now be used to build derivative applications in relatively fast manner. In addition, Decon2LS provides an extensive set of visualization tools, such as high performance chart controls. With a variety of options that include peak processing, deisotoping, isotope composition, etc, Decon2LS supports processing of multiple raw data formats. Deisotoping can be performed on an individual scan, an individual dataset, or on multiple datasets using batch processing. Other processing options include creating a two dimensional view of mass and liquid chromatography (LC) elution time features, generating spectrum files for tandem MS data, creating total intensity chromatograms, and visualizing theoretical peptide profiles. Application of Decon2LS to deisotope different datasets obtained across different instruments yielded a high number of features that can be used to identify and quantify peptides in the biological sample. Decon2LS is an efficient software package for discovering and visualizing features in proteomics studies that require automated interpretation of mass spectra. Besides being easy to use, fast, and reliable, Decon2LS is also open-source, which allows developers in the proteomics and bioinformatics communities to reuse and refine the algorithms to meet individual needs.Decon2LS source code, installer, and tutorials may be downloaded free of charge at http://http:/ncrr.pnl.gov/software/.
This program serves two purposes: (1) as a general-purpose indoor exposure model in buildings with multiple zones, multiple chemicals and multiple sources and sinks, and (2) as a special-purpose concentration model
Facilitating Internet-Scale Code Retrieval
ERIC Educational Resources Information Center
Bajracharya, Sushil Krishna
2010-01-01
Internet-Scale code retrieval deals with the representation, storage, and access of relevant source code from a large amount of source code available on the Internet. Internet-Scale code retrieval systems support common emerging practices among software developers related to finding and reusing source code. In this dissertation we focus on some…
NASA Technical Reports Server (NTRS)
Hogan, Patrick; Maxwell, Christopher; Kim, Randolph; Gaskins, Tom
2007-01-01
World Wind allows users to zoom from satellite altitude down to any place on Earth, leveraging high-resolution LandSat imagery and SRTM (Shuttle Radar Topography Mission) elevation data to experience Earth in visually rich 3D. In addition to Earth, World Wind can also visualize other planets, and there are already comprehensive data sets for Mars and the Earth's moon, which are as easily accessible as those of Earth. There have been more than 20 million downloads to date, and the software is being used heavily by the Department of Defense due to the code s ability to be extended and the evolution of the code courtesy of NASA and the user community. Primary features include the dynamic access to public domain imagery and its ease of use. All one needs to control World Wind is a two-button mouse. Additional guides and features can be accessed through a simplified menu. A JAVA version will be available soon. Navigation is automated with single clicks of a mouse, or by typing in any location to automatically zoom in to see it. The World Wind install package contains the necessary requirements such as the .NET runtime and managed DirectX library. World Wind can display combinations of data from a variety of sources, including Blue Marble, LandSat 7, SRTM, NASA Scientific Visualization Studio, GLOBE, and much more. A thorough list of features, the user manual, a key chart, and screen shots are available at http://worldwind.arc.nasa.gov.
Lewkowitz, Adam K; O'Donnell, Betsy E; Nakagawa, Sanae; Vargas, Juan E; Zlatnik, Marya G
2016-03-01
Text4baby is the only free text-message program for pregnancy available. Our objective was to determine whether content differed between Text4baby and popular pregnancy smart phone applications (apps). Researchers enrolled in Text4baby in 2012 and downloaded the four most-popular free pregnancy smart phone apps in July 2013; content was re-extracted in February 2014. Messages were assigned thematic codes. Two researchers coded messages independently before reviewing all the codes jointly to ensure consistency. Logistic regression modeling determined statistical differences between Text4baby and smart phone apps. About 1399 messages were delivered. Of these, 333 messages had content related to more than one theme and were coded as such, resulting in 1820 codes analyzed. Compared to smart phone apps, Text4baby was significantly more likely to have content regarding Postpartum Planning, Seeking Care, Recruitment and Prevention and significantly less likely to mention Normal Pregnancy Symptoms. No messaging program included content regarding postpartum contraception. To improve content without increasing text message number, Text4baby could replace messages on recruitment with messages regarding normal pregnancy symptoms, fetal development and postpartum contraception.
NASA Astrophysics Data System (ADS)
Cox, S. J.; Wyborn, L. A.; Fraser, R.; Rankine, T.; Woodcock, R.; Vote, J.; Evans, B.
2012-12-01
The Virtual Geophysics Laboratory (VGL) is web portal that provides geoscientists with an integrated online environment that: seamlessly accesses geophysical and geoscience data services from the AuScope national geoscience information infrastructure; loosely couples these data to a variety of gesocience software tools; and provides large scale processing facilities via cloud computing. VGL is a collaboration between CSIRO, Geoscience Australia, National Computational Infrastructure, Monash University, Australian National University and the University of Queensland. The VGL provides a distributed system whereby a user can enter an online virtual laboratory to seamlessly connect to OGC web services for geoscience data. The data is supplied in open standards formats using international standards like GeoSciML. A VGL user uses a web mapping interface to discover and filter the data sources using spatial and attribute filters to define a subset. Once the data is selected the user is not required to download the data. VGL collates the service query information for later in the processing workflow where it will be staged directly to the computing facilities. The combination of deferring data download and access to Cloud computing enables VGL users to access their data at higher resolutions and to undertake larger scale inversions, more complex models and simulations than their own local computing facilities might allow. Inside the Virtual Geophysics Laboratory, the user has access to a library of existing models, complete with exemplar workflows for specific scientific problems based on those models. For example, the user can load a geological model published by Geoscience Australia, apply a basic deformation workflow provided by a CSIRO scientist, and have it run in a scientific code from Monash. Finally the user can publish these results to share with a colleague or cite in a paper. This opens new opportunities for access and collaboration as all the resources (models, code, data, processing) are shared in the one virtual laboratory. VGL provides end users with access to an intuitive, user-centered interface that leverages cloud storage and cloud and cluster processing from both the research communities and commercial suppliers (e.g. Amazon). As the underlying data and information services are agnostic of the scientific domain, they can support many other data types. This fundamental characteristic results in a highly reusable virtual laboratory infrastructure that could also be used for example natural hazards, satellite processing, soil geochemistry, climate modeling, agriculture crop modeling.
Joint source-channel coding for motion-compensated DCT-based SNR scalable video.
Kondi, Lisimachos P; Ishtiaq, Faisal; Katsaggelos, Aggelos K
2002-01-01
In this paper, we develop an approach toward joint source-channel coding for motion-compensated DCT-based scalable video coding and transmission. A framework for the optimal selection of the source and channel coding rates over all scalable layers is presented such that the overall distortion is minimized. The algorithm utilizes universal rate distortion characteristics which are obtained experimentally and show the sensitivity of the source encoder and decoder to channel errors. The proposed algorithm allocates the available bit rate between scalable layers and, within each layer, between source and channel coding. We present the results of this rate allocation algorithm for video transmission over a wireless channel using the H.263 Version 2 signal-to-noise ratio (SNR) scalable codec for source coding and rate-compatible punctured convolutional (RCPC) codes for channel coding. We discuss the performance of the algorithm with respect to the channel conditions, coding methodologies, layer rates, and number of layers.
Evaluation of hierarchical models for integrative genomic analyses.
Denis, Marie; Tadesse, Mahlet G
2016-03-01
Advances in high-throughput technologies have led to the acquisition of various types of -omic data on the same biological samples. Each data type gives independent and complementary information that can explain the biological mechanisms of interest. While several studies performing independent analyses of each dataset have led to significant results, a better understanding of complex biological mechanisms requires an integrative analysis of different sources of data. Flexible modeling approaches, based on penalized likelihood methods and expectation-maximization (EM) algorithms, are studied and tested under various biological relationship scenarios between the different molecular features and their effects on a clinical outcome. The models are applied to genomic datasets from two cancer types in the Cancer Genome Atlas project: glioblastoma multiforme and ovarian serous cystadenocarcinoma. The integrative models lead to improved model fit and predictive performance. They also provide a better understanding of the biological mechanisms underlying patients' survival. Source code implementing the integrative models is freely available at https://github.com/mgt000/IntegrativeAnalysis along with example datasets and sample R script applying the models to these data. The TCGA datasets used for analysis are publicly available at https://tcga-data.nci.nih.gov/tcga/tcgaDownload.jsp marie.denis@cirad.fr or mgt26@georgetown.edu Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
The BioGRID Interaction Database: 2011 update
Stark, Chris; Breitkreutz, Bobby-Joe; Chatr-aryamontri, Andrew; Boucher, Lorrie; Oughtred, Rose; Livstone, Michael S.; Nixon, Julie; Van Auken, Kimberly; Wang, Xiaodong; Shi, Xiaoqi; Reguly, Teresa; Rust, Jennifer M.; Winter, Andrew; Dolinski, Kara; Tyers, Mike
2011-01-01
The Biological General Repository for Interaction Datasets (BioGRID) is a public database that archives and disseminates genetic and protein interaction data from model organisms and humans (http://www.thebiogrid.org). BioGRID currently holds 347 966 interactions (170 162 genetic, 177 804 protein) curated from both high-throughput data sets and individual focused studies, as derived from over 23 000 publications in the primary literature. Complete coverage of the entire literature is maintained for budding yeast (Saccharomyces cerevisiae), fission yeast (Schizosaccharomyces pombe) and thale cress (Arabidopsis thaliana), and efforts to expand curation across multiple metazoan species are underway. The BioGRID houses 48 831 human protein interactions that have been curated from 10 247 publications. Current curation drives are focused on particular areas of biology to enable insights into conserved networks and pathways that are relevant to human health. The BioGRID 3.0 web interface contains new search and display features that enable rapid queries across multiple data types and sources. An automated Interaction Management System (IMS) is used to prioritize, coordinate and track curation across international sites and projects. BioGRID provides interaction data to several model organism databases, resources such as Entrez-Gene and other interaction meta-databases. The entire BioGRID 3.0 data collection may be downloaded in multiple file formats, including PSI MI XML. Source code for BioGRID 3.0 is freely available without any restrictions. PMID:21071413
Weiss, Scott T.
2014-01-01
Bayesian Networks (BN) have been a popular predictive modeling formalism in bioinformatics, but their application in modern genomics has been slowed by an inability to cleanly handle domains with mixed discrete and continuous variables. Existing free BN software packages either discretize continuous variables, which can lead to information loss, or do not include inference routines, which makes prediction with the BN impossible. We present CGBayesNets, a BN package focused around prediction of a clinical phenotype from mixed discrete and continuous variables, which fills these gaps. CGBayesNets implements Bayesian likelihood and inference algorithms for the conditional Gaussian Bayesian network (CGBNs) formalism, one appropriate for predicting an outcome of interest from, e.g., multimodal genomic data. We provide four different network learning algorithms, each making a different tradeoff between computational cost and network likelihood. CGBayesNets provides a full suite of functions for model exploration and verification, including cross validation, bootstrapping, and AUC manipulation. We highlight several results obtained previously with CGBayesNets, including predictive models of wood properties from tree genomics, leukemia subtype classification from mixed genomic data, and robust prediction of intensive care unit mortality outcomes from metabolomic profiles. We also provide detailed example analysis on public metabolomic and gene expression datasets. CGBayesNets is implemented in MATLAB and available as MATLAB source code, under an Open Source license and anonymous download at http://www.cgbayesnets.com. PMID:24922310
EVEREST: Pixel Level Decorrelation of K2 Light Curves
NASA Astrophysics Data System (ADS)
Luger, Rodrigo; Agol, Eric; Kruse, Ethan; Barnes, Rory; Becker, Andrew; Foreman-Mackey, Daniel; Deming, Drake
2016-10-01
We present EPIC Variability Extraction and Removal for Exoplanet Science Targets (EVEREST), an open-source pipeline for removing instrumental noise from K2 light curves. EVEREST employs a variant of pixel level decorrelation to remove systematics introduced by the spacecraft’s pointing error and a Gaussian process to capture astrophysical variability. We apply EVEREST to all K2 targets in campaigns 0-7, yielding light curves with precision comparable to that of the original Kepler mission for stars brighter than {K}p≈ 13, and within a factor of two of the Kepler precision for fainter targets. We perform cross-validation and transit injection and recovery tests to validate the pipeline, and compare our light curves to the other de-trended light curves available for download at the MAST High Level Science Products archive. We find that EVEREST achieves the highest average precision of any of these pipelines for unsaturated K2 stars. The improved precision of these light curves will aid in exoplanet detection and characterization, investigations of stellar variability, asteroseismology, and other photometric studies. The EVEREST pipeline can also easily be applied to future surveys, such as the TESS mission, to correct for instrumental systematics and enable the detection of low signal-to-noise transiting exoplanets. The EVEREST light curves and the source code used to generate them are freely available online.
McGeachie, Michael J; Chang, Hsun-Hsien; Weiss, Scott T
2014-06-01
Bayesian Networks (BN) have been a popular predictive modeling formalism in bioinformatics, but their application in modern genomics has been slowed by an inability to cleanly handle domains with mixed discrete and continuous variables. Existing free BN software packages either discretize continuous variables, which can lead to information loss, or do not include inference routines, which makes prediction with the BN impossible. We present CGBayesNets, a BN package focused around prediction of a clinical phenotype from mixed discrete and continuous variables, which fills these gaps. CGBayesNets implements Bayesian likelihood and inference algorithms for the conditional Gaussian Bayesian network (CGBNs) formalism, one appropriate for predicting an outcome of interest from, e.g., multimodal genomic data. We provide four different network learning algorithms, each making a different tradeoff between computational cost and network likelihood. CGBayesNets provides a full suite of functions for model exploration and verification, including cross validation, bootstrapping, and AUC manipulation. We highlight several results obtained previously with CGBayesNets, including predictive models of wood properties from tree genomics, leukemia subtype classification from mixed genomic data, and robust prediction of intensive care unit mortality outcomes from metabolomic profiles. We also provide detailed example analysis on public metabolomic and gene expression datasets. CGBayesNets is implemented in MATLAB and available as MATLAB source code, under an Open Source license and anonymous download at http://www.cgbayesnets.com.
Closer Look at Cancer Imaging Tools
... download on the Apple Store and Google Play. SOURCE: National Institute of Biomedical Imaging and Bioengineering: Medical Scans Spring 2018 Issue: Volume 13 Number 1 Page 11 MedlinePlus Subscribe Magazine Information Contact Us Viewers & Players Friends of the National Library of Medicine (FNLM) top
Gamma-Insensitive Fast Neutron Detector with Spectral Source Identification Potential
2011-03-01
commercialising novel detection technologies developed for fundamental science. Rico’s past work experience includes consulting work performed for the European...copies of this journal and the articles contained herein may be printed or downloaded and redistributed for personal, research or educational
... this page, please enable JavaScript. MedlinePlus produces XML data sets that you are welcome to download and use. If you have questions about the MedlinePlus XML files, please contact us . For additional sources of MedlinePlus data in XML format, visit our Web service page, ...
A Guide to Lowering Test Scores.
ERIC Educational Resources Information Center
Rosenblum, Shelly; Spark, Barbara
2002-01-01
Discusses the adverse impact of poor classroom air quality on student performance and how school officials can eliminate the sources of indoor air pollution. Describes Environmental Protection Agency's "Indoor Air Quality Tools for Schools" program downloadable at www.epa.gov/iaq/schools/index.html. (PKP)
Are YouTube videos accurate and reliable on basic life support and cardiopulmonary resuscitation?
Yaylaci, Serpil; Serinken, Mustafa; Eken, Cenker; Karcioglu, Ozgur; Yilmaz, Atakan; Elicabuk, Hayri; Dal, Onur
2014-10-01
The objective of this study is to investigate reliability and accuracy of the information on YouTube videos related to CPR and BLS in accord with 2010 CPR guidelines. YouTube was queried using four search terms 'CPR', 'cardiopulmonary resuscitation', 'BLS' and 'basic life support' between 2011 and 2013. Sources that uploaded the videos, the record time, the number of viewers in the study period, inclusion of human or manikins were recorded. The videos were rated if they displayed the correct order of resuscitative efforts in full accord with 2010 CPR guidelines or not. Two hundred and nine videos meeting the inclusion criteria after the search in YouTube with four search terms ('CPR', 'cardiopulmonary resuscitation', 'BLS' and 'basic life support') comprised the study sample subjected to the analysis. Median score of the videos is 5 (IQR: 3.5-6). Only 11.5% (n = 24) of the videos were found to be compatible with 2010 CPR guidelines with regard to sequence of interventions. Videos uploaded by 'Guideline bodies' had significantly higher rates of download when compared with the videos uploaded by other sources. Sources of the videos and date of upload (year) were not shown to have any significant effect on the scores received (P = 0.615 and 0.513, respectively). The videos' number of downloads did not differ according to the videos compatible with the guidelines (P = 0.832). The videos downloaded more than 10,000 times had a higher score than the others (P = 0.001). The majority of You-Tube video clips purporting to be about CPR are not relevant educational material. Of those that are focused on teaching CPR, only a small minority optimally meet the 2010 Resucitation Guidelines. © 2014 Australasian College for Emergency Medicine and Australasian Society for Emergency Medicine.
VizieR Online Data Catalog: A spectral approach to transit timing variations (Ofir+, 2018)
NASA Astrophysics Data System (ADS)
Ofir, A.; Xie, J.-W.; Jiang, C.-F.; Sari, R.; Aharonson, O.
2018-03-01
We used Kepler Data Release 24 as the source data and the Kepler Objects of Interest (KOIs) table downloaded from the NExSci archive on 2015 December 25 as the source of list of candidate signals, and processed 4706 object not dispositioned as "false positive". We remove eclipsing binaries from the candidates list (see section 4.1 for further details). (1 data file).
Performance Assessment of Network Intrusion-Alert Prediction
2012-09-01
the threats. In this thesis, we use Snort to generate the intrusion detection alerts. 2. SNORT Snort is an open source network intrusion...standard for IPS. (Snort, 2012) We choose Snort because it is an open source product that is free to download and can be deployed cross-platform...Learning & prediction in relational time series: A survey. 21st Behavior Representation in Modeling & Simulation ( BRIMS ) Conference 2012, 93–100. Tan
Online Tools for Astronomy and Cosmochemistry
NASA Technical Reports Server (NTRS)
Meyer, B. S.
2005-01-01
Over the past year, the Webnucleo Group at Clemson University has been developing a web site with a number of interactive online tools for astronomy and cosmochemistry applications. The site uses SHP (Simplified Hypertext Preprocessor), which, because of its flexibility, allows us to embed almost any computer language into our web pages. For a description of SHP, please see http://www.joeldenny.com/ At our web site, an internet user may mine large and complex data sets, such as our stellar evolution models, and make graphs or tables of the results. The user may also run some of our detailed nuclear physics and astrophysics codes, such as our nuclear statistical equilibrium code, which is written in fortran and C. Again, the user may make graphs and tables and download the results.
Mendel-GPU: haplotyping and genotype imputation on graphics processing units
Chen, Gary K.; Wang, Kai; Stram, Alex H.; Sobel, Eric M.; Lange, Kenneth
2012-01-01
Motivation: In modern sequencing studies, one can improve the confidence of genotype calls by phasing haplotypes using information from an external reference panel of fully typed unrelated individuals. However, the computational demands are so high that they prohibit researchers with limited computational resources from haplotyping large-scale sequence data. Results: Our graphics processing unit based software delivers haplotyping and imputation accuracies comparable to competing programs at a fraction of the computational cost and peak memory demand. Availability: Mendel-GPU, our OpenCL software, runs on Linux platforms and is portable across AMD and nVidia GPUs. Users can download both code and documentation at http://code.google.com/p/mendel-gpu/. Contact: gary.k.chen@usc.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:22954633
Scaling to diversity: The DERECHOS distributed infrastructure for analyzing and sharing data
NASA Astrophysics Data System (ADS)
Rilee, M. L.; Kuo, K. S.; Clune, T.; Oloso, A.; Brown, P. G.
2016-12-01
Integrating Earth Science data from diverse sources such as satellite imagery and simulation output can be expensive and time-consuming, limiting scientific inquiry and the quality of our analyses. Reducing these costs will improve innovation and quality in science. The current Earth Science data infrastructure focuses on downloading data based on requests formed from the search and analysis of associated metadata. And while the data products provided by archives may use the best available data sharing technologies, scientist end-users generally do not have such resources (including staff) available to them. Furthermore, only once an end-user has received the data from multiple diverse sources and has integrated them can the actual analysis and synthesis begin. The cost of getting from idea to where synthesis can start dramatically slows progress. In this presentation we discuss a distributed computational and data storage framework that eliminates much of the aforementioned cost. The SciDB distributed array database is central as it is optimized for scientific computing involving very large arrays, performing better than less specialized frameworks like Spark. Adding spatiotemporal functions to the SciDB creates a powerful platform for analyzing and integrating massive, distributed datasets. SciDB allows Big Earth Data analysis to be performed "in place" without the need for expensive downloads and end-user resources. Spatiotemporal indexing technologies such as the hierarchical triangular mesh enable the compute and storage affinity needed to efficiently perform co-located and conditional analyses minimizing data transfers. These technologies automate the integration of diverse data sources using the framework, a critical step beyond current metadata search and analysis. Instead of downloading data into their idiosyncratic local environments, end-users can generate and share data products integrated from diverse multiple sources using a common shared environment, turning distributed active archive centers (DAACs) from warehouses into distributed active analysis centers.
Authorship Attribution of Source Code
ERIC Educational Resources Information Center
Tennyson, Matthew F.
2013-01-01
Authorship attribution of source code is the task of deciding who wrote a program, given its source code. Applications include software forensics, plagiarism detection, and determining software ownership. A number of methods for the authorship attribution of source code have been presented in the past. A review of those existing methods is…
Two-step web-mining approach to study geology/geophysics-related open-source software projects
NASA Astrophysics Data System (ADS)
Behrends, Knut; Conze, Ronald
2013-04-01
Geology/geophysics is a highly interdisciplinary science, overlapping with, for instance, physics, biology and chemistry. In today's software-intensive work environments, geoscientists often encounter new open-source software from scientific fields that are only remotely related to the own field of expertise. We show how web-mining techniques can help to carry out systematic discovery and evaluation of such software. In a first step, we downloaded ~500 abstracts (each consisting of ~1 kb UTF-8 text) from agu-fm12.abstractcentral.com. This web site hosts the abstracts of all publications presented at AGU Fall Meeting 2012, the world's largest annual geology/geophysics conference. All abstracts belonged to the category "Earth and Space Science Informatics", an interdisciplinary label cross-cutting many disciplines such as "deep biosphere", "atmospheric research", and "mineral physics". Each publication was represented by a highly structured record with ~20 short data attributes, the largest authorship-record being the unstructured "abstract" field. We processed texts of the abstracts with the statistics software "R" to calculate a corpus and a term-document matrix. Using R package "tm", we applied text-mining techniques to filter data and develop hypotheses about software-development activities happening in various geology/geophysics fields. Analyzing the term-document matrix with basic techniques (e.g., word frequencies, co-occurences, weighting) as well as more complex methods (clustering, classification) several key pieces of information were extracted. For example, text-mining can be used to identify scientists who are also developers of open-source scientific software, and the names of their programming projects and codes can also be identified. In a second step, based on the intermediate results found by processing the conference-abstracts, any new hypotheses can be tested in another webmining subproject: by merging the dataset with open data from github.com and stackoverflow.com. These popular, developer-centric websites have powerful application-programmer interfaces, and follow an open-data policy. In this regard, these sites offer a web-accessible reservoir of information that can be tapped to study questions such as: which open source software projects are eminent in the various geoscience fields? What are the most popular programming languages? How are they trending? Are there any interesting temporal patterns in committer activities? How large are programming teams and how do they change over time? What free software packages exist in the vast realms of related fields? Does the software from these fields have capabilities that might still be useful to me as a researcher, or can help me perform my work better? Are there any open-source projects that might be commercially interesting? This evaluation strategy reveals programming projects that tend to be new. As many important legacy codes are not hosted on open-source code-repositories, the presented search method might overlook some older projects.
Crowd-sourcing relative preferences for ecosystem services in the St. Louis River AOC
Analysis of ecosystem service tradeoffs among project scenarios is more reliable when valuation data are available. Empirical valuation data are expensive and difficult to collect. As a possible alternative or supplement to empirical data, we downloaded and classified images from...
This tutorial provides instructions for accessing, retrieving, and downloading the following software to install on a host computer in support of Quantitative Microbial Risk Assessment (QMRA) modeling:• SDMProjectBuilder (which includes the Microbial Source Module as part...
75 FR 35457 - Draft of the 2010 Causal Analysis/Diagnosis Decision Information System (CADDIS)
Federal Register 2010, 2011, 2012, 2013, 2014
2010-06-22
... Causal Analysis/Diagnosis Decision Information System (CADDIS) AGENCY: Environmental Protection Agency... site, ``2010 release of the Causal Analysis/Diagnosis Decision Information System (CADDIS).'' The... analyses, downloadable software tools, and links to outside information sources. II. How to Submit Comments...
Navigating protected genomics data with UCSC Genome Browser in a Box.
Haeussler, Maximilian; Raney, Brian J; Hinrichs, Angie S; Clawson, Hiram; Zweig, Ann S; Karolchik, Donna; Casper, Jonathan; Speir, Matthew L; Haussler, David; Kent, W James
2015-03-01
Genome Browser in a Box (GBiB) is a small virtual machine version of the popular University of California Santa Cruz (UCSC) Genome Browser that can be run on a researcher's own computer. Once GBiB is installed, a standard web browser is used to access the virtual server and add personal data files from the local hard disk. Annotation data are loaded on demand through the Internet from UCSC or can be downloaded to the local computer for faster access. Software downloads and installation instructions are freely available for non-commercial use at https://genome-store.ucsc.edu/. GBiB requires the installation of open-source software VirtualBox, available for all major operating systems, and the UCSC Genome Browser, which is open source and free for non-commercial use. Commercial use of GBiB and the Genome Browser requires a license (http://genome.ucsc.edu/license/). © The Author 2014. Published by Oxford University Press.
Version 2.0 Visual Sample Plan (VSP): UXO Module Code Description and Verification
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gilbert, Richard O.; Wilson, John E.; O'Brien, Robert F.
2003-05-06
The Pacific Northwest National Laboratory (PNNL) is developing statistical methods for determining the amount of geophysical surveys conducted along transects (swaths) that are needed to achieve specified levels of confidence of finding target areas (TAs) of anomalous readings and possibly unexploded ordnance (UXO) at closed, transferring and transferred (CTT) Department of Defense (DoD) ranges and other sites. The statistical methods developed by PNNL have been coded into the UXO module of the Visual Sample Plan (VSP) software code that is being developed by PNNL with support from the DoD, the U.S. Department of Energy (DOE, and the U.S. Environmental Protectionmore » Agency (EPA). (The VSP software and VSP Users Guide (Hassig et al, 2002) may be downloaded from http://dqo.pnl.gov/vsp.) This report describes and documents the statistical methods developed and the calculations and verification testing that have been conducted to verify that VSPs implementation of these methods is correct and accurate.« less
An open real-time tele-stethoscopy system.
Foche-Perez, Ignacio; Ramirez-Payba, Rodolfo; Hirigoyen-Emparanza, German; Balducci-Gonzalez, Fernando; Simo-Reigadas, Francisco-Javier; Seoane-Pascual, Joaquin; Corral-Peñafiel, Jaime; Martinez-Fernandez, Andres
2012-08-23
Acute respiratory infections are the leading cause of childhood mortality. The lack of physicians in rural areas of developing countries makes difficult their correct diagnosis and treatment. The staff of rural health facilities (health-care technicians) may not be qualified to distinguish respiratory diseases by auscultation. For this reason, the goal of this project is the development of a tele-stethoscopy system that allows a physician to receive real-time cardio-respiratory sounds from a remote auscultation, as well as video images showing where the technician is placing the stethoscope on the patient's body. A real-time wireless stethoscopy system was designed. The initial requirements were: 1) The system must send audio and video synchronously over IP networks, not requiring an Internet connection; 2) It must preserve the quality of cardiorespiratory sounds, allowing to adapt the binaural pieces and the chestpiece of standard stethoscopes, and; 3) Cardiorespiratory sounds should be recordable at both sides of the communication. In order to verify the diagnostic capacity of the system, a clinical validation with eight specialists has been designed. In a preliminary test, twelve patients have been auscultated by all the physicians using the tele-stethoscopy system, versus a local auscultation using traditional stethoscope. The system must allow listen the cardiac (systolic and diastolic murmurs, gallop sound, arrhythmias) and respiratory (rhonchi, rales and crepitations, wheeze, diminished and bronchial breath sounds, pleural friction rub) sounds. The design, development and initial validation of the real-time wireless tele-stethoscopy system are described in detail. The system was conceived from scratch as open-source, low-cost and designed in such a way that many universities and small local companies in developing countries may manufacture it. Only free open-source software has been used in order to minimize manufacturing costs and look for alliances to support its improvement and adaptation. The microcontroller firmware code, the computer software code and the PCB schematics are available for free download in a subversion repository hosted in SourceForge. It has been shown that real-time tele-stethoscopy, together with a videoconference system that allows a remote specialist to oversee the auscultation, may be a very helpful tool in rural areas of developing countries.
Chandra Source Catalog: User Interfaces
NASA Astrophysics Data System (ADS)
Bonaventura, Nina; Evans, I. N.; Harbo, P. N.; Rots, A. H.; Tibbetts, M. S.; Van Stone, D. W.; Zografou, P.; Anderson, C. S.; Chen, J. C.; Davis, J. E.; Doe, S. M.; Evans, J. D.; Fabbiano, G.; Galle, E.; Gibbs, D. G.; Glotfelty, K. J.; Grier, J. D.; Hain, R.; Hall, D. M.; He, X.; Houck, J. C.; Karovska, M.; Lauer, J.; McCollough, M. L.; McDowell, J. C.; Miller, J. B.; Mitschang, A. W.; Morgan, D. L.; Nichols, J. S.; Nowak, M. A.; Plummer, D. A.; Primini, F. A.; Refsdal, B. L.; Siemiginowska, A. L.; Sundheim, B. A.; Winkelman, S. L.
2010-03-01
The CSCview data mining interface is available for browsing the Chandra Source Catalog (CSC) and downloading tables of quality-assured source properties and data products. Once the desired source properties and search criteria are entered into the CSCview query form, the resulting source matches are returned in a table along with the values of the requested source properties for each source. (The catalog can be searched on any source property, not just position.) At this point, the table of search results may be saved to a text file, and the available data products for each source may be downloaded. CSCview save files are output in RDB-like and VOTable format. The available CSC data products include event files, spectra, lightcurves, and images, all of which are processed with the CIAO software. CSC data may also be accessed non-interactively with Unix command-line tools such as cURL and Wget, using ADQL 2.0 query syntax. In fact, CSCview features a separate ADQL query form for those who wish to specify this type of query within the GUI. Several interfaces are available for learning if a source is included in the catalog (in addition to CSCview): 1) the CSC interface to Sky in Google Earth shows the footprint of each Chandra observation on the sky, along with the CSC footprint for comparison (CSC source properties are also accessible when a source within a Chandra field-of-view is clicked); 2) the CSC Limiting Sensitivity online tool indicates if a source at an input celestial location was too faint for detection; 3) an IVOA Simple Cone Search interface locates all CSC sources within a specified radius of an R.A. and Dec.; and 4) the CSC-SDSS cross-match service returns the list of sources common to the CSC and SDSS, either all such sources or a subset based on search criteria.
Côté, Richard G; Jones, Philip; Martens, Lennart; Kerrien, Samuel; Reisinger, Florian; Lin, Quan; Leinonen, Rasko; Apweiler, Rolf; Hermjakob, Henning
2007-10-18
Each major protein database uses its own conventions when assigning protein identifiers. Resolving the various, potentially unstable, identifiers that refer to identical proteins is a major challenge. This is a common problem when attempting to unify datasets that have been annotated with proteins from multiple data sources or querying data providers with one flavour of protein identifiers when the source database uses another. Partial solutions for protein identifier mapping exist but they are limited to specific species or techniques and to a very small number of databases. As a result, we have not found a solution that is generic enough and broad enough in mapping scope to suit our needs. We have created the Protein Identifier Cross-Reference (PICR) service, a web application that provides interactive and programmatic (SOAP and REST) access to a mapping algorithm that uses the UniProt Archive (UniParc) as a data warehouse to offer protein cross-references based on 100% sequence identity to proteins from over 70 distinct source databases loaded into UniParc. Mappings can be limited by source database, taxonomic ID and activity status in the source database. Users can copy/paste or upload files containing protein identifiers or sequences in FASTA format to obtain mappings using the interactive interface. Search results can be viewed in simple or detailed HTML tables or downloaded as comma-separated values (CSV) or Microsoft Excel (XLS) files suitable for use in a local database or a spreadsheet. Alternatively, a SOAP interface is available to integrate PICR functionality in other applications, as is a lightweight REST interface. We offer a publicly available service that can interactively map protein identifiers and protein sequences to the majority of commonly used protein databases. Programmatic access is available through a standards-compliant SOAP interface or a lightweight REST interface. The PICR interface, documentation and code examples are available at http://www.ebi.ac.uk/Tools/picr.
Côté, Richard G; Jones, Philip; Martens, Lennart; Kerrien, Samuel; Reisinger, Florian; Lin, Quan; Leinonen, Rasko; Apweiler, Rolf; Hermjakob, Henning
2007-01-01
Background Each major protein database uses its own conventions when assigning protein identifiers. Resolving the various, potentially unstable, identifiers that refer to identical proteins is a major challenge. This is a common problem when attempting to unify datasets that have been annotated with proteins from multiple data sources or querying data providers with one flavour of protein identifiers when the source database uses another. Partial solutions for protein identifier mapping exist but they are limited to specific species or techniques and to a very small number of databases. As a result, we have not found a solution that is generic enough and broad enough in mapping scope to suit our needs. Results We have created the Protein Identifier Cross-Reference (PICR) service, a web application that provides interactive and programmatic (SOAP and REST) access to a mapping algorithm that uses the UniProt Archive (UniParc) as a data warehouse to offer protein cross-references based on 100% sequence identity to proteins from over 70 distinct source databases loaded into UniParc. Mappings can be limited by source database, taxonomic ID and activity status in the source database. Users can copy/paste or upload files containing protein identifiers or sequences in FASTA format to obtain mappings using the interactive interface. Search results can be viewed in simple or detailed HTML tables or downloaded as comma-separated values (CSV) or Microsoft Excel (XLS) files suitable for use in a local database or a spreadsheet. Alternatively, a SOAP interface is available to integrate PICR functionality in other applications, as is a lightweight REST interface. Conclusion We offer a publicly available service that can interactively map protein identifiers and protein sequences to the majority of commonly used protein databases. Programmatic access is available through a standards-compliant SOAP interface or a lightweight REST interface. The PICR interface, documentation and code examples are available at . PMID:17945017
Identifying personal microbiomes using metagenomic codes
Franzosa, Eric A.; Huang, Katherine; Meadow, James F.; Gevers, Dirk; Lemon, Katherine P.; Bohannan, Brendan J. M.; Huttenhower, Curtis
2015-01-01
Community composition within the human microbiome varies across individuals, but it remains unknown if this variation is sufficient to uniquely identify individuals within large populations or stable enough to identify them over time. We investigated this by developing a hitting set-based coding algorithm and applying it to the Human Microbiome Project population. Our approach defined body site-specific metagenomic codes: sets of microbial taxa or genes prioritized to uniquely and stably identify individuals. Codes capturing strain variation in clade-specific marker genes were able to distinguish among 100s of individuals at an initial sampling time point. In comparisons with follow-up samples collected 30–300 d later, ∼30% of individuals could still be uniquely pinpointed using metagenomic codes from a typical body site; coincidental (false positive) matches were rare. Codes based on the gut microbiome were exceptionally stable and pinpointed >80% of individuals. The failure of a code to match its owner at a later time point was largely explained by the loss of specific microbial strains (at current limits of detection) and was only weakly associated with the length of the sampling interval. In addition to highlighting patterns of temporal variation in the ecology of the human microbiome, this work demonstrates the feasibility of microbiome-based identifiability—a result with important ethical implications for microbiome study design. The datasets and code used in this work are available for download from huttenhower.sph.harvard.edu/idability. PMID:25964341
Measuring Diagnoses: ICD Code Accuracy
O'Malley, Kimberly J; Cook, Karon F; Price, Matt D; Wildes, Kimberly Raiford; Hurdle, John F; Ashton, Carol M
2005-01-01
Objective To examine potential sources of errors at each step of the described inpatient International Classification of Diseases (ICD) coding process. Data Sources/Study Setting The use of disease codes from the ICD has expanded from classifying morbidity and mortality information for statistical purposes to diverse sets of applications in research, health care policy, and health care finance. By describing a brief history of ICD coding, detailing the process for assigning codes, identifying where errors can be introduced into the process, and reviewing methods for examining code accuracy, we help code users more systematically evaluate code accuracy for their particular applications. Study Design/Methods We summarize the inpatient ICD diagnostic coding process from patient admission to diagnostic code assignment. We examine potential sources of errors at each step and offer code users a tool for systematically evaluating code accuracy. Principle Findings Main error sources along the “patient trajectory” include amount and quality of information at admission, communication among patients and providers, the clinician's knowledge and experience with the illness, and the clinician's attention to detail. Main error sources along the “paper trail” include variance in the electronic and written records, coder training and experience, facility quality-control efforts, and unintentional and intentional coder errors, such as misspecification, unbundling, and upcoding. Conclusions By clearly specifying the code assignment process and heightening their awareness of potential error sources, code users can better evaluate the applicability and limitations of codes for their particular situations. ICD codes can then be used in the most appropriate ways. PMID:16178999
VizieR Online Data Catalog: Swift AGN and Cluster Survey (SACS). I. (Dai+, 2015)
NASA Astrophysics Data System (ADS)
Dai, X.; Griffin, R. D.; Kochanek, C. S.; Nugent, J. M.; Bregman, J. N.
2015-07-01
The Swift XRT observations of GRB fields were downloaded from the HEASARC website. This includes ~9yr of data through 2013 July 27. We matched the sources to the WISE all-sky catalog (see section 3.2). (5 data files).
The International Energy Portal includes a powerful data browser that provides country-level energy data; many countries have at least 30 years of historical data. The data browser provides users the ability to view and download complete datasets for consumption, production, trade, reserves, and carbon dioxide emissions for different fuels and energy sources.
Taxonomy of Spyware and Empirical Study of Network Drive-By-Downloads
2005-09-01
homepages Mortgage company homepages Universities 500 Source: www.utexas.edu/world/univ/state/ Unsafe Sectors Adult Entertainment 418 Free porn ...industries/technology/2004-07-01-cyber-threat_x.htm. (Last accessed September 7, 2005) [2] American Library Association. Great Web Sites for Kids
Data Mining Meets HCI: Making Sense of Large Graphs
2012-07-01
graph algo- rithms, won the Open Source Software World Challenge, Silver Award. We have released Pegasus as free , open-source software, downloaded by...METIS [77], spectral clustering [108], and the parameter- free “Cross-associations” (CA) [26]. Belief Propagation can also be used for clus- tering, as...number of tools have been developed to support “ landscape ” views of information. These include WebBook and Web- Forager [23], which use a book metaphor
XtalOpt version r9: An open-source evolutionary algorithm for crystal structure prediction
Falls, Zackary; Lonie, David C.; Avery, Patrick; ...
2015-10-23
This is a new version of XtalOpt, an evolutionary algorithm for crystal structure prediction available for download from the CPC library or the XtalOpt website, http://xtalopt.github.io. XtalOpt is published under the Gnu Public License (GPL), which is an open source license that is recognized by the Open Source Initiative. We have detailed the new version incorporates many bug-fixes and new features here and predict the crystal structure of a system from its stoichiometry alone, using evolutionary algorithms.
Operational rate-distortion performance for joint source and channel coding of images.
Ruf, M J; Modestino, J W
1999-01-01
This paper describes a methodology for evaluating the operational rate-distortion behavior of combined source and channel coding schemes with particular application to images. In particular, we demonstrate use of the operational rate-distortion function to obtain the optimum tradeoff between source coding accuracy and channel error protection under the constraint of a fixed transmission bandwidth for the investigated transmission schemes. Furthermore, we develop information-theoretic bounds on performance for specific source and channel coding systems and demonstrate that our combined source-channel coding methodology applied to different schemes results in operational rate-distortion performance which closely approach these theoretical limits. We concentrate specifically on a wavelet-based subband source coding scheme and the use of binary rate-compatible punctured convolutional (RCPC) codes for transmission over the additive white Gaussian noise (AWGN) channel. Explicit results for real-world images demonstrate the efficacy of this approach.
NWChem: A comprehensive and scalable open-source solution for large scale molecular simulations
NASA Astrophysics Data System (ADS)
Valiev, M.; Bylaska, E. J.; Govind, N.; Kowalski, K.; Straatsma, T. P.; Van Dam, H. J. J.; Wang, D.; Nieplocha, J.; Apra, E.; Windus, T. L.; de Jong, W. A.
2010-09-01
The latest release of NWChem delivers an open-source computational chemistry package with extensive capabilities for large scale simulations of chemical and biological systems. Utilizing a common computational framework, diverse theoretical descriptions can be used to provide the best solution for a given scientific problem. Scalable parallel implementations and modular software design enable efficient utilization of current computational architectures. This paper provides an overview of NWChem focusing primarily on the core theoretical modules provided by the code and their parallel performance. Program summaryProgram title: NWChem Catalogue identifier: AEGI_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEGI_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: Open Source Educational Community License No. of lines in distributed program, including test data, etc.: 11 709 543 No. of bytes in distributed program, including test data, etc.: 680 696 106 Distribution format: tar.gz Programming language: Fortran 77, C Computer: all Linux based workstations and parallel supercomputers, Windows and Apple machines Operating system: Linux, OS X, Windows Has the code been vectorised or parallelized?: Code is parallelized Classification: 2.1, 2.2, 3, 7.3, 7.7, 16.1, 16.2, 16.3, 16.10, 16.13 Nature of problem: Large-scale atomistic simulations of chemical and biological systems require efficient and reliable methods for ground and excited solutions of many-electron Hamiltonian, analysis of the potential energy surface, and dynamics. Solution method: Ground and excited solutions of many-electron Hamiltonian are obtained utilizing density-functional theory, many-body perturbation approach, and coupled cluster expansion. These solutions or a combination thereof with classical descriptions are then used to analyze potential energy surface and perform dynamical simulations. Additional comments: Full documentation is provided in the distribution file. This includes an INSTALL file giving details of how to build the package. A set of test runs is provided in the examples directory. The distribution file for this program is over 90 Mbytes and therefore is not delivered directly when download or Email is requested. Instead a html file giving details of how the program can be obtained is sent. Running time: Running time depends on the size of the chemical system, complexity of the method, number of cpu's and the computational task. It ranges from several seconds for serial DFT energy calculations on a few atoms to several hours for parallel coupled cluster energy calculations on tens of atoms or ab-initio molecular dynamics simulation on hundreds of atoms.
PCR Amplicon Prediction from Multiplex Degenerate Primer and Probe Sets
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gardner, S. N.
2013-08-08
Assessing primer specificity and predicting both desired and off-target amplification products is an essential step for robust PCR assay design. Code is described to predict potential polymerase chain reaction (PCR) amplicons in a large sequence database such as NCBI nt from either singleplex or a large multiplexed set of primers, allowing degenerate primer and probe bases, with target mismatch annotates amplicons with gene information automatically downloaded from NCBI, and optionally it can predict whether there are also TaqMan/Luminex probe matches within predicted amplicons.
Yam, Chun-Shan
2007-11-01
The purpose of this article is to describe an alternative for creating scrollable movie loops for electronic presentations including PowerPoint. The alternative provided in this article enables academic radiologists to present scrollable movie loops in PowerPoint. The scrolling capability is created using Flash ActionScript. A Flash template with the required ActionScript code is provided. Users can simply download the template and follow the step-by-step demonstration to create scrollable movie loops. No previous ActionScript programming knowledge is necessary.
Development of a Spacecraft Materials Selector Expert System
NASA Technical Reports Server (NTRS)
Pippin, G.; Kauffman, W. (Technical Monitor)
2002-01-01
This report contains a description of the knowledge base tool and examples of its use. A downloadable version of the Spacecraft Materials Selector (SMS) knowledge base is available through the NASA Space Environments and Effects Program. The "Spacecraft Materials Selector" knowledge base is part of an electronic expert system. The expert system consists of an inference engine that contains the "decision-making" code and the knowledge base that contains the selected body of information. The inference engine is a software package previously developed at Boeing, called the Boeing Expert System Tool (BEST) kit.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Thorn, David L.
Code is written in Basic to run using web-available Basic assembler, available at justbasic.com. It drives a set of stepper motors to mechanize the operation of pipetting radioactive solutions within a hot cell, and it communicates via serial port with the C4 stepper controller sold by Arrick, see http://www.arrickrobotics.com/c4md2.html. It is intended to operate stand-alone, that is, the justbasic assembler/application is downloaded onto a PC, the application runs the software program Pipettor, and the instructions are included as comments within the software.
Geospatial resources for the geologic community: The USGS National Map
Witt, Emitt C.
2015-01-01
Geospatial data are a key component of investigating, interpreting, and communicating the geological sciences. Locating geospatial data can be time-consuming, which detracts from time spent on a study because these data are not obviously placed in central locations or are served from many disparate databases. The National Map of the US Geological Survey is a publicly available resource for accessing the geospatial base map data needs of the geological community from a central location. The National Map data are available through a viewer and download platform providing access to eight primary data themes, plus the US Topo and scanned historical topographic maps. The eight themes are elevation, orthoimagery, hydrography, geographic names, boundaries, transportation, structures, and land cover, and they are being offered for download as predefined tiles in formats supported by leading geographic information system software. Data tiles are periodically refreshed to capture the most current content and are an efficient method for disseminating and receiving geospatial information. Elevation data, for example, are offered as a download from the National Map as 1° × 1° tiles for the 10- and 30- m products and as 15′ × 15′ tiles for the higher-resolution 3-m product. Vector data sets with smaller file sizes are offered at several tile sizes and formats. Partial tiles are not a download option—any prestaged data that intersect the requesting bounding box will be, in their entirety, part of the download order. While there are many options for accessing geospatial data via the Web, the National Map represents authoritative sources of data that are documented and can be referenced for citation and inclusion in scientific publications. Therefore, National Map products and services should be part of a geologist’s first stop for geospatial information and data.
Continuation of research into language concepts for the mission support environment: Source code
NASA Technical Reports Server (NTRS)
Barton, Timothy J.; Ratner, Jeremiah M.
1991-01-01
Research into language concepts for the Mission Control Center is presented. A computer code for source codes is presented. The file contains the routines which allow source code files to be created and compiled. The build process assumes that all elements and the COMP exist in the current directory. The build process places as much code generation as possible on the preprocessor as possible. A summary is given of the source files as used and/or manipulated by the build routine.
Broadcasting a Lab Measurement over Existing Conductor Networks
ERIC Educational Resources Information Center
Knipp, Peter A.
2009-01-01
Students learn about physical laws and the scientific method when they analyze experimental data in a laboratory setting. Three common sources exist for the experimental data that they analyze: (1) "hands-on" measurements by the students themselves, (2) electronic transfer (by downloading a spreadsheet, video, or computer-aided data-acquisition…
Homology Modeling and Molecular Docking for the Science Curriculum
ERIC Educational Resources Information Center
McDougal, Owen M.; Cornia, Nic; Sambasivarao, S. V.; Remm, Andrew; Mallory, Chris; Oxford, Julia Thom; Maupin, C. Mark; Andersen, Tim
2014-01-01
DockoMatic 2.0 is a powerful open source software program (downloadable from sourceforge.net) that allows users to utilize a readily accessible computational tool to explore biomolecules and their interactions. This manuscript describes a practical tutorial for use in the undergraduate curriculum that introduces students to macromolecular…
No Strings Attached: Open Source Solutions
ERIC Educational Resources Information Center
Fredricks, Kathy
2009-01-01
Imagine downloading a new software application and not having to worry about licensing, finding dollars in the budget, or incurring additional maintenance costs. Imagine finding a Web design tool in the public domain--free for use. Imagine major universities that provide online courses with no strings attached. Imagine online textbooks without a…
Methods and Software for Building Bibliographic Data Bases.
ERIC Educational Resources Information Center
Daehn, Ralph M.
1985-01-01
This in-depth look at database management systems (DBMS) for microcomputers covers data entry, information retrieval, security, DBMS software and design, and downloading of literature search results. The advantages of in-house systems versus online search vendors are discussed, and specifications of three software packages and 14 sources are…
An audit of alcohol brand websites.
Gordon, Ross
2011-11-01
The study investigated the nature and content of alcohol brand websites in the UK. The research involved an audit of the websites of the 10 leading alcohol brands by sales in the UK across four categories: lager, spirits, Flavoured Alcoholic Beverages and cider/perry. Each site was visited twice over a 1-month period with site features and content recorded using a pro-forma. The content of websites was then reviewed against the regulatory codes governing broadcast advertising of alcohol. It was found that 27 of 40 leading alcohol brands had a dedicated website. Sites featured sophisticated content, including sports and music sections, games, downloads and competitions. Case studies of two brand websites demonstrate the range of content features on such sites. A review of the application of regulatory codes covering traditional advertising found some content may breach the codes. Study findings illustrate the sophisticated range of content accessible on alcohol brand websites. When applying regulatory codes covering traditional alcohol marketing channels it is apparent that some content on alcohol brand websites would breach the codes. This suggests the regulation of alcohol brand websites may be an issue requiring attention from policymakers. Further research in this area would help inform this process. © 2010 Australasian Professional Society on Alcohol and other Drugs.
RiboMaker: computational design of conformation-based riboregulation.
Rodrigo, Guillermo; Jaramillo, Alfonso
2014-09-01
The ability to engineer control systems of gene expression is instrumental for synthetic biology. Thus, bioinformatic methods that assist such engineering are appealing because they can guide the sequence design and prevent costly experimental screening. In particular, RNA is an ideal substrate to de novo design regulators of protein expression by following sequence-to-function models. We have implemented a novel algorithm, RiboMaker, aimed at the computational, automated design of bacterial riboregulation. RiboMaker reads the sequence and structure specifications, which codify for a gene regulatory behaviour, and optimizes the sequences of a small regulatory RNA and a 5'-untranslated region for an efficient intermolecular interaction. To this end, it implements an evolutionary design strategy, where random mutations are selected according to a physicochemical model based on free energies. The resulting sequences can then be tested experimentally, providing a new tool for synthetic biology, and also for investigating the riboregulation principles in natural systems. Web server is available at http://ribomaker.jaramillolab.org/. Source code, instructions and examples are freely available for download at http://sourceforge.net/projects/ribomaker/. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
PhyloExplorer: a web server to validate, explore and query phylogenetic trees
Ranwez, Vincent; Clairon, Nicolas; Delsuc, Frédéric; Pourali, Saeed; Auberval, Nicolas; Diser, Sorel; Berry, Vincent
2009-01-01
Background Many important problems in evolutionary biology require molecular phylogenies to be reconstructed. Phylogenetic trees must then be manipulated for subsequent inclusion in publications or analyses such as supertree inference and tree comparisons. However, no tool is currently available to facilitate the management of tree collections providing, for instance: standardisation of taxon names among trees with respect to a reference taxonomy; selection of relevant subsets of trees or sub-trees according to a taxonomic query; or simply computation of descriptive statistics on the collection. Moreover, although several databases of phylogenetic trees exist, there is currently no easy way to find trees that are both relevant and complementary to a given collection of trees. Results We propose a tool to facilitate assessment and management of phylogenetic tree collections. Given an input collection of rooted trees, PhyloExplorer provides facilities for obtaining statistics describing the collection, correcting invalid taxon names, extracting taxonomically relevant parts of the collection using a dedicated query language, and identifying related trees in the TreeBASE database. Conclusion PhyloExplorer is a simple and interactive website implemented through underlying Python libraries and MySQL databases. It is available at: and the source code can be downloaded from: . PMID:19450253
LiGRO: a graphical user interface for protein-ligand molecular dynamics.
Kagami, Luciano Porto; das Neves, Gustavo Machado; da Silva, Alan Wilter Sousa; Caceres, Rafael Andrade; Kawano, Daniel Fábio; Eifler-Lima, Vera Lucia
2017-10-04
To speed up the drug-discovery process, molecular dynamics (MD) calculations performed in GROMACS can be coupled to docking simulations for the post-screening analyses of large compound libraries. This requires generating the topology of the ligands in different software, some basic knowledge of Linux command lines, and a certain familiarity in handling the output files. LiGRO-the python-based graphical interface introduced here-was designed to overcome these protein-ligand parameterization challenges by allowing the graphical (non command line-based) control of GROMACS (MD and analysis), ACPYPE (ligand topology builder) and PLIP (protein-binder interactions monitor)-programs that can be used together to fully perform and analyze the outputs of complex MD simulations (including energy minimization and NVT/NPT equilibration). By allowing the calculation of linear interaction energies in a simple and quick fashion, LiGRO can be used in the drug-discovery pipeline to select compounds with a better protein-binding interaction profile. The design of LiGRO allows researchers to freely download and modify the software, with the source code being available under the terms of a GPLv3 license from http://www.ufrgs.br/lasomfarmacia/ligro/ .
Griss, Johannes; Reisinger, Florian; Hermjakob, Henning; Vizcaíno, Juan Antonio
2012-03-01
We here present the jmzReader library: a collection of Java application programming interfaces (APIs) to parse the most commonly used peak list and XML-based mass spectrometry (MS) data formats: DTA, MS2, MGF, PKL, mzXML, mzData, and mzML (based on the already existing API jmzML). The library is optimized to be used in conjunction with mzIdentML, the recently released standard data format for reporting protein and peptide identifications, developed by the HUPO proteomics standards initiative (PSI). mzIdentML files do not contain spectra data but contain references to different kinds of external MS data files. As a key functionality, all parsers implement a common interface that supports the various methods used by mzIdentML to reference external spectra. Thus, when developing software for mzIdentML, programmers no longer have to support multiple MS data file formats but only this one interface. The library (which includes a viewer) is open source and, together with detailed documentation, can be downloaded from http://code.google.com/p/jmzreader/. © 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Leveraging Hierarchical Population Structure in Discrete Association Studies
Carlson, Jonathan; Kadie, Carl; Mallal, Simon; Heckerman, David
2007-01-01
Population structure can confound the identification of correlations in biological data. Such confounding has been recognized in multiple biological disciplines, resulting in a disparate collection of proposed solutions. We examine several methods that correct for confounding on discrete data with hierarchical population structure and identify two distinct confounding processes, which we call coevolution and conditional influence. We describe these processes in terms of generative models and show that these generative models can be used to correct for the confounding effects. Finally, we apply the models to three applications: identification of escape mutations in HIV-1 in response to specific HLA-mediated immune pressure, prediction of coevolving residues in an HIV-1 peptide, and a search for genotypes that are associated with bacterial resistance traits in Arabidopsis thaliana. We show that coevolution is a better description of confounding in some applications and conditional influence is better in others. That is, we show that no single method is best for addressing all forms of confounding. Analysis tools based on these models are available on the internet as both web based applications and downloadable source code at http://atom.research.microsoft.com/bio/phylod.aspx. PMID:17611623
Adapting smart phone applications about physics education to blind students
NASA Astrophysics Data System (ADS)
Bülbül, M. Ş.; Yiğit, N.; Garip, B.
2016-04-01
Today, most of necessary equipment in a physics laboratory are available for smartphone users via applications. Physics teachers may measure from acceleration to sound volume with its internal sensors. These sensors collect data and smartphone applications make the raw data visible. Teachers who do not have well-equipped laboratories at their schools may have an opportunity to conduct experiments with the help of smart phones. In this study, we analyzed possible open source physics education applications in terms of blind users in inclusive learning environments. All apps are categorized as partially, full or non-supported. The roles of blind learner’s friend during the application are categorized as reader, describer or user. Mentioned apps in the study are compared with additional opportunities like size and downloading rates. Out of using apps we may also get information about whether via internet and some other extra information for different experiments in physics lab. Q-codes reading or augmented reality are two other opportunity provided by smart phones for users in physics labs. We also summarized blind learner’s smartphone experiences from literature and listed some suggestions for application designers about concepts in physics.
Gradient Magnitude Similarity Deviation: A Highly Efficient Perceptual Image Quality Index.
Xue, Wufeng; Zhang, Lei; Mou, Xuanqin; Bovik, Alan C
2014-02-01
It is an important task to faithfully evaluate the perceptual quality of output images in many applications, such as image compression, image restoration, and multimedia streaming. A good image quality assessment (IQA) model should not only deliver high quality prediction accuracy, but also be computationally efficient. The efficiency of IQA metrics is becoming particularly important due to the increasing proliferation of high-volume visual data in high-speed networks. We present a new effective and efficient IQA model, called gradient magnitude similarity deviation (GMSD). The image gradients are sensitive to image distortions, while different local structures in a distorted image suffer different degrees of degradations. This motivates us to explore the use of global variation of gradient based local quality map for overall image quality prediction. We find that the pixel-wise gradient magnitude similarity (GMS) between the reference and distorted images combined with a novel pooling strategy-the standard deviation of the GMS map-can predict accurately perceptual image quality. The resulting GMSD algorithm is much faster than most state-of-the-art IQA methods, and delivers highly competitive prediction accuracy. MATLAB source code of GMSD can be downloaded at http://www4.comp.polyu.edu.hk/~cslzhang/IQA/GMSD/GMSD.htm.
Basin Hopping Graph: a computational framework to characterize RNA folding landscapes
Kucharík, Marcel; Hofacker, Ivo L.; Stadler, Peter F.; Qin, Jing
2014-01-01
Motivation: RNA folding is a complicated kinetic process. The minimum free energy structure provides only a static view of the most stable conformational state of the system. It is insufficient to give detailed insights into the dynamic behavior of RNAs. A sufficiently sophisticated analysis of the folding free energy landscape, however, can provide the relevant information. Results: We introduce the Basin Hopping Graph (BHG) as a novel coarse-grained model of folding landscapes. Each vertex of the BHG is a local minimum, which represents the corresponding basin in the landscape. Its edges connect basins when the direct transitions between them are ‘energetically favorable’. Edge weights endcode the corresponding saddle heights and thus measure the difficulties of these favorable transitions. BHGs can be approximated accurately and efficiently for RNA molecules well beyond the length range accessible to enumerative algorithms. Availability and implementation: The algorithms described here are implemented in C++ as standalone programs. Its source code and supplemental material can be freely downloaded from http://www.tbi.univie.ac.at/bhg.html. Contact: qin@bioinf.uni-leipzig.de Supplementary information: Supplementary data are available at Bioinformatics online. PMID:24648041
PLIP: fully automated protein-ligand interaction profiler.
Salentin, Sebastian; Schreiber, Sven; Haupt, V Joachim; Adasme, Melissa F; Schroeder, Michael
2015-07-01
The characterization of interactions in protein-ligand complexes is essential for research in structural bioinformatics, drug discovery and biology. However, comprehensive tools are not freely available to the research community. Here, we present the protein-ligand interaction profiler (PLIP), a novel web service for fully automated detection and visualization of relevant non-covalent protein-ligand contacts in 3D structures, freely available at projects.biotec.tu-dresden.de/plip-web. The input is either a Protein Data Bank structure, a protein or ligand name, or a custom protein-ligand complex (e.g. from docking). In contrast to other tools, the rule-based PLIP algorithm does not require any structure preparation. It returns a list of detected interactions on single atom level, covering seven interaction types (hydrogen bonds, hydrophobic contacts, pi-stacking, pi-cation interactions, salt bridges, water bridges and halogen bonds). PLIP stands out by offering publication-ready images, PyMOL session files to generate custom images and parsable result files to facilitate successive data processing. The full python source code is available for download on the website. PLIP's command-line mode allows for high-throughput interaction profiling. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
ChIPWig: a random access-enabling lossless and lossy compression method for ChIP-seq data.
Ravanmehr, Vida; Kim, Minji; Wang, Zhiying; Milenkovic, Olgica
2018-03-15
Chromatin immunoprecipitation sequencing (ChIP-seq) experiments are inexpensive and time-efficient, and result in massive datasets that introduce significant storage and maintenance challenges. To address the resulting Big Data problems, we propose a lossless and lossy compression framework specifically designed for ChIP-seq Wig data, termed ChIPWig. ChIPWig enables random access, summary statistics lookups and it is based on the asymptotic theory of optimal point density design for nonuniform quantizers. We tested the ChIPWig compressor on 10 ChIP-seq datasets generated by the ENCODE consortium. On average, lossless ChIPWig reduced the file sizes to merely 6% of the original, and offered 6-fold compression rate improvement compared to bigWig. The lossy feature further reduced file sizes 2-fold compared to the lossless mode, with little or no effects on peak calling and motif discovery using specialized NarrowPeaks methods. The compression and decompression speed rates are of the order of 0.2 sec/MB using general purpose computers. The source code and binaries are freely available for download at https://github.com/vidarmehr/ChIPWig-v2, implemented in C ++. milenkov@illinois.edu. Supplementary data are available at Bioinformatics online.
Knee X-ray image analysis method for automated detection of Osteoarthritis
Shamir, Lior; Ling, Shari M.; Scott, William W.; Bos, Angelo; Orlov, Nikita; Macura, Tomasz; Eckley, D. Mark; Ferrucci, Luigi; Goldberg, Ilya G.
2008-01-01
We describe a method for automated detection of radiographic Osteoarthritis (OA) in knee X-ray images. The detection is based on the Kellgren-Lawrence classification grades, which correspond to the different stages of OA severity. The classifier was built using manually classified X-rays, representing the first four KL grades (normal, doubtful, minimal and moderate). Image analysis is performed by first identifying a set of image content descriptors and image transforms that are informative for the detection of OA in the X-rays, and assigning weights to these image features using Fisher scores. Then, a simple weighted nearest neighbor rule is used in order to predict the KL grade to which a given test X-ray sample belongs. The dataset used in the experiment contained 350 X-ray images classified manually by their KL grades. Experimental results show that moderate OA (KL grade 3) and minimal OA (KL grade 2) can be differentiated from normal cases with accuracy of 91.5% and 80.4%, respectively. Doubtful OA (KL grade 1) was detected automatically with a much lower accuracy of 57%. The source code developed and used in this study is available for free download at www.openmicroscopy.org. PMID:19342330
2011-01-01
Background Translational medicine requires the integration of knowledge using heterogeneous data from health care to the life sciences. Here, we describe a collaborative effort to produce a prototype Translational Medicine Knowledge Base (TMKB) capable of answering questions relating to clinical practice and pharmaceutical drug discovery. Results We developed the Translational Medicine Ontology (TMO) as a unifying ontology to integrate chemical, genomic and proteomic data with disease, treatment, and electronic health records. We demonstrate the use of Semantic Web technologies in the integration of patient and biomedical data, and reveal how such a knowledge base can aid physicians in providing tailored patient care and facilitate the recruitment of patients into active clinical trials. Thus, patients, physicians and researchers may explore the knowledge base to better understand therapeutic options, efficacy, and mechanisms of action. Conclusions This work takes an important step in using Semantic Web technologies to facilitate integration of relevant, distributed, external sources and progress towards a computational platform to support personalized medicine. Availability TMO can be downloaded from http://code.google.com/p/translationalmedicineontology and TMKB can be accessed at http://tm.semanticscience.org/sparql. PMID:21624155
Measuring diagnoses: ICD code accuracy.
O'Malley, Kimberly J; Cook, Karon F; Price, Matt D; Wildes, Kimberly Raiford; Hurdle, John F; Ashton, Carol M
2005-10-01
To examine potential sources of errors at each step of the described inpatient International Classification of Diseases (ICD) coding process. The use of disease codes from the ICD has expanded from classifying morbidity and mortality information for statistical purposes to diverse sets of applications in research, health care policy, and health care finance. By describing a brief history of ICD coding, detailing the process for assigning codes, identifying where errors can be introduced into the process, and reviewing methods for examining code accuracy, we help code users more systematically evaluate code accuracy for their particular applications. We summarize the inpatient ICD diagnostic coding process from patient admission to diagnostic code assignment. We examine potential sources of errors at each step and offer code users a tool for systematically evaluating code accuracy. Main error sources along the "patient trajectory" include amount and quality of information at admission, communication among patients and providers, the clinician's knowledge and experience with the illness, and the clinician's attention to detail. Main error sources along the "paper trail" include variance in the electronic and written records, coder training and experience, facility quality-control efforts, and unintentional and intentional coder errors, such as misspecification, unbundling, and upcoding. By clearly specifying the code assignment process and heightening their awareness of potential error sources, code users can better evaluate the applicability and limitations of codes for their particular situations. ICD codes can then be used in the most appropriate ways.
Comeau, Donald C.; Liu, Haibin; Islamaj Doğan, Rezarta; Wilbur, W. John
2014-01-01
BioC is a new format and associated code libraries for sharing text and annotations. We have implemented BioC natural language preprocessing pipelines in two popular programming languages: C++ and Java. The current implementations interface with the well-known MedPost and Stanford natural language processing tool sets. The pipeline functionality includes sentence segmentation, tokenization, part-of-speech tagging, lemmatization and sentence parsing. These pipelines can be easily integrated along with other BioC programs into any BioC compliant text mining systems. As an application, we converted the NCBI disease corpus to BioC format, and the pipelines have successfully run on this corpus to demonstrate their functionality. Code and data can be downloaded from http://bioc.sourceforge.net. Database URL: http://bioc.sourceforge.net PMID:24935050
NASA Technical Reports Server (NTRS)
White, P. R.; Little, R. R.
1985-01-01
A research effort was undertaken to develop personal computer based software for vibrational analysis. The software was developed to analytically determine the natural frequencies and mode shapes for the uncoupled lateral vibrations of the blade and counterweight assemblies used in a single bladed wind turbine. The uncoupled vibration analysis was performed in both the flapwise and chordwise directions for static rotor conditions. The effects of rotation on the uncoupled flapwise vibration of the blade and counterweight assemblies were evaluated for various rotor speeds up to 90 rpm. The theory, used in the vibration analysis codes, is based on a lumped mass formulation for the blade and counterweight assemblies. The codes are general so that other designs can be readily analyzed. The input for the codes is generally interactive to facilitate usage. The output of the codes is both tabular and graphical. Listings of the codes are provided. Predicted natural frequencies of the first several modes show reasonable agreement with experimental results. The analysis codes were originally developed on a DEC PDP 11/34 minicomputer and then downloaded and modified to run on an ITT XTRA personal computer. Studies conducted to evaluate the efficiency of running the programs on a personal computer as compared with the minicomputer indicated that, with the proper combination of hardware and software options, the efficiency of using a personal computer exceeds that of a minicomputer.
Douzery, Emmanuel J P; Scornavacca, Celine; Romiguier, Jonathan; Belkhir, Khalid; Galtier, Nicolas; Delsuc, Frédéric; Ranwez, Vincent
2014-07-01
Comparative genomic studies extensively rely on alignments of orthologous sequences. Yet, selecting, gathering, and aligning orthologous exons and protein-coding sequences (CDS) that are relevant for a given evolutionary analysis can be a difficult and time-consuming task. In this context, we developed OrthoMaM, a database of ORTHOlogous MAmmalian Markers describing the evolutionary dynamics of orthologous genes in mammalian genomes using a phylogenetic framework. Since its first release in 2007, OrthoMaM has regularly evolved, not only to include newly available genomes but also to incorporate up-to-date software in its analytic pipeline. This eighth release integrates the 40 complete mammalian genomes available in Ensembl v73 and provides alignments, phylogenies, evolutionary descriptor information, and functional annotations for 13,404 single-copy orthologous CDS and 6,953 long exons. The graphical interface allows to easily explore OrthoMaM to identify markers with specific characteristics (e.g., taxa availability, alignment size, %G+C, evolutionary rate, chromosome location). It hence provides an efficient solution to sample preprocessed markers adapted to user-specific needs. OrthoMaM has proven to be a valuable resource for researchers interested in mammalian phylogenomics, evolutionary genomics, and has served as a source of benchmark empirical data sets in several methodological studies. OrthoMaM is available for browsing, query and complete or filtered downloads at http://www.orthomam.univ-montp2.fr/. © The Author 2014. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
SIBIS: a Bayesian model for inconsistent protein sequence estimation.
Khenoussi, Walyd; Vanhoutrève, Renaud; Poch, Olivier; Thompson, Julie D
2014-09-01
The prediction of protein coding genes is a major challenge that depends on the quality of genome sequencing, the accuracy of the model used to elucidate the exonic structure of the genes and the complexity of the gene splicing process leading to different protein variants. As a consequence, today's protein databases contain a huge amount of inconsistency, due to both natural variants and sequence prediction errors. We have developed a new method, called SIBIS, to detect such inconsistencies based on the evolutionary information in multiple sequence alignments. A Bayesian framework, combined with Dirichlet mixture models, is used to estimate the probability of observing specific amino acids and to detect inconsistent or erroneous sequence segments. We evaluated the performance of SIBIS on a reference set of protein sequences with experimentally validated errors and showed that the sensitivity is significantly higher than previous methods, with only a small loss of specificity. We also assessed a large set of human sequences from the UniProt database and found evidence of inconsistency in 48% of the previously uncharacterized sequences. We conclude that the integration of quality control methods like SIBIS in automatic analysis pipelines will be critical for the robust inference of structural, functional and phylogenetic information from these sequences. Source code, implemented in C on a linux system, and the datasets of protein sequences are freely available for download at http://www.lbgi.fr/∼julie/SIBIS. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Automated Concurrent Blackboard System Generation in C++
NASA Technical Reports Server (NTRS)
Kaplan, J. A.; McManus, J. W.; Bynum, W. L.
1999-01-01
In his 1992 Ph.D. thesis, "Design and Analysis Techniques for Concurrent Blackboard Systems", John McManus defined several performance metrics for concurrent blackboard systems and developed a suite of tools for creating and analyzing such systems. These tools allow a user to analyze a concurrent blackboard system design and predict the performance of the system before any code is written. The design can be modified until simulated performance is satisfactory. Then, the code generator can be invoked to generate automatically all of the code required for the concurrent blackboard system except for the code implementing the functionality of each knowledge source. We have completed the port of the source code generator and a simulator for a concurrent blackboard system. The source code generator generates the necessary C++ source code to implement the concurrent blackboard system using Parallel Virtual Machine (PVM) running on a heterogeneous network of UNIX(trademark) workstations. The concurrent blackboard simulator uses the blackboard specification file to predict the performance of the concurrent blackboard design. The only part of the source code for the concurrent blackboard system that the user must supply is the code implementing the functionality of the knowledge sources.
Integrating the Allen Brain Institute Cell Types Database into Automated Neuroscience Workflow.
Stockton, David B; Santamaria, Fidel
2017-10-01
We developed software tools to download, extract features, and organize the Cell Types Database from the Allen Brain Institute (ABI) in order to integrate its whole cell patch clamp characterization data into the automated modeling/data analysis cycle. To expand the potential user base we employed both Python and MATLAB. The basic set of tools downloads selected raw data and extracts cell, sweep, and spike features, using ABI's feature extraction code. To facilitate data manipulation we added a tool to build a local specialized database of raw data plus extracted features. Finally, to maximize automation, we extended our NeuroManager workflow automation suite to include these tools plus a separate investigation database. The extended suite allows the user to integrate ABI experimental and modeling data into an automated workflow deployed on heterogeneous computer infrastructures, from local servers, to high performance computing environments, to the cloud. Since our approach is focused on workflow procedures our tools can be modified to interact with the increasing number of neuroscience databases being developed to cover all scales and properties of the nervous system.
You've Written a Cool Astronomy Code! Now What Do You Do with It?
NASA Astrophysics Data System (ADS)
Allen, Alice; Accomazzi, A.; Berriman, G. B.; DuPrie, K.; Hanisch, R. J.; Mink, J. D.; Nemiroff, R. J.; Shamir, L.; Shortridge, K.; Taylor, M. B.; Teuben, P. J.; Wallin, J. F.
2014-01-01
Now that you've written a useful astronomy code for your soon-to-be-published research, you have to figure out what you want to do with it. Our suggestion? Share it! This presentation highlights the means and benefits of sharing your code. Make your code citable -- submit it to the Astrophysics Source Code Library and have it indexed by ADS! The Astrophysics Source Code Library (ASCL) is a free online registry of source codes of interest to astronomers and astrophysicists. With over 700 codes, it is continuing its rapid growth, with an average of 17 new codes a month. The editors seek out codes for inclusion; indexing by ADS improves the discoverability of codes and provides a way to cite codes as separate entries, especially codes without papers that describe them.
Jones, David T; Singh, Tanya; Kosciolek, Tomasz; Tetchner, Stuart
2015-04-01
Recent developments of statistical techniques to infer direct evolutionary couplings between residue pairs have rendered covariation-based contact prediction a viable means for accurate 3D modelling of proteins, with no information other than the sequence required. To extend the usefulness of contact prediction, we have designed a new meta-predictor (MetaPSICOV) which combines three distinct approaches for inferring covariation signals from multiple sequence alignments, considers a broad range of other sequence-derived features and, uniquely, a range of metrics which describe both the local and global quality of the input multiple sequence alignment. Finally, we use a two-stage predictor, where the second stage filters the output of the first stage. This two-stage predictor is additionally evaluated on its ability to accurately predict the long range network of hydrogen bonds, including correctly assigning the donor and acceptor residues. Using the original PSICOV benchmark set of 150 protein families, MetaPSICOV achieves a mean precision of 0.54 for top-L predicted long range contacts-around 60% higher than PSICOV, and around 40% better than CCMpred. In de novo protein structure prediction using FRAGFOLD, MetaPSICOV is able to improve the TM-scores of models by a median of 0.05 compared with PSICOV. Lastly, for predicting long range hydrogen bonding, MetaPSICOV-HB achieves a precision of 0.69 for the top-L/10 hydrogen bonds compared with just 0.26 for the baseline MetaPSICOV. MetaPSICOV is available as a freely available web server at http://bioinf.cs.ucl.ac.uk/MetaPSICOV. Raw data (predicted contact lists and 3D models) and source code can be downloaded from http://bioinf.cs.ucl.ac.uk/downloads/MetaPSICOV. Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press.
Tools for Implementing the Recent IAU Resolutions: USNO Circular 179 and the NOVAS Software Package
NASA Astrophysics Data System (ADS)
Kaplan, G. H.; Bangert, J. A.
2006-08-01
The resolutions on positional astronomy adopted at the 1997 and 2000 IAU General Assemblies are far-reaching in scope, affecting both the details of various computations and the basic concepts upon which they are built. For many scientists and engineers, applying these recommendations to practical problems is thus doubly challenging. Because the U.S. Naval Observatory (USNO) serves a broad base of users, we have provided two different tools to aid in implementing the resolutions, both of which are intended for the person who is knowledgeable but not necessarily expert in positional astronomy. These tools complement the new material that has been added to The Astronomical Almanac (see paper by Hohenkerk). USNO Circular 179 is a 118-page book that introduces the resolutions to non-specialists. It includes extensive narratives describing the basic concepts as well as compilations of the equations necessary to apply the recommendations. The resolutions have been logically grouped into six main chapters. The Circular is available as a hard-cover book or as a PDF file that can be downloaded from either the USNO/AA web site (http://aa.usno.navy.mil/) or arXiv.org. NOVAS (Naval Observatory Vector Astrometry Subroutines) is a source-code library available in both Fortran and C. It is a long established package with a wide user base that has recently been extensively revised (in version 3.0) to implement the recent IAU resolutions. However, use of NOVAS does not require detailed knowledge of the resolutions, since commonly requested high-level data _ for example, topocentric positions of stars or planets _ are provided in a single call. NOVAS can be downloaded from the USNO/AA web site. Both Circular 179 and NOVAS version 3.0 anticipate IAU adoption of the recommendations of the 2003-2006 working groups on precession and nomenclature.
Sources of global climate data and visualization portals
Douglas, David C.
2014-01-01
Climate is integral to the geophysical foundation upon which ecosystems are structured. Knowledge about mechanistic linkages between the geophysical and biological environments is essential for understanding how global warming may reshape contemporary ecosystems and ecosystem services. Numerous global data sources spanning several decades are available that document key geophysical metrics such as temperature and precipitation, and metrics of primary biological production such as vegetation phenology and ocean phytoplankton. This paper provides an internet directory to portals for visualizing or servers for downloading many of the more commonly used global datasets, as well as a description of how to write simple computer code to efficiently retrieve these data. The data are broadly useful for quantifying relationships between climate, habitat availability, and lower-trophic-level habitat quality - especially in Arctic regions where strong seasonality is accompanied by intrinsically high year-to-year variability. If defensible linkages between the geophysical (climate) and the biological environment can be established, general circulation model (GCM) projections of future climate conditions can be used to infer future biological responses. Robustness of this approach is, however, complicated by the number of direct, indirect, or interacting linkages involved. For example, response of a predator species to climate change will be influenced by the responses of its prey and competitors, and so forth throughout a trophic web. The complexities of ecological systems warrant sensible and parsimonious approaches for assessing and establishing the role of natural climate variability in order to substantiate inferences about the potential effects of global warming.
QMachine: commodity supercomputing in web browsers.
Wilkinson, Sean R; Almeida, Jonas S
2014-06-09
Ongoing advancements in cloud computing provide novel opportunities in scientific computing, especially for distributed workflows. Modern web browsers can now be used as high-performance workstations for querying, processing, and visualizing genomics' "Big Data" from sources like The Cancer Genome Atlas (TCGA) and the International Cancer Genome Consortium (ICGC) without local software installation or configuration. The design of QMachine (QM) was driven by the opportunity to use this pervasive computing model in the context of the Web of Linked Data in Biomedicine. QM is an open-sourced, publicly available web service that acts as a messaging system for posting tasks and retrieving results over HTTP. The illustrative application described here distributes the analyses of 20 Streptococcus pneumoniae genomes for shared suffixes. Because all analytical and data retrieval tasks are executed by volunteer machines, few server resources are required. Any modern web browser can submit those tasks and/or volunteer to execute them without installing any extra plugins or programs. A client library provides high-level distribution templates including MapReduce. This stark departure from the current reliance on expensive server hardware running "download and install" software has already gathered substantial community interest, as QM received more than 2.2 million API calls from 87 countries in 12 months. QM was found adequate to deliver the sort of scalable bioinformatics solutions that computation- and data-intensive workflows require. Paradoxically, the sandboxed execution of code by web browsers was also found to enable them, as compute nodes, to address critical privacy concerns that characterize biomedical environments.
RGG: A general GUI Framework for R scripts
Visne, Ilhami; Dilaveroglu, Erkan; Vierlinger, Klemens; Lauss, Martin; Yildiz, Ahmet; Weinhaeusel, Andreas; Noehammer, Christa; Leisch, Friedrich; Kriegner, Albert
2009-01-01
Background R is the leading open source statistics software with a vast number of biostatistical and bioinformatical analysis packages. To exploit the advantages of R, extensive scripting/programming skills are required. Results We have developed a software tool called R GUI Generator (RGG) which enables the easy generation of Graphical User Interfaces (GUIs) for the programming language R by adding a few Extensible Markup Language (XML) – tags. RGG consists of an XML-based GUI definition language and a Java-based GUI engine. GUIs are generated in runtime from defined GUI tags that are embedded into the R script. User-GUI input is returned to the R code and replaces the XML-tags. RGG files can be developed using any text editor. The current version of RGG is available as a stand-alone software (RGGRunner) and as a plug-in for JGR. Conclusion RGG is a general GUI framework for R that has the potential to introduce R statistics (R packages, built-in functions and scripts) to users with limited programming skills and helps to bridge the gap between R developers and GUI-dependent users. RGG aims to abstract the GUI development from individual GUI toolkits by using an XML-based GUI definition language. Thus RGG can be easily integrated in any software. The RGG project further includes the development of a web-based repository for RGG-GUIs. RGG is an open source project licensed under the Lesser General Public License (LGPL) and can be downloaded freely at PMID:19254356
Web-based Toolkit for Dynamic Generation of Data Processors
NASA Astrophysics Data System (ADS)
Patel, J.; Dascalu, S.; Harris, F. C.; Benedict, K. K.; Gollberg, G.; Sheneman, L.
2011-12-01
All computation-intensive scientific research uses structured datasets, including hydrology and all other types of climate-related research. When it comes to testing their hypotheses, researchers might use the same dataset differently, and modify, transform, or convert it to meet their research needs. Currently, many researchers spend a good amount of time performing data processing and building tools to speed up this process. They might routinely repeat the same process activities for new research projects, spending precious time that otherwise could be dedicated to analyzing and interpreting the data. Numerous tools are available to run tests on prepared datasets and many of them work with datasets in different formats. However, there is still a significant need for applications that can comprehensively handle data transformation and conversion activities and help prepare the various processed datasets required by the researchers. We propose a web-based application (a software toolkit) that dynamically generates data processors capable of performing data conversions, transformations, and customizations based on user-defined mappings and selections. As a first step, the proposed solution allows the users to define various data structures and, in the next step, can select various file formats and data conversions for their datasets of interest. In a simple scenario, the core of the proposed web-based toolkit allows the users to define direct mappings between input and output data structures. The toolkit will also support defining complex mappings involving the use of pre-defined sets of mathematical, statistical, date/time, and text manipulation functions. Furthermore, the users will be allowed to define logical cases for input data filtering and sampling. At the end of the process, the toolkit is designed to generate reusable source code and executable binary files for download and use by the scientists. The application is also designed to store all data structures and mappings defined by a user (an author), and allow the original author to modify them using standard authoring techniques. The users can change or define new mappings to create new data processors for download and use. In essence, when executed, the generated data processor binary file can take an input data file in a given format and output this data, possibly transformed, in a different file format. If they so desire, the users will be able modify directly the source code in order to define more complex mappings and transformations that are not currently supported by the toolkit. Initially aimed at supporting research in hydrology, the toolkit's functions and features can be either directly used or easily extended to other areas of climate-related research. The proposed web-based data processing toolkit will be able to generate various custom software processors for data conversion and transformation in a matter of seconds or minutes, saving a significant amount of researchers' time and allowing them to focus on core research issues.
JCoDA: a tool for detecting evolutionary selection.
Steinway, Steven N; Dannenfelser, Ruth; Laucius, Christopher D; Hayes, James E; Nayak, Sudhir
2010-05-27
The incorporation of annotated sequence information from multiple related species in commonly used databases (Ensembl, Flybase, Saccharomyces Genome Database, Wormbase, etc.) has increased dramatically over the last few years. This influx of information has provided a considerable amount of raw material for evaluation of evolutionary relationships. To aid in the process, we have developed JCoDA (Java Codon Delimited Alignment) as a simple-to-use visualization tool for the detection of site specific and regional positive/negative evolutionary selection amongst homologous coding sequences. JCoDA accepts user-inputted unaligned or pre-aligned coding sequences, performs a codon-delimited alignment using ClustalW, and determines the dN/dS calculations using PAML (Phylogenetic Analysis Using Maximum Likelihood, yn00 and codeml) in order to identify regions and sites under evolutionary selection. The JCoDA package includes a graphical interface for Phylip (Phylogeny Inference Package) to generate phylogenetic trees, manages formatting of all required file types, and streamlines passage of information between underlying programs. The raw data are output to user configurable graphs with sliding window options for straightforward visualization of pairwise or gene family comparisons. Additionally, codon-delimited alignments are output in a variety of common formats and all dN/dS calculations can be output in comma-separated value (CSV) format for downstream analysis. To illustrate the types of analyses that are facilitated by JCoDA, we have taken advantage of the well studied sex determination pathway in nematodes as well as the extensive sequence information available to identify genes under positive selection, examples of regional positive selection, and differences in selection based on the role of genes in the sex determination pathway. JCoDA is a configurable, open source, user-friendly visualization tool for performing evolutionary analysis on homologous coding sequences. JCoDA can be used to rapidly screen for genes and regions of genes under selection using PAML. It can be freely downloaded at http://www.tcnj.edu/~nayaklab/jcoda.
JCoDA: a tool for detecting evolutionary selection
2010-01-01
Background The incorporation of annotated sequence information from multiple related species in commonly used databases (Ensembl, Flybase, Saccharomyces Genome Database, Wormbase, etc.) has increased dramatically over the last few years. This influx of information has provided a considerable amount of raw material for evaluation of evolutionary relationships. To aid in the process, we have developed JCoDA (Java Codon Delimited Alignment) as a simple-to-use visualization tool for the detection of site specific and regional positive/negative evolutionary selection amongst homologous coding sequences. Results JCoDA accepts user-inputted unaligned or pre-aligned coding sequences, performs a codon-delimited alignment using ClustalW, and determines the dN/dS calculations using PAML (Phylogenetic Analysis Using Maximum Likelihood, yn00 and codeml) in order to identify regions and sites under evolutionary selection. The JCoDA package includes a graphical interface for Phylip (Phylogeny Inference Package) to generate phylogenetic trees, manages formatting of all required file types, and streamlines passage of information between underlying programs. The raw data are output to user configurable graphs with sliding window options for straightforward visualization of pairwise or gene family comparisons. Additionally, codon-delimited alignments are output in a variety of common formats and all dN/dS calculations can be output in comma-separated value (CSV) format for downstream analysis. To illustrate the types of analyses that are facilitated by JCoDA, we have taken advantage of the well studied sex determination pathway in nematodes as well as the extensive sequence information available to identify genes under positive selection, examples of regional positive selection, and differences in selection based on the role of genes in the sex determination pathway. Conclusions JCoDA is a configurable, open source, user-friendly visualization tool for performing evolutionary analysis on homologous coding sequences. JCoDA can be used to rapidly screen for genes and regions of genes under selection using PAML. It can be freely downloaded at http://www.tcnj.edu/~nayaklab/jcoda. PMID:20507581
2013-01-01
Background Tuberculosis is currently the second highest cause of death from infectious diseases worldwide. The emergence of multi and extensive drug resistance is threatening to make tuberculosis incurable. There is growing evidence that the genetic diversity of Mycobacterium tuberculosis may have important clinical consequences. Therefore, combining genetic, clinical and socio-demographic data is critical to understand the epidemiology of this infectious disease, and how virulence and other phenotypic traits evolve over time. This requires dedicated bioinformatics platforms, capable of integrating and enabling analyses of this heterogeneous data. Results We developed inTB, a web-based system for integrated warehousing and analysis of clinical, socio-demographic and molecular data for Mycobacterium sp. isolates. As a database it can organize and display data from any of the standard genotyping methods (SNP, MIRU-VNTR, RFLP and spoligotype), as well as an extensive array of clinical and socio-demographic variables that are used in multiple countries to characterize the disease. Through the inTB interface it is possible to insert and download data, browse the database and search specific parameters. New isolates are automatically classified into strains according to an internal reference, and data uploaded or typed in is checked for internal consistency. As an analysis framework, the system provides simple, point and click analysis tools that allow multiple types of data plotting, as well as simple ways to download data for external analysis. Individual trees for each genotyping method are available, as well as a super tree combining all of them. The integrative nature of inTB grants the user the ability to generate trees for filtered subsets of data crossing molecular and clinical/socio-demografic information. inTB is built on open source software, can be easily installed locally and easily adapted to other diseases. Its design allows for use by research laboratories, hospitals or public health authorities. The full source code as well as ready to use packages is available at http://www.evocell.org/inTB. Conclusions To the best of our knowledge, this is the only system capable of integrating different types of molecular data with clinical and socio-demographic data, empowering researchers and clinicians with easy to use analysis tools that were not possible before. PMID:24001185
Sarntivijai, Sirarat; Vasant, Drashtti; Jupp, Simon; Saunders, Gary; Bento, A Patrícia; Gonzalez, Daniel; Betts, Joanna; Hasan, Samiul; Koscielny, Gautier; Dunham, Ian; Parkinson, Helen; Malone, James
2016-01-01
The Centre for Therapeutic Target Validation (CTTV - https://www.targetvalidation.org/) was established to generate therapeutic target evidence from genome-scale experiments and analyses. CTTV aims to support the validity of therapeutic targets by integrating existing and newly-generated data. Data integration has been achieved in some resources by mapping metadata such as disease and phenotypes to the Experimental Factor Ontology (EFO). Additionally, the relationship between ontology descriptions of rare and common diseases and their phenotypes can offer insights into shared biological mechanisms and potential drug targets. Ontologies are not ideal for representing the sometimes associated type relationship required. This work addresses two challenges; annotation of diverse big data, and representation of complex, sometimes associated relationships between concepts. Semantic mapping uses a combination of custom scripting, our annotation tool 'Zooma', and expert curation. Disease-phenotype associations were generated using literature mining on Europe PubMed Central abstracts, which were manually verified by experts for validity. Representation of the disease-phenotype association was achieved by the Ontology of Biomedical AssociatioN (OBAN), a generic association representation model. OBAN represents associations between a subject and object i.e., disease and its associated phenotypes and the source of evidence for that association. The indirect disease-to-disease associations are exposed through shared phenotypes. This was applied to the use case of linking rare to common diseases at the CTTV. EFO yields an average of over 80% of mapping coverage in all data sources. A 42% precision is obtained from the manual verification of the text-mined disease-phenotype associations. This results in 1452 and 2810 disease-phenotype pairs for IBD and autoimmune disease and contributes towards 11,338 rare diseases associations (merged with existing published work [Am J Hum Genet 97:111-24, 2015]). An OBAN result file is downloadable at http://sourceforge.net/p/efo/code/HEAD/tree/trunk/src/efoassociations/. Twenty common diseases are linked to 85 rare diseases by shared phenotypes. A generalizable OBAN model for association representation is presented in this study. Here we present solutions to large-scale annotation-ontology mapping in the CTTV knowledge base, a process for disease-phenotype mining, and propose a generic association model, 'OBAN', as a means to integrate disease using shared phenotypes. EFO is released monthly and available for download at http://www.ebi.ac.uk/efo/.
Soares, Patrícia; Alves, Renato J; Abecasis, Ana B; Penha-Gonçalves, Carlos; Gomes, M Gabriela M; Pereira-Leal, José B
2013-08-30
Tuberculosis is currently the second highest cause of death from infectious diseases worldwide. The emergence of multi and extensive drug resistance is threatening to make tuberculosis incurable. There is growing evidence that the genetic diversity of Mycobacterium tuberculosis may have important clinical consequences. Therefore, combining genetic, clinical and socio-demographic data is critical to understand the epidemiology of this infectious disease, and how virulence and other phenotypic traits evolve over time. This requires dedicated bioinformatics platforms, capable of integrating and enabling analyses of this heterogeneous data. We developed inTB, a web-based system for integrated warehousing and analysis of clinical, socio-demographic and molecular data for Mycobacterium sp. isolates. As a database it can organize and display data from any of the standard genotyping methods (SNP, MIRU-VNTR, RFLP and spoligotype), as well as an extensive array of clinical and socio-demographic variables that are used in multiple countries to characterize the disease. Through the inTB interface it is possible to insert and download data, browse the database and search specific parameters. New isolates are automatically classified into strains according to an internal reference, and data uploaded or typed in is checked for internal consistency. As an analysis framework, the system provides simple, point and click analysis tools that allow multiple types of data plotting, as well as simple ways to download data for external analysis. Individual trees for each genotyping method are available, as well as a super tree combining all of them. The integrative nature of inTB grants the user the ability to generate trees for filtered subsets of data crossing molecular and clinical/socio-demografic information. inTB is built on open source software, can be easily installed locally and easily adapted to other diseases. Its design allows for use by research laboratories, hospitals or public health authorities. The full source code as well as ready to use packages is available at http://www.evocell.org/inTB. To the best of our knowledge, this is the only system capable of integrating different types of molecular data with clinical and socio-demographic data, empowering researchers and clinicians with easy to use analysis tools that were not possible before.
Teachers' Acceptance and Use of an Educational Portal
ERIC Educational Resources Information Center
Pynoo, Bram; Tondeur, Jo; van Braak, Johan; Duyck, Wouter; Sijnave, Bart; Duyck, Philippe
2012-01-01
In this study, teachers' acceptance and use of an educational portal is assessed based on data from two sources: usage data (number of logins, downloads, uploads, reactions and pages viewed) and an online acceptance questionnaire. The usage data is extracted on two occasions from the portal's database: at survey completion (T1) and twenty-two…
A High-Resolution Stopwatch for Cents
ERIC Educational Resources Information Center
Gingl, Z.; Kopasz, K.
2011-01-01
A very low-cost, easy-to-make stopwatch is presented to support various experiments in mechanics. The high-resolution stopwatch is based on two photodetectors connected directly to the microphone input of a sound card. Dedicated free open-source software has been developed and made available to download. The efficiency is demonstrated by a free…
How to Keep Your Health Information Private and Secure
... permanently. When Using Mobile Devices · Research mobile apps – software programs that perform one or more specific functions – before you download and install any of them. Be sure to use known app websites or trusted sources. · Read the terms of service and the privacy notice of the mobile app ...
The random energy model in a magnetic field and joint source channel coding
NASA Astrophysics Data System (ADS)
Merhav, Neri
2008-09-01
We demonstrate that there is an intimate relationship between the magnetic properties of Derrida’s random energy model (REM) of spin glasses and the problem of joint source-channel coding in Information Theory. In particular, typical patterns of erroneously decoded messages in the coding problem have “magnetization” properties that are analogous to those of the REM in certain phases, where the non-uniformity of the distribution of the source in the coding problem plays the role of an external magnetic field applied to the REM. We also relate the ensemble performance (random coding exponents) of joint source-channel codes to the free energy of the REM in its different phases.
2014-06-01
User Manual and Source Code for a LAMMPS Implementation of Constant Energy Dissipative Particle Dynamics (DPD-E) by James P. Larentzos...Laboratory Aberdeen Proving Ground, MD 21005-5069 ARL-SR-290 June 2014 User Manual and Source Code for a LAMMPS Implementation of Constant...3. DATES COVERED (From - To) September 2013–February 2014 4. TITLE AND SUBTITLE User Manual and Source Code for a LAMMPS Implementation of
MetAMOS: a modular and open source metagenomic assembly and analysis pipeline
2013-01-01
We describe MetAMOS, an open source and modular metagenomic assembly and analysis pipeline. MetAMOS represents an important step towards fully automated metagenomic analysis, starting with next-generation sequencing reads and producing genomic scaffolds, open-reading frames and taxonomic or functional annotations. MetAMOS can aid in reducing assembly errors, commonly encountered when assembling metagenomic samples, and improves taxonomic assignment accuracy while also reducing computational cost. MetAMOS can be downloaded from: https://github.com/treangen/MetAMOS. PMID:23320958
Herbst, Christian T; Oh, Jinook; Vydrová, Jitka; Švec, Jan G
2015-07-01
In this short report we introduce DigitalVHI, a free open-source software application for obtaining Voice Handicap Index (VHI) and other questionnaire data, which can be put on a computer in clinics and used in clinical practice. The software can simplify performing clinical studies since it makes the VHI scores directly available for analysis in a digital form. It can be downloaded from http://www.christian-herbst.org/DigitalVHI/.
pytc: Open-Source Python Software for Global Analyses of Isothermal Titration Calorimetry Data.
Duvvuri, Hiranmayi; Wheeler, Lucas C; Harms, Michael J
2018-05-08
Here we describe pytc, an open-source Python package for global fits of thermodynamic models to multiple isothermal titration calorimetry experiments. Key features include simplicity, the ability to implement new thermodynamic models, a robust maximum likelihood fitter, a fast Bayesian Markov-Chain Monte Carlo sampler, rigorous implementation, extensive documentation, and full cross-platform compatibility. pytc fitting can be done using an application program interface or via a graphical user interface. It is available for download at https://github.com/harmslab/pytc .
2008-01-01
anomalous traffic of the node ÷ total anomalous traffic – Make parent nodes by merging child node information. prefix/length coverage/collateral...A T A U P L O A D F L O W S E N S O R 5. DATA DOWNLOAD FLOW SENSOR 1. RECON SNORT: KICKASS_PORN DRAGON: PORN HARDCORE SOURCEDEST SOURCE SOURCE...traffic • Make parent nodes by merging child node information. prefix/length coverage/collateral 0.0.0.0/0 100/100 depth=0 non divided 0.0.0.0/1 50/30
Top of the charts: download versus citations in the International Journal of Cardiology.
Coats, Andrew J S
2005-11-02
The medical literature is growing at an alarming rate. Research assessment exercises, research quality frameworks, league tables and the like have attempted to quantify the volume, quality and impact of research. Yet the established measures (such as citation rates) are being challenged by the sheer number of journals, variability in the "gold standard" of peer-review and the emergence of open-source or web-based journals. In the last few years, we have seen a growth in downloads to individual journal articles that now easily exceeds formal journal subscriptions. We have recorded the 10 top cited articles over a 12-month period and compared them to the 10 most popular articles being downloaded over the same time period. The citation-based listing included basic and applied, observational and interventional original research reports. For downloaded articles, which have shown a dramatic increase for the International Journal of Cardiology from 48,000 in 2002 to 120,000 in 2003 to 200,000 in 2004, the most popular articles over the same period are very different and are dominated by up-to-date reviews of either cutting-edge topics (such as the potential of stem cells) or of the management of rare or unusual conditions. There is no overlap between the two lists despite covering exactly the same 12-month period and using measures of peer esteem. Perhaps the time has come to look at the usage of articles rather than, or in addition to, their referencing.
Torres, Maria Beatriz; Asmar, Lucy; Danh, Thu; Horvath, Keith J
2018-01-01
Background Although many men who have sex with men (MSM) test for HIV at least once in their lifetime, opportunities to improve regular HIV testing, particularly among Hispanic or Latino MSM, is needed. Many mHealth interventions in development, including the ones on HIV testing, have primarily focused on English-speaking white, black, and MSM of other races. To date, no studies have assessed app use, attitudes, and motivations for downloading and sustaining use of mobile apps and preferences with respect to HIV prevention among Spanish-speaking, Hispanic MSM in the United States. Objective The primary aims of this study were to determine what features and functions of smartphone apps do Hispanic, Spanish-speaking MSM believe are associated with downloading apps to their smartphones, (2) what features and functions of smartphone apps are most likely to influence men’s sustained use of apps over time, and (3) what features and functions do men prefer in a smartphone app aimed to promote regular testing for HIV. Methods Interviews (N=15) were conducted with a racially diverse group of sexually active, HIV-negative, Spanish-speaking, Hispanic MSM in Miami, Florida. Interviews were digitally recorded, transcribed verbatim, translated back to English, and de-identified for analysis. A constant-comparison method (ie, grounded theory coding) was employed to examine themes that emerged from the interviews. Results Personal interest was the primary reason associated with whether men downloaded an app. Keeping personal information secure, cost, influence by peers and posted reviews, ease of use, and functionality affected whether they downloaded and used the app over time. Men also reported that entertainment value and frequency of updates influenced whether they kept and continued to use an app over time. There were 4 reasons why participants chose to delete an app—dislike, lack of use, cost, and lack of memory or space. Participants also shared their preferences for an app to encourage regular HIV testing by providing feedback on test reminders, tailored testing interval recommendations, HIV test locator, and monitoring of personal sexual behaviors. Conclusions The features and functions of mobile apps that Spanish-speaking MSM in this study believed were associated with downloading and/or sustained engagement of an app generally reflected the priorities mentioned in an earlier study with English-speaking MSM. Unlike the earlier study, Spanish-speaking MSM prioritized personal interest in a mobile app and de-emphasized the efficiency of an app to make their lives easier in their decision to download an app to their mobile device. Tailoring mobile apps to the language and needs of Spanish-speaking MSM is critical to help increase their willingness to download a mobile app. Despite the growing number of HIV-prevention apps in development, few are tailored to Spanish-speaking MSM, representing an important gap that should be addressed in future research. PMID:29691205
ProteinTracker: an application for managing protein production and purification
2012-01-01
Background Laboratories that produce protein reagents for research and development face the challenge of deciding whether to track batch-related data using simple file based storage mechanisms (e.g. spreadsheets and notebooks), or commit the time and effort to install, configure and maintain a more complex laboratory information management system (LIMS). Managing reagent data stored in files is challenging because files are often copied, moved, and reformatted. Furthermore, there is no simple way to query the data if/when questions arise. Commercial LIMS often include additional modules that may be paid for but not actually used, and often require software expertise to truly customize them for a given environment. Findings This web-application allows small to medium-sized protein production groups to track data related to plasmid DNA, conditioned media samples (supes), cell lines used for expression, and purified protein information, including method of purification and quality control results. In addition, a request system was added that includes a means of prioritizing requests to help manage the high demand of protein production resources at most organizations. ProteinTracker makes extensive use of existing open-source libraries and is designed to track essential data related to the production and purification of proteins. Conclusions ProteinTracker is an open-source web-based application that provides organizations with the ability to track key data involved in the production and purification of proteins and may be modified to meet the specific needs of an organization. The source code and database setup script can be downloaded from http://sourceforge.net/projects/proteintracker. This site also contains installation instructions and a user guide. A demonstration version of the application can be viewed at http://www.proteintracker.org. PMID:22574679
An open source automatic quality assurance (OSAQA) tool for the ACR MRI phantom.
Sun, Jidi; Barnes, Michael; Dowling, Jason; Menk, Fred; Stanwell, Peter; Greer, Peter B
2015-03-01
Routine quality assurance (QA) is necessary and essential to ensure MR scanner performance. This includes geometric distortion, slice positioning and thickness accuracy, high contrast spatial resolution, intensity uniformity, ghosting artefact and low contrast object detectability. However, this manual process can be very time consuming. This paper describes the development and validation of an open source tool to automate the MR QA process, which aims to increase physicist efficiency, and improve the consistency of QA results by reducing human error. The OSAQA software was developed in Matlab and the source code is available for download from http://jidisun.wix.com/osaqa-project/. During program execution QA results are logged for immediate review and are also exported to a spreadsheet for long-term machine performance reporting. For the automatic contrast QA test, a user specific contrast evaluation was designed to improve accuracy for individuals on different display monitors. American College of Radiology QA images were acquired over a period of 2 months to compare manual QA and the results from the proposed OSAQA software. OSAQA was found to significantly reduce the QA time from approximately 45 to 2 min. Both the manual and OSAQA results were found to agree with regard to the recommended criteria and the differences were insignificant compared to the criteria. The intensity homogeneity filter is necessary to obtain an image with acceptable quality and at the same time keeps the high contrast spatial resolution within the recommended criterion. The OSAQA tool has been validated on scanners with different field strengths and manufacturers. A number of suggestions have been made to improve both the phantom design and QA protocol in the future.
The Bioperl Toolkit: Perl Modules for the Life Sciences
Stajich, Jason E.; Block, David; Boulez, Kris; Brenner, Steven E.; Chervitz, Stephen A.; Dagdigian, Chris; Fuellen, Georg; Gilbert, James G.R.; Korf, Ian; Lapp, Hilmar; Lehväslaiho, Heikki; Matsalla, Chad; Mungall, Chris J.; Osborne, Brian I.; Pocock, Matthew R.; Schattner, Peter; Senger, Martin; Stein, Lincoln D.; Stupka, Elia; Wilkinson, Mark D.; Birney, Ewan
2002-01-01
The Bioperl project is an international open-source collaboration of biologists, bioinformaticians, and computer scientists that has evolved over the past 7 yr into the most comprehensive library of Perl modules available for managing and manipulating life-science information. Bioperl provides an easy-to-use, stable, and consistent programming interface for bioinformatics application programmers. The Bioperl modules have been successfully and repeatedly used to reduce otherwise complex tasks to only a few lines of code. The Bioperl object model has been proven to be flexible enough to support enterprise-level applications such as EnsEMBL, while maintaining an easy learning curve for novice Perl programmers. Bioperl is capable of executing analyses and processing results from programs such as BLAST, ClustalW, or the EMBOSS suite. Interoperation with modules written in Python and Java is supported through the evolving BioCORBA bridge. Bioperl provides access to data stores such as GenBank and SwissProt via a flexible series of sequence input/output modules, and to the emerging common sequence data storage format of the Open Bioinformatics Database Access project. This study describes the overall architecture of the toolkit, the problem domains that it addresses, and gives specific examples of how the toolkit can be used to solve common life-sciences problems. We conclude with a discussion of how the open-source nature of the project has contributed to the development effort. [Supplemental material is available online at www.genome.org. Bioperl is available as open-source software free of charge and is licensed under the Perl Artistic License (http://www.perl.com/pub/a/language/misc/Artistic.html). It is available for download at http://www.bioperl.org. Support inquiries should be addressed to bioperl-l@bioperl.org.] PMID:12368254
Climate-Smart Seedlot Selection Tool: Reforestation and Restoration for the 21st Century
NASA Astrophysics Data System (ADS)
Stevenson-Molnar, N.; Howe, G.; St Clair, B.; Bachelet, D. M.; Ward, B. C.
2017-12-01
Local populations of trees are generally adapted to their local climates. Historically, this has meant that local seed zones based on geography and elevation have been used to guide restoration and reforestation. In the face of climate change, seeds from local sources will likely be subjected to climates significantly different from those to which they are currently adapted. The Seedlot Selection Tool (SST) offers a new approach for matching seed sources with planting sites based on future climate scenarios. The SST is a mapping program designed for forest managers and researchers. Users can use the tool to to find seedlots for a given planting site, or to find potential planting sites for a given seedlot. Users select a location (seedlot or planting site), climate scenarios (a climate to which seeds are adapted, and a current or future climate scenario), climate variables, and transfer limits (the maximum climatic distance that is considered a suitable match). Transfer limits are provided by the user, or derived from the range of values within a geographically defined seed zone. The tool calculates scores across the landscape based on an area's similarity, in a multivariate climate space, to the input. Users can explore results on an interactive map, and export PDF and PowerPoint reports, including a map of the results along with the inputs used. Planned future improvements include support for non-forest use cases and ability to download results as GeoTIFF data. The Seedlot Selection Tool and its source code are available online at https://seedlotselectiontool.org. It is co-developed by the United States Forest Service, Oregon State University, and the Conservation Biology Institute.
Astronomy education and the Astrophysics Source Code Library
NASA Astrophysics Data System (ADS)
Allen, Alice; Nemiroff, Robert J.
2016-01-01
The Astrophysics Source Code Library (ASCL) is an online registry of source codes used in refereed astrophysics research. It currently lists nearly 1,200 codes and covers all aspects of computational astrophysics. How can this resource be of use to educators and to the graduate students they mentor? The ASCL serves as a discovery tool for codes that can be used for one's own research. Graduate students can also investigate existing codes to see how common astronomical problems are approached numerically in practice, and use these codes as benchmarks for their own solutions to these problems. Further, they can deepen their knowledge of software practices and techniques through examination of others' codes.
NASA Astrophysics Data System (ADS)
Nakagawa, Y.; Kawahara, S.; Araki, F.; Matsuoka, D.; Ishikawa, Y.; Fujita, M.; Sugimoto, S.; Okada, Y.; Kawazoe, S.; Watanabe, S.; Ishii, M.; Mizuta, R.; Murata, A.; Kawase, H.
2017-12-01
Analyses of large ensemble data are quite useful in order to produce probabilistic effect projection of climate change. Ensemble data of "+2K future climate simulations" are currently produced by Japanese national project "Social Implementation Program on Climate Change Adaptation Technology (SI-CAT)" as a part of a database for Policy Decision making for Future climate change (d4PDF; Mizuta et al. 2016) produced by Program for Risk Information on Climate Change. Those data consist of global warming simulations and regional downscaling simulations. Considering that those data volumes are too large (a few petabyte) to download to a local computer of users, a user-friendly system is required to search and download data which satisfy requests of the users. We develop "a database system for near-future climate change projections" for providing functions to find necessary data for the users under SI-CAT. The database system for near-future climate change projections mainly consists of a relational database, a data download function and user interface. The relational database using PostgreSQL is a key function among them. Temporally and spatially compressed data are registered on the relational database. As a first step, we develop the relational database for precipitation, temperature and track data of typhoon according to requests by SI-CAT members. The data download function using Open-source Project for a Network Data Access Protocol (OPeNDAP) provides a function to download temporally and spatially extracted data based on search results obtained by the relational database. We also develop the web-based user interface for using the relational database and the data download function. A prototype of the database system for near-future climate change projections are currently in operational test on our local server. The database system for near-future climate change projections will be released on Data Integration and Analysis System Program (DIAS) in fiscal year 2017. Techniques of the database system for near-future climate change projections might be quite useful for simulation and observational data in other research fields. We report current status of development and some case studies of the database system for near-future climate change projections.
Comeau, Donald C; Liu, Haibin; Islamaj Doğan, Rezarta; Wilbur, W John
2014-01-01
BioC is a new format and associated code libraries for sharing text and annotations. We have implemented BioC natural language preprocessing pipelines in two popular programming languages: C++ and Java. The current implementations interface with the well-known MedPost and Stanford natural language processing tool sets. The pipeline functionality includes sentence segmentation, tokenization, part-of-speech tagging, lemmatization and sentence parsing. These pipelines can be easily integrated along with other BioC programs into any BioC compliant text mining systems. As an application, we converted the NCBI disease corpus to BioC format, and the pipelines have successfully run on this corpus to demonstrate their functionality. Code and data can be downloaded from http://bioc.sourceforge.net. Database URL: http://bioc.sourceforge.net. © The Author(s) 2014. Published by Oxford University Press.
Donkin, Chris; Averell, Lee; Brown, Scott; Heathcote, Andrew
2009-11-01
Cognitive models of the decision process provide greater insight into response time and accuracy than do standard ANOVA techniques. However, such models can be mathematically and computationally difficult to apply. We provide instructions and computer code for three methods for estimating the parameters of the linear ballistic accumulator (LBA), a new and computationally tractable model of decisions between two or more choices. These methods-a Microsoft Excel worksheet, scripts for the statistical program R, and code for implementation of the LBA into the Bayesian sampling software WinBUGS-vary in their flexibility and user accessibility. We also provide scripts in R that produce a graphical summary of the data and model predictions. In a simulation study, we explored the effect of sample size on parameter recovery for each method. The materials discussed in this article may be downloaded as a supplement from http://brm.psychonomic-journals.org/content/supplemental.
Code of Federal Regulations, 2013 CFR
2013-04-01
... limited to software, files, data, and prize schedules. (2) Downloads must use secure methodologies that... date of the completion of the download; (iii) The Class II gaming system components to which software was downloaded; (iv) The version(s) of download package and any software downloaded. Logging of the...
Code of Federal Regulations, 2014 CFR
2014-04-01
... limited to software, files, data, and prize schedules. (2) Downloads must use secure methodologies that... date of the completion of the download; (iii) The Class II gaming system components to which software was downloaded; (iv) The version(s) of download package and any software downloaded. Logging of the...
Data processing with microcode designed with source coding
McCoy, James A; Morrison, Steven E
2013-05-07
Programming for a data processor to execute a data processing application is provided using microcode source code. The microcode source code is assembled to produce microcode that includes digital microcode instructions with which to signal the data processor to execute the data processing application.
Monte Carlo errors with less errors
NASA Astrophysics Data System (ADS)
Wolff, Ulli; Alpha Collaboration
2004-01-01
We explain in detail how to estimate mean values and assess statistical errors for arbitrary functions of elementary observables in Monte Carlo simulations. The method is to estimate and sum the relevant autocorrelation functions, which is argued to produce more certain error estimates than binning techniques and hence to help toward a better exploitation of expensive simulations. An effective integrated autocorrelation time is computed which is suitable to benchmark efficiencies of simulation algorithms with regard to specific observables of interest. A Matlab code is offered for download that implements the method. It can also combine independent runs (replica) allowing to judge their consistency.
genenames.org: the HGNC resources in 2011
Seal, Ruth L.; Gordon, Susan M.; Lush, Michael J.; Wright, Mathew W.; Bruford, Elspeth A.
2011-01-01
The HUGO Gene Nomenclature Committee (HGNC) aims to assign a unique gene symbol and name to every human gene. The HGNC database currently contains almost 30 000 approved gene symbols, over 19 000 of which represent protein-coding genes. The public website, www.genenames.org, displays all approved nomenclature within Symbol Reports that contain data curated by HGNC editors and links to related genomic, phenotypic and proteomic information. Here we describe improvements to our resources, including a new Quick Gene Search, a new List Search, an integrated HGNC BioMart and a new Statistics and Downloads facility. PMID:20929869
Adaptive distributed source coding.
Varodayan, David; Lin, Yao-Chung; Girod, Bernd
2012-05-01
We consider distributed source coding in the presence of hidden variables that parameterize the statistical dependence among sources. We derive the Slepian-Wolf bound and devise coding algorithms for a block-candidate model of this problem. The encoder sends, in addition to syndrome bits, a portion of the source to the decoder uncoded as doping bits. The decoder uses the sum-product algorithm to simultaneously recover the source symbols and the hidden statistical dependence variables. We also develop novel techniques based on density evolution (DE) to analyze the coding algorithms. We experimentally confirm that our DE analysis closely approximates practical performance. This result allows us to efficiently optimize parameters of the algorithms. In particular, we show that the system performs close to the Slepian-Wolf bound when an appropriate doping rate is selected. We then apply our coding and analysis techniques to a reduced-reference video quality monitoring system and show a bit rate saving of about 75% compared with fixed-length coding.
Campbell, J R; Carpenter, P; Sneiderman, C; Cohn, S; Chute, C G; Warren, J
1997-01-01
To compare three potential sources of controlled clinical terminology (READ codes version 3.1, SNOMED International, and Unified Medical Language System (UMLS) version 1.6) relative to attributes of completeness, clinical taxonomy, administrative mapping, term definitions and clarity (duplicate coding rate). The authors assembled 1929 source concept records from a variety of clinical information taken from four medical centers across the United States. The source data included medical as well as ample nursing terminology. The source records were coded in each scheme by an investigator and checked by the coding scheme owner. The codings were then scored by an independent panel of clinicians for acceptability. Codes were checked for definitions provided with the scheme. Codes for a random sample of source records were analyzed by an investigator for "parent" and "child" codes within the scheme. Parent and child pairs were scored by an independent panel of medical informatics specialists for clinical acceptability. Administrative and billing code mapping from the published scheme were reviewed for all coded records and analyzed by independent reviewers for accuracy. The investigator for each scheme exhaustively searched a sample of coded records for duplications. SNOMED was judged to be significantly more complete in coding the source material than the other schemes (SNOMED* 70%; READ 57%; UMLS 50%; *p < .00001). SNOMED also had a richer clinical taxonomy judged by the number of acceptable first-degree relatives per coded concept (SNOMED* 4.56, UMLS 3.17; READ 2.14, *p < .005). Only the UMLS provided any definitions; these were found for 49% of records which had a coding assignment. READ and UMLS had better administrative mappings (composite score: READ* 40.6%; UMLS* 36.1%; SNOMED 20.7%, *p < .00001), and SNOMED had substantially more duplications of coding assignments (duplication rate: READ 0%; UMLS 4.2%; SNOMED* 13.9%, *p < .004) associated with a loss of clarity. No major terminology source can lay claim to being the ideal resource for a computer-based patient record. However, based upon this analysis of releases for April 1995, SNOMED International is considerably more complete, has a compositional nature and a richer taxonomy. Is suffers from less clarity, resulting from a lack of syntax and evolutionary changes in its coding scheme. READ has greater clarity and better mapping to administrative schemes (ICD-10 and OPCS-4), is rapidly changing and is less complete. UMLS is a rich lexical resource, with mappings to many source vocabularies. It provides definitions for many of its terms. However, due to the varying granularities and purposes of its source schemes, it has limitations for representation of clinical concepts within a computer-based patient record.
Alternative Fuels Data Center: Data Downloads
Data Downloads to someone by E-mail Share Alternative Fuels Data Center: Data Downloads on Facebook Tweet about Alternative Fuels Data Center: Data Downloads on Twitter Bookmark Alternative Fuels Data Center: Data Downloads on Google Bookmark Alternative Fuels Data Center: Data Downloads on Delicious Rank
The Particle Accelerator Simulation Code PyORBIT
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gorlov, Timofey V; Holmes, Jeffrey A; Cousineau, Sarah M
2015-01-01
The particle accelerator simulation code PyORBIT is presented. The structure, implementation, history, parallel and simulation capabilities, and future development of the code are discussed. The PyORBIT code is a new implementation and extension of algorithms of the original ORBIT code that was developed for the Spallation Neutron Source accelerator at the Oak Ridge National Laboratory. The PyORBIT code has a two level structure. The upper level uses the Python programming language to control the flow of intensive calculations performed by the lower level code implemented in the C++ language. The parallel capabilities are based on MPI communications. The PyORBIT ismore » an open source code accessible to the public through the Google Open Source Projects Hosting service.« less
An Open Source Simulation System
NASA Technical Reports Server (NTRS)
Slack, Thomas
2005-01-01
An investigation into the current state of the art of open source real time programming practices. This document includes what technologies are available, how easy is it to obtain, configure, and use them, and some performance measures done on the different systems. A matrix of vendors and their products is included as part of this investigation, but this is not an exhaustive list, and represents only a snapshot of time in a field that is changing rapidly. Specifically, there are three approaches investigated: 1. Completely open source on generic hardware, downloaded from the net. 2. Open source packaged by a vender and provided as free evaluation copy. 3. Proprietary hardware with pre-loaded proprietary source available software provided by the vender as for our evaluation.
ERIC Educational Resources Information Center
Kroski, Ellyssa
2008-01-01
A widget displays Web content from external sources and can be embedded into a blog, social network, or other Web page, or downloaded to one's desktop. With widgets--sometimes referred to as gadgets--one can insert video into a blog post, display slideshows on MySpace, get the weather delivered to his mobile device, drag-and-drop his Netflix queue…
Twelve example local data support files are automatically downloaded when the SDMProjectBuilder is installed on a computer. They allow the user to modify values to parameters that impact the release, migration, fate, and transport of microbes within a watershed, and control delin...
Twelve example local data support files are automatically downloaded when the SDMProjectBuilder is installed on a computer. They allow the user to modify values to parameters that impact the release, migration, fate, and transport of microbes within a watershed, and control delin...
2009-09-01
nuclear industry for conducting performance assessment calculations. The analytical FORTRAN code for the DNAPL source function, REMChlor, was...project. The first was to apply existing deterministic codes , such as T2VOC and UTCHEM, to the DNAPL source zone to simulate the remediation processes...but describe the spatial variability of source zones unlike one-dimensional flow and transport codes that assume homogeneity. The Lagrangian models
Phase II Evaluation of Clinical Coding Schemes
Campbell, James R.; Carpenter, Paul; Sneiderman, Charles; Cohn, Simon; Chute, Christopher G.; Warren, Judith
1997-01-01
Abstract Objective: To compare three potential sources of controlled clinical terminology (READ codes version 3.1, SNOMED International, and Unified Medical Language System (UMLS) version 1.6) relative to attributes of completeness, clinical taxonomy, administrative mapping, term definitions and clarity (duplicate coding rate). Methods: The authors assembled 1929 source concept records from a variety of clinical information taken from four medical centers across the United States. The source data included medical as well as ample nursing terminology. The source records were coded in each scheme by an investigator and checked by the coding scheme owner. The codings were then scored by an independent panel of clinicians for acceptability. Codes were checked for definitions provided with the scheme. Codes for a random sample of source records were analyzed by an investigator for “parent” and “child” codes within the scheme. Parent and child pairs were scored by an independent panel of medical informatics specialists for clinical acceptability. Administrative and billing code mapping from the published scheme were reviewed for all coded records and analyzed by independent reviewers for accuracy. The investigator for each scheme exhaustively searched a sample of coded records for duplications. Results: SNOMED was judged to be significantly more complete in coding the source material than the other schemes (SNOMED* 70%; READ 57%; UMLS 50%; *p <.00001). SNOMED also had a richer clinical taxonomy judged by the number of acceptable first-degree relatives per coded concept (SNOMED* 4.56; UMLS 3.17; READ 2.14, *p <.005). Only the UMLS provided any definitions; these were found for 49% of records which had a coding assignment. READ and UMLS had better administrative mappings (composite score: READ* 40.6%; UMLS* 36.1%; SNOMED 20.7%, *p <. 00001), and SNOMED had substantially more duplications of coding assignments (duplication rate: READ 0%; UMLS 4.2%; SNOMED* 13.9%, *p <. 004) associated with a loss of clarity. Conclusion: No major terminology source can lay claim to being the ideal resource for a computer-based patient record. However, based upon this analysis of releases for April 1995, SNOMED International is considerably more complete, has a compositional nature and a richer taxonomy. It suffers from less clarity, resulting from a lack of syntax and evolutionary changes in its coding scheme. READ has greater clarity and better mapping to administrative schemes (ICD-10 and OPCS-4), is rapidly changing and is less complete. UMLS is a rich lexical resource, with mappings to many source vocabularies. It provides definitions for many of its terms. However, due to the varying granularities and purposes of its source schemes, it has limitations for representation of clinical concepts within a computer-based patient record. PMID:9147343
NASA Technical Reports Server (NTRS)
Barry, Matthew R.; Osborne, Richard N.
2005-01-01
The RoseDoclet computer program extends the capability of Java doclet software to automatically synthesize Unified Modeling Language (UML) content from Java language source code. [Doclets are Java-language programs that use the doclet application programming interface (API) to specify the content and format of the output of Javadoc. Javadoc is a program, originally designed to generate API documentation from Java source code, now also useful as an extensible engine for processing Java source code.] RoseDoclet takes advantage of Javadoc comments and tags already in the source code to produce a UML model of that code. RoseDoclet applies the doclet API to create a doclet passed to Javadoc. The Javadoc engine applies the doclet to the source code, emitting the output format specified by the doclet. RoseDoclet emits a Rose model file and populates it with fully documented packages, classes, methods, variables, and class diagrams identified in the source code. The way in which UML models are generated can be controlled by use of new Javadoc comment tags that RoseDoclet provides. The advantage of using RoseDoclet is that Javadoc documentation becomes leveraged for two purposes: documenting the as-built API and keeping the design documentation up to date.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Alexandrov, Boian S.; Lliev, Filip L.; Stanev, Valentin G.
This code is a toy (short) version of CODE-2016-83. From a general perspective, the code represents an unsupervised adaptive machine learning algorithm that allows efficient and high performance de-mixing and feature extraction of a multitude of non-negative signals mixed and recorded by a network of uncorrelated sensor arrays. The code identifies the number of the mixed original signals and their locations. Further, the code also allows deciphering of signals that have been delayed in regards to the mixing process in each sensor. This code is high customizable and it can be efficiently used for a fast macro-analyses of data. Themore » code is applicable to a plethora of distinct problems: chemical decomposition, pressure transient decomposition, unknown sources/signal allocation, EM signal decomposition. An additional procedure for allocation of the unknown sources is incorporated in the code.« less
Joint Source-Channel Decoding of Variable-Length Codes with Soft Information: A Survey
NASA Astrophysics Data System (ADS)
Guillemot, Christine; Siohan, Pierre
2005-12-01
Multimedia transmission over time-varying wireless channels presents a number of challenges beyond existing capabilities conceived so far for third-generation networks. Efficient quality-of-service (QoS) provisioning for multimedia on these channels may in particular require a loosening and a rethinking of the layer separation principle. In that context, joint source-channel decoding (JSCD) strategies have gained attention as viable alternatives to separate decoding of source and channel codes. A statistical framework based on hidden Markov models (HMM) capturing dependencies between the source and channel coding components sets the foundation for optimal design of techniques of joint decoding of source and channel codes. The problem has been largely addressed in the research community, by considering both fixed-length codes (FLC) and variable-length source codes (VLC) widely used in compression standards. Joint source-channel decoding of VLC raises specific difficulties due to the fact that the segmentation of the received bitstream into source symbols is random. This paper makes a survey of recent theoretical and practical advances in the area of JSCD with soft information of VLC-encoded sources. It first describes the main paths followed for designing efficient estimators for VLC-encoded sources, the key component of the JSCD iterative structure. It then presents the main issues involved in the application of the turbo principle to JSCD of VLC-encoded sources as well as the main approaches to source-controlled channel decoding. This survey terminates by performance illustrations with real image and video decoding systems.
Development of tools and techniques for momentum compression of fast rare isotopes
DOE Office of Scientific and Technical Information (OSTI.GOV)
David J. Morrissey; Bradley M. Sherrill; Oleg Tarasov
2010-11-21
As part of our past research and development work, we have created and developed the LISE++ simulation code [Tar04, Tar08]. The LISE++ package was significantly extended with the addition of a Monte Carlo option that includes an option for calculating ion trajectories using a Taylor-series expansion up to fifth order, and implementation of the MOTER Monte Carlo code [Kow87] for ray tracing of the ions into the suite of LISE++ codes. The MOTER code was rewritten from FORTRAN into C++ and transported to the MS-Windows operating system. Extensive work went into the creation of a user-friendly interface for the code.more » An example of the graphical user interface created for the MOTER code is shown in the left panel of Figure 1 and the results of a typical calculation for the trajectories of particles that pass through the A1900 fragment separator are shown in the right panel. The MOTER code is presently included as part of the LISE++ package for downloading without restriction by the worldwide community. The LISE++ was extensively developed and generalized to apply to any projectile fragment separator during the early phase of this grant. In addition to the inclusion of the MOTER code, other important additions to the LISE++ code made during FY08/FY09 are listed. The LISE++ is distributed over the web (http://groups.nscl.msu.edu/lise ) and is available without charge to anyone by anonymous download, thus, the number of individual users is not recorded. The number of 'hits' on the servers that provide the LISE++ code is shown in Figure 3 for the last eight calendar years (left panel) along with the country from the IP address (right panel). The data show an increase in web-activity with the release of the new version of the program during the grant period and a worldwide impact. An important part of the proposed work carried out during FY07, FY08 and FY09 by a graduate student in the MSU Physics program was to benchmark the codes by comparison of detailed measurements to the LISE++ predictions. A large data set was obtained for fission fragments from the reaction of 238U ions at 81 MeV/u in a 92 mg/cm2 beryllium target with the A1900 projectile fragment separator. The data were analyzed and form the bulk of a Ph.D. dissertation that is nearing completion. The rich data set provides a number of benchmarks for the improved LISE++ code and only a few examples can be shown here. The primary information obtained from the measurements is the yield of the products as a function of mass, charge and momentum. Examples of the momentum distributions of individually identified fragments can be seen in Figures 2 and 4 along with comparisons to the predicted distributions. The agreement is remarkably good and indicates the general validity of the model of the nuclear reactions producing these fragments and of the higher order transmission calculations in the LISE++ code. The momentum distributions were integrated to provide the cross sections for the individual isotopes. As shown in Figure 5, there is good agreement with the model predictions although the observed cross sections are a factor of five or so higher in this case. Other comparisons of measured production cross sections from abrasion-fission reactions have been published by our group working at the NSCL during this period [Fol09] and through our collaboration with Japanese researchers working at RIKEN with the BigRIPS separator [Ohn08, Ohn10]. The agreement of the model predictions with the data obtained with two different fragment separators is very good and indicates the usefulness of the new LISE++ code.« less
Open-Source Development of the Petascale Reactive Flow and Transport Code PFLOTRAN
NASA Astrophysics Data System (ADS)
Hammond, G. E.; Andre, B.; Bisht, G.; Johnson, T.; Karra, S.; Lichtner, P. C.; Mills, R. T.
2013-12-01
Open-source software development has become increasingly popular in recent years. Open-source encourages collaborative and transparent software development and promotes unlimited free redistribution of source code to the public. Open-source development is good for science as it reveals implementation details that are critical to scientific reproducibility, but generally excluded from journal publications. In addition, research funds that would have been spent on licensing fees can be redirected to code development that benefits more scientists. In 2006, the developers of PFLOTRAN open-sourced their code under the U.S. Department of Energy SciDAC-II program. Since that time, the code has gained popularity among code developers and users from around the world seeking to employ PFLOTRAN to simulate thermal, hydraulic, mechanical and biogeochemical processes in the Earth's surface/subsurface environment. PFLOTRAN is a massively-parallel subsurface reactive multiphase flow and transport simulator designed from the ground up to run efficiently on computing platforms ranging from the laptop to leadership-class supercomputers, all from a single code base. The code employs domain decomposition for parallelism and is founded upon the well-established and open-source parallel PETSc and HDF5 frameworks. PFLOTRAN leverages modern Fortran (i.e. Fortran 2003-2008) in its extensible object-oriented design. The use of this progressive, yet domain-friendly programming language has greatly facilitated collaboration in the code's software development. Over the past year, PFLOTRAN's top-level data structures were refactored as Fortran classes (i.e. extendible derived types) to improve the flexibility of the code, ease the addition of new process models, and enable coupling to external simulators. For instance, PFLOTRAN has been coupled to the parallel electrical resistivity tomography code E4D to enable hydrogeophysical inversion while the same code base can be used as a third-party library to provide hydrologic flow, energy transport, and biogeochemical capability to the community land model, CLM, part of the open-source community earth system model (CESM) for climate. In this presentation, the advantages and disadvantages of open source software development in support of geoscience research at government laboratories, universities, and the private sector are discussed. Since the code is open-source (i.e. it's transparent and readily available to competitors), the PFLOTRAN team's development strategy within a competitive research environment is presented. Finally, the developers discuss their approach to object-oriented programming and the leveraging of modern Fortran in support of collaborative geoscience research as the Fortran standard evolves among compiler vendors.
Multidimensional incremental parsing for universal source coding.
Bae, Soo Hyun; Juang, Biing-Hwang
2008-10-01
A multidimensional incremental parsing algorithm (MDIP) for multidimensional discrete sources, as a generalization of the Lempel-Ziv coding algorithm, is investigated. It consists of three essential component schemes, maximum decimation matching, hierarchical structure of multidimensional source coding, and dictionary augmentation. As a counterpart of the longest match search in the Lempel-Ziv algorithm, two classes of maximum decimation matching are studied. Also, an underlying behavior of the dictionary augmentation scheme for estimating the source statistics is examined. For an m-dimensional source, m augmentative patches are appended into the dictionary at each coding epoch, thus requiring the transmission of a substantial amount of information to the decoder. The property of the hierarchical structure of the source coding algorithm resolves this issue by successively incorporating lower dimensional coding procedures in the scheme. In regard to universal lossy source coders, we propose two distortion functions, the local average distortion and the local minimax distortion with a set of threshold levels for each source symbol. For performance evaluation, we implemented three image compression algorithms based upon the MDIP; one is lossless and the others are lossy. The lossless image compression algorithm does not perform better than the Lempel-Ziv-Welch coding, but experimentally shows efficiency in capturing the source structure. The two lossy image compression algorithms are implemented using the two distortion functions, respectively. The algorithm based on the local average distortion is efficient at minimizing the signal distortion, but the images by the one with the local minimax distortion have a good perceptual fidelity among other compression algorithms. Our insights inspire future research on feature extraction of multidimensional discrete sources.
Downloading from the OPAC: The Innovative Interfaces Environment.
ERIC Educational Resources Information Center
Spore, Stuart
1991-01-01
Discussion of downloading from online public access catalogs focuses on downloading to MS-DOS microcomputers from the INNOPAC online catalog system. Tools for capturing and postprocessing downloaded files are described, technical and institutional constraints on downloading are addressed, and an innovative program for overcoming such constraints…
An Efficient Variable Length Coding Scheme for an IID Source
NASA Technical Reports Server (NTRS)
Cheung, K. -M.
1995-01-01
A scheme is examined for using two alternating Huffman codes to encode a discrete independent and identically distributed source with a dominant symbol. This combined strategy, or alternating runlength Huffman (ARH) coding, was found to be more efficient than ordinary coding in certain circumstances.
ITK: enabling reproducible research and open science
McCormick, Matthew; Liu, Xiaoxiao; Jomier, Julien; Marion, Charles; Ibanez, Luis
2014-01-01
Reproducibility verification is essential to the practice of the scientific method. Researchers report their findings, which are strengthened as other independent groups in the scientific community share similar outcomes. In the many scientific fields where software has become a fundamental tool for capturing and analyzing data, this requirement of reproducibility implies that reliable and comprehensive software platforms and tools should be made available to the scientific community. The tools will empower them and the public to verify, through practice, the reproducibility of observations that are reported in the scientific literature. Medical image analysis is one of the fields in which the use of computational resources, both software and hardware, are an essential platform for performing experimental work. In this arena, the introduction of the Insight Toolkit (ITK) in 1999 has transformed the field and facilitates its progress by accelerating the rate at which algorithmic implementations are developed, tested, disseminated and improved. By building on the efficiency and quality of open source methodologies, ITK has provided the medical image community with an effective platform on which to build a daily workflow that incorporates the true scientific practices of reproducibility verification. This article describes the multiple tools, methodologies, and practices that the ITK community has adopted, refined, and followed during the past decade, in order to become one of the research communities with the most modern reproducibility verification infrastructure. For example, 207 contributors have created over 2400 unit tests that provide over 84% code line test coverage. The Insight Journal, an open publication journal associated with the toolkit, has seen over 360,000 publication downloads. The median normalized closeness centrality, a measure of knowledge flow, resulting from the distributed peer code review system was high, 0.46. PMID:24600387
Greene, Jeremy A; Choudhry, Niteesh K; Kilabuk, Elaine; Shrank, William H
2011-03-01
Several disease-specific information exchanges now exist on Facebook and other online social networking sites. These new sources of knowledge, support, and engagement have become important for patients living with chronic disease, yet the quality and content of the information provided in these digital arenas are poorly understood. To qualitatively evaluate the content of communication in Facebook communities dedicated to diabetes. We identified the 15 largest Facebook groups focused on diabetes management. For each group, we downloaded the 15 most recent "wall posts" and the 15 most recent discussion topics from the 10 largest groups. Four hundred eighty unique users were identified in a series of 690 comments from wall posts and discussion topics. Posts were abstracted and aggregated into a database. Two investigators evaluated the posts, developed a thematic coding scheme, and applied codes to the data. Patients with diabetes, family members, and their friends use Facebook to share personal clinical information, to request disease-specific guidance and feedback, and to receive emotional support. Approximately two-thirds of posts included unsolicited sharing of diabetes management strategies, over 13% of posts provided specific feedback to information requested by other users, and almost 29% of posts featured an effort by the poster to provide emotional support to others as members of a community. Approximately 27% of posts featured some type of promotional activity, generally presented as testimonials advertising non-FDA approved, "natural" products. Clinically inaccurate recommendations were infrequent, but were usually associated with promotion of a specific product or service. Thirteen percent of posts contained requests for personal information from Facebook participants. Facebook provides a forum for reporting personal experiences, asking questions, and receiving direct feedback for people living with diabetes. However, promotional activity and personal data collection are also common, with no accountability or checks for authenticity.
Accelerating calculations of RNA secondary structure partition functions using GPUs
2013-01-01
Background RNA performs many diverse functions in the cell in addition to its role as a messenger of genetic information. These functions depend on its ability to fold to a unique three-dimensional structure determined by the sequence. The conformation of RNA is in part determined by its secondary structure, or the particular set of contacts between pairs of complementary bases. Prediction of the secondary structure of RNA from its sequence is therefore of great interest, but can be computationally expensive. In this work we accelerate computations of base-pair probababilities using parallel graphics processing units (GPUs). Results Calculation of the probabilities of base pairs in RNA secondary structures using nearest-neighbor standard free energy change parameters has been implemented using CUDA to run on hardware with multiprocessor GPUs. A modified set of recursions was introduced, which reduces memory usage by about 25%. GPUs are fastest in single precision, and for some hardware, restricted to single precision. This may introduce significant roundoff error. However, deviations in base-pair probabilities calculated using single precision were found to be negligible compared to those resulting from shifting the nearest-neighbor parameters by a random amount of magnitude similar to their experimental uncertainties. For large sequences running on our particular hardware, the GPU implementation reduces execution time by a factor of close to 60 compared with an optimized serial implementation, and by a factor of 116 compared with the original code. Conclusions Using GPUs can greatly accelerate computation of RNA secondary structure partition functions, allowing calculation of base-pair probabilities for large sequences in a reasonable amount of time, with a negligible compromise in accuracy due to working in single precision. The source code is integrated into the RNAstructure software package and available for download at http://rna.urmc.rochester.edu. PMID:24180434
ITK: enabling reproducible research and open science.
McCormick, Matthew; Liu, Xiaoxiao; Jomier, Julien; Marion, Charles; Ibanez, Luis
2014-01-01
Reproducibility verification is essential to the practice of the scientific method. Researchers report their findings, which are strengthened as other independent groups in the scientific community share similar outcomes. In the many scientific fields where software has become a fundamental tool for capturing and analyzing data, this requirement of reproducibility implies that reliable and comprehensive software platforms and tools should be made available to the scientific community. The tools will empower them and the public to verify, through practice, the reproducibility of observations that are reported in the scientific literature. Medical image analysis is one of the fields in which the use of computational resources, both software and hardware, are an essential platform for performing experimental work. In this arena, the introduction of the Insight Toolkit (ITK) in 1999 has transformed the field and facilitates its progress by accelerating the rate at which algorithmic implementations are developed, tested, disseminated and improved. By building on the efficiency and quality of open source methodologies, ITK has provided the medical image community with an effective platform on which to build a daily workflow that incorporates the true scientific practices of reproducibility verification. This article describes the multiple tools, methodologies, and practices that the ITK community has adopted, refined, and followed during the past decade, in order to become one of the research communities with the most modern reproducibility verification infrastructure. For example, 207 contributors have created over 2400 unit tests that provide over 84% code line test coverage. The Insight Journal, an open publication journal associated with the toolkit, has seen over 360,000 publication downloads. The median normalized closeness centrality, a measure of knowledge flow, resulting from the distributed peer code review system was high, 0.46.
An open real-time tele-stethoscopy system
2012-01-01
Background Acute respiratory infections are the leading cause of childhood mortality. The lack of physicians in rural areas of developing countries makes difficult their correct diagnosis and treatment. The staff of rural health facilities (health-care technicians) may not be qualified to distinguish respiratory diseases by auscultation. For this reason, the goal of this project is the development of a tele-stethoscopy system that allows a physician to receive real-time cardio-respiratory sounds from a remote auscultation, as well as video images showing where the technician is placing the stethoscope on the patient’s body. Methods A real-time wireless stethoscopy system was designed. The initial requirements were: 1) The system must send audio and video synchronously over IP networks, not requiring an Internet connection; 2) It must preserve the quality of cardiorespiratory sounds, allowing to adapt the binaural pieces and the chestpiece of standard stethoscopes, and; 3) Cardiorespiratory sounds should be recordable at both sides of the communication. In order to verify the diagnostic capacity of the system, a clinical validation with eight specialists has been designed. In a preliminary test, twelve patients have been auscultated by all the physicians using the tele-stethoscopy system, versus a local auscultation using traditional stethoscope. The system must allow listen the cardiac (systolic and diastolic murmurs, gallop sound, arrhythmias) and respiratory (rhonchi, rales and crepitations, wheeze, diminished and bronchial breath sounds, pleural friction rub) sounds. Results The design, development and initial validation of the real-time wireless tele-stethoscopy system are described in detail. The system was conceived from scratch as open-source, low-cost and designed in such a way that many universities and small local companies in developing countries may manufacture it. Only free open-source software has been used in order to minimize manufacturing costs and look for alliances to support its improvement and adaptation. The microcontroller firmware code, the computer software code and the PCB schematics are available for free download in a subversion repository hosted in SourceForge. Conclusions It has been shown that real-time tele-stethoscopy, together with a videoconference system that allows a remote specialist to oversee the auscultation, may be a very helpful tool in rural areas of developing countries. PMID:22917062
Observer's Interface for Solar System Target Specification
NASA Astrophysics Data System (ADS)
Roman, Anthony; Link, Miranda; Moriarty, Christopher; Stansberry, John A.
2016-10-01
When observing an asteroid or comet with HST, it has been necessary for the observer to manually enter the target's orbital elements into the Astronomer's Proposal Tool (APT). This allowed possible copy/paste transcription errors from the observer's source of orbital elements data. In order to address this issue, APT has now been improved with the capability to identify targets in and then download orbital elements from JPL Horizons. The observer will first use a target name resolver to choose the intended target from the Horizons database, and then download the orbital elements from Horizons directly into APT. A manual entry option is also still retained if the observer does not wish to use elements from Horizons. This new capability is available for HST observing, and it will also be supported for JWST observing. The poster shows examples of this new interface.
Observer's Interface for Solar System Target Specification
NASA Astrophysics Data System (ADS)
Roman, Anthony; Link, Miranda; Moriarty, Christopher; Stansberry, John A.
2016-01-01
When observing an asteroid or comet with HST, it has been necessary for the observer to manually enter the target's orbital elements into the Astronomer's Proposal Tool (APT). This allowed possible copy/paste transcription errors from the observer's source of orbital elements data. In order to address this issue, APT has now been improved with the capability to identify targets in and then download orbital elements from JPL Horizons. The observer will first use a target name resolver to choose the intended target from the Horizons database, and then download the orbital elements from Horizons directly into APT. A manual entry option is also still retained if the observer does not wish to use elements from Horizons. This new capability is available for HST observing, and it will also be supported for JWST observing. The poster shows examples of this new interface.
Source Code Plagiarism--A Student Perspective
ERIC Educational Resources Information Center
Joy, M.; Cosma, G.; Yau, J. Y.-K.; Sinclair, J.
2011-01-01
This paper considers the problem of source code plagiarism by students within the computing disciplines and reports the results of a survey of students in Computing departments in 18 institutions in the U.K. This survey was designed to investigate how well students understand the concept of source code plagiarism and to discover what, if any,…
College Students' Moral Evaluations of Illegal Music Downloading
ERIC Educational Resources Information Center
Jambon, Marc M.; Smetana, Judith G.
2012-01-01
Although unauthorized music downloading is illegal, a majority of college students have downloaded music for free online. Evaluations of illegal music downloading and their association with downloading behavior were examined using social domain theory in a sample of 188 ethnically diverse college students (M[subscript age] = 19.80 years, SD =…
Hales, Sarah; Dunn, Caroline; Wilcox, Sara; Turner-McGrievy, Gabrielle M
2016-11-01
Apps using digital photos to track dietary intake and provide feedback are common, but currently there has been no research examining what evidence-based strategies are included in these apps. A content analysis of mobile apps for photo diet tracking was conducted, including whether effective techniques for interventions promoting behavior change, including self-regulation, for healthy eating (HE) are targeted. An initial search of app stores yielded 34 apps (n = 8 Android and Apple; n = 11 Android; n = 15 Apple). One app was removed (unable to download), and other apps (n = 4) were unable to be rated (no longer available). Remaining apps (n = 29) were downloaded, reviewed, and coded by 2 independent reviewers to determine the number of known effective self-regulation and other behavior change techniques included. The raters met to compare their coding of the apps, calculate interrater agreement, resolve any discrepancies, and come to a consensus. Six apps (21%) did not utilize any of the behavior change techniques examined. Three apps (10%) provided feedback to users via crowdsourcing or collective feedback from other users and professionals, 7 apps (24%) used crowdsourcing or collective feedback, 1 app (3%) used professionals, and 18 apps (62%) did not provide any dietary feedback to users. Few photo diet-tracking apps include evidence-based strategies to improve dietary intake. Use of photos to self-monitor dietary intake and receive feedback has the potential to reduce user burden for self-monitoring, yet photo diet tracking apps need to incorporate known effective behavior strategies for HE, including self-regulation. © 2016 Diabetes Technology Society.
Recent advances in coding theory for near error-free communications
NASA Technical Reports Server (NTRS)
Cheung, K.-M.; Deutsch, L. J.; Dolinar, S. J.; Mceliece, R. J.; Pollara, F.; Shahshahani, M.; Swanson, L.
1991-01-01
Channel and source coding theories are discussed. The following subject areas are covered: large constraint length convolutional codes (the Galileo code); decoder design (the big Viterbi decoder); Voyager's and Galileo's data compression scheme; current research in data compression for images; neural networks for soft decoding; neural networks for source decoding; finite-state codes; and fractals for data compression.
Hybrid concatenated codes and iterative decoding
NASA Technical Reports Server (NTRS)
Divsalar, Dariush (Inventor); Pollara, Fabrizio (Inventor)
2000-01-01
Several improved turbo code apparatuses and methods. The invention encompasses several classes: (1) A data source is applied to two or more encoders with an interleaver between the source and each of the second and subsequent encoders. Each encoder outputs a code element which may be transmitted or stored. A parallel decoder provides the ability to decode the code elements to derive the original source information d without use of a received data signal corresponding to d. The output may be coupled to a multilevel trellis-coded modulator (TCM). (2) A data source d is applied to two or more encoders with an interleaver between the source and each of the second and subsequent encoders. Each of the encoders outputs a code element. In addition, the original data source d is output from the encoder. All of the output elements are coupled to a TCM. (3) At least two data sources are applied to two or more encoders with an interleaver between each source and each of the second and subsequent encoders. The output may be coupled to a TCM. (4) At least two data sources are applied to two or more encoders with at least two interleavers between each source and each of the second and subsequent encoders. (5) At least one data source is applied to one or more serially linked encoders through at least one interleaver. The output may be coupled to a TCM. The invention includes a novel way of terminating a turbo coder.
Gschwind, Michael K
2013-07-23
Mechanisms for aggressively optimizing computer code are provided. With these mechanisms, a compiler determines an optimization to apply to a portion of source code and determines if the optimization as applied to the portion of source code will result in unsafe optimized code that introduces a new source of exceptions being generated by the optimized code. In response to a determination that the optimization is an unsafe optimization, the compiler generates an aggressively compiled code version, in which the unsafe optimization is applied, and a conservatively compiled code version in which the unsafe optimization is not applied. The compiler stores both versions and provides them for execution. Mechanisms are provided for switching between these versions during execution in the event of a failure of the aggressively compiled code version. Moreover, predictive mechanisms are provided for predicting whether such a failure is likely.
YODA++: A proposal for a semi-automatic space mission control
NASA Astrophysics Data System (ADS)
Casolino, M.; de Pascale, M. P.; Nagni, M.; Picozza, P.
YODA++ is a proposal for a semi-automated data handling and analysis system for the PAMELA space experiment. The core of the routines have been developed to process a stream of raw data downlinked from the Resurs DK1 satellite (housing PAMELA) to the ground station in Moscow. Raw data consist of scientific data and are complemented by housekeeping information. Housekeeping information will be analyzed within a short time from download (1 h) in order to monitor the status of the experiment and to foreseen the mission acquisition planning. A prototype for the data visualization will run on an APACHE TOMCAT web application server, providing an off-line analysis tool using a browser and part of code for the system maintenance. Data retrieving development is in production phase, while a GUI interface for human friendly monitoring is on preliminary phase as well as a JavaServerPages/JavaServerFaces (JSP/JSF) web application facility. On a longer timescale (1 3 h from download) scientific data are analyzed. The data storage core will be a mix of CERNs ROOT files structure and MySQL as a relational database. YODA++ is currently being used in the integration and testing on ground of PAMELA data.
AnnotCompute: annotation-based exploration and meta-analysis of genomics experiments
Zheng, Jie; Stoyanovich, Julia; Manduchi, Elisabetta; Liu, Junmin; Stoeckert, Christian J.
2011-01-01
The ever-increasing scale of biological data sets, particularly those arising in the context of high-throughput technologies, requires the development of rich data exploration tools. In this article, we present AnnotCompute, an information discovery platform for repositories of functional genomics experiments such as ArrayExpress. Our system leverages semantic annotations of functional genomics experiments with controlled vocabulary and ontology terms, such as those from the MGED Ontology, to compute conceptual dissimilarities between pairs of experiments. These dissimilarities are then used to support two types of exploratory analysis—clustering and query-by-example. We show that our proposed dissimilarity measures correspond to a user's intuition about conceptual dissimilarity, and can be used to support effective query-by-example. We also evaluate the quality of clustering based on these measures. While AnnotCompute can support a richer data exploration experience, its effectiveness is limited in some cases, due to the quality of available annotations. Nonetheless, tools such as AnnotCompute may provide an incentive for richer annotations of experiments. Code is available for download at http://www.cbil.upenn.edu/downloads/AnnotCompute. Database URL: http://www.cbil.upenn.edu/annotCompute/ PMID:22190598
w4CSeq: software and web application to analyze 4C-seq data.
Cai, Mingyang; Gao, Fan; Lu, Wange; Wang, Kai
2016-11-01
Circularized Chromosome Conformation Capture followed by deep sequencing (4C-Seq) is a powerful technique to identify genome-wide partners interacting with a pre-specified genomic locus. Here, we present a computational and statistical approach to analyze 4C-Seq data generated from both enzyme digestion and sonication fragmentation-based methods. We implemented a command line software tool and a web interface called w4CSeq, which takes in the raw 4C sequencing data (FASTQ files) as input, performs automated statistical analysis and presents results in a user-friendly manner. Besides providing users with the list of candidate interacting sites/regions, w4CSeq generates figures showing genome-wide distribution of interacting regions, and sketches the enrichment of key features such as TSSs, TTSs, CpG sites and DNA replication timing around 4C sites. Users can establish their own web server by downloading source codes at https://github.com/WGLab/w4CSeq Additionally, a demo web server is available at http://w4cseq.wglab.org CONTACT: kaiwang@usc.edu or wangelu@usc.eduSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Kimura, Yasumasa; Soma, Takahiro; Kasahara, Naoko; Delobel, Diane; Hanami, Takeshi; Tanaka, Yuki; de Hoon, Michiel J L; Hayashizaki, Yoshihide; Usui, Kengo; Harbers, Matthias
2016-01-01
Analytical PCR experiments preferably use internal probes for monitoring the amplification reaction and specific detection of the amplicon. Such internal probes have to be designed in close context with the amplification primers, and may require additional considerations for the detection of genetic variations. Here we describe Edesign, a new online and stand-alone tool for designing sets of PCR primers together with an internal probe for conducting quantitative real-time PCR (qPCR) and genotypic experiments. Edesign can be used for selecting standard DNA oligonucleotides like for instance TaqMan probes, but has been further extended with new functions and enhanced design features for Eprobes. Eprobes, with their single thiazole orange-labelled nucleotide, allow for highly sensitive genotypic assays because of their higher DNA binding affinity as compared to standard DNA oligonucleotides. Using new thermodynamic parameters, Edesign considers unique features of Eprobes during primer and probe design for establishing qPCR experiments and genotyping by melting curve analysis. Additional functions in Edesign allow probe design for effective discrimination between wild-type sequences and genetic variations either using standard DNA oligonucleotides or Eprobes. Edesign can be freely accessed online at http://www.dnaform.com/edesign2/, and the source code is available for download.
Jeong, Hyundoo; Yoon, Byung-Jun
2017-03-14
Network querying algorithms provide computational means to identify conserved network modules in large-scale biological networks that are similar to known functional modules, such as pathways or molecular complexes. Two main challenges for network querying algorithms are the high computational complexity of detecting potential isomorphism between the query and the target graphs and ensuring the biological significance of the query results. In this paper, we propose SEQUOIA, a novel network querying algorithm that effectively addresses these issues by utilizing a context-sensitive random walk (CSRW) model for network comparison and minimizing the network conductance of potential matches in the target network. The CSRW model, inspired by the pair hidden Markov model (pair-HMM) that has been widely used for sequence comparison and alignment, can accurately assess the node-to-node correspondence between different graphs by accounting for node insertions and deletions. The proposed algorithm identifies high-scoring network regions based on the CSRW scores, which are subsequently extended by maximally reducing the network conductance of the identified subnetworks. Performance assessment based on real PPI networks and known molecular complexes show that SEQUOIA outperforms existing methods and clearly enhances the biological significance of the query results. The source code and datasets can be downloaded from http://www.ece.tamu.edu/~bjyoon/SEQUOIA .
Pedretti, Alessandro; Mazzolari, Angelica; Vistoli, Giulio
2018-05-21
The manuscript describes WarpEngine, a novel platform implemented within the VEGA ZZ suite of software for performing distributed simulations both in local and wide area networks. Despite being tailored for structure-based virtual screening campaigns, WarpEngine possesses the required flexibility to carry out distributed calculations utilizing various pieces of software, which can be easily encapsulated within this platform without changing their source codes. WarpEngine takes advantages of all cheminformatics features implemented in the VEGA ZZ program as well as of its largely customizable scripting architecture thus allowing an efficient distribution of various time-demanding simulations. To offer an example of the WarpEngine potentials, the manuscript includes a set of virtual screening campaigns based on the ACE data set of the DUD-E collections using PLANTS as the docking application. Benchmarking analyses revealed a satisfactory linearity of the WarpEngine performances, the speed-up values being roughly equal to the number of utilized cores. Again, the computed scalability values emphasized that a vast majority (i.e., >90%) of the performed simulations benefit from the distributed platform presented here. WarpEngine can be freely downloaded along with the VEGA ZZ program at www.vegazz.net .
Lowekamp, Bradley C; Chen, David T; Ibáñez, Luis; Blezek, Daniel
2013-01-01
SimpleITK is a new interface to the Insight Segmentation and Registration Toolkit (ITK) designed to facilitate rapid prototyping, education and scientific activities via high level programming languages. ITK is a templated C++ library of image processing algorithms and frameworks for biomedical and other applications, and it was designed to be generic, flexible and extensible. Initially, ITK provided a direct wrapping interface to languages such as Python and Tcl through the WrapITK system. Unlike WrapITK, which exposed ITK's complex templated interface, SimpleITK was designed to provide an easy to use and simplified interface to ITK's algorithms. It includes procedural methods, hides ITK's demand driven pipeline, and provides a template-less layer. Also SimpleITK provides practical conveniences such as binary distribution packages and overloaded operators. Our user-friendly design goals dictated a departure from the direct interface wrapping approach of WrapITK, toward a new facade class structure that only exposes the required functionality, hiding ITK's extensive template use. Internally SimpleITK utilizes a manual description of each filter with code-generation and advanced C++ meta-programming to provide the higher-level interface, bringing the capabilities of ITK to a wider audience. SimpleITK is licensed as open source software library under the Apache License Version 2.0 and more information about downloading it can be found at http://www.simpleitk.org.
Devailly, Guillaume; Mantsoki, Anna; Joshi, Anagha
2016-11-01
Better protocols and decreasing costs have made high-throughput sequencing experiments now accessible even to small experimental laboratories. However, comparing one or few experiments generated by an individual lab to the vast amount of relevant data freely available in the public domain might be limited due to lack of bioinformatics expertise. Though several tools, including genome browsers, allow such comparison at a single gene level, they do not provide a genome-wide view. We developed Heat*seq, a web-tool that allows genome scale comparison of high throughput experiments chromatin immuno-precipitation followed by sequencing, RNA-sequencing and Cap Analysis of Gene Expression) provided by a user, to the data in the public domain. Heat*seq currently contains over 12 000 experiments across diverse tissues and cell types in human, mouse and drosophila. Heat*seq displays interactive correlation heatmaps, with an ability to dynamically subset datasets to contextualize user experiments. High quality figures and tables are produced and can be downloaded in multiple formats. Web application: http://www.heatstarseq.roslin.ed.ac.uk/ Source code: https://github.com/gdevailly CONTACT: Guillaume.Devailly@roslin.ed.ac.uk or Anagha.Joshi@roslin.ed.ac.ukSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.
Scalable clustering algorithms for continuous environmental flow cytometry.
Hyrkas, Jeremy; Clayton, Sophie; Ribalet, Francois; Halperin, Daniel; Armbrust, E Virginia; Howe, Bill
2016-02-01
Recent technological innovations in flow cytometry now allow oceanographers to collect high-frequency flow cytometry data from particles in aquatic environments on a scale far surpassing conventional flow cytometers. The SeaFlow cytometer continuously profiles microbial phytoplankton populations across thousands of kilometers of the surface ocean. The data streams produced by instruments such as SeaFlow challenge the traditional sample-by-sample approach in cytometric analysis and highlight the need for scalable clustering algorithms to extract population information from these large-scale, high-frequency flow cytometers. We explore how available algorithms commonly used for medical applications perform at classification of such a large-scale, environmental flow cytometry data. We apply large-scale Gaussian mixture models to massive datasets using Hadoop. This approach outperforms current state-of-the-art cytometry classification algorithms in accuracy and can be coupled with manual or automatic partitioning of data into homogeneous sections for further classification gains. We propose the Gaussian mixture model with partitioning approach for classification of large-scale, high-frequency flow cytometry data. Source code available for download at https://github.com/jhyrkas/seaflow_cluster, implemented in Java for use with Hadoop. hyrkas@cs.washington.edu Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
msap: a tool for the statistical analysis of methylation-sensitive amplified polymorphism data.
Pérez-Figueroa, A
2013-05-01
In this study msap, an R package which analyses methylation-sensitive amplified polymorphism (MSAP or MS-AFLP) data is presented. The program provides a deep analysis of epigenetic variation starting from a binary data matrix indicating the banding pattern between the isoesquizomeric endonucleases HpaII and MspI, with differential sensitivity to cytosine methylation. After comparing the restriction fragments, the program determines if each fragment is susceptible to methylation (representative of epigenetic variation) or if there is no evidence of methylation (representative of genetic variation). The package provides, in a user-friendly command line interface, a pipeline of different analyses of the variation (genetic and epigenetic) among user-defined groups of samples, as well as the classification of the methylation occurrences in those groups. Statistical testing provides support to the analyses. A comprehensive report of the analyses and several useful plots could help researchers to assess the epigenetic and genetic variation in their MSAP experiments. msap is downloadable from CRAN (http://cran.r-project.org/) and its own webpage (http://msap.r-forge.R-project.org/). The package is intended to be easy to use even for those people unfamiliar with the R command line environment. Advanced users may take advantage of the available source code to adapt msap to more complex analyses. © 2013 Blackwell Publishing Ltd.
Culture and Creativity: World of Warcraft Modding in China and the US
NASA Astrophysics Data System (ADS)
Kow, Yong Ming; Nardi, Bonnie
Modding - end-user modification of commercial hardware and software - can be traced back at least to 1961 when Spacewar! was developed by a group of MIT students on a DEC PDP-1. Spacewar! evolved into arcade games including Space Wars produced in 1977 by Cinematronics (Sotamaa 2003). In 1992, players altering Wolfenstein 3-D (1992), a first person shooter game made by id Software, overwrote the graphics and sounds by editing the game files. Learning from this experience, id Software released Doom in 1993 with isolated media files and open source code for players to develop custom maps, images, sounds, and other utilities. Players were able to pass on their modifications to others. By 1996, with the release of Quake, end-user modifications had come to be known as "mods," and modding was an accepted part of the gaming community (Kucklich 2005; Postigo 2008a, b). Since late-2005, we have been studying World of Warcraft (WoW) in which the use of mods is an important aspect of player practice (Nardi and Harris 2006; Nardi et al. 2007). Technically minded players with an interest in extending the game write mods and make them available to players for free download on distribution sites. Most modders work for free, but the distribution sites are commercial enterprises with advertising.
UltraPse: A Universal and Extensible Software Platform for Representing Biological Sequences.
Du, Pu-Feng; Zhao, Wei; Miao, Yang-Yang; Wei, Le-Yi; Wang, Likun
2017-11-14
With the avalanche of biological sequences in public databases, one of the most challenging problems in computational biology is to predict their biological functions and cellular attributes. Most of the existing prediction algorithms can only handle fixed-length numerical vectors. Therefore, it is important to be able to represent biological sequences with various lengths using fixed-length numerical vectors. Although several algorithms, as well as software implementations, have been developed to address this problem, these existing programs can only provide a fixed number of representation modes. Every time a new sequence representation mode is developed, a new program will be needed. In this paper, we propose the UltraPse as a universal software platform for this problem. The function of the UltraPse is not only to generate various existing sequence representation modes, but also to simplify all future programming works in developing novel representation modes. The extensibility of UltraPse is particularly enhanced. It allows the users to define their own representation mode, their own physicochemical properties, or even their own types of biological sequences. Moreover, UltraPse is also the fastest software of its kind. The source code package, as well as the executables for both Linux and Windows platforms, can be downloaded from the GitHub repository.
Biographer: web-based editing and rendering of SBGN compliant biochemical networks.
Krause, Falko; Schulz, Marvin; Ripkens, Ben; Flöttmann, Max; Krantz, Marcus; Klipp, Edda; Handorf, Thomas
2013-06-01
The rapid accumulation of knowledge in the field of Systems Biology during the past years requires advanced, but simple-to-use, methods for the visualization of information in a structured and easily comprehensible manner. We have developed biographer, a web-based renderer and editor for reaction networks, which can be integrated as a library into tools dealing with network-related information. Our software enables visualizations based on the emerging standard Systems Biology Graphical Notation. It is able to import networks encoded in various formats such as SBML, SBGN-ML and jSBGN, a custom lightweight exchange format. The core package is implemented in HTML5, CSS and JavaScript and can be used within any kind of web-based project. It features interactive graph-editing tools and automatic graph layout algorithms. In addition, we provide a standalone graph editor and a web server, which contains enhanced features like web services for the import and export of models and visualizations in different formats. The biographer tool can be used at and downloaded from the web page http://biographer.biologie.hu-berlin.de/. The different software packages, including a server-independent version as well as a web server for Windows and Linux based systems, are available at http://code.google.com/p/biographer/ under the open-source license LGPL
Tracking cells in Life Cell Imaging videos using topological alignments.
Mosig, Axel; Jäger, Stefan; Wang, Chaofeng; Nath, Sumit; Ersoy, Ilker; Palaniappan, Kannap-pan; Chen, Su-Shing
2009-07-16
With the increasing availability of live cell imaging technology, tracking cells and other moving objects in live cell videos has become a major challenge for bioimage informatics. An inherent problem for most cell tracking algorithms is over- or under-segmentation of cells - many algorithms tend to recognize one cell as several cells or vice versa. We propose to approach this problem through so-called topological alignments, which we apply to address the problem of linking segmentations of two consecutive frames in the video sequence. Starting from the output of a conventional segmentation procedure, we align pairs of consecutive frames through assigning sets of segments in one frame to sets of segments in the next frame. We achieve this through finding maximum weighted solutions to a generalized "bipartite matching" between two hierarchies of segments, where we derive weights from relative overlap scores of convex hulls of sets of segments. For solving the matching task, we rely on an integer linear program. Practical experiments demonstrate that the matching task can be solved efficiently in practice, and that our method is both effective and useful for tracking cells in data sets derived from a so-called Large Scale Digital Cell Analysis System (LSDCAS). The source code of the implementation is available for download from http://www.picb.ac.cn/patterns/Software/topaln.
Kasahara, Naoko; Delobel, Diane; Hanami, Takeshi; Tanaka, Yuki; de Hoon, Michiel J. L.; Hayashizaki, Yoshihide; Usui, Kengo; Harbers, Matthias
2016-01-01
Analytical PCR experiments preferably use internal probes for monitoring the amplification reaction and specific detection of the amplicon. Such internal probes have to be designed in close context with the amplification primers, and may require additional considerations for the detection of genetic variations. Here we describe Edesign, a new online and stand-alone tool for designing sets of PCR primers together with an internal probe for conducting quantitative real-time PCR (qPCR) and genotypic experiments. Edesign can be used for selecting standard DNA oligonucleotides like for instance TaqMan probes, but has been further extended with new functions and enhanced design features for Eprobes. Eprobes, with their single thiazole orange-labelled nucleotide, allow for highly sensitive genotypic assays because of their higher DNA binding affinity as compared to standard DNA oligonucleotides. Using new thermodynamic parameters, Edesign considers unique features of Eprobes during primer and probe design for establishing qPCR experiments and genotyping by melting curve analysis. Additional functions in Edesign allow probe design for effective discrimination between wild-type sequences and genetic variations either using standard DNA oligonucleotides or Eprobes. Edesign can be freely accessed online at http://www.dnaform.com/edesign2/, and the source code is available for download. PMID:26863543
Evaluation environment for digital and analog pathology: a platform for validation studies
Gallas, Brandon D.; Gavrielides, Marios A.; Conway, Catherine M.; Ivansky, Adam; Keay, Tyler C.; Cheng, Wei-Chung; Hipp, Jason; Hewitt, Stephen M.
2014-01-01
Abstract. We present a platform for designing and executing studies that compare pathologists interpreting histopathology of whole slide images (WSIs) on a computer display to pathologists interpreting glass slides on an optical microscope. eeDAP is an evaluation environment for digital and analog pathology. The key element in eeDAP is the registration of the WSI to the glass slide. Registration is accomplished through computer control of the microscope stage and a camera mounted on the microscope that acquires real-time images of the microscope field of view (FOV). Registration allows for the evaluation of the same regions of interest (ROIs) in both domains. This can reduce or eliminate disagreements that arise from pathologists interpreting different areas and focuses on the comparison of image quality. We reduced the pathologist interpretation area from an entire glass slide (10 to 30 mm2) to small ROIs (<50 μm2). We also made possible the evaluation of individual cells. We summarize eeDAP’s software and hardware and provide calculations and corresponding images of the microscope FOV and the ROIs extracted from the WSIs. The eeDAP software can be downloaded from the Google code website (project: eeDAP) as a MATLAB source or as a precompiled stand-alone license-free application. PMID:26158076
Waese, Jamie; Fan, Jim; Yu, Hans; Fucile, Geoffrey; Shi, Ruian; Cumming, Matthew; Town, Chris; Stuerzlinger, Wolfgang
2017-01-01
A big challenge in current systems biology research arises when different types of data must be accessed from separate sources and visualized using separate tools. The high cognitive load required to navigate such a workflow is detrimental to hypothesis generation. Accordingly, there is a need for a robust research platform that incorporates all data and provides integrated search, analysis, and visualization features through a single portal. Here, we present ePlant (http://bar.utoronto.ca/eplant), a visual analytic tool for exploring multiple levels of Arabidopsis thaliana data through a zoomable user interface. ePlant connects to several publicly available web services to download genome, proteome, interactome, transcriptome, and 3D molecular structure data for one or more genes or gene products of interest. Data are displayed with a set of visualization tools that are presented using a conceptual hierarchy from big to small, and many of the tools combine information from more than one data type. We describe the development of ePlant in this article and present several examples illustrating its integrative features for hypothesis generation. We also describe the process of deploying ePlant as an “app” on Araport. Building on readily available web services, the code for ePlant is freely available for any other biological species research. PMID:28808136
Evaluation environment for digital and analog pathology: a platform for validation studies.
Gallas, Brandon D; Gavrielides, Marios A; Conway, Catherine M; Ivansky, Adam; Keay, Tyler C; Cheng, Wei-Chung; Hipp, Jason; Hewitt, Stephen M
2014-10-01
We present a platform for designing and executing studies that compare pathologists interpreting histopathology of whole slide images (WSIs) on a computer display to pathologists interpreting glass slides on an optical microscope. eeDAP is an evaluation environment for digital and analog pathology. The key element in eeDAP is the registration of the WSI to the glass slide. Registration is accomplished through computer control of the microscope stage and a camera mounted on the microscope that acquires real-time images of the microscope field of view (FOV). Registration allows for the evaluation of the same regions of interest (ROIs) in both domains. This can reduce or eliminate disagreements that arise from pathologists interpreting different areas and focuses on the comparison of image quality. We reduced the pathologist interpretation area from an entire glass slide (10 to [Formula: see text]) to small ROIs ([Formula: see text]). We also made possible the evaluation of individual cells. We summarize eeDAP's software and hardware and provide calculations and corresponding images of the microscope FOV and the ROIs extracted from the WSIs. The eeDAP software can be downloaded from the Google code website (project: eeDAP) as a MATLAB source or as a precompiled stand-alone license-free application.
Farris, Dominic James; Lichtwark, Glen A
2016-05-01
Dynamic measurements of human muscle fascicle length from sequences of B-mode ultrasound images have become increasingly prevalent in biomedical research. Manual digitisation of these images is time consuming and algorithms for automating the process have been developed. Here we present a freely available software implementation of a previously validated algorithm for semi-automated tracking of muscle fascicle length in dynamic ultrasound image recordings, "UltraTrack". UltraTrack implements an affine extension to an optic flow algorithm to track movement of the muscle fascicle end-points throughout dynamically recorded sequences of images. The underlying algorithm has been previously described and its reliability tested, but here we present the software implementation with features for: tracking multiple fascicles in multiple muscles simultaneously; correcting temporal drift in measurements; manually adjusting tracking results; saving and re-loading of tracking results and loading a range of file formats. Two example runs of the software are presented detailing the tracking of fascicles from several lower limb muscles during a squatting and walking activity. We have presented a software implementation of a validated fascicle-tracking algorithm and made the source code and standalone versions freely available for download. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
RNAiFold: a web server for RNA inverse folding and molecular design.
Garcia-Martin, Juan Antonio; Clote, Peter; Dotu, Ivan
2013-07-01
Synthetic biology and nanotechnology are poised to make revolutionary contributions to the 21st century. In this article, we describe a new web server to support in silico RNA molecular design. Given an input target RNA secondary structure, together with optional constraints, such as requiring GC-content to lie within a certain range, requiring the number of strong (GC), weak (AU) and wobble (GU) base pairs to lie in a certain range, the RNAiFold web server determines one or more RNA sequences, whose minimum free-energy secondary structure is the target structure. RNAiFold provides access to two servers: RNA-CPdesign, which applies constraint programming, and RNA-LNSdesign, which applies the large neighborhood search heuristic; hence, it is suitable for larger input structures. Both servers can also solve the RNA inverse hybridization problem, i.e. given a representation of the desired hybridization structure, RNAiFold returns two sequences, whose minimum free-energy hybridization is the input target structure. The web server is publicly accessible at http://bioinformatics.bc.edu/clotelab/RNAiFold, which provides access to two specialized servers: RNA-CPdesign and RNA-LNSdesign. Source code for the underlying algorithms, implemented in COMET and supported on linux, can be downloaded at the server website.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zehtabian, M; Zaker, N; Sina, S
2015-06-15
Purpose: Different versions of MCNP code are widely used for dosimetry purposes. The purpose of this study is to compare different versions of the MCNP codes in dosimetric evaluation of different brachytherapy sources. Methods: The TG-43 parameters such as dose rate constant, radial dose function, and anisotropy function of different brachytherapy sources, i.e. Pd-103, I-125, Ir-192, and Cs-137 were calculated in water phantom. The results obtained by three versions of Monte Carlo codes (MCNP4C, MCNPX, MCNP5) were compared for low and high energy brachytherapy sources. Then the cross section library of MCNP4C code was changed to ENDF/B-VI release 8 whichmore » is used in MCNP5 and MCNPX codes. Finally, the TG-43 parameters obtained using the MCNP4C-revised code, were compared with other codes. Results: The results of these investigations indicate that for high energy sources, the differences in TG-43 parameters between the codes are less than 1% for Ir-192 and less than 0.5% for Cs-137. However for low energy sources like I-125 and Pd-103, large discrepancies are observed in the g(r) values obtained by MCNP4C and the two other codes. The differences between g(r) values calculated using MCNP4C and MCNP5 at the distance of 6cm were found to be about 17% and 28% for I-125 and Pd-103 respectively. The results obtained with MCNP4C-revised and MCNPX were similar. However, the maximum difference between the results obtained with the MCNP5 and MCNP4C-revised codes was 2% at 6cm. Conclusion: The results indicate that using MCNP4C code for dosimetry of low energy brachytherapy sources can cause large errors in the results. Therefore it is recommended not to use this code for low energy sources, unless its cross section library is changed. Since the results obtained with MCNP4C-revised and MCNPX were similar, it is concluded that the difference between MCNP4C and MCNPX is their cross section libraries.« less
ERIC Educational Resources Information Center
Kadhim, Kais A.; Habeeb, Luwaytha S.; Sapar, Ahmad Arifin; Hussin, Zaharah; Abdullah, Muhammad Ridhuan Tony Lim
2013-01-01
Nowadays, online Machine Translation (MT) is used widely with translation software, such as Google and Babylon, being easily available and downloadable. This study aims to test the translation quality of these two machine systems in translating Arabic news headlines into English. 40 Arabic news headlines were selected from three online sources,…
77 FR 15369 - Mobility Fund Phase I Auction GIS Data of Potentially Eligible Census Blocks
Federal Register 2010, 2011, 2012, 2013, 2014
2012-03-15
....fcc.gov/auctions/901/ , are the following: Downloadable shapefile Web mapping service MapBox map tiles... GIS software allows you to add this service as a layer to your session or project. 6. MapBox map tiles are cached map tiles of the data. With this open source software approach, these image tiles can be...
Mitchell, Jason; Torres, Maria Beatriz; Asmar, Lucy; Danh, Thu; Horvath, Keith J
2018-04-24
Although many men who have sex with men (MSM) test for HIV at least once in their lifetime, opportunities to improve regular HIV testing, particularly among Hispanic or Latino MSM, is needed. Many mHealth interventions in development, including the ones on HIV testing, have primarily focused on English-speaking white, black, and MSM of other races. To date, no studies have assessed app use, attitudes, and motivations for downloading and sustaining use of mobile apps and preferences with respect to HIV prevention among Spanish-speaking, Hispanic MSM in the United States. The primary aims of this study were to determine what features and functions of smartphone apps do Hispanic, Spanish-speaking MSM believe are associated with downloading apps to their smartphones, (2) what features and functions of smartphone apps are most likely to influence men's sustained use of apps over time, and (3) what features and functions do men prefer in a smartphone app aimed to promote regular testing for HIV. Interviews (N=15) were conducted with a racially diverse group of sexually active, HIV-negative, Spanish-speaking, Hispanic MSM in Miami, Florida. Interviews were digitally recorded, transcribed verbatim, translated back to English, and de-identified for analysis. A constant-comparison method (ie, grounded theory coding) was employed to examine themes that emerged from the interviews. Personal interest was the primary reason associated with whether men downloaded an app. Keeping personal information secure, cost, influence by peers and posted reviews, ease of use, and functionality affected whether they downloaded and used the app over time. Men also reported that entertainment value and frequency of updates influenced whether they kept and continued to use an app over time. There were 4 reasons why participants chose to delete an app-dislike, lack of use, cost, and lack of memory or space. Participants also shared their preferences for an app to encourage regular HIV testing by providing feedback on test reminders, tailored testing interval recommendations, HIV test locator, and monitoring of personal sexual behaviors. The features and functions of mobile apps that Spanish-speaking MSM in this study believed were associated with downloading and/or sustained engagement of an app generally reflected the priorities mentioned in an earlier study with English-speaking MSM. Unlike the earlier study, Spanish-speaking MSM prioritized personal interest in a mobile app and de-emphasized the efficiency of an app to make their lives easier in their decision to download an app to their mobile device. Tailoring mobile apps to the language and needs of Spanish-speaking MSM is critical to help increase their willingness to download a mobile app. Despite the growing number of HIV-prevention apps in development, few are tailored to Spanish-speaking MSM, representing an important gap that should be addressed in future research. ©Jason Mitchell, Maria Beatriz Torres, Lucy Asmar, Thu Danh, Keith J Horvath. Originally published in JMIR Public Health and Surveillance (http://publichealth.jmir.org), 24.04.2018.
Process Model Improvement for Source Code Plagiarism Detection in Student Programming Assignments
ERIC Educational Resources Information Center
Kermek, Dragutin; Novak, Matija
2016-01-01
In programming courses there are various ways in which students attempt to cheat. The most commonly used method is copying source code from other students and making minimal changes in it, like renaming variable names. Several tools like Sherlock, JPlag and Moss have been devised to detect source code plagiarism. However, for larger student…
Finding Resolution for the Responsible Transparency of Economic Models in Health and Medicine.
Padula, William V; McQueen, Robert Brett; Pronovost, Peter J
2017-11-01
The Second Panel on Cost-Effectiveness in Health and Medicine recommendations for conduct, methodological practices, and reporting of cost-effectiveness analyses has a number of questions unanswered with respect to the implementation of transparent, open source code interface for economic models. The possibility of making economic model source code could be positive and progressive for the field; however, several unintended consequences of this system should be first considered before complete implementation of this model. First, there is the concern regarding intellectual property rights that modelers have to their analyses. Second, the open source code could make analyses more accessible to inexperienced modelers, leading to inaccurate or misinterpreted results. We propose several resolutions to these concerns. The field should establish a licensing system of open source code such that the model originators maintain control of the code use and grant permissions to other investigators who wish to use it. The field should also be more forthcoming towards the teaching of cost-effectiveness analysis in medical and health services education so that providers and other professionals are familiar with economic modeling and able to conduct analyses with open source code. These types of unintended consequences need to be fully considered before the field's preparedness to move forward into an era of model transparency with open source code.
The Ruby UCSC API: accessing the UCSC genome database using Ruby.
Mishima, Hiroyuki; Aerts, Jan; Katayama, Toshiaki; Bonnal, Raoul J P; Yoshiura, Koh-ichiro
2012-09-21
The University of California, Santa Cruz (UCSC) genome database is among the most used sources of genomic annotation in human and other organisms. The database offers an excellent web-based graphical user interface (the UCSC genome browser) and several means for programmatic queries. A simple application programming interface (API) in a scripting language aimed at the biologist was however not yet available. Here, we present the Ruby UCSC API, a library to access the UCSC genome database using Ruby. The API is designed as a BioRuby plug-in and built on the ActiveRecord 3 framework for the object-relational mapping, making writing SQL statements unnecessary. The current version of the API supports databases of all organisms in the UCSC genome database including human, mammals, vertebrates, deuterostomes, insects, nematodes, and yeast.The API uses the bin index-if available-when querying for genomic intervals. The API also supports genomic sequence queries using locally downloaded *.2bit files that are not stored in the official MySQL database. The API is implemented in pure Ruby and is therefore available in different environments and with different Ruby interpreters (including JRuby). Assisted by the straightforward object-oriented design of Ruby and ActiveRecord, the Ruby UCSC API will facilitate biologists to query the UCSC genome database programmatically. The API is available through the RubyGem system. Source code and documentation are available at https://github.com/misshie/bioruby-ucsc-api/ under the Ruby license. Feedback and help is provided via the website at http://rubyucscapi.userecho.com/.
The Ruby UCSC API: accessing the UCSC genome database using Ruby
2012-01-01
Background The University of California, Santa Cruz (UCSC) genome database is among the most used sources of genomic annotation in human and other organisms. The database offers an excellent web-based graphical user interface (the UCSC genome browser) and several means for programmatic queries. A simple application programming interface (API) in a scripting language aimed at the biologist was however not yet available. Here, we present the Ruby UCSC API, a library to access the UCSC genome database using Ruby. Results The API is designed as a BioRuby plug-in and built on the ActiveRecord 3 framework for the object-relational mapping, making writing SQL statements unnecessary. The current version of the API supports databases of all organisms in the UCSC genome database including human, mammals, vertebrates, deuterostomes, insects, nematodes, and yeast. The API uses the bin index—if available—when querying for genomic intervals. The API also supports genomic sequence queries using locally downloaded *.2bit files that are not stored in the official MySQL database. The API is implemented in pure Ruby and is therefore available in different environments and with different Ruby interpreters (including JRuby). Conclusions Assisted by the straightforward object-oriented design of Ruby and ActiveRecord, the Ruby UCSC API will facilitate biologists to query the UCSC genome database programmatically. The API is available through the RubyGem system. Source code and documentation are available at https://github.com/misshie/bioruby-ucsc-api/ under the Ruby license. Feedback and help is provided via the website at http://rubyucscapi.userecho.com/. PMID:22994508
Chikkagoudar, Satish; Wang, Kai; Li, Mingyao
2011-05-26
Gene-gene interaction in genetic association studies is computationally intensive when a large number of SNPs are involved. Most of the latest Central Processing Units (CPUs) have multiple cores, whereas Graphics Processing Units (GPUs) also have hundreds of cores and have been recently used to implement faster scientific software. However, currently there are no genetic analysis software packages that allow users to fully utilize the computing power of these multi-core devices for genetic interaction analysis for binary traits. Here we present a novel software package GENIE, which utilizes the power of multiple GPU or CPU processor cores to parallelize the interaction analysis. GENIE reads an entire genetic association study dataset into memory and partitions the dataset into fragments with non-overlapping sets of SNPs. For each fragment, GENIE analyzes: 1) the interaction of SNPs within it in parallel, and 2) the interaction between the SNPs of the current fragment and other fragments in parallel. We tested GENIE on a large-scale candidate gene study on high-density lipoprotein cholesterol. Using an NVIDIA Tesla C1060 graphics card, the GPU mode of GENIE achieves a speedup of 27 times over its single-core CPU mode run. GENIE is open-source, economical, user-friendly, and scalable. Since the computing power and memory capacity of graphics cards are increasing rapidly while their cost is going down, we anticipate that GENIE will achieve greater speedups with faster GPU cards. Documentation, source code, and precompiled binaries can be downloaded from http://www.cceb.upenn.edu/~mli/software/GENIE/.
Entropic Profiler – detection of conservation in genomes using information theory
Fernandes, Francisco; Freitas, Ana T; Almeida, Jonas S; Vinga, Susana
2009-01-01
Background In the last decades, with the successive availability of whole genome sequences, many research efforts have been made to mathematically model DNA. Entropic Profiles (EP) were proposed recently as a new measure of continuous entropy of genome sequences. EP represent local information plots related to DNA randomness and are based on information theory and statistical concepts. They express the weighed relative abundance of motifs for each position in genomes. Their study is very relevant because under or over-representation segments are often associated with significant biological meaning. Findings The Entropic Profiler application here presented is a new tool designed to detect and extract under and over-represented DNA segments in genomes by using EP. It allows its computation in a very efficient way by recurring to improved algorithms and data structures, which include modified suffix trees. Available through a web interface and as downloadable source code, it allows to study positions and to search for motifs inside the whole sequence or within a specified range. DNA sequences can be entered from different sources, including FASTA files, pre-loaded examples or resuming a previously saved work. Besides the EP value plots, p-values and z-scores for each motif are also computed, along with the Chaos Game Representation of the sequence. Conclusion EP are directly related with the statistical significance of motifs and can be considered as a new method to extract and classify significant regions in genomes and estimate local scales in DNA. The present implementation establishes an efficient and useful tool for whole genome analysis. PMID:19416538
Federated Giovanni: A Distributed Web Service for Analysis and Visualization of Remote Sensing Data
NASA Technical Reports Server (NTRS)
Lynnes, Chris
2014-01-01
The Geospatial Interactive Online Visualization and Analysis Interface (Giovanni) is a popular tool for users of the Goddard Earth Sciences Data and Information Services Center (GES DISC) and has been in use for over a decade. It provides a wide variety of algorithms and visualizations to explore large remote sensing datasets without having to download the data and without having to write readers and visualizers for it. Giovanni is now being extended to enable its capabilities at other data centers within the Earth Observing System Data and Information System (EOSDIS). This Federated Giovanni will allow four other data centers to add and maintain their data within Giovanni on behalf of their user community. Those data centers are the Physical Oceanography Distributed Active Archive Center (PO.DAAC), MODIS Adaptive Processing System (MODAPS), Ocean Biology Processing Group (OBPG), and Land Processes Distributed Active Archive Center (LP DAAC). Three tiers are supported: Tier 1 (GES DISC-hosted) gives the remote data center a data management interface to add and maintain data, which are provided through the Giovanni instance at the GES DISC. Tier 2 packages Giovanni up as a virtual machine for distribution to and deployment by the other data centers. Data variables are shared among data centers by sharing documents from the Solr database that underpins Giovanni's data management capabilities. However, each data center maintains their own instance of Giovanni, exposing the variables of most interest to their user community. Tier 3 is a Shared Source model, in which the data centers cooperate to extend the infrastructure by contributing source code.
QMachine: commodity supercomputing in web browsers
2014-01-01
Background Ongoing advancements in cloud computing provide novel opportunities in scientific computing, especially for distributed workflows. Modern web browsers can now be used as high-performance workstations for querying, processing, and visualizing genomics’ “Big Data” from sources like The Cancer Genome Atlas (TCGA) and the International Cancer Genome Consortium (ICGC) without local software installation or configuration. The design of QMachine (QM) was driven by the opportunity to use this pervasive computing model in the context of the Web of Linked Data in Biomedicine. Results QM is an open-sourced, publicly available web service that acts as a messaging system for posting tasks and retrieving results over HTTP. The illustrative application described here distributes the analyses of 20 Streptococcus pneumoniae genomes for shared suffixes. Because all analytical and data retrieval tasks are executed by volunteer machines, few server resources are required. Any modern web browser can submit those tasks and/or volunteer to execute them without installing any extra plugins or programs. A client library provides high-level distribution templates including MapReduce. This stark departure from the current reliance on expensive server hardware running “download and install” software has already gathered substantial community interest, as QM received more than 2.2 million API calls from 87 countries in 12 months. Conclusions QM was found adequate to deliver the sort of scalable bioinformatics solutions that computation- and data-intensive workflows require. Paradoxically, the sandboxed execution of code by web browsers was also found to enable them, as compute nodes, to address critical privacy concerns that characterize biomedical environments. PMID:24913605
2011-01-01
Background Gene-gene interaction in genetic association studies is computationally intensive when a large number of SNPs are involved. Most of the latest Central Processing Units (CPUs) have multiple cores, whereas Graphics Processing Units (GPUs) also have hundreds of cores and have been recently used to implement faster scientific software. However, currently there are no genetic analysis software packages that allow users to fully utilize the computing power of these multi-core devices for genetic interaction analysis for binary traits. Findings Here we present a novel software package GENIE, which utilizes the power of multiple GPU or CPU processor cores to parallelize the interaction analysis. GENIE reads an entire genetic association study dataset into memory and partitions the dataset into fragments with non-overlapping sets of SNPs. For each fragment, GENIE analyzes: 1) the interaction of SNPs within it in parallel, and 2) the interaction between the SNPs of the current fragment and other fragments in parallel. We tested GENIE on a large-scale candidate gene study on high-density lipoprotein cholesterol. Using an NVIDIA Tesla C1060 graphics card, the GPU mode of GENIE achieves a speedup of 27 times over its single-core CPU mode run. Conclusions GENIE is open-source, economical, user-friendly, and scalable. Since the computing power and memory capacity of graphics cards are increasing rapidly while their cost is going down, we anticipate that GENIE will achieve greater speedups with faster GPU cards. Documentation, source code, and precompiled binaries can be downloaded from http://www.cceb.upenn.edu/~mli/software/GENIE/. PMID:21615923
40 CFR 51.50 - What definitions apply to this subpart?
Code of Federal Regulations, 2010 CFR
2010-07-01
... accuracy description (MAD) codes means a set of six codes used to define the accuracy of latitude/longitude data for point sources. The six codes and their definitions are: (1) Coordinate Data Source Code: The... physical piece of or a closely related set of equipment. The EPA's reporting format for a given inventory...
: Upload Date Photo Date 1 2 3 4 5 Next Arctic Edge 2018 Download Full Image Photo Details Arctic Edge 2018 Download Full Image Photo Details Arctic Edge 2018 Download Full Image Photo Details Arctic Edge 2018 Download Full Image Photo Details Arctic Edge 2018 Download Full Image Photo Details Arctic Edge 2018
The Astrophysics Source Code Library by the numbers
NASA Astrophysics Data System (ADS)
Allen, Alice; Teuben, Peter; Berriman, G. Bruce; DuPrie, Kimberly; Mink, Jessica; Nemiroff, Robert; Ryan, PW; Schmidt, Judy; Shamir, Lior; Shortridge, Keith; Wallin, John; Warmels, Rein
2018-01-01
The Astrophysics Source Code Library (ASCL, ascl.net) was founded in 1999 by Robert Nemiroff and John Wallin. ASCL editors seek both new and old peer-reviewed papers that describe methods or experiments that involve the development or use of source code, and add entries for the found codes to the library. Software authors can submit their codes to the ASCL as well. This ensures a comprehensive listing covering a significant number of the astrophysics source codes used in peer-reviewed studies. The ASCL is indexed by both NASA’s Astrophysics Data System (ADS) and Web of Science, making software used in research more discoverable. This presentation covers the growth in the ASCL’s number of entries, the number of citations to its entries, and in which journals those citations appear. It also discusses what changes have been made to the ASCL recently, and what its plans are for the future.
Astrophysics Source Code Library
NASA Astrophysics Data System (ADS)
Allen, A.; DuPrie, K.; Berriman, B.; Hanisch, R. J.; Mink, J.; Teuben, P. J.
2013-10-01
The Astrophysics Source Code Library (ASCL), founded in 1999, is a free on-line registry for source codes of interest to astronomers and astrophysicists. The library is housed on the discussion forum for Astronomy Picture of the Day (APOD) and can be accessed at http://ascl.net. The ASCL has a comprehensive listing that covers a significant number of the astrophysics source codes used to generate results published in or submitted to refereed journals and continues to grow. The ASCL currently has entries for over 500 codes; its records are citable and are indexed by ADS. The editors of the ASCL and members of its Advisory Committee were on hand at a demonstration table in the ADASS poster room to present the ASCL, accept code submissions, show how the ASCL is starting to be used by the astrophysics community, and take questions on and suggestions for improving the resource.
Atmospheric Science Data Center
2017-10-12
Order online ASDC Web Ordering Tool data available for ftp download. Order online CERES Subsetter Ordering Page data available for ftp download. Order online Earthdata Search Tool data available for download. Download the ...
Generating code adapted for interlinking legacy scalar code and extended vector code
Gschwind, Michael K
2013-06-04
Mechanisms for intermixing code are provided. Source code is received for compilation using an extended Application Binary Interface (ABI) that extends a legacy ABI and uses a different register configuration than the legacy ABI. First compiled code is generated based on the source code, the first compiled code comprising code for accommodating the difference in register configurations used by the extended ABI and the legacy ABI. The first compiled code and second compiled code are intermixed to generate intermixed code, the second compiled code being compiled code that uses the legacy ABI. The intermixed code comprises at least one call instruction that is one of a call from the first compiled code to the second compiled code or a call from the second compiled code to the first compiled code. The code for accommodating the difference in register configurations is associated with the at least one call instruction.
Optimal power allocation and joint source-channel coding for wireless DS-CDMA visual sensor networks
NASA Astrophysics Data System (ADS)
Pandremmenou, Katerina; Kondi, Lisimachos P.; Parsopoulos, Konstantinos E.
2011-01-01
In this paper, we propose a scheme for the optimal allocation of power, source coding rate, and channel coding rate for each of the nodes of a wireless Direct Sequence Code Division Multiple Access (DS-CDMA) visual sensor network. The optimization is quality-driven, i.e. the received quality of the video that is transmitted by the nodes is optimized. The scheme takes into account the fact that the sensor nodes may be imaging scenes with varying levels of motion. Nodes that image low-motion scenes will require a lower source coding rate, so they will be able to allocate a greater portion of the total available bit rate to channel coding. Stronger channel coding will mean that such nodes will be able to transmit at lower power. This will both increase battery life and reduce interference to other nodes. Two optimization criteria are considered. One that minimizes the average video distortion of the nodes and one that minimizes the maximum distortion among the nodes. The transmission powers are allowed to take continuous values, whereas the source and channel coding rates can assume only discrete values. Thus, the resulting optimization problem lies in the field of mixed-integer optimization tasks and is solved using Particle Swarm Optimization. Our experimental results show the importance of considering the characteristics of the video sequences when determining the transmission power, source coding rate and channel coding rate for the nodes of the visual sensor network.
SnopViz, an interactive snow profile visualization tool
NASA Astrophysics Data System (ADS)
Fierz, Charles; Egger, Thomas; gerber, Matthias; Bavay, Mathias; Techel, Frank
2016-04-01
SnopViz is a visualization tool for both simulation outputs of the snow-cover model SNOWPACK and observed snow profiles. It has been designed to fulfil the needs of operational services (Swiss Avalanche Warning Service, Avalanche Canada) as well as offer the flexibility required to satisfy the specific needs of researchers. This JavaScript application runs on any modern browser and does not require an active Internet connection. The open source code is available for download from models.slf.ch where examples can also be run. Both the SnopViz library and the SnopViz User Interface will become a full replacement of the current research visualization tool SN_GUI for SNOWPACK. The SnopViz library is a stand-alone application that parses the provided input files, for example, a single snow profile (CAAML file format) or multiple snow profiles as output by SNOWPACK (PRO file format). A plugin architecture allows for handling JSON objects (JavaScript Object Notation) as well and plugins for other file formats may be added easily. The outputs are provided either as vector graphics (SVG) or JSON objects. The SnopViz User Interface (UI) is a browser based stand-alone interface. It runs in every modern browser, including IE, and allows user interaction with the graphs. SVG, the XML based standard for vector graphics, was chosen because of its easy interaction with JS and a good software support (Adobe Illustrator, Inkscape) to manipulate graphs outside SnopViz for publication purposes. SnopViz provides new visualization for SNOWPACK timeline output as well as time series input and output. The actual output format for SNOWPACK timelines was retained while time series are read from SMET files, a file format used in conjunction with the open source data handling code MeteoIO. Finally, SnopViz is able to render single snow profiles, either observed or modelled, that are provided as CAAML-file. This file format (caaml.org/Schemas/V5.0/Profiles/SnowProfileIACS) is an international standard to exchange snow profile data. It is supported by the International Association of Cryospheric Sciences (IACS) and was developed in collaboration with practitioners (Avalanche Canada).
Data compression for satellite images
NASA Technical Reports Server (NTRS)
Chen, P. H.; Wintz, P. A.
1976-01-01
An efficient data compression system is presented for satellite pictures and two grey level pictures derived from satellite pictures. The compression techniques take advantages of the correlation between adjacent picture elements. Several source coding methods are investigated. Double delta coding is presented and shown to be the most efficient. Both predictive differential quantizing technique and double delta coding can be significantly improved by applying a background skipping technique. An extension code is constructed. This code requires very little storage space and operates efficiently. Simulation results are presented for various coding schemes and source codes.
NASA Astrophysics Data System (ADS)
Michel-Sendis, Franco; Martinez-González, Jesus; Gauld, Ian
2017-09-01
SFCOMPO-2.0 is a database of experimental isotopic concentrations measured in destructive radiochemical analysis of spent nuclear fuel (SNF) samples. The database includes corresponding design description of the fuel rods and assemblies, relevant operating conditions and characteristics of the host reactors necessary for modelling and simulation. Aimed at establishing a thorough, reliable, and publicly available resource for code and data validation of safety-related applications, SFCOMPO-2.0 is developed and maintained by the OECD Nuclear Energy Agency (NEA). The SFCOMPO-2.0 database is a Java application which is downloadable from the NEA website.
pyNBS: A Python implementation for network-based stratification of tumor mutations.
Huang, Justin K; Jia, Tongqiu; Carlin, Daniel E; Ideker, Trey
2018-03-28
We present pyNBS: a modularized Python 2.7 implementation of the network-based stratification (NBS) algorithm for stratifying tumor somatic mutation profiles into molecularly and clinically relevant subtypes. In addition to release of the software, we benchmark its key parameters and provide a compact cancer reference network that increases the significance of tumor stratification using the NBS algorithm. The structure of the code exposes key steps of the algorithm to foster further collaborative development. The package, along with examples and data, can be downloaded and installed from the URL http://www.github.com/huangger/pyNBS/. jkh013@ucsd.edu.
NASA Astrophysics Data System (ADS)
Hirst, Paul; Cardenes, Ricardo
2016-08-01
We have developed and deployed a new data archive for the Gemini Observatory. Focused on simplicity and ease of use, the archive provides a number of powerful and novel features including automatic association of calibration data with the science data, and the ability to bookmark searches. A simple but powerful API allows programmatic search and download of data. The archive is hosted on Amazon Web Services, which provides us excellent internet connectivity and significant cost savings in both operations and development over more traditional deployment options. The code is written in python, utilizing a PostgreSQL database and Apache web server.
Theoretical White Dwarf Spectra on Demand: TheoSSA
NASA Astrophysics Data System (ADS)
Ringat, E.; Rauch, T.
2010-11-01
In the last decades, a lot of progress was made in spectral analysis. The quality (e.g. resolution, S/N ratio) of observed spectra has improved much and several model-atmosphere codes were developed. One of these is the ``Tübingen NLTE Model-Atmosphere Package'' (TMAP), that is a highly developed program for the calculation of model atmospheres of hot, compact objects. In the framework of the German Astrophysical Virtual Observatory (GAVO), theoretical spectral energy distributions (SEDs) can be downloaded via TheoSSA. In a pilot phase, TheoSSA is based on TMAP model atmospheres. We present the current state of this VO service.
Distributed Joint Source-Channel Coding in Wireless Sensor Networks
Zhu, Xuqi; Liu, Yu; Zhang, Lin
2009-01-01
Considering the fact that sensors are energy-limited and the wireless channel conditions in wireless sensor networks, there is an urgent need for a low-complexity coding method with high compression ratio and noise-resisted features. This paper reviews the progress made in distributed joint source-channel coding which can address this issue. The main existing deployments, from the theory to practice, of distributed joint source-channel coding over the independent channels, the multiple access channels and the broadcast channels are introduced, respectively. To this end, we also present a practical scheme for compressing multiple correlated sources over the independent channels. The simulation results demonstrate the desired efficiency. PMID:22408560
NASA Technical Reports Server (NTRS)
Clark, Kenneth; Watney, Garth; Murray, Alexander; Benowitz, Edward
2007-01-01
A computer program translates Unified Modeling Language (UML) representations of state charts into source code in the C, C++, and Python computing languages. ( State charts signifies graphical descriptions of states and state transitions of a spacecraft or other complex system.) The UML representations constituting the input to this program are generated by using a UML-compliant graphical design program to draw the state charts. The generated source code is consistent with the "quantum programming" approach, which is so named because it involves discrete states and state transitions that have features in common with states and state transitions in quantum mechanics. Quantum programming enables efficient implementation of state charts, suitable for real-time embedded flight software. In addition to source code, the autocoder program generates a graphical-user-interface (GUI) program that, in turn, generates a display of state transitions in response to events triggered by the user. The GUI program is wrapped around, and can be used to exercise the state-chart behavior of, the generated source code. Once the expected state-chart behavior is confirmed, the generated source code can be augmented with a software interface to the rest of the software with which the source code is required to interact.
Practices in source code sharing in astrophysics
NASA Astrophysics Data System (ADS)
Shamir, Lior; Wallin, John F.; Allen, Alice; Berriman, Bruce; Teuben, Peter; Nemiroff, Robert J.; Mink, Jessica; Hanisch, Robert J.; DuPrie, Kimberly
2013-02-01
While software and algorithms have become increasingly important in astronomy, the majority of authors who publish computational astronomy research do not share the source code they develop, making it difficult to replicate and reuse the work. In this paper we discuss the importance of sharing scientific source code with the entire astrophysics community, and propose that journals require authors to make their code publicly available when a paper is published. That is, we suggest that a paper that involves a computer program not be accepted for publication unless the source code becomes publicly available. The adoption of such a policy by editors, editorial boards, and reviewers will improve the ability to replicate scientific results, and will also make computational astronomy methods more available to other researchers who wish to apply them to their data.
NASA Astrophysics Data System (ADS)
Rajib, A.; Zhao, L.; Merwade, V.; Shin, J.; Smith, J.; Song, C. X.
2017-12-01
Despite the significant potential of remotely sensed earth observations, their application is still not full-fledged in water resources research, management and education. Inconsistent storage structures, data formats and spatial resolution among different platforms/sources of earth observations hinder the use of these data. Available web-services can help bulk data downloading and visualization, but they are not sufficiently tailored to meet the degree of interoperability required for direct application of earth observations in hydrologic modeling at user-defined spatio-temporal scales. Similarly, the least ambiguous way for educators and watershed managers is to instantaneously obtain a time-series at any watershed of interest without spending time and computational resources on data download and post-processing activities. To address this issue, an open access, online platform, named HydroGlobe, is developed that minimizes all these processing tasks and delivers ready-to-use data from different earth observation sources. HydroGlobe can provide spatially-averaged time series of earth observations by using the following inputs: (i) data source, (ii) temporal extent in the form of start/end date, and (iii) geographic units (e.g., grid cell or sub-basin boundary) and extent in the form of GIS shapefile. In its preliminary version, HydroGlobe simultaneously handles five data sources including the surface and root zone soil moisture from SMAP (Soil Moisture Active Passive Mission), actual and potential evapotranspiration from MODIS (Moderate Resolution Imaging Spectroradiometer), and precipitation from GPM (Global Precipitation Measurements). This presentation will demonstrate the HydroGlobe interface and its applicability using few test cases on watersheds from different parts of the globe.
NASA Astrophysics Data System (ADS)
Choi, Y.; Piasecki, M.
2008-12-01
The objective of this study is the preparation and indexing of rainfall data products for ingestion into the Chesapeake Bay Environmental Observatory (CBEO) node of the CUAHSI/WATERs network. Rainfall products (which are obtained and then processed based on the WSR-88D NEXRAD network) are obtained from the NOAA/NWS Advanced Hydrologic Prediction Service that combines the Multi-sensor Precipitation Estimate (MPE) data generated by the Regional River Forecast Centers and Hydro-NEXRAD rainfall data generated as a service by the University of Iowa. The former is collected on 4*4 km grid (HRAP) with a daily average temporal resolution and the latter on a 1minute*1minute degree grid with hourly values. We have generated a cut-out for the Chesapeake Bay Basin that contains about 9,300 nodes (sites) for the MPE data and about 300,000 nodes (sites) for the Hydro-NEXRAD product. Automated harvesting services have been implemented for both data products. The MPE data is harvested from its download site using ArcGIS which in turn is used to extract the data for the Chesapeake Bay watershed before a scripting program is used to scatter the data into the ODM. The Hydro-NEXRAD is downloaded from a web-based system at the University of Iowa which permits downloads for large scale watersheds organized by Hydraulic Unit Codes (HUC). The resulting ASCII is then automatically parsed and the information stored alongside the MPE data. The two data products stored side-by-side then allows a comparison between them addressing the accuracy and agreement between the methods used to arrive at rainfall data as both use the raw reflectivity data from the WSD-88D system.
Accessible and informative sectioned images, color-coded images, and surface models of the ear.
Park, Hyo Seok; Chung, Min Suk; Shin, Dong Sun; Jung, Yong Wook; Park, Jin Seo
2013-08-01
In our previous research, we created state-of-the-art sectioned images, color-coded images, and surface models of the human ear. Our ear data would be more beneficial and informative if they were more easily accessible. Therefore, the purpose of this study was to distribute the browsing software and the PDF file in which ear images are to be readily obtainable and freely explored. Another goal was to inform other researchers of our methods for establishing the browsing software and the PDF file. To achieve this, sectioned images and color-coded images of ear were prepared (voxel size 0.1 mm). In the color-coded images, structures related to hearing, equilibrium, and structures originated from the first and second pharyngeal arches were segmented supplementarily. The sectioned and color-coded images of right ear were added to the browsing software, which displayed the images serially along with structure names. The surface models were reconstructed to be combined into the PDF file where they could be freely manipulated. Using the browsing software and PDF file, sectional and three-dimensional shapes of ear structures could be comprehended in detail. Furthermore, using the PDF file, clinical knowledge could be identified through virtual otoscopy. Therefore, the presented educational tools will be helpful to medical students and otologists by improving their knowledge of ear anatomy. The browsing software and PDF file can be downloaded without charge and registration at our homepage (http://anatomy.dongguk.ac.kr/ear/). Copyright © 2013 Wiley Periodicals, Inc.
Web-based data collection: detailed methods of a questionnaire and data gathering tool
Cooper, Charles J; Cooper, Sharon P; del Junco, Deborah J; Shipp, Eva M; Whitworth, Ryan; Cooper, Sara R
2006-01-01
There have been dramatic advances in the development of web-based data collection instruments. This paper outlines a systematic web-based approach to facilitate this process through locally developed code and to describe the results of using this process after two years of data collection. We provide a detailed example of a web-based method that we developed for a study in Starr County, Texas, assessing high school students' work and health status. This web-based application includes data instrument design, data entry and management, and data tables needed to store the results that attempt to maximize the advantages of this data collection method. The software also efficiently produces a coding manual, web-based statistical summary and crosstab reports, as well as input templates for use by statistical packages. Overall, web-based data entry using a dynamic approach proved to be a very efficient and effective data collection system. This data collection method expedited data processing and analysis and eliminated the need for cumbersome and expensive transfer and tracking of forms, data entry, and verification. The code has been made available for non-profit use only to the public health research community as a free download [1]. PMID:16390556
DOE Office of Scientific and Technical Information (OSTI.GOV)
Palmer, M.E.
1997-12-05
This V and V Report includes analysis of two revisions of the DMS [data management system] System Requirements Specification (SRS) and the Preliminary System Design Document (PSDD); the source code for the DMS Communication Module (DMSCOM) messages; the source code for selected DMS Screens, and the code for the BWAS Simulator. BDM Federal analysts used a series of matrices to: compare the requirements in the System Requirements Specification (SRS) to the specifications found in the System Design Document (SDD), to ensure the design supports the business functions, compare the discreet parts of the SDD with each other, to ensure thatmore » the design is consistent and cohesive, compare the source code of the DMS Communication Module with the specifications, to ensure that the resultant messages will support the design, compare the source code of selected screens to the specifications to ensure that resultant system screens will support the design, compare the source code of the BWAS simulator with the requirements to interface with DMS messages and data transfers relating to the BWAS operations.« less
ReadyVax: A new mobile vaccine information app.
Bednarczyk, Robert A; Frew, Paula M; Salmon, Daniel A; Whitney, Ellen; Omer, Saad B
2017-05-04
Vaccine information of varying quality is available through many different sources. We describe the creation, release and utilization of ReadyVax, a new mobile smartphone app providing access to trustworthy, evidence-based vaccine information for a target audience of healthcare providers, pharmacists, and patients (including parents of children). We describe the information content and technical development of ReadyVax. Between the hard launch of the app on February 12, 2015 and October 8, 2016, the app has been downloaded by 5,142 unique users, with 6,841 total app sessions initiated, comprising a total of 15,491 screen views (2.3 screens/session on average). ReadyVax has been downloaded by users in 102 different countries; most users (52%) are from the United States. We are continuing outreach efforts to increase app use, and planning for development of an Android-compatible version of ReadyVax, to increase the available market for the app.
ReadyVax: A new mobile vaccine information app
Bednarczyk, Robert A.; Frew, Paula M.; Salmon, Daniel A.; Whitney, Ellen; Omer, Saad B.
2017-01-01
Abstract Vaccine information of varying quality is available through many different sources. We describe the creation, release and utilization of ReadyVax, a new mobile smartphone app providing access to trustworthy, evidence-based vaccine information for a target audience of healthcare providers, pharmacists, and patients (including parents of children). We describe the information content and technical development of ReadyVax. Between the hard launch of the app on February 12, 2015 and October 8, 2016, the app has been downloaded by 5,142 unique users, with 6,841 total app sessions initiated, comprising a total of 15,491 screen views (2.3 screens/session on average). ReadyVax has been downloaded by users in 102 different countries; most users (52%) are from the United States. We are continuing outreach efforts to increase app use, and planning for development of an Android-compatible version of ReadyVax, to increase the available market for the app. PMID:28059610
Share, steal, or buy? A social cognitive perspective of music downloading.
LaRose, Robert; Kim, Junghyun
2007-04-01
The music downloading phenomenon presents a unique opportunity to examine normative influences on media consumption behavior. Downloaders face moral, legal, and ethical quandaries that can be conceptualized as normative influences within the self-regulatory mechanism of social cognitive theory. The music industry hopes to eliminate illegal file sharing and to divert illegal downloaders to pay services by asserting normative influence through selective prosecutions and public information campaigns. However the deficient self-regulation of downloaders counters these efforts maintaining file sharing as a persistent habit that defies attempts to establish normative control. The present research tests and extends the social cognitive theory of downloading on a sample of college students. The expected outcomes of downloading behavior and deficient self-regulation of that behavior were found to be important determinants of intentions to continue downloading. Consistent with social cognitive theory but in contrast to the theory of planned behavior, it was found that descriptive and prescriptive norms influenced deficient self-regulation but had no direct impact on behavioral intentions. Downloading intentions also had no direct relationship to either compact disc purchases or to subscription to online pay music services.
Xu, Guoai; Li, Qi; Guo, Yanhui; Zhang, Miao
2017-01-01
Authorship attribution is to identify the most likely author of a given sample among a set of candidate known authors. It can be not only applied to discover the original author of plain text, such as novels, blogs, emails, posts etc., but also used to identify source code programmers. Authorship attribution of source code is required in diverse applications, ranging from malicious code tracking to solving authorship dispute or software plagiarism detection. This paper aims to propose a new method to identify the programmer of Java source code samples with a higher accuracy. To this end, it first introduces back propagation (BP) neural network based on particle swarm optimization (PSO) into authorship attribution of source code. It begins by computing a set of defined feature metrics, including lexical and layout metrics, structure and syntax metrics, totally 19 dimensions. Then these metrics are input to neural network for supervised learning, the weights of which are output by PSO and BP hybrid algorithm. The effectiveness of the proposed method is evaluated on a collected dataset with 3,022 Java files belong to 40 authors. Experiment results show that the proposed method achieves 91.060% accuracy. And a comparison with previous work on authorship attribution of source code for Java language illustrates that this proposed method outperforms others overall, also with an acceptable overhead. PMID:29095934
3D reconstruction software comparison for short sequences
NASA Astrophysics Data System (ADS)
Strupczewski, Adam; Czupryński, BłaŻej
2014-11-01
Large scale multiview reconstruction is recently a very popular area of research. There are many open source tools that can be downloaded and run on a personal computer. However, there are few, if any, comparisons between all the available software in terms of accuracy on small datasets that a single user can create. The typical datasets for testing of the software are archeological sites or cities, comprising thousands of images. This paper presents a comparison of currently available open source multiview reconstruction software for small datasets. It also compares the open source solutions with a simple structure from motion pipeline developed by the authors from scratch with the use of OpenCV and Eigen libraries.
XTALOPT version r11: An open-source evolutionary algorithm for crystal structure prediction
NASA Astrophysics Data System (ADS)
Avery, Patrick; Falls, Zackary; Zurek, Eva
2018-01-01
Version 11 of XTALOPT, an evolutionary algorithm for crystal structure prediction, has now been made available for download from the CPC library or the XTALOPT website, http://xtalopt.github.io. Whereas the previous versions of XTALOPT were published under the Gnu Public License (GPL), the current version is made available under the 3-Clause BSD License, which is an open source license that is recognized by the Open Source Initiative. Importantly, the new version can be executed via a command line interface (i.e., it does not require the use of a Graphical User Interface). Moreover, the new version is written as a stand-alone program, rather than an extension to AVOGADRO.
The mathematical theory of signal processing and compression-designs
NASA Astrophysics Data System (ADS)
Feria, Erlan H.
2006-05-01
The mathematical theory of signal processing, named processor coding, will be shown to inherently arise as the computational time dual of Shannon's mathematical theory of communication which is also known as source coding. Source coding is concerned with signal source memory space compression while processor coding deals with signal processor computational time compression. Their combination is named compression-designs and referred as Conde in short. A compelling and pedagogically appealing diagram will be discussed highlighting Conde's remarkable successful application to real-world knowledge-aided (KA) airborne moving target indicator (AMTI) radar.
Accurate Modeling of Ionospheric Electromagnetic Fields Generated by a Low-Altitude VLF Transmitter
2007-08-31
latitude) for 3 different grid spacings. 14 8. Low-altitude fields produced by a 10-kHz source computed using the FD and TD codes. The agreement is...excellent, validating the new FD code. 16 9. High-altitude fields produced by a 10-kHz source computed using the FD and TD codes. The agreement is...again excellent. 17 10. Low-altitude fields produced by a 20-k.Hz source computed using the FD and TD codes. 17 11. High-altitude fields produced
ICIS-NPDES Biosolids Annual Report Download Summary ...
ECHO, Enforcement and Compliance History Online, provides compliance and enforcement information for approximately 800,000 EPA-regulated facilities nationwide. ECHO includes permit, inspection, violation, enforcement action, and penalty information about facilities regulated under the Clean Air Act (CAA) Stationary Source Program, Clean Water Act (CWA) National Pollutant Elimination Discharge System (NPDES), and/or Resource Conservation and Recovery Act (RCRA). Information also is provided on surrounding demographics when available.
NPDES Monitoring Data Download | ECHO | US EPA
ECHO, Enforcement and Compliance History Online, provides compliance and enforcement information for approximately 800,000 EPA-regulated facilities nationwide. ECHO includes permit, inspection, violation, enforcement action, and penalty information about facilities regulated under the Clean Air Act (CAA) Stationary Source Program, Clean Water Act (CWA) National Pollutant Elimination Discharge System (NPDES), and/or Resource Conservation and Recovery Act (RCRA). Information also is provided on surrounding demographics when available.
FRS Download Summary and Data Element Dictionary ...
ECHO, Enforcement and Compliance History Online, provides compliance and enforcement information for approximately 800,000 EPA-regulated facilities nationwide. ECHO includes permit, inspection, violation, enforcement action, and penalty information about facilities regulated under the Clean Air Act (CAA) Stationary Source Program, Clean Water Act (CWA) National Pollutant Elimination Discharge System (NPDES), and/or Resource Conservation and Recovery Act (RCRA). Information also is provided on surrounding demographics when available.
ICIS-NPDES Download Summary and Data Element ...
ECHO, Enforcement and Compliance History Online, provides compliance and enforcement information for approximately 800,000 EPA-regulated facilities nationwide. ECHO includes permit, inspection, violation, enforcement action, and penalty information about facilities regulated under the Clean Air Act (CAA) Stationary Source Program, Clean Water Act (CWA) National Pollutant Elimination Discharge System (NPDES), and/or Resource Conservation and Recovery Act (RCRA). Information also is provided on surrounding demographics when available.
RCRAInfo Download Summary and Data Element Dictionary ...
ECHO, Enforcement and Compliance History Online, provides compliance and enforcement information for approximately 800,000 EPA-regulated facilities nationwide. ECHO includes permit, inspection, violation, enforcement action, and penalty information about facilities regulated under the Clean Air Act (CAA) Stationary Source Program, Clean Water Act (CWA) National Pollutant Elimination Discharge System (NPDES), and/or Resource Conservation and Recovery Act (RCRA). Information also is provided on surrounding demographics when available.
Providing open access data online to advance malaria research and control.
Moyes, Catherine L; Temperley, William H; Henry, Andrew J; Burgert, Clara R; Hay, Simon I
2013-05-16
To advance research on malaria, the outputs from existing studies and the data that fed into them need to be made freely available. This will ensure new studies can build on the work that has gone before. These data and results also need to be made available to groups who are developing public health policies based on up-to-date evidence. The Malaria Atlas Project (MAP) has collated and geopositioned over 50,000 parasite prevalence and vector occurrence survey records contributed by over 3,000 sources including research groups, government agencies and non-governmental organizations worldwide. This paper describes the results of a project set up to release data gathered, used and generated by MAP. Requests for permission to release data online were sent to 236 groups who had contributed unpublished prevalence (parasite rate) surveys. An online explorer tool was developed so that users can visualize the spatial distribution of the vector and parasite survey data before downloading it. In addition, a consultation group was convened to provide advice on the mode and format of release for data generated by MAP's modelling work. New software was developed to produce a suite of publication-quality map images for download from the internet for use in external publications. More than 40,000 survey records can now be visualized on a set of dynamic maps and downloaded from the MAP website on a free and unrestricted basis. As new data are added and new permissions to release existing data come in, the volume of data available for download will increase. The modelled data output from MAP's own analyses are also available online in a range of formats, including image files and GIS surface data, for use in advocacy, education, further research and to help parameterize or validate other mathematical models.
The State of Open Source Electronic Health Record Projects: A Software Anthropology Study
2017-01-01
Background Electronic health records (EHR) are a key tool in managing and storing patients’ information. Currently, there are over 50 open source EHR systems available. Functionality and usability are important factors for determining the success of any system. These factors are often a direct reflection of the domain knowledge and developers’ motivations. However, few published studies have focused on the characteristics of free and open source software (F/OSS) EHR systems and none to date have discussed the motivation, knowledge background, and demographic characteristics of the developers involved in open source EHR projects. Objective This study analyzed the characteristics of prevailing F/OSS EHR systems and aimed to provide an understanding of the motivation, knowledge background, and characteristics of the developers. Methods This study identified F/OSS EHR projects on SourceForge and other websites from May to July 2014. Projects were classified and characterized by license type, downloads, programming languages, spoken languages, project age, development status, supporting materials, top downloads by country, and whether they were “certified” EHRs. Health care F/OSS developers were also surveyed using an online survey. Results At the time of the assessment, we uncovered 54 open source EHR projects, but only four of them had been successfully certified under the Office of the National Coordinator for Health Information Technology (ONC Health IT) Certification Program. In the majority of cases, the open source EHR software was downloaded by users in the United States (64.07%, 148,666/232,034), underscoring that there is a significant interest in EHR open source applications in the United States. A survey of EHR open source developers was conducted and a total of 103 developers responded to the online questionnaire. The majority of EHR F/OSS developers (65.3%, 66/101) are participating in F/OSS projects as part of a paid activity and only 25.7% (26/101) of EHR F/OSS developers are, or have been, health care providers in their careers. In addition, 45% (45/99) of developers do not work in the health care field. Conclusion The research presented in this study highlights some challenges that may be hindering the future of health care F/OSS. A minority of developers have been health care professionals, and only 55% (54/99) work in the health care field. This undoubtedly limits the ability of functional design of F/OSS EHR systems from being a competitive advantage over prevailing commercial EHR systems. Open source software seems to be a significant interest to many; however, given that only four F/OSS EHR systems are ONC-certified, this interest is unlikely to yield significant adoption of these systems in the United States. Although the Health Information Technology for Economic and Clinical Health (HITECH) act was responsible for a substantial infusion of capital into the EHR marketplace, the lack of a corporate entity in most F/OSS EHR projects translates to a marginal capacity to market the respective F/OSS system and to navigate certification. This likely has further disadvantaged F/OSS EHR adoption in the United States. PMID:28235750
NASA Astrophysics Data System (ADS)
Wünderlich, D.; Mochalskyy, S.; Montellano, I. M.; Revel, A.
2018-05-01
Particle-in-cell (PIC) codes are used since the early 1960s for calculating self-consistently the motion of charged particles in plasmas, taking into account external electric and magnetic fields as well as the fields created by the particles itself. Due to the used very small time steps (in the order of the inverse plasma frequency) and mesh size, the computational requirements can be very high and they drastically increase with increasing plasma density and size of the calculation domain. Thus, usually small computational domains and/or reduced dimensionality are used. In the last years, the available central processing unit (CPU) power strongly increased. Together with a massive parallelization of the codes, it is now possible to describe in 3D the extraction of charged particles from a plasma, using calculation domains with an edge length of several centimeters, consisting of one extraction aperture, the plasma in direct vicinity of the aperture, and a part of the extraction system. Large negative hydrogen or deuterium ion sources are essential parts of the neutral beam injection (NBI) system in future fusion devices like the international fusion experiment ITER and the demonstration reactor (DEMO). For ITER NBI RF driven sources with a source area of 0.9 × 1.9 m2 and 1280 extraction apertures will be used. The extraction of negative ions is accompanied by the co-extraction of electrons which are deflected onto an electron dump. Typically, the maximum negative extracted ion current is limited by the amount and the temporal instability of the co-extracted electrons, especially for operation in deuterium. Different PIC codes are available for the extraction region of large driven negative ion sources for fusion. Additionally, some effort is ongoing in developing codes that describe in a simplified manner (coarser mesh or reduced dimensionality) the plasma of the whole ion source. The presentation first gives a brief overview of the current status of the ion source development for ITER NBI and of the PIC method. Different PIC codes for the extraction region are introduced as well as the coupling to codes describing the whole source (PIC codes or fluid codes). Presented and discussed are different physical and numerical aspects of applying PIC codes to negative hydrogen ion sources for fusion as well as selected code results. The main focus of future calculations will be the meniscus formation and identifying measures for reducing the co-extracted electrons, in particular for deuterium operation. The recent results of the 3D PIC code ONIX (calculation domain: one extraction aperture and its vicinity) for the ITER prototype source (1/8 size of the ITER NBI source) are presented.
Genetics of PCOS: A systematic bioinformatics approach to unveil the proteins responsible for PCOS.
Panda, Pritam Kumar; Rane, Riya; Ravichandran, Rahul; Singh, Shrinkhla; Panchal, Hetalkumar
2016-06-01
Polycystic ovary syndrome (PCOS) is a hormonal imbalance in women, which causes problems during menstrual cycle and in pregnancy that sometimes results in fatality. Though the genetics of PCOS is not fully understood, early diagnosis and treatment can prevent long-term effects. In this study, we have studied the proteins involved in PCOS and the structural aspects of the proteins that are taken into consideration using computational tools. The proteins involved are modeled using Modeller 9v14 and Ab-initio programs. All the 43 proteins responsible for PCOS were subjected to phylogenetic analysis to identify the relatedness of the proteins. Further, microarray data analysis of PCOS datasets was analyzed that was downloaded from GEO datasets to find the significant protein-coding genes responsible for PCOS, which is an addition to the reported protein-coding genes. Various statistical analyses were done using R programming to get an insight into the structural aspects of PCOS that can be used as drug targets to treat PCOS and other related reproductive diseases.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Warehime, Mick; Alexander, Millard H., E-mail: mha@umd.edu
We restate the application of the finite element method to collinear triatomic reactive scattering dynamics with a novel treatment of the scattering boundary conditions. The method provides directly the reactive scattering wave function and, subsequently, the probability current density field. Visualizing these quantities provides additional insight into the quantum dynamics of simple chemical reactions beyond simplistic one-dimensional models. Application is made here to a symmetric reaction (H+H{sub 2}), a heavy-light-light reaction (F+H{sub 2}), and a heavy-light-heavy reaction (F+HCl). To accompany this article, we have written a MATLAB code which is fast, simple enough to be accessible to a wide audience,more » as well as generally applicable to any problem that can be mapped onto a collinear atom-diatom reaction. The code and user's manual are available for download from http://www2.chem.umd.edu/groups/alexander/FEM.« less
ANNz2: Photometric Redshift and Probability Distribution Function Estimation using Machine Learning
NASA Astrophysics Data System (ADS)
Sadeh, I.; Abdalla, F. B.; Lahav, O.
2016-10-01
We present ANNz2, a new implementation of the public software for photometric redshift (photo-z) estimation of Collister & Lahav, which now includes generation of full probability distribution functions (PDFs). ANNz2 utilizes multiple machine learning methods, such as artificial neural networks and boosted decision/regression trees. The objective of the algorithm is to optimize the performance of the photo-z estimation, to properly derive the associated uncertainties, and to produce both single-value solutions and PDFs. In addition, estimators are made available, which mitigate possible problems of non-representative or incomplete spectroscopic training samples. ANNz2 has already been used as part of the first weak lensing analysis of the Dark Energy Survey, and is included in the experiment's first public data release. Here we illustrate the functionality of the code using data from the tenth data release of the Sloan Digital Sky Survey and the Baryon Oscillation Spectroscopic Survey. The code is available for download at http://github.com/IftachSadeh/ANNZ.
Baek, Sora; Yoon, Dae Young; Lim, Kyoung Ja; Cho, Young Kwon; Seo, Young Lan; Yun, Eun Joo
2018-05-07
To evaluate and compare the characteristics of the most downloaded and most cited articles in radiology journals. We selected 41 radiology journals that provided lists of both the most downloaded and most cited articles on their websites, and identified the 596 most downloaded articles and 596 most cited articles. We compared the following characteristics of the most downloaded and most cited articles: year of publication, journal title, department of the first author, country of origin, publication type, radiologic subspecialty, radiologic technique and accessibility. Compared to the most cited articles, the most downloaded articles were more frequently review articles (36.1% vs 17.1%, p < 0.05), case reports (5.9% vs 3.2%, p < 0.05), guidelines/consensus statements (5.4% vs 2.7%, p < 0.05), editorials/commentaries (3.7% vs 0.7%, p < 0.05) and pictorial essays (2.0% vs 0.2%, p < 0.05). Compared to the most cited articles, the most downloaded articles more frequently originated from the UK (8.7% vs 5.0%, p < 0.05) and were more frequently free-access articles (46.0% vs 39.4%, p < 0.05). Educational and free-access articles are more frequent among the most downloaded articles. • There was only small overlap between the most downloaded and most cited articles. • Educational articles were more frequent among the most downloaded articles. • Free-access articles are more frequent among the most downloaded articles.
Replacing the IRAF/PyRAF Code-base at STScI: The Advanced Camera for Surveys (ACS)
NASA Astrophysics Data System (ADS)
Lucas, Ray A.; Desjardins, Tyler D.; STScI ACS (Advanced Camera for Surveys) Team
2018-06-01
IRAF/PyRAF are no longer viable on the latest hardware often used by HST observers, therefore STScI no longer actively supports IRAF or PyRAF for most purposes. STScI instrument teams are in the process of converting all of our data processing and analysis code from IRAF/PyRAF to Python, including our calibration reference file pipelines and data reduction software. This is exemplified by our latest ACS Data Handbook, version 9.0, which was recently published in February 2018. Examples of IRAF and PyRAF commands have now been replaced by code blocks in Python, with references linked to documentation on how to download and install the latest Python software via Conda and AstroConda. With the temporary exception of the ACS slitless spectroscopy tool aXe, all ACS-related software is now independent of IRAF/PyRAF. A concerted effort has been made across STScI divisions to help the astronomical community transition from IRAF/PyRAF to Python, with tools such as Python Jupyter notebooks being made to give users workable examples. In addition to our code changes, the new ACS data handbook discusses the latest developments in charge transfer efficiency (CTE) correction, bias de-striping, and updates to the creation and format of calibration reference files among other topics.
Coplen, Tyler B.
2000-01-01
The reliability and accuracy of isotopic data can be improved by utilizing database software to (i) store information about samples, (ii) store the results of mass spectrometric isotope-ratio analyses of samples, (iii) calculate analytical results using standardized algorithms stored in a database, (iv) normalize stable isotopic data to international scales using isotopic reference materials, and (v) generate multi-sheet paper templates for convenient sample loading of automated mass-spectrometer sample preparation manifolds. Such a database program, the Laboratory Information Management System (LIMS) for Light Stable Isotopes, is presented herein. Major benefits of this system include (i) a dramatic improvement in quality assurance, (ii) an increase in laboratory efficiency, (iii) a reduction in workload due to the elimination or reduction of retyping of data by laboratory personnel, and (iv) a decrease in errors in data reported to sample submitters. Such a database provides a complete record of when and how often laboratory reference materials have been analyzed and provides a record of what correction factors have been used through time. It provides an audit trail for laboratories. LIMS for Light Stable Isotopes is available for both Microsoft Office 97 Professional and Microsoft Office 2000 Professional as versions 7 and 8, respectively. Both source code (mdb file) and precompiled executable files (mde) are available. Numerous improvements have been made for continuous flow isotopic analysis in this version (specifically 7.13 for Microsoft Access 97 and 8.13 for Microsoft Access 2000). It is much easier to import isotopic results from Finnigan ISODAT worksheets, even worksheets on which corrections for amount of sample (linearity corrections) have been added. The capability to determine blank corrections using isotope mass balance from analyses of elemental analyzer samples has been added. It is now possible to calculate and apply drift corrections to isotopic data based on the time of day of analysis. Whereas Finnigan ISODAT software is confined to using only a single peak for calculating delta values, LIMS now enables one to use the mean of two or more reference injections during a continuous flow analysis to calculate delta values. This is useful with Finnigan?s GasBench II online sample preparation system. Concentrations of carbon, nitrogen, and sulfur can be calculated based one or more isotopic reference materials analyzed with a group of samples. Both sample data and isotopic analysis data can now be exported to Excel files. A calculator for determining the amount of sample needed for isotopic analysis based on a previous amount of sample and continuous flow area is now an integral part of LIMS for Light Stable Isotopes. LIMS for Light Stable Isotopes can now assign an error code to Finnigan elemental analyzer analyses in which one of the electrometers has saturated due to analysis of too much sample material, giving rise to incorrect isotopic abundances. Information on downloading this report and downloading code and databases is provided at the Internet addresses: http://water.usgs.gov/software/geochemical.html or http://www.geogr.uni-jena.de/software/geochemical.html in the Eastern Hemisphere.
Wang, R; Li, X A
2001-02-01
The dose parameters for the beta-particle emitting 90Sr/90Y source for intravascular brachytherapy (IVBT) have been calculated by different investigators. At a distant distance from the source, noticeable differences are seen in these parameters calculated using different Monte Carlo codes. The purpose of this work is to quantify as well as to understand these differences. We have compared a series of calculations using an EGS4, an EGSnrc, and the MCNP Monte Carlo codes. Data calculated and compared include the depth dose curve for a broad parallel beam of electrons, and radial dose distributions for point electron sources (monoenergetic or polyenergetic) and for a real 90Sr/90Y source. For the 90Sr/90Y source, the doses at the reference position (2 mm radial distance) calculated by the three code agree within 2%. However, the differences between the dose calculated by the three codes can be over 20% in the radial distance range interested in IVBT. The difference increases with radial distance from source, and reaches 30% at the tail of dose curve. These differences may be partially attributed to the different multiple scattering theories and Monte Carlo models for electron transport adopted in these three codes. Doses calculated by the EGSnrc code are more accurate than those by the EGS4. The two calculations agree within 5% for radial distance <6 mm.
Kim, Daehee; Kim, Dongwan; An, Sunshin
2016-07-09
Code dissemination in wireless sensor networks (WSNs) is a procedure for distributing a new code image over the air in order to update programs. Due to the fact that WSNs are mostly deployed in unattended and hostile environments, secure code dissemination ensuring authenticity and integrity is essential. Recent works on dynamic packet size control in WSNs allow enhancing the energy efficiency of code dissemination by dynamically changing the packet size on the basis of link quality. However, the authentication tokens attached by the base station become useless in the next hop where the packet size can vary according to the link quality of the next hop. In this paper, we propose three source authentication schemes for code dissemination supporting dynamic packet size. Compared to traditional source authentication schemes such as μTESLA and digital signatures, our schemes provide secure source authentication under the environment, where the packet size changes in each hop, with smaller energy consumption.
Kim, Daehee; Kim, Dongwan; An, Sunshin
2016-01-01
Code dissemination in wireless sensor networks (WSNs) is a procedure for distributing a new code image over the air in order to update programs. Due to the fact that WSNs are mostly deployed in unattended and hostile environments, secure code dissemination ensuring authenticity and integrity is essential. Recent works on dynamic packet size control in WSNs allow enhancing the energy efficiency of code dissemination by dynamically changing the packet size on the basis of link quality. However, the authentication tokens attached by the base station become useless in the next hop where the packet size can vary according to the link quality of the next hop. In this paper, we propose three source authentication schemes for code dissemination supporting dynamic packet size. Compared to traditional source authentication schemes such as μTESLA and digital signatures, our schemes provide secure source authentication under the environment, where the packet size changes in each hop, with smaller energy consumption. PMID:27409616
DOE Office of Scientific and Technical Information (OSTI.GOV)
Santos-Villalobos, Hector J; Gregor, Jens; Bingham, Philip R
2014-01-01
At the present, neutron sources cannot be fabricated small and powerful enough in order to achieve high resolution radiography while maintaining an adequate flux. One solution is to employ computational imaging techniques such as a Magnified Coded Source Imaging (CSI) system. A coded-mask is placed between the neutron source and the object. The system resolution is increased by reducing the size of the mask holes and the flux is increased by increasing the size of the coded-mask and/or the number of holes. One limitation of such system is that the resolution of current state-of-the-art scintillator-based detectors caps around 50um. Tomore » overcome this challenge, the coded-mask and object are magnified by making the distance from the coded-mask to the object much smaller than the distance from object to detector. In previous work, we have shown via synthetic experiments that our least squares method outperforms other methods in image quality and reconstruction precision because of the modeling of the CSI system components. However, the validation experiments were limited to simplistic neutron sources. In this work, we aim to model the flux distribution of a real neutron source and incorporate such a model in our least squares computational system. We provide a full description of the methodology used to characterize the neutron source and validate the method with synthetic experiments.« less
Streamlined Genome Sequence Compression using Distributed Source Coding
Wang, Shuang; Jiang, Xiaoqian; Chen, Feng; Cui, Lijuan; Cheng, Samuel
2014-01-01
We aim at developing a streamlined genome sequence compression algorithm to support alternative miniaturized sequencing devices, which have limited communication, storage, and computation power. Existing techniques that require heavy client (encoder side) cannot be applied. To tackle this challenge, we carefully examined distributed source coding theory and developed a customized reference-based genome compression protocol to meet the low-complexity need at the client side. Based on the variation between source and reference, our protocol will pick adaptively either syndrome coding or hash coding to compress subsequences of changing code length. Our experimental results showed promising performance of the proposed method when compared with the state-of-the-art algorithm (GRS). PMID:25520552
Lin, Chin-Feng
2008-02-01
This study examined the downloader cognitive structures toward Web service quality and the downloader ethical attitudes across various levels of participation in a virtual community. Using four types of free downloads as the research subjects, the researcher found that the users in different participation degrees have different perception preferences. Owners of the free downloading Web sites can use the findings of this study to develop effective Web marketing strategies.
The FORTRAN static source code analyzer program (SAP) system description
NASA Technical Reports Server (NTRS)
Decker, W.; Taylor, W.; Merwarth, P.; Oneill, M.; Goorevich, C.; Waligora, S.
1982-01-01
A source code analyzer program (SAP) designed to assist personnel in conducting studies of FORTRAN programs is described. The SAP scans FORTRAN source code and produces reports that present statistics and measures of statements and structures that make up a module. The processing performed by SAP and of the routines, COMMON blocks, and files used by SAP are described. The system generation procedure for SAP is also presented.
Luyckx, Kim; Luyten, Léon; Daelemans, Walter; Van den Bulcke, Tim
2016-01-01
Objective Enormous amounts of healthcare data are becoming increasingly accessible through the large-scale adoption of electronic health records. In this work, structured and unstructured (textual) data are combined to assign clinical diagnostic and procedural codes (specifically ICD-9-CM) to patient stays. We investigate whether integrating these heterogeneous data types improves prediction strength compared to using the data types in isolation. Methods Two separate data integration approaches were evaluated. Early data integration combines features of several sources within a single model, and late data integration learns a separate model per data source and combines these predictions with a meta-learner. This is evaluated on data sources and clinical codes from a broad set of medical specialties. Results When compared with the best individual prediction source, late data integration leads to improvements in predictive power (eg, overall F-measure increased from 30.6% to 38.3% for International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM) diagnostic codes), while early data integration is less consistent. The predictive strength strongly differs between medical specialties, both for ICD-9-CM diagnostic and procedural codes. Discussion Structured data provides complementary information to unstructured data (and vice versa) for predicting ICD-9-CM codes. This can be captured most effectively by the proposed late data integration approach. Conclusions We demonstrated that models using multiple electronic health record data sources systematically outperform models using data sources in isolation in the task of predicting ICD-9-CM codes over a broad range of medical specialties. PMID:26316458
Managing multicentre clinical trials with open source.
Raptis, Dimitri Aristotle; Mettler, Tobias; Fischer, Michael Alexander; Patak, Michael; Lesurtel, Mickael; Eshmuminov, Dilmurodjon; de Rougemont, Olivier; Graf, Rolf; Clavien, Pierre-Alain; Breitenstein, Stefan
2014-03-01
Multicentre clinical trials are challenged by high administrative burden, data management pitfalls and costs. This leads to a reduced enthusiasm and commitment of the physicians involved and thus to a reluctance in conducting multicentre clinical trials. The purpose of this study was to develop a web-based open source platform to support a multi-centre clinical trial. We developed on Drupal, an open source software distributed under the terms of the General Public License, a web-based, multi-centre clinical trial management system with the design science research approach. This system was evaluated by user-testing and well supported several completed and on-going clinical trials and is available for free download. Open source clinical trial management systems are capable in supporting multi-centre clinical trials by enhancing efficiency, quality of data management and collaboration.
IPO: a tool for automated optimization of XCMS parameters.
Libiseller, Gunnar; Dvorzak, Michaela; Kleb, Ulrike; Gander, Edgar; Eisenberg, Tobias; Madeo, Frank; Neumann, Steffen; Trausinger, Gert; Sinner, Frank; Pieber, Thomas; Magnes, Christoph
2015-04-16
Untargeted metabolomics generates a huge amount of data. Software packages for automated data processing are crucial to successfully process these data. A variety of such software packages exist, but the outcome of data processing strongly depends on algorithm parameter settings. If they are not carefully chosen, suboptimal parameter settings can easily lead to biased results. Therefore, parameter settings also require optimization. Several parameter optimization approaches have already been proposed, but a software package for parameter optimization which is free of intricate experimental labeling steps, fast and widely applicable is still missing. We implemented the software package IPO ('Isotopologue Parameter Optimization') which is fast and free of labeling steps, and applicable to data from different kinds of samples and data from different methods of liquid chromatography - high resolution mass spectrometry and data from different instruments. IPO optimizes XCMS peak picking parameters by using natural, stable (13)C isotopic peaks to calculate a peak picking score. Retention time correction is optimized by minimizing relative retention time differences within peak groups. Grouping parameters are optimized by maximizing the number of peak groups that show one peak from each injection of a pooled sample. The different parameter settings are achieved by design of experiments, and the resulting scores are evaluated using response surface models. IPO was tested on three different data sets, each consisting of a training set and test set. IPO resulted in an increase of reliable groups (146% - 361%), a decrease of non-reliable groups (3% - 8%) and a decrease of the retention time deviation to one third. IPO was successfully applied to data derived from liquid chromatography coupled to high resolution mass spectrometry from three studies with different sample types and different chromatographic methods and devices. We were also able to show the potential of IPO to increase the reliability of metabolomics data. The source code is implemented in R, tested on Linux and Windows and it is freely available for download at https://github.com/glibiseller/IPO . The training sets and test sets can be downloaded from https://health.joanneum.at/IPO .
Entropy-Based Bounds On Redundancies Of Huffman Codes
NASA Technical Reports Server (NTRS)
Smyth, Padhraic J.
1992-01-01
Report presents extension of theory of redundancy of binary prefix code of Huffman type which includes derivation of variety of bounds expressed in terms of entropy of source and size of alphabet. Recent developments yielded bounds on redundancy of Huffman code in terms of probabilities of various components in source alphabet. In practice, redundancies of optimal prefix codes often closer to 0 than to 1.
40 CFR Appendix A to Subpart A of... - Tables
Code of Federal Regulations, 2010 CFR
2010-07-01
... phone number ✓ ✓ (6) FIPS code ✓ ✓ (7) Facility ID codes ✓ ✓ (8) Unit ID code ✓ ✓ (9) Process ID code... for Reporting on Emissions From Nonpoint Sources and Nonroad Mobile Sources, Where Required by 40 CFR... start date ✓ ✓ (3) Inventory end date ✓ ✓ (4) Contact name ✓ ✓ (5) Contact phone number ✓ ✓ (6) FIPS...
SIDECACHE: Information access, management and dissemination framework for web services.
Doderer, Mark S; Burkhardt, Cory; Robbins, Kay A
2011-06-14
Many bioinformatics algorithms and data sets are deployed using web services so that the results can be explored via the Internet and easily integrated into other tools and services. These services often include data from other sites that is accessed either dynamically or through file downloads. Developers of these services face several problems because of the dynamic nature of the information from the upstream services. Many publicly available repositories of bioinformatics data frequently update their information. When such an update occurs, the developers of the downstream service may also need to update. For file downloads, this process is typically performed manually followed by web service restart. Requests for information obtained by dynamic access of upstream sources is sometimes subject to rate restrictions. SideCache provides a framework for deploying web services that integrate information extracted from other databases and from web sources that are periodically updated. This situation occurs frequently in biotechnology where new information is being continuously generated and the latest information is important. SideCache provides several types of services including proxy access and rate control, local caching, and automatic web service updating. We have used the SideCache framework to automate the deployment and updating of a number of bioinformatics web services and tools that extract information from remote primary sources such as NCBI, NCIBI, and Ensembl. The SideCache framework also has been used to share research results through the use of a SideCache derived web service.
JHelioviewer: Open-Source Software for Discovery and Image Access in the Petabyte Age (Invited)
NASA Astrophysics Data System (ADS)
Mueller, D.; Dimitoglou, G.; Langenberg, M.; Pagel, S.; Dau, A.; Nuhn, M.; Garcia Ortiz, J. P.; Dietert, H.; Schmidt, L.; Hughitt, V. K.; Ireland, J.; Fleck, B.
2010-12-01
The unprecedented torrent of data returned by the Solar Dynamics Observatory is both a blessing and a barrier: a blessing for making available data with significantly higher spatial and temporal resolution, but a barrier for scientists to access, browse and analyze them. With such staggering data volume, the data is bound to be accessible only from a few repositories and users will have to deal with data sets effectively immobile and practically difficult to download. From a scientist's perspective this poses three challenges: accessing, browsing and finding interesting data while avoiding the proverbial search for a needle in a haystack. To address these challenges, we have developed JHelioviewer, an open-source visualization software that lets users browse large data volumes both as still images and movies. We did so by deploying an efficient image encoding, storage, and dissemination solution using the JPEG 2000 standard. This solution enables users to access remote images at different resolution levels as a single data stream. Users can view, manipulate, pan, zoom, and overlay JPEG 2000 compressed data quickly, without severe network bandwidth penalties. Besides viewing data, the browser provides third-party metadata and event catalog integration to quickly locate data of interest, as well as an interface to the Virtual Solar Observatory to download science-quality data. As part of the Helioviewer Project, JHelioviewer offers intuitive ways to browse large amounts of heterogeneous data remotely and provides an extensible and customizable open-source platform for the scientific community.
SOURCELESS STARTUP. A MACHINE CODE FOR COMPUTING LOW-SOURCE REACTOR STARTUPS
DOE Office of Scientific and Technical Information (OSTI.GOV)
MacMillan, D.B.
1960-06-01
>A revision to the sourceless start-up code is presented. The code solves a system of differential equations encountered in computing the probability distribution of activity at an observed power level during reactor start-up from a very low source level. (J.R.D.)
48 CFR 252.227-7013 - Rights in technical data-Noncommercial items.
Code of Federal Regulations, 2011 CFR
2011-10-01
... causing a computer to perform a specific operation or series of operations. (3) Computer software means computer programs, source code, source code listings, object code listings, design details, algorithms... or will be developed exclusively with Government funds; (ii) Studies, analyses, test data, or similar...
48 CFR 252.227-7013 - Rights in technical data-Noncommercial items.
Code of Federal Regulations, 2012 CFR
2012-10-01
... causing a computer to perform a specific operation or series of operations. (3) Computer software means computer programs, source code, source code listings, object code listings, design details, algorithms... or will be developed exclusively with Government funds; (ii) Studies, analyses, test data, or similar...
48 CFR 252.227-7013 - Rights in technical data-Noncommercial items.
Code of Federal Regulations, 2014 CFR
2014-10-01
... causing a computer to perform a specific operation or series of operations. (3) Computer software means computer programs, source code, source code listings, object code listings, design details, algorithms... or will be developed exclusively with Government funds; (ii) Studies, analyses, test data, or similar...
48 CFR 252.227-7013 - Rights in technical data-Noncommercial items.
Code of Federal Regulations, 2010 CFR
2010-10-01
... causing a computer to perform a specific operation or series of operations. (3) Computer software means computer programs, source code, source code listings, object code listings, design details, algorithms... developed exclusively with Government funds; (ii) Studies, analyses, test data, or similar data produced for...
LDPC-based iterative joint source-channel decoding for JPEG2000.
Pu, Lingling; Wu, Zhenyu; Bilgin, Ali; Marcellin, Michael W; Vasic, Bane
2007-02-01
A framework is proposed for iterative joint source-channel decoding of JPEG2000 codestreams. At the encoder, JPEG2000 is used to perform source coding with certain error-resilience (ER) modes, and LDPC codes are used to perform channel coding. During decoding, the source decoder uses the ER modes to identify corrupt sections of the codestream and provides this information to the channel decoder. Decoding is carried out jointly in an iterative fashion. Experimental results indicate that the proposed method requires fewer iterations and improves overall system performance.
48 CFR 252.227-7013 - Rights in technical data-Noncommercial items.
Code of Federal Regulations, 2013 CFR
2013-10-01
... causing a computer to perform a specific operation or series of operations. (3) Computer software means computer programs, source code, source code listings, object code listings, design details, algorithms... funds; (ii) Studies, analyses, test data, or similar data produced for this contract, when the study...
SolTrace | Concentrating Solar Power | NREL
NREL packaged distribution or from source code at the SolTrace open source project website. NREL Publications Support FAQs SolTrace open source project The code uses Monte-Carlo ray-tracing methodology. The -tracing capabilities. With the release of the SolTrace open source project, the software has adopted
Chain of Custody Item Monitor Message Viewer v.1.0 Beta
DOE Office of Scientific and Technical Information (OSTI.GOV)
Schwartz, Steven Robert; Fielder, Laura; Hymel, Ross W.
The CoCIM Message Viewer software allows users to connect to and download messages from a Chain of Custody Item Monitor (CoCIM) connected to a serial port on the user’s computer. The downloaded messages are authenticated and displayed in a Graphical User Interface that allows the user a limited degree of sorting and filtering of the downloaded messages as well as the ability to save downloaded files or to open previously downloaded message history files.
ICIS-NPDES Data Set Download | ECHO | US EPA
ECHO, Enforcement and Compliance History Online, provides compliance and enforcement information for approximately 800,000 EPA-regulated facilities nationwide. ECHO includes permit, inspection, violation, enforcement action, and penalty information about facilities regulated under the Clean Air Act (CAA) Stationary Source Program, Clean Water Act (CWA) National Pollutant Elimination Discharge System (NPDES), and/or Resource Conservation and Recovery Act (RCRA). Information also is provided on surrounding demographics when available.
Transport Traffic Analysis for Abusive Infrastructure Characterization
2012-12-14
Introduction Abusive traffic abounds on the Internet, in the form of email, malware, vulnerability scanners, worms, denial-of-service, drive-by-downloads, scam ...insight is two-fold. First, attackers have a basic requirement to source large amounts of data, be it denial-of-service, scam -hosting, spam, or other...the network core. This paper explores the power of transport-layer traffic analysis to detect and characterize scam hosting infrastructure, including
EPA FRS Facilities Combined File CSV Download for the Marshall Islands
The Facility Registry System (FRS) identifies facilities, sites, or places subject to environmental regulation or of environmental interest to EPA programs or delegated states. Using vigorous verification and data management procedures, FRS integrates facility data from program national systems, state master facility records, tribal partners, and other federal agencies and provides the Agency with a centrally managed, single source of comprehensive and authoritative information on facilities.
EPA FRS Facilities Single File CSV Download for the Marshall Islands
The Facility Registry System (FRS) identifies facilities, sites, or places subject to environmental regulation or of environmental interest to EPA programs or delegated states. Using vigorous verification and data management procedures, FRS integrates facility data from program national systems, state master facility records, tribal partners, and other federal agencies and provides the Agency with a centrally managed, single source of comprehensive and authoritative information on facilities.
A nomenclator of extant and fossil taxa of the Valvatidae (Gastropoda, Ectobranchia)
Haszprunar, Gerhard
2014-01-01
Abstract A compilation of all supra- and (infra-) specific taxa of extant and fossil Valvatidae, a group of freshwater operculate snails, is provided, including taxa initially described in this family and subsequently classified in other families, as well as names containing errors or misspellings. The extensive reference list is directly linked to the available electronic source (digital view or pdf-download) of the respective papers. PMID:24578604
ICIS-Air Download Summary and Data Element Dictionary ...
ECHO, Enforcement and Compliance History Online, provides compliance and enforcement information for approximately 800,000 EPA-regulated facilities nationwide. ECHO includes permit, inspection, violation, enforcement action, and penalty information about facilities regulated under the Clean Air Act (CAA) Stationary Source Program, Clean Water Act (CWA) National Pollutant Elimination Discharge System (NPDES), and/or Resource Conservation and Recovery Act (RCRA). Information also is provided on surrounding demographics when available.
ICIS-FE&C Download Summary and Data Element Dictionary ...
ECHO, Enforcement and Compliance History Online, provides compliance and enforcement information for approximately 800,000 EPA-regulated facilities nationwide. ECHO includes permit, inspection, violation, enforcement action, and penalty information about facilities regulated under the Clean Air Act (CAA) Stationary Source Program, Clean Water Act (CWA) National Pollutant Elimination Discharge System (NPDES), and/or Resource Conservation and Recovery Act (RCRA). Information also is provided on surrounding demographics when available.