Attributes and Behaviors of Performance-Centered Systems.
ERIC Educational Resources Information Center
Gery, Gloria
1995-01-01
Examines attributes, characteristics, and behaviors of performance-centered software packages that are emerging in the consumer software marketplace and compares them with large-scale systems software being designed by internal information systems staffs and vendors of large-scale software designed for financial, manufacturing, processing, and…
2015-08-01
Atomic/Molecular Massively Parallel Simulator ( LAMMPS ) Software by N Scott Weingarten and James P Larentzos Approved for...Massively Parallel Simulator ( LAMMPS ) Software by N Scott Weingarten Weapons and Materials Research Directorate, ARL James P Larentzos Engility...Shifted Periodic Boundary Conditions in the Large-Scale Atomic/Molecular Massively Parallel Simulator ( LAMMPS ) Software 5a. CONTRACT NUMBER 5b
Large-scale structural optimization
NASA Technical Reports Server (NTRS)
Sobieszczanski-Sobieski, J.
1983-01-01
Problems encountered by aerospace designers in attempting to optimize whole aircraft are discussed, along with possible solutions. Large scale optimization, as opposed to component-by-component optimization, is hindered by computational costs, software inflexibility, concentration on a single, rather than trade-off, design methodology and the incompatibility of large-scale optimization with single program, single computer methods. The software problem can be approached by placing the full analysis outside of the optimization loop. Full analysis is then performed only periodically. Problem-dependent software can be removed from the generic code using a systems programming technique, and then embody the definitions of design variables, objective function and design constraints. Trade-off algorithms can be used at the design points to obtain quantitative answers. Finally, decomposing the large-scale problem into independent subproblems allows systematic optimization of the problems by an organization of people and machines.
Data Integration: Charting a Path Forward to 2035
2011-02-14
New York, NY: Gotham Books, 2004. Seligman , Len. Mitre Corporation, e-mail interview, 6 Dec 2010. Singer, P.W. Wired for War: The Robotics...articles.aspx (accessed 4 Dec 2010). Ultra-Large-Scale Systems: The Software Challenge of the Future. Study lead Linda Northrup. Pittsburgh, PA: Carnegie...Virtualization?‖ 1. 41 Ultra-Large-Scale Systems: The Software Challenge of the Future. Study lead Linda Northrup. Pittsburgh, PA: Carnegie Mellon Software
NASA Astrophysics Data System (ADS)
Kröger, Knut; Creutzburg, Reiner
2013-05-01
The aim of this paper is to show the usefulness of modern forensic software tools for processing large-scale digital investigations. In particular, we focus on the new version of Nuix 4.2 and compare it with AccessData FTK 4.2, X-Ways Forensics 16.9 and Guidance Encase Forensic 7 regarding its performance, functionality, usability and capability. We will show how these software tools work with large forensic images and how capable they are in examining complex and big data scenarios.
The large scale microelectronics Computer-Aided Design and Test (CADAT) system
NASA Technical Reports Server (NTRS)
Gould, J. M.
1978-01-01
The CADAT system consists of a number of computer programs written in FORTRAN that provide the capability to simulate, lay out, analyze, and create the artwork for large scale microelectronics. The function of each software component of the system is described with references to specific documentation for each software component.
Designing an External Evaluation of a Large-Scale Software Development Project.
ERIC Educational Resources Information Center
Collis, Betty; Moonen, Jef
This paper describes the design and implementation of the evaluation of the POCO Project, a large-scale national software project in the Netherlands which incorporates the perspective of an evaluator throughout the entire span of the project, and uses the experiences gained from it to suggest an evaluation procedure that could be applied to other…
Supporting Source Code Comprehension during Software Evolution and Maintenance
ERIC Educational Resources Information Center
Alhindawi, Nouh
2013-01-01
This dissertation addresses the problems of program comprehension to support the evolution of large-scale software systems. The research concerns how software engineers locate features and concepts along with categorizing changes within very large bodies of source code along with their versioned histories. More specifically, advanced Information…
Implicity restarted Arnoldi/Lanczos methods for large scale eigenvalue calculations
NASA Technical Reports Server (NTRS)
Sorensen, Danny C.
1996-01-01
Eigenvalues and eigenfunctions of linear operators are important to many areas of applied mathematics. The ability to approximate these quantities numerically is becoming increasingly important in a wide variety of applications. This increasing demand has fueled interest in the development of new methods and software for the numerical solution of large-scale algebraic eigenvalue problems. In turn, the existence of these new methods and software, along with the dramatically increased computational capabilities now available, has enabled the solution of problems that would not even have been posed five or ten years ago. Until very recently, software for large-scale nonsymmetric problems was virtually non-existent. Fortunately, the situation is improving rapidly. The purpose of this article is to provide an overview of the numerical solution of large-scale algebraic eigenvalue problems. The focus will be on a class of methods called Krylov subspace projection methods. The well-known Lanczos method is the premier member of this class. The Arnoldi method generalizes the Lanczos method to the nonsymmetric case. A recently developed variant of the Arnoldi/Lanczos scheme called the Implicitly Restarted Arnoldi Method is presented here in some depth. This method is highlighted because of its suitability as a basis for software development.
Sign: large-scale gene network estimation environment for high performance computing.
Tamada, Yoshinori; Shimamura, Teppei; Yamaguchi, Rui; Imoto, Seiya; Nagasaki, Masao; Miyano, Satoru
2011-01-01
Our research group is currently developing software for estimating large-scale gene networks from gene expression data. The software, called SiGN, is specifically designed for the Japanese flagship supercomputer "K computer" which is planned to achieve 10 petaflops in 2012, and other high performance computing environments including Human Genome Center (HGC) supercomputer system. SiGN is a collection of gene network estimation software with three different sub-programs: SiGN-BN, SiGN-SSM and SiGN-L1. In these three programs, five different models are available: static and dynamic nonparametric Bayesian networks, state space models, graphical Gaussian models, and vector autoregressive models. All these models require a huge amount of computational resources for estimating large-scale gene networks and therefore are designed to be able to exploit the speed of 10 petaflops. The software will be available freely for "K computer" and HGC supercomputer system users. The estimated networks can be viewed and analyzed by Cell Illustrator Online and SBiP (Systems Biology integrative Pipeline). The software project web site is available at http://sign.hgc.jp/ .
A Review of Feature Extraction Software for Microarray Gene Expression Data
Tan, Ching Siang; Ting, Wai Soon; Mohamad, Mohd Saberi; Chan, Weng Howe; Deris, Safaai; Ali Shah, Zuraini
2014-01-01
When gene expression data are too large to be processed, they are transformed into a reduced representation set of genes. Transforming large-scale gene expression data into a set of genes is called feature extraction. If the genes extracted are carefully chosen, this gene set can extract the relevant information from the large-scale gene expression data, allowing further analysis by using this reduced representation instead of the full size data. In this paper, we review numerous software applications that can be used for feature extraction. The software reviewed is mainly for Principal Component Analysis (PCA), Independent Component Analysis (ICA), Partial Least Squares (PLS), and Local Linear Embedding (LLE). A summary and sources of the software are provided in the last section for each feature extraction method. PMID:25250315
Approaching the exa-scale: a real-world evaluation of rendering extremely large data sets
DOE Office of Scientific and Technical Information (OSTI.GOV)
Patchett, John M; Ahrens, James P; Lo, Li - Ta
2010-10-15
Extremely large scale analysis is becoming increasingly important as supercomputers and their simulations move from petascale to exascale. The lack of dedicated hardware acceleration for rendering on today's supercomputing platforms motivates our detailed evaluation of the possibility of interactive rendering on the supercomputer. In order to facilitate our understanding of rendering on the supercomputing platform, we focus on scalability of rendering algorithms and architecture envisioned for exascale datasets. To understand tradeoffs for dealing with extremely large datasets, we compare three different rendering algorithms for large polygonal data: software based ray tracing, software based rasterization and hardware accelerated rasterization. We presentmore » a case study of strong and weak scaling of rendering extremely large data on both GPU and CPU based parallel supercomputers using Para View, a parallel visualization tool. Wc use three different data sets: two synthetic and one from a scientific application. At an extreme scale, algorithmic rendering choices make a difference and should be considered while approaching exascale computing, visualization, and analysis. We find software based ray-tracing offers a viable approach for scalable rendering of the projected future massive data sizes.« less
Methods for Large-Scale Nonlinear Optimization.
1980-05-01
STANFORD, CALIFORNIA 94305 METHODS FOR LARGE-SCALE NONLINEAR OPTIMIZATION by Philip E. Gill, Waiter Murray, I Michael A. Saunden, and Masgaret H. Wright...typical iteration can be partitioned so that where B is an m X m basise matrix. This partition effectively divides the vari- ables into three classes... attention is given to the standard of the coding or the documentation. A much better way of obtaining mathematical software is from a software library
Towards Portable Large-Scale Image Processing with High-Performance Computing.
Huo, Yuankai; Blaber, Justin; Damon, Stephen M; Boyd, Brian D; Bao, Shunxing; Parvathaneni, Prasanna; Noguera, Camilo Bermudez; Chaganti, Shikha; Nath, Vishwesh; Greer, Jasmine M; Lyu, Ilwoo; French, William R; Newton, Allen T; Rogers, Baxter P; Landman, Bennett A
2018-05-03
High-throughput, large-scale medical image computing demands tight integration of high-performance computing (HPC) infrastructure for data storage, job distribution, and image processing. The Vanderbilt University Institute for Imaging Science (VUIIS) Center for Computational Imaging (CCI) has constructed a large-scale image storage and processing infrastructure that is composed of (1) a large-scale image database using the eXtensible Neuroimaging Archive Toolkit (XNAT), (2) a content-aware job scheduling platform using the Distributed Automation for XNAT pipeline automation tool (DAX), and (3) a wide variety of encapsulated image processing pipelines called "spiders." The VUIIS CCI medical image data storage and processing infrastructure have housed and processed nearly half-million medical image volumes with Vanderbilt Advanced Computing Center for Research and Education (ACCRE), which is the HPC facility at the Vanderbilt University. The initial deployment was natively deployed (i.e., direct installations on a bare-metal server) within the ACCRE hardware and software environments, which lead to issues of portability and sustainability. First, it could be laborious to deploy the entire VUIIS CCI medical image data storage and processing infrastructure to another HPC center with varying hardware infrastructure, library availability, and software permission policies. Second, the spiders were not developed in an isolated manner, which has led to software dependency issues during system upgrades or remote software installation. To address such issues, herein, we describe recent innovations using containerization techniques with XNAT/DAX which are used to isolate the VUIIS CCI medical image data storage and processing infrastructure from the underlying hardware and software environments. The newly presented XNAT/DAX solution has the following new features: (1) multi-level portability from system level to the application level, (2) flexible and dynamic software development and expansion, and (3) scalable spider deployment compatible with HPC clusters and local workstations.
Living Design Memory: Framework, Implementation, Lessons Learned.
ERIC Educational Resources Information Center
Terveen, Loren G.; And Others
1995-01-01
Discusses large-scale software development and describes the development of the Designer Assistant to improve software development effectiveness. Highlights include the knowledge management problem; related work, including artificial intelligence and expert systems, software process modeling research, and other approaches to organizational memory;…
A process improvement model for software verification and validation
NASA Technical Reports Server (NTRS)
Callahan, John; Sabolish, George
1994-01-01
We describe ongoing work at the NASA Independent Verification and Validation (IV&V) Facility to establish a process improvement model for software verification and validation (V&V) organizations. This model, similar to those used by some software development organizations, uses measurement-based techniques to identify problem areas and introduce incremental improvements. We seek to replicate this model for organizations involved in V&V on large-scale software development projects such as EOS and space station. At the IV&V Facility, a university research group and V&V contractors are working together to collect metrics across projects in order to determine the effectiveness of V&V and improve its application. Since V&V processes are intimately tied to development processes, this paper also examines the repercussions for development organizations in large-scale efforts.
A process improvement model for software verification and validation
NASA Technical Reports Server (NTRS)
Callahan, John; Sabolish, George
1994-01-01
We describe ongoing work at the NASA Independent Verification and Validation (IV&V) Facility to establish a process improvement model for software verification and validation (V&V) organizations. This model, similar to those used by some software development organizations, uses measurement-based techniques to identify problem areas and introduce incremental improvements. We seek to replicate this model for organizations involved in V&V on large-scale software development projects such as EOS and Space Station. At the IV&V Facility, a university research group and V&V contractors are working together to collect metrics across projects in order to determine the effectiveness of V&V and improve its application. Since V&V processes are intimately tied to development processes, this paper also examines the repercussions for development organizations in large-scale efforts.
Knight, Christopher G.; Knight, Sylvia H. E.; Massey, Neil; Aina, Tolu; Christensen, Carl; Frame, Dave J.; Kettleborough, Jamie A.; Martin, Andrew; Pascoe, Stephen; Sanderson, Ben; Stainforth, David A.; Allen, Myles R.
2007-01-01
In complex spatial models, as used to predict the climate response to greenhouse gas emissions, parameter variation within plausible bounds has major effects on model behavior of interest. Here, we present an unprecedentedly large ensemble of >57,000 climate model runs in which 10 parameters, initial conditions, hardware, and software used to run the model all have been varied. We relate information about the model runs to large-scale model behavior (equilibrium sensitivity of global mean temperature to a doubling of carbon dioxide). We demonstrate that effects of parameter, hardware, and software variation are detectable, complex, and interacting. However, we find most of the effects of parameter variation are caused by a small subset of parameters. Notably, the entrainment coefficient in clouds is associated with 30% of the variation seen in climate sensitivity, although both low and high values can give high climate sensitivity. We demonstrate that the effect of hardware and software is small relative to the effect of parameter variation and, over the wide range of systems tested, may be treated as equivalent to that caused by changes in initial conditions. We discuss the significance of these results in relation to the design and interpretation of climate modeling experiments and large-scale modeling more generally. PMID:17640921
Large Scale Software Building with CMake in ATLAS
NASA Astrophysics Data System (ADS)
Elmsheuser, J.; Krasznahorkay, A.; Obreshkov, E.; Undrus, A.; ATLAS Collaboration
2017-10-01
The offline software of the ATLAS experiment at the Large Hadron Collider (LHC) serves as the platform for detector data reconstruction, simulation and analysis. It is also used in the detector’s trigger system to select LHC collision events during data taking. The ATLAS offline software consists of several million lines of C++ and Python code organized in a modular design of more than 2000 specialized packages. Because of different workflows, many stable numbered releases are in parallel production use. To accommodate specific workflow requests, software patches with modified libraries are distributed on top of existing software releases on a daily basis. The different ATLAS software applications also require a flexible build system that strongly supports unit and integration tests. Within the last year this build system was migrated to CMake. A CMake configuration has been developed that allows one to easily set up and build the above mentioned software packages. This also makes it possible to develop and test new and modified packages on top of existing releases. The system also allows one to detect and execute partial rebuilds of the release based on single package changes. The build system makes use of CPack for building RPM packages out of the software releases, and CTest for running unit and integration tests. We report on the migration and integration of the ATLAS software to CMake and show working examples of this large scale project in production.
NASA Technical Reports Server (NTRS)
Aiken, Alexander
2001-01-01
The Scalable Analysis Toolkit (SAT) project aimed to demonstrate that it is feasible and useful to statically detect software bugs in very large systems. The technical focus of the project was on a relatively new class of constraint-based techniques for analysis software, where the desired facts about programs (e.g., the presence of a particular bug) are phrased as constraint problems to be solved. At the beginning of this project, the most successful forms of formal software analysis were limited forms of automatic theorem proving (as exemplified by the analyses used in language type systems and optimizing compilers), semi-automatic theorem proving for full verification, and model checking. With a few notable exceptions these approaches had not been demonstrated to scale to software systems of even 50,000 lines of code. Realistic approaches to large-scale software analysis cannot hope to make every conceivable formal method scale. Thus, the SAT approach is to mix different methods in one application by using coarse and fast but still adequate methods at the largest scales, and reserving the use of more precise but also more expensive methods at smaller scales for critical aspects (that is, aspects critical to the analysis problem under consideration) of a software system. The principled method proposed for combining a heterogeneous collection of formal systems with different scalability characteristics is mixed constraints. This idea had been used previously in small-scale applications with encouraging results: using mostly coarse methods and narrowly targeted precise methods, useful information (meaning the discovery of bugs in real programs) was obtained with excellent scalability.
Achieving Agility and Stability in Large-Scale Software Development
2013-01-16
temporary team is assigned to prepare layers and frameworks for future feature teams. Presentation Layer Domain Layer Data Access Layer...http://www.sei.cmu.edu/training/ elearning ~ Software Engineering Institute CarnegieMellon
Achieving Agility and Stability in Large-Scale Software Development
2013-01-16
temporary team is assigned to prepare layers and frameworks for future feature teams. Presentation Layer Domain Layer Data Access Layer Framework...http://www.sei.cmu.edu/training/ elearning ~ Software Engineering Institute CarnegieMellon
An overview of the Hadoop/MapReduce/HBase framework and its current applications in bioinformatics
DOE Office of Scientific and Technical Information (OSTI.GOV)
Taylor, Ronald C.
Bioinformatics researchers are increasingly confronted with analysis of ultra large-scale data sets, a problem that will only increase at an alarming rate in coming years. Recent developments in open source software, that is, the Hadoop project and associated software, provide a foundation for scaling to petabyte scale data warehouses on Linux clusters, providing fault-tolerant parallelized analysis on such data using a programming style named MapReduce. An overview is given of the current usage within the bioinformatics community of Hadoop, a top-level Apache Software Foundation project, and of associated open source software projects. The concepts behind Hadoop and the associated HBasemore » project are defined, and current bioinformatics software that employ Hadoop is described. The focus is on next-generation sequencing, as the leading application area to date.« less
Pynamic: the Python Dynamic Benchmark
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lee, G L; Ahn, D H; de Supinksi, B R
2007-07-10
Python is widely used in scientific computing to facilitate application development and to support features such as computational steering. Making full use of some of Python's popular features, which improve programmer productivity, leads to applications that access extremely high numbers of dynamically linked libraries (DLLs). As a result, some important Python-based applications severely stress a system's dynamic linking and loading capabilities and also cause significant difficulties for most development environment tools, such as debuggers. Furthermore, using the Python paradigm for large scale MPI-based applications can create significant file IO and further stress tools and operating systems. In this paper, wemore » present Pynamic, the first benchmark program to support configurable emulation of a wide-range of the DLL usage of Python-based applications for large scale systems. Pynamic has already accurately reproduced system software and tool issues encountered by important large Python-based scientific applications on our supercomputers. Pynamic provided insight for our system software and tool vendors, and our application developers, into the impact of several design decisions. As we describe the Pynamic benchmark, we will highlight some of the issues discovered in our large scale system software and tools using Pynamic.« less
Challenges in Managing Trustworthy Large-scale Digital Science
NASA Astrophysics Data System (ADS)
Evans, B. J. K.
2017-12-01
The increased use of large-scale international digital science has opened a number of challenges for managing, handling, using and preserving scientific information. The large volumes of information are driven by three main categories - model outputs including coupled models and ensembles, data products that have been processing to a level of usability, and increasingly heuristically driven data analysis. These data products are increasingly the ones that are usable by the broad communities, and far in excess of the raw instruments data outputs. The data, software and workflows are then shared and replicated to allow broad use at an international scale, which places further demands of infrastructure to support how the information is managed reliably across distributed resources. Users necessarily rely on these underlying "black boxes" so that they are productive to produce new scientific outcomes. The software for these systems depend on computational infrastructure, software interconnected systems, and information capture systems. This ranges from the fundamentals of the reliability of the compute hardware, system software stacks and libraries, and the model software. Due to these complexities and capacity of the infrastructure, there is an increased emphasis of transparency of the approach and robustness of the methods over the full reproducibility. Furthermore, with large volume data management, it is increasingly difficult to store the historical versions of all model and derived data. Instead, the emphasis is on the ability to access the updated products and the reliability by which both previous outcomes are still relevant and can be updated for the new information. We will discuss these challenges and some of the approaches underway that are being used to address these issues.
Formal Verification of Large Software Systems
NASA Technical Reports Server (NTRS)
Yin, Xiang; Knight, John
2010-01-01
We introduce a scalable proof structure to facilitate formal verification of large software systems. In our approach, we mechanically synthesize an abstract specification from the software implementation, match its static operational structure to that of the original specification, and organize the proof as the conjunction of a series of lemmas about the specification structure. By setting up a different lemma for each distinct element and proving each lemma independently, we obtain the important benefit that the proof scales easily for large systems. We present details of the approach and an illustration of its application on a challenge problem from the security domain
What makes computational open source software libraries successful?
NASA Astrophysics Data System (ADS)
Bangerth, Wolfgang; Heister, Timo
2013-01-01
Software is the backbone of scientific computing. Yet, while we regularly publish detailed accounts about the results of scientific software, and while there is a general sense of which numerical methods work well, our community is largely unaware of best practices in writing the large-scale, open source scientific software upon which our discipline rests. This is particularly apparent in the commonly held view that writing successful software packages is largely the result of simply ‘being a good programmer’ when in fact there are many other factors involved, for example the social skill of community building. In this paper, we consider what we have found to be the necessary ingredients for successful scientific software projects and, in particular, for software libraries upon which the vast majority of scientific codes are built today. In particular, we discuss the roles of code, documentation, communities, project management and licenses. We also briefly comment on the impact on academic careers of engaging in software projects.
Chen, Wenjin; Wong, Chung; Vosburgh, Evan; Levine, Arnold J; Foran, David J; Xu, Eugenia Y
2014-07-08
The increasing number of applications of three-dimensional (3D) tumor spheroids as an in vitro model for drug discovery requires their adaptation to large-scale screening formats in every step of a drug screen, including large-scale image analysis. Currently there is no ready-to-use and free image analysis software to meet this large-scale format. Most existing methods involve manually drawing the length and width of the imaged 3D spheroids, which is a tedious and time-consuming process. This study presents a high-throughput image analysis software application - SpheroidSizer, which measures the major and minor axial length of the imaged 3D tumor spheroids automatically and accurately; calculates the volume of each individual 3D tumor spheroid; then outputs the results in two different forms in spreadsheets for easy manipulations in the subsequent data analysis. The main advantage of this software is its powerful image analysis application that is adapted for large numbers of images. It provides high-throughput computation and quality-control workflow. The estimated time to process 1,000 images is about 15 min on a minimally configured laptop, or around 1 min on a multi-core performance workstation. The graphical user interface (GUI) is also designed for easy quality control, and users can manually override the computer results. The key method used in this software is adapted from the active contour algorithm, also known as Snakes, which is especially suitable for images with uneven illumination and noisy background that often plagues automated imaging processing in high-throughput screens. The complimentary "Manual Initialize" and "Hand Draw" tools provide the flexibility to SpheroidSizer in dealing with various types of spheroids and diverse quality images. This high-throughput image analysis software remarkably reduces labor and speeds up the analysis process. Implementing this software is beneficial for 3D tumor spheroids to become a routine in vitro model for drug screens in industry and academia.
Architecting for Large Scale Agile Software Development: A Risk-Driven Approach
2013-05-01
addressed aspect of scale in agile software development. Practices such as Scrum of Scrums are meant to address orchestration of multiple development...owner, Scrum master) have differing responsibilities from the roles in the existing phase-based waterfall program structures. Such differences may... Scrum . Communication with both internal and external stakeholders must be open and documentation should not be used as a substitute for communication
QuickEval: a web application for psychometric scaling experiments
NASA Astrophysics Data System (ADS)
Van Ngo, Khai; Storvik, Jehans J.; Dokkeberg, Christopher A.; Farup, Ivar; Pedersen, Marius
2015-01-01
QuickEval is a web application for carrying out psychometric scaling experiments. It offers the possibility of running controlled experiments in a laboratory, or large scale experiment over the web for people all over the world. It is a unique one of a kind web application, and it is a software needed in the image quality field. It is also, to the best of knowledge, the first software that supports the three most common scaling methods; paired comparison, rank order, and category judgement. It is also the first software to support rank order. Hopefully, a side effect of this newly created software is that it will lower the threshold to perform psychometric experiments, improve the quality of the experiments being carried out, make it easier to reproduce experiments, and increase research on image quality both in academia and industry. The web application is available at www.colourlab.no/quickeval.
Evaluating English Language Teaching Software for Kids: Education or Entertainment or Both?
ERIC Educational Resources Information Center
Kazanci, Zekeriya; Okan, Zuhal
2009-01-01
The purpose of this study is to offer a critical consideration of instructional software designed particularly for children. Since the early 1990s computer applications integrating education with entertainment have been adopted on a large scale by both educators and parents. It is expected that through edutainment software the process of learning…
Large-Scale 3D Printing: The Way Forward
NASA Astrophysics Data System (ADS)
Jassmi, Hamad Al; Najjar, Fady Al; Ismail Mourad, Abdel-Hamid
2018-03-01
Research on small-scale 3D printing has rapidly evolved, where numerous industrial products have been tested and successfully applied. Nonetheless, research on large-scale 3D printing, directed to large-scale applications such as construction and automotive manufacturing, yet demands a great a great deal of efforts. Large-scale 3D printing is considered an interdisciplinary topic and requires establishing a blended knowledge base from numerous research fields including structural engineering, materials science, mechatronics, software engineering, artificial intelligence and architectural engineering. This review article summarizes key topics of relevance to new research trends on large-scale 3D printing, particularly pertaining (1) technological solutions of additive construction (i.e. the 3D printers themselves), (2) materials science challenges, and (3) new design opportunities.
Durbin, Kenneth R.; Tran, John C.; Zamdborg, Leonid; Sweet, Steve M. M.; Catherman, Adam D.; Lee, Ji Eun; Li, Mingxi; Kellie, John F.; Kelleher, Neil L.
2011-01-01
Applying high-throughput Top-Down MS to an entire proteome requires a yet-to-be-established model for data processing. Since Top-Down is becoming possible on a large scale, we report our latest software pipeline dedicated to capturing the full value of intact protein data in automated fashion. For intact mass detection, we combine algorithms for processing MS1 data from both isotopically resolved (FT) and charge-state resolved (ion trap) LC-MS data, which are then linked to their fragment ions for database searching using ProSight. Automated determination of human keratin and tubulin isoforms is one result. Optimized for the intricacies of whole proteins, new software modules visualize proteome-scale data based on the LC retention time and intensity of intact masses and enable selective detection of PTMs to automatically screen for acetylation, phosphorylation, and methylation. Software functionality was demonstrated using comparative LC-MS data from yeast strains in addition to human cells undergoing chemical stress. We further these advances as a key aspect of realizing Top-Down MS on a proteomic scale. PMID:20848673
Software environment for implementing engineering applications on MIMD computers
NASA Technical Reports Server (NTRS)
Lopez, L. A.; Valimohamed, K. A.; Schiff, S.
1990-01-01
In this paper the concept for a software environment for developing engineering application systems for multiprocessor hardware (MIMD) is presented. The philosophy employed is to solve the largest problems possible in a reasonable amount of time, rather than solve existing problems faster. In the proposed environment most of the problems concerning parallel computation and handling of large distributed data spaces are hidden from the application program developer, thereby facilitating the development of large-scale software applications. Applications developed under the environment can be executed on a variety of MIMD hardware; it protects the application software from the effects of a rapidly changing MIMD hardware technology.
NASA Technical Reports Server (NTRS)
Mckay, Charles W.; Feagin, Terry; Bishop, Peter C.; Hallum, Cecil R.; Freedman, Glenn B.
1987-01-01
The principle focus of one of the RICIS (Research Institute for Computing and Information Systems) components is computer systems and software engineering in-the-large of the lifecycle of large, complex, distributed systems which: (1) evolve incrementally over a long time; (2) contain non-stop components; and (3) must simultaneously satisfy a prioritized balance of mission and safety critical requirements at run time. This focus is extremely important because of the contribution of the scaling direction problem to the current software crisis. The Computer Systems and Software Engineering (CSSE) component addresses the lifestyle issues of three environments: host, integration, and target.
Final Technical Report - Center for Technology for Advanced Scientific Component Software (TASCS)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sussman, Alan
2014-10-21
This is a final technical report for the University of Maryland work in the SciDAC Center for Technology for Advanced Scientific Component Software (TASCS). The Maryland work focused on software tools for coupling parallel software components built using the Common Component Architecture (CCA) APIs. Those tools are based on the Maryland InterComm software framework that has been used in multiple computational science applications to build large-scale simulations of complex physical systems that employ multiple separately developed codes.
Implementation of highly parallel and large scale GW calculations within the OpenAtom software
NASA Astrophysics Data System (ADS)
Ismail-Beigi, Sohrab
The need to describe electronic excitations with better accuracy than provided by band structures produced by Density Functional Theory (DFT) has been a long-term enterprise for the computational condensed matter and materials theory communities. In some cases, appropriate theoretical frameworks have existed for some time but have been difficult to apply widely due to computational cost. For example, the GW approximation incorporates a great deal of important non-local and dynamical electronic interaction effects but has been too computationally expensive for routine use in large materials simulations. OpenAtom is an open source massively parallel ab initiodensity functional software package based on plane waves and pseudopotentials (http://charm.cs.uiuc.edu/OpenAtom/) that takes advantage of the Charm + + parallel framework. At present, it is developed via a three-way collaboration, funded by an NSF SI2-SSI grant (ACI-1339804), between Yale (Ismail-Beigi), IBM T. J. Watson (Glenn Martyna) and the University of Illinois at Urbana Champaign (Laxmikant Kale). We will describe the project and our current approach towards implementing large scale GW calculations with OpenAtom. Potential applications of large scale parallel GW software for problems involving electronic excitations in semiconductor and/or metal oxide systems will be also be pointed out.
Browndye: A Software Package for Brownian Dynamics
McCammon, J. Andrew
2010-01-01
A new software package, Browndye, is presented for simulating the diffusional encounter of two large biological molecules. It can be used to estimate second-order rate constants and encounter probabilities, and to explore reaction trajectories. Browndye builds upon previous knowledge and algorithms from software packages such as UHBD, SDA, and Macrodox, while implementing algorithms that scale to larger systems. PMID:21132109
CrossTalk. The Journal of Defense Software Engineering. Volume 13, Number 6, June 2000
2000-06-01
Techniques for Efficiently Generating and Testing Software This paper presents a proven process that uses advanced tools to design, develop and test... optimal software. by Keith R. Wegner Large Software Systems—Back to Basics Development methods that work on small problems seem to not scale well to...Ability Requirements for Teamwork: Implications for Human Resource Management, Journal of Management, Vol. 20, No. 2, 1994. 11. Ferguson, Pat, Watts S
NASA Technical Reports Server (NTRS)
Talbot, Bryan; Zhou, Shu-Jia; Higgins, Glenn; Zukor, Dorothy (Technical Monitor)
2002-01-01
One of the most significant challenges in large-scale climate modeling, as well as in high-performance computing in other scientific fields, is that of effectively integrating many software models from multiple contributors. A software framework facilitates the integration task, both in the development and runtime stages of the simulation. Effective software frameworks reduce the programming burden for the investigators, freeing them to focus more on the science and less on the parallel communication implementation. while maintaining high performance across numerous supercomputer and workstation architectures. This document surveys numerous software frameworks for potential use in Earth science modeling. Several frameworks are evaluated in depth, including Parallel Object-Oriented Methods and Applications (POOMA), Cactus (from (he relativistic physics community), Overture, Goddard Earth Modeling System (GEMS), the National Center for Atmospheric Research Flux Coupler, and UCLA/UCB Distributed Data Broker (DDB). Frameworks evaluated in less detail include ROOT, Parallel Application Workspace (PAWS), and Advanced Large-Scale Integrated Computational Environment (ALICE). A host of other frameworks and related tools are referenced in this context. The frameworks are evaluated individually and also compared with each other.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Trujillo, Angelina Michelle
Strategy, Planning, Acquiring- very large scale computing platforms come and go and planning for immensely scalable machines often precedes actual procurement by 3 years. Procurement can be another year or more. Integration- After Acquisition, machines must be integrated into the computing environments at LANL. Connection to scalable storage via large scale storage networking, assuring correct and secure operations. Management and Utilization – Ongoing operations, maintenance, and trouble shooting of the hardware and systems software at massive scale is required.
The Assignment of Scale to Object-Oriented Software Measures
NASA Technical Reports Server (NTRS)
Neal, Ralph D.; Weistroffer, H. Roland; Coppins, Richard J.
1997-01-01
In order to improve productivity (and quality), measurement of specific aspects of software has become imperative. As object oriented programming languages have become more widely used, metrics designed specifically for object-oriented software are required. Recently a large number of new metrics for object- oriented software has appeared in the literature. Unfortunately, many of these proposed metrics have not been validated to measure what they purport to measure. In this paper fifty (50) of these metrics are analyzed.
Big Software for Big Data: Scaling Up Photometry for LSST (Abstract)
NASA Astrophysics Data System (ADS)
Rawls, M.
2017-06-01
(Abstract only) The Large Synoptic Survey Telescope (LSST) will capture mosaics of the sky every few nights, each containing more data than your computer's hard drive can store. As a result, the software to process these images is as critical to the science as the telescope and the camera. I discuss the algorithms and software being developed by the LSST Data Management team to handle such a large volume of data. All of our work is open source and available to the community. Once LSST comes online, our software will produce catalogs of objects and a stream of alerts. These will bring exciting new opportunities for follow-up observations and collaborations with LSST scientists.
Coordination in Large Scale Software Development
1990-01-01
toward achieving common and explicitly recognized goals" (Blau and Scott, 1962) and "the integration or linking together of different parts of an...require a strong degree of integration of its components. Much software is built of thousands of modules that must mesh with each other perfectly for the...coordination between subgroups producing software modules could lead to failure in integrating the modules themselves. Informal communication. Both
Large-scale visualization projects for teaching software engineering.
Müller, Christoph; Reina, Guido; Burch, Michael; Weiskopf, Daniel
2012-01-01
The University of Stuttgart's software engineering major complements the traditional computer science major with more practice-oriented education. Two-semester software projects in various application areas offered by the university's different computer science institutes are a successful building block in the curriculum. With this realistic, complex project setting, students experience the practice of software engineering, including software development processes, technologies, and soft skills. In particular, visualization-based projects are popular with students. Such projects offer them the opportunity to gain profound knowledge that would hardly be possible with only regular lectures and homework assignments.
Instructional Support Software System. Final Report.
ERIC Educational Resources Information Center
McDonnell Douglas Astronautics Co. - East, St. Louis, MO.
This report describes the development of the Instructional Support System (ISS), a large-scale, computer-based training system that supports both computer-assisted instruction and computer-managed instruction. Written in the Ada programming language, the ISS software package is designed to be machine independent. It is also grouped into functional…
Discovering and Mitigating Software Vulnerabilities through Large-Scale Collaboration
ERIC Educational Resources Information Center
Zhao, Mingyi
2016-01-01
In today's rapidly digitizing society, people place their trust in a wide range of digital services and systems that deliver latest news, process financial transactions, store sensitive information, etc. However, this trust does not have a solid foundation, because software code that supports this digital world has security vulnerabilities. These…
Facilitating Internet-Scale Code Retrieval
ERIC Educational Resources Information Center
Bajracharya, Sushil Krishna
2010-01-01
Internet-Scale code retrieval deals with the representation, storage, and access of relevant source code from a large amount of source code available on the Internet. Internet-Scale code retrieval systems support common emerging practices among software developers related to finding and reusing source code. In this dissertation we focus on some…
NASA Technical Reports Server (NTRS)
Nguyen, D. T.; Watson, Willie R. (Technical Monitor)
2005-01-01
The overall objectives of this research work are to formulate and validate efficient parallel algorithms, and to efficiently design/implement computer software for solving large-scale acoustic problems, arised from the unified frameworks of the finite element procedures. The adopted parallel Finite Element (FE) Domain Decomposition (DD) procedures should fully take advantages of multiple processing capabilities offered by most modern high performance computing platforms for efficient parallel computation. To achieve this objective. the formulation needs to integrate efficient sparse (and dense) assembly techniques, hybrid (or mixed) direct and iterative equation solvers, proper pre-conditioned strategies, unrolling strategies, and effective processors' communicating schemes. Finally, the numerical performance of the developed parallel finite element procedures will be evaluated by solving series of structural, and acoustic (symmetrical and un-symmetrical) problems (in different computing platforms). Comparisons with existing "commercialized" and/or "public domain" software are also included, whenever possible.
Large - scale Rectangular Ruler Automated Verification Device
NASA Astrophysics Data System (ADS)
Chen, Hao; Chang, Luping; Xing, Minjian; Xie, Xie
2018-03-01
This paper introduces a large-scale rectangular ruler automated verification device, which consists of photoelectric autocollimator and self-designed mechanical drive car and data automatic acquisition system. The design of mechanical structure part of the device refer to optical axis design, drive part, fixture device and wheel design. The design of control system of the device refer to hardware design and software design, and the hardware mainly uses singlechip system, and the software design is the process of the photoelectric autocollimator and the automatic data acquisition process. This devices can automated achieve vertical measurement data. The reliability of the device is verified by experimental comparison. The conclusion meets the requirement of the right angle test procedure.
The large-scale structure of software-intensive systems
Booch, Grady
2012-01-01
The computer metaphor is dominant in most discussions of neuroscience, but the semantics attached to that metaphor are often quite naive. Herein, we examine the ontology of software-intensive systems, the nature of their structure and the application of the computer metaphor to the metaphysical questions of self and causation. PMID:23386964
A microkernel design for component-based parallel numerical software systems.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Balay, S.
1999-01-13
What is the minimal software infrastructure and what type of conventions are needed to simplify development of sophisticated parallel numerical application codes using a variety of software components that are not necessarily available as source code? We propose an opaque object-based model where the objects are dynamically loadable from the file system or network. The microkernel required to manage such a system needs to include, at most: (1) a few basic services, namely--a mechanism for loading objects at run time via dynamic link libraries, and consistent schemes for error handling and memory management; and (2) selected methods that all objectsmore » share, to deal with object life (destruction, reference counting, relationships), and object observation (viewing, profiling, tracing). We are experimenting with these ideas in the context of extensible numerical software within the ALICE (Advanced Large-scale Integrated Computational Environment) project, where we are building the microkernel to manage the interoperability among various tools for large-scale scientific simulations. This paper presents some preliminary observations and conclusions from our work with microkernel design.« less
Architecture independent environment for developing engineering software on MIMD computers
NASA Technical Reports Server (NTRS)
Valimohamed, Karim A.; Lopez, L. A.
1990-01-01
Engineers are constantly faced with solving problems of increasing complexity and detail. Multiple Instruction stream Multiple Data stream (MIMD) computers have been developed to overcome the performance limitations of serial computers. The hardware architectures of MIMD computers vary considerably and are much more sophisticated than serial computers. Developing large scale software for a variety of MIMD computers is difficult and expensive. There is a need to provide tools that facilitate programming these machines. First, the issues that must be considered to develop those tools are examined. The two main areas of concern were architecture independence and data management. Architecture independent software facilitates software portability and improves the longevity and utility of the software product. It provides some form of insurance for the investment of time and effort that goes into developing the software. The management of data is a crucial aspect of solving large engineering problems. It must be considered in light of the new hardware organizations that are available. Second, the functional design and implementation of a software environment that facilitates developing architecture independent software for large engineering applications are described. The topics of discussion include: a description of the model that supports the development of architecture independent software; identifying and exploiting concurrency within the application program; data coherence; engineering data base and memory management.
2004-10-01
MONITORING AGENCY NAME(S) AND ADDRESS(ES) Defense Advanced Research Projects Agency AFRL/IFTC 3701 North Fairfax Drive...Scalable Parallel Libraries for Large-Scale Concurrent Applications," Technical Report UCRL -JC-109251, Lawrence Livermore National Laboratory
Investigations on the Bundle Adjustment Results from Sfm-Based Software for Mapping Purposes
NASA Astrophysics Data System (ADS)
Lumban-Gaol, Y. A.; Murtiyoso, A.; Nugroho, B. H.
2018-05-01
Since its first inception, aerial photography has been used for topographic mapping. Large-scale aerial photography contributed to the creation of many of the topographic maps around the world. In Indonesia, a 2013 government directive on spatial management has re-stressed the need for topographic maps, with aerial photogrammetry providing the main method of acquisition. However, the large need to generate such maps is often limited by budgetary reasons. Today, SfM (Structure-from-Motion) offers quicker and less expensive solutions to this problem. However, considering the required precision for topographic missions, these solutions need to be assessed to see if they provide enough level of accuracy. In this paper, a popular SfM-based software Agisoft PhotoScan is used to perform bundle adjustment on a set of large-scale aerial images. The aim of the paper is to compare its bundle adjustment results with those generated by more classical photogrammetric software, namely Trimble Inpho and ERDAS IMAGINE. Furthermore, in order to provide more bundle adjustment statistics to be compared, the Damped Bundle Adjustment Toolbox (DBAT) was also used to reprocess the PhotoScan project. Results show that PhotoScan results are less stable than those generated by the two photogrammetric software programmes. This translates to lower accuracy, which may impact the final photogrammetric product.
Computer-aided software development process design
NASA Technical Reports Server (NTRS)
Lin, Chi Y.; Levary, Reuven R.
1989-01-01
The authors describe an intelligent tool designed to aid managers of software development projects in planning, managing, and controlling the development process of medium- to large-scale software projects. Its purpose is to reduce uncertainties in the budget, personnel, and schedule planning of software development projects. It is based on dynamic model for the software development and maintenance life-cycle process. This dynamic process is composed of a number of time-varying, interacting developmental phases, each characterized by its intended functions and requirements. System dynamics is used as a modeling methodology. The resulting Software LIfe-Cycle Simulator (SLICS) and the hybrid expert simulation system of which it is a subsystem are described.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gamblin, T.
2014-08-29
Large-scale systems like Sequoia allow running small numbers of very large (1M+ process) jobs, but their resource managers and schedulers do not allow large numbers of small (4, 8, 16, etc.) process jobs to run efficiently. Cram is a tool that allows users to launch many small MPI jobs within one large partition, and to overcome the limitations of current resource management software for large ensembles of jobs.
Virtual Systems Pharmacology (ViSP) software for simulation from mechanistic systems-level models.
Ermakov, Sergey; Forster, Peter; Pagidala, Jyotsna; Miladinov, Marko; Wang, Albert; Baillie, Rebecca; Bartlett, Derek; Reed, Mike; Leil, Tarek A
2014-01-01
Multiple software programs are available for designing and running large scale system-level pharmacology models used in the drug development process. Depending on the problem, scientists may be forced to use several modeling tools that could increase model development time, IT costs and so on. Therefore, it is desirable to have a single platform that allows setting up and running large-scale simulations for the models that have been developed with different modeling tools. We developed a workflow and a software platform in which a model file is compiled into a self-contained executable that is no longer dependent on the software that was used to create the model. At the same time the full model specifics is preserved by presenting all model parameters as input parameters for the executable. This platform was implemented as a model agnostic, therapeutic area agnostic and web-based application with a database back-end that can be used to configure, manage and execute large-scale simulations for multiple models by multiple users. The user interface is designed to be easily configurable to reflect the specifics of the model and the user's particular needs and the back-end database has been implemented to store and manage all aspects of the systems, such as Models, Virtual Patients, User Interface Settings, and Results. The platform can be adapted and deployed on an existing cluster or cloud computing environment. Its use was demonstrated with a metabolic disease systems pharmacology model that simulates the effects of two antidiabetic drugs, metformin and fasiglifam, in type 2 diabetes mellitus patients.
Virtual Systems Pharmacology (ViSP) software for simulation from mechanistic systems-level models
Ermakov, Sergey; Forster, Peter; Pagidala, Jyotsna; Miladinov, Marko; Wang, Albert; Baillie, Rebecca; Bartlett, Derek; Reed, Mike; Leil, Tarek A.
2014-01-01
Multiple software programs are available for designing and running large scale system-level pharmacology models used in the drug development process. Depending on the problem, scientists may be forced to use several modeling tools that could increase model development time, IT costs and so on. Therefore, it is desirable to have a single platform that allows setting up and running large-scale simulations for the models that have been developed with different modeling tools. We developed a workflow and a software platform in which a model file is compiled into a self-contained executable that is no longer dependent on the software that was used to create the model. At the same time the full model specifics is preserved by presenting all model parameters as input parameters for the executable. This platform was implemented as a model agnostic, therapeutic area agnostic and web-based application with a database back-end that can be used to configure, manage and execute large-scale simulations for multiple models by multiple users. The user interface is designed to be easily configurable to reflect the specifics of the model and the user's particular needs and the back-end database has been implemented to store and manage all aspects of the systems, such as Models, Virtual Patients, User Interface Settings, and Results. The platform can be adapted and deployed on an existing cluster or cloud computing environment. Its use was demonstrated with a metabolic disease systems pharmacology model that simulates the effects of two antidiabetic drugs, metformin and fasiglifam, in type 2 diabetes mellitus patients. PMID:25374542
A CCD experimental platform for large telescope in Antarctica based on FPGA
NASA Astrophysics Data System (ADS)
Zhu, Yuhua; Qi, Yongjun
2014-07-01
The CCD , as a detector , is one of the important components of astronomical telescopes. For a large telescope in Antarctica, a set of CCD detector system with large size, high sensitivity and low noise is indispensable. Because of the extremely low temperatures and unattended, system maintenance and software and hardware upgrade become hard problems. This paper introduces a general CCD controller experiment platform, using Field programmable gate array FPGA, which is, in fact, a large-scale field reconfigurable array. Taking the advantage of convenience to modify the system, construction of driving circuit, digital signal processing module, network communication interface, control algorithm validation, and remote reconfigurable module may realize. With the concept of integrated hardware and software, the paper discusses the key technology of building scientific CCD system suitable for the special work environment in Antarctica, focusing on the method of remote reconfiguration for controller via network and then offering a feasible hardware and software solution.
Enabling Diverse Software Stacks on Supercomputers using High Performance Virtual Clusters.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Younge, Andrew J.; Pedretti, Kevin; Grant, Ryan
While large-scale simulations have been the hallmark of the High Performance Computing (HPC) community for decades, Large Scale Data Analytics (LSDA) workloads are gaining attention within the scientific community not only as a processing component to large HPC simulations, but also as standalone scientific tools for knowledge discovery. With the path towards Exascale, new HPC runtime systems are also emerging in a way that differs from classical distributed com- puting models. However, system software for such capabilities on the latest extreme-scale DOE supercomputing needs to be enhanced to more appropriately support these types of emerging soft- ware ecosystems. In thismore » paper, we propose the use of Virtual Clusters on advanced supercomputing resources to enable systems to support not only HPC workloads, but also emerging big data stacks. Specifi- cally, we have deployed the KVM hypervisor within Cray's Compute Node Linux on a XC-series supercomputer testbed. We also use libvirt and QEMU to manage and provision VMs directly on compute nodes, leveraging Ethernet-over-Aries network emulation. To our knowledge, this is the first known use of KVM on a true MPP supercomputer. We investigate the overhead our solution using HPC benchmarks, both evaluating single-node performance as well as weak scaling of a 32-node virtual cluster. Overall, we find single node performance of our solution using KVM on a Cray is very efficient with near-native performance. However overhead increases by up to 20% as virtual cluster size increases, due to limitations of the Ethernet-over-Aries bridged network. Furthermore, we deploy Apache Spark with large data analysis workloads in a Virtual Cluster, ef- fectively demonstrating how diverse software ecosystems can be supported by High Performance Virtual Clusters.« less
Chiang, Michael F; Starren, Justin B
2002-01-01
The successful implementation of clinical information systems is difficult. In examining the reasons and potential solutions for this problem, the medical informatics community may benefit from the lessons of a rich body of software engineering and management literature about the failure of software projects. Based on previous studies, we present a conceptual framework for understanding the risk factors associated with large-scale projects. However, the vast majority of existing literature is based on large, enterprise-wide systems, and it unclear whether those results may be scaled down and applied to smaller projects such as departmental medical information systems. To examine this issue, we discuss the case study of a delayed electronic medical record implementation project in a small specialty practice at Columbia-Presbyterian Medical Center. While the factors contributing to the delay of this small project share some attributes with those found in larger organizations, there are important differences. The significance of these differences for groups implementing small medical information systems is discussed.
Piccinini, Filippo; Balassa, Tamas; Szkalisity, Abel; Molnar, Csaba; Paavolainen, Lassi; Kujala, Kaisa; Buzas, Krisztina; Sarazova, Marie; Pietiainen, Vilja; Kutay, Ulrike; Smith, Kevin; Horvath, Peter
2017-06-28
High-content, imaging-based screens now routinely generate data on a scale that precludes manual verification and interrogation. Software applying machine learning has become an essential tool to automate analysis, but these methods require annotated examples to learn from. Efficiently exploring large datasets to find relevant examples remains a challenging bottleneck. Here, we present Advanced Cell Classifier (ACC), a graphical software package for phenotypic analysis that addresses these difficulties. ACC applies machine-learning and image-analysis methods to high-content data generated by large-scale, cell-based experiments. It features methods to mine microscopic image data, discover new phenotypes, and improve recognition performance. We demonstrate that these features substantially expedite the training process, successfully uncover rare phenotypes, and improve the accuracy of the analysis. ACC is extensively documented, designed to be user-friendly for researchers without machine-learning expertise, and distributed as a free open-source tool at www.cellclassifier.org. Copyright © 2017 Elsevier Inc. All rights reserved.
DOT National Transportation Integrated Search
2013-01-01
The simulator was once a very expensive, large-scale mechanical device for training military pilots or astronauts. Modern computers, linking sophisticated software and large-screen displays, have yielded simulators for the desktop or configured as sm...
TomoMiner and TomoMinerCloud: A software platform for large-scale subtomogram structural analysis
Frazier, Zachary; Xu, Min; Alber, Frank
2017-01-01
SUMMARY Cryo-electron tomography (cryoET) captures the 3D electron density distribution of macromolecular complexes in close to native state. With the rapid advance of cryoET acquisition technologies, it is possible to generate large numbers (>100,000) of subtomograms, each containing a macromolecular complex. Often, these subtomograms represent a heterogeneous sample due to variations in structure and composition of a complex in situ form or because particles are a mixture of different complexes. In this case subtomograms must be classified. However, classification of large numbers of subtomograms is a time-intensive task and often a limiting bottleneck. This paper introduces an open source software platform, TomoMiner, for large-scale subtomogram classification, template matching, subtomogram averaging, and alignment. Its scalable and robust parallel processing allows efficient classification of tens to hundreds of thousands of subtomograms. Additionally, TomoMiner provides a pre-configured TomoMinerCloud computing service permitting users without sufficient computing resources instant access to TomoMiners high-performance features. PMID:28552576
Design Optimization Toolkit: Users' Manual
DOE Office of Scientific and Technical Information (OSTI.GOV)
Aguilo Valentin, Miguel Alejandro
The Design Optimization Toolkit (DOTk) is a stand-alone C++ software package intended to solve complex design optimization problems. DOTk software package provides a range of solution methods that are suited for gradient/nongradient-based optimization, large scale constrained optimization, and topology optimization. DOTk was design to have a flexible user interface to allow easy access to DOTk solution methods from external engineering software packages. This inherent flexibility makes DOTk barely intrusive to other engineering software packages. As part of this inherent flexibility, DOTk software package provides an easy-to-use MATLAB interface that enables users to call DOTk solution methods directly from the MATLABmore » command window.« less
NASA Astrophysics Data System (ADS)
Gerke, Kirill M.; Vasilyev, Roman V.; Khirevich, Siarhei; Collins, Daniel; Karsanina, Marina V.; Sizonenko, Timofey O.; Korost, Dmitry V.; Lamontagne, Sébastien; Mallants, Dirk
2018-05-01
Permeability is one of the fundamental properties of porous media and is required for large-scale Darcian fluid flow and mass transport models. Whilst permeability can be measured directly at a range of scales, there are increasing opportunities to evaluate permeability from pore-scale fluid flow simulations. We introduce the free software Finite-Difference Method Stokes Solver (FDMSS) that solves Stokes equation using a finite-difference method (FDM) directly on voxelized 3D pore geometries (i.e. without meshing). Based on explicit convergence studies, validation on sphere packings with analytically known permeabilities, and comparison against lattice-Boltzmann and other published FDM studies, we conclude that FDMSS provides a computationally efficient and accurate basis for single-phase pore-scale flow simulations. By implementing an efficient parallelization and code optimization scheme, permeability inferences can now be made from 3D images of up to 109 voxels using modern desktop computers. Case studies demonstrate the broad applicability of the FDMSS software for both natural and artificial porous media.
Technology Infusion of CodeSonar into the Space Network Ground Segment
NASA Technical Reports Server (NTRS)
Benson, Markland J.
2009-01-01
This slide presentation reviews the applicability of CodeSonar to the Space Network software. CodeSonar is a commercial off the shelf system that analyzes programs written in C, C++ or Ada for defects in the code. Software engineers use CodeSonar results as an input to the existing source code inspection process. The study is focused on large scale software developed using formal processes. The systems studied are mission critical in nature but some use commodity computer systems.
NASA Technical Reports Server (NTRS)
Church, Victor E.; Long, D.; Hartenstein, Ray; Perez-Davila, Alfredo
1992-01-01
A set of functional requirements for software configuration management (CM) and metrics reporting for Space Station Freedom ground systems software are described. This report is one of a series from a study of the interfaces among the Ground Systems Development Environment (GSDE), the development systems for the Space Station Training Facility (SSTF) and the Space Station Control Center (SSCC), and the target systems for SSCC and SSTF. The focus is on the CM of the software following delivery to NASA and on the software metrics that relate to the quality and maintainability of the delivered software. The CM and metrics requirements address specific problems that occur in large-scale software development. Mechanisms to assist in the continuing improvement of mission operations software development are described.
Hong S. He; Wei Li; Brian R. Sturtevant; Jian Yang; Bo Z. Shang; Eric J. Gustafson; David J. Mladenoff
2005-01-01
LANDIS 4.0 is new-generation software that simulates forest landscape change over large spatial and temporal scales. It is used to explore how disturbances, succession, and management interact to determine forest composition and pattern. Also describes software architecture, model assumptions and provides detailed instructions on the use of the model.
Martín-Campos, Trinidad; Mylonas, Roman; Masselot, Alexandre; Waridel, Patrice; Petricevic, Tanja; Xenarios, Ioannis; Quadroni, Manfredo
2017-08-04
Mass spectrometry (MS) has become the tool of choice for the large scale identification and quantitation of proteins and their post-translational modifications (PTMs). This development has been enabled by powerful software packages for the automated analysis of MS data. While data on PTMs of thousands of proteins can nowadays be readily obtained, fully deciphering the complexity and combinatorics of modification patterns even on a single protein often remains challenging. Moreover, functional investigation of PTMs on a protein of interest requires validation of the localization and the accurate quantitation of its changes across several conditions, tasks that often still require human evaluation. Software tools for large scale analyses are highly efficient but are rarely conceived for interactive, in-depth exploration of data on individual proteins. We here describe MsViz, a web-based and interactive software tool that supports manual validation of PTMs and their relative quantitation in small- and medium-size experiments. The tool displays sequence coverage information, peptide-spectrum matches, tandem MS spectra and extracted ion chromatograms through a single, highly intuitive interface. We found that MsViz greatly facilitates manual data inspection to validate PTM location and quantitate modified species across multiple samples.
Using Computing and Data Grids for Large-Scale Science and Engineering
NASA Technical Reports Server (NTRS)
Johnston, William E.
2001-01-01
We use the term "Grid" to refer to a software system that provides uniform and location independent access to geographically and organizationally dispersed, heterogeneous resources that are persistent and supported. These emerging data and computing Grids promise to provide a highly capable and scalable environment for addressing large-scale science problems. We describe the requirements for science Grids, the resulting services and architecture of NASA's Information Power Grid (IPG) and DOE's Science Grid, and some of the scaling issues that have come up in their implementation.
Optical interconnect for large-scale systems
NASA Astrophysics Data System (ADS)
Dress, William
2013-02-01
This paper presents a switchless, optical interconnect module that serves as a node in a network of identical distribution modules for large-scale systems. Thousands to millions of hosts or endpoints may be interconnected by a network of such modules, avoiding the need for multi-level switches. Several common network topologies are reviewed and their scaling properties assessed. The concept of message-flow routing is discussed in conjunction with the unique properties enabled by the optical distribution module where it is shown how top-down software control (global routing tables, spanning-tree algorithms) may be avoided.
Scalable Analysis Methods and In Situ Infrastructure for Extreme Scale Knowledge Discovery
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bethel, Wes
2016-07-24
The primary challenge motivating this team’s work is the widening gap between the ability to compute information and to store it for subsequent analysis. This gap adversely impacts science code teams, who are able to perform analysis only on a small fraction of the data they compute, resulting in the very real likelihood of lost or missed science, when results are computed but not analyzed. Our approach is to perform as much analysis or visualization processing on data while it is still resident in memory, an approach that is known as in situ processing. The idea in situ processing wasmore » not new at the time of the start of this effort in 2014, but efforts in that space were largely ad hoc, and there was no concerted effort within the research community that aimed to foster production-quality software tools suitable for use by DOE science projects. In large, our objective was produce and enable use of production-quality in situ methods and infrastructure, at scale, on DOE HPC facilities, though we expected to have impact beyond DOE due to the widespread nature of the challenges, which affect virtually all large-scale computational science efforts. To achieve that objective, we assembled a unique team of researchers consisting of representatives from DOE national laboratories, academia, and industry, and engaged in software technology R&D, as well as engaged in close partnerships with DOE science code teams, to produce software technologies that were shown to run effectively at scale on DOE HPC platforms.« less
Cloud-enabled large-scale land surface model simulations with the NASA Land Information System
NASA Astrophysics Data System (ADS)
Duffy, D.; Vaughan, G.; Clark, M. P.; Peters-Lidard, C. D.; Nijssen, B.; Nearing, G. S.; Rheingrover, S.; Kumar, S.; Geiger, J. V.
2017-12-01
Developed by the Hydrological Sciences Laboratory at NASA Goddard Space Flight Center (GSFC), the Land Information System (LIS) is a high-performance software framework for terrestrial hydrology modeling and data assimilation. LIS provides the ability to integrate satellite and ground-based observational products and advanced modeling algorithms to extract land surface states and fluxes. Through a partnership with the National Center for Atmospheric Research (NCAR) and the University of Washington, the LIS model is currently being extended to include the Structure for Unifying Multiple Modeling Alternatives (SUMMA). With the addition of SUMMA in LIS, meaningful simulations containing a large multi-model ensemble will be enabled and can provide advanced probabilistic continental-domain modeling capabilities at spatial scales relevant for water managers. The resulting LIS/SUMMA application framework is difficult for non-experts to install due to the large amount of dependencies on specific versions of operating systems, libraries, and compilers. This has created a significant barrier to entry for domain scientists that are interested in using the software on their own systems or in the cloud. In addition, the requirement to support multiple run time environments across the LIS community has created a significant burden on the NASA team. To overcome these challenges, LIS/SUMMA has been deployed using Linux containers, which allows for an entire software package along with all dependences to be installed within a working runtime environment, and Kubernetes, which orchestrates the deployment of a cluster of containers. Within a cloud environment, users can now easily create a cluster of virtual machines and run large-scale LIS/SUMMA simulations. Installations that have taken weeks and months can now be performed in minutes of time. This presentation will discuss the steps required to create a cloud-enabled large-scale simulation, present examples of its use, and describe the potential deployment of this information technology with other NASA applications.
DOE Office of Scientific and Technical Information (OSTI.GOV)
2012-09-25
The Megatux platform enables the emulation of large scale (multi-million node) distributed systems. In particular, it allows for the emulation of large-scale networks interconnecting a very large number of emulated computer systems. It does this by leveraging virtualization and associated technologies to allow hundreds of virtual computers to be hosted on a single moderately sized server or workstation. Virtualization technology provided by modern processors allows for multiple guest OSs to run at the same time, sharing the hardware resources. The Megatux platform can be deployed on a single PC, a small cluster of a few boxes or a large clustermore » of computers. With a modest cluster, the Megatux platform can emulate complex organizational networks. By using virtualization, we emulate the hardware, but run actual software enabling large scale without sacrificing fidelity.« less
RT-Syn: A real-time software system generator
NASA Technical Reports Server (NTRS)
Setliff, Dorothy E.
1992-01-01
This paper presents research into providing highly reusable and maintainable components by using automatic software synthesis techniques. This proposal uses domain knowledge combined with automatic software synthesis techniques to engineer large-scale mission-critical real-time software. The hypothesis centers on a software synthesis architecture that specifically incorporates application-specific (in this case real-time) knowledge. This architecture synthesizes complex system software to meet a behavioral specification and external interaction design constraints. Some examples of these external constraints are communication protocols, precisions, timing, and space limitations. The incorporation of application-specific knowledge facilitates the generation of mathematical software metrics which are used to narrow the design space, thereby making software synthesis tractable. Success has the potential to dramatically reduce mission-critical system life-cycle costs not only by reducing development time, but more importantly facilitating maintenance, modifications, and extensions of complex mission-critical software systems, which are currently dominating life cycle costs.
Architectural Visualization of C/C++ Source Code for Program Comprehension
DOE Office of Scientific and Technical Information (OSTI.GOV)
Panas, T; Epperly, T W; Quinlan, D
2006-09-01
Structural and behavioral visualization of large-scale legacy systems to aid program comprehension is still a major challenge. The challenge is even greater when applications are implemented in flexible and expressive languages such as C and C++. In this paper, we consider visualization of static and dynamic aspects of large-scale scientific C/C++ applications. For our investigation, we reuse and integrate specialized analysis and visualization tools. Furthermore, we present a novel layout algorithm that permits a compressive architectural view of a large-scale software system. Our layout is unique in that it allows traditional program visualizations, i.e., graph structures, to be seen inmore » relation to the application's file structure.« less
Substructured multibody molecular dynamics.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Grest, Gary Stephen; Stevens, Mark Jackson; Plimpton, Steven James
2006-11-01
We have enhanced our parallel molecular dynamics (MD) simulation software LAMMPS (Large-scale Atomic/Molecular Massively Parallel Simulator, lammps.sandia.gov) to include many new features for accelerated simulation including articulated rigid body dynamics via coupling to the Rensselaer Polytechnic Institute code POEMS (Parallelizable Open-source Efficient Multibody Software). We use new features of the LAMMPS software package to investigate rhodopsin photoisomerization, and water model surface tension and capillary waves at the vapor-liquid interface. Finally, we motivate the recipes of MD for practitioners and researchers in numerical analysis and computational mechanics.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ouyang, Lizhi
Advanced Ultra Supercritical Boiler (AUSC) requires materials that can operate in corrosive environment at temperature and pressure as high as 760°C (or 1400°F) and 5000psi, respectively, while at the same time maintain good ductility at low temperature. We develop automated simulation software tools to enable fast large scale screening studies of candidate designs. While direct evaluation of creep rupture strength and ductility are currently not feasible, properties such as energy, elastic constants, surface energy, interface energy, and stack fault energy can be used to assess their relative ductility and creeping strength. We implemented software to automate the complex calculations tomore » minimize human inputs in the tedious screening studies which involve model structures generation, settings for first principles calculations, results analysis and reporting. The software developed in the project and library of computed mechanical properties of phases found in ferritic steels, many are complex solid solutions estimated for the first time, will certainly help the development of low cost ferritic steel for AUSC.« less
Large Scale Portability of Hospital Information System Software
Munnecke, Thomas H.; Kuhn, Ingeborg M.
1986-01-01
As part of its Decentralized Hospital Computer Program (DHCP) the Veterans Administration installed new hospital information systems in 169 of its facilities during 1984 and 1985. The application software for these systems is based on the ANS MUMPS language, is public domain, and is designed to be operating system and hardware independent. The software, developed by VA employees, is built upon a layered approach, where application packages layer on a common data dictionary which is supported by a Kernel of software. Communications between facilities are based on public domain Department of Defense ARPA net standards for domain naming, mail transfer protocols, and message formats, layered on a variety of communications technologies.
ERIC Educational Resources Information Center
Zhang, Xuesong; Dorn, Bradley
2012-01-01
Agile development has received increasing interest both in industry and academia due to its benefits in developing software quickly, meeting customer needs, and keeping pace with the rapidly changing requirements. However, agile practices and scrum in particular have been mainly tested in mid- to large-size projects. In this paper, we present…
Attacking Software Crisis: A Macro Approach.
1985-03-01
Advisor X0774R.. Dyns, Second Reader W.R. Greer r. armn, Department of AAministrative Sciences Kneale rf. mrh- Dean of Information and Policy siences ...was at least originally intended to have practical value, that is, to satisfy some real need. Even the recent wave of game software for microcomputer...Comparing Online an" Offline Programming Performance, Communications of the ACM, January, 1968. 31. Schwartz, ,J. "Analyzing Large-Scale System
PIRATE: pediatric imaging response assessment and targeting environment
NASA Astrophysics Data System (ADS)
Glenn, Russell; Zhang, Yong; Krasin, Matthew; Hua, Chiaho
2010-02-01
By combining the strengths of various imaging modalities, the multimodality imaging approach has potential to improve tumor staging, delineation of tumor boundaries, chemo-radiotherapy regime design, and treatment response assessment in cancer management. To address the urgent needs for efficient tools to analyze large-scale clinical trial data, we have developed an integrated multimodality, functional and anatomical imaging analysis software package for target definition and therapy response assessment in pediatric radiotherapy (RT) patients. Our software provides quantitative tools for automated image segmentation, region-of-interest (ROI) histogram analysis, spatial volume-of-interest (VOI) analysis, and voxel-wise correlation across modalities. To demonstrate the clinical applicability of this software, histogram analyses were performed on baseline and follow-up 18F-fluorodeoxyglucose (18F-FDG) PET images of nine patients with rhabdomyosarcoma enrolled in an institutional clinical trial at St. Jude Children's Research Hospital. In addition, we combined 18F-FDG PET, dynamic-contrast-enhanced (DCE) MR, and anatomical MR data to visualize the heterogeneity in tumor pathophysiology with the ultimate goal of adaptive targeting of regions with high tumor burden. Our software is able to simultaneously analyze multimodality images across multiple time points, which could greatly speed up the analysis of large-scale clinical trial data and validation of potential imaging biomarkers.
CoCoNUT: an efficient system for the comparison and analysis of genomes
2008-01-01
Background Comparative genomics is the analysis and comparison of genomes from different species. This area of research is driven by the large number of sequenced genomes and heavily relies on efficient algorithms and software to perform pairwise and multiple genome comparisons. Results Most of the software tools available are tailored for one specific task. In contrast, we have developed a novel system CoCoNUT (Computational Comparative geNomics Utility Toolkit) that allows solving several different tasks in a unified framework: (1) finding regions of high similarity among multiple genomic sequences and aligning them, (2) comparing two draft or multi-chromosomal genomes, (3) locating large segmental duplications in large genomic sequences, and (4) mapping cDNA/EST to genomic sequences. Conclusion CoCoNUT is competitive with other software tools w.r.t. the quality of the results. The use of state of the art algorithms and data structures allows CoCoNUT to solve comparative genomics tasks more efficiently than previous tools. With the improved user interface (including an interactive visualization component), CoCoNUT provides a unified, versatile, and easy-to-use software tool for large scale studies in comparative genomics. PMID:19014477
Tsugawa, Hiroshi; Arita, Masanori; Kanazawa, Mitsuhiro; Ogiwara, Atsushi; Bamba, Takeshi; Fukusaki, Eiichiro
2013-05-21
We developed a new software program, MRMPROBS, for widely targeted metabolomics by using the large-scale multiple reaction monitoring (MRM) mode. The strategy became increasingly popular for the simultaneous analysis of up to several hundred metabolites at high sensitivity, selectivity, and quantitative capability. However, the traditional method of assessing measured metabolomics data without probabilistic criteria is not only time-consuming but is often subjective and makeshift work. Our program overcomes these problems by detecting and identifying metabolites automatically, by separating isomeric metabolites, and by removing background noise using a probabilistic score defined as the odds ratio from an optimized multivariate logistic regression model. Our software program also provides a user-friendly graphical interface to curate and organize data matrices and to apply principal component analyses and statistical tests. For a demonstration, we conducted a widely targeted metabolome analysis (152 metabolites) of propagating Saccharomyces cerevisiae measured at 15 time points by gas and liquid chromatography coupled to triple quadrupole mass spectrometry. MRMPROBS is a useful and practical tool for the assessment of large-scale MRM data available to any instrument or any experimental condition.
Zhao, Yongli; He, Ruiying; Chen, Haoran; Zhang, Jie; Ji, Yuefeng; Zheng, Haomian; Lin, Yi; Wang, Xinbo
2014-04-21
Software defined networking (SDN) has become the focus in the current information and communication technology area because of its flexibility and programmability. It has been introduced into various network scenarios, such as datacenter networks, carrier networks, and wireless networks. Optical transport network is also regarded as an important application scenario for SDN, which is adopted as the enabling technology of data communication networks (DCN) instead of general multi-protocol label switching (GMPLS). However, the practical performance of SDN based DCN for large scale optical networks, which is very important for the technology selection in the future optical network deployment, has not been evaluated up to now. In this paper we have built a large scale flexi-grid optical network testbed with 1000 virtual optical transport nodes to evaluate the performance of SDN based DCN, including network scalability, DCN bandwidth limitation, and restoration time. A series of network performance parameters including blocking probability, bandwidth utilization, average lightpath provisioning time, and failure restoration time have been demonstrated under various network environments, such as with different traffic loads and different DCN bandwidths. The demonstration in this work can be taken as a proof for the future network deployment.
An overview of the Hadoop/MapReduce/HBase framework and its current applications in bioinformatics
2010-01-01
Background Bioinformatics researchers are now confronted with analysis of ultra large-scale data sets, a problem that will only increase at an alarming rate in coming years. Recent developments in open source software, that is, the Hadoop project and associated software, provide a foundation for scaling to petabyte scale data warehouses on Linux clusters, providing fault-tolerant parallelized analysis on such data using a programming style named MapReduce. Description An overview is given of the current usage within the bioinformatics community of Hadoop, a top-level Apache Software Foundation project, and of associated open source software projects. The concepts behind Hadoop and the associated HBase project are defined, and current bioinformatics software that employ Hadoop is described. The focus is on next-generation sequencing, as the leading application area to date. Conclusions Hadoop and the MapReduce programming paradigm already have a substantial base in the bioinformatics community, especially in the field of next-generation sequencing analysis, and such use is increasing. This is due to the cost-effectiveness of Hadoop-based analysis on commodity Linux clusters, and in the cloud via data upload to cloud vendors who have implemented Hadoop/HBase; and due to the effectiveness and ease-of-use of the MapReduce method in parallelization of many data analysis algorithms. PMID:21210976
An overview of the Hadoop/MapReduce/HBase framework and its current applications in bioinformatics.
Taylor, Ronald C
2010-12-21
Bioinformatics researchers are now confronted with analysis of ultra large-scale data sets, a problem that will only increase at an alarming rate in coming years. Recent developments in open source software, that is, the Hadoop project and associated software, provide a foundation for scaling to petabyte scale data warehouses on Linux clusters, providing fault-tolerant parallelized analysis on such data using a programming style named MapReduce. An overview is given of the current usage within the bioinformatics community of Hadoop, a top-level Apache Software Foundation project, and of associated open source software projects. The concepts behind Hadoop and the associated HBase project are defined, and current bioinformatics software that employ Hadoop is described. The focus is on next-generation sequencing, as the leading application area to date. Hadoop and the MapReduce programming paradigm already have a substantial base in the bioinformatics community, especially in the field of next-generation sequencing analysis, and such use is increasing. This is due to the cost-effectiveness of Hadoop-based analysis on commodity Linux clusters, and in the cloud via data upload to cloud vendors who have implemented Hadoop/HBase; and due to the effectiveness and ease-of-use of the MapReduce method in parallelization of many data analysis algorithms.
ERIC Educational Resources Information Center
Rutherford, Teomara; Kibrick, Melissa; Burchinal, Margaret; Richland, Lindsey; Conley, AnneMarie; Osborne, Keara; Schneider, Stephanie; Duran, Lauren; Coulson, Andrew; Antenore, Fran; Daniels, Abby; Martinez, Michael E.
2010-01-01
This paper describes the background, methodology, preliminary findings, and anticipated future directions of a large-scale multi-year randomized field experiment addressing the efficacy of ST Math [Spatial-Temporal Math], a fully-developed math curriculum that uses interactive animated software. ST Math's unique approach minimizes the use of…
Advanced Connectivity Analysis (ACA): a Large Scale Functional Connectivity Data Mining Environment.
Chen, Rong; Nixon, Erika; Herskovits, Edward
2016-04-01
Using resting-state functional magnetic resonance imaging (rs-fMRI) to study functional connectivity is of great importance to understand normal development and function as well as a host of neurological and psychiatric disorders. Seed-based analysis is one of the most widely used rs-fMRI analysis methods. Here we describe a freely available large scale functional connectivity data mining software package called Advanced Connectivity Analysis (ACA). ACA enables large-scale seed-based analysis and brain-behavior analysis. It can seamlessly examine a large number of seed regions with minimal user input. ACA has a brain-behavior analysis component to delineate associations among imaging biomarkers and one or more behavioral variables. We demonstrate applications of ACA to rs-fMRI data sets from a study of autism.
Versatile synchronized real-time MEG hardware controller for large-scale fast data acquisition.
Sun, Limin; Han, Menglai; Pratt, Kevin; Paulson, Douglas; Dinh, Christoph; Esch, Lorenz; Okada, Yoshio; Hämäläinen, Matti
2017-05-01
Versatile controllers for accurate, fast, and real-time synchronized acquisition of large-scale data are useful in many areas of science, engineering, and technology. Here, we describe the development of a controller software based on a technique called queued state machine for controlling the data acquisition (DAQ) hardware, continuously acquiring a large amount of data synchronized across a large number of channels (>400) at a fast rate (up to 20 kHz/channel) in real time, and interfacing with applications for real-time data analysis and display of electrophysiological data. This DAQ controller was developed specifically for a 384-channel pediatric whole-head magnetoencephalography (MEG) system, but its architecture is useful for wide applications. This controller running in a LabVIEW environment interfaces with microprocessors in the MEG sensor electronics to control their real-time operation. It also interfaces with a real-time MEG analysis software via transmission control protocol/internet protocol, to control the synchronous acquisition and transfer of the data in real time from >400 channels to acquisition and analysis workstations. The successful implementation of this controller for an MEG system with a large number of channels demonstrates the feasibility of employing the present architecture in several other applications.
Versatile synchronized real-time MEG hardware controller for large-scale fast data acquisition
NASA Astrophysics Data System (ADS)
Sun, Limin; Han, Menglai; Pratt, Kevin; Paulson, Douglas; Dinh, Christoph; Esch, Lorenz; Okada, Yoshio; Hämäläinen, Matti
2017-05-01
Versatile controllers for accurate, fast, and real-time synchronized acquisition of large-scale data are useful in many areas of science, engineering, and technology. Here, we describe the development of a controller software based on a technique called queued state machine for controlling the data acquisition (DAQ) hardware, continuously acquiring a large amount of data synchronized across a large number of channels (>400) at a fast rate (up to 20 kHz/channel) in real time, and interfacing with applications for real-time data analysis and display of electrophysiological data. This DAQ controller was developed specifically for a 384-channel pediatric whole-head magnetoencephalography (MEG) system, but its architecture is useful for wide applications. This controller running in a LabVIEW environment interfaces with microprocessors in the MEG sensor electronics to control their real-time operation. It also interfaces with a real-time MEG analysis software via transmission control protocol/internet protocol, to control the synchronous acquisition and transfer of the data in real time from >400 channels to acquisition and analysis workstations. The successful implementation of this controller for an MEG system with a large number of channels demonstrates the feasibility of employing the present architecture in several other applications.
Kelley, James J; Maor, Shay; Kim, Min Kyung; Lane, Anatoliy; Lun, Desmond S
2017-08-15
Visualization of metabolites, reactions and pathways in genome-scale metabolic networks (GEMs) can assist in understanding cellular metabolism. Three attributes are desirable in software used for visualizing GEMs: (i) automation, since GEMs can be quite large; (ii) production of understandable maps that provide ease in identification of pathways, reactions and metabolites; and (iii) visualization of the entire network to show how pathways are interconnected. No software currently exists for visualizing GEMs that satisfies all three characteristics, but MOST-Visualization, an extension of the software package MOST (Metabolic Optimization and Simulation Tool), satisfies (i), and by using a pre-drawn overview map of metabolism based on the Roche map satisfies (ii) and comes close to satisfying (iii). MOST is distributed for free on the GNU General Public License. The software and full documentation are available at http://most.ccib.rutgers.edu/. dslun@rutgers.edu. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com
DOE Office of Scientific and Technical Information (OSTI.GOV)
O'Leary, Patrick
The primary challenge motivating this project is the widening gap between the ability to compute information and to store it for subsequent analysis. This gap adversely impacts science code teams, who can perform analysis only on a small fraction of the data they calculate, resulting in the substantial likelihood of lost or missed science, when results are computed but not analyzed. Our approach is to perform as much analysis or visualization processing on data while it is still resident in memory, which is known as in situ processing. The idea in situ processing was not new at the time ofmore » the start of this effort in 2014, but efforts in that space were largely ad hoc, and there was no concerted effort within the research community that aimed to foster production-quality software tools suitable for use by Department of Energy (DOE) science projects. Our objective was to produce and enable the use of production-quality in situ methods and infrastructure, at scale, on DOE high-performance computing (HPC) facilities, though we expected to have an impact beyond DOE due to the widespread nature of the challenges, which affect virtually all large-scale computational science efforts. To achieve this objective, we engaged in software technology research and development (R&D), in close partnerships with DOE science code teams, to produce software technologies that were shown to run efficiently at scale on DOE HPC platforms.« less
Small-scale fixed wing airplane software verification flight test
NASA Astrophysics Data System (ADS)
Miller, Natasha R.
The increased demand for micro Unmanned Air Vehicles (UAV) driven by military requirements, commercial use, and academia is creating a need for the ability to quickly and accurately conduct low Reynolds Number aircraft design. There exist several open source software programs that are free or inexpensive that can be used for large scale aircraft design, but few software programs target the realm of low Reynolds Number flight. XFLR5 is an open source, free to download, software program that attempts to take into consideration viscous effects that occur at low Reynolds Number in airfoil design, 3D wing design, and 3D airplane design. An off the shelf, remote control airplane was used as a test bed to model in XFLR5 and then compared to flight test collected data. Flight test focused on the stability modes of the 3D plane, specifically the phugoid mode. Design and execution of the flight tests were accomplished for the RC airplane using methodology from full scale military airplane test procedures. Results from flight test were not conclusive in determining the accuracy of the XFLR5 software program. There were several sources of uncertainty that did not allow for a full analysis of the flight test results. An off the shelf drone autopilot was used as a data collection device for flight testing. The precision and accuracy of the autopilot is unknown. Potential future work should investigate flight test methods for small scale UAV flight.
The Cloud-Based Integrated Data Viewer (IDV)
NASA Astrophysics Data System (ADS)
Fisher, Ward
2015-04-01
Maintaining software compatibility across new computing environments and the associated underlying hardware is a common problem for software engineers and scientific programmers. While there are a suite of tools and methodologies used in traditional software engineering environments to mitigate this issue, they are typically ignored by developers lacking a background in software engineering. The result is a large body of software which is simultaneously critical and difficult to maintain. Visualization software is particularly vulnerable to this problem, given the inherent dependency on particular graphics hardware and software API's. The advent of cloud computing has provided a solution to this problem, which was not previously practical on a large scale; Application Streaming. This technology allows a program to run entirely on a remote virtual machine while still allowing for interactivity and dynamic visualizations, with little-to-no re-engineering required. Through application streaming we are able to bring the same visualization to a desktop, a netbook, a smartphone, and the next generation of hardware, whatever it may be. Unidata has been able to harness Application Streaming to provide a tablet-compatible version of our visualization software, the Integrated Data Viewer (IDV). This work will examine the challenges associated with adapting the IDV to an application streaming platform, and include a brief discussion of the underlying technologies involved. We will also discuss the differences between local software and software-as-a-service.
MRMPROBS suite for metabolomics using large-scale MRM assays.
Tsugawa, Hiroshi; Kanazawa, Mitsuhiro; Ogiwara, Atsushi; Arita, Masanori
2014-08-15
We developed new software environment for the metabolome analysis of large-scale multiple reaction monitoring (MRM) assays. It supports the data format of four major mass spectrometer vendors and mzML common data format. This program provides a process pipeline from the raw-format import to high-dimensional statistical analyses. The novel aspect is graphical user interface-based visualization to perform peak quantification, to interpolate missing values and to normalize peaks interactively based on quality control samples. Together with the software platform, the MRM standard library of 301 metabolites with 775 transitions is also available, which contributes to the reliable peak identification by using retention time and ion abundances. MRMPROBS is available for Windows OS under the creative-commons by-attribution license at http://prime.psc.riken.jp. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Evans, David D.; Mulholland, George W.; Baum, Howard R.; Walton, William D.; McGrattan, Kevin B.
2001-01-01
For more than a decade NIST conducted research to understand, measure and predict the important features of burning oil on water. Results of that research have been included in nationally recognized guidelines for approval of intentional burning. NIST measurements and predictions have played a major role in establishing in situ burning as a primary oil spill response method. Data are given for pool fire burning rates, smoke yield, smoke particulate size distribution, smoke aging, and polycyclic aromatic hydrocarbon content of the smoke for crude and fuel oil fires with effective diameters up to 17.2 m. New user-friendly software, ALOFT, was developed to quantify the large-scale features and trajectory of wind blown smoke plumes in the atmosphere and estimate the ground level smoke particulate concentrations. Predictions using the model were tested successfully against data from large-scale tests. ALOFT software is being used by oil spill response teams to help assess the potential impact of intentional burning. PMID:27500022
InterProScan 5: genome-scale protein function classification
Jones, Philip; Binns, David; Chang, Hsin-Yu; Fraser, Matthew; Li, Weizhong; McAnulla, Craig; McWilliam, Hamish; Maslen, John; Mitchell, Alex; Nuka, Gift; Pesseat, Sebastien; Quinn, Antony F.; Sangrador-Vegas, Amaia; Scheremetjew, Maxim; Yong, Siew-Yit; Lopez, Rodrigo; Hunter, Sarah
2014-01-01
Motivation: Robust large-scale sequence analysis is a major challenge in modern genomic science, where biologists are frequently trying to characterize many millions of sequences. Here, we describe a new Java-based architecture for the widely used protein function prediction software package InterProScan. Developments include improvements and additions to the outputs of the software and the complete reimplementation of the software framework, resulting in a flexible and stable system that is able to use both multiprocessor machines and/or conventional clusters to achieve scalable distributed data analysis. InterProScan is freely available for download from the EMBl-EBI FTP site and the open source code is hosted at Google Code. Availability and implementation: InterProScan is distributed via FTP at ftp://ftp.ebi.ac.uk/pub/software/unix/iprscan/5/ and the source code is available from http://code.google.com/p/interproscan/. Contact: http://www.ebi.ac.uk/support or interhelp@ebi.ac.uk or mitchell@ebi.ac.uk PMID:24451626
Kenny, Joseph P.; Janssen, Curtis L.; Gordon, Mark S.; ...
2008-01-01
Cutting-edge scientific computing software is complex, increasingly involving the coupling of multiple packages to combine advanced algorithms or simulations at multiple physical scales. Component-based software engineering (CBSE) has been advanced as a technique for managing this complexity, and complex component applications have been created in the quantum chemistry domain, as well as several other simulation areas, using the component model advocated by the Common Component Architecture (CCA) Forum. While programming models do indeed enable sound software engineering practices, the selection of programming model is just one building block in a comprehensive approach to large-scale collaborative development which must also addressmore » interface and data standardization, and language and package interoperability. We provide an overview of the development approach utilized within the Quantum Chemistry Science Application Partnership, identifying design challenges, describing the techniques which we have adopted to address these challenges and highlighting the advantages which the CCA approach offers for collaborative development.« less
Yoo, Sun K; Kim, Dong Keun; Kim, Jung C; Park, Youn Jung; Chang, Byung Chul
2008-01-01
With the increase in demand for high quality medical services, the need for an innovative hospital information system has become essential. An improved system has been implemented in all hospital units of the Yonsei University Health System. Interoperability between multi-units required appropriate hardware infrastructure and software architecture. This large-scale hospital information system encompassed PACS (Picture Archiving and Communications Systems), EMR (Electronic Medical Records) and ERP (Enterprise Resource Planning). It involved two tertiary hospitals and 50 community hospitals. The monthly data production rate by the integrated hospital information system is about 1.8 TByte and the total quantity of data produced so far is about 60 TByte. Large scale information exchange and sharing will be particularly useful for telemedicine applications.
Extremely Scalable Spiking Neuronal Network Simulation Code: From Laptops to Exascale Computers.
Jordan, Jakob; Ippen, Tammo; Helias, Moritz; Kitayama, Itaru; Sato, Mitsuhisa; Igarashi, Jun; Diesmann, Markus; Kunkel, Susanne
2018-01-01
State-of-the-art software tools for neuronal network simulations scale to the largest computing systems available today and enable investigations of large-scale networks of up to 10 % of the human cortex at a resolution of individual neurons and synapses. Due to an upper limit on the number of incoming connections of a single neuron, network connectivity becomes extremely sparse at this scale. To manage computational costs, simulation software ultimately targeting the brain scale needs to fully exploit this sparsity. Here we present a two-tier connection infrastructure and a framework for directed communication among compute nodes accounting for the sparsity of brain-scale networks. We demonstrate the feasibility of this approach by implementing the technology in the NEST simulation code and we investigate its performance in different scaling scenarios of typical network simulations. Our results show that the new data structures and communication scheme prepare the simulation kernel for post-petascale high-performance computing facilities without sacrificing performance in smaller systems.
Extremely Scalable Spiking Neuronal Network Simulation Code: From Laptops to Exascale Computers
Jordan, Jakob; Ippen, Tammo; Helias, Moritz; Kitayama, Itaru; Sato, Mitsuhisa; Igarashi, Jun; Diesmann, Markus; Kunkel, Susanne
2018-01-01
State-of-the-art software tools for neuronal network simulations scale to the largest computing systems available today and enable investigations of large-scale networks of up to 10 % of the human cortex at a resolution of individual neurons and synapses. Due to an upper limit on the number of incoming connections of a single neuron, network connectivity becomes extremely sparse at this scale. To manage computational costs, simulation software ultimately targeting the brain scale needs to fully exploit this sparsity. Here we present a two-tier connection infrastructure and a framework for directed communication among compute nodes accounting for the sparsity of brain-scale networks. We demonstrate the feasibility of this approach by implementing the technology in the NEST simulation code and we investigate its performance in different scaling scenarios of typical network simulations. Our results show that the new data structures and communication scheme prepare the simulation kernel for post-petascale high-performance computing facilities without sacrificing performance in smaller systems. PMID:29503613
Network models of biology, whether curated or derived from large-scale data analysis, are critical tools in the understanding of cancer mechanisms and in the design and personalization of therapies. The NDEx Project (Network Data Exchange) will create, deploy, and maintain an open-source, web-based software platform and public website to enable scientists, organizations, and software applications to share, store, manipulate, and publish biological networks.
Submarine pipeline on-bottom stability. Volume 2: Software and manuals
DOE Office of Scientific and Technical Information (OSTI.GOV)
NONE
1998-12-01
The state-of-the-art in pipeline stability design has been changing very rapidly recent. The physics governing on-bottom stability are much better understood now than they were eight years. This is due largely because of research and large scale model tests sponsored by PRCI. Analysis tools utilizing this new knowledge have been developed. These tools provide the design engineer with a rational approach have been developed. These tools provide the design engineer with a rational approach for weight coating design, which he can use with confidence because the tools have been developed based on full scale and near full scale model tests.more » These tools represent the state-of-the-art in stability design and model the complex behavior of pipes subjected to both wave and current loads. These include: hydrodynamic forces which account for the effect of the wake (generated by flow over the pipe) washing back and forth over the pipe in oscillatory flow; and the embedment (digging) which occurs as a pipe resting on the seabed is exposed to oscillatory loadings and small oscillatory deflections. This report has been developed as a reference handbook for use in on-bottom pipeline stability analysis It consists of two volumes. Volume one is devoted descriptions of the various aspects of the problem: the pipeline design process; ocean physics, wave mechanics, hydrodynamic forces, and meteorological data determination; geotechnical data collection and soil mechanics; and stability design procedures. Volume two describes, lists, and illustrates the analysis software. Diskettes containing the software and examples of the software are also included in Volume two.« less
Rueckl, Martin; Lenzi, Stephen C; Moreno-Velasquez, Laura; Parthier, Daniel; Schmitz, Dietmar; Ruediger, Sten; Johenning, Friedrich W
2017-01-01
The measurement of activity in vivo and in vitro has shifted from electrical to optical methods. While the indicators for imaging activity have improved significantly over the last decade, tools for analysing optical data have not kept pace. Most available analysis tools are limited in their flexibility and applicability to datasets obtained at different spatial scales. Here, we present SamuROI (Structured analysis of multiple user-defined ROIs), an open source Python-based analysis environment for imaging data. SamuROI simplifies exploratory analysis and visualization of image series of fluorescence changes in complex structures over time and is readily applicable at different spatial scales. In this paper, we show the utility of SamuROI in Ca 2+ -imaging based applications at three spatial scales: the micro-scale (i.e., sub-cellular compartments including cell bodies, dendrites and spines); the meso-scale, (i.e., whole cell and population imaging with single-cell resolution); and the macro-scale (i.e., imaging of changes in bulk fluorescence in large brain areas, without cellular resolution). The software described here provides a graphical user interface for intuitive data exploration and region of interest (ROI) management that can be used interactively within Jupyter Notebook: a publicly available interactive Python platform that allows simple integration of our software with existing tools for automated ROI generation and post-processing, as well as custom analysis pipelines. SamuROI software, source code and installation instructions are publicly available on GitHub and documentation is available online. SamuROI reduces the energy barrier for manual exploration and semi-automated analysis of spatially complex Ca 2+ imaging datasets, particularly when these have been acquired at different spatial scales.
Rueckl, Martin; Lenzi, Stephen C.; Moreno-Velasquez, Laura; Parthier, Daniel; Schmitz, Dietmar; Ruediger, Sten; Johenning, Friedrich W.
2017-01-01
The measurement of activity in vivo and in vitro has shifted from electrical to optical methods. While the indicators for imaging activity have improved significantly over the last decade, tools for analysing optical data have not kept pace. Most available analysis tools are limited in their flexibility and applicability to datasets obtained at different spatial scales. Here, we present SamuROI (Structured analysis of multiple user-defined ROIs), an open source Python-based analysis environment for imaging data. SamuROI simplifies exploratory analysis and visualization of image series of fluorescence changes in complex structures over time and is readily applicable at different spatial scales. In this paper, we show the utility of SamuROI in Ca2+-imaging based applications at three spatial scales: the micro-scale (i.e., sub-cellular compartments including cell bodies, dendrites and spines); the meso-scale, (i.e., whole cell and population imaging with single-cell resolution); and the macro-scale (i.e., imaging of changes in bulk fluorescence in large brain areas, without cellular resolution). The software described here provides a graphical user interface for intuitive data exploration and region of interest (ROI) management that can be used interactively within Jupyter Notebook: a publicly available interactive Python platform that allows simple integration of our software with existing tools for automated ROI generation and post-processing, as well as custom analysis pipelines. SamuROI software, source code and installation instructions are publicly available on GitHub and documentation is available online. SamuROI reduces the energy barrier for manual exploration and semi-automated analysis of spatially complex Ca2+ imaging datasets, particularly when these have been acquired at different spatial scales. PMID:28706482
FunRich proteomics software analysis, let the fun begin!
Benito-Martin, Alberto; Peinado, Héctor
2015-08-01
Protein MS analysis is the preferred method for unbiased protein identification. It is normally applied to a large number of both small-scale and high-throughput studies. However, user-friendly computational tools for protein analysis are still needed. In this issue, Mathivanan and colleagues (Proteomics 2015, 15, 2597-2601) report the development of FunRich software, an open-access software that facilitates the analysis of proteomics data, providing tools for functional enrichment and interaction network analysis of genes and proteins. FunRich is a reinterpretation of proteomic software, a standalone tool combining ease of use with customizable databases, free access, and graphical representations. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
NASA Technical Reports Server (NTRS)
Basili, V. R.; Zelkowitz, M. V.
1978-01-01
In a brief evaluation of software-related considerations, it is found that suitable approaches for software development depend to a large degree on the characteristics of the particular project involved. An analysis is conducted of development problems in an environment in which ground support software is produced for spacecraft control. The amount of work involved is in the range from 6 to 10 man-years. Attention is given to a general project summary, a programmer/analyst survey, a component summary, a component status report, a resource summary, a change report, a computer program run analysis, aspects of data collection on a smaller scale, progress forecasting, problems of overhead, and error analysis.
3D reconstruction software comparison for short sequences
NASA Astrophysics Data System (ADS)
Strupczewski, Adam; Czupryński, BłaŻej
2014-11-01
Large scale multiview reconstruction is recently a very popular area of research. There are many open source tools that can be downloaded and run on a personal computer. However, there are few, if any, comparisons between all the available software in terms of accuracy on small datasets that a single user can create. The typical datasets for testing of the software are archeological sites or cities, comprising thousands of images. This paper presents a comparison of currently available open source multiview reconstruction software for small datasets. It also compares the open source solutions with a simple structure from motion pipeline developed by the authors from scratch with the use of OpenCV and Eigen libraries.
Making automated computer program documentation a feature of total system design
NASA Technical Reports Server (NTRS)
Wolf, A. W.
1970-01-01
It is pointed out that in large-scale computer software systems, program documents are too often fraught with errors, out of date, poorly written, and sometimes nonexistent in whole or in part. The means are described by which many of these typical system documentation problems were overcome in a large and dynamic software project. A systems approach was employed which encompassed such items as: (1) configuration management; (2) standards and conventions; (3) collection of program information into central data banks; (4) interaction among executive, compiler, central data banks, and configuration management; and (5) automatic documentation. A complete description of the overall system is given.
Large scale systems : a study of computer organizations for air traffic control applications.
DOT National Transportation Integrated Search
1971-06-01
Based on current sizing estimates and tracking algorithms, some computer organizations applicable to future air traffic control computing systems are described and assessed. Hardware and software problem areas are defined and solutions are outlined.
A Large Scale, High Resolution Agent-Based Insurgency Model
2013-09-30
CUDA) is NVIDIA Corporation’s software development model for General Purpose Programming on Graphics Processing Units (GPGPU) ( NVIDIA Corporation ...Conference. Argonne National Laboratory, Argonne, IL, October, 2005. NVIDIA Corporation . NVIDIA CUDA Programming Guide 2.0 [Online]. NVIDIA Corporation
An Overview of Mesoscale Modeling Software for Energetic Materials Research
2010-03-01
12 2.9 Large-scale Atomic/Molecular Massively Parallel Simulator ( LAMMPS ...13 Table 10. LAMMPS summary...extensive reviews, lectures and workshops are available on multiscale modeling of materials applications (76-78). • Multi-phase mixtures of
Bellamy, Chloe; Altringham, John
2015-01-01
Conservation increasingly operates at the landscape scale. For this to be effective, we need landscape scale information on species distributions and the environmental factors that underpin them. Species records are becoming increasingly available via data centres and online portals, but they are often patchy and biased. We demonstrate how such data can yield useful habitat suitability models, using bat roost records as an example. We analysed the effects of environmental variables at eight spatial scales (500 m - 6 km) on roost selection by eight bat species (Pipistrellus pipistrellus, P. pygmaeus, Nyctalus noctula, Myotis mystacinus, M. brandtii, M. nattereri, M. daubentonii, and Plecotus auritus) using the presence-only modelling software MaxEnt. Modelling was carried out on a selection of 418 data centre roost records from the Lake District National Park, UK. Target group pseudoabsences were selected to reduce the impact of sampling bias. Multi-scale models, combining variables measured at their best performing spatial scales, were used to predict roosting habitat suitability, yielding models with useful predictive abilities. Small areas of deciduous woodland consistently increased roosting habitat suitability, but other habitat associations varied between species and scales. Pipistrellus were positively related to built environments at small scales, and depended on large-scale woodland availability. The other, more specialist, species were highly sensitive to human-altered landscapes, avoiding even small rural towns. The strength of many relationships at large scales suggests that bats are sensitive to habitat modifications far from the roost itself. The fine resolution, large extent maps will aid targeted decision-making by conservationists and planners. We have made available an ArcGIS toolbox that automates the production of multi-scale variables, to facilitate the application of our methods to other taxa and locations. Habitat suitability modelling has the potential to become a standard tool for supporting landscape-scale decision-making as relevant data and open source, user-friendly, and peer-reviewed software become widely available.
Implementing Extreme Programming in Distributed Software Project Teams: Strategies and Challenges
NASA Astrophysics Data System (ADS)
Maruping, Likoebe M.
Agile software development methods and distributed forms of organizing teamwork are two team process innovations that are gaining prominence in today's demanding software development environment. Individually, each of these innovations has yielded gains in the practice of software development. Agile methods have enabled software project teams to meet the challenges of an ever turbulent business environment through enhanced flexibility and responsiveness to emergent customer needs. Distributed software project teams have enabled organizations to access highly specialized expertise across geographic locations. Although much progress has been made in understanding how to more effectively manage agile development teams and how to manage distributed software development teams, managers have little guidance on how to leverage these two potent innovations in combination. In this chapter, I outline some of the strategies and challenges associated with implementing agile methods in distributed software project teams. These are discussed in the context of a study of a large-scale software project in the United States that lasted four months.
Multibiodose radiation emergency triage categorization software.
Ainsbury, Elizabeth A; Barnard, Stephen; Barrios, Lleonard; Fattibene, Paola; de Gelder, Virginie; Gregoire, Eric; Lindholm, Carita; Lloyd, David; Nergaard, Inger; Rothkamm, Kai; Romm, Horst; Scherthan, Harry; Thierens, Hubert; Vandevoorde, Charlot; Woda, Clemens; Wojcik, Andrzej
2014-07-01
In this note, the authors describe the MULTIBIODOSE software, which has been created as part of the MULTIBIODOSE project. The software enables doses estimated by networks of laboratories, using up to five retrospective (biological and physical) assays, to be combined to give a single estimate of triage category for each individual potentially exposed to ionizing radiation in a large scale radiation accident or incident. The MULTIBIODOSE software has been created in Java. The usage of the software is based on the MULTIBIODOSE Guidance: the program creates a link to a single SQLite database for each incident, and the database is administered by the lead laboratory. The software has been tested with Java runtime environment 6 and 7 on a number of different Windows, Mac, and Linux systems, using data from a recent intercomparison exercise. The Java program MULTIBIODOSE_1.0.jar is freely available to download from http://www.multibiodose.eu/software or by contacting the software administrator: MULTIBIODOSE-software@gmx.com.
COPS: Large-scale nonlinearly constrained optimization problems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bondarenko, A.S.; Bortz, D.M.; More, J.J.
2000-02-10
The authors have started the development of COPS, a collection of large-scale nonlinearly Constrained Optimization Problems. The primary purpose of this collection is to provide difficult test cases for optimization software. Problems in the current version of the collection come from fluid dynamics, population dynamics, optimal design, and optimal control. For each problem they provide a short description of the problem, notes on the formulation of the problem, and results of computational experiments with general optimization solvers. They currently have results for DONLP2, LANCELOT, MINOS, SNOPT, and LOQO.
van Albada, Sacha J.; Rowley, Andrew G.; Senk, Johanna; Hopkins, Michael; Schmidt, Maximilian; Stokes, Alan B.; Lester, David R.; Diesmann, Markus; Furber, Steve B.
2018-01-01
The digital neuromorphic hardware SpiNNaker has been developed with the aim of enabling large-scale neural network simulations in real time and with low power consumption. Real-time performance is achieved with 1 ms integration time steps, and thus applies to neural networks for which faster time scales of the dynamics can be neglected. By slowing down the simulation, shorter integration time steps and hence faster time scales, which are often biologically relevant, can be incorporated. We here describe the first full-scale simulations of a cortical microcircuit with biological time scales on SpiNNaker. Since about half the synapses onto the neurons arise within the microcircuit, larger cortical circuits have only moderately more synapses per neuron. Therefore, the full-scale microcircuit paves the way for simulating cortical circuits of arbitrary size. With approximately 80, 000 neurons and 0.3 billion synapses, this model is the largest simulated on SpiNNaker to date. The scale-up is enabled by recent developments in the SpiNNaker software stack that allow simulations to be spread across multiple boards. Comparison with simulations using the NEST software on a high-performance cluster shows that both simulators can reach a similar accuracy, despite the fixed-point arithmetic of SpiNNaker, demonstrating the usability of SpiNNaker for computational neuroscience applications with biological time scales and large network size. The runtime and power consumption are also assessed for both simulators on the example of the cortical microcircuit model. To obtain an accuracy similar to that of NEST with 0.1 ms time steps, SpiNNaker requires a slowdown factor of around 20 compared to real time. The runtime for NEST saturates around 3 times real time using hybrid parallelization with MPI and multi-threading. However, achieving this runtime comes at the cost of increased power and energy consumption. The lowest total energy consumption for NEST is reached at around 144 parallel threads and 4.6 times slowdown. At this setting, NEST and SpiNNaker have a comparable energy consumption per synaptic event. Our results widen the application domain of SpiNNaker and help guide its development, showing that further optimizations such as synapse-centric network representation are necessary to enable real-time simulation of large biological neural networks. PMID:29875620
Knowledge-based reusable software synthesis system
NASA Technical Reports Server (NTRS)
Donaldson, Cammie
1989-01-01
The Eli system, a knowledge-based reusable software synthesis system, is being developed for NASA Langley under a Phase 2 SBIR contract. Named after Eli Whitney, the inventor of interchangeable parts, Eli assists engineers of large-scale software systems in reusing components while they are composing their software specifications or designs. Eli will identify reuse potential, search for components, select component variants, and synthesize components into the developer's specifications. The Eli project began as a Phase 1 SBIR to define a reusable software synthesis methodology that integrates reusabilityinto the top-down development process and to develop an approach for an expert system to promote and accomplish reuse. The objectives of the Eli Phase 2 work are to integrate advanced technologies to automate the development of reusable components within the context of large system developments, to integrate with user development methodologies without significant changes in method or learning of special languages, and to make reuse the easiest operation to perform. Eli will try to address a number of reuse problems including developing software with reusable components, managing reusable components, identifying reusable components, and transitioning reuse technology. Eli is both a library facility for classifying, storing, and retrieving reusable components and a design environment that emphasizes, encourages, and supports reuse.
Software architecture and engineering for patient records: current and future.
Weng, Chunhua; Levine, Betty A; Mun, Seong K
2009-05-01
During the "The National Forum on the Future of the Defense Health Information System," a track focusing on "Systems Architecture and Software Engineering" included eight presenters. These presenters identified three key areas of interest in this field, which include the need for open enterprise architecture and a federated database design, net centrality based on service-oriented architecture, and the need for focus on software usability and reusability. The eight panelists provided recommendations related to the suitability of service-oriented architecture and the enabling technologies of grid computing and Web 2.0 for building health services research centers and federated data warehouses to facilitate large-scale collaborative health care and research. Finally, they discussed the need to leverage industry best practices for software engineering to facilitate rapid software development, testing, and deployment.
Advances in knowledge-based software engineering
NASA Technical Reports Server (NTRS)
Truszkowski, Walt
1991-01-01
The underlying hypothesis of this work is that a rigorous and comprehensive software reuse methodology can bring about a more effective and efficient utilization of constrained resources in the development of large-scale software systems by both government and industry. It is also believed that correct use of this type of software engineering methodology can significantly contribute to the higher levels of reliability that will be required of future operational systems. An overview and discussion of current research in the development and application of two systems that support a rigorous reuse paradigm are presented: the Knowledge-Based Software Engineering Environment (KBSEE) and the Knowledge Acquisition fo the Preservation of Tradeoffs and Underlying Rationales (KAPTUR) systems. Emphasis is on a presentation of operational scenarios which highlight the major functional capabilities of the two systems.
NASA Astrophysics Data System (ADS)
Evans, B. J. K.; Pugh, T.; Wyborn, L. A.; Porter, D.; Allen, C.; Smillie, J.; Antony, J.; Trenham, C.; Evans, B. J.; Beckett, D.; Erwin, T.; King, E.; Hodge, J.; Woodcock, R.; Fraser, R.; Lescinsky, D. T.
2014-12-01
The National Computational Infrastructure (NCI) has co-located a priority set of national data assets within a HPC research platform. This powerful in-situ computational platform has been created to help serve and analyse the massive amounts of data across the spectrum of environmental collections - in particular the climate, observational data and geoscientific domains. This paper examines the infrastructure, innovation and opportunity for this significant research platform. NCI currently manages nationally significant data collections (10+ PB) categorised as 1) earth system sciences, climate and weather model data assets and products, 2) earth and marine observations and products, 3) geosciences, 4) terrestrial ecosystem, 5) water management and hydrology, and 6) astronomy, social science and biosciences. The data is largely sourced from the NCI partners (who include the custodians of many of the national scientific records), major research communities, and collaborating overseas organisations. By co-locating these large valuable data assets, new opportunities have arisen by harmonising the data collections, making a powerful transdisciplinary research platformThe data is accessible within an integrated HPC-HPD environment - a 1.2 PFlop supercomputer (Raijin), a HPC class 3000 core OpenStack cloud system and several highly connected large scale and high-bandwidth Lustre filesystems. New scientific software, cloud-scale techniques, server-side visualisation and data services have been harnessed and integrated into the platform, so that analysis is performed seamlessly across the traditional boundaries of the underlying data domains. Characterisation of the techniques along with performance profiling ensures scalability of each software component, all of which can either be enhanced or replaced through future improvements. A Development-to-Operations (DevOps) framework has also been implemented to manage the scale of the software complexity alone. This ensures that software is both upgradable and maintainable, and can be readily reused with complexly integrated systems and become part of the growing global trusted community tools for cross-disciplinary research.
The Australian Computational Earth Systems Simulator
NASA Astrophysics Data System (ADS)
Mora, P.; Muhlhaus, H.; Lister, G.; Dyskin, A.; Place, D.; Appelbe, B.; Nimmervoll, N.; Abramson, D.
2001-12-01
Numerical simulation of the physics and dynamics of the entire earth system offers an outstanding opportunity for advancing earth system science and technology but represents a major challenge due to the range of scales and physical processes involved, as well as the magnitude of the software engineering effort required. However, new simulation and computer technologies are bringing this objective within reach. Under a special competitive national funding scheme to establish new Major National Research Facilities (MNRF), the Australian government together with a consortium of Universities and research institutions have funded construction of the Australian Computational Earth Systems Simulator (ACcESS). The Simulator or computational virtual earth will provide the research infrastructure to the Australian earth systems science community required for simulations of dynamical earth processes at scales ranging from microscopic to global. It will consist of thematic supercomputer infrastructure and an earth systems simulation software system. The Simulator models and software will be constructed over a five year period by a multi-disciplinary team of computational scientists, mathematicians, earth scientists, civil engineers and software engineers. The construction team will integrate numerical simulation models (3D discrete elements/lattice solid model, particle-in-cell large deformation finite-element method, stress reconstruction models, multi-scale continuum models etc) with geophysical, geological and tectonic models, through advanced software engineering and visualization technologies. When fully constructed, the Simulator aims to provide the software and hardware infrastructure needed to model solid earth phenomena including global scale dynamics and mineralisation processes, crustal scale processes including plate tectonics, mountain building, interacting fault system dynamics, and micro-scale processes that control the geological, physical and dynamic behaviour of earth systems. ACcESS represents a part of Australia's contribution to the APEC Cooperation for Earthquake Simulation (ACES) international initiative. Together with other national earth systems science initiatives including the Japanese Earth Simulator and US General Earthquake Model projects, ACcESS aims to provide a driver for scientific advancement and technological breakthroughs including: quantum leaps in understanding of earth evolution at global, crustal, regional and microscopic scales; new knowledge of the physics of crustal fault systems required to underpin the grand challenge of earthquake prediction; new understanding and predictive capabilities of geological processes such as tectonics and mineralisation.
Tsugawa, Hiroshi; Ohta, Erika; Izumi, Yoshihiro; Ogiwara, Atsushi; Yukihira, Daichi; Bamba, Takeshi; Fukusaki, Eiichiro; Arita, Masanori
2014-01-01
Based on theoretically calculated comprehensive lipid libraries, in lipidomics as many as 1000 multiple reaction monitoring (MRM) transitions can be monitored for each single run. On the other hand, lipid analysis from each MRM chromatogram requires tremendous manual efforts to identify and quantify lipid species. Isotopic peaks differing by up to a few atomic masses further complicate analysis. To accelerate the identification and quantification process we developed novel software, MRM-DIFF, for the differential analysis of large-scale MRM assays. It supports a correlation optimized warping (COW) algorithm to align MRM chromatograms and utilizes quality control (QC) sample datasets to automatically adjust the alignment parameters. Moreover, user-defined reference libraries that include the molecular formula, retention time, and MRM transition can be used to identify target lipids and to correct peak abundances by considering isotopic peaks. Here, we demonstrate the software pipeline and introduce key points for MRM-based lipidomics research to reduce the mis-identification and overestimation of lipid profiles. The MRM-DIFF program, example data set and the tutorials are downloadable at the "Standalone software" section of the PRIMe (Platform for RIKEN Metabolomics, http://prime.psc.riken.jp/) database website.
Ohue, Masahito; Shimoda, Takehiro; Suzuki, Shuji; Matsuzaki, Yuri; Ishida, Takashi; Akiyama, Yutaka
2014-11-15
The application of protein-protein docking in large-scale interactome analysis is a major challenge in structural bioinformatics and requires huge computing resources. In this work, we present MEGADOCK 4.0, an FFT-based docking software that makes extensive use of recent heterogeneous supercomputers and shows powerful, scalable performance of >97% strong scaling. MEGADOCK 4.0 is written in C++ with OpenMPI and NVIDIA CUDA 5.0 (or later) and is freely available to all academic and non-profit users at: http://www.bi.cs.titech.ac.jp/megadock. akiyama@cs.titech.ac.jp Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press.
NASA Astrophysics Data System (ADS)
Hua, H.; Manipon, G.; Starch, M.
2017-12-01
NASA's upcoming missions are expected to be generating data volumes at least an order of magnitude larger than current missions. A significant increase in data processing, data rates, data volumes, and long-term data archive capabilities are needed. Consequently, new challenges are emerging that impact traditional data and software management approaches. At large-scales, next generation science data systems are exploring the move onto cloud computing paradigms to support these increased needs. New implications such as costs, data movement, collocation of data systems & archives, and moving processing closer to the data, may result in changes to the stewardship, preservation, and provenance of science data and software. With more science data systems being on-boarding onto cloud computing facilities, we can expect more Earth science data records to be both generated and kept in the cloud. But at large scales, the cost of processing and storing global data may impact architectural and system designs. Data systems will trade the cost of keeping data in the cloud with the data life-cycle approaches of moving "colder" data back to traditional on-premise facilities. How will this impact data citation and processing software stewardship? What are the impacts of cloud-based on-demand processing and its affect on reproducibility and provenance. Similarly, with more science processing software being moved onto cloud, virtual machines, and container based approaches, more opportunities arise for improved stewardship and preservation. But will the science community trust data reprocessed years or decades later? We will also explore emerging questions of the stewardship of the science data system software that is generating the science data records both during and after the life of mission.
Enterprise System Implementations: Lessons from the Trenches.
ERIC Educational Resources Information Center
McCredie, Jack; Updegrove, Dan
1999-01-01
Offers 20 practical guidelines for colleges and universities embarking on large-scale administrative software initiatives or enterprise resource planning systems. Guidelines are based on the extensive experience of the authors and a series of panel discussions at 1999 professional meetings. (DB)
Globus | Informatics Technology for Cancer Research (ITCR)
Globus software services provide secure cancer research data transfer, synchronization, and sharing in distributed environments at large scale. These services can be integrated into applications and research data gateways, leveraging Globus identity management, single sign-on, search, and authorization capabilities. Globus Genomics integrates Globus with the Galaxy genomics workflow engine and Amazon Web Services to enable cancer genomics analysis that can elastically scale compute resources with demand.
CmapTools: A Software Environment for Knowledge Modeling and Sharing
NASA Technical Reports Server (NTRS)
Canas, Alberto J.
2004-01-01
In an ongoing collaborative effort between a group of NASA Ames scientists and researchers at the Institute for Human and Machine Cognition (IHMC) of the University of West Florida, a new version of CmapTools has been developed that enable scientists to construct knowledge models of their domain of expertise, share them with other scientists, make them available to anybody on the Internet with access to a Web browser, and peer-review other scientists models. These software tools have been successfully used at NASA to build a large-scale multimedia on Mars and in knowledge model on Habitability Assessment. The new version of the software places emphasis on greater usability for experts constructing their own knowledge models, and support for the creation of large knowledge models with large number of supporting resources in the forms of images, videos, web pages, and other media. Additionally, the software currently allows scientists to cooperate with each other in the construction, sharing and criticizing of knowledge models. Scientists collaborating from remote distances, for example researchers at the Astrobiology Institute, can concurrently manipulate the knowledge models they are viewing without having to do this at a special videoconferencing facility.
Identification and evaluation of software measures
NASA Technical Reports Server (NTRS)
Card, D. N.
1981-01-01
A large scale, systematic procedure for identifying and evaluating measures that meaningfully characterize one or more elements of software development is described. The background of this research, the nature of the data involved, and the steps of the analytic procedure are discussed. An example of the application of this procedure to data from real software development projects is presented. As the term is used here, a measure is a count or numerical rating of the occurrence of some property. Examples of measures include lines of code, number of computer runs, person hours expended, and degree of use of top down design methodology. Measures appeal to the researcher and the manager as a potential means of defining, explaining, and predicting software development qualities, especially productivity and reliability.
NASA Astrophysics Data System (ADS)
Leng, Bi-Bin; Gong, Jian; Zhang, Wen-bo; Ji, Xue-Qiang
2017-11-01
According to the environmental pollution caused by large-scale pig breeding, the SPSS statistical software and factor analysis method were used to calculate the environmental pollution bearing index of China’s breeding scale from 2006 to 2015. The results showed that with the increase of scale the density of live pig farming and the amount of fertilizer application in agricultural production increased. However, due to the improvement of national environmental awareness, industrial waste water discharge is greatly reduced. China's hog farming environmental pollution load index is rising.
Rainbow: a tool for large-scale whole-genome sequencing data analysis using cloud computing.
Zhao, Shanrong; Prenger, Kurt; Smith, Lance; Messina, Thomas; Fan, Hongtao; Jaeger, Edward; Stephens, Susan
2013-06-27
Technical improvements have decreased sequencing costs and, as a result, the size and number of genomic datasets have increased rapidly. Because of the lower cost, large amounts of sequence data are now being produced by small to midsize research groups. Crossbow is a software tool that can detect single nucleotide polymorphisms (SNPs) in whole-genome sequencing (WGS) data from a single subject; however, Crossbow has a number of limitations when applied to multiple subjects from large-scale WGS projects. The data storage and CPU resources that are required for large-scale whole genome sequencing data analyses are too large for many core facilities and individual laboratories to provide. To help meet these challenges, we have developed Rainbow, a cloud-based software package that can assist in the automation of large-scale WGS data analyses. Here, we evaluated the performance of Rainbow by analyzing 44 different whole-genome-sequenced subjects. Rainbow has the capacity to process genomic data from more than 500 subjects in two weeks using cloud computing provided by the Amazon Web Service. The time includes the import and export of the data using Amazon Import/Export service. The average cost of processing a single sample in the cloud was less than 120 US dollars. Compared with Crossbow, the main improvements incorporated into Rainbow include the ability: (1) to handle BAM as well as FASTQ input files; (2) to split large sequence files for better load balance downstream; (3) to log the running metrics in data processing and monitoring multiple Amazon Elastic Compute Cloud (EC2) instances; and (4) to merge SOAPsnp outputs for multiple individuals into a single file to facilitate downstream genome-wide association studies. Rainbow is a scalable, cost-effective, and open-source tool for large-scale WGS data analysis. For human WGS data sequenced by either the Illumina HiSeq 2000 or HiSeq 2500 platforms, Rainbow can be used straight out of the box. Rainbow is available for third-party implementation and use, and can be downloaded from http://s3.amazonaws.com/jnj_rainbow/index.html.
Rainbow: a tool for large-scale whole-genome sequencing data analysis using cloud computing
2013-01-01
Background Technical improvements have decreased sequencing costs and, as a result, the size and number of genomic datasets have increased rapidly. Because of the lower cost, large amounts of sequence data are now being produced by small to midsize research groups. Crossbow is a software tool that can detect single nucleotide polymorphisms (SNPs) in whole-genome sequencing (WGS) data from a single subject; however, Crossbow has a number of limitations when applied to multiple subjects from large-scale WGS projects. The data storage and CPU resources that are required for large-scale whole genome sequencing data analyses are too large for many core facilities and individual laboratories to provide. To help meet these challenges, we have developed Rainbow, a cloud-based software package that can assist in the automation of large-scale WGS data analyses. Results Here, we evaluated the performance of Rainbow by analyzing 44 different whole-genome-sequenced subjects. Rainbow has the capacity to process genomic data from more than 500 subjects in two weeks using cloud computing provided by the Amazon Web Service. The time includes the import and export of the data using Amazon Import/Export service. The average cost of processing a single sample in the cloud was less than 120 US dollars. Compared with Crossbow, the main improvements incorporated into Rainbow include the ability: (1) to handle BAM as well as FASTQ input files; (2) to split large sequence files for better load balance downstream; (3) to log the running metrics in data processing and monitoring multiple Amazon Elastic Compute Cloud (EC2) instances; and (4) to merge SOAPsnp outputs for multiple individuals into a single file to facilitate downstream genome-wide association studies. Conclusions Rainbow is a scalable, cost-effective, and open-source tool for large-scale WGS data analysis. For human WGS data sequenced by either the Illumina HiSeq 2000 or HiSeq 2500 platforms, Rainbow can be used straight out of the box. Rainbow is available for third-party implementation and use, and can be downloaded from http://s3.amazonaws.com/jnj_rainbow/index.html. PMID:23802613
Optimizing Voltage Control for Large-Scale Solar - Text version | Energy
system software in the loop during simulations, both for long term studies, as well as in some hardware in the loop studies, where we had actual devices interacting with this. So this is a fairly unique
NASA Technical Reports Server (NTRS)
Chien, Steve A.; Tran, Daniel Q.; Rabideau, Gregg R.; Schaffer, Steven R.
2011-01-01
Software has been designed to schedule remote sensing with the Earth Observing One spacecraft. The software attempts to satisfy as many observation requests as possible considering each against spacecraft operation constraints such as data volume, thermal, pointing maneuvers, and others. More complex constraints such as temperature are approximated to enable efficient reasoning while keeping the spacecraft within safe limits. Other constraints are checked using an external software library. For example, an attitude control library is used to determine the feasibility of maneuvering between pairs of observations. This innovation can deal with a wide range of spacecraft constraints and solve large scale scheduling problems like hundreds of observations and thousands of combinations of observation sequences.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Evans, David Edward
A description of the development of the mc_runjob software package used to manage large scale computing tasks for the D0 Experiment at Fermilab is presented, along with a review of the Digital Front End Trigger electronics and the software used to control them. A tracking study is performed on detector data to determine that the D0 Experiment can detect charged B mesons, and that these results are in accordance with current results. B mesons are found by searching for the decay channel B ± → J / Ψ K ± .
NASA Technical Reports Server (NTRS)
Avizienis, A.; Gunningberg, P.; Kelly, J. P. J.; Strigini, L.; Traverse, P. J.; Tso, K. S.; Voges, U.
1986-01-01
To establish a long-term research facility for experimental investigations of design diversity as a means of achieving fault-tolerant systems, a distributed testbed for multiple-version software was designed. It is part of a local network, which utilizes the Locus distributed operating system to operate a set of 20 VAX 11/750 computers. It is used in experiments to measure the efficacy of design diversity and to investigate reliability increases under large-scale, controlled experimental conditions.
Characterizing and Assessing a Large-Scale Software Maintenance Organization
NASA Technical Reports Server (NTRS)
Briand, Lionel; Melo, Walcelio; Seaman, Carolyn; Basili, Victor
1995-01-01
One important component of a software process is the organizational context in which the process is enacted. This component is often missing or incomplete in current process modeling approaches. One technique for modeling this perspective is the Actor-Dependency (AD) Model. This paper reports on a case study which used this approach to analyze and assess a large software maintenance organization. Our goal was to identify the approach's strengths and weaknesses while providing practical recommendations for improvement and research directions. The AD model was found to be very useful in capturing the important properties of the organizational context of the maintenance process, and aided in the understanding of the flaws found in this process. However, a number of opportunities for extending and improving the AD model were identified. Among others, there is a need to incorporate quantitative information to complement the qualitative model.
The ATLAS Simulation Infrastructure
Aad, G.; Abbott, B.; Abdallah, J.; ...
2010-09-25
The simulation software for the ATLAS Experiment at the Large Hadron Collider is being used for large-scale production of events on the LHC Computing Grid. This simulation requires many components, from the generators that simulate particle collisions, through packages simulating the response of the various detectors and triggers. All of these components come together under the ATLAS simulation infrastructure. In this paper, that infrastructure is discussed, including that supporting the detector description, interfacing the event generation, and combining the GEANT4 simulation of the response of the individual detectors. Also described are the tools allowing the software validation, performance testing, andmore » the validation of the simulated output against known physics processes.« less
Final Report: Enabling Exascale Hardware and Software Design through Scalable System Virtualization
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bridges, Patrick G.
2015-02-01
In this grant, we enhanced the Palacios virtual machine monitor to increase its scalability and suitability for addressing exascale system software design issues. This included a wide range of research on core Palacios features, large-scale system emulation, fault injection, perfomrance monitoring, and VMM extensibility. This research resulted in large number of high-impact publications in well-known venues, the support of a number of students, and the graduation of two Ph.D. students and one M.S. student. In addition, our enhanced version of the Palacios virtual machine monitor has been adopted as a core element of the Hobbes operating system under active DOE-fundedmore » research and development.« less
ALFA: The new ALICE-FAIR software framework
NASA Astrophysics Data System (ADS)
Al-Turany, M.; Buncic, P.; Hristov, P.; Kollegger, T.; Kouzinopoulos, C.; Lebedev, A.; Lindenstruth, V.; Manafov, A.; Richter, M.; Rybalchenko, A.; Vande Vyvre, P.; Winckler, N.
2015-12-01
The commonalities between the ALICE and FAIR experiments and their computing requirements led to the development of large parts of a common software framework in an experiment independent way. The FairRoot project has already shown the feasibility of such an approach for the FAIR experiments and extending it beyond FAIR to experiments at other facilities[1, 2]. The ALFA framework is a joint development between ALICE Online- Offline (O2) and FairRoot teams. ALFA is designed as a flexible, elastic system, which balances reliability and ease of development with performance using multi-processing and multithreading. A message- based approach has been adopted; such an approach will support the use of the software on different hardware platforms, including heterogeneous systems. Each process in ALFA assumes limited communication and reliance on other processes. Such a design will add horizontal scaling (multiple processes) to vertical scaling provided by multiple threads to meet computing and throughput demands. ALFA does not dictate any application protocols. Potentially, any content-based processor or any source can change the application protocol. The framework supports different serialization standards for data exchange between different hardware and software languages.
ERIC Educational Resources Information Center
Shim, Eunjae; Shim, Minsuk K.; Felner, Robert D.
Automation of the survey process has proved successful in many industries, yet it is still underused in educational research. This is largely due to the facts (1) that number crunching is usually carried out using software that was developed before information technology existed, and (2) that the educational research is to a great extent trapped…
Information Power Grid Posters
NASA Technical Reports Server (NTRS)
Vaziri, Arsi
2003-01-01
This document is a summary of the accomplishments of the Information Power Grid (IPG). Grids are an emerging technology that provide seamless and uniform access to the geographically dispersed, computational, data storage, networking, instruments, and software resources needed for solving large-scale scientific and engineering problems. The goal of the NASA IPG is to use NASA's remotely located computing and data system resources to build distributed systems that can address problems that are too large or complex for a single site. The accomplishments outlined in this poster presentation are: access to distributed data, IPG heterogeneous computing, integration of large-scale computing node into distributed environment, remote access to high data rate instruments,and exploratory grid environment.
A Networked Sensor System for the Analysis of Plot-Scale Hydrology.
Villalba, German; Plaza, Fernando; Zhong, Xiaoyang; Davis, Tyler W; Navarro, Miguel; Li, Yimei; Slater, Thomas A; Liang, Yao; Liang, Xu
2017-03-20
This study presents the latest updates to the Audubon Society of Western Pennsylvania (ASWP) testbed, a $50,000 USD, 104-node outdoor multi-hop wireless sensor network (WSN). The network collects environmental data from over 240 sensors, including the EC-5, MPS-1 and MPS-2 soil moisture and soil water potential sensors and self-made sap flow sensors, across a heterogeneous deployment comprised of MICAz, IRIS and TelosB wireless motes. A low-cost sensor board and software driver was developed for communicating with the analog and digital sensors. Innovative techniques (e.g., balanced energy efficient routing and heterogeneous over-the-air mote reprogramming) maintained high success rates (>96%) and enabled effective software updating, throughout the large-scale heterogeneous WSN. The edaphic properties monitored by the network showed strong agreement with data logger measurements and were fitted to pedotransfer functions for estimating local soil hydraulic properties. Furthermore, sap flow measurements, scaled to tree stand transpiration, were found to be at or below potential evapotranspiration estimates. While outdoor WSNs still present numerous challenges, the ASWP testbed proves to be an effective and (relatively) low-cost environmental monitoring solution and represents a step towards developing a platform for monitoring and quantifying statistically relevant environmental parameters from large-scale network deployments.
A Networked Sensor System for the Analysis of Plot-Scale Hydrology
Villalba, German; Plaza, Fernando; Zhong, Xiaoyang; Davis, Tyler W.; Navarro, Miguel; Li, Yimei; Slater, Thomas A.; Liang, Yao; Liang, Xu
2017-01-01
This study presents the latest updates to the Audubon Society of Western Pennsylvania (ASWP) testbed, a $50,000 USD, 104-node outdoor multi-hop wireless sensor network (WSN). The network collects environmental data from over 240 sensors, including the EC-5, MPS-1 and MPS-2 soil moisture and soil water potential sensors and self-made sap flow sensors, across a heterogeneous deployment comprised of MICAz, IRIS and TelosB wireless motes. A low-cost sensor board and software driver was developed for communicating with the analog and digital sensors. Innovative techniques (e.g., balanced energy efficient routing and heterogeneous over-the-air mote reprogramming) maintained high success rates (>96%) and enabled effective software updating, throughout the large-scale heterogeneous WSN. The edaphic properties monitored by the network showed strong agreement with data logger measurements and were fitted to pedotransfer functions for estimating local soil hydraulic properties. Furthermore, sap flow measurements, scaled to tree stand transpiration, were found to be at or below potential evapotranspiration estimates. While outdoor WSNs still present numerous challenges, the ASWP testbed proves to be an effective and (relatively) low-cost environmental monitoring solution and represents a step towards developing a platform for monitoring and quantifying statistically relevant environmental parameters from large-scale network deployments. PMID:28335534
NASA Technical Reports Server (NTRS)
Greene, P. H.
1972-01-01
Both in practical engineering and in control of muscular systems, low level subsystems automatically provide crude approximations to the proper response. Through low level tuning of these approximations, the proper response variant can emerge from standardized high level commands. Such systems are expressly suited to emerging large scale integrated circuit technology. A computer, using symbolic descriptions of subsystem responses, can select and shape responses of low level digital or analog microcircuits. A mathematical theory that reveals significant informational units in this style of control and software for realizing such information structures are formulated.
Large-Scale Wireless Temperature Monitoring System for Liquefied Petroleum Gas Storage Tanks.
Fan, Guangwen; Shen, Yu; Hao, Xiaowei; Yuan, Zongming; Zhou, Zhi
2015-09-18
Temperature distribution is a critical indicator of the health condition for Liquefied Petroleum Gas (LPG) storage tanks. In this paper, we present a large-scale wireless temperature monitoring system to evaluate the safety of LPG storage tanks. The system includes wireless sensors networks, high temperature fiber-optic sensors, and monitoring software. Finally, a case study on real-world LPG storage tanks proves the feasibility of the system. The unique features of wireless transmission, automatic data acquisition and management, local and remote access make the developed system a good alternative for temperature monitoring of LPG storage tanks in practical applications.
Ry Horsey Photo of Henry Horsey Ry Horsey Software Developer - Commercial Buildings Energy Modeling the field of commercial building energy modeling. He is particularly interested in the increasing tools to support large-scale commercial building energy modeling. This work has led to contributions to
Static analysis techniques for semiautomatic synthesis of message passing software skeletons
Sottile, Matthew; Dagit, Jason; Zhang, Deli; ...
2015-06-29
The design of high-performance computing architectures demands performance analysis of large-scale parallel applications to derive various parameters concerning hardware design and software development. The process of performance analysis and benchmarking an application can be done in several ways with varying degrees of fidelity. One of the most cost-effective ways is to do a coarse-grained study of large-scale parallel applications through the use of program skeletons. The concept of a “program skeleton” that we discuss in this article is an abstracted program that is derived from a larger program where source code that is determined to be irrelevant is removed formore » the purposes of the skeleton. In this work, we develop a semiautomatic approach for extracting program skeletons based on compiler program analysis. Finally, we demonstrate correctness of our skeleton extraction process by comparing details from communication traces, as well as show the performance speedup of using skeletons by running simulations in the SST/macro simulator.« less
Hiraki, Masahiko; Kato, Ryuichi; Nagai, Minoru; Satoh, Tadashi; Hirano, Satoshi; Ihara, Kentaro; Kudo, Norio; Nagae, Masamichi; Kobayashi, Masanori; Inoue, Michio; Uejima, Tamami; Oda, Shunichiro; Chavas, Leonard M G; Akutsu, Masato; Yamada, Yusuke; Kawasaki, Masato; Matsugaki, Naohiro; Igarashi, Noriyuki; Suzuki, Mamoru; Wakatsuki, Soichi
2006-09-01
Protein crystallization remains one of the bottlenecks in crystallographic analysis of macromolecules. An automated large-scale protein-crystallization system named PXS has been developed consisting of the following subsystems, which proceed in parallel under unified control software: dispensing precipitants and protein solutions, sealing crystallization plates, carrying robot, incubators, observation system and image-storage server. A sitting-drop crystallization plate specialized for PXS has also been designed and developed. PXS can set up 7680 drops for vapour diffusion per hour, which includes time for replenishing supplies such as disposable tips and crystallization plates. Images of the crystallization drops are automatically recorded according to a preprogrammed schedule and can be viewed by users remotely using web-based browser software. A number of protein crystals were successfully produced and several protein structures could be determined directly from crystals grown by PXS. In other cases, X-ray quality crystals were obtained by further optimization by manual screening based on the conditions found by PXS.
DOE Office of Scientific and Technical Information (OSTI.GOV)
2014-08-21
Recent advancements in technology scaling have shown a trend towards greater integration with large-scale chips containing thousands of processors connected to memories and other I/O devices using non-trivial network topologies. Software simulation proves insufficient to study the tradeoffs in such complex systems due to slow execution time, whereas hardware RTL development is too time-consuming. We present OpenSoC Fabric, an on-chip network generation infrastructure which aims to provide a parameterizable and powerful on-chip network generator for evaluating future high performance computing architectures based on SoC technology. OpenSoC Fabric leverages a new hardware DSL, Chisel, which contains powerful abstractions provided by itsmore » base language, Scala, and generates both software (C++) and hardware (Verilog) models from a single code base. The OpenSoC Fabric2 infrastructure is modeled after existing state-of-the-art simulators, offers large and powerful collections of configuration options, and follows object-oriented design and functional programming to make functionality extension as easy as possible.« less
Requirements: Towards an understanding on why software projects fail
NASA Astrophysics Data System (ADS)
Hussain, Azham; Mkpojiogu, Emmanuel O. C.
2016-08-01
Requirement engineering is at the foundation of every successful software project. There are many reasons for software project failures; however, poorly engineered requirements process contributes immensely to the reason why software projects fail. Software project failure is usually costly and risky and could also be life threatening. Projects that undermine requirements engineering suffer or are likely to suffer from failures, challenges and other attending risks. The cost of project failures and overruns when estimated is very huge. Furthermore, software project failures or overruns pose a challenge in today's competitive market environment. It affects the company's image, goodwill, and revenue drive and decreases the perceived satisfaction of customers and clients. In this paper, requirements engineering was discussed. Its role in software projects success was elaborated. The place of software requirements process in relation to software project failure was explored and examined. Also, project success and failure factors were also discussed with emphasis placed on requirements factors as they play a major role in software projects' challenges, successes and failures. The paper relied on secondary data and empirical statistics to explore and examine factors responsible for the successes, challenges and failures of software projects in large, medium and small scaled software companies.
Averting Denver Airports on a Chip
NASA Technical Reports Server (NTRS)
Sullivan, Kevin J.
1995-01-01
As a result of recent advances in software engineering capabilities, we are now in a more stable environment. De-facto hardware and software standards are emerging. Work on software architecture and design patterns signals a consensus on the importance of early system-level design decisions, and agreements on the uses of certain paradigmatic software structures. We now routinely build systems that would have been risky or infeasible a few years ago. Unfortunately, technological developments threaten to destabilize software design again. Systems designed around novel computing and peripheral devices will spark ambitious new projects that will stress current software design and engineering capabilities. Micro-electro-mechanical systems (MEMS) and related technologies provide the physical basis for new systems with the potential to produce this kind of destabilizing effect. One important response to anticipated software engineering and design difficulties is carefully directed engineering-scientific research. Two specific problems meriting substantial research attention are: A lack of sufficient means to build software systems by generating, extending, specializing, and integrating large-scale reusable components; and a lack of adequate computational and analytic tools to extend and aid engineers in maintaining intellectual control over complex software designs.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Terwilliger, Thomas C., E-mail: terwilliger@lanl.gov; Bricogne, Gerard, E-mail: terwilliger@lanl.gov; Los Alamos National Laboratory, Mail Stop M888, Los Alamos, NM 87507
Macromolecular structures deposited in the PDB can and should be continually reinterpreted and improved on the basis of their accompanying experimental X-ray data, exploiting the steady progress in methods and software that the deposition of such data into the PDB on a massive scale has made possible. Accurate crystal structures of macromolecules are of high importance in the biological and biomedical fields. Models of crystal structures in the Protein Data Bank (PDB) are in general of very high quality as deposited. However, methods for obtaining the best model of a macromolecular structure from a given set of experimental X-ray datamore » continue to progress at a rapid pace, making it possible to improve most PDB entries after their deposition by re-analyzing the original deposited data with more recent software. This possibility represents a very significant departure from the situation that prevailed when the PDB was created, when it was envisioned as a cumulative repository of static contents. A radical paradigm shift for the PDB is therefore proposed, away from the static archive model towards a much more dynamic body of continuously improving results in symbiosis with continuously improving methods and software. These simultaneous improvements in methods and final results are made possible by the current deposition of processed crystallographic data (structure-factor amplitudes) and will be supported further by the deposition of raw data (diffraction images). It is argued that it is both desirable and feasible to carry out small-scale and large-scale efforts to make this paradigm shift a reality. Small-scale efforts would focus on optimizing structures that are of interest to specific investigators. Large-scale efforts would undertake a systematic re-optimization of all of the structures in the PDB, or alternatively the redetermination of groups of structures that are either related to or focused on specific questions. All of the resulting structures should be made generally available, along with the precursor entries, with various views of the structures being made available depending on the types of questions that users are interested in answering.« less
NASA Astrophysics Data System (ADS)
Evans, Ben; Allen, Chris; Antony, Joseph; Bastrakova, Irina; Gohar, Kashif; Porter, David; Pugh, Tim; Santana, Fabiana; Smillie, Jon; Trenham, Claire; Wang, Jingbo; Wyborn, Lesley
2015-04-01
The National Computational Infrastructure (NCI) has established a powerful and flexible in-situ petascale computational environment to enable both high performance computing and Data-intensive Science across a wide spectrum of national environmental and earth science data collections - in particular climate, observational data and geoscientific assets. This paper examines 1) the computational environments that supports the modelling and data processing pipelines, 2) the analysis environments and methods to support data analysis, and 3) the progress so far to harmonise the underlying data collections for future interdisciplinary research across these large volume data collections. NCI has established 10+ PBytes of major national and international data collections from both the government and research sectors based on six themes: 1) weather, climate, and earth system science model simulations, 2) marine and earth observations, 3) geosciences, 4) terrestrial ecosystems, 5) water and hydrology, and 6) astronomy, social and biosciences. Collectively they span the lithosphere, crust, biosphere, hydrosphere, troposphere, and stratosphere. The data is largely sourced from NCI's partners (which include the custodians of many of the major Australian national-scale scientific collections), leading research communities, and collaborating overseas organisations. New infrastructures created at NCI mean the data collections are now accessible within an integrated High Performance Computing and Data (HPC-HPD) environment - a 1.2 PFlop supercomputer (Raijin), a HPC class 3000 core OpenStack cloud system and several highly connected large-scale high-bandwidth Lustre filesystems. The hardware was designed at inception to ensure that it would allow the layered software environment to flexibly accommodate the advancement of future data science. New approaches to software technology and data models have also had to be developed to enable access to these large and exponentially increasing data volumes at NCI. Traditional HPC and data environments are still made available in a way that flexibly provides the tools, services and supporting software systems on these new petascale infrastructures. But to enable the research to take place at this scale, the data, metadata and software now need to evolve together - creating a new integrated high performance infrastructure. The new infrastructure at NCI currently supports a catalogue of integrated, reusable software and workflows from earth system and ecosystem modelling, weather research, satellite and other observed data processing and analysis. One of the challenges for NCI has been to support existing techniques and methods, while carefully preparing the underlying infrastructure for the transition needed for the next class of Data-intensive Science. In doing so, a flexible range of techniques and software can be made available for application across the corpus of data collections available, and to provide a new infrastructure for future interdisciplinary research.
GROMACS 4.5: a high-throughput and highly parallel open source molecular simulation toolkit
Pronk, Sander; Páll, Szilárd; Schulz, Roland; Larsson, Per; Bjelkmar, Pär; Apostolov, Rossen; Shirts, Michael R.; Smith, Jeremy C.; Kasson, Peter M.; van der Spoel, David; Hess, Berk; Lindahl, Erik
2013-01-01
Motivation: Molecular simulation has historically been a low-throughput technique, but faster computers and increasing amounts of genomic and structural data are changing this by enabling large-scale automated simulation of, for instance, many conformers or mutants of biomolecules with or without a range of ligands. At the same time, advances in performance and scaling now make it possible to model complex biomolecular interaction and function in a manner directly testable by experiment. These applications share a need for fast and efficient software that can be deployed on massive scale in clusters, web servers, distributed computing or cloud resources. Results: Here, we present a range of new simulation algorithms and features developed during the past 4 years, leading up to the GROMACS 4.5 software package. The software now automatically handles wide classes of biomolecules, such as proteins, nucleic acids and lipids, and comes with all commonly used force fields for these molecules built-in. GROMACS supports several implicit solvent models, as well as new free-energy algorithms, and the software now uses multithreading for efficient parallelization even on low-end systems, including windows-based workstations. Together with hand-tuned assembly kernels and state-of-the-art parallelization, this provides extremely high performance and cost efficiency for high-throughput as well as massively parallel simulations. Availability: GROMACS is an open source and free software available from http://www.gromacs.org. Contact: erik.lindahl@scilifelab.se Supplementary information: Supplementary data are available at Bioinformatics online. PMID:23407358
Applications of large-scale density functional theory in biology
NASA Astrophysics Data System (ADS)
Cole, Daniel J.; Hine, Nicholas D. M.
2016-10-01
Density functional theory (DFT) has become a routine tool for the computation of electronic structure in the physics, materials and chemistry fields. Yet the application of traditional DFT to problems in the biological sciences is hindered, to a large extent, by the unfavourable scaling of the computational effort with system size. Here, we review some of the major software and functionality advances that enable insightful electronic structure calculations to be performed on systems comprising many thousands of atoms. We describe some of the early applications of large-scale DFT to the computation of the electronic properties and structure of biomolecules, as well as to paradigmatic problems in enzymology, metalloproteins, photosynthesis and computer-aided drug design. With this review, we hope to demonstrate that first principles modelling of biological structure-function relationships are approaching a reality.
Autonomous smart sensor network for full-scale structural health monitoring
NASA Astrophysics Data System (ADS)
Rice, Jennifer A.; Mechitov, Kirill A.; Spencer, B. F., Jr.; Agha, Gul A.
2010-04-01
The demands of aging infrastructure require effective methods for structural monitoring and maintenance. Wireless smart sensor networks offer the ability to enhance structural health monitoring (SHM) practices through the utilization of onboard computation to achieve distributed data management. Such an approach is scalable to the large number of sensor nodes required for high-fidelity modal analysis and damage detection. While smart sensor technology is not new, the number of full-scale SHM applications has been limited. This slow progress is due, in part, to the complex network management issues that arise when moving from a laboratory setting to a full-scale monitoring implementation. This paper presents flexible network management software that enables continuous and autonomous operation of wireless smart sensor networks for full-scale SHM applications. The software components combine sleep/wake cycling for enhanced power management with threshold detection for triggering network wide tasks, such as synchronized sensing or decentralized modal analysis, during periods of critical structural response.
ERIC Educational Resources Information Center
Arasasingham, Ramesh D.; Taagepera, Mare; Potter, Frank; Martorell, Ingrid; Lonjers, Stacy
2005-01-01
Student achievement in web-based learning tools is assessed by using in-class examination, pretests, and posttests. The study reveals that using mastering chemistry web software in large-scale instruction provides an overall benefit to introductory chemistry students.
Wang, Tongli; Hamann, Andreas; Spittlehouse, Dave; Carroll, Carlos
2016-01-01
Large volumes of gridded climate data have become available in recent years including interpolated historical data from weather stations and future predictions from general circulation models. These datasets, however, are at various spatial resolutions that need to be converted to scales meaningful for applications such as climate change risk and impact assessments or sample-based ecological research. Extracting climate data for specific locations from large datasets is not a trivial task and typically requires advanced GIS and data management skills. In this study, we developed a software package, ClimateNA, that facilitates this task and provides a user-friendly interface suitable for resource managers and decision makers as well as scientists. The software locally downscales historical and future monthly climate data layers into scale-free point estimates of climate values for the entire North American continent. The software also calculates a large number of biologically relevant climate variables that are usually derived from daily weather data. ClimateNA covers 1) 104 years of historical data (1901–2014) in monthly, annual, decadal and 30-year time steps; 2) three paleoclimatic periods (Last Glacial Maximum, Mid Holocene and Last Millennium); 3) three future periods (2020s, 2050s and 2080s); and 4) annual time-series of model projections for 2011–2100. Multiple general circulation models (GCMs) were included for both paleo and future periods, and two representative concentration pathways (RCP4.5 and 8.5) were chosen for future climate data. PMID:27275583
Wang, Tongli; Hamann, Andreas; Spittlehouse, Dave; Carroll, Carlos
2016-01-01
Large volumes of gridded climate data have become available in recent years including interpolated historical data from weather stations and future predictions from general circulation models. These datasets, however, are at various spatial resolutions that need to be converted to scales meaningful for applications such as climate change risk and impact assessments or sample-based ecological research. Extracting climate data for specific locations from large datasets is not a trivial task and typically requires advanced GIS and data management skills. In this study, we developed a software package, ClimateNA, that facilitates this task and provides a user-friendly interface suitable for resource managers and decision makers as well as scientists. The software locally downscales historical and future monthly climate data layers into scale-free point estimates of climate values for the entire North American continent. The software also calculates a large number of biologically relevant climate variables that are usually derived from daily weather data. ClimateNA covers 1) 104 years of historical data (1901-2014) in monthly, annual, decadal and 30-year time steps; 2) three paleoclimatic periods (Last Glacial Maximum, Mid Holocene and Last Millennium); 3) three future periods (2020s, 2050s and 2080s); and 4) annual time-series of model projections for 2011-2100. Multiple general circulation models (GCMs) were included for both paleo and future periods, and two representative concentration pathways (RCP4.5 and 8.5) were chosen for future climate data.
An interactive display system for large-scale 3D models
NASA Astrophysics Data System (ADS)
Liu, Zijian; Sun, Kun; Tao, Wenbing; Liu, Liman
2018-04-01
With the improvement of 3D reconstruction theory and the rapid development of computer hardware technology, the reconstructed 3D models are enlarging in scale and increasing in complexity. Models with tens of thousands of 3D points or triangular meshes are common in practical applications. Due to storage and computing power limitation, it is difficult to achieve real-time display and interaction with large scale 3D models for some common 3D display software, such as MeshLab. In this paper, we propose a display system for large-scale 3D scene models. We construct the LOD (Levels of Detail) model of the reconstructed 3D scene in advance, and then use an out-of-core view-dependent multi-resolution rendering scheme to realize the real-time display of the large-scale 3D model. With the proposed method, our display system is able to render in real time while roaming in the reconstructed scene and 3D camera poses can also be displayed. Furthermore, the memory consumption can be significantly decreased via internal and external memory exchange mechanism, so that it is possible to display a large scale reconstructed scene with over millions of 3D points or triangular meshes in a regular PC with only 4GB RAM.
ERIC Educational Resources Information Center
Cor, Ken; Alves, Cecilia; Gierl, Mark J.
2008-01-01
This review describes and evaluates a software add-in created by Frontline Systems, Inc., that can be used with Microsoft Excel 2007 to solve large, complex test assembly problems. The combination of Microsoft Excel 2007 with the Frontline Systems Premium Solver Platform is significant because Microsoft Excel is the most commonly used spreadsheet…
A Validation Framework for the Long Term Preservation of High Energy Physics Data
NASA Astrophysics Data System (ADS)
Ozerov, Dmitri; South, David M.
2014-06-01
The study group on data preservation in high energy physics, DPHEP, is moving to a new collaboration structure, which will focus on the implementation of preservation projects, such as those described in the group's large scale report published in 2012. One such project is the development of a validation framework, which checks the compatibility of evolving computing environments and technologies with the experiments software for as long as possible, with the aim of substantially extending the lifetime of the analysis software, and hence of the usability of the data. The framework is designed to automatically test and validate the software and data of an experiment against changes and upgrades to the computing environment, as well as changes to the experiment software itself. Technically, this is realised using a framework capable of hosting a number of virtual machine images, built with different configurations of operating systems and the relevant software, including any necessary external dependencies.
A bridge role metric model for nodes in software networks.
Li, Bo; Feng, Yanli; Ge, Shiyu; Li, Dashe
2014-01-01
A bridge role metric model is put forward in this paper. Compared with previous metric models, our solution of a large-scale object-oriented software system as a complex network is inherently more realistic. To acquire nodes and links in an undirected network, a new model that presents the crucial connectivity of a module or the hub instead of only centrality as in previous metric models is presented. Two previous metric models are described for comparison. In addition, it is obvious that the fitting curve between the Bre results and degrees can well be fitted by a power law. The model represents many realistic characteristics of actual software structures, and a hydropower simulation system is taken as an example. This paper makes additional contributions to an accurate understanding of module design of software systems and is expected to be beneficial to software engineering practices.
A Bridge Role Metric Model for Nodes in Software Networks
Li, Bo; Feng, Yanli; Ge, Shiyu; Li, Dashe
2014-01-01
A bridge role metric model is put forward in this paper. Compared with previous metric models, our solution of a large-scale object-oriented software system as a complex network is inherently more realistic. To acquire nodes and links in an undirected network, a new model that presents the crucial connectivity of a module or the hub instead of only centrality as in previous metric models is presented. Two previous metric models are described for comparison. In addition, it is obvious that the fitting curve between the results and degrees can well be fitted by a power law. The model represents many realistic characteristics of actual software structures, and a hydropower simulation system is taken as an example. This paper makes additional contributions to an accurate understanding of module design of software systems and is expected to be beneficial to software engineering practices. PMID:25364938
Large-scale virtual screening on public cloud resources with Apache Spark.
Capuccini, Marco; Ahmed, Laeeq; Schaal, Wesley; Laure, Erwin; Spjuth, Ola
2017-01-01
Structure-based virtual screening is an in-silico method to screen a target receptor against a virtual molecular library. Applying docking-based screening to large molecular libraries can be computationally expensive, however it constitutes a trivially parallelizable task. Most of the available parallel implementations are based on message passing interface, relying on low failure rate hardware and fast network connection. Google's MapReduce revolutionized large-scale analysis, enabling the processing of massive datasets on commodity hardware and cloud resources, providing transparent scalability and fault tolerance at the software level. Open source implementations of MapReduce include Apache Hadoop and the more recent Apache Spark. We developed a method to run existing docking-based screening software on distributed cloud resources, utilizing the MapReduce approach. We benchmarked our method, which is implemented in Apache Spark, docking a publicly available target receptor against [Formula: see text]2.2 M compounds. The performance experiments show a good parallel efficiency (87%) when running in a public cloud environment. Our method enables parallel Structure-based virtual screening on public cloud resources or commodity computer clusters. The degree of scalability that we achieve allows for trying out our method on relatively small libraries first and then to scale to larger libraries. Our implementation is named Spark-VS and it is freely available as open source from GitHub (https://github.com/mcapuccini/spark-vs).Graphical abstract.
NASA Technical Reports Server (NTRS)
Zhang, Zhong
1997-01-01
The development of large-scale, composite software in a geographically distributed environment is an evolutionary process. Often, in such evolving systems, striving for consistency is complicated by many factors, because development participants have various locations, skills, responsibilities, roles, opinions, languages, terminology and different degrees of abstraction they employ. This naturally leads to many partial specifications or viewpoints. These multiple views on the system being developed usually overlap. From another aspect, these multiple views give rise to the potential for inconsistency. Existing CASE tools do not efficiently manage inconsistencies in distributed development environment for a large-scale project. Based on the ViewPoints framework the WHERE (Web-Based Hypertext Environment for requirements Evolution) toolkit aims to tackle inconsistency management issues within geographically distributed software development projects. Consequently, WHERE project helps make more robust software and support software assurance process. The long term goal of WHERE tools aims to the inconsistency analysis and management in requirements specifications. A framework based on Graph Grammar theory and TCMJAVA toolkit is proposed to detect inconsistencies among viewpoints. This systematic approach uses three basic operations (UNION, DIFFERENCE, INTERSECTION) to study the static behaviors of graphic and tabular notations. From these operations, subgraphs Query, Selection, Merge, Replacement operations can be derived. This approach uses graph PRODUCTIONS (rewriting rules) to study the dynamic transformations of graphs. We discuss the feasibility of implementation these operations. Also, We present the process of porting original TCM (Toolkit for Conceptual Modeling) project from C++ to Java programming language in this thesis. A scenario based on NASA International Space Station Specification is discussed to show the applicability of our approach. Finally, conclusion and future work about inconsistency management issues in WHERE project will be summarized.
Implementation of Dynamic Extensible Adaptive Locally Exchangeable Measures (IDEALEM) v 0.1
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sim, Alex; Lee, Dongeun; Wu, K. John
2016-03-04
Handling large streaming data is essential for various applications such as network traffic analysis, social networks, energy cost trends, and environment modeling. However, it is in general intractable to store, compute, search, and retrieve large streaming data. This software addresses a fundamental issue, which is to reduce the size of large streaming data and still obtain accurate statistical analysis. As an example, when a high-speed network such as 100 Gbps network is monitored, the collected measurement data rapidly grows so that polynomial time algorithms (e.g., Gaussian processes) become intractable. One possible solution to reduce the storage of vast amounts ofmore » measured data is to store a random sample, such as one out of 1000 network packets. However, such static sampling methods (linear sampling) have drawbacks: (1) it is not scalable for high-rate streaming data, and (2) there is no guarantee of reflecting the underlying distribution. In this software, we implemented a dynamic sampling algorithm, based on the recent technology from the relational dynamic bayesian online locally exchangeable measures, that reduces the storage of data records in a large scale, and still provides accurate analysis of large streaming data. The software can be used for both online and offline data records.« less
Validating the simulation of large-scale parallel applications using statistical characteristics
Zhang, Deli; Wilke, Jeremiah; Hendry, Gilbert; ...
2016-03-01
Simulation is a widely adopted method to analyze and predict the performance of large-scale parallel applications. Validating the hardware model is highly important for complex simulations with a large number of parameters. Common practice involves calculating the percent error between the projected and the real execution time of a benchmark program. However, in a high-dimensional parameter space, this coarse-grained approach often suffers from parameter insensitivity, which may not be known a priori. Moreover, the traditional approach cannot be applied to the validation of software models, such as application skeletons used in online simulations. In this work, we present a methodologymore » and a toolset for validating both hardware and software models by quantitatively comparing fine-grained statistical characteristics obtained from execution traces. Although statistical information has been used in tasks like performance optimization, this is the first attempt to apply it to simulation validation. Lastly, our experimental results show that the proposed evaluation approach offers significant improvement in fidelity when compared to evaluation using total execution time, and the proposed metrics serve as reliable criteria that progress toward automating the simulation tuning process.« less
Modeling and Performance Considerations for Automated Fault Isolation in Complex Systems
NASA Technical Reports Server (NTRS)
Ferrell, Bob; Oostdyk, Rebecca
2010-01-01
The purpose of this paper is to document the modeling considerations and performance metrics that were examined in the development of a large-scale Fault Detection, Isolation and Recovery (FDIR) system. The FDIR system is envisioned to perform health management functions for both a launch vehicle and the ground systems that support the vehicle during checkout and launch countdown by using suite of complimentary software tools that alert operators to anomalies and failures in real-time. The FDIR team members developed a set of operational requirements for the models that would be used for fault isolation and worked closely with the vendor of the software tools selected for fault isolation to ensure that the software was able to meet the requirements. Once the requirements were established, example models of sufficient complexity were used to test the performance of the software. The results of the performance testing demonstrated the need for enhancements to the software in order to meet the demands of the full-scale ground and vehicle FDIR system. The paper highlights the importance of the development of operational requirements and preliminary performance testing as a strategy for identifying deficiencies in highly scalable systems and rectifying those deficiencies before they imperil the success of the project
Software Engineering for Scientific Computer Simulations
NASA Astrophysics Data System (ADS)
Post, Douglass E.; Henderson, Dale B.; Kendall, Richard P.; Whitney, Earl M.
2004-11-01
Computer simulation is becoming a very powerful tool for analyzing and predicting the performance of fusion experiments. Simulation efforts are evolving from including only a few effects to many effects, from small teams with a few people to large teams, and from workstations and small processor count parallel computers to massively parallel platforms. Successfully making this transition requires attention to software engineering issues. We report on the conclusions drawn from a number of case studies of large scale scientific computing projects within DOE, academia and the DoD. The major lessons learned include attention to sound project management including setting reasonable and achievable requirements, building a good code team, enforcing customer focus, carrying out verification and validation and selecting the optimum computational mathematics approaches.
Building software tools to help contextualize and interpret monitoring data
USDA-ARS?s Scientific Manuscript database
Even modest monitoring efforts at landscape scales produce large volumes of data.These are most useful if they can be interpreted relative to land potential or other similar sites. However, for many ecological systems reference conditions may not be defined or are poorly described, which hinders und...
Parallel Computing for Probabilistic Response Analysis of High Temperature Composites
NASA Technical Reports Server (NTRS)
Sues, R. H.; Lua, Y. J.; Smith, M. D.
1994-01-01
The objective of this Phase I research was to establish the required software and hardware strategies to achieve large scale parallelism in solving PCM problems. To meet this objective, several investigations were conducted. First, we identified the multiple levels of parallelism in PCM and the computational strategies to exploit these parallelisms. Next, several software and hardware efficiency investigations were conducted. These involved the use of three different parallel programming paradigms and solution of two example problems on both a shared-memory multiprocessor and a distributed-memory network of workstations.
pcircle - A Suite of Scalable Parallel File System Tools
DOE Office of Scientific and Technical Information (OSTI.GOV)
WANG, FEIYI
2015-10-01
Most of the software related to file system are written for conventional local file system, they are serialized and can't take advantage of the benefit of a large scale parallel file system. "pcircle" software builds on top of ubiquitous MPI in cluster computing environment and "work-stealing" pattern to provide a scalable, high-performance suite of file system tools. In particular - it implemented parallel data copy and parallel data checksumming, with advanced features such as async progress report, checkpoint and restart, as well as integrity checking.
Simulation Of Combat With An Expert System
NASA Technical Reports Server (NTRS)
Provenzano, J. P.
1989-01-01
Proposed expert system predicts outcomes of combat situations. Called "COBRA", combat outcome based on rules for attrition, system selects rules for mathematical modeling of losses and discrete events in combat according to previous experiences. Used with another software module known as the "Game". Game/COBRA software system, consisting of Game and COBRA modules, provides for both quantitative aspects and qualitative aspects in simulations of battles. COBRA intended for simulation of large-scale military exercises, concepts embodied in it have much broader applicability. In industrial research, knowledge-based system enables qualitative as well as quantitative simulations.
NASA Astrophysics Data System (ADS)
Plebe, Alice; Grasso, Giorgio
2016-12-01
This paper describes a system developed for the simulation of flames inside an open-source 3D computer graphic software, Blender, with the aim of analyzing in virtual reality scenarios of hazards in large-scale industrial plants. The advantages of Blender are of rendering at high resolution the very complex structure of large industrial plants, and of embedding a physical engine based on smoothed particle hydrodynamics. This particle system is used to evolve a simulated fire. The interaction of this fire with the components of the plant is computed using polyhedron separation distance, adopting a Voronoi-based strategy that optimizes the number of feature distance computations. Results on a real oil and gas refining industry are presented.
CImbinator: a web-based tool for drug synergy analysis in small- and large-scale datasets.
Flobak, Åsmund; Vazquez, Miguel; Lægreid, Astrid; Valencia, Alfonso
2017-08-01
Drug synergies are sought to identify combinations of drugs particularly beneficial. User-friendly software solutions that can assist analysis of large-scale datasets are required. CImbinator is a web-service that can aid in batch-wise and in-depth analyzes of data from small-scale and large-scale drug combination screens. CImbinator offers to quantify drug combination effects, using both the commonly employed median effect equation, as well as advanced experimental mathematical models describing dose response relationships. CImbinator is written in Ruby and R. It uses the R package drc for advanced drug response modeling. CImbinator is available at http://cimbinator.bioinfo.cnio.es , the source-code is open and available at https://github.com/Rbbt-Workflows/combination_index . A Docker image is also available at https://hub.docker.com/r/mikisvaz/rbbt-ci_mbinator/ . asmund.flobak@ntnu.no or miguel.vazquez@cnio.es. Supplementary data are available at Bioinformatics online. © The Author(s) 2017. Published by Oxford University Press.
A unifying framework for systems modeling, control systems design, and system operation
NASA Technical Reports Server (NTRS)
Dvorak, Daniel L.; Indictor, Mark B.; Ingham, Michel D.; Rasmussen, Robert D.; Stringfellow, Margaret V.
2005-01-01
Current engineering practice in the analysis and design of large-scale multi-disciplinary control systems is typified by some form of decomposition- whether functional or physical or discipline-based-that enables multiple teams to work in parallel and in relative isolation. Too often, the resulting system after integration is an awkward marriage of different control and data mechanisms with poor end-to-end accountability. System of systems engineering, which faces this problem on a large scale, cries out for a unifying framework to guide analysis, design, and operation. This paper describes such a framework based on a state-, model-, and goal-based architecture for semi-autonomous control systems that guides analysis and modeling, shapes control system software design, and directly specifies operational intent. This paper illustrates the key concepts in the context of a large-scale, concurrent, globally distributed system of systems: NASA's proposed Array-based Deep Space Network.
Deployment dynamics and control of large-scale flexible solar array system with deployable mast
NASA Astrophysics Data System (ADS)
Li, Hai-Quan; Liu, Xiao-Feng; Guo, Shao-Jing; Cai, Guo-Ping
2016-10-01
In this paper, deployment dynamics and control of large-scale flexible solar array system with deployable mast are investigated. The adopted solar array system is introduced firstly, including system configuration, deployable mast and solar arrays with several mechanisms. Then dynamic equation of the solar array system is established by the Jourdain velocity variation principle and a method for dynamics with topology changes is introduced. In addition, a PD controller with disturbance estimation is designed to eliminate the drift of spacecraft mainbody. Finally the validity of the dynamic model is verified through a comparison with ADAMS software and the deployment process and dynamic behavior of the system are studied in detail. Simulation results indicate that the proposed model is effective to describe the deployment dynamics of the large-scale flexible solar arrays and the proposed controller is practical to eliminate the drift of spacecraft mainbody.
Deelman, E.; Callaghan, S.; Field, E.; Francoeur, H.; Graves, R.; Gupta, N.; Gupta, V.; Jordan, T.H.; Kesselman, C.; Maechling, P.; Mehringer, J.; Mehta, G.; Okaya, D.; Vahi, K.; Zhao, L.
2006-01-01
This paper discusses the process of building an environment where large-scale, complex, scientific analysis can be scheduled onto a heterogeneous collection of computational and storage resources. The example application is the Southern California Earthquake Center (SCEC) CyberShake project, an analysis designed to compute probabilistic seismic hazard curves for sites in the Los Angeles area. We explain which software tools were used to build to the system, describe their functionality and interactions. We show the results of running the CyberShake analysis that included over 250,000 jobs using resources available through SCEC and the TeraGrid. ?? 2006 IEEE.
Large-Scale Wireless Temperature Monitoring System for Liquefied Petroleum Gas Storage Tanks
Fan, Guangwen; Shen, Yu; Hao, Xiaowei; Yuan, Zongming; Zhou, Zhi
2015-01-01
Temperature distribution is a critical indicator of the health condition for Liquefied Petroleum Gas (LPG) storage tanks. In this paper, we present a large-scale wireless temperature monitoring system to evaluate the safety of LPG storage tanks. The system includes wireless sensors networks, high temperature fiber-optic sensors, and monitoring software. Finally, a case study on real-world LPG storage tanks proves the feasibility of the system. The unique features of wireless transmission, automatic data acquisition and management, local and remote access make the developed system a good alternative for temperature monitoring of LPG storage tanks in practical applications. PMID:26393596
Integrating Cloud-Computing-Specific Model into Aircraft Design
NASA Astrophysics Data System (ADS)
Zhimin, Tian; Qi, Lin; Guangwen, Yang
Cloud Computing is becoming increasingly relevant, as it will enable companies involved in spreading this technology to open the door to Web 3.0. In the paper, the new categories of services introduced will slowly replace many types of computational resources currently used. In this perspective, grid computing, the basic element for the large scale supply of cloud services, will play a fundamental role in defining how those services will be provided. The paper tries to integrate cloud computing specific model into aircraft design. This work has acquired good results in sharing licenses of large scale and expensive software, such as CFD (Computational Fluid Dynamics), UG, CATIA, and so on.
Seqcrawler: biological data indexing and browsing platform.
Sallou, Olivier; Bretaudeau, Anthony; Roult, Aurelien
2012-07-24
Seqcrawler takes its roots in software like SRS or Lucegene. It provides an indexing platform to ease the search of data and meta-data in biological banks and it can scale to face the current flow of data. While many biological bank search tools are available on the Internet, mainly provided by large organizations to search their data, there is a lack of free and open source solutions to browse one's own set of data with a flexible query system and able to scale from a single computer to a cloud system. A personal index platform will help labs and bioinformaticians to search their meta-data but also to build a larger information system with custom subsets of data. The software is scalable from a single computer to a cloud-based infrastructure. It has been successfully tested in a private cloud with 3 index shards (pieces of index) hosting ~400 millions of sequence information (whole GenBank, UniProt, PDB and others) for a total size of 600 GB in a fault tolerant architecture (high-availability). It has also been successfully integrated with software to add extra meta-data from blast results to enhance users' result analysis. Seqcrawler provides a complete open source search and store solution for labs or platforms needing to manage large amount of data/meta-data with a flexible and customizable web interface. All components (search engine, visualization and data storage), though independent, share a common and coherent data system that can be queried with a simple HTTP interface. The solution scales easily and can also provide a high availability infrastructure.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bartlett, Roscoe A; Heroux, Dr. Michael A; Willenbring, James
2012-01-01
Software lifecycles are becoming an increasingly important issue for computational science & engineering (CSE) software. The process by which a piece of CSE software begins life as a set of research requirements and then matures into a trusted high-quality capability is both commonplace and extremely challenging. Although an implicit lifecycle is obviously being used in any effort, the challenges of this process--respecting the competing needs of research vs. production--cannot be overstated. Here we describe a proposal for a well-defined software lifecycle process based on modern Lean/Agile software engineering principles. What we propose is appropriate for many CSE software projects thatmore » are initially heavily focused on research but also are expected to eventually produce usable high-quality capabilities. The model is related to TriBITS, a build, integration and testing system, which serves as a strong foundation for this lifecycle model, and aspects of this lifecycle model are ingrained in the TriBITS system. Indeed this lifecycle process, if followed, will enable large-scale sustainable integration of many complex CSE software efforts across several institutions.« less
Zhao, Shanrong; Prenger, Kurt; Smith, Lance
2013-01-01
RNA-Seq is becoming a promising replacement to microarrays in transcriptome profiling and differential gene expression study. Technical improvements have decreased sequencing costs and, as a result, the size and number of RNA-Seq datasets have increased rapidly. However, the increasing volume of data from large-scale RNA-Seq studies poses a practical challenge for data analysis in a local environment. To meet this challenge, we developed Stormbow, a cloud-based software package, to process large volumes of RNA-Seq data in parallel. The performance of Stormbow has been tested by practically applying it to analyse 178 RNA-Seq samples in the cloud. In our test, it took 6 to 8 hours to process an RNA-Seq sample with 100 million reads, and the average cost was $3.50 per sample. Utilizing Amazon Web Services as the infrastructure for Stormbow allows us to easily scale up to handle large datasets with on-demand computational resources. Stormbow is a scalable, cost effective, and open-source based tool for large-scale RNA-Seq data analysis. Stormbow can be freely downloaded and can be used out of box to process Illumina RNA-Seq datasets. PMID:25937948
Zhao, Shanrong; Prenger, Kurt; Smith, Lance
2013-01-01
RNA-Seq is becoming a promising replacement to microarrays in transcriptome profiling and differential gene expression study. Technical improvements have decreased sequencing costs and, as a result, the size and number of RNA-Seq datasets have increased rapidly. However, the increasing volume of data from large-scale RNA-Seq studies poses a practical challenge for data analysis in a local environment. To meet this challenge, we developed Stormbow, a cloud-based software package, to process large volumes of RNA-Seq data in parallel. The performance of Stormbow has been tested by practically applying it to analyse 178 RNA-Seq samples in the cloud. In our test, it took 6 to 8 hours to process an RNA-Seq sample with 100 million reads, and the average cost was $3.50 per sample. Utilizing Amazon Web Services as the infrastructure for Stormbow allows us to easily scale up to handle large datasets with on-demand computational resources. Stormbow is a scalable, cost effective, and open-source based tool for large-scale RNA-Seq data analysis. Stormbow can be freely downloaded and can be used out of box to process Illumina RNA-Seq datasets.
Large scale rigidity-based flexibility analysis of biomolecules
Streinu, Ileana
2016-01-01
KINematics And RIgidity (KINARI) is an on-going project for in silico flexibility analysis of proteins. The new version of the software, Kinari-2, extends the functionality of our free web server KinariWeb, incorporates advanced web technologies, emphasizes the reproducibility of its experiments, and makes substantially improved tools available to the user. It is designed specifically for large scale experiments, in particular, for (a) very large molecules, including bioassemblies with high degree of symmetry such as viruses and crystals, (b) large collections of related biomolecules, such as those obtained through simulated dilutions, mutations, or conformational changes from various types of dynamics simulations, and (c) is intended to work as seemlessly as possible on the large, idiosyncratic, publicly available repository of biomolecules, the Protein Data Bank. We describe the system design, along with the main data processing, computational, mathematical, and validation challenges underlying this phase of the KINARI project. PMID:26958583
Design for Run-Time Monitor on Cloud Computing
NASA Astrophysics Data System (ADS)
Kang, Mikyung; Kang, Dong-In; Yun, Mira; Park, Gyung-Leen; Lee, Junghoon
Cloud computing is a new information technology trend that moves computing and data away from desktops and portable PCs into large data centers. The basic principle of cloud computing is to deliver applications as services over the Internet as well as infrastructure. A cloud is the type of a parallel and distributed system consisting of a collection of inter-connected and virtualized computers that are dynamically provisioned and presented as one or more unified computing resources. The large-scale distributed applications on a cloud require adaptive service-based software, which has the capability of monitoring the system status change, analyzing the monitored information, and adapting its service configuration while considering tradeoffs among multiple QoS features simultaneously. In this paper, we design Run-Time Monitor (RTM) which is a system software to monitor the application behavior at run-time, analyze the collected information, and optimize resources on cloud computing. RTM monitors application software through library instrumentation as well as underlying hardware through performance counter optimizing its computing configuration based on the analyzed data.
caGrid 1.0: An Enterprise Grid Infrastructure for Biomedical Research
Oster, Scott; Langella, Stephen; Hastings, Shannon; Ervin, David; Madduri, Ravi; Phillips, Joshua; Kurc, Tahsin; Siebenlist, Frank; Covitz, Peter; Shanbhag, Krishnakant; Foster, Ian; Saltz, Joel
2008-01-01
Objective To develop software infrastructure that will provide support for discovery, characterization, integrated access, and management of diverse and disparate collections of information sources, analysis methods, and applications in biomedical research. Design An enterprise Grid software infrastructure, called caGrid version 1.0 (caGrid 1.0), has been developed as the core Grid architecture of the NCI-sponsored cancer Biomedical Informatics Grid (caBIG™) program. It is designed to support a wide range of use cases in basic, translational, and clinical research, including 1) discovery, 2) integrated and large-scale data analysis, and 3) coordinated study. Measurements The caGrid is built as a Grid software infrastructure and leverages Grid computing technologies and the Web Services Resource Framework standards. It provides a set of core services, toolkits for the development and deployment of new community provided services, and application programming interfaces for building client applications. Results The caGrid 1.0 was released to the caBIG community in December 2006. It is built on open source components and caGrid source code is publicly and freely available under a liberal open source license. The core software, associated tools, and documentation can be downloaded from the following URL: https://cabig.nci.nih.gov/workspaces/Architecture/caGrid. Conclusions While caGrid 1.0 is designed to address use cases in cancer research, the requirements associated with discovery, analysis and integration of large scale data, and coordinated studies are common in other biomedical fields. In this respect, caGrid 1.0 is the realization of a framework that can benefit the entire biomedical community. PMID:18096909
caGrid 1.0: an enterprise Grid infrastructure for biomedical research.
Oster, Scott; Langella, Stephen; Hastings, Shannon; Ervin, David; Madduri, Ravi; Phillips, Joshua; Kurc, Tahsin; Siebenlist, Frank; Covitz, Peter; Shanbhag, Krishnakant; Foster, Ian; Saltz, Joel
2008-01-01
To develop software infrastructure that will provide support for discovery, characterization, integrated access, and management of diverse and disparate collections of information sources, analysis methods, and applications in biomedical research. An enterprise Grid software infrastructure, called caGrid version 1.0 (caGrid 1.0), has been developed as the core Grid architecture of the NCI-sponsored cancer Biomedical Informatics Grid (caBIG) program. It is designed to support a wide range of use cases in basic, translational, and clinical research, including 1) discovery, 2) integrated and large-scale data analysis, and 3) coordinated study. The caGrid is built as a Grid software infrastructure and leverages Grid computing technologies and the Web Services Resource Framework standards. It provides a set of core services, toolkits for the development and deployment of new community provided services, and application programming interfaces for building client applications. The caGrid 1.0 was released to the caBIG community in December 2006. It is built on open source components and caGrid source code is publicly and freely available under a liberal open source license. The core software, associated tools, and documentation can be downloaded from the following URL: https://cabig.nci.nih.gov/workspaces/Architecture/caGrid. While caGrid 1.0 is designed to address use cases in cancer research, the requirements associated with discovery, analysis and integration of large scale data, and coordinated studies are common in other biomedical fields. In this respect, caGrid 1.0 is the realization of a framework that can benefit the entire biomedical community.
Software engineering and automatic continuous verification of scientific software
NASA Astrophysics Data System (ADS)
Piggott, M. D.; Hill, J.; Farrell, P. E.; Kramer, S. C.; Wilson, C. R.; Ham, D.; Gorman, G. J.; Bond, T.
2011-12-01
Software engineering of scientific code is challenging for a number of reasons including pressure to publish and a lack of awareness of the pitfalls of software engineering by scientists. The Applied Modelling and Computation Group at Imperial College is a diverse group of researchers that employ best practice software engineering methods whilst developing open source scientific software. Our main code is Fluidity - a multi-purpose computational fluid dynamics (CFD) code that can be used for a wide range of scientific applications from earth-scale mantle convection, through basin-scale ocean dynamics, to laboratory-scale classic CFD problems, and is coupled to a number of other codes including nuclear radiation and solid modelling. Our software development infrastructure consists of a number of free tools that could be employed by any group that develops scientific code and has been developed over a number of years with many lessons learnt. A single code base is developed by over 30 people for which we use bazaar for revision control, making good use of the strong branching and merging capabilities. Using features of Canonical's Launchpad platform, such as code review, blueprints for designing features and bug reporting gives the group, partners and other Fluidity uers an easy-to-use platform to collaborate and allows the induction of new members of the group into an environment where software development forms a central part of their work. The code repositoriy are coupled to an automated test and verification system which performs over 20,000 tests, including unit tests, short regression tests, code verification and large parallel tests. Included in these tests are build tests on HPC systems, including local and UK National HPC services. The testing of code in this manner leads to a continuous verification process; not a discrete event performed once development has ceased. Much of the code verification is done via the "gold standard" of comparisons to analytical solutions via the method of manufactured solutions. By developing and verifying code in tandem we avoid a number of pitfalls in scientific software development and advocate similar procedures for other scientific code applications.
Measures of Agreement Between Many Raters for Ordinal Classifications
Nelson, Kerrie P.; Edwards, Don
2015-01-01
Screening and diagnostic procedures often require a physician's subjective interpretation of a patient's test result using an ordered categorical scale to define the patient's disease severity. Due to wide variability observed between physicians’ ratings, many large-scale studies have been conducted to quantify agreement between multiple experts’ ordinal classifications in common diagnostic procedures such as mammography. However, very few statistical approaches are available to assess agreement in these large-scale settings. Existing summary measures of agreement rely on extensions of Cohen's kappa [1 - 5]. These are prone to prevalence and marginal distribution issues, become increasingly complex for more than three experts or are not easily implemented. Here we propose a model-based approach to assess agreement in large-scale studies based upon a framework of ordinal generalized linear mixed models. A summary measure of agreement is proposed for multiple experts assessing the same sample of patients’ test results according to an ordered categorical scale. This measure avoids some of the key flaws associated with Cohen's kappa and its extensions. Simulation studies are conducted to demonstrate the validity of the approach with comparison to commonly used agreement measures. The proposed methods are easily implemented using the software package R and are applied to two large-scale cancer agreement studies. PMID:26095449
Fan, Long; Hui, Jerome H L; Yu, Zu Guo; Chu, Ka Hou
2014-07-01
Species identification based on short sequences of DNA markers, that is, DNA barcoding, has emerged as an integral part of modern taxonomy. However, software for the analysis of large and multilocus barcoding data sets is scarce. The Basic Local Alignment Search Tool (BLAST) is currently the fastest tool capable of handling large databases (e.g. >5000 sequences), but its accuracy is a concern and has been criticized for its local optimization. However, current more accurate software requires sequence alignment or complex calculations, which are time-consuming when dealing with large data sets during data preprocessing or during the search stage. Therefore, it is imperative to develop a practical program for both accurate and scalable species identification for DNA barcoding. In this context, we present VIP Barcoding: a user-friendly software in graphical user interface for rapid DNA barcoding. It adopts a hybrid, two-stage algorithm. First, an alignment-free composition vector (CV) method is utilized to reduce searching space by screening a reference database. The alignment-based K2P distance nearest-neighbour method is then employed to analyse the smaller data set generated in the first stage. In comparison with other software, we demonstrate that VIP Barcoding has (i) higher accuracy than Blastn and several alignment-free methods and (ii) higher scalability than alignment-based distance methods and character-based methods. These results suggest that this platform is able to deal with both large-scale and multilocus barcoding data with accuracy and can contribute to DNA barcoding for modern taxonomy. VIP Barcoding is free and available at http://msl.sls.cuhk.edu.hk/vipbarcoding/. © 2014 John Wiley & Sons Ltd.
Landscape pattern metrics and regional assessment
Robert V. O' Neill; Kurt H. Riitters; J.D. Wickham; Bruce K. Jones
1999-01-01
The combination of remote imagery data, geographic information systems software, and landscape ecology theory provides a unique basis for monitoring and assessing large-scale ecological systems. The unique feature of the work has been the need to develop interpret quantitative measures of spatial patter-the landscape indices. This article reviews what is known about...
Software Tools | Office of Cancer Clinical Proteomics Research
The CPTAC program develops new approaches to elucidate aspects of the molecular complexity of cancer made from large-scale proteogenomic datasets, and advance them toward precision medicine. Part of the CPTAC mission is to make data and tools available and accessible to the greater research community to accelerate the discovery process.
Jeremy S. Fried; Larry D. Potts; Sara M. Loreno; Glenn A. Christensen; R. Jamie Barbour
2017-01-01
The Forest Inventory and Analysis (FIA)-based BioSum (Bioregional Inventory Originated Simulation Under Management) is a free policy analysis framework and workflow management software solution. It addresses complex management questions concerning forest health and vulnerability for large, multimillion acre, multiowner landscapes using FIA plot data as the initial...
ERIC Educational Resources Information Center
Kharabe, Amol T.
2012-01-01
Over the last two decades, firms have operated in "increasingly" accelerated "high-velocity" dynamic markets, which require them to become "agile." During the same time frame, firms have increasingly deployed complex enterprise systems--large-scale packaged software "innovations" that integrate and automate…
Large-Scale Dynamic Observation Planning for Unmanned Surface Vessels
2007-06-01
programming language. In addition, the useful development software NetBeans IDE is free and makes the use of Java very user-friendly. 92...3. We implemented the greedy and 3PAA algorithms in Java using the NetBeans IDE version 5.5. 4. The test datasets were generated in MATLAB. 5
Universal distribution of component frequencies in biological and technological systems
Pang, Tin Yau; Maslov, Sergei
2013-01-01
Bacterial genomes and large-scale computer software projects both consist of a large number of components (genes or software packages) connected via a network of mutual dependencies. Components can be easily added or removed from individual systems, and their use frequencies vary over many orders of magnitude. We study this frequency distribution in genomes of ∼500 bacterial species and in over 2 million Linux computers and find that in both cases it is described by the same scale-free power-law distribution with an additional peak near the tail of the distribution corresponding to nearly universal components. We argue that the existence of a power law distribution of frequencies of components is a general property of any modular system with a multilayered dependency network. We demonstrate that the frequency of a component is positively correlated with its dependency degree given by the total number of upstream components whose operation directly or indirectly depends on the selected component. The observed frequency/dependency degree distributions are reproduced in a simple mathematically tractable model introduced and analyzed in this study. PMID:23530195
Matrix Algebra for GPU and Multicore Architectures (MAGMA) for Large Petascale Systems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dongarra, Jack J.; Tomov, Stanimire
2014-03-24
The goal of the MAGMA project is to create a new generation of linear algebra libraries that achieve the fastest possible time to an accurate solution on hybrid Multicore+GPU-based systems, using all the processing power that future high-end systems can make available within given energy constraints. Our efforts at the University of Tennessee achieved the goals set in all of the five areas identified in the proposal: 1. Communication optimal algorithms; 2. Autotuning for GPU and hybrid processors; 3. Scheduling and memory management techniques for heterogeneity and scale; 4. Fault tolerance and robustness for large scale systems; 5. Building energymore » efficiency into software foundations. The University of Tennessee’s main contributions, as proposed, were the research and software development of new algorithms for hybrid multi/many-core CPUs and GPUs, as related to two-sided factorizations and complete eigenproblem solvers, hybrid BLAS, and energy efficiency for dense, as well as sparse, operations. Furthermore, as proposed, we investigated and experimented with various techniques targeting the five main areas outlined.« less
Pazos, Florencio; Chagoyen, Monica
2018-01-16
Daily work in molecular biology presently depends on a large number of computational tools. An in-depth, large-scale study of that 'ecosystem' of Web tools, its characteristics, interconnectivity, patterns of usage/citation, temporal evolution and rate of decay is crucial for understanding the forces that shape it and for informing initiatives aimed at its funding, long-term maintenance and improvement. In particular, the long-term maintenance of these tools is compromised because of their specific development model. Hundreds of published studies become irreproducible de facto, as the software tools used to conduct them become unavailable. In this study, we present a large-scale survey of >5400 publications describing Web servers within the two main bibliographic resources for disseminating new software developments in molecular biology. For all these servers, we studied their citation patterns, the subjects they address, their citation networks and the temporal evolution of these factors. We also analysed how these factors affect the availability of these servers (whether they are alive). Our results show that this ecosystem of tools is highly interconnected and adapts to the 'trendy' subjects in every moment. The servers present characteristic temporal patterns of citation/usage, and there is a worrying rate of server 'death', which is influenced by factors such as the server popularity and the institutions that hosts it. These results can inform initiatives aimed at the long-term maintenance of these resources. © The Author(s) 2018. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
MrEnt: an editor for publication-quality phylogenetic tree illustrations.
Zuccon, Alessandro; Zuccon, Dario
2014-09-01
We developed MrEnt, a Windows-based, user-friendly software that allows the production of complex, high-resolution, publication-quality phylogenetic trees in few steps, directly from the analysis output. The program recognizes the standard Nexus tree format and the annotated tree files produced by BEAST and MrBayes. MrEnt combines in a single software a large suite of tree manipulation functions (e.g. handling of multiple trees, tree rotation, character mapping, node collapsing, compression of large clades, handling of time scale and error bars for chronograms) with drawing tools typical of standard graphic editors, including handling of graphic elements and images. The tree illustration can be printed or exported in several standard formats suitable for journal publication, PowerPoint presentation or Web publication. © 2014 John Wiley & Sons Ltd.
2013-01-01
Background The binding of transcription factors to DNA plays an essential role in the regulation of gene expression. Numerous experiments elucidated binding sequences which subsequently have been used to derive statistical models for predicting potential transcription factor binding sites (TFBS). The rapidly increasing number of genome sequence data requires sophisticated computational approaches to manage and query experimental and predicted TFBS data in the context of other epigenetic factors and across different organisms. Results We have developed D-Light, a novel client-server software package to store and query large amounts of TFBS data for any number of genomes. Users can add small-scale data to the server database and query them in a large scale, genome-wide promoter context. The client is implemented in Java and provides simple graphical user interfaces and data visualization. Here we also performed a statistical analysis showing what a user can expect for certain parameter settings and we illustrate the usage of D-Light with the help of a microarray data set. Conclusions D-Light is an easy to use software tool to integrate, store and query annotation data for promoters. A public D-Light server, the client and server software for local installation and the source code under GNU GPL license are available at http://biwww.che.sbg.ac.at/dlight. PMID:23617301
OTEC Cold Water Pipe-Platform Subsystem Dynamic Interaction Validation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Varley, Robert; Halkyard, John; Johnson, Peter
A commercial floating 100-megawatt (MW) ocean thermal energy conversion (OTEC) power plant will require a cold water pipe (CWP) with a diameter of 10-meter (m) and length of up to 1,000 m. The mass of the cold water pipe, including entrained water, can exceed the mass of the platform supporting it. The offshore industry uses software-modeling tools to develop platform and riser (pipe) designs to survive the offshore environment. These tools are typically validated by scale model tests in facilities able to replicate real at-sea meteorological and ocean (metocean) conditions to provide the understanding and confidence to proceed to finalmore » design and full-scale fabrication. However, today’s offshore platforms (similar to and usually larger than those needed for OTEC applications) incorporate risers (or pipes) with diameters well under one meter. Secondly, the preferred construction method for large diameter OTEC CWPs is the use of composite materials, primarily a form of fiber-reinforced plastic (FRP). The use of these material results in relatively low pipe stiffness and large strains compared to steel construction. These factors suggest the need for further validation of offshore industry software tools. The purpose of this project was to validate the ability to model numerically the dynamic interaction between a large cold water-filled fiberglass pipe and a floating OTEC platform excited by metocean weather conditions using measurements from a scale model tested in an ocean basin test facility.« less
2014-09-30
repeating pulse-like signals were investigated. Software prototypes were developed and integrated into distinct streams of reseach ; projects...to study complex sound archives spanning large spatial and temporal scales. A new post processing method for detection and classifcation was also...false positive rates. HK-ANN was successfully tested for a large minke whale dataset, but could easily be used on other signal types. Various
ENCoRE: an efficient software for CRISPR screens identifies new players in extrinsic apoptosis.
Trümbach, Dietrich; Pfeiffer, Susanne; Poppe, Manuel; Scherb, Hagen; Doll, Sebastian; Wurst, Wolfgang; Schick, Joel A
2017-11-25
As CRISPR/Cas9 mediated screens with pooled guide libraries in somatic cells become increasingly established, an unmet need for rapid and accurate companion informatics tools has emerged. We have developed a lightweight and efficient software to easily manipulate large raw next generation sequencing datasets derived from such screens into informative relational context with graphical support. The advantages of the software entitled ENCoRE (Easy NGS-to-Gene CRISPR REsults) include a simple graphical workflow, platform independence, local and fast multithreaded processing, data pre-processing and gene mapping with custom library import. We demonstrate the capabilities of ENCoRE to interrogate results from a pooled CRISPR cellular viability screen following Tumor Necrosis Factor-alpha challenge. The results not only identified stereotypical players in extrinsic apoptotic signaling but two as yet uncharacterized members of the extrinsic apoptotic cascade, Smg7 and Ces2a. We further validated and characterized cell lines containing mutations in these genes against a panel of cell death stimuli and involvement in p53 signaling. In summary, this software enables bench scientists with sensitive data or without access to informatic cores to rapidly interpret results from large scale experiments resulting from pooled CRISPR/Cas9 library screens.
Optimizing CyberShake Seismic Hazard Workflows for Large HPC Resources
NASA Astrophysics Data System (ADS)
Callaghan, S.; Maechling, P. J.; Juve, G.; Vahi, K.; Deelman, E.; Jordan, T. H.
2014-12-01
The CyberShake computational platform is a well-integrated collection of scientific software and middleware that calculates 3D simulation-based probabilistic seismic hazard curves and hazard maps for the Los Angeles region. Currently each CyberShake model comprises about 235 million synthetic seismograms from about 415,000 rupture variations computed at 286 sites. CyberShake integrates large-scale parallel and high-throughput serial seismological research codes into a processing framework in which early stages produce files used as inputs by later stages. Scientific workflow tools are used to manage the jobs, data, and metadata. The Southern California Earthquake Center (SCEC) developed the CyberShake platform using USC High Performance Computing and Communications systems and open-science NSF resources.CyberShake calculations were migrated to the NSF Track 1 system NCSA Blue Waters when it became operational in 2013, via an interdisciplinary team approach including domain scientists, computer scientists, and middleware developers. Due to the excellent performance of Blue Waters and CyberShake software optimizations, we reduced the makespan (a measure of wallclock time-to-solution) of a CyberShake study from 1467 to 342 hours. We will describe the technical enhancements behind this improvement, including judicious introduction of new GPU software, improved scientific software components, increased workflow-based automation, and Blue Waters-specific workflow optimizations.Our CyberShake performance improvements highlight the benefits of scientific workflow tools. The CyberShake workflow software stack includes the Pegasus Workflow Management System (Pegasus-WMS, which includes Condor DAGMan), HTCondor, and Globus GRAM, with Pegasus-mpi-cluster managing the high-throughput tasks on the HPC resources. The workflow tools handle data management, automatically transferring about 13 TB back to SCEC storage.We will present performance metrics from the most recent CyberShake study, executed on Blue Waters. We will compare the performance of CPU and GPU versions of our large-scale parallel wave propagation code, AWP-ODC-SGT. Finally, we will discuss how these enhancements have enabled SCEC to move forward with plans to increase the CyberShake simulation frequency to 1.0 Hz.
NASA Software Cost Estimation Model: An Analogy Based Estimation Model
NASA Technical Reports Server (NTRS)
Hihn, Jairus; Juster, Leora; Menzies, Tim; Mathew, George; Johnson, James
2015-01-01
The cost estimation of software development activities is increasingly critical for large scale integrated projects such as those at DOD and NASA especially as the software systems become larger and more complex. As an example MSL (Mars Scientific Laboratory) developed at the Jet Propulsion Laboratory launched with over 2 million lines of code making it the largest robotic spacecraft ever flown (Based on the size of the software). Software development activities are also notorious for their cost growth, with NASA flight software averaging over 50% cost growth. All across the agency, estimators and analysts are increasingly being tasked to develop reliable cost estimates in support of program planning and execution. While there has been extensive work on improving parametric methods there is very little focus on the use of models based on analogy and clustering algorithms. In this paper we summarize our findings on effort/cost model estimation and model development based on ten years of software effort estimation research using data mining and machine learning methods to develop estimation models based on analogy and clustering. The NASA Software Cost Model performance is evaluated by comparing it to COCOMO II, linear regression, and K- nearest neighbor prediction model performance on the same data set.
Multi-Scale Three-Dimensional Variational Data Assimilation System for Coastal Ocean Prediction
NASA Technical Reports Server (NTRS)
Li, Zhijin; Chao, Yi; Li, P. Peggy
2012-01-01
A multi-scale three-dimensional variational data assimilation system (MS-3DVAR) has been formulated and the associated software system has been developed for improving high-resolution coastal ocean prediction. This system helps improve coastal ocean prediction skill, and has been used in support of operational coastal ocean forecasting systems and field experiments. The system has been developed to improve the capability of data assimilation for assimilating, simultaneously and effectively, sparse vertical profiles and high-resolution remote sensing surface measurements into coastal ocean models, as well as constraining model biases. In this system, the cost function is decomposed into two separate units for the large- and small-scale components, respectively. As such, data assimilation is implemented sequentially from large to small scales, the background error covariance is constructed to be scale-dependent, and a scale-dependent dynamic balance is incorporated. This scheme then allows effective constraining large scales and model bias through assimilating sparse vertical profiles, and small scales through assimilating high-resolution surface measurements. This MS-3DVAR enhances the capability of the traditional 3DVAR for assimilating highly heterogeneously distributed observations, such as along-track satellite altimetry data, and particularly maximizing the extraction of information from limited numbers of vertical profile observations.
VALIDATION OF ANSYS FINITE ELEMENT ANALYSIS SOFTWARE
DOE Office of Scientific and Technical Information (OSTI.GOV)
HAMM, E.R.
2003-06-27
This document provides a record of the verification and Validation of the ANSYS Version 7.0 software that is installed on selected CH2M HILL computers. The issues addressed include: Software verification, installation, validation, configuration management and error reporting. The ANSYS{reg_sign} computer program is a large scale multi-purpose finite element program which may be used for solving several classes of engineering analysis. The analysis capabilities of ANSYS Full Mechanical Version 7.0 installed on selected CH2M Hill Hanford Group (CH2M HILL) Intel processor based computers include the ability to solve static and dynamic structural analyses, steady-state and transient heat transfer problems, mode-frequency andmore » buckling eigenvalue problems, static or time-varying magnetic analyses and various types of field and coupled-field applications. The program contains many special features which allow nonlinearities or secondary effects to be included in the solution, such as plasticity, large strain, hyperelasticity, creep, swelling, large deflections, contact, stress stiffening, temperature dependency, material anisotropy, and thermal radiation. The ANSYS program has been in commercial use since 1970, and has been used extensively in the aerospace, automotive, construction, electronic, energy services, manufacturing, nuclear, plastics, oil and steel industries.« less
Radiation breakage of DNA: a model based on random-walk chromatin structure
NASA Technical Reports Server (NTRS)
Ponomarev, A. L.; Sachs, R. K.
2001-01-01
Monte Carlo computer software, called DNAbreak, has recently been developed to analyze observed non-random clustering of DNA double strand breaks in chromatin after exposure to densely ionizing radiation. The software models coarse-grained configurations of chromatin and radiation tracks, small-scale details being suppressed in order to obtain statistical results for larger scales, up to the size of a whole chromosome. We here give an analytic counterpart of the numerical model, useful for benchmarks, for elucidating the numerical results, for analyzing the assumptions of a more general but less mechanistic "randomly-located-clusters" formalism, and, potentially, for speeding up the calculations. The equations characterize multi-track DNA fragment-size distributions in terms of one-track action; an important step in extrapolating high-dose laboratory results to the much lower doses of main interest in environmental or occupational risk estimation. The approach can utilize the experimental information on DNA fragment-size distributions to draw inferences about large-scale chromatin geometry during cell-cycle interphase.
Taheri, Mohammadreza; Moazeni-Pourasil, Roudabeh Sadat; Sheikh-Olia-Lavasani, Majid; Karami, Ahmad; Ghassempour, Alireza
2016-03-01
Chromatographic method development for preparative targets is a time-consuming and subjective process. This can be particularly problematic because of the use of valuable samples for isolation and the large consumption of solvents in preparative scale. These processes could be improved by using statistical computations to save time, solvent and experimental efforts. Thus, contributed by ESI-MS, after applying DryLab software to gain an overview of the most effective parameters in separation of synthesized celecoxib and its co-eluted compounds, design of experiment software that relies on multivariate modeling as a chemometric approach was used to predict the optimized touching-band overloading conditions by objective functions according to the relationship between selectivity and stationary phase properties. The loadability of the method was investigated on the analytical and semi-preparative scales, and the performance of this chemometric approach was approved by peak shapes beside recovery and purity of products. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Implementation of AN Unmanned Aerial Vehicle System for Large Scale Mapping
NASA Astrophysics Data System (ADS)
Mah, S. B.; Cryderman, C. S.
2015-08-01
Unmanned Aerial Vehicles (UAVs), digital cameras, powerful personal computers, and software have made it possible for geomatics professionals to capture aerial photographs and generate digital terrain models and orthophotographs without using full scale aircraft or hiring mapping professionals. This has been made possible by the availability of miniaturized computers and sensors, and software which has been driven, in part, by the demand for this technology in consumer items such as smartphones. The other force that is in play is the increasing number of Do-It-Yourself (DIY) people who are building UAVs as a hobby or for professional use. Building a UAV system for mapping is an alternative to purchasing a turnkey system. This paper describes factors to be considered when building a UAV mapping system, the choices made, and the test results of a project using this completed system.
ReaDDy - A Software for Particle-Based Reaction-Diffusion Dynamics in Crowded Cellular Environments
Schöneberg, Johannes; Noé, Frank
2013-01-01
We introduce the software package ReaDDy for simulation of detailed spatiotemporal mechanisms of dynamical processes in the cell, based on reaction-diffusion dynamics with particle resolution. In contrast to other particle-based reaction kinetics programs, ReaDDy supports particle interaction potentials. This permits effects such as space exclusion, molecular crowding and aggregation to be modeled. The biomolecules simulated can be represented as a sphere, or as a more complex geometry such as a domain structure or polymer chain. ReaDDy bridges the gap between small-scale but highly detailed molecular dynamics or Brownian dynamics simulations and large-scale but little-detailed reaction kinetics simulations. ReaDDy has a modular design that enables the exchange of the computing core by efficient platform-specific implementations or dynamical models that are different from Brownian dynamics. PMID:24040218
Quantum Computing Architectural Design
NASA Astrophysics Data System (ADS)
West, Jacob; Simms, Geoffrey; Gyure, Mark
2006-03-01
Large scale quantum computers will invariably require scalable architectures in addition to high fidelity gate operations. Quantum computing architectural design (QCAD) addresses the problems of actually implementing fault-tolerant algorithms given physical and architectural constraints beyond those of basic gate-level fidelity. Here we introduce a unified framework for QCAD that enables the scientist to study the impact of varying error correction schemes, architectural parameters including layout and scheduling, and physical operations native to a given architecture. Our software package, aptly named QCAD, provides compilation, manipulation/transformation, multi-paradigm simulation, and visualization tools. We demonstrate various features of the QCAD software package through several examples.
Zheng, Guangyong; Xu, Yaochen; Zhang, Xiujun; Liu, Zhi-Ping; Wang, Zhuo; Chen, Luonan; Zhu, Xin-Guang
2016-12-23
A gene regulatory network (GRN) represents interactions of genes inside a cell or tissue, in which vertexes and edges stand for genes and their regulatory interactions respectively. Reconstruction of gene regulatory networks, in particular, genome-scale networks, is essential for comparative exploration of different species and mechanistic investigation of biological processes. Currently, most of network inference methods are computationally intensive, which are usually effective for small-scale tasks (e.g., networks with a few hundred genes), but are difficult to construct GRNs at genome-scale. Here, we present a software package for gene regulatory network reconstruction at a genomic level, in which gene interaction is measured by the conditional mutual information measurement using a parallel computing framework (so the package is named CMIP). The package is a greatly improved implementation of our previous PCA-CMI algorithm. In CMIP, we provide not only an automatic threshold determination method but also an effective parallel computing framework for network inference. Performance tests on benchmark datasets show that the accuracy of CMIP is comparable to most current network inference methods. Moreover, running tests on synthetic datasets demonstrate that CMIP can handle large datasets especially genome-wide datasets within an acceptable time period. In addition, successful application on a real genomic dataset confirms its practical applicability of the package. This new software package provides a powerful tool for genomic network reconstruction to biological community. The software can be accessed at http://www.picb.ac.cn/CMIP/ .
Parallel software for lattice N = 4 supersymmetric Yang-Mills theory
NASA Astrophysics Data System (ADS)
Schaich, David; DeGrand, Thomas
2015-05-01
We present new parallel software, SUSY LATTICE, for lattice studies of four-dimensional N = 4 supersymmetric Yang-Mills theory with gauge group SU(N). The lattice action is constructed to exactly preserve a single supersymmetry charge at non-zero lattice spacing, up to additional potential terms included to stabilize numerical simulations. The software evolved from the MILC code for lattice QCD, and retains a similar large-scale framework despite the different target theory. Many routines are adapted from an existing serial code (Catterall and Joseph, 2012), which SUSY LATTICE supersedes. This paper provides an overview of the new parallel software, summarizing the lattice system, describing the applications that are currently provided and explaining their basic workflow for non-experts in lattice gauge theory. We discuss the parallel performance of the code, and highlight some notable aspects of the documentation for those interested in contributing to its future development.
Large Scale Bacterial Colony Screening of Diversified FRET Biosensors
Litzlbauer, Julia; Schifferer, Martina; Ng, David; Fabritius, Arne; Thestrup, Thomas; Griesbeck, Oliver
2015-01-01
Biosensors based on Förster Resonance Energy Transfer (FRET) between fluorescent protein mutants have started to revolutionize physiology and biochemistry. However, many types of FRET biosensors show relatively small FRET changes, making measurements with these probes challenging when used under sub-optimal experimental conditions. Thus, a major effort in the field currently lies in designing new optimization strategies for these types of sensors. Here we describe procedures for optimizing FRET changes by large scale screening of mutant biosensor libraries in bacterial colonies. We describe optimization of biosensor expression, permeabilization of bacteria, software tools for analysis, and screening conditions. The procedures reported here may help in improving FRET changes in multiple suitable classes of biosensors. PMID:26061878
A Comparative Study of Point Cloud Data Collection and Processing
NASA Astrophysics Data System (ADS)
Pippin, J. E.; Matheney, M.; Gentle, J. N., Jr.; Pierce, S. A.; Fuentes-Pineda, G.
2016-12-01
Over the past decade, there has been dramatic growth in the acquisition of publicly funded high-resolution topographic data for scientific, environmental, engineering and planning purposes. These data sets are valuable for applications of interest across a large and varied user community. However, because of the large volumes of data produced by high-resolution mapping technologies and expense of aerial data collection, it is often difficult to collect and distribute these datasets. Furthermore, the data can be technically challenging to process, requiring software and computing resources not readily available to many users. This study presents a comparison of advanced computing hardware and software that is used to collect and process point cloud datasets, such as LIDAR scans. Activities included implementation and testing of open source libraries and applications for point cloud data processing such as, Meshlab, Blender, PDAL, and PCL. Additionally, a suite of commercial scale applications, Skanect and Cloudcompare, were applied to raw datasets. Handheld hardware solutions, a Structure Scanner and Xbox 360 Kinect V1, were tested for their ability to scan at three field locations. The resultant data projects successfully scanned and processed subsurface karst features ranging from small stalactites to large rooms, as well as a surface waterfall feature. Outcomes support the feasibility of rapid sensing in 3D at field scales.
The software architecture to control the Cherenkov Telescope Array
NASA Astrophysics Data System (ADS)
Oya, I.; Füßling, M.; Antonino, P. O.; Conforti, V.; Hagge, L.; Melkumyan, D.; Morgenstern, A.; Tosti, G.; Schwanke, U.; Schwarz, J.; Wegner, P.; Colomé, J.; Lyard, E.
2016-07-01
The Cherenkov Telescope Array (CTA) project is an initiative to build two large arrays of Cherenkov gamma- ray telescopes. CTA will be deployed as two installations, one in the northern and the other in the southern hemisphere, containing dozens of telescopes of different sizes. CTA is a big step forward in the field of ground- based gamma-ray astronomy, not only because of the expected scientific return, but also due to the order-of- magnitude larger scale of the instrument to be controlled. The performance requirements associated with such a large and distributed astronomical installation require a thoughtful analysis to determine the best software solutions. The array control and data acquisition (ACTL) work-package within the CTA initiative will deliver the software to control and acquire the data from the CTA instrumentation. In this contribution we present the current status of the formal ACTL system decomposition into software building blocks and the relationships among them. The system is modelled via the Systems Modelling Language (SysML) formalism. To cope with the complexity of the system, this architecture model is sub-divided into different perspectives. The relationships with the stakeholders and external systems are used to create the first perspective, the context of the ACTL software system. Use cases are employed to describe the interaction of those external elements with the ACTL system and are traced to a hierarchy of functionalities (abstract system functions) describing the internal structure of the ACTL system. These functions are then traced to fully specified logical elements (software components), the deployment of which as technical elements, is also described. This modelling approach allows us to decompose the ACTL software in elements to be created and the ow of information within the system, providing us with a clear way to identify sub-system interdependencies. This architectural approach allows us to build the ACTL system model and trace requirements to deliverables (source code, documentation, etc.), and permits the implementation of a flexible use-case driven software development approach thanks to the traceability from use cases to the logical software elements. The Alma Common Software (ACS) container/component framework, used for the control of the Atacama Large Millimeter/submillimeter Array (ALMA) is the basis for the ACTL software and as such it is considered as an integral part of the software architecture.
Parallel computing for probabilistic fatigue analysis
NASA Technical Reports Server (NTRS)
Sues, Robert H.; Lua, Yuan J.; Smith, Mark D.
1993-01-01
This paper presents the results of Phase I research to investigate the most effective parallel processing software strategies and hardware configurations for probabilistic structural analysis. We investigate the efficiency of both shared and distributed-memory architectures via a probabilistic fatigue life analysis problem. We also present a parallel programming approach, the virtual shared-memory paradigm, that is applicable across both types of hardware. Using this approach, problems can be solved on a variety of parallel configurations, including networks of single or multiprocessor workstations. We conclude that it is possible to effectively parallelize probabilistic fatigue analysis codes; however, special strategies will be needed to achieve large-scale parallelism to keep large number of processors busy and to treat problems with the large memory requirements encountered in practice. We also conclude that distributed-memory architecture is preferable to shared-memory for achieving large scale parallelism; however, in the future, the currently emerging hybrid-memory architectures will likely be optimal.
WEST-3 wind turbine simulator development
NASA Technical Reports Server (NTRS)
Hoffman, J. A.; Sridhar, S.
1985-01-01
The software developed for WEST-3, a new, all digital, and fully programmable wind turbine simulator is given. The process of wind turbine simulation on WEST-3 is described in detail. The major steps are, the processing of the mathematical models, the preparation of the constant data, and the use of system software generated executable code for running on WEST-3. The mechanics of reformulation, normalization, and scaling of the mathematical models is discussed in detail, in particulr, the significance of reformulation which leads to accurate simulations. Descriptions for the preprocessor computer programs which are used to prepare the constant data needed in the simulation are given. These programs, in addition to scaling and normalizing all the constants, relieve the user from having to generate a large number of constants used in the simulation. Also given are brief descriptions of the components of the WEST-3 system software: Translator, Assembler, Linker, and Loader. Also included are: details of the aeroelastic rotor analysis, which is the center of a wind turbine simulation model, analysis of the gimbal subsystem; and listings of the variables, constants, and equations used in the simulation.
IDEAL: Images Across Domains, Experiments, Algorithms and Learning
NASA Astrophysics Data System (ADS)
Ushizima, Daniela M.; Bale, Hrishikesh A.; Bethel, E. Wes; Ercius, Peter; Helms, Brett A.; Krishnan, Harinarayan; Grinberg, Lea T.; Haranczyk, Maciej; Macdowell, Alastair A.; Odziomek, Katarzyna; Parkinson, Dilworth Y.; Perciano, Talita; Ritchie, Robert O.; Yang, Chao
2016-11-01
Research across science domains is increasingly reliant on image-centric data. Software tools are in high demand to uncover relevant, but hidden, information in digital images, such as those coming from faster next generation high-throughput imaging platforms. The challenge is to analyze the data torrent generated by the advanced instruments efficiently, and provide insights such as measurements for decision-making. In this paper, we overview work performed by an interdisciplinary team of computational and materials scientists, aimed at designing software applications and coordinating research efforts connecting (1) emerging algorithms for dealing with large and complex datasets; (2) data analysis methods with emphasis in pattern recognition and machine learning; and (3) advances in evolving computer architectures. Engineering tools around these efforts accelerate the analyses of image-based recordings, improve reusability and reproducibility, scale scientific procedures by reducing time between experiments, increase efficiency, and open opportunities for more users of the imaging facilities. This paper describes our algorithms and software tools, showing results across image scales, demonstrating how our framework plays a role in improving image understanding for quality control of existent materials and discovery of new compounds.
DynamO: a free O(N) general event-driven molecular dynamics simulator.
Bannerman, M N; Sargant, R; Lue, L
2011-11-30
Molecular dynamics algorithms for systems of particles interacting through discrete or "hard" potentials are fundamentally different to the methods for continuous or "soft" potential systems. Although many software packages have been developed for continuous potential systems, software for discrete potential systems based on event-driven algorithms are relatively scarce and specialized. We present DynamO, a general event-driven simulation package, which displays the optimal O(N) asymptotic scaling of the computational cost with the number of particles N, rather than the O(N) scaling found in most standard algorithms. DynamO provides reference implementations of the best available event-driven algorithms. These techniques allow the rapid simulation of both complex and large (>10(6) particles) systems for long times. The performance of the program is benchmarked for elastic hard sphere systems, homogeneous cooling and sheared inelastic hard spheres, and equilibrium Lennard-Jones fluids. This software and its documentation are distributed under the GNU General Public license and can be freely downloaded from http://marcusbannerman.co.uk/dynamo. Copyright © 2011 Wiley Periodicals, Inc.
AnClim and ProClimDB software for data quality control and homogenization of time series
NASA Astrophysics Data System (ADS)
Stepanek, Petr
2015-04-01
During the last decade, a software package consisting of AnClim, ProClimDB and LoadData for processing (mainly climatological) data has been created. This software offers a complex solution for processing of climatological time series, starting from loading the data from a central database (e.g. Oracle, software LoadData), through data duality control and homogenization to time series analysis, extreme value evaluations and RCM outputs verification and correction (ProClimDB and AnClim software). The detection of inhomogeneities is carried out on a monthly scale through the application of AnClim, or newly by R functions called from ProClimDB, while quality control, the preparation of reference series and the correction of found breaks is carried out by the ProClimDB software. The software combines many statistical tests, types of reference series and time scales (monthly, seasonal and annual, daily and sub-daily ones). These can be used to create an "ensemble" of solutions, which may be more reliable than any single method. AnClim software is suitable for educational purposes: e.g. for students getting acquainted with methods used in climatology. Built-in graphical tools and comparison of various statistical tests help in better understanding of a given method. ProClimDB is, on the contrary, tool aimed for processing of large climatological datasets. Recently, functions from R may be used within the software making it more efficient in data processing and capable of easy inclusion of new methods (when available under R). An example of usage is easy comparison of methods for correction of inhomogeneities in daily data (HOM of Paul Della-Marta, SPLIDHOM method of Olivier Mestre, DAP - own method, QM of Xiaolan Wang and others). The software is available together with further information on www.climahom.eu . Acknowledgement: this work was partially funded by the project "Building up a multidisciplinary scientific team focused on drought" No. CZ.1.07/2.3.00/20.0248.
A Matter of Time: Faster Percolator Analysis via Efficient SVM Learning for Large-Scale Proteomics.
Halloran, John T; Rocke, David M
2018-05-04
Percolator is an important tool for greatly improving the results of a database search and subsequent downstream analysis. Using support vector machines (SVMs), Percolator recalibrates peptide-spectrum matches based on the learned decision boundary between targets and decoys. To improve analysis time for large-scale data sets, we update Percolator's SVM learning engine through software and algorithmic optimizations rather than heuristic approaches that necessitate the careful study of their impact on learned parameters across different search settings and data sets. We show that by optimizing Percolator's original learning algorithm, l 2 -SVM-MFN, large-scale SVM learning requires nearly only a third of the original runtime. Furthermore, we show that by employing the widely used Trust Region Newton (TRON) algorithm instead of l 2 -SVM-MFN, large-scale Percolator SVM learning is reduced to nearly only a fifth of the original runtime. Importantly, these speedups only affect the speed at which Percolator converges to a global solution and do not alter recalibration performance. The upgraded versions of both l 2 -SVM-MFN and TRON are optimized within the Percolator codebase for multithreaded and single-thread use and are available under Apache license at bitbucket.org/jthalloran/percolator_upgrade .
Vulnerability Analyst’s Guide to Geometric Target Description
1992-09-01
not constitute indorsement of any commercial product. Form Approved REPORT DOCUMENTATION PAGE OMB No. 0704-O,8 public reporting burden for this...46 5.3 Surrogacy ..............................................46 5.4 Specialized Targets......................................46 5.5... commercially available documents for other large-scale software. The documentation is not a BRL technical report, but can be obtained by contacting
ERIC Educational Resources Information Center
Schagen, Ian; Schagen, Sandie
2005-01-01
The advent of large-scale matched data sets, linking pupils' attainment across key stages, gives new opportunities to explore the effects of school organisational factors on pupil performance. Combined with currently available sophisticated and efficient software for multilevel analysis, it offers educational researchers the chance to develop…
Application of Open-Source Enterprise Information System Modules: An Empirical Study
ERIC Educational Resources Information Center
Lee, Sang-Heui
2010-01-01
Although there have been a number of studies on large scale implementation of proprietary enterprise information systems (EIS), open-source software (OSS) for EIS has received limited attention in spite of its potential as a disruptive innovation. Cost saving is the main driver for adopting OSS among the other possible benefits including security…
Improving Real World Performance of Vision Aided Navigation in a Flight Environment
2016-09-15
Introduction . . . . . . . 63 4.2 Wide Area Search Extent . . . . . . . . . . . . . . . . . 64 4.3 Large-Scale Image Navigation Histogram Filter ...65 4.3.1 Location Model . . . . . . . . . . . . . . . . . . 66 4.3.2 Measurement Model . . . . . . . . . . . . . . . 66 4.3.3 Histogram Filter ...Iteration of Histogram Filter . . . . . . . . . . . 70 4.4 Implementation and Flight Test Campaign . . . . . . . . 71 4.4.1 Software Implementation
System and Method for Dynamic Aeroelastic Control
NASA Technical Reports Server (NTRS)
Suh, Peter M. (Inventor)
2015-01-01
The present invention proposes a hardware and software architecture for dynamic modal structural monitoring that uses a robust modal filter to monitor a potentially very large-scale array of sensors in real time, and tolerant of asymmetric sensor noise and sensor failures, to achieve aircraft performance optimization such as minimizing aircraft flutter, drag and maximizing fuel efficiency.
ERIC Educational Resources Information Center
Filinger, Ronald H.; Hall, Paul W.
Because large scale individualized learning systems place excessive demands on conventional means of producing audiovisual software, electronic image generation has been investigated as an alternative. A prototype, experimental device, Scanimate-500, was designed and built by the Computer Image Corporation. It uses photographic, television, and…
YaQ: an architecture for real-time navigation and rendering of varied crowds.
Maïm, Jonathan; Yersin, Barbara; Thalmann, Daniel
2009-01-01
The YaQ software platform is a complete system dedicated to real-time crowd simulation and rendering. Fitting multiple application domains, such as video games and VR, YaQ aims to provide efficient algorithms to generate crowds comprising up to thousands of varied virtual humans navigating in large-scale, global environments.
cOSPREY: A Cloud-Based Distributed Algorithm for Large-Scale Computational Protein Design
Pan, Yuchao; Dong, Yuxi; Zhou, Jingtian; Hallen, Mark; Donald, Bruce R.; Xu, Wei
2016-01-01
Abstract Finding the global minimum energy conformation (GMEC) of a huge combinatorial search space is the key challenge in computational protein design (CPD) problems. Traditional algorithms lack a scalable and efficient distributed design scheme, preventing researchers from taking full advantage of current cloud infrastructures. We design cloud OSPREY (cOSPREY), an extension to a widely used protein design software OSPREY, to allow the original design framework to scale to the commercial cloud infrastructures. We propose several novel designs to integrate both algorithm and system optimizations, such as GMEC-specific pruning, state search partitioning, asynchronous algorithm state sharing, and fault tolerance. We evaluate cOSPREY on three different cloud platforms using different technologies and show that it can solve a number of large-scale protein design problems that have not been possible with previous approaches. PMID:27154509
Chalise, D. R.; Haj, Adel E.; Fontaine, T.A.
2018-01-01
The hydrological simulation program Fortran (HSPF) [Hydrological Simulation Program Fortran version 12.2 (Computer software). USEPA, Washington, DC] and the precipitation runoff modeling system (PRMS) [Precipitation Runoff Modeling System version 4.0 (Computer software). USGS, Reston, VA] models are semidistributed, deterministic hydrological tools for simulating the impacts of precipitation, land use, and climate on basin hydrology and streamflow. Both models have been applied independently to many watersheds across the United States. This paper reports the statistical results assessing various temporal (daily, monthly, and annual) and spatial (small versus large watershed) scale biases in HSPF and PRMS simulations using two watersheds in the Black Hills, South Dakota. The Nash-Sutcliffe efficiency (NSE), Pearson correlation coefficient (r">rr), and coefficient of determination (R2">R2R2) statistics for the daily, monthly, and annual flows were used to evaluate the models’ performance. Results from the HSPF models showed that the HSPF consistently simulated the annual flows for both large and small basins better than the monthly and daily flows, and the simulated flows for the small watershed better than flows for the large watershed. In comparison, the PRMS model results show that the PRMS simulated the monthly flows for both the large and small watersheds better than the daily and annual flows, and the range of statistical error in the PRMS models was greater than that in the HSPF models. Moreover, it can be concluded that the statistical error in the HSPF and the PRMSdaily, monthly, and annual flow estimates for watersheds in the Black Hills was influenced by both temporal and spatial scale variability.
Job Management and Task Bundling
NASA Astrophysics Data System (ADS)
Berkowitz, Evan; Jansen, Gustav R.; McElvain, Kenneth; Walker-Loud, André
2018-03-01
High Performance Computing is often performed on scarce and shared computing resources. To ensure computers are used to their full capacity, administrators often incentivize large workloads that are not possible on smaller systems. Measurements in Lattice QCD frequently do not scale to machine-size workloads. By bundling tasks together we can create large jobs suitable for gigantic partitions. We discuss METAQ and mpi_jm, software developed to dynamically group computational tasks together, that can intelligently backfill to consume idle time without substantial changes to users' current workflows or executables.
Compiler-directed cache management in multiprocessors
NASA Technical Reports Server (NTRS)
Cheong, Hoichi; Veidenbaum, Alexander V.
1990-01-01
The necessity of finding alternatives to hardware-based cache coherence strategies for large-scale multiprocessor systems is discussed. Three different software-based strategies sharing the same goals and general approach are presented. They consist of a simple invalidation approach, a fast selective invalidation scheme, and a version control scheme. The strategies are suitable for shared-memory multiprocessor systems with interconnection networks and a large number of processors. Results of trace-driven simulations conducted on numerical benchmark routines to compare the performance of the three schemes are presented.
Strengthening Software Authentication with the ROSE Software Suite
DOE Office of Scientific and Technical Information (OSTI.GOV)
White, G
2006-06-15
Many recent nonproliferation and arms control software projects include a software authentication regime. These include U.S. Government-sponsored projects both in the United States and in the Russian Federation (RF). This trend toward requiring software authentication is only accelerating. Demonstrating assurance that software performs as expected without hidden ''backdoors'' is crucial to a project's success. In this context, ''authentication'' is defined as determining that a software package performs only its intended purpose and performs said purpose correctly and reliably over the planned duration of an agreement. In addition to visual inspections by knowledgeable computer scientists, automated tools are needed to highlightmore » suspicious code constructs, both to aid visual inspection and to guide program development. While many commercial tools are available for portions of the authentication task, they are proprietary and not extensible. An open-source, extensible tool can be customized to the unique needs of each project (projects can have both common and custom rules to detect flaws and security holes). Any such extensible tool has to be based on a complete language compiler. ROSE is precisely such a compiler infrastructure developed within the Department of Energy (DOE) and targeted at the optimization of scientific applications and user-defined libraries within large-scale applications (typically applications of a million lines of code). ROSE is a robust, source-to-source analysis and optimization infrastructure currently addressing large, million-line DOE applications in C and C++ (handling the full C, C99, C++ languages and with current collaborations to support Fortran90). We propose to extend ROSE to address a number of security-specific requirements, and apply it to software authentication for nonproliferation and arms control projects.« less
STEM_CELL: a software tool for electron microscopy: part 2--analysis of crystalline materials.
Grillo, Vincenzo; Rossi, Francesca
2013-02-01
A new graphical software (STEM_CELL) for analysis of HRTEM and STEM-HAADF images is here introduced in detail. The advantage of the software, beyond its graphic interface, is to put together different analysis algorithms and simulation (described in an associated article) to produce novel analysis methodologies. Different implementations and improvements to state of the art approach are reported in the image analysis, filtering, normalization, background subtraction. In particular two important methodological results are here highlighted: (i) the definition of a procedure for atomic scale quantitative analysis of HAADF images, (ii) the extension of geometric phase analysis to large regions up to potentially 1μm through the use of under sampled images with aliasing effects. Copyright © 2012 Elsevier B.V. All rights reserved.
Towards a Methodology for Identifying Program Constraints During Requirements Analysis
NASA Technical Reports Server (NTRS)
Romo, Lilly; Gates, Ann Q.; Della-Piana, Connie Kubo
1997-01-01
Requirements analysis is the activity that involves determining the needs of the customer, identifying the services that the software system should provide and understanding the constraints on the solution. The result of this activity is a natural language document, typically referred to as the requirements definition document. Some of the problems that exist in defining requirements in large scale software projects includes synthesizing knowledge from various domain experts and communicating this information across multiple levels of personnel. One approach that addresses part of this problem is called context monitoring and involves identifying the properties of and relationships between objects that the system will manipulate. This paper examines several software development methodologies, discusses the support that each provide for eliciting such information from experts and specifying the information, and suggests refinements to these methodologies.
The GeoClaw software for depth-averaged flows with adaptive refinement
Berger, M.J.; George, D.L.; LeVeque, R.J.; Mandli, Kyle T.
2011-01-01
Many geophysical flow or wave propagation problems can be modeled with two-dimensional depth-averaged equations, of which the shallow water equations are the simplest example. We describe the GeoClaw software that has been designed to solve problems of this nature, consisting of open source Fortran programs together with Python tools for the user interface and flow visualization. This software uses high-resolution shock-capturing finite volume methods on logically rectangular grids, including latitude-longitude grids on the sphere. Dry states are handled automatically to model inundation. The code incorporates adaptive mesh refinement to allow the efficient solution of large-scale geophysical problems. Examples are given illustrating its use for modeling tsunamis and dam-break flooding problems. Documentation and download information is available at www.clawpack.org/geoclaw. ?? 2011.
Large scale database scrubbing using object oriented software components.
Herting, R L; Barnes, M R
1998-01-01
Now that case managers, quality improvement teams, and researchers use medical databases extensively, the ability to share and disseminate such databases while maintaining patient confidentiality is paramount. A process called scrubbing addresses this problem by removing personally identifying information while keeping the integrity of the medical information intact. Scrubbing entire databases, containing multiple tables, requires that the implicit relationships between data elements in different tables of the database be maintained. To address this issue we developed DBScrub, a Java program that interfaces with any JDBC compliant database and scrubs the database while maintaining the implicit relationships within it. DBScrub uses a small number of highly configurable object-oriented software components to carry out the scrubbing. We describe the structure of these software components and how they maintain the implicit relationships within the database.
Modal Analysis for Grid Operation
DOE Office of Scientific and Technical Information (OSTI.GOV)
MANGO software is to provide a solution for improving small signal stability of power systems through adjusting operator-controllable variables using PMU measurement. System oscillation problems are one of the major threats to the grid stability and reliability in California and the Western Interconnection. These problems result in power fluctuations, lower grid operation efficiency, and may even lead to large-scale grid breakup and outages. This MANGO software aims to solve this problem by automatically generating recommended operation procedures termed Modal Analysis for Grid Operation (MANGO) to improve damping of inter-area oscillation modes. The MANGO procedure includes three steps: recognizing small signalmore » stability problems, implementing operating point adjustment using modal sensitivity, and evaluating the effectiveness of the adjustment. The MANGO software package is designed to help implement the MANGO procedure.« less
Terwilliger, Thomas C; Bricogne, Gerard
2014-10-01
Accurate crystal structures of macromolecules are of high importance in the biological and biomedical fields. Models of crystal structures in the Protein Data Bank (PDB) are in general of very high quality as deposited. However, methods for obtaining the best model of a macromolecular structure from a given set of experimental X-ray data continue to progress at a rapid pace, making it possible to improve most PDB entries after their deposition by re-analyzing the original deposited data with more recent software. This possibility represents a very significant departure from the situation that prevailed when the PDB was created, when it was envisioned as a cumulative repository of static contents. A radical paradigm shift for the PDB is therefore proposed, away from the static archive model towards a much more dynamic body of continuously improving results in symbiosis with continuously improving methods and software. These simultaneous improvements in methods and final results are made possible by the current deposition of processed crystallographic data (structure-factor amplitudes) and will be supported further by the deposition of raw data (diffraction images). It is argued that it is both desirable and feasible to carry out small-scale and large-scale efforts to make this paradigm shift a reality. Small-scale efforts would focus on optimizing structures that are of interest to specific investigators. Large-scale efforts would undertake a systematic re-optimization of all of the structures in the PDB, or alternatively the redetermination of groups of structures that are either related to or focused on specific questions. All of the resulting structures should be made generally available, along with the precursor entries, with various views of the structures being made available depending on the types of questions that users are interested in answering.
Terwilliger, Thomas C.; Bricogne, Gerard
2014-09-30
Accurate crystal structures of macromolecules are of high importance in the biological and biomedical fields. Models of crystal structures in the Protein Data Bank (PDB) are in general of very high quality as deposited. However, methods for obtaining the best model of a macromolecular structure from a given set of experimental X-ray data continue to progress at a rapid pace, making it possible to improve most PDB entries after their deposition by re-analyzing the original deposited data with more recent software. This possibility represents a very significant departure from the situation that prevailed when the PDB was created, when itmore » was envisioned as a cumulative repository of static contents. A radical paradigm shift for the PDB is therefore proposed, away from the static archive model towards a much more dynamic body of continuously improving results in symbiosis with continuously improving methods and software. These simultaneous improvements in methods and final results are made possible by the current deposition of processed crystallographic data (structure-factor amplitudes) and will be supported further by the deposition of raw data (diffraction images). It is argued that it is both desirable and feasible to carry out small-scale and large-scale efforts to make this paradigm shift a reality. Small-scale efforts would focus on optimizing structures that are of interest to specific investigators. Large-scale efforts would undertake a systematic re-optimization of all of the structures in the PDB, or alternatively the redetermination of groups of structures that are either related to or focused on specific questions. All of the resulting structures should be made generally available, along with the precursor entries, with various views of the structures being made available depending on the types of questions that users are interested in answering.« less
Terwilliger, Thomas C.; Bricogne, Gerard
2014-01-01
Accurate crystal structures of macromolecules are of high importance in the biological and biomedical fields. Models of crystal structures in the Protein Data Bank (PDB) are in general of very high quality as deposited. However, methods for obtaining the best model of a macromolecular structure from a given set of experimental X-ray data continue to progress at a rapid pace, making it possible to improve most PDB entries after their deposition by re-analyzing the original deposited data with more recent software. This possibility represents a very significant departure from the situation that prevailed when the PDB was created, when it was envisioned as a cumulative repository of static contents. A radical paradigm shift for the PDB is therefore proposed, away from the static archive model towards a much more dynamic body of continuously improving results in symbiosis with continuously improving methods and software. These simultaneous improvements in methods and final results are made possible by the current deposition of processed crystallographic data (structure-factor amplitudes) and will be supported further by the deposition of raw data (diffraction images). It is argued that it is both desirable and feasible to carry out small-scale and large-scale efforts to make this paradigm shift a reality. Small-scale efforts would focus on optimizing structures that are of interest to specific investigators. Large-scale efforts would undertake a systematic re-optimization of all of the structures in the PDB, or alternatively the redetermination of groups of structures that are either related to or focused on specific questions. All of the resulting structures should be made generally available, along with the precursor entries, with various views of the structures being made available depending on the types of questions that users are interested in answering. PMID:25286839
DOE Office of Scientific and Technical Information (OSTI.GOV)
Terwilliger, Thomas C.; Bricogne, Gerard
Accurate crystal structures of macromolecules are of high importance in the biological and biomedical fields. Models of crystal structures in the Protein Data Bank (PDB) are in general of very high quality as deposited. However, methods for obtaining the best model of a macromolecular structure from a given set of experimental X-ray data continue to progress at a rapid pace, making it possible to improve most PDB entries after their deposition by re-analyzing the original deposited data with more recent software. This possibility represents a very significant departure from the situation that prevailed when the PDB was created, when itmore » was envisioned as a cumulative repository of static contents. A radical paradigm shift for the PDB is therefore proposed, away from the static archive model towards a much more dynamic body of continuously improving results in symbiosis with continuously improving methods and software. These simultaneous improvements in methods and final results are made possible by the current deposition of processed crystallographic data (structure-factor amplitudes) and will be supported further by the deposition of raw data (diffraction images). It is argued that it is both desirable and feasible to carry out small-scale and large-scale efforts to make this paradigm shift a reality. Small-scale efforts would focus on optimizing structures that are of interest to specific investigators. Large-scale efforts would undertake a systematic re-optimization of all of the structures in the PDB, or alternatively the redetermination of groups of structures that are either related to or focused on specific questions. All of the resulting structures should be made generally available, along with the precursor entries, with various views of the structures being made available depending on the types of questions that users are interested in answering.« less
DOVIS 2.0: an efficient and easy to use parallel virtual screening tool based on AutoDock 4.0.
Jiang, Xiaohui; Kumar, Kamal; Hu, Xin; Wallqvist, Anders; Reifman, Jaques
2008-09-08
Small-molecule docking is an important tool in studying receptor-ligand interactions and in identifying potential drug candidates. Previously, we developed a software tool (DOVIS) to perform large-scale virtual screening of small molecules in parallel on Linux clusters, using AutoDock 3.05 as the docking engine. DOVIS enables the seamless screening of millions of compounds on high-performance computing platforms. In this paper, we report significant advances in the software implementation of DOVIS 2.0, including enhanced screening capability, improved file system efficiency, and extended usability. To keep DOVIS up-to-date, we upgraded the software's docking engine to the more accurate AutoDock 4.0 code. We developed a new parallelization scheme to improve runtime efficiency and modified the AutoDock code to reduce excessive file operations during large-scale virtual screening jobs. We also implemented an algorithm to output docked ligands in an industry standard format, sd-file format, which can be easily interfaced with other modeling programs. Finally, we constructed a wrapper-script interface to enable automatic rescoring of docked ligands by arbitrarily selected third-party scoring programs. The significance of the new DOVIS 2.0 software compared with the previous version lies in its improved performance and usability. The new version makes the computation highly efficient by automating load balancing, significantly reducing excessive file operations by more than 95%, providing outputs that conform to industry standard sd-file format, and providing a general wrapper-script interface for rescoring of docked ligands. The new DOVIS 2.0 package is freely available to the public under the GNU General Public License.
Tethys – A Python Package for Spatial and Temporal Downscaling of Global Water Withdrawals
DOE Office of Scientific and Technical Information (OSTI.GOV)
Li, Xinya; Vernon, Chris R.; Hejazi, Mohamad I.
Downscaling of water withdrawals from regional/national to local scale is a fundamental step and also a common problem when integrating large scale economic and integrated assessment models with high-resolution detailed sectoral models. Tethys, an open-access software written in Python, is developed with statistical downscaling algorithms, to spatially and temporally downscale water withdrawal data to a finer scale. The spatial resolution will be downscaled from region/basin scale to grid (0.5 geographic degree) scale and the temporal resolution will be downscaled from year to month. Tethys is used to produce monthly global gridded water withdrawal products based on estimates from the Globalmore » Change Assessment Model (GCAM).« less
Tethys – A Python Package for Spatial and Temporal Downscaling of Global Water Withdrawals
Li, Xinya; Vernon, Chris R.; Hejazi, Mohamad I.; ...
2018-02-09
Downscaling of water withdrawals from regional/national to local scale is a fundamental step and also a common problem when integrating large scale economic and integrated assessment models with high-resolution detailed sectoral models. Tethys, an open-access software written in Python, is developed with statistical downscaling algorithms, to spatially and temporally downscale water withdrawal data to a finer scale. The spatial resolution will be downscaled from region/basin scale to grid (0.5 geographic degree) scale and the temporal resolution will be downscaled from year to month. Tethys is used to produce monthly global gridded water withdrawal products based on estimates from the Globalmore » Change Assessment Model (GCAM).« less
Exploring Cloud Computing for Large-scale Scientific Applications
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lin, Guang; Han, Binh; Yin, Jian
This paper explores cloud computing for large-scale data-intensive scientific applications. Cloud computing is attractive because it provides hardware and software resources on-demand, which relieves the burden of acquiring and maintaining a huge amount of resources that may be used only once by a scientific application. However, unlike typical commercial applications that often just requires a moderate amount of ordinary resources, large-scale scientific applications often need to process enormous amount of data in the terabyte or even petabyte range and require special high performance hardware with low latency connections to complete computation in a reasonable amount of time. To address thesemore » challenges, we build an infrastructure that can dynamically select high performance computing hardware across institutions and dynamically adapt the computation to the selected resources to achieve high performance. We have also demonstrated the effectiveness of our infrastructure by building a system biology application and an uncertainty quantification application for carbon sequestration, which can efficiently utilize data and computation resources across several institutions.« less
Parallel Index and Query for Large Scale Data Analysis
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chou, Jerry; Wu, Kesheng; Ruebel, Oliver
2011-07-18
Modern scientific datasets present numerous data management and analysis challenges. State-of-the-art index and query technologies are critical for facilitating interactive exploration of large datasets, but numerous challenges remain in terms of designing a system for process- ing general scientific datasets. The system needs to be able to run on distributed multi-core platforms, efficiently utilize underlying I/O infrastructure, and scale to massive datasets. We present FastQuery, a novel software framework that address these challenges. FastQuery utilizes a state-of-the-art index and query technology (FastBit) and is designed to process mas- sive datasets on modern supercomputing platforms. We apply FastQuery to processing ofmore » a massive 50TB dataset generated by a large scale accelerator modeling code. We demonstrate the scalability of the tool to 11,520 cores. Motivated by the scientific need to search for inter- esting particles in this dataset, we use our framework to reduce search time from hours to tens of seconds.« less
Large-scale imputation of epigenomic datasets for systematic annotation of diverse human tissues.
Ernst, Jason; Kellis, Manolis
2015-04-01
With hundreds of epigenomic maps, the opportunity arises to exploit the correlated nature of epigenetic signals, across both marks and samples, for large-scale prediction of additional datasets. Here, we undertake epigenome imputation by leveraging such correlations through an ensemble of regression trees. We impute 4,315 high-resolution signal maps, of which 26% are also experimentally observed. Imputed signal tracks show overall similarity to observed signals and surpass experimental datasets in consistency, recovery of gene annotations and enrichment for disease-associated variants. We use the imputed data to detect low-quality experimental datasets, to find genomic sites with unexpected epigenomic signals, to define high-priority marks for new experiments and to delineate chromatin states in 127 reference epigenomes spanning diverse tissues and cell types. Our imputed datasets provide the most comprehensive human regulatory region annotation to date, and our approach and the ChromImpute software constitute a useful complement to large-scale experimental mapping of epigenomic information.
Python for large-scale electrophysiology.
Spacek, Martin; Blanche, Tim; Swindale, Nicholas
2008-01-01
Electrophysiology is increasingly moving towards highly parallel recording techniques which generate large data sets. We record extracellularly in vivo in cat and rat visual cortex with 54-channel silicon polytrodes, under time-locked visual stimulation, from localized neuronal populations within a cortical column. To help deal with the complexity of generating and analysing these data, we used the Python programming language to develop three software projects: one for temporally precise visual stimulus generation ("dimstim"); one for electrophysiological waveform visualization and spike sorting ("spyke"); and one for spike train and stimulus analysis ("neuropy"). All three are open source and available for download (http://swindale.ecc.ubc.ca/code). The requirements and solutions for these projects differed greatly, yet we found Python to be well suited for all three. Here we present our software as a showcase of the extensive capabilities of Python in neuroscience.
ART-Ada: An Ada-based expert system tool
NASA Technical Reports Server (NTRS)
Lee, S. Daniel; Allen, Bradley P.
1990-01-01
The Department of Defense mandate to standardize on Ada as the language for software systems development has resulted in an increased interest in making expert systems technology readily available in Ada environments. NASA's Space Station Freedom is an example of the large Ada software development projects that will require expert systems in the 1990's. Another large scale application that can benefit from Ada based expert system tool technology is the Pilot's Associate (PA) expert system project for military combat aircraft. The Automated Reasoning Tool-Ada (ART-Ada), an Ada expert system tool, is explained. ART-Ada allows applications of a C-based expert system tool called ART-IM to be deployed in various Ada environments. ART-Ada is being used to implement several prototype expert systems for NASA's Space Station Freedom program and the U.S. Air Force.
ART-Ada: An Ada-based expert system tool
NASA Technical Reports Server (NTRS)
Lee, S. Daniel; Allen, Bradley P.
1991-01-01
The Department of Defense mandate to standardize on Ada as the language for software systems development has resulted in increased interest in making expert systems technology readily available in Ada environments. NASA's Space Station Freedom is an example of the large Ada software development projects that will require expert systems in the 1990's. Another large scale application that can benefit from Ada based expert system tool technology is the Pilot's Associate (PA) expert system project for military combat aircraft. Automated Reasoning Tool (ART) Ada, an Ada Expert system tool is described. ART-Ada allow applications of a C-based expert system tool called ART-IM to be deployed in various Ada environments. ART-Ada is being used to implement several prototype expert systems for NASA's Space Station Freedom Program and the U.S. Air Force.
Panoptes: web-based exploration of large scale genome variation data.
Vauterin, Paul; Jeffery, Ben; Miles, Alistair; Amato, Roberto; Hart, Lee; Wright, Ian; Kwiatkowski, Dominic
2017-10-15
The size and complexity of modern large-scale genome variation studies demand novel approaches for exploring and sharing the data. In order to unlock the potential of these data for a broad audience of scientists with various areas of expertise, a unified exploration framework is required that is accessible, coherent and user-friendly. Panoptes is an open-source software framework for collaborative visual exploration of large-scale genome variation data and associated metadata in a web browser. It relies on technology choices that allow it to operate in near real-time on very large datasets. It can be used to browse rich, hybrid content in a coherent way, and offers interactive visual analytics approaches to assist the exploration. We illustrate its application using genome variation data of Anopheles gambiae, Plasmodium falciparum and Plasmodium vivax. Freely available at https://github.com/cggh/panoptes, under the GNU Affero General Public License. paul.vauterin@gmail.com. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
Analysis of BJ493 diesel engine lubrication system properties
NASA Astrophysics Data System (ADS)
Liu, F.
2017-12-01
The BJ493ZLQ4A diesel engine design is based on the primary model of BJ493ZLQ3, of which exhaust level is upgraded to the National GB5 standard due to the improved design of combustion and injection systems. Given the above changes in the diesel lubrication system, its improved properties are analyzed in this paper. According to the structures, technical parameters and indices of the lubrication system, the lubrication system model of BJ493ZLQ4A diesel engine was constructed using the Flowmaster flow simulation software. The properties of the diesel engine lubrication system, such as the oil flow rate and pressure at different rotational speeds were analyzed for the schemes involving large- and small-scale oil filters. The calculated values of the main oil channel pressure are in good agreement with the experimental results, which verifies the proposed model feasibility. The calculation results show that the main oil channel pressure and maximum oil flow rate values for the large-scale oil filter scheme satisfy the design requirements, while the small-scale scheme yields too low main oil channel’s pressure and too high. Therefore, application of small-scale oil filters is hazardous, and the large-scale scheme is recommended.
Towards Large-area Field-scale Operational Evapotranspiration for Water Use Mapping
NASA Astrophysics Data System (ADS)
Senay, G. B.; Friedrichs, M.; Morton, C.; Huntington, J. L.; Verdin, J.
2017-12-01
Field-scale evapotranspiration (ET) estimates are needed for improving surface and groundwater use and water budget studies. Ideally, field-scale ET estimates would be at regional to national levels and cover long time periods. As a result of large data storage and computational requirements associated with processing field-scale satellite imagery such as Landsat, numerous challenges remain to develop operational ET estimates over large areas for detailed water use and availability studies. However, the combination of new science, data availability, and cloud computing technology is enabling unprecedented capabilities for ET mapping. To demonstrate this capability, we used Google's Earth Engine cloud computing platform to create nationwide annual ET estimates with 30-meter resolution Landsat ( 16,000 images) and gridded weather data using the Operational Simplified Surface Energy Balance (SSEBop) model in support of the National Water Census, a USGS research program designed to build decision support capacity for water management agencies and other natural resource managers. By leveraging Google's Earth Engine Application Programming Interface (API) and developing software in a collaborative, open-platform environment, we rapidly advance from research towards applications for large-area field-scale ET mapping. Cloud computing of the Landsat image archive combined with other satellite, climate, and weather data, is creating never imagined opportunities for assessing ET model behavior and uncertainty, and ultimately providing the ability for more robust operational monitoring and assessment of water use at field-scales.
Bi-Force: large-scale bicluster editing and its application to gene expression data biclustering
Sun, Peng; Speicher, Nora K.; Röttger, Richard; Guo, Jiong; Baumbach, Jan
2014-01-01
Abstract The explosion of the biological data has dramatically reformed today's biological research. The need to integrate and analyze high-dimensional biological data on a large scale is driving the development of novel bioinformatics approaches. Biclustering, also known as ‘simultaneous clustering’ or ‘co-clustering’, has been successfully utilized to discover local patterns in gene expression data and similar biomedical data types. Here, we contribute a new heuristic: ‘Bi-Force’. It is based on the weighted bicluster editing model, to perform biclustering on arbitrary sets of biological entities, given any kind of pairwise similarities. We first evaluated the power of Bi-Force to solve dedicated bicluster editing problems by comparing Bi-Force with two existing algorithms in the BiCluE software package. We then followed a biclustering evaluation protocol in a recent review paper from Eren et al. (2013) (A comparative analysis of biclustering algorithms for gene expressiondata. Brief. Bioinform., 14:279–292.) and compared Bi-Force against eight existing tools: FABIA, QUBIC, Cheng and Church, Plaid, BiMax, Spectral, xMOTIFs and ISA. To this end, a suite of synthetic datasets as well as nine large gene expression datasets from Gene Expression Omnibus were analyzed. All resulting biclusters were subsequently investigated by Gene Ontology enrichment analysis to evaluate their biological relevance. The distinct theoretical foundation of Bi-Force (bicluster editing) is more powerful than strict biclustering. We thus outperformed existing tools with Bi-Force at least when following the evaluation protocols from Eren et al. Bi-Force is implemented in Java and integrated into the open source software package of BiCluE. The software as well as all used datasets are publicly available at http://biclue.mpi-inf.mpg.de. PMID:24682815
xSDK Foundations: Toward an Extreme-scale Scientific Software Development Kit
DOE Office of Scientific and Technical Information (OSTI.GOV)
Heroux, Michael A.; Bartlett, Roscoe; Demeshko, Irina
Here, extreme-scale computational science increasingly demands multiscale and multiphysics formulations. Combining software developed by independent groups is imperative: no single team has resources for all predictive science and decision support capabilities. Scientific libraries provide high-quality, reusable software components for constructing applications with improved robustness and portability. However, without coordination, many libraries cannot be easily composed. Namespace collisions, inconsistent arguments, lack of third-party software versioning, and additional difficulties make composition costly. The Extreme-scale Scientific Software Development Kit (xSDK) defines community policies to improve code quality and compatibility across independently developed packages (hypre, PETSc, SuperLU, Trilinos, and Alquimia) and provides a foundationmore » for addressing broader issues in software interoperability, performance portability, and sustainability. The xSDK provides turnkey installation of member software and seamless combination of aggregate capabilities, and it marks first steps toward extreme-scale scientific software ecosystems from which future applications can be composed rapidly with assured quality and scalability.« less
xSDK Foundations: Toward an Extreme-scale Scientific Software Development Kit
Heroux, Michael A.; Bartlett, Roscoe; Demeshko, Irina; ...
2017-03-01
Here, extreme-scale computational science increasingly demands multiscale and multiphysics formulations. Combining software developed by independent groups is imperative: no single team has resources for all predictive science and decision support capabilities. Scientific libraries provide high-quality, reusable software components for constructing applications with improved robustness and portability. However, without coordination, many libraries cannot be easily composed. Namespace collisions, inconsistent arguments, lack of third-party software versioning, and additional difficulties make composition costly. The Extreme-scale Scientific Software Development Kit (xSDK) defines community policies to improve code quality and compatibility across independently developed packages (hypre, PETSc, SuperLU, Trilinos, and Alquimia) and provides a foundationmore » for addressing broader issues in software interoperability, performance portability, and sustainability. The xSDK provides turnkey installation of member software and seamless combination of aggregate capabilities, and it marks first steps toward extreme-scale scientific software ecosystems from which future applications can be composed rapidly with assured quality and scalability.« less
NASA Astrophysics Data System (ADS)
Afanasiev, M.; Boehm, C.; van Driel, M.; Krischer, L.; May, D.; Rietmann, M.; Fichtner, A.
2016-12-01
Recent years have been witness to the application of waveform inversion to new and exciting domains, ranging from non-destructive testing to global seismology. Often, each new application brings with it novel wave propagation physics, spatial and temporal discretizations, and models of variable complexity. Adapting existing software to these novel applications often requires a significant investment of time, and acts as a barrier to progress. To combat these problems we introduce Salvus, a software package designed to solve large-scale full-waveform inverse problems, with a focus on both flexibility and performance. Based on a high order finite (spectral) element discretization, we have built Salvus to work on unstructured quad/hex meshes in both 2 or 3 dimensions, with support for P1-P3 bases on triangles and tetrahedra. A diverse (and expanding) collection of wave propagation physics are supported (i.e. coupled solid-fluid). With a focus on the inverse problem, functionality is provided to ease integration with internal and external optimization libraries. Additionally, a python-based meshing package is included to simplify the generation and manipulation of regional to global scale Earth models (quad/hex), with interfaces available to external mesh generators for complex engineering-scale applications (quad/hex/tri/tet). Finally, to ensure that the code remains accurate and maintainable, we build upon software libraries such as PETSc and Eigen, and follow modern software design and testing protocols. Salvus bridges the gap between research and production codes with a design based on C++ mixins and Python wrappers that separates the physical equations from the numerical core. This allows domain scientists to add new equations using a high-level interface, without having to worry about optimized implementation details. Our goal in this presentation is to introduce the code, show several examples across the scales, and discuss some of the extensible design points.
NASA Astrophysics Data System (ADS)
Afanasiev, Michael; Boehm, Christian; van Driel, Martin; Krischer, Lion; May, Dave; Rietmann, Max; Fichtner, Andreas
2017-04-01
Recent years have been witness to the application of waveform inversion to new and exciting domains, ranging from non-destructive testing to global seismology. Often, each new application brings with it novel wave propagation physics, spatial and temporal discretizations, and models of variable complexity. Adapting existing software to these novel applications often requires a significant investment of time, and acts as a barrier to progress. To combat these problems we introduce Salvus, a software package designed to solve large-scale full-waveform inverse problems, with a focus on both flexibility and performance. Currently based on an abstract implementation of high order finite (spectral) elements, we have built Salvus to work on unstructured quad/hex meshes in both 2 or 3 dimensions, with support for P1-P3 bases on triangles and tetrahedra. A diverse (and expanding) collection of wave propagation physics are supported (i.e. viscoelastic, coupled solid-fluid). With a focus on the inverse problem, functionality is provided to ease integration with internal and external optimization libraries. Additionally, a python-based meshing package is included to simplify the generation and manipulation of regional to global scale Earth models (quad/hex), with interfaces available to external mesh generators for complex engineering-scale applications (quad/hex/tri/tet). Finally, to ensure that the code remains accurate and maintainable, we build upon software libraries such as PETSc and Eigen, and follow modern software design and testing protocols. Salvus bridges the gap between research and production codes with a design based on C++ template mixins and Python wrappers that separates the physical equations from the numerical core. This allows domain scientists to add new equations using a high-level interface, without having to worry about optimized implementation details. Our goal in this presentation is to introduce the code, show several examples across the scales, and discuss some of the extensible design points.
Design and Development of a Run-Time Monitor for Multi-Core Architectures in Cloud Computing
Kang, Mikyung; Kang, Dong-In; Crago, Stephen P.; Park, Gyung-Leen; Lee, Junghoon
2011-01-01
Cloud computing is a new information technology trend that moves computing and data away from desktops and portable PCs into large data centers. The basic principle of cloud computing is to deliver applications as services over the Internet as well as infrastructure. A cloud is a type of parallel and distributed system consisting of a collection of inter-connected and virtualized computers that are dynamically provisioned and presented as one or more unified computing resources. The large-scale distributed applications on a cloud require adaptive service-based software, which has the capability of monitoring system status changes, analyzing the monitored information, and adapting its service configuration while considering tradeoffs among multiple QoS features simultaneously. In this paper, we design and develop a Run-Time Monitor (RTM) which is a system software to monitor the application behavior at run-time, analyze the collected information, and optimize cloud computing resources for multi-core architectures. RTM monitors application software through library instrumentation as well as underlying hardware through a performance counter optimizing its computing configuration based on the analyzed data. PMID:22163811
Design and development of a run-time monitor for multi-core architectures in cloud computing.
Kang, Mikyung; Kang, Dong-In; Crago, Stephen P; Park, Gyung-Leen; Lee, Junghoon
2011-01-01
Cloud computing is a new information technology trend that moves computing and data away from desktops and portable PCs into large data centers. The basic principle of cloud computing is to deliver applications as services over the Internet as well as infrastructure. A cloud is a type of parallel and distributed system consisting of a collection of inter-connected and virtualized computers that are dynamically provisioned and presented as one or more unified computing resources. The large-scale distributed applications on a cloud require adaptive service-based software, which has the capability of monitoring system status changes, analyzing the monitored information, and adapting its service configuration while considering tradeoffs among multiple QoS features simultaneously. In this paper, we design and develop a Run-Time Monitor (RTM) which is a system software to monitor the application behavior at run-time, analyze the collected information, and optimize cloud computing resources for multi-core architectures. RTM monitors application software through library instrumentation as well as underlying hardware through a performance counter optimizing its computing configuration based on the analyzed data.
Tessera: Open source software for accelerated data science
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sego, Landon H.; Hafen, Ryan P.; Director, Hannah M.
2014-06-30
Extracting useful, actionable information from data can be a formidable challenge for the safeguards, nonproliferation, and arms control verification communities. Data scientists are often on the “front-lines” of making sense of complex and large datasets. They require flexible tools that make it easy to rapidly reformat large datasets, interactively explore and visualize data, develop statistical algorithms, and validate their approaches—and they need to perform these activities with minimal lines of code. Existing commercial software solutions often lack extensibility and the flexibility required to address the nuances of the demanding and dynamic environments where data scientists work. To address this need,more » Pacific Northwest National Laboratory developed Tessera, an open source software suite designed to enable data scientists to interactively perform their craft at the terabyte scale. Tessera automatically manages the complicated tasks of distributed storage and computation, empowering data scientists to do what they do best: tackling critical research and mission objectives by deriving insight from data. We illustrate the use of Tessera with an example analysis of computer network data.« less
NASA Technical Reports Server (NTRS)
Vanderplaats, Garrett; Townsend, James C. (Technical Monitor)
2002-01-01
The purpose of this research under the NASA Small Business Innovative Research program was to develop algorithms and associated software to solve very large nonlinear, constrained optimization tasks. Key issues included efficiency, reliability, memory, and gradient calculation requirements. This report describes the general optimization problem, ten candidate methods, and detailed evaluations of four candidates. The algorithm chosen for final development is a modern recreation of a 1960s external penalty function method that uses very limited computer memory and computational time. Although of lower efficiency, the new method can solve problems orders of magnitude larger than current methods. The resulting BIGDOT software has been demonstrated on problems with 50,000 variables and about 50,000 active constraints. For unconstrained optimization, it has solved a problem in excess of 135,000 variables. The method includes a technique for solving discrete variable problems that finds a "good" design, although a theoretical optimum cannot be guaranteed. It is very scalable in that the number of function and gradient evaluations does not change significantly with increased problem size. Test cases are provided to demonstrate the efficiency and reliability of the methods and software.
Evolving from bioinformatics in-the-small to bioinformatics in-the-large.
Parker, D Stott; Gorlick, Michael M; Lee, Christopher J
2003-01-01
We argue the significance of a fundamental shift in bioinformatics, from in-the-small to in-the-large. Adopting a large-scale perspective is a way to manage the problems endemic to the world of the small-constellations of incompatible tools for which the effort required to assemble an integrated system exceeds the perceived benefit of the integration. Where bioinformatics in-the-small is about data and tools, bioinformatics in-the-large is about metadata and dependencies. Dependencies represent the complexities of large-scale integration, including the requirements and assumptions governing the composition of tools. The popular make utility is a very effective system for defining and maintaining simple dependencies, and it offers a number of insights about the essence of bioinformatics in-the-large. Keeping an in-the-large perspective has been very useful to us in large bioinformatics projects. We give two fairly different examples, and extract lessons from them showing how it has helped. These examples both suggest the benefit of explicitly defining and managing knowledge flows and knowledge maps (which represent metadata regarding types, flows, and dependencies), and also suggest approaches for developing bioinformatics database systems. Generally, we argue that large-scale engineering principles can be successfully adapted from disciplines such as software engineering and data management, and that having an in-the-large perspective will be a key advantage in the next phase of bioinformatics development.
bioNerDS: exploring bioinformatics’ database and software use through literature mining
2013-01-01
Background Biology-focused databases and software define bioinformatics and their use is central to computational biology. In such a complex and dynamic field, it is of interest to understand what resources are available, which are used, how much they are used, and for what they are used. While scholarly literature surveys can provide some insights, large-scale computer-based approaches to identify mentions of bioinformatics databases and software from primary literature would automate systematic cataloguing, facilitate the monitoring of usage, and provide the foundations for the recovery of computational methods for analysing biological data, with the long-term aim of identifying best/common practice in different areas of biology. Results We have developed bioNerDS, a named entity recogniser for the recovery of bioinformatics databases and software from primary literature. We identify such entities with an F-measure ranging from 63% to 91% at the mention level and 63-78% at the document level, depending on corpus. Not attaining a higher F-measure is mostly due to high ambiguity in resource naming, which is compounded by the on-going introduction of new resources. To demonstrate the software, we applied bioNerDS to full-text articles from BMC Bioinformatics and Genome Biology. General mention patterns reflect the remit of these journals, highlighting BMC Bioinformatics’s emphasis on new tools and Genome Biology’s greater emphasis on data analysis. The data also illustrates some shifts in resource usage: for example, the past decade has seen R and the Gene Ontology join BLAST and GenBank as the main components in bioinformatics processing. Abstract Conclusions We demonstrate the feasibility of automatically identifying resource names on a large-scale from the scientific literature and show that the generated data can be used for exploration of bioinformatics database and software usage. For example, our results help to investigate the rate of change in resource usage and corroborate the suspicion that a vast majority of resources are created, but rarely (if ever) used thereafter. bioNerDS is available at http://bionerds.sourceforge.net/. PMID:23768135
Strategies for Energy Efficient Resource Management of Hybrid Programming Models
DOE Office of Scientific and Technical Information (OSTI.GOV)
Li, Dong; Supinski, Bronis de; Schulz, Martin
2013-01-01
Many scientific applications are programmed using hybrid programming models that use both message-passing and shared-memory, due to the increasing prevalence of large-scale systems with multicore, multisocket nodes. Previous work has shown that energy efficiency can be improved using software-controlled execution schemes that consider both the programming model and the power-aware execution capabilities of the system. However, such approaches have focused on identifying optimal resource utilization for one programming model, either shared-memory or message-passing, in isolation. The potential solution space, thus the challenge, increases substantially when optimizing hybrid models since the possible resource configurations increase exponentially. Nonetheless, with the accelerating adoptionmore » of hybrid programming models, we increasingly need improved energy efficiency in hybrid parallel applications on large-scale systems. In this work, we present new software-controlled execution schemes that consider the effects of dynamic concurrency throttling (DCT) and dynamic voltage and frequency scaling (DVFS) in the context of hybrid programming models. Specifically, we present predictive models and novel algorithms based on statistical analysis that anticipate application power and time requirements under different concurrency and frequency configurations. We apply our models and methods to the NPB MZ benchmarks and selected applications from the ASC Sequoia codes. Overall, we achieve substantial energy savings (8.74% on average and up to 13.8%) with some performance gain (up to 7.5%) or negligible performance loss.« less
Extreme-Scale De Novo Genome Assembly
DOE Office of Scientific and Technical Information (OSTI.GOV)
Georganas, Evangelos; Hofmeyr, Steven; Egan, Rob
De novo whole genome assembly reconstructs genomic sequence from short, overlapping, and potentially erroneous DNA segments and is one of the most important computations in modern genomics. This work presents HipMER, a high-quality end-to-end de novo assembler designed for extreme scale analysis, via efficient parallelization of the Meraculous code. Genome assembly software has many components, each of which stresses different components of a computer system. This chapter explains the computational challenges involved in each step of the HipMer pipeline, the key distributed data structures, and communication costs in detail. We present performance results of assembling the human genome and themore » large hexaploid wheat genome on large supercomputers up to tens of thousands of cores.« less
PD5: a general purpose library for primer design software.
Riley, Michael C; Aubrey, Wayne; Young, Michael; Clare, Amanda
2013-01-01
Complex PCR applications for large genome-scale projects require fast, reliable and often highly sophisticated primer design software applications. Presently, such applications use pipelining methods to utilise many third party applications and this involves file parsing, interfacing and data conversion, which is slow and prone to error. A fully integrated suite of software tools for primer design would considerably improve the development time, the processing speed, and the reliability of bespoke primer design software applications. The PD5 software library is an open-source collection of classes and utilities, providing a complete collection of software building blocks for primer design and analysis. It is written in object-oriented C(++) with an emphasis on classes suitable for efficient and rapid development of bespoke primer design programs. The modular design of the software library simplifies the development of specific applications and also integration with existing third party software where necessary. We demonstrate several applications created using this software library that have already proved to be effective, but we view the project as a dynamic environment for building primer design software and it is open for future development by the bioinformatics community. Therefore, the PD5 software library is published under the terms of the GNU General Public License, which guarantee access to source-code and allow redistribution and modification. The PD5 software library is downloadable from Google Code and the accompanying Wiki includes instructions and examples: http://code.google.com/p/primer-design.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Brown, Joseph; Pirrung, Meg; McCue, Lee Ann
FQC is software that facilitates large-scale quality control of FASTQ files by carrying out a QC protocol, parsing results, and aggregating quality metrics within and across experiments into an interactive dashboard. The dashboard utilizes human-readable configuration files to manipulate the pages and tabs, and is extensible with CSV data.
Avionic Data Bus Integration Technology
1991-12-01
address the hardware-software interaction between a digital data bus and an avionic system. Very Large Scale Integration (VLSI) ICs and multiversion ...the SCP. In 1984, the Sperry Corporation developed a fault tolerant system which employed multiversion programming, voting, and monitoring for error... MULTIVERSION PROGRAMMING. N-version programming. 226 N-VERSION PROGRAMMING. The independent coding of a number, N, of redundant computer programs that
Structural dynamics payload loads estimates
NASA Technical Reports Server (NTRS)
Engels, R. C.
1982-01-01
Methods for the prediction of loads on large space structures are discussed. Existing approaches to the problem of loads calculation are surveyed. A full scale version of an alternate numerical integration technique to solve the response part of a load cycle is presented, and a set of short cut versions of the algorithm developed. The implementation of these techniques using the software package developed is discussed.
Teaching the blind to find their way by playing video games.
Merabet, Lotfi B; Connors, Erin C; Halko, Mark A; Sánchez, Jaime
2012-01-01
Computer based video games are receiving great interest as a means to learn and acquire new skills. As a novel approach to teaching navigation skills in the blind, we have developed Audio-based Environment Simulator (AbES); a virtual reality environment set within the context of a video game metaphor. Despite the fact that participants were naïve to the overall purpose of the software, we found that early blind users were able to acquire relevant information regarding the spatial layout of a previously unfamiliar building using audio based cues alone. This was confirmed by a series of behavioral performance tests designed to assess the transfer of acquired spatial information to a large-scale, real-world indoor navigation task. Furthermore, learning the spatial layout through a goal directed gaming strategy allowed for the mental manipulation of spatial information as evidenced by enhanced navigation performance when compared to an explicit route learning strategy. We conclude that the immersive and highly interactive nature of the software greatly engages the blind user to actively explore the virtual environment. This in turn generates an accurate sense of a large-scale three-dimensional space and facilitates the learning and transfer of navigation skills to the physical world.
Tsugawa, Hiroshi; Ohta, Erika; Izumi, Yoshihiro; Ogiwara, Atsushi; Yukihira, Daichi; Bamba, Takeshi; Fukusaki, Eiichiro; Arita, Masanori
2015-01-01
Based on theoretically calculated comprehensive lipid libraries, in lipidomics as many as 1000 multiple reaction monitoring (MRM) transitions can be monitored for each single run. On the other hand, lipid analysis from each MRM chromatogram requires tremendous manual efforts to identify and quantify lipid species. Isotopic peaks differing by up to a few atomic masses further complicate analysis. To accelerate the identification and quantification process we developed novel software, MRM-DIFF, for the differential analysis of large-scale MRM assays. It supports a correlation optimized warping (COW) algorithm to align MRM chromatograms and utilizes quality control (QC) sample datasets to automatically adjust the alignment parameters. Moreover, user-defined reference libraries that include the molecular formula, retention time, and MRM transition can be used to identify target lipids and to correct peak abundances by considering isotopic peaks. Here, we demonstrate the software pipeline and introduce key points for MRM-based lipidomics research to reduce the mis-identification and overestimation of lipid profiles. The MRM-DIFF program, example data set and the tutorials are downloadable at the “Standalone software” section of the PRIMe (Platform for RIKEN Metabolomics, http://prime.psc.riken.jp/) database website. PMID:25688256
ProteoLens: a visual analytic tool for multi-scale database-driven biological network data mining.
Huan, Tianxiao; Sivachenko, Andrey Y; Harrison, Scott H; Chen, Jake Y
2008-08-12
New systems biology studies require researchers to understand how interplay among myriads of biomolecular entities is orchestrated in order to achieve high-level cellular and physiological functions. Many software tools have been developed in the past decade to help researchers visually navigate large networks of biomolecular interactions with built-in template-based query capabilities. To further advance researchers' ability to interrogate global physiological states of cells through multi-scale visual network explorations, new visualization software tools still need to be developed to empower the analysis. A robust visual data analysis platform driven by database management systems to perform bi-directional data processing-to-visualizations with declarative querying capabilities is needed. We developed ProteoLens as a JAVA-based visual analytic software tool for creating, annotating and exploring multi-scale biological networks. It supports direct database connectivity to either Oracle or PostgreSQL database tables/views, on which SQL statements using both Data Definition Languages (DDL) and Data Manipulation languages (DML) may be specified. The robust query languages embedded directly within the visualization software help users to bring their network data into a visualization context for annotation and exploration. ProteoLens supports graph/network represented data in standard Graph Modeling Language (GML) formats, and this enables interoperation with a wide range of other visual layout tools. The architectural design of ProteoLens enables the de-coupling of complex network data visualization tasks into two distinct phases: 1) creating network data association rules, which are mapping rules between network node IDs or edge IDs and data attributes such as functional annotations, expression levels, scores, synonyms, descriptions etc; 2) applying network data association rules to build the network and perform the visual annotation of graph nodes and edges according to associated data values. We demonstrated the advantages of these new capabilities through three biological network visualization case studies: human disease association network, drug-target interaction network and protein-peptide mapping network. The architectural design of ProteoLens makes it suitable for bioinformatics expert data analysts who are experienced with relational database management to perform large-scale integrated network visual explorations. ProteoLens is a promising visual analytic platform that will facilitate knowledge discoveries in future network and systems biology studies.
Validation of a free software for unsupervised assessment of abdominal fat in MRI.
Maddalo, Michele; Zorza, Ivan; Zubani, Stefano; Nocivelli, Giorgio; Calandra, Giulio; Soldini, Pierantonio; Mascaro, Lorella; Maroldi, Roberto
2017-05-01
To demonstrate the accuracy of an unsupervised (fully automated) software for fat segmentation in magnetic resonance imaging. The proposed software is a freeware solution developed in ImageJ that enables the quantification of metabolically different adipose tissues in large cohort studies. The lumbar part of the abdomen (19cm in craniocaudal direction, centered in L3) of eleven healthy volunteers (age range: 21-46years, BMI range: 21.7-31.6kg/m 2 ) was examined in a breath hold on expiration with a GE T1 Dixon sequence. Single-slice and volumetric data were considered for each subject. The results of the visceral and subcutaneous adipose tissue assessments obtained by the unsupervised software were compared to supervised segmentations of reference. The associated statistical analysis included Pearson correlations, Bland-Altman plots and volumetric differences (VD % ). Values calculated by the unsupervised software significantly correlated with corresponding supervised segmentations of reference for both subcutaneous adipose tissue - SAT (R=0.9996, p<0.001) and visceral adipose tissue - VAT (R=0.995, p<0.001). Bland-Altman plots showed the absence of systematic errors and a limited spread of the differences. In the single-slice analysis, VD % were (1.6±2.9)% for SAT and (4.9±6.9)% for VAT. In the volumetric analysis, VD % were (1.3±0.9)% for SAT and (2.9±2.7)% for VAT. The developed software is capable of segmenting the metabolically different adipose tissues with a high degree of accuracy. This free add-on software for ImageJ can easily have a widespread and enable large-scale population studies regarding the adipose tissue and its related diseases. Copyright © 2017 Associazione Italiana di Fisica Medica. Published by Elsevier Ltd. All rights reserved.
hEIDI: An Intuitive Application Tool To Organize and Treat Large-Scale Proteomics Data.
Hesse, Anne-Marie; Dupierris, Véronique; Adam, Claire; Court, Magali; Barthe, Damien; Emadali, Anouk; Masselon, Christophe; Ferro, Myriam; Bruley, Christophe
2016-10-07
Advances in high-throughput proteomics have led to a rapid increase in the number, size, and complexity of the associated data sets. Managing and extracting reliable information from such large series of data sets require the use of dedicated software organized in a consistent pipeline to reduce, validate, exploit, and ultimately export data. The compilation of multiple mass-spectrometry-based identification and quantification results obtained in the context of a large-scale project represents a real challenge for developers of bioinformatics solutions. In response to this challenge, we developed a dedicated software suite called hEIDI to manage and combine both identifications and semiquantitative data related to multiple LC-MS/MS analyses. This paper describes how, through a user-friendly interface, hEIDI can be used to compile analyses and retrieve lists of nonredundant protein groups. Moreover, hEIDI allows direct comparison of series of analyses, on the basis of protein groups, while ensuring consistent protein inference and also computing spectral counts. hEIDI ensures that validated results are compliant with MIAPE guidelines as all information related to samples and results is stored in appropriate databases. Thanks to the database structure, validated results generated within hEIDI can be easily exported in the PRIDE XML format for subsequent publication. hEIDI can be downloaded from http://biodev.extra.cea.fr/docs/heidi .
Estimate of the Potential Costs and Effectiveness of Scaling Up CRESST Assessment Software.
ERIC Educational Resources Information Center
Chung, Gregory K. W. K.; Herl, Howard E.; Klein, Davina C. D.; O'Neil, Harold F., Jr.; Schacter, John
This report examines issues in the scale-up of assessment software from the Center for Research on Evaluation, Standards, and Student Testing (CRESST). "Scale-up" is used in a metaphorical sense, meaning adding new assessment tools to CRESST's assessment software. During the past several years, CRESST has been developing and evaluating a…
A new parallel-vector finite element analysis software on distributed-memory computers
NASA Technical Reports Server (NTRS)
Qin, Jiangning; Nguyen, Duc T.
1993-01-01
A new parallel-vector finite element analysis software package MPFEA (Massively Parallel-vector Finite Element Analysis) is developed for large-scale structural analysis on massively parallel computers with distributed-memory. MPFEA is designed for parallel generation and assembly of the global finite element stiffness matrices as well as parallel solution of the simultaneous linear equations, since these are often the major time-consuming parts of a finite element analysis. Block-skyline storage scheme along with vector-unrolling techniques are used to enhance the vector performance. Communications among processors are carried out concurrently with arithmetic operations to reduce the total execution time. Numerical results on the Intel iPSC/860 computers (such as the Intel Gamma with 128 processors and the Intel Touchstone Delta with 512 processors) are presented, including an aircraft structure and some very large truss structures, to demonstrate the efficiency and accuracy of MPFEA.
Python for Large-Scale Electrophysiology
Spacek, Martin; Blanche, Tim; Swindale, Nicholas
2008-01-01
Electrophysiology is increasingly moving towards highly parallel recording techniques which generate large data sets. We record extracellularly in vivo in cat and rat visual cortex with 54-channel silicon polytrodes, under time-locked visual stimulation, from localized neuronal populations within a cortical column. To help deal with the complexity of generating and analysing these data, we used the Python programming language to develop three software projects: one for temporally precise visual stimulus generation (“dimstim”); one for electrophysiological waveform visualization and spike sorting (“spyke”); and one for spike train and stimulus analysis (“neuropy”). All three are open source and available for download (http://swindale.ecc.ubc.ca/code). The requirements and solutions for these projects differed greatly, yet we found Python to be well suited for all three. Here we present our software as a showcase of the extensive capabilities of Python in neuroscience. PMID:19198646
Topal, Taner; Polat, Hüseyin; Güler, Inan
2008-10-01
In this paper, a time-frequency spectral analysis software (Heart Sound Analyzer) for the computer-aided analysis of cardiac sounds has been developed with LabVIEW. Software modules reveal important information for cardiovascular disorders, it can also assist to general physicians to come up with more accurate and reliable diagnosis at early stages. Heart sound analyzer (HSA) software can overcome the deficiency of expert doctors and help them in rural as well as urban clinics and hospitals. HSA has two main blocks: data acquisition and preprocessing, time-frequency spectral analyses. The heart sounds are first acquired using a modified stethoscope which has an electret microphone in it. Then, the signals are analysed using the time-frequency/scale spectral analysis techniques such as STFT, Wigner-Ville distribution and wavelet transforms. HSA modules have been tested with real heart sounds from 35 volunteers and proved to be quite efficient and robust while dealing with a large variety of pathological conditions.
Design Aspects of the Rayleigh Convection Code
NASA Astrophysics Data System (ADS)
Featherstone, N. A.
2017-12-01
Understanding the long-term generation of planetary or stellar magnetic field requires complementary knowledge of the large-scale fluid dynamics pervading large fractions of the object's interior. Such large-scale motions are sensitive to the system's geometry which, in planets and stars, is spherical to a good approximation. As a result, computational models designed to study such systems often solve the MHD equations in spherical geometry, frequently employing a spectral approach involving spherical harmonics. We present computational and user-interface design aspects of one such modeling tool, the Rayleigh convection code, which is suitable for deployment on desktop and petascale-hpc architectures alike. In this poster, we will present an overview of this code's parallel design and its built-in diagnostics-output package. Rayleigh has been developed with NSF support through the Computational Infrastructure for Geodynamics and is expected to be released as open-source software in winter 2017/2018.
Lee, Jae H.; Yao, Yushu; Shrestha, Uttam; Gullberg, Grant T.; Seo, Youngho
2014-01-01
The primary goal of this project is to implement the iterative statistical image reconstruction algorithm, in this case maximum likelihood expectation maximum (MLEM) used for dynamic cardiac single photon emission computed tomography, on Spark/GraphX. This involves porting the algorithm to run on large-scale parallel computing systems. Spark is an easy-to- program software platform that can handle large amounts of data in parallel. GraphX is a graph analytic system running on top of Spark to handle graph and sparse linear algebra operations in parallel. The main advantage of implementing MLEM algorithm in Spark/GraphX is that it allows users to parallelize such computation without any expertise in parallel computing or prior knowledge in computer science. In this paper we demonstrate a successful implementation of MLEM in Spark/GraphX and present the performance gains with the goal to eventually make it useable in clinical setting. PMID:27081299
Lee, Jae H; Yao, Yushu; Shrestha, Uttam; Gullberg, Grant T; Seo, Youngho
2014-11-01
The primary goal of this project is to implement the iterative statistical image reconstruction algorithm, in this case maximum likelihood expectation maximum (MLEM) used for dynamic cardiac single photon emission computed tomography, on Spark/GraphX. This involves porting the algorithm to run on large-scale parallel computing systems. Spark is an easy-to- program software platform that can handle large amounts of data in parallel. GraphX is a graph analytic system running on top of Spark to handle graph and sparse linear algebra operations in parallel. The main advantage of implementing MLEM algorithm in Spark/GraphX is that it allows users to parallelize such computation without any expertise in parallel computing or prior knowledge in computer science. In this paper we demonstrate a successful implementation of MLEM in Spark/GraphX and present the performance gains with the goal to eventually make it useable in clinical setting.
NASA Astrophysics Data System (ADS)
Xie, Hongbo; Mao, Chensheng; Ren, Yongjie; Zhu, Jigui; Wang, Chao; Yang, Lei
2017-10-01
In high precision and large-scale coordinate measurement, one commonly used approach to determine the coordinate of a target point is utilizing the spatial trigonometric relationships between multiple laser transmitter stations and the target point. A light receiving device at the target point is the key element in large-scale coordinate measurement systems. To ensure high-resolution and highly sensitive spatial coordinate measurement, a high-performance and miniaturized omnidirectional single-point photodetector (OSPD) is greatly desired. We report one design of OSPD using an aspheric lens, which achieves an enhanced reception angle of -5 deg to 45 deg in vertical and 360 deg in horizontal. As the heart of our OSPD, the aspheric lens is designed in a geometric model and optimized by LightTools Software, which enables the reflection of a wide-angle incident light beam into the single-point photodiode. The performance of home-made OSPD is characterized with working distances from 1 to 13 m and further analyzed utilizing developed a geometric model. The experimental and analytic results verify that our device is highly suitable for large-scale coordinate metrology. The developed device also holds great potential in various applications such as omnidirectional vision sensor, indoor global positioning system, and optical wireless communication systems.
Active Self-Testing Noise Measurement Sensors for Large-Scale Environmental Sensor Networks
Domínguez, Federico; Cuong, Nguyen The; Reinoso, Felipe; Touhafi, Abdellah; Steenhaut, Kris
2013-01-01
Large-scale noise pollution sensor networks consist of hundreds of spatially distributed microphones that measure environmental noise. These networks provide historical and real-time environmental data to citizens and decision makers and are therefore a key technology to steer environmental policy. However, the high cost of certified environmental microphone sensors render large-scale environmental networks prohibitively expensive. Several environmental network projects have started using off-the-shelf low-cost microphone sensors to reduce their costs, but these sensors have higher failure rates and produce lower quality data. To offset this disadvantage, we developed a low-cost noise sensor that actively checks its condition and indirectly the integrity of the data it produces. The main design concept is to embed a 13 mm speaker in the noise sensor casing and, by regularly scheduling a frequency sweep, estimate the evolution of the microphone's frequency response over time. This paper presents our noise sensor's hardware and software design together with the results of a test deployment in a large-scale environmental network in Belgium. Our middle-range-value sensor (around €50) effectively detected all experienced malfunctions, in laboratory tests and outdoor deployments, with a few false positives. Future improvements could further lower the cost of our sensor below €10. PMID:24351634
Computer-generated forces in distributed interactive simulation
NASA Astrophysics Data System (ADS)
Petty, Mikel D.
1995-04-01
Distributed Interactive Simulation (DIS) is an architecture for building large-scale simulation models from a set of independent simulator nodes communicating via a common network protocol. DIS is most often used to create a simulated battlefield for military training. Computer Generated Forces (CGF) systems control large numbers of autonomous battlefield entities in a DIS simulation using computer equipment and software rather than humans in simulators. CGF entities serve as both enemy forces and supplemental friendly forces in a DIS exercise. Research into various aspects of CGF systems is ongoing. Several CGF systems have been implemented.
Optimization of large matrix calculations for execution on the Cray X-MP vector supercomputer
NASA Technical Reports Server (NTRS)
Hornfeck, William A.
1988-01-01
A considerable volume of large computational computer codes were developed for NASA over the past twenty-five years. This code represents algorithms developed for machines of earlier generation. With the emergence of the vector supercomputer as a viable, commercially available machine, an opportunity exists to evaluate optimization strategies to improve the efficiency of existing software. This result is primarily due to architectural differences in the latest generation of large-scale machines and the earlier, mostly uniprocessor, machines. A sofware package being used by NASA to perform computations on large matrices is described, and a strategy for conversion to the Cray X-MP vector supercomputer is also described.
FracPaQ: a MATLAB™ Toolbox for the Quantification of Fracture Patterns
NASA Astrophysics Data System (ADS)
Healy, D.; Rizzo, R. E.; Cornwell, D. G.; Timms, N.; Farrell, N. J.; Watkins, H.; Gomez-Rivas, E.; Smith, M.
2016-12-01
The patterns of fractures in deformed rocks are rarely uniform or random. Fracture orientations, sizes, shapes and spatial distributions often exhibit some kind of order. In detail, there may be relationships among the different fracture attributes e.g. small fractures dominated by one orientation, larger fractures by another. These relationships are important because the mechanical (e.g. strength, anisotropy) and transport (e.g. fluids, heat) properties of rock depend on these fracture patterns and fracture attributes. This presentation describes an open source toolbox to quantify fracture patterns, including distributions in fracture attributes and their spatial variation. Software has been developed to quantify fracture patterns from 2-D digital images, such as thin section micrographs, geological maps, outcrop or aerial photographs or satellite images. The toolbox comprises a suite of MATLAB™ scripts based on published quantitative methods for the analysis of fracture attributes: orientations, lengths, intensity, density and connectivity. An estimate of permeability in 2-D is made using a parallel plate model. The software provides an objective and consistent methodology for quantifying fracture patterns and their variations in 2-D across a wide range of length scales. Our current focus for the application of the software is on quantifying the fracture patterns in and around fault zones. There is a large body of published work on the quantification of relatively simple joint patterns, but fault zones present a bigger, and arguably more important, challenge. The method presented is inherently scale independent, and a key task will be to analyse and integrate quantitative fracture pattern data from micro- to macro-scales. Planned future releases will incorporate multi-scale analyses based on a wavelet method to look for scale transitions, and combining fracture traces from multiple 2-D images to derive the statistically equivalent 3-D fracture pattern.
NASA Astrophysics Data System (ADS)
Wyborn, L. A.; Evans, B. J. K.; Pugh, T.; Lescinsky, D. T.; Foster, C.; Uhlherr, A.
2014-12-01
The National Computational Infrastructure (NCI) at the Australian National University (ANU) is a partnership between CSIRO, ANU, Bureau of Meteorology (BoM) and Geoscience Australia. Recent investments in a 1.2 PFlop Supercomputer (Raijin), ~ 20 PB data storage using Lustre filesystems and a 3000 core high performance cloud have created a hybrid platform for higher performance computing and data-intensive science to enable large scale earth and climate systems modelling and analysis. There are > 3000 users actively logging in and > 600 projects on the NCI system. Efficiently scaling and adapting data and software systems to petascale infrastructures requires the collaborative development of an architecture that is designed, programmed and operated to enable users to interactively invoke different forms of in-situ computation over complex and large scale data collections. NCI makes available major and long tail data collections from both the government and research sectors based on six themes: 1) weather, climate and earth system science model simulations, 2) marine and earth observations, 3) geosciences, 4) terrestrial ecosystems, 5) water and hydrology and 6) astronomy, bio and social. Collectively they span the lithosphere, crust, biosphere, hydrosphere, troposphere, and stratosphere. Collections are the operational form for data management and access. Similar data types from individual custodians are managed cohesively. Use of international standards for discovery and interoperability allow complex interactions within and between the collections. This design facilitates a transdisciplinary approach to research and enables a shift from small scale, 'stove-piped' science efforts to large scale, collaborative systems science. This new and complex infrastructure requires a move to shared, globally trusted software frameworks that can be maintained and updated. Workflow engines become essential and need to integrate provenance, versioning, traceability, repeatability and publication. There are also human resource challenges as highly skilled HPC/HPD specialists, specialist programmers, and data scientists are required whose skills can support scaling to the new paradigm of effective and efficient data-intensive earth science analytics on petascale, and soon to be exascale systems.
NASA Astrophysics Data System (ADS)
Harris, B.; McDougall, K.; Barry, M.
2012-07-01
Digital Elevation Models (DEMs) allow for the efficient and consistent creation of waterways and catchment boundaries over large areas. Studies of waterway delineation from DEMs are usually undertaken over small or single catchment areas due to the nature of the problems being investigated. Improvements in Geographic Information Systems (GIS) techniques, software, hardware and data allow for analysis of larger data sets and also facilitate a consistent tool for the creation and analysis of waterways over extensive areas. However, rarely are they developed over large regional areas because of the lack of available raw data sets and the amount of work required to create the underlying DEMs. This paper examines definition of waterways and catchments over an area of approximately 25,000 km2 to establish the optimal DEM scale required for waterway delineation over large regional projects. The comparative study analysed multi-scale DEMs over two test areas (Wivenhoe catchment, 543 km2 and a detailed 13 km2 within the Wivenhoe catchment) including various data types, scales, quality, and variable catchment input parameters. Historic and available DEM data was compared to high resolution Lidar based DEMs to assess variations in the formation of stream networks. The results identified that, particularly in areas of high elevation change, DEMs at 20 m cell size created from broad scale 1:25,000 data (combined with more detailed data or manual delineation in flat areas) are adequate for the creation of waterways and catchments at a regional scale.
Static Schedulers for Embedded Real-Time Systems
1989-12-01
Because of the need for having efficient scheduling algorithms in large scale real time systems , software engineers put a lot of effort on developing...provide static schedulers for he Embedded Real Time Systems with single processor using Ada programming language. The independent nonpreemptable...support the Computer Aided Rapid Prototyping for Embedded Real Time Systems so that we determine whether the system, as designed, meets the required
Hardware-Assisted Large-Scale Neuroevolution for Multiagent Learning
2014-12-30
SECURITY CLASSIFICATION OF: This DURIP equipment award was used to purchase, install, and bring on-line two Berkeley Emulation Engines ( BEEs ) and two...mini- BEE machines to establish an FPGA-based high-performance multiagent training platform and its associated software. This acquisition of BEE4-W...Platform; Probabilistic Domain Transformation; Hardware-Assisted; FPGA; BEE ; Hive Brain; Multiagent. REPORT DOCUMENTATION PAGE 11. SPONSOR/MONITOR’S
NASA Astrophysics Data System (ADS)
Callaghan, S.; Maechling, P. J.; Juve, G.; Vahi, K.; Deelman, E.; Jordan, T. H.
2015-12-01
The CyberShake computational platform, developed by the Southern California Earthquake Center (SCEC), is an integrated collection of scientific software and middleware that performs 3D physics-based probabilistic seismic hazard analysis (PSHA) for Southern California. CyberShake integrates large-scale and high-throughput research codes to produce probabilistic seismic hazard curves for individual locations of interest and hazard maps for an entire region. A recent CyberShake calculation produced about 500,000 two-component seismograms for each of 336 locations, resulting in over 300 million synthetic seismograms in a Los Angeles-area probabilistic seismic hazard model. CyberShake calculations require a series of scientific software programs. Early computational stages produce data used as inputs by later stages, so we describe CyberShake calculations using a workflow definition language. Scientific workflow tools automate and manage the input and output data and enable remote job execution on large-scale HPC systems. To satisfy the requests of broad impact users of CyberShake data, such as seismologists, utility companies, and building code engineers, we successfully completed CyberShake Study 15.4 in April and May 2015, calculating a 1 Hz urban seismic hazard map for Los Angeles. We distributed the calculation between the NSF Track 1 system NCSA Blue Waters, the DOE Leadership-class system OLCF Titan, and USC's Center for High Performance Computing. This study ran for over 5 weeks, burning about 1.1 million node-hours and producing over half a petabyte of data. The CyberShake Study 15.4 results doubled the maximum simulated seismic frequency from 0.5 Hz to 1.0 Hz as compared to previous studies, representing a factor of 16 increase in computational complexity. We will describe how our workflow tools supported splitting the calculation across multiple systems. We will explain how we modified CyberShake software components, including GPU implementations and migrating from file-based communication to MPI messaging, to greatly reduce the I/O demands and node-hour requirements of CyberShake. We will also present performance metrics from CyberShake Study 15.4, and discuss challenges that producers of Big Data on open-science HPC resources face moving forward.
QuantWorm: a comprehensive software package for Caenorhabditis elegans phenotypic assays.
Jung, Sang-Kyu; Aleman-Meza, Boanerges; Riepe, Celeste; Zhong, Weiwei
2014-01-01
Phenotypic assays are crucial in genetics; however, traditional methods that rely on human observation are unsuitable for quantitative, large-scale experiments. Furthermore, there is an increasing need for comprehensive analyses of multiple phenotypes to provide multidimensional information. Here we developed an automated, high-throughput computer imaging system for quantifying multiple Caenorhabditis elegans phenotypes. Our imaging system is composed of a microscope equipped with a digital camera and a motorized stage connected to a computer running the QuantWorm software package. Currently, the software package contains one data acquisition module and four image analysis programs: WormLifespan, WormLocomotion, WormLength, and WormEgg. The data acquisition module collects images and videos. The WormLifespan software counts the number of moving worms by using two time-lapse images; the WormLocomotion software computes the velocity of moving worms; the WormLength software measures worm body size; and the WormEgg software counts the number of eggs. To evaluate the performance of our software, we compared the results of our software with manual measurements. We then demonstrated the application of the QuantWorm software in a drug assay and a genetic assay. Overall, the QuantWorm software provided accurate measurements at a high speed. Software source code, executable programs, and sample images are available at www.quantworm.org. Our software package has several advantages over current imaging systems for C. elegans. It is an all-in-one package for quantifying multiple phenotypes. The QuantWorm software is written in Java and its source code is freely available, so it does not require use of commercial software or libraries. It can be run on multiple platforms and easily customized to cope with new methods and requirements.
Abdoun, Oussama; Joucla, Sébastien; Mazzocco, Claire; Yvert, Blaise
2010-01-01
A major characteristic of neural networks is the complexity of their organization at various spatial scales, from microscopic local circuits to macroscopic brain-scale areas. Understanding how neural information is processed thus entails the ability to study them at multiple scales simultaneously. This is made possible using microelectrodes array (MEA) technology. Indeed, high-density MEAs provide large-scale coverage (several square millimeters) of whole neural structures combined with microscopic resolution (about 50 μm) of unit activity. Yet, current options for spatiotemporal representation of MEA-collected data remain limited. Here we present NeuroMap, a new interactive Matlab-based software for spatiotemporal mapping of MEA data. NeuroMap uses thin plate spline interpolation, which provides several assets with respect to conventional mapping methods used currently. First, any MEA design can be considered, including 2D or 3D, regular or irregular, arrangements of electrodes. Second, spline interpolation allows the estimation of activity across the tissue with local extrema not necessarily at recording sites. Finally, this interpolation approach provides a straightforward analytical estimation of the spatial Laplacian for better current sources localization. In this software, coregistration of 2D MEA data on the anatomy of the neural tissue is made possible by fine matching of anatomical data with electrode positions using rigid-deformation-based correction of anatomical pictures. Overall, NeuroMap provides substantial material for detailed spatiotemporal analysis of MEA data. The package is distributed under GNU General Public License and available at http://sites.google.com/site/neuromapsoftware. PMID:21344013
Abdoun, Oussama; Joucla, Sébastien; Mazzocco, Claire; Yvert, Blaise
2011-01-01
A major characteristic of neural networks is the complexity of their organization at various spatial scales, from microscopic local circuits to macroscopic brain-scale areas. Understanding how neural information is processed thus entails the ability to study them at multiple scales simultaneously. This is made possible using microelectrodes array (MEA) technology. Indeed, high-density MEAs provide large-scale coverage (several square millimeters) of whole neural structures combined with microscopic resolution (about 50 μm) of unit activity. Yet, current options for spatiotemporal representation of MEA-collected data remain limited. Here we present NeuroMap, a new interactive Matlab-based software for spatiotemporal mapping of MEA data. NeuroMap uses thin plate spline interpolation, which provides several assets with respect to conventional mapping methods used currently. First, any MEA design can be considered, including 2D or 3D, regular or irregular, arrangements of electrodes. Second, spline interpolation allows the estimation of activity across the tissue with local extrema not necessarily at recording sites. Finally, this interpolation approach provides a straightforward analytical estimation of the spatial Laplacian for better current sources localization. In this software, coregistration of 2D MEA data on the anatomy of the neural tissue is made possible by fine matching of anatomical data with electrode positions using rigid-deformation-based correction of anatomical pictures. Overall, NeuroMap provides substantial material for detailed spatiotemporal analysis of MEA data. The package is distributed under GNU General Public License and available at http://sites.google.com/site/neuromapsoftware.
Large-scale automated histology in the pursuit of connectomes.
Kleinfeld, David; Bharioke, Arjun; Blinder, Pablo; Bock, Davi D; Briggman, Kevin L; Chklovskii, Dmitri B; Denk, Winfried; Helmstaedter, Moritz; Kaufhold, John P; Lee, Wei-Chung Allen; Meyer, Hanno S; Micheva, Kristina D; Oberlaender, Marcel; Prohaska, Steffen; Reid, R Clay; Smith, Stephen J; Takemura, Shinya; Tsai, Philbert S; Sakmann, Bert
2011-11-09
How does the brain compute? Answering this question necessitates neuronal connectomes, annotated graphs of all synaptic connections within defined brain areas. Further, understanding the energetics of the brain's computations requires vascular graphs. The assembly of a connectome requires sensitive hardware tools to measure neuronal and neurovascular features in all three dimensions, as well as software and machine learning for data analysis and visualization. We present the state of the art on the reconstruction of circuits and vasculature that link brain anatomy and function. Analysis at the scale of tens of nanometers yields connections between identified neurons, while analysis at the micrometer scale yields probabilistic rules of connection between neurons and exact vascular connectivity.
Large-Scale Automated Histology in the Pursuit of Connectomes
Bharioke, Arjun; Blinder, Pablo; Bock, Davi D.; Briggman, Kevin L.; Chklovskii, Dmitri B.; Denk, Winfried; Helmstaedter, Moritz; Kaufhold, John P.; Lee, Wei-Chung Allen; Meyer, Hanno S.; Micheva, Kristina D.; Oberlaender, Marcel; Prohaska, Steffen; Reid, R. Clay; Smith, Stephen J.; Takemura, Shinya; Tsai, Philbert S.; Sakmann, Bert
2011-01-01
How does the brain compute? Answering this question necessitates neuronal connectomes, annotated graphs of all synaptic connections within defined brain areas. Further, understanding the energetics of the brain's computations requires vascular graphs. The assembly of a connectome requires sensitive hardware tools to measure neuronal and neurovascular features in all three dimensions, as well as software and machine learning for data analysis and visualization. We present the state of the art on the reconstruction of circuits and vasculature that link brain anatomy and function. Analysis at the scale of tens of nanometers yields connections between identified neurons, while analysis at the micrometer scale yields probabilistic rules of connection between neurons and exact vascular connectivity. PMID:22072665
A 2D Fourier tool for the analysis of photo-elastic effect in large granular assemblies
NASA Astrophysics Data System (ADS)
Leśniewska, Danuta
2017-06-01
Fourier transforms are the basic tool in constructing different types of image filters, mainly those reducing optical noise. Some DIC or PIV software also uses frequency space to obtain displacement fields from a series of digital images of a deforming body. The paper presents series of 2D Fourier transforms of photo-elastic transmission images, representing large pseudo 2D granular assembly, deforming under varying boundary conditions. The images related to different scales were acquired using the same image resolution, but taken at different distance from the sample. Fourier transforms of images, representing different stages of deformation, reveal characteristic features at the three (`macro-`, `meso-` and `micro-`) scales, which can serve as a data to study internal order-disorder transition within granular materials.
Maestro: an orchestration framework for large-scale WSN simulations.
Riliskis, Laurynas; Osipov, Evgeny
2014-03-18
Contemporary wireless sensor networks (WSNs) have evolved into large and complex systems and are one of the main technologies used in cyber-physical systems and the Internet of Things. Extensive research on WSNs has led to the development of diverse solutions at all levels of software architecture, including protocol stacks for communications. This multitude of solutions is due to the limited computational power and restrictions on energy consumption that must be accounted for when designing typical WSN systems. It is therefore challenging to develop, test and validate even small WSN applications, and this process can easily consume significant resources. Simulations are inexpensive tools for testing, verifying and generally experimenting with new technologies in a repeatable fashion. Consequently, as the size of the systems to be tested increases, so does the need for large-scale simulations. This article describes a tool called Maestro for the automation of large-scale simulation and investigates the feasibility of using cloud computing facilities for such task. Using tools that are built into Maestro, we demonstrate a feasible approach for benchmarking cloud infrastructure in order to identify cloud Virtual Machine (VM)instances that provide an optimal balance of performance and cost for a given simulation.
Stanzel, Sven; Weimer, Marc; Kopp-Schneider, Annette
2013-06-01
High-throughput screening approaches are carried out for the toxicity assessment of a large number of chemical compounds. In such large-scale in vitro toxicity studies several hundred or thousand concentration-response experiments are conducted. The automated evaluation of concentration-response data using statistical analysis scripts saves time and yields more consistent results in comparison to data analysis performed by the use of menu-driven statistical software. Automated statistical analysis requires that concentration-response data are available in a standardised data format across all compounds. To obtain consistent data formats, a standardised data management workflow must be established, including guidelines for data storage, data handling and data extraction. In this paper two procedures for data management within large-scale toxicological projects are proposed. Both procedures are based on Microsoft Excel files as the researcher's primary data format and use a computer programme to automate the handling of data files. The first procedure assumes that data collection has not yet started whereas the second procedure can be used when data files already exist. Successful implementation of the two approaches into the European project ACuteTox is illustrated. Copyright © 2012 Elsevier Ltd. All rights reserved.
Maestro: An Orchestration Framework for Large-Scale WSN Simulations
Riliskis, Laurynas; Osipov, Evgeny
2014-01-01
Contemporary wireless sensor networks (WSNs) have evolved into large and complex systems and are one of the main technologies used in cyber-physical systems and the Internet of Things. Extensive research on WSNs has led to the development of diverse solutions at all levels of software architecture, including protocol stacks for communications. This multitude of solutions is due to the limited computational power and restrictions on energy consumption that must be accounted for when designing typical WSN systems. It is therefore challenging to develop, test and validate even small WSN applications, and this process can easily consume significant resources. Simulations are inexpensive tools for testing, verifying and generally experimenting with new technologies in a repeatable fashion. Consequently, as the size of the systems to be tested increases, so does the need for large-scale simulations. This article describes a tool called Maestro for the automation of large-scale simulation and investigates the feasibility of using cloud computing facilities for such task. Using tools that are built into Maestro, we demonstrate a feasible approach for benchmarking cloud infrastructure in order to identify cloud Virtual Machine (VM)instances that provide an optimal balance of performance and cost for a given simulation. PMID:24647123
Benchmark of Client and Server-Side Catchment Delineation Approaches on Web-Based Systems
NASA Astrophysics Data System (ADS)
Demir, I.; Sermet, M. Y.; Sit, M. A.
2016-12-01
Recent advances in internet and cyberinfrastructure technologies have provided the capability to acquire large scale spatial data from various gauges and sensor networks. The collection of environmental data increased demand for applications which are capable of managing and processing large-scale and high-resolution data sets. With the amount and resolution of data sets provided, one of the challenging tasks for organizing and customizing hydrological data sets is delineation of watersheds on demand. Watershed delineation is a process for creating a boundary that represents the contributing area for a specific control point or water outlet, with intent of characterization and analysis of portions of a study area. Although many GIS tools and software for watershed analysis are available on desktop systems, there is a need for web-based and client-side techniques for creating a dynamic and interactive environment for exploring hydrological data. In this project, we demonstrated several watershed delineation techniques on the web with various techniques implemented on the client-side using JavaScript and WebGL, and on the server-side using Python and C++. We also developed a client-side GPGPU (General Purpose Graphical Processing Unit) algorithm to analyze high-resolution terrain data for watershed delineation which allows parallelization using GPU. The web-based real-time analysis of watershed segmentation can be helpful for decision-makers and interested stakeholders while eliminating the need of installing complex software packages and dealing with large-scale data sets. Utilization of the client-side hardware resources also eliminates the need of servers due its crowdsourcing nature. Our goal for future work is to improve other hydrologic analysis methods such as rain flow tracking by adapting presented approaches.
NASA Astrophysics Data System (ADS)
Liang, Cunren; Zeng, Qiming; Jia, Jianying; Jiao, Jian; Cui, Xi'ai
2013-02-01
Scanning synthetic aperture radar (ScanSAR) mode is an efficient way to map large scale geophysical phenomena at low cost. The work presented in this paper is dedicated to ScanSAR interferometric processing and its implementation by making full use of existing standard interferometric synthetic aperture radar (InSAR) software. We first discuss the properties of the ScanSAR signal and its phase-preserved focusing using the full aperture algorithm in terms of interferometry. Then a complete interferometric processing flow is proposed. The standard ScanSAR product is decoded subswath by subswath with burst gaps padded with zero-pulses, followed by a Doppler centroid frequency estimation for each subswath and a polynomial fit of all of the subswaths for the whole scene. The burst synchronization of the interferometric pair is then calculated, and only the synchronized pulses are kept for further interferometric processing. After the complex conjugate multiplication of the interferometric pair, the residual non-integer pulse repetition interval (PRI) part between adjacent bursts caused by zero padding is compensated by resampling using a sinc kernel. The subswath interferograms are then mosaicked, in which a method is proposed to remove the subswath discontinuities in the overlap area. Then the following interferometric processing goes back to the traditional stripmap processing flow. A processor written with C and Fortran languages and controlled by Perl scripts is developed to implement these algorithms and processing flow based on the JPL/Caltech Repeat Orbit Interferometry PACkage (ROI_PAC). Finally, we use the processor to process ScanSAR data from the Envisat and ALOS satellites and obtain large scale deformation maps in the radar line-of-sight (LOS) direction.
A new practice-driven approach to develop software in a cyber-physical system environment
NASA Astrophysics Data System (ADS)
Jiang, Yiping; Chen, C. L. Philip; Duan, Junwei
2016-02-01
Cyber-physical system (CPS) is an emerging area, which cannot work efficiently without proper software handling of the data and business logic. Software and middleware is the soul of the CPS. The software development of CPS is a critical issue because of its complicity in a large scale realistic system. Furthermore, object-oriented approach (OOA) is often used to develop CPS software, which needs some improvements according to the characteristics of CPS. To develop software in a CPS environment, a new systematic approach is proposed in this paper. It comes from practice, and has been evolved from software companies. It consists of (A) Requirement analysis in event-oriented way, (B) architecture design in data-oriented way, (C) detailed design and coding in object-oriented way and (D) testing in event-oriented way. It is a new approach based on OOA; the difference when compared with OOA is that the proposed approach has different emphases and measures in every stage. It is more accord with the characteristics of event-driven CPS. In CPS software development, one should focus on the events more than the functions or objects. A case study of a smart home system is designed to reveal the effectiveness of the approach. It shows that the approach is also easy to be operated in the practice owing to some simplifications. The running result illustrates the validity of this approach.
The Waterfall Model in Large-Scale Development
NASA Astrophysics Data System (ADS)
Petersen, Kai; Wohlin, Claes; Baca, Dejan
Waterfall development is still a widely used way of working in software development companies. Many problems have been reported related to the model. Commonly accepted problems are for example to cope with change and that defects all too often are detected too late in the software development process. However, many of the problems mentioned in literature are based on beliefs and experiences, and not on empirical evidence. To address this research gap, we compare the problems in literature with the results of a case study at Ericsson AB in Sweden, investigating issues in the waterfall model. The case study aims at validating or contradicting the beliefs of what the problems are in waterfall development through empirical research.
Web-based segmentation and display of three-dimensional radiologic image data.
Silverstein, J; Rubenstein, J; Millman, A; Panko, W
1998-01-01
In many clinical circumstances, viewing sequential radiological image data as three-dimensional models is proving beneficial. However, designing customized computer-generated radiological models is beyond the scope of most physicians, due to specialized hardware and software requirements. We have created a simple method for Internet users to remotely construct and locally display three-dimensional radiological models using only a standard web browser. Rapid model construction is achieved by distributing the hardware intensive steps to a remote server. Once created, the model is automatically displayed on the requesting browser and is accessible to multiple geographically distributed users. Implementation of our server software on large scale systems could be of great service to the worldwide medical community.
ASERA: A Spectrum Eye Recognition Assistant
NASA Astrophysics Data System (ADS)
Yuan, Hailong; Zhang, Haotong; Zhang, Yanxia; Lei, Yajuan; Dong, Yiqiao; Zhao, Yongheng
2018-04-01
ASERA, ASpectrum Eye Recognition Assistant, aids in quasar spectral recognition and redshift measurement and can also be used to recognize various types of spectra of stars, galaxies and AGNs (Active Galactic Nucleus). This interactive software allows users to visualize observed spectra, superimpose template spectra from the Sloan Digital Sky Survey (SDSS), and interactively access related spectral line information. ASERA is an efficient and user-friendly semi-automated toolkit for the accurate classification of spectra observed by LAMOST (the Large Sky Area Multi-object Fiber Spectroscopic Telescope) and is available as a standalone Java application and as a Java applet. The software offers several functions, including wavelength and flux scale settings, zoom in and out, redshift estimation, and spectral line identification.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sierra Thermal /Fluid Team
The SIERRA Low Mach Module: Fuego along with the SIERRA Participating Media Radiation Module: Syrinx, henceforth referred to as Fuego and Syrinx, respectively, are the key elements of the ASCI fire environment simulation project. The fire environment simulation project is directed at characterizing both open large-scale pool fires and building enclosure fires. Fuego represents the turbulent, buoyantly-driven incompressible flow, heat transfer, mass transfer, combustion, soot, and absorption coefficient model portion of the simulation software. Syrinx represents the participating-media thermal radiation mechanics. This project is an integral part of the SIERRA multi-mechanics software development project. Fuego depends heavily upon the coremore » architecture developments provided by SIERRA for massively parallel computing, solution adaptivity, and mechanics coupling on unstructured grids.« less
A Grid Infrastructure for Supporting Space-based Science Operations
NASA Technical Reports Server (NTRS)
Bradford, Robert N.; Redman, Sandra H.; McNair, Ann R. (Technical Monitor)
2002-01-01
Emerging technologies for computational grid infrastructures have the potential for revolutionizing the way computers are used in all aspects of our lives. Computational grids are currently being implemented to provide a large-scale, dynamic, and secure research and engineering environments based on standards and next-generation reusable software, enabling greater science and engineering productivity through shared resources and distributed computing for less cost than traditional architectures. Combined with the emerging technologies of high-performance networks, grids provide researchers, scientists and engineers the first real opportunity for an effective distributed collaborative environment with access to resources such as computational and storage systems, instruments, and software tools and services for the most computationally challenging applications.
Gennaro, G; Ballaminut, A; Contento, G
2017-09-01
This study aims to illustrate a multiparametric automatic method for monitoring long-term reproducibility of digital mammography systems, and its application on a large scale. Twenty-five digital mammography systems employed within a regional screening programme were controlled weekly using the same type of phantom, whose images were analysed by an automatic software tool. To assess system reproducibility levels, 15 image quality indices (IQIs) were extracted and compared with the corresponding indices previously determined by a baseline procedure. The coefficients of variation (COVs) of the IQIs were used to assess the overall variability. A total of 2553 phantom images were collected from the 25 digital mammography systems from March 2013 to December 2014. Most of the systems showed excellent image quality reproducibility over the surveillance interval, with mean variability below 5%. Variability of each IQI was 5%, with the exception of one index associated with the smallest phantom objects (0.25 mm), which was below 10%. The method applied for reproducibility tests-multi-detail phantoms, cloud automatic software tool to measure multiple image quality indices and statistical process control-was proven to be effective and applicable on a large scale and to any type of digital mammography system. • Reproducibility of mammography image quality should be monitored by appropriate quality controls. • Use of automatic software tools allows image quality evaluation by multiple indices. • System reproducibility can be assessed comparing current index value with baseline data. • Overall system reproducibility of modern digital mammography systems is excellent. • The method proposed and applied is cost-effective and easily scalable.
Swan, D; Hannigan, A; Higgins, S; McDonnell, R; Meagher, D; Cullen, W
2017-02-01
In Ireland, as in many other healthcare systems, mental health service provision is being reconfigured with a move toward more care in the community, and particularly primary care. Recording and surveillance systems for mental health information and activities in primary care are needed for service planning and quality improvement. We describe the development and initial implementation of a software tool ('mental health finder') within a widely used primary care electronic medical record system (EMR) in Ireland to enable large-scale data collection on the epidemiology and management of mental health and substance use problems among patients attending general practice. In collaboration with the Irish Primary Care Research Network (IPCRN), we developed the 'Mental Health Finder' as a software plug-in to a commonly used primary care EMR system to facilitate data collection on mental health diagnoses and pharmacological treatments among patients. The finder searches for and identifies patients based on diagnostic coding and/or prescribed medicines. It was initially implemented among a convenience sample of six GP practices. Prevalence of mental health and substance use problems across the six practices, as identified by the finder, was 9.4% (range 6.9-12.7%). 61.9% of identified patients were female; 25.8% were private patients. One-third (33.4%) of identified patients were prescribed more than one class of psychotropic medication. Of the patients identified by the finder, 89.9% were identifiable via prescribing data, 23.7% via diagnostic coding. The finder is a feasible and promising methodology for large-scale data collection on mental health problems in primary care.
Interactive, graphical processing unitbased evaluation of evacuation scenarios at the state scale
DOE Office of Scientific and Technical Information (OSTI.GOV)
Perumalla, Kalyan S; Aaby, Brandon G; Yoginath, Srikanth B
2011-01-01
In large-scale scenarios, transportation modeling and simulation is severely constrained by simulation time. For example, few real- time simulators scale to evacuation traffic scenarios at the level of an entire state, such as Louisiana (approximately 1 million links) or Florida (2.5 million links). New simulation approaches are needed to overcome severe computational demands of conventional (microscopic or mesoscopic) modeling techniques. Here, a new modeling and execution methodology is explored that holds the potential to provide a tradeoff among the level of behavioral detail, the scale of transportation network, and real-time execution capabilities. A novel, field-based modeling technique and its implementationmore » on graphical processing units are presented. Although additional research with input from domain experts is needed for refining and validating the models, the techniques reported here afford interactive experience at very large scales of multi-million road segments. Illustrative experiments on a few state-scale net- works are described based on an implementation of this approach in a software system called GARFIELD. Current modeling cap- abilities and implementation limitations are described, along with possible use cases and future research.« less
Implementation of Audio Computer-Assisted Interviewing Software in HIV/AIDS Research
Pluhar, Erika; Yeager, Katherine A.; Corkran, Carol; McCarty, Frances; Holstad, Marcia McDonnell; Denzmore-Nwagbara, Pamela; Fielder, Bridget; DiIorio, Colleen
2007-01-01
Computer assisted interviewing (CAI) has begun to play a more prominent role in HIV/AIDS prevention research. Despite the increased popularity of CAI, particularly audio computer assisted self-interviewing (ACASI), some research teams are still reluctant to implement ACASI technology due to lack of familiarity with the practical issues related to using these software packages. The purpose of this paper is to describe the implementation of one particular ACASI software package, the Questionnaire Development System™ (QDS™), in several nursing and HIV/AIDS prevention research settings. We present acceptability and satisfaction data from two large-scale public health studies in which we have used QDS with diverse populations. We also address issues related to developing and programming a questionnaire, discuss practical strategies related to planning for and implementing ACASI in the field, including selecting equipment, training staff, and collecting and transferring data, and summarize advantages and disadvantages of computer assisted research methods. PMID:17662924
TDat: An Efficient Platform for Processing Petabyte-Scale Whole-Brain Volumetric Images.
Li, Yuxin; Gong, Hui; Yang, Xiaoquan; Yuan, Jing; Jiang, Tao; Li, Xiangning; Sun, Qingtao; Zhu, Dan; Wang, Zhenyu; Luo, Qingming; Li, Anan
2017-01-01
Three-dimensional imaging of whole mammalian brains at single-neuron resolution has generated terabyte (TB)- and even petabyte (PB)-sized datasets. Due to their size, processing these massive image datasets can be hindered by the computer hardware and software typically found in biological laboratories. To fill this gap, we have developed an efficient platform named TDat, which adopts a novel data reformatting strategy by reading cuboid data and employing parallel computing. In data reformatting, TDat is more efficient than any other software. In data accessing, we adopted parallelization to fully explore the capability for data transmission in computers. We applied TDat in large-volume data rigid registration and neuron tracing in whole-brain data with single-neuron resolution, which has never been demonstrated in other studies. We also showed its compatibility with various computing platforms, image processing software and imaging systems.
NASA Astrophysics Data System (ADS)
Niu, Yingli; Li, Wenqiang; Peng, Qian; Geng, Hua; Yi, Yuanping; Wang, Linjun; Nan, Guangjun; Wang, Dong; Shuai, Zhigang
2018-04-01
MOlecular MAterials Property Prediction Package (MOMAP) is a software toolkit for molecular materials property prediction. It focuses on luminescent properties and charge mobility properties. This article contains a brief descriptive introduction of key features, theoretical models and algorithms of the software, together with examples that illustrate the performance. First, we present the theoretical models and algorithms for molecular luminescent properties calculation, which includes the excited-state radiative/non-radiative decay rate constant and the optical spectra. Then, a multi-scale simulation approach and its algorithm for the molecular charge mobility are described. This approach is based on hopping model and combines with Kinetic Monte Carlo and molecular dynamics simulations, and it is especially applicable for describing a large category of organic semiconductors, whose inter-molecular electronic coupling is much smaller than intra-molecular charge reorganisation energy.
Shen, Yiwen; Hattink, Maarten H N; Samadi, Payman; Cheng, Qixiang; Hu, Ziyiz; Gazman, Alexander; Bergman, Keren
2018-04-16
Silicon photonics based switches offer an effective option for the delivery of dynamic bandwidth for future large-scale Datacom systems while maintaining scalable energy efficiency. The integration of a silicon photonics-based optical switching fabric within electronic Datacom architectures requires novel network topologies and arbitration strategies to effectively manage the active elements in the network. We present a scalable software-defined networking control plane to integrate silicon photonic based switches with conventional Ethernet or InfiniBand networks. Our software-defined control plane manages both electronic packet switches and multiple silicon photonic switches for simultaneous packet and circuit switching. We built an experimental Dragonfly network testbed with 16 electronic packet switches and 2 silicon photonic switches to evaluate our control plane. Observed latencies occupied by each step of the switching procedure demonstrate a total of 344 µs control plane latency for data-center and high performance computing platforms.
Kedalion: NASA's Adaptable and Agile Hardware/Software Integration and Test Lab
NASA Technical Reports Server (NTRS)
Mangieri, Mark L.; Vice, Jason
2011-01-01
NASA fs Kedalion engineering analysis lab at Johnson Space Center is on the forefront of validating and using many contemporary avionics hardware/software development and integration techniques, which represent new paradigms to heritage NASA culture. Kedalion has validated many of the Orion hardware/software engineering techniques borrowed from the adjacent commercial aircraft avionics solution space, with the intention to build upon such techniques to better align with today fs aerospace market. Using agile techniques, commercial products, early rapid prototyping, in-house expertise and tools, and customer collaboration, Kedalion has demonstrated that cost effective contemporary paradigms hold the promise to serve future NASA endeavors within a diverse range of system domains. Kedalion provides a readily adaptable solution for medium/large scale integration projects. The Kedalion lab is currently serving as an in-line resource for the project and the Multipurpose Crew Vehicle (MPCV) program.
A second generation experiment in fault-tolerant software
NASA Technical Reports Server (NTRS)
Knight, J. C.
1986-01-01
Information was collected on the efficacy of fault-tolerant software by conducting two large-scale controlled experiments. In the first, an empirical study of multi-version software (MVS) was conducted. The second experiment is an empirical evaluation of self testing as a method of error detection (STED). The purpose ot the MVS experiment was to obtain empirical measurement of the performance of multi-version systems. Twenty versions of a program were prepared at four different sites under reasonably realistic development conditions from the same specifications. The purpose of the STED experiment was to obtain empirical measurements of the performance of assertions in error detection. Eight versions of a program were modified to include assertions at two different sites under controlled conditions. The overall structure of the testing environment for the MVS experiment and its status are described. Work to date in the STED experiment is also presented.
SOURCE EXPLORER: Towards Web Browser Based Tools for Astronomical Source Visualization and Analysis
NASA Astrophysics Data System (ADS)
Young, M. D.; Hayashi, S.; Gopu, A.
2014-05-01
As a new generation of large format, high-resolution imagers come online (ODI, DECAM, LSST, etc.) we are faced with the daunting prospect of astronomical images containing upwards of hundreds of thousands of identifiable sources. Visualizing and interacting with such large datasets using traditional astronomical tools appears to be unfeasible, and a new approach is required. We present here a method for the display and analysis of arbitrarily large source datasets using dynamically scaling levels of detail, enabling scientists to rapidly move from large-scale spatial overviews down to the level of individual sources and everything in-between. Based on the recognized standards of HTML5+JavaScript, we enable observers and archival users to interact with their images and sources from any modern computer without having to install specialized software. We demonstrate the ability to produce large-scale source lists from the images themselves, as well as overlaying data from publicly available source ( 2MASS, GALEX, SDSS, etc.) or user provided source lists. A high-availability cluster of computational nodes allows us to produce these source maps on demand and customized based on user input. User-generated source lists and maps are persistent across sessions and are available for further plotting, analysis, refinement, and culling.
Singh, Kumar Saurabh; Thual, Dominique; Spurio, Roberto; Cannata, Nicola
2015-01-01
One of the most crucial characteristics of day-to-day laboratory information management is the collection, storage and retrieval of information about research subjects and environmental or biomedical samples. An efficient link between sample data and experimental results is absolutely important for the successful outcome of a collaborative project. Currently available software solutions are largely limited to large scale, expensive commercial Laboratory Information Management Systems (LIMS). Acquiring such LIMS indeed can bring laboratory information management to a higher level, but most of the times this requires a sufficient investment of money, time and technical efforts. There is a clear need for a light weighted open source system which can easily be managed on local servers and handled by individual researchers. Here we present a software named SaDA for storing, retrieving and analyzing data originated from microorganism monitoring experiments. SaDA is fully integrated in the management of environmental samples, oligonucleotide sequences, microarray data and the subsequent downstream analysis procedures. It is simple and generic software, and can be extended and customized for various environmental and biomedical studies. PMID:26047146
Development of high performance scientific components for interoperability of computing packages
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gulabani, Teena Pratap
2008-01-01
Three major high performance quantum chemistry computational packages, NWChem, GAMESS and MPQC have been developed by different research efforts following different design patterns. The goal is to achieve interoperability among these packages by overcoming the challenges caused by the different communication patterns and software design of each of these packages. A chemistry algorithm is hard to develop as well as being a time consuming process; integration of large quantum chemistry packages will allow resource sharing and thus avoid reinvention of the wheel. Creating connections between these incompatible packages is the major motivation of the proposed work. This interoperability is achievedmore » by bringing the benefits of Component Based Software Engineering through a plug-and-play component framework called Common Component Architecture (CCA). In this thesis, I present a strategy and process used for interfacing two widely used and important computational chemistry methodologies: Quantum Mechanics and Molecular Mechanics. To show the feasibility of the proposed approach the Tuning and Analysis Utility (TAU) has been coupled with NWChem code and its CCA components. Results show that the overhead is negligible when compared to the ease and potential of organizing and coping with large-scale software applications.« less
Singh, Kumar Saurabh; Thual, Dominique; Spurio, Roberto; Cannata, Nicola
2015-06-03
One of the most crucial characteristics of day-to-day laboratory information management is the collection, storage and retrieval of information about research subjects and environmental or biomedical samples. An efficient link between sample data and experimental results is absolutely important for the successful outcome of a collaborative project. Currently available software solutions are largely limited to large scale, expensive commercial Laboratory Information Management Systems (LIMS). Acquiring such LIMS indeed can bring laboratory information management to a higher level, but most of the times this requires a sufficient investment of money, time and technical efforts. There is a clear need for a light weighted open source system which can easily be managed on local servers and handled by individual researchers. Here we present a software named SaDA for storing, retrieving and analyzing data originated from microorganism monitoring experiments. SaDA is fully integrated in the management of environmental samples, oligonucleotide sequences, microarray data and the subsequent downstream analysis procedures. It is simple and generic software, and can be extended and customized for various environmental and biomedical studies.
Ergatis: a web interface and scalable software system for bioinformatics workflows
Orvis, Joshua; Crabtree, Jonathan; Galens, Kevin; Gussman, Aaron; Inman, Jason M.; Lee, Eduardo; Nampally, Sreenath; Riley, David; Sundaram, Jaideep P.; Felix, Victor; Whitty, Brett; Mahurkar, Anup; Wortman, Jennifer; White, Owen; Angiuoli, Samuel V.
2010-01-01
Motivation: The growth of sequence data has been accompanied by an increasing need to analyze data on distributed computer clusters. The use of these systems for routine analysis requires scalable and robust software for data management of large datasets. Software is also needed to simplify data management and make large-scale bioinformatics analysis accessible and reproducible to a wide class of target users. Results: We have developed a workflow management system named Ergatis that enables users to build, execute and monitor pipelines for computational analysis of genomics data. Ergatis contains preconfigured components and template pipelines for a number of common bioinformatics tasks such as prokaryotic genome annotation and genome comparisons. Outputs from many of these components can be loaded into a Chado relational database. Ergatis was designed to be accessible to a broad class of users and provides a user friendly, web-based interface. Ergatis supports high-throughput batch processing on distributed compute clusters and has been used for data management in a number of genome annotation and comparative genomics projects. Availability: Ergatis is an open-source project and is freely available at http://ergatis.sourceforge.net Contact: jorvis@users.sourceforge.net PMID:20413634
Using Selection Pressure as an Asset to Develop Reusable, Adaptable Software Systems
NASA Technical Reports Server (NTRS)
Berrick, Stephen; Lynnes, Christopher
2007-01-01
The Goddard Earth Sciences Data and Information Services Center (GES DISC) at NASA has over the years developed and honed several reusable architectural components for supporting large-scale data centers with a large customer base. These include a processing system (S4PM) and an archive system (S4PA) based upon a workflow engine called the Simple Scalable Script based Science Processor (S4P) and an online data visualization and analysis system (Giovanni). These subsystems are currently reused internally in a variety of combinations to implement customized data management on behalf of instrument science teams and other science investigators. Some of these subsystems (S4P and S4PM) have also been reused by other data centers for operational science processing. Our experience has been that development and utilization of robust interoperable and reusable software systems can actually flourish in environments defined by heterogeneous commodity hardware systems the emphasis on value-added customer service and the continual goal for achieving higher cost efficiencies. The repeated internal reuse that is fostered by such an environment encourages and even forces changes to the software that make it more reusable and adaptable. Allowing and even encouraging such selective pressures to software development has been a key factor In the success of S4P and S4PM which are now available to the open source community under the NASA Open source Agreement
Design and implementation of a privacy preserving electronic health record linkage tool in Chicago
Cashy, John P; Jackson, Kathryn L; Pah, Adam R; Goel, Satyender; Boehnke, Jörn; Humphries, John Eric; Kominers, Scott Duke; Hota, Bala N; Sims, Shannon A; Malin, Bradley A; French, Dustin D; Walunas, Theresa L; Meltzer, David O; Kaleba, Erin O; Jones, Roderick C; Galanter, William L
2015-01-01
Objective To design and implement a tool that creates a secure, privacy preserving linkage of electronic health record (EHR) data across multiple sites in a large metropolitan area in the United States (Chicago, IL), for use in clinical research. Methods The authors developed and distributed a software application that performs standardized data cleaning, preprocessing, and hashing of patient identifiers to remove all protected health information. The application creates seeded hash code combinations of patient identifiers using a Health Insurance Portability and Accountability Act compliant SHA-512 algorithm that minimizes re-identification risk. The authors subsequently linked individual records using a central honest broker with an algorithm that assigns weights to hash combinations in order to generate high specificity matches. Results The software application successfully linked and de-duplicated 7 million records across 6 institutions, resulting in a cohort of 5 million unique records. Using a manually reconciled set of 11 292 patients as a gold standard, the software achieved a sensitivity of 96% and a specificity of 100%, with a majority of the missed matches accounted for by patients with both a missing social security number and last name change. Using 3 disease examples, it is demonstrated that the software can reduce duplication of patient records across sites by as much as 28%. Conclusions Software that standardizes the assignment of a unique seeded hash identifier merged through an agreed upon third-party honest broker can enable large-scale secure linkage of EHR data for epidemiologic and public health research. The software algorithm can improve future epidemiologic research by providing more comprehensive data given that patients may make use of multiple healthcare systems. PMID:26104741
Design and implementation of a privacy preserving electronic health record linkage tool in Chicago.
Kho, Abel N; Cashy, John P; Jackson, Kathryn L; Pah, Adam R; Goel, Satyender; Boehnke, Jörn; Humphries, John Eric; Kominers, Scott Duke; Hota, Bala N; Sims, Shannon A; Malin, Bradley A; French, Dustin D; Walunas, Theresa L; Meltzer, David O; Kaleba, Erin O; Jones, Roderick C; Galanter, William L
2015-09-01
To design and implement a tool that creates a secure, privacy preserving linkage of electronic health record (EHR) data across multiple sites in a large metropolitan area in the United States (Chicago, IL), for use in clinical research. The authors developed and distributed a software application that performs standardized data cleaning, preprocessing, and hashing of patient identifiers to remove all protected health information. The application creates seeded hash code combinations of patient identifiers using a Health Insurance Portability and Accountability Act compliant SHA-512 algorithm that minimizes re-identification risk. The authors subsequently linked individual records using a central honest broker with an algorithm that assigns weights to hash combinations in order to generate high specificity matches. The software application successfully linked and de-duplicated 7 million records across 6 institutions, resulting in a cohort of 5 million unique records. Using a manually reconciled set of 11 292 patients as a gold standard, the software achieved a sensitivity of 96% and a specificity of 100%, with a majority of the missed matches accounted for by patients with both a missing social security number and last name change. Using 3 disease examples, it is demonstrated that the software can reduce duplication of patient records across sites by as much as 28%. Software that standardizes the assignment of a unique seeded hash identifier merged through an agreed upon third-party honest broker can enable large-scale secure linkage of EHR data for epidemiologic and public health research. The software algorithm can improve future epidemiologic research by providing more comprehensive data given that patients may make use of multiple healthcare systems. © The Author 2015. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Implementing Large Projects in Software Engineering Courses
ERIC Educational Resources Information Center
Coppit, David
2006-01-01
In software engineering education, large projects are widely recognized as a useful way of exposing students to the real-world difficulties of team software development. But large projects are difficult to put into practice. First, educators rarely have additional time to manage software projects. Second, classrooms have inherent limitations that…
PASSIM--an open source software system for managing information in biomedical studies.
Viksna, Juris; Celms, Edgars; Opmanis, Martins; Podnieks, Karlis; Rucevskis, Peteris; Zarins, Andris; Barrett, Amy; Neogi, Sudeshna Guha; Krestyaninova, Maria; McCarthy, Mark I; Brazma, Alvis; Sarkans, Ugis
2007-02-09
One of the crucial aspects of day-to-day laboratory information management is collection, storage and retrieval of information about research subjects and biomedical samples. An efficient link between sample data and experiment results is absolutely imperative for a successful outcome of a biomedical study. Currently available software solutions are largely limited to large-scale, expensive commercial Laboratory Information Management Systems (LIMS). Acquiring such LIMS indeed can bring laboratory information management to a higher level, but often implies sufficient investment of time, effort and funds, which are not always available. There is a clear need for lightweight open source systems for patient and sample information management. We present a web-based tool for submission, management and retrieval of sample and research subject data. The system secures confidentiality by separating anonymized sample information from individuals' records. It is simple and generic, and can be customised for various biomedical studies. Information can be both entered and accessed using the same web interface. User groups and their privileges can be defined. The system is open-source and is supplied with an on-line tutorial and necessary documentation. It has proven to be successful in a large international collaborative project. The presented system closes the gap between the need and the availability of lightweight software solutions for managing information in biomedical studies involving human research subjects.
Computer-intensive simulation of solid-state NMR experiments using SIMPSON.
Tošner, Zdeněk; Andersen, Rasmus; Stevensson, Baltzar; Edén, Mattias; Nielsen, Niels Chr; Vosegaard, Thomas
2014-09-01
Conducting large-scale solid-state NMR simulations requires fast computer software potentially in combination with efficient computational resources to complete within a reasonable time frame. Such simulations may involve large spin systems, multiple-parameter fitting of experimental spectra, or multiple-pulse experiment design using parameter scan, non-linear optimization, or optimal control procedures. To efficiently accommodate such simulations, we here present an improved version of the widely distributed open-source SIMPSON NMR simulation software package adapted to contemporary high performance hardware setups. The software is optimized for fast performance on standard stand-alone computers, multi-core processors, and large clusters of identical nodes. We describe the novel features for fast computation including internal matrix manipulations, propagator setups and acquisition strategies. For efficient calculation of powder averages, we implemented interpolation method of Alderman, Solum, and Grant, as well as recently introduced fast Wigner transform interpolation technique. The potential of the optimal control toolbox is greatly enhanced by higher precision gradients in combination with the efficient optimization algorithm known as limited memory Broyden-Fletcher-Goldfarb-Shanno. In addition, advanced parallelization can be used in all types of calculations, providing significant time reductions. SIMPSON is thus reflecting current knowledge in the field of numerical simulations of solid-state NMR experiments. The efficiency and novel features are demonstrated on the representative simulations. Copyright © 2014 Elsevier Inc. All rights reserved.
Complexity as a Factor of Quality and Cost in Large Scale Software Development.
1979-12-01
allocating testing resources." [69 69I V. THE ROLE OF COMPLEXITY IN RESOURCE ESTIMATION AND ALLOCATION A. GENERAL It can be argued that blame for the...and allocation of testing resource by - identifying independent substructures and - identifying heavily used logic paths. 2. Setting a Design Threshold... RESOURCE ESTIMATION -------- 70 1. New Dynamic Field ------------------------- 70 2. Quality and Testing ----------------------- 71 3. Programming Units of
Fault-Tolerant Sequencer Using FPGA-Based Logic Designs for Space Applications
2013-12-01
Prototype Board SBU single bit upset SDK software development kit SDRAM synchronous dynamic random-access memory SEB single-event burnout ...current VHDL VHSIC hardware description language VHSIC very-high-speed integrated circuits VLSI very-large- scale integration VQFP very...transient pulse, called a single-event transient (SET), or even cause permanent damage to the device in the form of a burnout or gate rupture. The SEE
Web Archiving for the Rest of Us: How to Collect and Manage Websites Using Free and Easy Software
ERIC Educational Resources Information Center
Dunn, Katharine; Szydlowski, Nick
2009-01-01
Large-scale projects such as the Internet Archive (www.archive.org) send out crawlers to gather snapshots of much of the web. This massive collection of archived websites may include content of interest to one's patrons. But if librarians want to control exactly when and what is archived, relying on someone else to do the archiving is not ideal.…
Digital Systems Validation Handbook. Volume 2. Chapter 18. Avionic Data Bus Integration Technology
1993-11-01
interaction between a digital data bus and an avionic system. Very Large Scale Integration (VLSI) ICs and multiversion software, which make up digital...1984, the Sperry Corporation developed a fault tolerant system which employed multiversion programming, voting, and monitoring for error detection and...formulate all the significant behavior of a system. MULTIVERSION PROGRAMMING. N-version programming. N-VERSION PROGRAMMING. The independent coding of a
2006-05-01
a significant design project that requires development of a large scale software project . A distinct shortcoming of Purdue ECE...18-540: Rapid Prototyping of Computer Systems This is a project -oriented course which will deal with all four aspects of project development ; the...instructors, will develop specifications for a mobile computer to assist in inspection and maintenance. The application will be partitioned
NASA Technical Reports Server (NTRS)
Plesniak, Michael W.; Johnston, J. P.
1989-01-01
The construction and development of the multi-component traversing system and associated control hardware and software are presented. A hydrogen bubble/laser sheet flow visualization technique was developed to visually study the characteristics of the mixing layers. With this technique large-scale rollers arising from the Taylor-Gortler instability and its interaction with the primary Kelvin-Helmholtz structures can be studied.
Unmanned Aircraft Systems Traffic Management (UTM)
NASA Technical Reports Server (NTRS)
Johnson, Ronald D.
2018-01-01
UTM is an 'air traffic management' ecosystem for uncontrolled operations. UTM utilizes industry's ability to supply services under FAA's regulatory authority where these services do not exist. UTM development will ultimately enable the management of large scale, low-altitude UAS operations. Operational concept will address beyond visual line of sight UAS operations under 400 ft. AGL. Information architecture, data exchange protocols, software functions. Roles/responsibilities of FAA and operators. Performance requirements.
Research on computer-aided design of modern marine power systems
NASA Astrophysics Data System (ADS)
Ding, Dongdong; Zeng, Fanming; Chen, Guojun
2004-03-01
To make the MPS (Marine Power System) design process more economical and easier, a new CAD scheme is brought forward which takes much advantage of VR (Virtual Reality) and AI (Artificial Intelligence) technologies. This CAD system can shorten the period of design and reduce the requirements on designers' experience in large scale. And some key issues like the selection of hardware and software of such a system are discussed.
Large Scale Hierarchical K-Means Based Image Retrieval With MapReduce
2014-03-27
hadoop distributed file system: Architecture and design, 2007. [10] G. Bradski. Dr. Dobb’s Journal of Software Tools, 2000. [11] Terry Costlow. Big data ...million images running on 20 virtual machines are shown. 15. SUBJECT TERMS Image Retrieval, MapReduce, Hierarchical K-Means, Big Data , Hadoop U U U UU 87...13 2.1.1.2 HDFS Data Representation . . . . . . . . . . . . . . . . 14 2.1.1.3 Hadoop Engine
Colak, Recep; Moser, Flavia; Chu, Jeffrey Shih-Chieh; Schönhuth, Alexander; Chen, Nansheng; Ester, Martin
2010-10-25
Computational prediction of functionally related groups of genes (functional modules) from large-scale data is an important issue in computational biology. Gene expression experiments and interaction networks are well studied large-scale data sources, available for many not yet exhaustively annotated organisms. It has been well established, when analyzing these two data sources jointly, modules are often reflected by highly interconnected (dense) regions in the interaction networks whose participating genes are co-expressed. However, the tractability of the problem had remained unclear and methods by which to exhaustively search for such constellations had not been presented. We provide an algorithmic framework, referred to as Densely Connected Biclustering (DECOB), by which the aforementioned search problem becomes tractable. To benchmark the predictive power inherent to the approach, we computed all co-expressed, dense regions in physical protein and genetic interaction networks from human and yeast. An automatized filtering procedure reduces our output which results in smaller collections of modules, comparable to state-of-the-art approaches. Our results performed favorably in a fair benchmarking competition which adheres to standard criteria. We demonstrate the usefulness of an exhaustive module search, by using the unreduced output to more quickly perform GO term related function prediction tasks. We point out the advantages of our exhaustive output by predicting functional relationships using two examples. We demonstrate that the computation of all densely connected and co-expressed regions in interaction networks is an approach to module discovery of considerable value. Beyond confirming the well settled hypothesis that such co-expressed, densely connected interaction network regions reflect functional modules, we open up novel computational ways to comprehensively analyze the modular organization of an organism based on prevalent and largely available large-scale datasets. Software and data sets are available at http://www.sfu.ca/~ester/software/DECOB.zip.
Teaching the Blind to Find Their Way by Playing Video Games
Merabet, Lotfi B.; Connors, Erin C.; Halko, Mark A.; Sánchez, Jaime
2012-01-01
Computer based video games are receiving great interest as a means to learn and acquire new skills. As a novel approach to teaching navigation skills in the blind, we have developed Audio-based Environment Simulator (AbES); a virtual reality environment set within the context of a video game metaphor. Despite the fact that participants were naïve to the overall purpose of the software, we found that early blind users were able to acquire relevant information regarding the spatial layout of a previously unfamiliar building using audio based cues alone. This was confirmed by a series of behavioral performance tests designed to assess the transfer of acquired spatial information to a large-scale, real-world indoor navigation task. Furthermore, learning the spatial layout through a goal directed gaming strategy allowed for the mental manipulation of spatial information as evidenced by enhanced navigation performance when compared to an explicit route learning strategy. We conclude that the immersive and highly interactive nature of the software greatly engages the blind user to actively explore the virtual environment. This in turn generates an accurate sense of a large-scale three-dimensional space and facilitates the learning and transfer of navigation skills to the physical world. PMID:23028703
BRepertoire: a user-friendly web server for analysing antibody repertoire data.
Margreitter, Christian; Lu, Hui-Chun; Townsend, Catherine; Stewart, Alexander; Dunn-Walters, Deborah K; Fraternali, Franca
2018-04-14
Antibody repertoire analysis by high throughput sequencing is now widely used, but a persisting challenge is enabling immunologists to explore their data to discover discriminating repertoire features for their own particular investigations. Computational methods are necessary for large-scale evaluation of antibody properties. We have developed BRepertoire, a suite of user-friendly web-based software tools for large-scale statistical analyses of repertoire data. The software is able to use data preprocessed by IMGT, and performs statistical and comparative analyses with versatile plotting options. BRepertoire has been designed to operate in various modes, for example analysing sequence-specific V(D)J gene usage, discerning physico-chemical properties of the CDR regions and clustering of clonotypes. Those analyses are performed on the fly by a number of R packages and are deployed by a shiny web platform. The user can download the analysed data in different table formats and save the generated plots as image files ready for publication. We believe BRepertoire to be a versatile analytical tool that complements experimental studies of immune repertoires. To illustrate the server's functionality, we show use cases including differential gene usage in a vaccination dataset and analysis of CDR3H properties in old and young individuals. The server is accessible under http://mabra.biomed.kcl.ac.uk/BRepertoire.
Implementationof a modular software system for multiphysical processes in porous media
NASA Astrophysics Data System (ADS)
Naumov, Dmitri; Watanabe, Norihiro; Bilke, Lars; Fischer, Thomas; Lehmann, Christoph; Rink, Karsten; Walther, Marc; Wang, Wenqing; Kolditz, Olaf
2016-04-01
Subsurface georeservoirs are a candidate technology for large scale energy storage required as part of the transition to renewable energy sources. The increased use of the subsurface results in competing interests and possible impacts on protected entities. To optimize and plan the use of the subsurface in large scale scenario analyses,powerful numerical frameworks are required that aid process understanding and can capture the coupled thermal (T), hydraulic (H), mechanical (M), and chemical (C) processes with high computational efficiency. Due to having a multitude of different couplings between basic T, H, M, or C processes and the necessity to implement new numerical schemes the development focus has moved to software's modularity. The decreased coupling between the components results in two major advantages: easier addition of specialized processes and improvement of the code's testability and therefore its quality. The idea of modularization is implemented on several levels, in addition to library based separation of the previous code version, by using generalized algorithms available in the Standard Template Library and the Boost library, relying on efficient implementations of liner algebra solvers, using concepts when designing new types, and localization of frequently accessed data structures. This procedure shows certain benefits for a flexible high-performance framework applied to the analysis of multipurpose georeservoirs.
Identifying currents in the gene pool for bacterial populations using an integrative approach.
Tang, Jing; Hanage, William P; Fraser, Christophe; Corander, Jukka
2009-08-01
The evolution of bacterial populations has recently become considerably better understood due to large-scale sequencing of population samples. It has become clear that DNA sequences from a multitude of genes, as well as a broad sample coverage of a target population, are needed to obtain a relatively unbiased view of its genetic structure and the patterns of ancestry connected to the strains. However, the traditional statistical methods for evolutionary inference, such as phylogenetic analysis, are associated with several difficulties under such an extensive sampling scenario, in particular when a considerable amount of recombination is anticipated to have taken place. To meet the needs of large-scale analyses of population structure for bacteria, we introduce here several statistical tools for the detection and representation of recombination between populations. Also, we introduce a model-based description of the shape of a population in sequence space, in terms of its molecular variability and affinity towards other populations. Extensive real data from the genus Neisseria are utilized to demonstrate the potential of an approach where these population genetic tools are combined with an phylogenetic analysis. The statistical tools introduced here are freely available in BAPS 5.2 software, which can be downloaded from http://web.abo.fi/fak/mnf/mate/jc/software/baps.html.
Canosa, Joel
2018-01-01
The aim of this study is the application of a software tool to the design of stripping columns to calculate the removal of trihalomethanes (THMs) from drinking water. The tool also allows calculating the rough capital cost of the column and the decrease in carcinogenic risk indeces associated with the elimination of THMs and, thus, the investment to save a human life. The design of stripping columns includes the determination, among other factors, of the height (HOG), the theoretical number of plates (NOG), and the section (S) of the columns based on the study of pressure drop. These results have been compared with THM stripping literature values, showing that simulation is sufficiently conservative. Three case studies were chosen to apply the developed software. The first case study was representative of small-scale application to a community in Córdoba (Spain) where chloroform is predominant and has a low concentration. The second case study was of an intermediate scale in a region in Venezuela, and the third case study was representative of large-scale treatment of water in the Barcelona metropolitan region (Spain). Results showed that case studies with larger scale and higher initial risk offer the best capital investment to decrease the risk. PMID:29562670
Parallel Computation of the Regional Ocean Modeling System (ROMS)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wang, P; Song, Y T; Chao, Y
2005-04-05
The Regional Ocean Modeling System (ROMS) is a regional ocean general circulation modeling system solving the free surface, hydrostatic, primitive equations over varying topography. It is free software distributed world-wide for studying both complex coastal ocean problems and the basin-to-global scale ocean circulation. The original ROMS code could only be run on shared-memory systems. With the increasing need to simulate larger model domains with finer resolutions and on a variety of computer platforms, there is a need in the ocean-modeling community to have a ROMS code that can be run on any parallel computer ranging from 10 to hundreds ofmore » processors. Recently, we have explored parallelization for ROMS using the MPI programming model. In this paper, an efficient parallelization strategy for such a large-scale scientific software package, based on an existing shared-memory computing model, is presented. In addition, scientific applications and data-performance issues on a couple of SGI systems, including Columbia, the world's third-fastest supercomputer, are discussed.« less
Study of multi-functional precision optical measuring system for large scale equipment
NASA Astrophysics Data System (ADS)
Jiang, Wei; Lao, Dabao; Zhou, Weihu; Zhang, Wenying; Jiang, Xingjian; Wang, Yongxi
2017-10-01
The effective application of high performance measurement technology can greatly improve the large-scale equipment manufacturing ability. Therefore, the geometric parameters measurement, such as size, attitude and position, requires the measurement system with high precision, multi-function, portability and other characteristics. However, the existing measuring instruments, such as laser tracker, total station, photogrammetry system, mostly has single function, station moving and other shortcomings. Laser tracker needs to work with cooperative target, but it can hardly meet the requirement of measurement in extreme environment. Total station is mainly used for outdoor surveying and mapping, it is hard to achieve the demand of accuracy in industrial measurement. Photogrammetry system can achieve a wide range of multi-point measurement, but the measuring range is limited and need to repeatedly move station. The paper presents a non-contact opto-electronic measuring instrument, not only it can work by scanning the measurement path but also measuring the cooperative target by tracking measurement. The system is based on some key technologies, such as absolute distance measurement, two-dimensional angle measurement, automatically target recognition and accurate aiming, precision control, assembly of complex mechanical system and multi-functional 3D visualization software. Among them, the absolute distance measurement module ensures measurement with high accuracy, and the twodimensional angle measuring module provides precision angle measurement. The system is suitable for the case of noncontact measurement of large-scale equipment, it can ensure the quality and performance of large-scale equipment throughout the process of manufacturing and improve the manufacturing ability of large-scale and high-end equipment.
Fractals and Spatial Methods for Mining Remote Sensing Imagery
NASA Technical Reports Server (NTRS)
Lam, Nina; Emerson, Charles; Quattrochi, Dale
2003-01-01
The rapid increase in digital remote sensing and GIS data raises a critical problem -- how can such an enormous amount of data be handled and analyzed so that useful information can be derived quickly? Efficient handling and analysis of large spatial data sets is central to environmental research, particularly in global change studies that employ time series. Advances in large-scale environmental monitoring and modeling require not only high-quality data, but also reliable tools to analyze the various types of data. A major difficulty facing geographers and environmental scientists in environmental assessment and monitoring is that spatial analytical tools are not easily accessible. Although many spatial techniques have been described recently in the literature, they are typically presented in an analytical form and are difficult to transform to a numerical algorithm. Moreover, these spatial techniques are not necessarily designed for remote sensing and GIS applications, and research must be conducted to examine their applicability and effectiveness in different types of environmental applications. This poses a chicken-and-egg problem: on one hand we need more research to examine the usability of the newer techniques and tools, yet on the other hand, this type of research is difficult to conduct if the tools to be explored are not accessible. Another problem that is fundamental to environmental research are issues related to spatial scale. The scale issue is especially acute in the context of global change studies because of the need to integrate remote-sensing and other spatial data that are collected at different scales and resolutions. Extrapolation of results across broad spatial scales remains the most difficult problem in global environmental research. There is a need for basic characterization of the effects of scale on image data, and the techniques used to measure these effects must be developed and implemented to allow for a multiple scale assessment of the data before any useful process-oriented modeling involving scale-dependent data can be conducted. Through the support of research grants from NASA, we have developed a software module called ICAMS (Image Characterization And Modeling System) to address the need to develop innovative spatial techniques and make them available to the broader scientific communities. ICAMS provides new spatial techniques, such as fractal analysis, geostatistical functions, and multiscale analysis that are not easily available in commercial GIS/image processing software. By bundling newer spatial methods in a user-friendly software module, researchers can begin to test and experiment with the new spatial analysis methods and they can gauge scale effects using a variety of remote sensing imagery. In the following, we describe briefly the development of ICAMS and present application examples.
Shi, Yulin; Veidenbaum, Alexander V; Nicolau, Alex; Xu, Xiangmin
2015-01-15
Modern neuroscience research demands computing power. Neural circuit mapping studies such as those using laser scanning photostimulation (LSPS) produce large amounts of data and require intensive computation for post hoc processing and analysis. Here we report on the design and implementation of a cost-effective desktop computer system for accelerated experimental data processing with recent GPU computing technology. A new version of Matlab software with GPU enabled functions is used to develop programs that run on Nvidia GPUs to harness their parallel computing power. We evaluated both the central processing unit (CPU) and GPU-enabled computational performance of our system in benchmark testing and practical applications. The experimental results show that the GPU-CPU co-processing of simulated data and actual LSPS experimental data clearly outperformed the multi-core CPU with up to a 22× speedup, depending on computational tasks. Further, we present a comparison of numerical accuracy between GPU and CPU computation to verify the precision of GPU computation. In addition, we show how GPUs can be effectively adapted to improve the performance of commercial image processing software such as Adobe Photoshop. To our best knowledge, this is the first demonstration of GPU application in neural circuit mapping and electrophysiology-based data processing. Together, GPU enabled computation enhances our ability to process large-scale data sets derived from neural circuit mapping studies, allowing for increased processing speeds while retaining data precision. Copyright © 2014 Elsevier B.V. All rights reserved.
Shi, Yulin; Veidenbaum, Alexander V.; Nicolau, Alex; Xu, Xiangmin
2014-01-01
Background Modern neuroscience research demands computing power. Neural circuit mapping studies such as those using laser scanning photostimulation (LSPS) produce large amounts of data and require intensive computation for post-hoc processing and analysis. New Method Here we report on the design and implementation of a cost-effective desktop computer system for accelerated experimental data processing with recent GPU computing technology. A new version of Matlab software with GPU enabled functions is used to develop programs that run on Nvidia GPUs to harness their parallel computing power. Results We evaluated both the central processing unit (CPU) and GPU-enabled computational performance of our system in benchmark testing and practical applications. The experimental results show that the GPU-CPU co-processing of simulated data and actual LSPS experimental data clearly outperformed the multi-core CPU with up to a 22x speedup, depending on computational tasks. Further, we present a comparison of numerical accuracy between GPU and CPU computation to verify the precision of GPU computation. In addition, we show how GPUs can be effectively adapted to improve the performance of commercial image processing software such as Adobe Photoshop. Comparison with Existing Method(s) To our best knowledge, this is the first demonstration of GPU application in neural circuit mapping and electrophysiology-based data processing. Conclusions Together, GPU enabled computation enhances our ability to process large-scale data sets derived from neural circuit mapping studies, allowing for increased processing speeds while retaining data precision. PMID:25277633
Review of Software Tools for Design and Analysis of Large scale MRM Proteomic Datasets
Colangelo, Christopher M.; Chung, Lisa; Bruce, Can; Cheung, Kei-Hoi
2013-01-01
Selective or Multiple Reaction monitoring (SRM/MRM) is a liquid-chromatography (LC)/tandem-mass spectrometry (MS/MS) method that enables the quantitation of specific proteins in a sample by analyzing precursor ions and the fragment ions of their selected tryptic peptides. Instrumentation software has advanced to the point that thousands of transitions (pairs of primary and secondary m/z values) can be measured in a triple quadrupole instrument coupled to an LC, by a well-designed scheduling and selection of m/z windows. The design of a good MRM assay relies on the availability of peptide spectra from previous discovery-phase LC-MS/MS studies. The tedious aspect of manually developing and processing MRM assays involving thousands of transitions has spurred to development of software tools to automate this process. Software packages have been developed for project management, assay development, assay validation, data export, peak integration, quality assessment, and biostatistical analysis. No single tool provides a complete end-to-end solution, thus this article reviews the current state and discusses future directions of these software tools in order to enable researchers to combine these tools for a comprehensive targeted proteomics workflow. PMID:23702368
Distributed agile software development for the SKA
NASA Astrophysics Data System (ADS)
Wicenec, Andreas; Parsons, Rebecca; Kitaeff, Slava; Vinsen, Kevin; Wu, Chen; Nelson, Paul; Reed, David
2012-09-01
The SKA software will most probably be developed by many groups distributed across the globe and coming from dierent backgrounds, like industries and research institutions. The SKA software subsystems will have to cover a very wide range of dierent areas, but still they have to react and work together like a single system to achieve the scientic goals and satisfy the challenging data ow requirements. Designing and developing such a system in a distributed fashion requires proper tools and the setup of an environment to allow for ecient detection and tracking of interface and integration issues in particular in a timely way. Agile development can provide much faster feedback mechanisms and also much tighter collaboration between the customer (scientist) and the developer. Continuous integration and continuous deployment on the other hand can provide much faster feedback of integration issues from the system level to the subsystem developers. This paper describes the results obtained from trialing a potential SKA development environment based on existing science software development processes like ALMA, the expected distribution of the groups potentially involved in the SKA development and experience gained in the development of large scale commercial software projects.
2009-01-01
Background Insertional mutagenesis is an effective method for functional genomic studies in various organisms. It can rapidly generate easily tractable mutations. A large-scale insertional mutagenesis with the piggyBac (PB) transposon is currently performed in mice at the Institute of Developmental Biology and Molecular Medicine (IDM), Fudan University in Shanghai, China. This project is carried out via collaborations among multiple groups overseeing interconnected experimental steps and generates a large volume of experimental data continuously. Therefore, the project calls for an efficient database system for recording, management, statistical analysis, and information exchange. Results This paper presents a database application called MP-PBmice (insertional mutation mapping system of PB Mutagenesis Information Center), which is developed to serve the on-going large-scale PB insertional mutagenesis project. A lightweight enterprise-level development framework Struts-Spring-Hibernate is used here to ensure constructive and flexible support to the application. The MP-PBmice database system has three major features: strict access-control, efficient workflow control, and good expandability. It supports the collaboration among different groups that enter data and exchange information on daily basis, and is capable of providing real time progress reports for the whole project. MP-PBmice can be easily adapted for other large-scale insertional mutation mapping projects and the source code of this software is freely available at http://www.idmshanghai.cn/PBmice. Conclusion MP-PBmice is a web-based application for large-scale insertional mutation mapping onto the mouse genome, implemented with the widely used framework Struts-Spring-Hibernate. This system is already in use by the on-going genome-wide PB insertional mutation mapping project at IDM, Fudan University. PMID:19958505
Chikkagoudar, Satish; Wang, Kai; Li, Mingyao
2011-05-26
Gene-gene interaction in genetic association studies is computationally intensive when a large number of SNPs are involved. Most of the latest Central Processing Units (CPUs) have multiple cores, whereas Graphics Processing Units (GPUs) also have hundreds of cores and have been recently used to implement faster scientific software. However, currently there are no genetic analysis software packages that allow users to fully utilize the computing power of these multi-core devices for genetic interaction analysis for binary traits. Here we present a novel software package GENIE, which utilizes the power of multiple GPU or CPU processor cores to parallelize the interaction analysis. GENIE reads an entire genetic association study dataset into memory and partitions the dataset into fragments with non-overlapping sets of SNPs. For each fragment, GENIE analyzes: 1) the interaction of SNPs within it in parallel, and 2) the interaction between the SNPs of the current fragment and other fragments in parallel. We tested GENIE on a large-scale candidate gene study on high-density lipoprotein cholesterol. Using an NVIDIA Tesla C1060 graphics card, the GPU mode of GENIE achieves a speedup of 27 times over its single-core CPU mode run. GENIE is open-source, economical, user-friendly, and scalable. Since the computing power and memory capacity of graphics cards are increasing rapidly while their cost is going down, we anticipate that GENIE will achieve greater speedups with faster GPU cards. Documentation, source code, and precompiled binaries can be downloaded from http://www.cceb.upenn.edu/~mli/software/GENIE/.
GiA Roots: software for the high throughput analysis of plant root system architecture.
Galkovskyi, Taras; Mileyko, Yuriy; Bucksch, Alexander; Moore, Brad; Symonova, Olga; Price, Charles A; Topp, Christopher N; Iyer-Pascuzzi, Anjali S; Zurek, Paul R; Fang, Suqin; Harer, John; Benfey, Philip N; Weitz, Joshua S
2012-07-26
Characterizing root system architecture (RSA) is essential to understanding the development and function of vascular plants. Identifying RSA-associated genes also represents an underexplored opportunity for crop improvement. Software tools are needed to accelerate the pace at which quantitative traits of RSA are estimated from images of root networks. We have developed GiA Roots (General Image Analysis of Roots), a semi-automated software tool designed specifically for the high-throughput analysis of root system images. GiA Roots includes user-assisted algorithms to distinguish root from background and a fully automated pipeline that extracts dozens of root system phenotypes. Quantitative information on each phenotype, along with intermediate steps for full reproducibility, is returned to the end-user for downstream analysis. GiA Roots has a GUI front end and a command-line interface for interweaving the software into large-scale workflows. GiA Roots can also be extended to estimate novel phenotypes specified by the end-user. We demonstrate the use of GiA Roots on a set of 2393 images of rice roots representing 12 genotypes from the species Oryza sativa. We validate trait measurements against prior analyses of this image set that demonstrated that RSA traits are likely heritable and associated with genotypic differences. Moreover, we demonstrate that GiA Roots is extensible and an end-user can add functionality so that GiA Roots can estimate novel RSA traits. In summary, we show that the software can function as an efficient tool as part of a workflow to move from large numbers of root images to downstream analysis.
2011-01-01
Background Gene-gene interaction in genetic association studies is computationally intensive when a large number of SNPs are involved. Most of the latest Central Processing Units (CPUs) have multiple cores, whereas Graphics Processing Units (GPUs) also have hundreds of cores and have been recently used to implement faster scientific software. However, currently there are no genetic analysis software packages that allow users to fully utilize the computing power of these multi-core devices for genetic interaction analysis for binary traits. Findings Here we present a novel software package GENIE, which utilizes the power of multiple GPU or CPU processor cores to parallelize the interaction analysis. GENIE reads an entire genetic association study dataset into memory and partitions the dataset into fragments with non-overlapping sets of SNPs. For each fragment, GENIE analyzes: 1) the interaction of SNPs within it in parallel, and 2) the interaction between the SNPs of the current fragment and other fragments in parallel. We tested GENIE on a large-scale candidate gene study on high-density lipoprotein cholesterol. Using an NVIDIA Tesla C1060 graphics card, the GPU mode of GENIE achieves a speedup of 27 times over its single-core CPU mode run. Conclusions GENIE is open-source, economical, user-friendly, and scalable. Since the computing power and memory capacity of graphics cards are increasing rapidly while their cost is going down, we anticipate that GENIE will achieve greater speedups with faster GPU cards. Documentation, source code, and precompiled binaries can be downloaded from http://www.cceb.upenn.edu/~mli/software/GENIE/. PMID:21615923
Wawrzyniak, Zbigniew M; Paczesny, Daniel; Mańczuk, Marta; Zatoński, Witold A
2011-01-01
Large-scale epidemiologic studies can assess health indicators differentiating social groups and important health outcomes of the incidence and mortality of cancer, cardiovascular disease, and others, to establish a solid knowledgebase for the prevention management of premature morbidity and mortality causes. This study presents new advanced methods of data collection and data management systems with current data quality control and security to ensure high quality data assessment of health indicators in the large epidemiologic PONS study (The Polish-Norwegian Study). The material for experiment is the data management design of the large-scale population study in Poland (PONS) and the managed processes are applied into establishing a high quality and solid knowledge. The functional requirements of the PONS study data collection, supported by the advanced IT web-based methods, resulted in medical data of a high quality, data security, with quality data assessment, control process and evolution monitoring are fulfilled and shared by the IT system. Data from disparate and deployed sources of information are integrated into databases via software interfaces, and archived by a multi task secure server. The practical and implemented solution of modern advanced database technologies and remote software/hardware structure successfully supports the research of the big PONS study project. Development and implementation of follow-up control of the consistency and quality of data analysis and the processes of the PONS sub-databases have excellent measurement properties of data consistency of more than 99%. The project itself, by tailored hardware/software application, shows the positive impact of Quality Assurance (QA) on the quality of outcomes analysis results, effective data management within a shorter time. This efficiency ensures the quality of the epidemiological data and indicators of health by the elimination of common errors of research questionnaires and medical measurements.
Bi-Force: large-scale bicluster editing and its application to gene expression data biclustering.
Sun, Peng; Speicher, Nora K; Röttger, Richard; Guo, Jiong; Baumbach, Jan
2014-05-01
The explosion of the biological data has dramatically reformed today's biological research. The need to integrate and analyze high-dimensional biological data on a large scale is driving the development of novel bioinformatics approaches. Biclustering, also known as 'simultaneous clustering' or 'co-clustering', has been successfully utilized to discover local patterns in gene expression data and similar biomedical data types. Here, we contribute a new heuristic: 'Bi-Force'. It is based on the weighted bicluster editing model, to perform biclustering on arbitrary sets of biological entities, given any kind of pairwise similarities. We first evaluated the power of Bi-Force to solve dedicated bicluster editing problems by comparing Bi-Force with two existing algorithms in the BiCluE software package. We then followed a biclustering evaluation protocol in a recent review paper from Eren et al. (2013) (A comparative analysis of biclustering algorithms for gene expressiondata. Brief. Bioinform., 14:279-292.) and compared Bi-Force against eight existing tools: FABIA, QUBIC, Cheng and Church, Plaid, BiMax, Spectral, xMOTIFs and ISA. To this end, a suite of synthetic datasets as well as nine large gene expression datasets from Gene Expression Omnibus were analyzed. All resulting biclusters were subsequently investigated by Gene Ontology enrichment analysis to evaluate their biological relevance. The distinct theoretical foundation of Bi-Force (bicluster editing) is more powerful than strict biclustering. We thus outperformed existing tools with Bi-Force at least when following the evaluation protocols from Eren et al. Bi-Force is implemented in Java and integrated into the open source software package of BiCluE. The software as well as all used datasets are publicly available at http://biclue.mpi-inf.mpg.de. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
Scaling Semantic Graph Databases in Size and Performance
DOE Office of Scientific and Technical Information (OSTI.GOV)
Morari, Alessandro; Castellana, Vito G.; Villa, Oreste
In this paper we present SGEM, a full software system for accelerating large-scale semantic graph databases on commodity clusters. Unlike current approaches, SGEM addresses semantic graph databases by only employing graph methods at all the levels of the stack. On one hand, this allows exploiting the space efficiency of graph data structures and the inherent parallelism of graph algorithms. These features adapt well to the increasing system memory and core counts of modern commodity clusters. On the other hand, however, these systems are optimized for regular computation and batched data transfers, while graph methods usually are irregular and generate fine-grainedmore » data accesses with poor spatial and temporal locality. Our framework comprises a SPARQL to data parallel C compiler, a library of parallel graph methods and a custom, multithreaded runtime system. We introduce our stack, motivate its advantages with respect to other solutions and show how we solved the challenges posed by irregular behaviors. We present the result of our software stack on the Berlin SPARQL benchmarks with datasets up to 10 billion triples (a triple corresponds to a graph edge), demonstrating scaling in dataset size and in performance as more nodes are added to the cluster.« less
Northwest Trajectory Analysis Capability: A Platform for Enhancing Computational Biophysics Analysis
DOE Office of Scientific and Technical Information (OSTI.GOV)
Peterson, Elena S.; Stephan, Eric G.; Corrigan, Abigail L.
2008-07-30
As computational resources continue to increase, the ability of computational simulations to effectively complement, and in some cases replace, experimentation in scientific exploration also increases. Today, large-scale simulations are recognized as an effective tool for scientific exploration in many disciplines including chemistry and biology. A natural side effect of this trend has been the need for an increasingly complex analytical environment. In this paper, we describe Northwest Trajectory Analysis Capability (NTRAC), an analytical software suite developed to enhance the efficiency of computational biophysics analyses. Our strategy is to layer higher-level services and introduce improved tools within the user’s familiar environmentmore » without preventing researchers from using traditional tools and methods. Our desire is to share these experiences to serve as an example for effectively analyzing data intensive large scale simulation data.« less
DistributedFBA.jl: High-level, high-performance flux balance analysis in Julia
DOE Office of Scientific and Technical Information (OSTI.GOV)
Heirendt, Laurent; Thiele, Ines; Fleming, Ronan M. T.
Flux balance analysis and its variants are widely used methods for predicting steady-state reaction rates in biochemical reaction networks. The exploration of high dimensional networks with such methods is currently hampered by software performance limitations. DistributedFBA.jl is a high-level, high-performance, open-source implementation of flux balance analysis in Julia. It is tailored to solve multiple flux balance analyses on a subset or all the reactions of large and huge-scale networks, on any number of threads or nodes. DistributedFBA.jl is a high-level, high-performance, open-source implementation of flux balance analysis in Julia. It is tailored to solve multiple flux balance analyses on amore » subset or all the reactions of large and huge-scale networks, on any number of threads or nodes.« less
DistributedFBA.jl: High-level, high-performance flux balance analysis in Julia
Heirendt, Laurent; Thiele, Ines; Fleming, Ronan M. T.
2017-01-16
Flux balance analysis and its variants are widely used methods for predicting steady-state reaction rates in biochemical reaction networks. The exploration of high dimensional networks with such methods is currently hampered by software performance limitations. DistributedFBA.jl is a high-level, high-performance, open-source implementation of flux balance analysis in Julia. It is tailored to solve multiple flux balance analyses on a subset or all the reactions of large and huge-scale networks, on any number of threads or nodes. DistributedFBA.jl is a high-level, high-performance, open-source implementation of flux balance analysis in Julia. It is tailored to solve multiple flux balance analyses on amore » subset or all the reactions of large and huge-scale networks, on any number of threads or nodes.« less
[Stress management in large-scale establishments].
Fukasawa, Kenji
2002-07-01
Due to a recent dramatic change in industrial structures in Japan, the role of large-scale enterprises is changing. Mass production used to be the major income sources of companies, but nowadays it has changed to high value-added products, including, software development. As a consequence of highly competitive inter-corporate development, there are various sources of job stress which induce health problems in employees, especially those concerned with development or management. To simply to obey the law or offer medical care are not enough to achieve management of these problems. Occupational health staff need to act according to the disease type and provide care with support from the Supervisor and Personnel Division. And for the training, development and consultation system, occupational health staff must work with the Personnel Division and Safety Division, and be approved by management supervisors.
Wan, Shixiang; Zou, Quan
2017-01-01
Multiple sequence alignment (MSA) plays a key role in biological sequence analyses, especially in phylogenetic tree construction. Extreme increase in next-generation sequencing results in shortage of efficient ultra-large biological sequence alignment approaches for coping with different sequence types. Distributed and parallel computing represents a crucial technique for accelerating ultra-large (e.g. files more than 1 GB) sequence analyses. Based on HAlign and Spark distributed computing system, we implement a highly cost-efficient and time-efficient HAlign-II tool to address ultra-large multiple biological sequence alignment and phylogenetic tree construction. The experiments in the DNA and protein large scale data sets, which are more than 1GB files, showed that HAlign II could save time and space. It outperformed the current software tools. HAlign-II can efficiently carry out MSA and construct phylogenetic trees with ultra-large numbers of biological sequences. HAlign-II shows extremely high memory efficiency and scales well with increases in computing resource. THAlign-II provides a user-friendly web server based on our distributed computing infrastructure. HAlign-II with open-source codes and datasets was established at http://lab.malab.cn/soft/halign.
Event management for large scale event-driven digital hardware spiking neural networks.
Caron, Louis-Charles; D'Haene, Michiel; Mailhot, Frédéric; Schrauwen, Benjamin; Rouat, Jean
2013-09-01
The interest in brain-like computation has led to the design of a plethora of innovative neuromorphic systems. Individually, spiking neural networks (SNNs), event-driven simulation and digital hardware neuromorphic systems get a lot of attention. Despite the popularity of event-driven SNNs in software, very few digital hardware architectures are found. This is because existing hardware solutions for event management scale badly with the number of events. This paper introduces the structured heap queue, a pipelined digital hardware data structure, and demonstrates its suitability for event management. The structured heap queue scales gracefully with the number of events, allowing the efficient implementation of large scale digital hardware event-driven SNNs. The scaling is linear for memory, logarithmic for logic resources and constant for processing time. The use of the structured heap queue is demonstrated on a field-programmable gate array (FPGA) with an image segmentation experiment and a SNN of 65,536 neurons and 513,184 synapses. Events can be processed at the rate of 1 every 7 clock cycles and a 406×158 pixel image is segmented in 200 ms. Copyright © 2013 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
George, D. L.; Iverson, R. M.
2012-12-01
Numerically simulating debris-flow motion presents many challenges due to the complicated physics of flowing granular-fluid mixtures, the diversity of spatial scales (ranging from a characteristic particle size to the extent of the debris flow deposit), and the unpredictability of the flow domain prior to a simulation. Accurately predicting debris-flows requires models that are complex enough to represent the dominant effects of granular-fluid interaction, while remaining mathematically and computationally tractable. We have developed a two-phase depth-averaged mathematical model for debris-flow initiation and subsequent motion. Additionally, we have developed software that numerically solves the model equations efficiently on large domains. A unique feature of the mathematical model is that it includes the feedback between pore-fluid pressure and the evolution of the solid grain volume fraction, a process that regulates flow resistance. This feature endows the model with the ability to represent the transition from a stationary mass to a dynamic flow. With traditional approaches, slope stability analysis and flow simulation are treated separately, and the latter models are often initialized with force balances that are unrealistically far from equilibrium. Additionally, our new model relies on relatively few dimensionless parameters that are functions of well-known material properties constrained by physical data (eg. hydraulic permeability, pore-fluid viscosity, debris compressibility, Coulomb friction coefficient, etc.). We have developed numerical methods and software for accurately solving the model equations. By employing adaptive mesh refinement (AMR), the software can efficiently resolve an evolving debris flow as it advances through irregular topography, without needing terrain-fit computational meshes. The AMR algorithms utilize multiple levels of grid resolutions, so that computationally inexpensive coarse grids can be used where the flow is absent, and much higher resolution grids evolve with the flow. The reduction in computational cost, due to AMR, makes very large-scale problems tractable on personal computers. Model accuracy can be tested by comparison of numerical predictions and empirical data. These comparisons utilize controlled experiments conducted at the USGS debris-flow flume, which provide detailed data about flow mobilization and dynamics. Additionally, we have simulated historical large-scale debris flows, such as the (≈50 million m^3) debris flow that originated on Mt. Meager, British Columbia in 2010. This flow took a very complex route through highly variable topography and provides a valuable benchmark for testing. Maps of the debris flow deposit and data from seismic stations provide evidence regarding flow initiation, transit times and deposition. Our simulations reproduce many of the complex patterns of the event, such as run-out geometry and extent, and the large-scale nature of the flow and the complex topographical features demonstrate the utility of AMR in flow simulations.
Image processing in biodosimetry: A proposal of a generic free software platform.
Dumpelmann, Matthias; Cadena da Matta, Mariel; Pereira de Lemos Pinto, Marcela Maria; de Salazar E Fernandes, Thiago; Borges da Silva, Edvane; Amaral, Ademir
2015-08-01
The scoring of chromosome aberrations is the most reliable biological method for evaluating individual exposure to ionizing radiation. However, microscopic analyses of chromosome human metaphases, generally employed to identify aberrations mainly dicentrics (chromosome with two centromeres), is a laborious task. This method is time consuming and its application in biological dosimetry would be almost impossible in case of a large scale radiation incidents. In this project, a generic software was enhanced for automatic chromosome image processing from a framework originally developed for the Framework V project Simbio, of the European Union for applications in the area of source localization from electroencephalographic signals. The platforms capability is demonstrated by a study comparing automatic segmentation strategies of chromosomes from microscopic images.
Apollo experience report: Guidance and control systems. Engineering simulation program
NASA Technical Reports Server (NTRS)
Gilbert, D. W.
1973-01-01
The Apollo Program experience from early 1962 to July 1969 with respect to the engineering-simulation support and the problems encountered is summarized in this report. Engineering simulation in support of the Apollo guidance and control system is discussed in terms of design analysis and verification, certification of hardware in closed-loop operation, verification of hardware/software compatibility, and verification of both software and procedures for each mission. The magnitude, time, and cost of the engineering simulations are described with respect to hardware availability, NASA and contractor facilities (for verification of the command module, the lunar module, and the primary guidance, navigation, and control system), and scheduling and planning considerations. Recommendations are made regarding implementation of similar, large-scale simulations for future programs.
Parallel Multiscale Algorithms for Astrophysical Fluid Dynamics Simulations
NASA Technical Reports Server (NTRS)
Norman, Michael L.
1997-01-01
Our goal is to develop software libraries and applications for astrophysical fluid dynamics simulations in multidimensions that will enable us to resolve the large spatial and temporal variations that inevitably arise due to gravity, fronts and microphysical phenomena. The software must run efficiently on parallel computers and be general enough to allow the incorporation of a wide variety of physics. Cosmological structure formation with realistic gas physics is the primary application driver in this work. Accurate simulations of e.g. galaxy formation require a spatial dynamic range (i.e., ratio of system scale to smallest resolved feature) of 104 or more in three dimensions in arbitrary topologies. We take this as our technical requirement. We have achieved, and in fact, surpassed these goals.
An Integrated Unix-based CAD System for the Design and Testing of Custom VLSI Chips
NASA Technical Reports Server (NTRS)
Deutsch, L. J.
1985-01-01
A computer aided design (CAD) system that is being used at the Jet Propulsion Laboratory for the design of custom and semicustom very large scale integrated (VLSI) chips is described. The system consists of a Digital Equipment Corporation VAX computer with the UNIX operating system and a collection of software tools for the layout, simulation, and verification of microcircuits. Most of these tools were written by the academic community and are, therefore, available to JPL at little or no cost. Some small pieces of software have been written in-house in order to make all the tools interact with each other with a minimal amount of effort on the part of the designer.
Computing Properties of Hadrons, Nuclei and Nuclear Matter from Quantum Chromodynamics
DOE Office of Scientific and Technical Information (OSTI.GOV)
Savage, Martin J.
This project was part of a coordinated software development effort which the nuclear physics lattice QCD community pursues in order to ensure that lattice calculations can make optimal use of present, and forthcoming leadership-class and dedicated hardware, including those of the national laboratories, and prepares for the exploitation of future computational resources in the exascale era. The UW team improved and extended software libraries used in lattice QCD calculations related to multi-nucleon systems, enhanced production running codes related to load balancing multi-nucleon production on large-scale computing platforms, and developed SQLite (addressable database) interfaces to efficiently archive and analyze multi-nucleon datamore » and developed a Mathematica interface for the SQLite databases.« less
NASA Technical Reports Server (NTRS)
Chang, C. L.; Stachowitz, R. A.
1988-01-01
Software quality is of primary concern in all large-scale expert system development efforts. Building appropriate validation and test tools for ensuring software reliability of expert systems is therefore required. The Expert Systems Validation Associate (EVA) is a validation system under development at the Lockheed Artificial Intelligence Center. EVA provides a wide range of validation and test tools to check correctness, consistency, and completeness of an expert system. Testing a major function of EVA. It means executing an expert system with test cases with the intent of finding errors. In this paper, we describe many different types of testing such as function-based testing, structure-based testing, and data-based testing. We describe how appropriate test cases may be selected in order to perform good and thorough testing of an expert system.
Seeing beyond Computer Science and Software Engineering
NASA Astrophysics Data System (ADS)
Nori, Kesav Vithal
The boundaries of computer science are defined by what symbolic computation can accomplish. Software Engineering is concerned with effective use of computing technology to support automatic computation on a large scale so as to construct desirable solutions to worthwhile problems. Both focus on what happens within the machine. In contrast, most practical applications of computing support end-users in realizing (often unsaid) objectives. It is often said that such objectives cannot be even specified, e.g., what is the specification of MS Word, or for that matter, any flavour of UNIX? This situation points to the need for architecting what people do with computers. Based on Systems Thinking and Cybernetics, we present such a viewpoint which hinges on Human Responsibility and means of living up to it.
Halligan, Brian D.; Geiger, Joey F.; Vallejos, Andrew K.; Greene, Andrew S.; Twigger, Simon N.
2009-01-01
One of the major difficulties for many laboratories setting up proteomics programs has been obtaining and maintaining the computational infrastructure required for the analysis of the large flow of proteomics data. We describe a system that combines distributed cloud computing and open source software to allow laboratories to set up scalable virtual proteomics analysis clusters without the investment in computational hardware or software licensing fees. Additionally, the pricing structure of distributed computing providers, such as Amazon Web Services, allows laboratories or even individuals to have large-scale computational resources at their disposal at a very low cost per run. We provide detailed step by step instructions on how to implement the virtual proteomics analysis clusters as well as a list of current available preconfigured Amazon machine images containing the OMSSA and X!Tandem search algorithms and sequence databases on the Medical College of Wisconsin Proteomics Center website (http://proteomics.mcw.edu/vipdac). PMID:19358578
Halligan, Brian D; Geiger, Joey F; Vallejos, Andrew K; Greene, Andrew S; Twigger, Simon N
2009-06-01
One of the major difficulties for many laboratories setting up proteomics programs has been obtaining and maintaining the computational infrastructure required for the analysis of the large flow of proteomics data. We describe a system that combines distributed cloud computing and open source software to allow laboratories to set up scalable virtual proteomics analysis clusters without the investment in computational hardware or software licensing fees. Additionally, the pricing structure of distributed computing providers, such as Amazon Web Services, allows laboratories or even individuals to have large-scale computational resources at their disposal at a very low cost per run. We provide detailed step-by-step instructions on how to implement the virtual proteomics analysis clusters as well as a list of current available preconfigured Amazon machine images containing the OMSSA and X!Tandem search algorithms and sequence databases on the Medical College of Wisconsin Proteomics Center Web site ( http://proteomics.mcw.edu/vipdac ).
Sirocco Storage Server v. pre-alpha 0.1
DOE Office of Scientific and Technical Information (OSTI.GOV)
Curry, Matthew L.; Danielson, Geoffrey; Ward, H. Lee
Sirocco is a parallel storage system under development, designed for write-intensive workloads on large-scale HPC platforms. It implements a keyvalue object store on top of a set of loosely federated storage servers that cooperate to ensure data integrity and performance. It includes support for a range of different types of storage transactions. This software release constitutes a conformant storage server, along with the client-side libraries to access the storage over a network.
Streaming fragment assignment for real-time analysis of sequencing experiments
Roberts, Adam; Pachter, Lior
2013-01-01
We present eXpress, a software package for highly efficient probabilistic assignment of ambiguously mapping sequenced fragments. eXpress uses a streaming algorithm with linear run time and constant memory use. It can determine abundances of sequenced molecules in real time, and can be applied to ChIP-seq, metagenomics and other large-scale sequencing data. We demonstrate its use on RNA-seq data, showing greater efficiency than other quantification methods. PMID:23160280
YF-12 cooperative airframe/propulsion control system program, volume 1
NASA Technical Reports Server (NTRS)
Anderson, D. L.; Connolly, G. F.; Mauro, F. M.; Reukauf, P. J.; Marks, R. (Editor)
1980-01-01
Several YF-12C airplane analog control systems were converted to a digital system. Included were the air data computer, autopilot, inlet control system, and autothrottle systems. This conversion was performed to allow assessment of digital technology applications to supersonic cruise aircraft. The digital system was composed of a digital computer and specialized interface unit. A large scale mathematical simulation of the airplane was used for integration testing and software checkout.
Final Report: Quantification of Uncertainty in Extreme Scale Computations (QUEST)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Marzouk, Youssef; Conrad, Patrick; Bigoni, Daniele
QUEST (\\url{www.quest-scidac.org}) is a SciDAC Institute that is focused on uncertainty quantification (UQ) in large-scale scientific computations. Our goals are to (1) advance the state of the art in UQ mathematics, algorithms, and software; and (2) provide modeling, algorithmic, and general UQ expertise, together with software tools, to other SciDAC projects, thereby enabling and guiding a broad range of UQ activities in their respective contexts. QUEST is a collaboration among six institutions (Sandia National Laboratories, Los Alamos National Laboratory, the University of Southern California, Massachusetts Institute of Technology, the University of Texas at Austin, and Duke University) with a historymore » of joint UQ research. Our vision encompasses all aspects of UQ in leadership-class computing. This includes the well-founded setup of UQ problems; characterization of the input space given available data/information; local and global sensitivity analysis; adaptive dimensionality and order reduction; forward and inverse propagation of uncertainty; handling of application code failures, missing data, and hardware/software fault tolerance; and model inadequacy, comparison, validation, selection, and averaging. The nature of the UQ problem requires the seamless combination of data, models, and information across this landscape in a manner that provides a self-consistent quantification of requisite uncertainties in predictions from computational models. Accordingly, our UQ methods and tools span an interdisciplinary space across applied math, information theory, and statistics. The MIT QUEST effort centers on statistical inference and methods for surrogate or reduced-order modeling. MIT personnel have been responsible for the development of adaptive sampling methods, methods for approximating computationally intensive models, and software for both forward uncertainty propagation and statistical inverse problems. A key software product of the MIT QUEST effort is the MIT Uncertainty Quantification library, called MUQ (\\url{muq.mit.edu}).« less
Östling, Gerd; Persson, Margaretha; Hedblad, Bo; Gonçalves, Isabel
2013-11-01
Grey scale median (GSM) measured on ultrasound images of carotid plaques has been used for several years now in research to find the vulnerable plaque. Centres have used different software and also different methods for GSM measurement. This has resulted in a wide range of GSM values and cut-off values for the detection of the vulnerable plaque. The aim of this study was to compare the values obtained with two different softwares, using different standardization methods, for the measurement of GSM on ultrasound images of carotid human plaques. GSM was measured with Adobe Photoshop(®) and with Artery Measurement System (AMS) on duplex ultrasound images of 100 consecutive medium- to large-sized carotid plaques of the Beta-blocker Cholesterol-lowering Asymptomatic Plaque Study (BCAPS). The mean values of GSM were 35·2 ± 19·3 and 55·8 ± 22·5 for Adobe Photoshop(®) and AMS, respectively. Mean difference was 20·45 (95% CI: 19·17-21·73). Although the absolute values of GSM differed, the agreement between the two measurements was good, correlation coefficient 0·95. A chi-square test revealed a kappa value of 0·68 when studying quartiles of GSM. The intra-observer variability was 1·9% for AMS and 2·5% for Adobe Photoshop. The difference between softwares and standardization methods must be taken into consideration when comparing studies. To avoid these problems, researcher should come to a consensus regarding software and standardization method for GSM measurement on ultrasound images of plaque in the arteries. © 2013 Scandinavian Society of Clinical Physiology and Nuclear Medicine. Published by John Wiley & Sons Ltd.
Resources for Functional Genomics Studies in Drosophila melanogaster
Mohr, Stephanie E.; Hu, Yanhui; Kim, Kevin; Housden, Benjamin E.; Perrimon, Norbert
2014-01-01
Drosophila melanogaster has become a system of choice for functional genomic studies. Many resources, including online databases and software tools, are now available to support design or identification of relevant fly stocks and reagents or analysis and mining of existing functional genomic, transcriptomic, proteomic, etc. datasets. These include large community collections of fly stocks and plasmid clones, “meta” information sites like FlyBase and FlyMine, and an increasing number of more specialized reagents, databases, and online tools. Here, we introduce key resources useful to plan large-scale functional genomics studies in Drosophila and to analyze, integrate, and mine the results of those studies in ways that facilitate identification of highest-confidence results and generation of new hypotheses. We also discuss ways in which existing resources can be used and might be improved and suggest a few areas of future development that would further support large- and small-scale studies in Drosophila and facilitate use of Drosophila information by the research community more generally. PMID:24653003
Data-Driven Simulation-Enhanced Optimization of People-Based Print Production Service
NASA Astrophysics Data System (ADS)
Rai, Sudhendu
This paper describes a systematic six-step data-driven simulation-based methodology for optimizing people-based service systems on a large distributed scale that exhibit high variety and variability. The methodology is exemplified through its application within the printing services industry where it has been successfully deployed by Xerox Corporation across small, mid-sized and large print shops generating over 250 million in profits across the customer value chain. Each step of the methodology consisting of innovative concepts co-development and testing in partnership with customers, development of software and hardware tools to implement the innovative concepts, establishment of work-process and practices for customer-engagement and service implementation, creation of training and infrastructure for large scale deployment, integration of the innovative offering within the framework of existing corporate offerings and lastly the monitoring and deployment of the financial and operational metrics for estimating the return-on-investment and the continual renewal of the offering are described in detail.
NASA Astrophysics Data System (ADS)
Semple, Lucas M.; Carriveau, Rupp; Ting, David S.-K.
2018-04-01
In the Ontario greenhouse sector the misalignment of available solar radiation during the summer months and large heating demand during the winter months makes solar thermal collector systems an unviable option without some form of seasonal energy storage. Information obtained from Ontario greenhouse operators has shown that over 20% of annual natural gas usage occurs during the summer months for greenhouse pre-heating prior to sunrise. A transient model of the greenhouse microclimate and indoor conditioning systems is carried out using TRNSYS software and validated with actual natural gas usage data. A large-scale solar thermal collector system is then incorporated and found to reduce the annual heating energy demand by approximately 35%. The inclusion of the collector system correlates to a reduction of about 120 tonnes of CO2 equivalent emissions per acre of greenhouse per year. System payback period is discussed considering the benefits of a future Ontario carbon tax.
Constructing Flexible, Configurable, ETL Pipelines for the Analysis of "Big Data" with Apache OODT
NASA Astrophysics Data System (ADS)
Hart, A. F.; Mattmann, C. A.; Ramirez, P.; Verma, R.; Zimdars, P. A.; Park, S.; Estrada, A.; Sumarlidason, A.; Gil, Y.; Ratnakar, V.; Krum, D.; Phan, T.; Meena, A.
2013-12-01
A plethora of open source technologies for manipulating, transforming, querying, and visualizing 'big data' have blossomed and matured in the last few years, driven in large part by recognition of the tremendous value that can be derived by leveraging data mining and visualization techniques on large data sets. One facet of many of these tools is that input data must often be prepared into a particular format (e.g.: JSON, CSV), or loaded into a particular storage technology (e.g.: HDFS) before analysis can take place. This process, commonly known as Extract-Transform-Load, or ETL, often involves multiple well-defined steps that must be executed in a particular order, and the approach taken for a particular data set is generally sensitive to the quantity and quality of the input data, as well as the structure and complexity of the desired output. When working with very large, heterogeneous, unstructured or semi-structured data sets, automating the ETL process and monitoring its progress becomes increasingly important. Apache Object Oriented Data Technology (OODT) provides a suite of complementary data management components called the Process Control System (PCS) that can be connected together to form flexible ETL pipelines as well as browser-based user interfaces for monitoring and control of ongoing operations. The lightweight, metadata driven middleware layer can be wrapped around custom ETL workflow steps, which themselves can be implemented in any language. Once configured, it facilitates communication between workflow steps and supports execution of ETL pipelines across a distributed cluster of compute resources. As participants in a DARPA-funded effort to develop open source tools for large-scale data analysis, we utilized Apache OODT to rapidly construct custom ETL pipelines for a variety of very large data sets to prepare them for analysis and visualization applications. We feel that OODT, which is free and open source software available through the Apache Software Foundation, is particularly well suited to developing and managing arbitrary large-scale ETL processes both for the simplicity and flexibility of its wrapper framework, as well as the detailed provenance information it exposes throughout the process. Our experience using OODT to manage processing of large-scale data sets in domains as diverse as radio astronomy, life sciences, and social network analysis demonstrates the flexibility of the framework, and the range of potential applications to a broad array of big data ETL challenges.
Pérez-Rodríguez, Gael; Glez-Peña, Daniel; Azevedo, Nuno F; Pereira, Maria Olívia; Fdez-Riverola, Florentino; Lourenço, Anália
2015-03-01
Biofilms are receiving increasing attention from the biomedical community. Biofilm-like growth within human body is considered one of the key microbial strategies to augment resistance and persistence during infectious processes. The Biofilms Experiment Workbench is a novel software workbench for the operation and analysis of biofilms experimental data. The goal is to promote the interchange and comparison of data among laboratories, providing systematic, harmonised and large-scale data computation. The workbench was developed with AIBench, an open-source Java desktop application framework for scientific software development in the domain of translational biomedicine. Implementation favours free and open-source third-parties, such as the R statistical package, and reaches for the Web services of the BiofOmics database to enable public experiment deposition. First, we summarise the novel, free, open, XML-based interchange format for encoding biofilms experimental data. Then, we describe the execution of common scenarios of operation with the new workbench, such as the creation of new experiments, the importation of data from Excel spreadsheets, the computation of analytical results, the on-demand and highly customised construction of Web publishable reports, and the comparison of results between laboratories. A considerable and varied amount of biofilms data is being generated, and there is a critical need to develop bioinformatics tools that expedite the interchange and comparison of microbiological and clinical results among laboratories. We propose a simple, open-source software infrastructure which is effective, extensible and easy to understand. The workbench is freely available for non-commercial use at http://sing.ei.uvigo.es/bew under LGPL license. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
A Web-based Distributed Voluntary Computing Platform for Large Scale Hydrological Computations
NASA Astrophysics Data System (ADS)
Demir, I.; Agliamzanov, R.
2014-12-01
Distributed volunteer computing can enable researchers and scientist to form large parallel computing environments to utilize the computing power of the millions of computers on the Internet, and use them towards running large scale environmental simulations and models to serve the common good of local communities and the world. Recent developments in web technologies and standards allow client-side scripting languages to run at speeds close to native application, and utilize the power of Graphics Processing Units (GPU). Using a client-side scripting language like JavaScript, we have developed an open distributed computing framework that makes it easy for researchers to write their own hydrologic models, and run them on volunteer computers. Users will easily enable their websites for visitors to volunteer sharing their computer resources to contribute running advanced hydrological models and simulations. Using a web-based system allows users to start volunteering their computational resources within seconds without installing any software. The framework distributes the model simulation to thousands of nodes in small spatial and computational sizes. A relational database system is utilized for managing data connections and queue management for the distributed computing nodes. In this paper, we present a web-based distributed volunteer computing platform to enable large scale hydrological simulations and model runs in an open and integrated environment.
Reverse engineering and analysis of large genome-scale gene networks
Aluru, Maneesha; Zola, Jaroslaw; Nettleton, Dan; Aluru, Srinivas
2013-01-01
Reverse engineering the whole-genome networks of complex multicellular organisms continues to remain a challenge. While simpler models easily scale to large number of genes and gene expression datasets, more accurate models are compute intensive limiting their scale of applicability. To enable fast and accurate reconstruction of large networks, we developed Tool for Inferring Network of Genes (TINGe), a parallel mutual information (MI)-based program. The novel features of our approach include: (i) B-spline-based formulation for linear-time computation of MI, (ii) a novel algorithm for direct permutation testing and (iii) development of parallel algorithms to reduce run-time and facilitate construction of large networks. We assess the quality of our method by comparison with ARACNe (Algorithm for the Reconstruction of Accurate Cellular Networks) and GeneNet and demonstrate its unique capability by reverse engineering the whole-genome network of Arabidopsis thaliana from 3137 Affymetrix ATH1 GeneChips in just 9 min on a 1024-core cluster. We further report on the development of a new software Gene Network Analyzer (GeNA) for extracting context-specific subnetworks from a given set of seed genes. Using TINGe and GeNA, we performed analysis of 241 Arabidopsis AraCyc 8.0 pathways, and the results are made available through the web. PMID:23042249
Modeling the Hydrologic Effects of Large-Scale Green Infrastructure Projects with GIS
NASA Astrophysics Data System (ADS)
Bado, R. A.; Fekete, B. M.; Khanbilvardi, R.
2015-12-01
Impervious surfaces in urban areas generate excess runoff, which in turn causes flooding, combined sewer overflows, and degradation of adjacent surface waters. Municipal environmental protection agencies have shown a growing interest in mitigating these effects with 'green' infrastructure practices that partially restore the perviousness and water holding capacity of urban centers. Assessment of the performance of current and future green infrastructure projects is hindered by the lack of adequate hydrological modeling tools; conventional techniques fail to account for the complex flow pathways of urban environments, and detailed analyses are difficult to prepare for the very large domains in which green infrastructure projects are implemented. Currently, no standard toolset exists that can rapidly and conveniently predict runoff, consequent inundations, and sewer overflows at a city-wide scale. We demonstrate how streamlined modeling techniques can be used with open-source GIS software to efficiently model runoff in large urban catchments. Hydraulic parameters and flow paths through city blocks, roadways, and sewer drains are automatically generated from GIS layers, and ultimately urban flow simulations can be executed for a variety of rainfall conditions. With this methodology, users can understand the implications of large-scale land use changes and green/gray storm water retention systems on hydraulic loading, peak flow rates, and runoff volumes.
A parallel and sensitive software tool for methylation analysis on multicore platforms.
Tárraga, Joaquín; Pérez, Mariano; Orduña, Juan M; Duato, José; Medina, Ignacio; Dopazo, Joaquín
2015-10-01
DNA methylation analysis suffers from very long processing time, as the advent of Next-Generation Sequencers has shifted the bottleneck of genomic studies from the sequencers that obtain the DNA samples to the software that performs the analysis of these samples. The existing software for methylation analysis does not seem to scale efficiently neither with the size of the dataset nor with the length of the reads to be analyzed. As it is expected that the sequencers will provide longer and longer reads in the near future, efficient and scalable methylation software should be developed. We present a new software tool, called HPG-Methyl, which efficiently maps bisulphite sequencing reads on DNA, analyzing DNA methylation. The strategy used by this software consists of leveraging the speed of the Burrows-Wheeler Transform to map a large number of DNA fragments (reads) rapidly, as well as the accuracy of the Smith-Waterman algorithm, which is exclusively employed to deal with the most ambiguous and shortest reads. Experimental results on platforms with Intel multicore processors show that HPG-Methyl significantly outperforms in both execution time and sensitivity state-of-the-art software such as Bismark, BS-Seeker or BSMAP, particularly for long bisulphite reads. Software in the form of C libraries and functions, together with instructions to compile and execute this software. Available by sftp to anonymous@clariano.uv.es (password 'anonymous'). juan.orduna@uv.es or jdopazo@cipf.es. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Bockholt, Henry J.; Scully, Mark; Courtney, William; Rachakonda, Srinivas; Scott, Adam; Caprihan, Arvind; Fries, Jill; Kalyanam, Ravi; Segall, Judith M.; de la Garza, Raul; Lane, Susan; Calhoun, Vince D.
2009-01-01
A neuroinformatics (NI) system is critical to brain imaging research in order to shorten the time between study conception and results. Such a NI system is required to scale well when large numbers of subjects are studied. Further, when multiple sites participate in research projects organizational issues become increasingly difficult. Optimized NI applications mitigate these problems. Additionally, NI software enables coordination across multiple studies, leveraging advantages potentially leading to exponential research discoveries. The web-based, Mind Research Network (MRN), database system has been designed and improved through our experience with 200 research studies and 250 researchers from seven different institutions. The MRN tools permit the collection, management, reporting and efficient use of large scale, heterogeneous data sources, e.g., multiple institutions, multiple principal investigators, multiple research programs and studies, and multimodal acquisitions. We have collected and analyzed data sets on thousands of research participants and have set up a framework to automatically analyze the data, thereby making efficient, practical data mining of this vast resource possible. This paper presents a comprehensive framework for capturing and analyzing heterogeneous neuroscience research data sources that has been fully optimized for end-users to perform novel data mining. PMID:20461147
Zhou, Juntuo; Liu, Huiying; Liu, Yang; Liu, Jia; Zhao, Xuyang; Yin, Yuxin
2016-04-19
Recent advances in mass spectrometers which have yielded higher resolution and faster scanning speeds have expanded their application in metabolomics of diverse diseases. Using a quadrupole-Orbitrap LC-MS system, we developed an efficient large-scale quantitative method targeting 237 metabolites involved in various metabolic pathways using scheduled, parallel reaction monitoring (PRM). We assessed the dynamic range, linearity, reproducibility, and system suitability of the PRM assay by measuring concentration curves, biological samples, and clinical serum samples. The quantification performances of PRM and MS1-based assays in Q-Exactive were compared, and the MRM assay in QTRAP 6500 was also compared. The PRM assay monitoring 237 polar metabolites showed greater reproducibility and quantitative accuracy than MS1-based quantification and also showed greater flexibility in postacquisition assay refinement than the MRM assay in QTRAP 6500. We present a workflow for convenient PRM data processing using Skyline software which is free of charge. In this study we have established a reliable PRM methodology on a quadrupole-Orbitrap platform for evaluation of large-scale targeted metabolomics, which provides a new choice for basic and clinical metabolomics study.
Using Selection Pressure as an Asset to Develop Reusable, Adaptable Software Systems
NASA Astrophysics Data System (ADS)
Berrick, S. W.; Lynnes, C.
2007-12-01
The Goddard Earth Sciences Data and Information Services Center (GES DISC) at NASA has over the years developed and honed a number of reusable architectural components for supporting large-scale data centers with a large customer base. These include a processing system (S4PM) and an archive system (S4PA) based upon a workflow engine called the Simple, Scalable, Script-based Science Processor (S4P); an online data visualization and analysis system (Giovanni); and the radically simple and fast data search tool, Mirador. These subsystems are currently reused internally in a variety of combinations to implement customized data management on behalf of instrument science teams and other science investigators. Some of these subsystems (S4P and S4PM) have also been reused by other data centers for operational science processing. Our experience has been that development and utilization of robust, interoperable, and reusable software systems can actually flourish in environments defined by heterogeneous commodity hardware systems, the emphasis on value-added customer service, and continual cost reduction pressures. The repeated internal reuse that is fostered by such an environment encourages and even forces changes to the software that make it more reusable and adaptable. Allowing and even encouraging such selective pressures to software development has been a key factor in the success of S4P and S4PM, which are now available to the open source community under the NASA Open Source Agreement.
Visualization for Molecular Dynamics Simulation of Gas and Metal Surface Interaction
NASA Astrophysics Data System (ADS)
Puzyrkov, D.; Polyakov, S.; Podryga, V.
2016-02-01
The development of methods, algorithms and applications for visualization of molecular dynamics simulation outputs is discussed. The visual analysis of the results of such calculations is a complex and actual problem especially in case of the large scale simulations. To solve this challenging task it is necessary to decide on: 1) what data parameters to render, 2) what type of visualization to choose, 3) what development tools to use. In the present work an attempt to answer these questions was made. For visualization it was offered to draw particles in the corresponding 3D coordinates and also their velocity vectors, trajectories and volume density in the form of isosurfaces or fog. We tested the way of post-processing and visualization based on the Python language with use of additional libraries. Also parallel software was developed that allows processing large volumes of data in the 3D regions of the examined system. This software gives the opportunity to achieve desired results that are obtained in parallel with the calculations, and at the end to collect discrete received frames into a video file. The software package "Enthought Mayavi2" was used as the tool for visualization. This visualization application gave us the opportunity to study the interaction of a gas with a metal surface and to closely observe the adsorption effect.
NASA Astrophysics Data System (ADS)
Mohan, C.
In this paper, I survey briefly some of the recent and emerging trends in hardware and software features which impact high performance transaction processing and data analytics applications. These features include multicore processor chips, ultra large main memories, flash storage, storage class memories, database appliances, field programmable gate arrays, transactional memory, key-value stores, and cloud computing. While some applications, e.g., Web 2.0 ones, were initially built without traditional transaction processing functionality in mind, slowly system architects and designers are beginning to address such previously ignored issues. The availability, analytics and response time requirements of these applications were initially given more importance than ACID transaction semantics and resource consumption characteristics. A project at IBM Almaden is studying the implications of phase change memory on transaction processing, in the context of a key-value store. Bitemporal data management has also become an important requirement, especially for financial applications. Power consumption and heat dissipation properties are also major considerations in the emergence of modern software and hardware architectural features. Considerations relating to ease of configuration, installation, maintenance and monitoring, and improvement of total cost of ownership have resulted in database appliances becoming very popular. The MapReduce paradigm is now quite popular for large scale data analysis, in spite of the major inefficiencies associated with it.
Daily quality assurance software for a satellite radiometer system
NASA Technical Reports Server (NTRS)
Keegstra, P. B.; Smoot, G. F.; Bennett, C. L.; Aymon, J.; Backus, C.; Deamici, G.; Hinshaw, G.; Jackson, P. D.; Kogut, A.; Lineweaver, C.
1992-01-01
Six Differential Microwave Radiometers (DMR) on COBE (Cosmic Background Explorer) measure the large-angular-scale isotropy of the cosmic microwave background (CMB) at 31.5, 53, and 90 GHz. Quality assurance software analyzes the daily telemetry from the spacecraft to ensure that the instrument is operating correctly and that the data are not corrupted. Quality assurance for DMR poses challenging requirements. The data are differential, so a single bad point can affect a large region of the sky, yet the CMB isotropy requires lengthy integration times (greater than 1 year) to limit potential CMB anisotropies. Celestial sources (with the exception of the moon) are not, in general, visible in the raw differential data. A 'quicklook' software system was developed that, in addition to basic plotting and limit-checking, implements a collection of data tests as well as long-term trending. Some of the key capabilities include the following: (1) stability analysis showing how well the data RMS averages down with increased data; (2) a Fourier analysis and autocorrelation routine to plot the power spectrum and confirm the presence of the 3 mK 'cosmic' dipole signal; (3) binning of the data against basic spacecraft quantities such as orbit angle; (4) long-term trending; and (5) dipole fits to confirm the spacecraft attitude azimuth angle.
Science Gateways, Scientific Workflows and Open Community Software
NASA Astrophysics Data System (ADS)
Pierce, M. E.; Marru, S.
2014-12-01
Science gateways and scientific workflows occupy different ends of the spectrum of user-focused cyberinfrastructure. Gateways, sometimes called science portals, provide a way for enabling large numbers of users to take advantage of advanced computing resources (supercomputers, advanced storage systems, science clouds) by providing Web and desktop interfaces and supporting services. Scientific workflows, at the other end of the spectrum, support advanced usage of cyberinfrastructure that enable "power users" to undertake computational experiments that are not easily done through the usual mechanisms (managing simulations across multiple sites, for example). Despite these different target communities, gateways and workflows share many similarities and can potentially be accommodated by the same software system. For example, pipelines to process InSAR imagery sets or to datamine GPS time series data are workflows. The results and the ability to make downstream products may be made available through a gateway, and power users may want to provide their own custom pipelines. In this abstract, we discuss our efforts to build an open source software system, Apache Airavata, that can accommodate both gateway and workflow use cases. Our approach is general, and we have applied the software to problems in a number of scientific domains. In this talk, we discuss our applications to usage scenarios specific to earth science, focusing on earthquake physics examples drawn from the QuakSim.org and GeoGateway.org efforts. We also examine the role of the Apache Software Foundation's open community model as a way to build up common commmunity codes that do not depend upon a single "owner" to sustain. Pushing beyond open source software, we also see the need to provide gateways and workflow systems as cloud services. These services centralize operations, provide well-defined programming interfaces, scale elastically, and have global-scale fault tolerance. We discuss our work providing Apache Airavata as a hosted service to provide these features.
Khomtchouk, Bohdan B; Van Booven, Derek J; Wahlestedt, Claes
2014-01-01
The graphical visualization of gene expression data using heatmaps has become an integral component of modern-day medical research. Heatmaps are used extensively to plot quantitative differences in gene expression levels, such as those measured with RNAseq and microarray experiments, to provide qualitative large-scale views of the transcriptonomic landscape. Creating high-quality heatmaps is a computationally intensive task, often requiring considerable programming experience, particularly for customizing features to a specific dataset at hand. Software to create publication-quality heatmaps is developed with the R programming language, C++ programming language, and OpenGL application programming interface (API) to create industry-grade high performance graphics. We create a graphical user interface (GUI) software package called HeatmapGenerator for Windows OS and Mac OS X as an intuitive, user-friendly alternative to researchers with minimal prior coding experience to allow them to create publication-quality heatmaps using R graphics without sacrificing their desired level of customization. The simplicity of HeatmapGenerator is that it only requires the user to upload a preformatted input file and download the publicly available R software language, among a few other operating system-specific requirements. Advanced features such as color, text labels, scaling, legend construction, and even database storage can be easily customized with no prior programming knowledge. We provide an intuitive and user-friendly software package, HeatmapGenerator, to create high-quality, customizable heatmaps generated using the high-resolution color graphics capabilities of R. The software is available for Microsoft Windows and Apple Mac OS X. HeatmapGenerator is released under the GNU General Public License and publicly available at: http://sourceforge.net/projects/heatmapgenerator/. The Mac OS X direct download is available at: http://sourceforge.net/projects/heatmapgenerator/files/HeatmapGenerator_MAC_OSX.tar.gz/download. The Windows OS direct download is available at: http://sourceforge.net/projects/heatmapgenerator/files/HeatmapGenerator_WINDOWS.zip/download.
FracPaQ: a MATLAB™ toolbox for the quantification of fracture patterns
NASA Astrophysics Data System (ADS)
Healy, David; Rizzo, Roberto; Farrell, Natalie; Watkins, Hannah; Cornwell, David; Gomez-Rivas, Enrique; Timms, Nick
2017-04-01
The patterns of fractures in deformed rocks are rarely uniform or random. Fracture orientations, sizes, shapes and spatial distributions often exhibit some kind of order. In detail, there may be relationships among the different fracture attributes e.g. small fractures dominated by one orientation, larger fractures by another. These relationships are important because the mechanical (e.g. strength, anisotropy) and transport (e.g. fluids, heat) properties of rock depend on these fracture patterns and fracture attributes. This presentation describes an open source toolbox to quantify fracture patterns, including distributions in fracture attributes and their spatial variation. Software has been developed to quantify fracture patterns from 2-D digital images, such as thin section micrographs, geological maps, outcrop or aerial photographs or satellite images. The toolbox comprises a suite of MATLAB™ scripts based on published quantitative methods for the analysis of fracture attributes: orientations, lengths, intensity, density and connectivity. An estimate of permeability in 2-D is made using a parallel plate model. The software provides an objective and consistent methodology for quantifying fracture patterns and their variations in 2-D across a wide range of length scales. Our current focus for the application of the software is on quantifying crack and fracture patterns in and around fault zones. There is a large body of published work on the quantification of relatively simple joint patterns, but fault zones present a bigger, and arguably more important, challenge. The methods presented are inherently scale independent, and a key task will be to analyse and integrate quantitative fracture pattern data from micro- to macro-scales. New features in this release include multi-scale analyses based on a wavelet method to look for scale transitions, support for multi-colour traces in the input file processed as separate fracture sets, and combining fracture traces from multiple 2-D images to derive the statistically equivalent 3-D fracture pattern expressed as a 2nd rank crack tensor.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Willenbring, James M.; Bartlett, Roscoe Ainsworth; Heroux, Michael Allen
2012-01-01
Software lifecycles are becoming an increasingly important issue for computational science and engineering (CSE) software. The process by which a piece of CSE software begins life as a set of research requirements and then matures into a trusted high-quality capability is both commonplace and extremely challenging. Although an implicit lifecycle is obviously being used in any effort, the challenges of this process - respecting the competing needs of research vs. production - cannot be overstated. Here we describe a proposal for a well-defined software lifecycle process based on modern Lean/Agile software engineering principles. What we propose is appropriate for manymore » CSE software projects that are initially heavily focused on research but also are expected to eventually produce usable high-quality capabilities. The model is related to TriBITS, a build, integration and testing system, which serves as a strong foundation for this lifecycle model, and aspects of this lifecycle model are ingrained in the TriBITS system. Here, we advocate three to four phases or maturity levels that address the appropriate handling of many issues associated with the transition from research to production software. The goals of this lifecycle model are to better communicate maturity levels with customers and to help to identify and promote Software Engineering (SE) practices that will help to improve productivity and produce better software. An important collection of software in this domain is Trilinos, which is used as the motivation and the initial target for this lifecycle model. However, many other related and similar CSE (and non-CSE) software projects can also make good use of this lifecycle model, especially those that use the TriBITS system. Indeed this lifecycle process, if followed, will enable large-scale sustainable integration of many complex CSE software efforts across several institutions.« less
NASA Astrophysics Data System (ADS)
Cui, Tie Jun; Wu, Rui Yuan; Wu, Wei; Shi, Chuan Bo; Li, Yun Bo
2017-10-01
We propose fast and accurate designs to large-scale and low-profile transmission-type anisotropic coding metasurfaces with multiple functions in the millimeter-wave frequencies based on the antenna-array method. The numerical simulation of an anisotropic coding metasurface with the size of 30λ × 30λ by the proposed method takes only 20 min, which however cannot be realized by commercial software due to huge memory usage in personal computers. To inspect the performance of coding metasurfaces in the millimeter-wave band, the working frequency is chosen as 60 GHz. Based on the convolution operations and holographic theory, the proposed multifunctional anisotropic coding metasurface exhibits different effects excited by y-polarized and x-polarized incidences. This study extends the frequency range of coding metasurfaces, filling the gap between microwave and terahertz bands, and implying promising applications in millimeter-wave communication and imaging.
What do the data show? Fostering physical intuition with ClimateBits and NASA Earth Observations
NASA Astrophysics Data System (ADS)
Schollaert Uz, S.; Ward, K.
2017-12-01
Through data visualizations using global satellite imagery available in NASA Earth Observations (NEO), we explain Earth science concepts (e.g. albedo, urban heat island effect, phytoplankton). We also provide examples of ways to explore the satellite data in NEO within a new blog series. This is an ideal tool for scientists and non-scientists alike who want to quickly check satellite imagery for large scale features or patterns. NEO analysis requires no software or plug-ins; only a browser and an internet connection. You can even check imagery and perform simple analyses from your smart phone. NEO can be used to create graphics for presentations and papers or as a first step before acquiring data for more rigorous analysis. NEO has potential application to easily explore large scale environmental and climate patterns that impact operations and infrastructure. This is something we are currently exploring with end user groups.
NASA Technical Reports Server (NTRS)
Venkatachari, Balaji Shankar; Streett, Craig L.; Chang, Chau-Lyan; Friedlander, David J.; Wang, Xiao-Yen; Chang, Sin-Chung
2016-01-01
Despite decades of development of unstructured mesh methods, high-fidelity time-accurate simulations are still predominantly carried out on structured, or unstructured hexahedral meshes by using high-order finite-difference, weighted essentially non-oscillatory (WENO), or hybrid schemes formed by their combinations. In this work, the space-time conservation element solution element (CESE) method is used to simulate several flow problems including supersonic jet/shock interaction and its impact on launch vehicle acoustics, and direct numerical simulations of turbulent flows using tetrahedral meshes. This paper provides a status report for the continuing development of the space-time conservation element solution element (CESE) numerical and software framework under the Revolutionary Computational Aerosciences (RCA) project. Solution accuracy and large-scale parallel performance of the numerical framework is assessed with the goal of providing a viable paradigm for future high-fidelity flow physics simulations.
Measuring Large-Scale Social Networks with High Resolution
Stopczynski, Arkadiusz; Sekara, Vedran; Sapiezynski, Piotr; Cuttone, Andrea; Madsen, Mette My; Larsen, Jakob Eg; Lehmann, Sune
2014-01-01
This paper describes the deployment of a large-scale study designed to measure human interactions across a variety of communication channels, with high temporal resolution and spanning multiple years—the Copenhagen Networks Study. Specifically, we collect data on face-to-face interactions, telecommunication, social networks, location, and background information (personality, demographics, health, politics) for a densely connected population of 1 000 individuals, using state-of-the-art smartphones as social sensors. Here we provide an overview of the related work and describe the motivation and research agenda driving the study. Additionally, the paper details the data-types measured, and the technical infrastructure in terms of both backend and phone software, as well as an outline of the deployment procedures. We document the participant privacy procedures and their underlying principles. The paper is concluded with early results from data analysis, illustrating the importance of multi-channel high-resolution approach to data collection. PMID:24770359
Policy Driven Development: Flexible Policy Insertion for Large Scale Systems.
Demchak, Barry; Krüger, Ingolf
2012-07-01
The success of a software system depends critically on how well it reflects and adapts to stakeholder requirements. Traditional development methods often frustrate stakeholders by creating long latencies between requirement articulation and system deployment, especially in large scale systems. One source of latency is the maintenance of policy decisions encoded directly into system workflows at development time, including those involving access control and feature set selection. We created the Policy Driven Development (PDD) methodology to address these development latencies by enabling the flexible injection of decision points into existing workflows at runtime , thus enabling policy composition that integrates requirements furnished by multiple, oblivious stakeholder groups. Using PDD, we designed and implemented a production cyberinfrastructure that demonstrates policy and workflow injection that quickly implements stakeholder requirements, including features not contemplated in the original system design. PDD provides a path to quickly and cost effectively evolve such applications over a long lifetime.
A large-scale computer facility for computational aerodynamics
NASA Technical Reports Server (NTRS)
Bailey, F. R.; Ballhaus, W. F., Jr.
1985-01-01
As a result of advances related to the combination of computer system technology and numerical modeling, computational aerodynamics has emerged as an essential element in aerospace vehicle design methodology. NASA has, therefore, initiated the Numerical Aerodynamic Simulation (NAS) Program with the objective to provide a basis for further advances in the modeling of aerodynamic flowfields. The Program is concerned with the development of a leading-edge, large-scale computer facility. This facility is to be made available to Government agencies, industry, and universities as a necessary element in ensuring continuing leadership in computational aerodynamics and related disciplines. Attention is given to the requirements for computational aerodynamics, the principal specific goals of the NAS Program, the high-speed processor subsystem, the workstation subsystem, the support processing subsystem, the graphics subsystem, the mass storage subsystem, the long-haul communication subsystem, the high-speed data-network subsystem, and software.
Eissing, Thomas; Kuepfer, Lars; Becker, Corina; Block, Michael; Coboeken, Katrin; Gaub, Thomas; Goerlitz, Linus; Jaeger, Juergen; Loosen, Roland; Ludewig, Bernd; Meyer, Michaela; Niederalt, Christoph; Sevestre, Michael; Siegmund, Hans-Ulrich; Solodenko, Juri; Thelen, Kirstin; Telle, Ulrich; Weiss, Wolfgang; Wendl, Thomas; Willmann, Stefan; Lippert, Joerg
2011-01-01
Today, in silico studies and trial simulations already complement experimental approaches in pharmaceutical R&D and have become indispensable tools for decision making and communication with regulatory agencies. While biology is multiscale by nature, project work, and software tools usually focus on isolated aspects of drug action, such as pharmacokinetics at the organism scale or pharmacodynamic interaction on the molecular level. We present a modeling and simulation software platform consisting of PK-Sim® and MoBi® capable of building and simulating models that integrate across biological scales. A prototypical multiscale model for the progression of a pancreatic tumor and its response to pharmacotherapy is constructed and virtual patients are treated with a prodrug activated by hepatic metabolization. Tumor growth is driven by signal transduction leading to cell cycle transition and proliferation. Free tumor concentrations of the active metabolite inhibit Raf kinase in the signaling cascade and thereby cell cycle progression. In a virtual clinical study, the individual therapeutic outcome of the chemotherapeutic intervention is simulated for a large population with heterogeneous genomic background. Thereby, the platform allows efficient model building and integration of biological knowledge and prior data from all biological scales. Experimental in vitro model systems can be linked with observations in animal experiments and clinical trials. The interplay between patients, diseases, and drugs and topics with high clinical relevance such as the role of pharmacogenomics, drug–drug, or drug–metabolite interactions can be addressed using this mechanistic, insight driven multiscale modeling approach. PMID:21483730
Scalability and Validation of Big Data Bioinformatics Software.
Yang, Andrian; Troup, Michael; Ho, Joshua W K
2017-01-01
This review examines two important aspects that are central to modern big data bioinformatics analysis - software scalability and validity. We argue that not only are the issues of scalability and validation common to all big data bioinformatics analyses, they can be tackled by conceptually related methodological approaches, namely divide-and-conquer (scalability) and multiple executions (validation). Scalability is defined as the ability for a program to scale based on workload. It has always been an important consideration when developing bioinformatics algorithms and programs. Nonetheless the surge of volume and variety of biological and biomedical data has posed new challenges. We discuss how modern cloud computing and big data programming frameworks such as MapReduce and Spark are being used to effectively implement divide-and-conquer in a distributed computing environment. Validation of software is another important issue in big data bioinformatics that is often ignored. Software validation is the process of determining whether the program under test fulfils the task for which it was designed. Determining the correctness of the computational output of big data bioinformatics software is especially difficult due to the large input space and complex algorithms involved. We discuss how state-of-the-art software testing techniques that are based on the idea of multiple executions, such as metamorphic testing, can be used to implement an effective bioinformatics quality assurance strategy. We hope this review will raise awareness of these critical issues in bioinformatics.
Precision measurements from very-large scale aerial digital imagery.
Booth, D Terrance; Cox, Samuel E; Berryman, Robert D
2006-01-01
Managers need measurements and resource managers need the length/width of a variety of items including that of animals, logs, streams, plant canopies, man-made objects, riparian habitat, vegetation patches and other things important in resource monitoring and land inspection. These types of measurements can now be easily and accurately obtained from very large scale aerial (VLSA) imagery having spatial resolutions as fine as 1 millimeter per pixel by using the three new software programs described here. VLSA images have small fields of view and are used for intermittent sampling across extensive landscapes. Pixel-coverage among images is influenced by small changes in airplane altitude above ground level (AGL) and orientation relative to the ground, as well as by changes in topography. These factors affect the object-to-camera distance used for image-resolution calculations. 'ImageMeasurement' offers a user-friendly interface for accounting for pixel-coverage variation among images by utilizing a database. 'LaserLOG' records and displays airplane altitude AGL measured from a high frequency laser rangefinder, and displays the vertical velocity. 'Merge' sorts through large amounts of data generated by LaserLOG and matches precise airplane altitudes with camera trigger times for input to the ImageMeasurement database. We discuss application of these tools, including error estimates. We found measurements from aerial images (collection resolution: 5-26 mm/pixel as projected on the ground) using ImageMeasurement, LaserLOG, and Merge, were accurate to centimeters with an error less than 10%. We recommend these software packages as a means for expanding the utility of aerial image data.
Efficient analysis of large-scale genome-wide data with two R packages: bigstatsr and bigsnpr.
Privé, Florian; Aschard, Hugues; Ziyatdinov, Andrey; Blum, Michael G B
2017-03-30
Genome-wide datasets produced for association studies have dramatically increased in size over the past few years, with modern datasets commonly including millions of variants measured in dozens of thousands of individuals. This increase in data size is a major challenge severely slowing down genomic analyses, leading to some software becoming obsolete and researchers having limited access to diverse analysis tools. Here we present two R packages, bigstatsr and bigsnpr, allowing for the analysis of large scale genomic data to be performed within R. To address large data size, the packages use memory-mapping for accessing data matrices stored on disk instead of in RAM. To perform data pre-processing and data analysis, the packages integrate most of the tools that are commonly used, either through transparent system calls to existing software, or through updated or improved implementation of existing methods. In particular, the packages implement fast and accurate computations of principal component analysis and association studies, functions to remove SNPs in linkage disequilibrium and algorithms to learn polygenic risk scores on millions of SNPs. We illustrate applications of the two R packages by analyzing a case-control genomic dataset for celiac disease, performing an association study and computing Polygenic Risk Scores. Finally, we demonstrate the scalability of the R packages by analyzing a simulated genome-wide dataset including 500,000 individuals and 1 million markers on a single desktop computer. https://privefl.github.io/bigstatsr/ & https://privefl.github.io/bigsnpr/. florian.prive@univ-grenoble-alpes.fr & michael.blum@univ-grenoble-alpes.fr. Supplementary materials are available at Bioinformatics online.
Responding to the Event Deluge
NASA Technical Reports Server (NTRS)
Williams, Roy D.; Barthelmy, Scott D.; Denny, Robert B.; Graham, Matthew J.; Swinbank, John
2012-01-01
We present the VOEventNet infrastructure for large-scale rapid follow-up of astronomical events, including selection, annotation, machine intelligence, and coordination of observations. The VOEvent.standard is central to this vision, with distributed and replicated services rather than centralized facilities. We also describe some of the event brokers, services, and software that .are connected to the network. These technologies will become more important in the coming years, with new event streams from Gaia, LOF AR, LIGO, LSST, and many others
2013-11-01
big data with R is relatively new. RHadoop is a mature product from Revolution Analytics that uses R with Hadoop Streaming [15] and provides...agnostic all- data summaries or computations, in which case we use MapReduce directly. 2.3 D&R Software Environment In this work, we use the Hadoop ...job scheduling and tracking, data distribu- tion, system architecture, heterogeneity, and fault-tolerance. Hadoop also provides a distributed key-value
Staghorn: An Automated Large-Scale Distributed System Analysis Platform
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gabert, Kasimir; Burns, Ian; Elliott, Steven
2016-09-01
Conducting experiments on large-scale distributed computing systems is becoming significantly easier with the assistance of emulation. Researchers can now create a model of a distributed computing environment and then generate a virtual, laboratory copy of the entire system composed of potentially thousands of virtual machines, switches, and software. The use of real software, running at clock rate in full virtual machines, allows experiments to produce meaningful results without necessitating a full understanding of all model components. However, the ability to inspect and modify elements within these models is bound by the limitation that such modifications must compete with the model,more » either running in or alongside it. This inhibits entire classes of analyses from being conducted upon these models. We developed a mechanism to snapshot an entire emulation-based model as it is running. This allows us to \\freeze time" and subsequently fork execution, replay execution, modify arbitrary parts of the model, or deeply explore the model. This snapshot includes capturing packets in transit and other input/output state along with the running virtual machines. We were able to build this system in Linux using Open vSwitch and Kernel Virtual Machines on top of Sandia's emulation platform Firewheel. This primitive opens the door to numerous subsequent analyses on models, including state space exploration, debugging distributed systems, performance optimizations, improved training environments, and improved experiment repeatability.« less
Griffin, Kingsley J; Hedge, Luke H; González-Rivero, Manuel; Hoegh-Guldberg, Ove I; Johnston, Emma L
2017-07-01
Historically, marine ecologists have lacked efficient tools that are capable of capturing detailed species distribution data over large areas. Emerging technologies such as high-resolution imaging and associated machine-learning image-scoring software are providing new tools to map species over large areas in the ocean. Here, we combine a novel diver propulsion vehicle (DPV) imaging system with free-to-use machine-learning software to semi-automatically generate dense and widespread abundance records of a habitat-forming algae over ~5,000 m 2 of temperate reef. We employ replicable spatial techniques to test the effectiveness of traditional diver-based sampling, and better understand the distribution and spatial arrangement of one key algal species. We found that the effectiveness of a traditional survey depended on the level of spatial structuring, and generally 10-20 transects (50 × 1 m) were required to obtain reliable results. This represents 2-20 times greater replication than have been collected in previous studies. Furthermore, we demonstrate the usefulness of fine-resolution distribution modeling for understanding patterns in canopy algae cover at multiple spatial scales, and discuss applications to other marine habitats. Our analyses demonstrate that semi-automated methods of data gathering and processing provide more accurate results than traditional methods for describing habitat structure at seascape scales, and therefore represent vastly improved techniques for understanding and managing marine seascapes.
Implementation of audio computer-assisted interviewing software in HIV/AIDS research.
Pluhar, Erika; McDonnell Holstad, Marcia; Yeager, Katherine A; Denzmore-Nwagbara, Pamela; Corkran, Carol; Fielder, Bridget; McCarty, Frances; Diiorio, Colleen
2007-01-01
Computer-assisted interviewing (CAI) has begun to play a more prominent role in HIV/AIDS prevention research. Despite the increased popularity of CAI, particularly audio computer-assisted self-interviewing (ACASI), some research teams are still reluctant to implement ACASI technology because of lack of familiarity with the practical issues related to using these software packages. The purpose of this report is to describe the implementation of one particular ACASI software package, the Questionnaire Development System (QDS; Nova Research Company, Bethesda, MD), in several nursing and HIV/AIDS prevention research settings. The authors present acceptability and satisfaction data from two large-scale public health studies in which they have used QDS with diverse populations. They also address issues related to developing and programming a questionnaire; discuss practical strategies related to planning for and implementing ACASI in the field, including selecting equipment, training staff, and collecting and transferring data; and summarize advantages and disadvantages of computer-assisted research methods.
Shen, Yiwen; Hattink, Maarten; Samadi, Payman; ...
2018-04-13
Silicon photonics based switches offer an effective option for the delivery of dynamic bandwidth for future large-scale Datacom systems while maintaining scalable energy efficiency. The integration of a silicon photonics-based optical switching fabric within electronic Datacom architectures requires novel network topologies and arbitration strategies to effectively manage the active elements in the network. Here, we present a scalable software-defined networking control plane to integrate silicon photonic based switches with conventional Ethernet or InfiniBand networks. Our software-defined control plane manages both electronic packet switches and multiple silicon photonic switches for simultaneous packet and circuit switching. We built an experimental Dragonfly networkmore » testbed with 16 electronic packet switches and 2 silicon photonic switches to evaluate our control plane. Observed latencies occupied by each step of the switching procedure demonstrate a total of 344 microsecond control plane latency for data-center and high performance computing platforms.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Shen, Yiwen; Hattink, Maarten; Samadi, Payman
Silicon photonics based switches offer an effective option for the delivery of dynamic bandwidth for future large-scale Datacom systems while maintaining scalable energy efficiency. The integration of a silicon photonics-based optical switching fabric within electronic Datacom architectures requires novel network topologies and arbitration strategies to effectively manage the active elements in the network. Here, we present a scalable software-defined networking control plane to integrate silicon photonic based switches with conventional Ethernet or InfiniBand networks. Our software-defined control plane manages both electronic packet switches and multiple silicon photonic switches for simultaneous packet and circuit switching. We built an experimental Dragonfly networkmore » testbed with 16 electronic packet switches and 2 silicon photonic switches to evaluate our control plane. Observed latencies occupied by each step of the switching procedure demonstrate a total of 344 microsecond control plane latency for data-center and high performance computing platforms.« less
Comparison of Fiber Optic Strain Demodulation Implementations
NASA Technical Reports Server (NTRS)
Quach, Cuong C.; Vazquez, Sixto L.
2005-01-01
NASA Langley Research Center is developing instrumentation based upon principles of Optical Frequency-Domain Reflectometry (OFDR) for the provision of large-scale, dense distribution of strain sensors using fiber optics embedded with Bragg gratings. Fiber Optic Bragg Grating technology enables the distribution of thousands of sensors immune to moisture and electromagnetic interference with negligible weight penalty. At Langley, this technology provides a key component for research and development relevant to comprehensive aerospace vehicle structural health monitoring. A prototype system is under development that includes hardware and software necessary for the acquisition of data from an optical network and conversion of the data into strain measurements. This report documents the steps taken to verify the software that implements the algorithm for calculating the fiber strain. Brief descriptions of the strain measurement system and the test article are given. The scope of this report is the verification of software implementations as compared to a reference model. The algorithm will be detailed along with comparison results.
NASA Technical Reports Server (NTRS)
Madrid, G. A.; Westmoreland, P. T.
1983-01-01
A progress report is presented on a program to upgrade the existing NASA Deep Space Network in terms of a redesigned computer-controlled data acquisition system for channelling tracking, telemetry, and command data between a California-based control center and three signal processing centers in Australia, California, and Spain. The methodology for the improvements is oriented towards single subsystem development with consideration for a multi-system and multi-subsystem network of operational software. Details of the existing hardware configurations and data transmission links are provided. The program methodology includes data flow design, interface design and coordination, incremental capability availability, increased inter-subsystem developmental synthesis and testing, system and network level synthesis and testing, and system verification and validation. The software has been implemented thus far to a 65 percent completion level, and the methodology being used to effect the changes, which will permit enhanced tracking and communication with spacecraft, has been concluded to feature effective techniques.
Lightweight genome viewer: portable software for browsing genomics data in its chromosomal context
Faith, Jeremiah J; Olson, Andrew J; Gardner, Timothy S; Sachidanandam, Ravi
2007-01-01
Background Lightweight genome viewer (lwgv) is a web-based tool for visualization of sequence annotations in their chromosomal context. It performs most of the functions of larger genome browsers, while relying on standard flat-file formats and bypassing the database needs of most visualization tools. Visualization as an aide to discovery requires display of novel data in conjunction with static annotations in their chromosomal context. With database-based systems, displaying dynamic results requires temporary tables that need to be tracked for removal. Results lwgv simplifies the visualization of user-generated results on a local computer. The dynamic results of these analyses are written to transient files, which can import static content from a more permanent file. lwgv is currently used in many different applications, from whole genome browsers to single-gene RNAi design visualization, demonstrating its applicability in a large variety of contexts and scales. Conclusion lwgv provides a lightweight alternative to large genome browsers for visualizing biological annotations and dynamic analyses in their chromosomal context. It is particularly suited for applications ranging from short sequences to medium-sized genomes when the creation and maintenance of a large software and database infrastructure is not necessary or desired. PMID:17877794
Lightweight genome viewer: portable software for browsing genomics data in its chromosomal context.
Faith, Jeremiah J; Olson, Andrew J; Gardner, Timothy S; Sachidanandam, Ravi
2007-09-18
Lightweight genome viewer (lwgv) is a web-based tool for visualization of sequence annotations in their chromosomal context. It performs most of the functions of larger genome browsers, while relying on standard flat-file formats and bypassing the database needs of most visualization tools. Visualization as an aide to discovery requires display of novel data in conjunction with static annotations in their chromosomal context. With database-based systems, displaying dynamic results requires temporary tables that need to be tracked for removal. lwgv simplifies the visualization of user-generated results on a local computer. The dynamic results of these analyses are written to transient files, which can import static content from a more permanent file. lwgv is currently used in many different applications, from whole genome browsers to single-gene RNAi design visualization, demonstrating its applicability in a large variety of contexts and scales. lwgv provides a lightweight alternative to large genome browsers for visualizing biological annotations and dynamic analyses in their chromosomal context. It is particularly suited for applications ranging from short sequences to medium-sized genomes when the creation and maintenance of a large software and database infrastructure is not necessary or desired.
Visualization in aerospace research with a large wall display system
NASA Astrophysics Data System (ADS)
Matsuo, Yuichi
2002-05-01
National Aerospace Laboratory of Japan has built a large- scale visualization system with a large wall-type display. The system has been operational since April 2001 and comprises a 4.6x1.5-meter (15x5-foot) rear projection screen with 3 BARCO 812 high-resolution CRT projectors. The reason we adopted the 3-gun CRT projectors is support for stereoscopic viewing, ease with color/luminosity matching and accuracy of edge-blending. The system is driven by a new SGI Onyx 3400 server of distributed shared-memory architecture with 32 CPUs, 64Gbytes memory, 1.5TBytes FC RAID disk and 6 IR3 graphics pipelines. Software is another important issue for us to make full use of the system. We have introduced some applications available in a multi- projector environment such as AVS/MPE, EnSight Gold and COVISE, and been developing some software tools that create volumetric images with using SGI graphics libraries. The system is mainly used for visualization fo computational fluid dynamics (CFD) simulation sin aerospace research. Visualized CFD results are of our help for designing an improved configuration of aerospace vehicles and analyzing their aerodynamic performances. These days we also use it for various collaborations among researchers.
The Hyper Suprime-Cam software pipeline
NASA Astrophysics Data System (ADS)
Bosch, James; Armstrong, Robert; Bickerton, Steven; Furusawa, Hisanori; Ikeda, Hiroyuki; Koike, Michitaro; Lupton, Robert; Mineo, Sogo; Price, Paul; Takata, Tadafumi; Tanaka, Masayuki; Yasuda, Naoki; AlSayyad, Yusra; Becker, Andrew C.; Coulton, William; Coupon, Jean; Garmilla, Jose; Huang, Song; Krughoff, K. Simon; Lang, Dustin; Leauthaud, Alexie; Lim, Kian-Tat; Lust, Nate B.; MacArthur, Lauren A.; Mandelbaum, Rachel; Miyatake, Hironao; Miyazaki, Satoshi; Murata, Ryoma; More, Surhud; Okura, Yuki; Owen, Russell; Swinbank, John D.; Strauss, Michael A.; Yamada, Yoshihiko; Yamanoi, Hitomi
2018-01-01
In this paper, we describe the optical imaging data processing pipeline developed for the Subaru Telescope's Hyper Suprime-Cam (HSC) instrument. The HSC Pipeline builds on the prototype pipeline being developed by the Large Synoptic Survey Telescope's Data Management system, adding customizations for HSC, large-scale processing capabilities, and novel algorithms that have since been reincorporated into the LSST codebase. While designed primarily to reduce HSC Subaru Strategic Program (SSP) data, it is also the recommended pipeline for reducing general-observer HSC data. The HSC pipeline includes high-level processing steps that generate coadded images and science-ready catalogs as well as low-level detrending and image characterizations.
Review of software tools for design and analysis of large scale MRM proteomic datasets.
Colangelo, Christopher M; Chung, Lisa; Bruce, Can; Cheung, Kei-Hoi
2013-06-15
Selective or Multiple Reaction monitoring (SRM/MRM) is a liquid-chromatography (LC)/tandem-mass spectrometry (MS/MS) method that enables the quantitation of specific proteins in a sample by analyzing precursor ions and the fragment ions of their selected tryptic peptides. Instrumentation software has advanced to the point that thousands of transitions (pairs of primary and secondary m/z values) can be measured in a triple quadrupole instrument coupled to an LC, by a well-designed scheduling and selection of m/z windows. The design of a good MRM assay relies on the availability of peptide spectra from previous discovery-phase LC-MS/MS studies. The tedious aspect of manually developing and processing MRM assays involving thousands of transitions has spurred to development of software tools to automate this process. Software packages have been developed for project management, assay development, assay validation, data export, peak integration, quality assessment, and biostatistical analysis. No single tool provides a complete end-to-end solution, thus this article reviews the current state and discusses future directions of these software tools in order to enable researchers to combine these tools for a comprehensive targeted proteomics workflow. Copyright © 2013 The Authors. Published by Elsevier Inc. All rights reserved.
NASA Technical Reports Server (NTRS)
Talbot, Bryan; Zhou, Shu-Jia; Higgins, Glenn
2002-01-01
One of the most significant challenges in large-scale climate modeling, as well as in high-performance computing in other scientific fields, is that of effectively integrating many software models from multiple contributors. A software framework facilitates the integration task. both in the development and runtime stages of the simulation. Effective software frameworks reduce the programming burden for the investigators, freeing them to focus more on the science and less on the parallel communication implementation, while maintaining high performance across numerous supercomputer and workstation architectures. This document proposes a strawman framework design for the climate community based on the integration of Cactus, from the relativistic physics community, and UCLA/UCB Distributed Data Broker (DDB) from the climate community. This design is the result of an extensive survey of climate models and frameworks in the climate community as well as frameworks from many other scientific communities. The design addresses fundamental development and runtime needs using Cactus, a framework with interfaces for FORTRAN and C-based languages, and high-performance model communication needs using DDB. This document also specifically explores object-oriented design issues in the context of climate modeling as well as climate modeling issues in terms of object-oriented design.
Mercury⊕: An evidential reasoning image classifier
NASA Astrophysics Data System (ADS)
Peddle, Derek R.
1995-12-01
MERCURY⊕ is a multisource evidential reasoning classification software system based on the Dempster-Shafer theory of evidence. The design and implementation of this software package is described for improving the classification and analysis of multisource digital image data necessary for addressing advanced environmental and geoscience applications. In the remote-sensing context, the approach provides a more appropriate framework for classifying modern, multisource, and ancillary data sets which may contain a large number of disparate variables with different statistical properties, scales of measurement, and levels of error which cannot be handled using conventional Bayesian approaches. The software uses a nonparametric, supervised approach to classification, and provides a more objective and flexible interface to the evidential reasoning framework using a frequency-based method for computing support values from training data. The MERCURY⊕ software package has been implemented efficiently in the C programming language, with extensive use made of dynamic memory allocation procedures and compound linked list and hash-table data structures to optimize the storage and retrieval of evidence in a Knowledge Look-up Table. The software is complete with a full user interface and runs under Unix, Ultrix, VAX/VMS, MS-DOS, and Apple Macintosh operating system. An example of classifying alpine land cover and permafrost active layer depth in northern Canada is presented to illustrate the use and application of these ideas.
CARGO: effective format-free compressed storage of genomic information
Roguski, Łukasz; Ribeca, Paolo
2016-01-01
The recent super-exponential growth in the amount of sequencing data generated worldwide has put techniques for compressed storage into the focus. Most available solutions, however, are strictly tied to specific bioinformatics formats, sometimes inheriting from them suboptimal design choices; this hinders flexible and effective data sharing. Here, we present CARGO (Compressed ARchiving for GenOmics), a high-level framework to automatically generate software systems optimized for the compressed storage of arbitrary types of large genomic data collections. Straightforward applications of our approach to FASTQ and SAM archives require a few lines of code, produce solutions that match and sometimes outperform specialized format-tailored compressors and scale well to multi-TB datasets. All CARGO software components can be freely downloaded for academic and non-commercial use from http://bio-cargo.sourceforge.net. PMID:27131376
SIERRA Low Mach Module: Fuego User Manual Version 4.46.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sierra Thermal/Fluid Team
2017-09-01
The SIERRA Low Mach Module: Fuego along with the SIERRA Participating Media Radiation Module: Syrinx, henceforth referred to as Fuego and Syrinx, respectively, are the key elements of the ASCI fire environment simulation project. The fire environment simulation project is directed at characterizing both open large-scale pool fires and building enclosure fires. Fuego represents the turbulent, buoyantly-driven incompressible flow, heat transfer, mass transfer, combustion, soot, and absorption coefficient model portion of the simulation software. Syrinx represents the participating-media thermal radiation mechanics. This project is an integral part of the SIERRA multi-mechanics software development project. Fuego depends heavily upon the coremore » architecture developments provided by SIERRA for massively parallel computing, solution adaptivity, and mechanics coupling on unstructured grids.« less
SIERRA Low Mach Module: Fuego Theory Manual Version 4.44
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sierra Thermal /Fluid Team
2017-04-01
The SIERRA Low Mach Module: Fuego along with the SIERRA Participating Media Radiation Module: Syrinx, henceforth referred to as Fuego and Syrinx, respectively, are the key elements of the ASCI fire environment simulation project. The fire environment simulation project is directed at characterizing both open large-scale pool fires and building enclosure fires. Fuego represents the turbulent, buoyantly-driven incompressible flow, heat transfer, mass transfer, combustion, soot, and absorption coefficient model portion of the simulation software. Syrinx represents the participating-media thermal radiation mechanics. This project is an integral part of the SIERRA multi-mechanics software development project. Fuego depends heavily upon the coremore » architecture developments provided by SIERRA for massively parallel computing, solution adaptivity, and mechanics coupling on unstructured grids.« less
SIERRA Low Mach Module: Fuego Theory Manual Version 4.46.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sierra Thermal/Fluid Team
The SIERRA Low Mach Module: Fuego along with the SIERRA Participating Media Radiation Module: Syrinx, henceforth referred to as Fuego and Syrinx, respectively, are the key elements of the ASCI fire environment simulation project. The fire environment simulation project is directed at characterizing both open large-scale pool fires and building enclosure fires. Fuego represents the turbulent, buoyantly-driven incompressible flow, heat transfer, mass transfer, combustion, soot, and absorption coefficient model portion of the simulation software. Syrinx represents the participating-media thermal radiation mechanics. This project is an integral part of the SIERRA multi-mechanics software development project. Fuego depends heavily upon the coremore » architecture developments provided by SIERRA for massively parallel computing, solution adaptivity, and mechanics coupling on unstructured grids.« less
Mendel-GPU: haplotyping and genotype imputation on graphics processing units
Chen, Gary K.; Wang, Kai; Stram, Alex H.; Sobel, Eric M.; Lange, Kenneth
2012-01-01
Motivation: In modern sequencing studies, one can improve the confidence of genotype calls by phasing haplotypes using information from an external reference panel of fully typed unrelated individuals. However, the computational demands are so high that they prohibit researchers with limited computational resources from haplotyping large-scale sequence data. Results: Our graphics processing unit based software delivers haplotyping and imputation accuracies comparable to competing programs at a fraction of the computational cost and peak memory demand. Availability: Mendel-GPU, our OpenCL software, runs on Linux platforms and is portable across AMD and nVidia GPUs. Users can download both code and documentation at http://code.google.com/p/mendel-gpu/. Contact: gary.k.chen@usc.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:22954633
Cloud4Psi: cloud computing for 3D protein structure similarity searching.
Mrozek, Dariusz; Małysiak-Mrozek, Bożena; Kłapciński, Artur
2014-10-01
Popular methods for 3D protein structure similarity searching, especially those that generate high-quality alignments such as Combinatorial Extension (CE) and Flexible structure Alignment by Chaining Aligned fragment pairs allowing Twists (FATCAT) are still time consuming. As a consequence, performing similarity searching against large repositories of structural data requires increased computational resources that are not always available. Cloud computing provides huge amounts of computational power that can be provisioned on a pay-as-you-go basis. We have developed the cloud-based system that allows scaling of the similarity searching process vertically and horizontally. Cloud4Psi (Cloud for Protein Similarity) was tested in the Microsoft Azure cloud environment and provided good, almost linearly proportional acceleration when scaled out onto many computational units. Cloud4Psi is available as Software as a Service for testing purposes at: http://cloud4psi.cloudapp.net/. For source code and software availability, please visit the Cloud4Psi project home page at http://zti.polsl.pl/dmrozek/science/cloud4psi.htm. © The Author 2014. Published by Oxford University Press.
NASA Astrophysics Data System (ADS)
Zhao, Ziyue; Gan, Xiaochuan; Zou, Zhi; Ma, Liqun
2018-01-01
The dynamic envelope measurement plays very important role in the external dimension design for high-speed train. Recently there is no digital measurement system to solve this problem. This paper develops an optoelectronic measurement system by using monocular digital camera, and presents the research of measurement theory, visual target design, calibration algorithm design, software programming and so on. This system consists of several CMOS digital cameras, several luminous targets for measuring, a scale bar, data processing software and a terminal computer. The system has such advantages as large measurement scale, high degree of automation, strong anti-interference ability, noise rejection and real-time measurement. In this paper, we resolve the key technology such as the transformation, storage and calculation of multiple cameras' high resolution digital image. The experimental data show that the repeatability of the system is within 0.02mm and the distance error of the system is within 0.12mm in the whole workspace. This experiment has verified the rationality of the system scheme, the correctness, the precision and effectiveness of the relevant methods.
Final Technical Report: Quantification of Uncertainty in Extreme Scale Computations (QUEST)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Knio, Omar M.
QUEST is a SciDAC Institute comprising Sandia National Laboratories, Los Alamos National Laboratory, University of Southern California, Massachusetts Institute of Technology, University of Texas at Austin, and Duke University. The mission of QUEST is to: (1) develop a broad class of uncertainty quantification (UQ) methods/tools, and (2) provide UQ expertise and software to other SciDAC projects, thereby enabling/guiding their UQ activities. The Duke effort focused on the development of algorithms and utility software for non-intrusive sparse UQ representations, and on participation in the organization of annual workshops and tutorials to disseminate UQ tools to the community, and to gather inputmore » in order to adapt approaches to the needs of SciDAC customers. In particular, fundamental developments were made in (a) multiscale stochastic preconditioners, (b) gradient-based approaches to inverse problems, (c) adaptive pseudo-spectral approximations, (d) stochastic limit cycles, and (e) sensitivity analysis tools for noisy systems. In addition, large-scale demonstrations were performed, namely in the context of ocean general circulation models.« less
THE VIRTUAL INSTRUMENT: SUPPORT FOR GRID-ENABLED MCELL SIMULATIONS
Casanova, Henri; Berman, Francine; Bartol, Thomas; Gokcay, Erhan; Sejnowski, Terry; Birnbaum, Adam; Dongarra, Jack; Miller, Michelle; Ellisman, Mark; Faerman, Marcio; Obertelli, Graziano; Wolski, Rich; Pomerantz, Stuart; Stiles, Joel
2010-01-01
Ensembles of widely distributed, heterogeneous resources, or Grids, have emerged as popular platforms for large-scale scientific applications. In this paper we present the Virtual Instrument project, which provides an integrated application execution environment that enables end-users to run and interact with running scientific simulations on Grids. This work is performed in the specific context of MCell, a computational biology application. While MCell provides the basis for running simulations, its capabilities are currently limited in terms of scale, ease-of-use, and interactivity. These limitations preclude usage scenarios that are critical for scientific advances. Our goal is to create a scientific “Virtual Instrument” from MCell by allowing its users to transparently access Grid resources while being able to steer running simulations. In this paper, we motivate the Virtual Instrument project and discuss a number of relevant issues and accomplishments in the area of Grid software development and application scheduling. We then describe our software design and report on the current implementation. We verify and evaluate our design via experiments with MCell on a real-world Grid testbed. PMID:20689618
Cloud4Psi: cloud computing for 3D protein structure similarity searching
Mrozek, Dariusz; Małysiak-Mrozek, Bożena; Kłapciński, Artur
2014-01-01
Summary: Popular methods for 3D protein structure similarity searching, especially those that generate high-quality alignments such as Combinatorial Extension (CE) and Flexible structure Alignment by Chaining Aligned fragment pairs allowing Twists (FATCAT) are still time consuming. As a consequence, performing similarity searching against large repositories of structural data requires increased computational resources that are not always available. Cloud computing provides huge amounts of computational power that can be provisioned on a pay-as-you-go basis. We have developed the cloud-based system that allows scaling of the similarity searching process vertically and horizontally. Cloud4Psi (Cloud for Protein Similarity) was tested in the Microsoft Azure cloud environment and provided good, almost linearly proportional acceleration when scaled out onto many computational units. Availability and implementation: Cloud4Psi is available as Software as a Service for testing purposes at: http://cloud4psi.cloudapp.net/. For source code and software availability, please visit the Cloud4Psi project home page at http://zti.polsl.pl/dmrozek/science/cloud4psi.htm. Contact: dariusz.mrozek@polsl.pl PMID:24930141
Integration experiences and performance studies of A COTS parallel archive systems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chen, Hsing-bung; Scott, Cody; Grider, Bary
2010-01-01
Current and future Archive Storage Systems have been asked to (a) scale to very high bandwidths, (b) scale in metadata performance, (c) support policy-based hierarchical storage management capability, (d) scale in supporting changing needs of very large data sets, (e) support standard interface, and (f) utilize commercial-off-the-shelf(COTS) hardware. Parallel file systems have been asked to do the same thing but at one or more orders of magnitude faster in performance. Archive systems continue to move closer to file systems in their design due to the need for speed and bandwidth, especially metadata searching speeds such as more caching and lessmore » robust semantics. Currently the number of extreme highly scalable parallel archive solutions is very small especially those that will move a single large striped parallel disk file onto many tapes in parallel. We believe that a hybrid storage approach of using COTS components and innovative software technology can bring new capabilities into a production environment for the HPC community much faster than the approach of creating and maintaining a complete end-to-end unique parallel archive software solution. In this paper, we relay our experience of integrating a global parallel file system and a standard backup/archive product with a very small amount of additional code to provide a scalable, parallel archive. Our solution has a high degree of overlap with current parallel archive products including (a) doing parallel movement to/from tape for a single large parallel file, (b) hierarchical storage management, (c) ILM features, (d) high volume (non-single parallel file) archives for backup/archive/content management, and (e) leveraging all free file movement tools in Linux such as copy, move, ls, tar, etc. We have successfully applied our working COTS Parallel Archive System to the current world's first petaflop/s computing system, LANL's Roadrunner, and demonstrated its capability to address requirements of future archival storage systems.« less
Integration experiments and performance studies of a COTS parallel archive system
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chen, Hsing-bung; Scott, Cody; Grider, Gary
2010-06-16
Current and future Archive Storage Systems have been asked to (a) scale to very high bandwidths, (b) scale in metadata performance, (c) support policy-based hierarchical storage management capability, (d) scale in supporting changing needs of very large data sets, (e) support standard interface, and (f) utilize commercial-off-the-shelf (COTS) hardware. Parallel file systems have been asked to do the same thing but at one or more orders of magnitude faster in performance. Archive systems continue to move closer to file systems in their design due to the need for speed and bandwidth, especially metadata searching speeds such as more caching andmore » less robust semantics. Currently the number of extreme highly scalable parallel archive solutions is very small especially those that will move a single large striped parallel disk file onto many tapes in parallel. We believe that a hybrid storage approach of using COTS components and innovative software technology can bring new capabilities into a production environment for the HPC community much faster than the approach of creating and maintaining a complete end-to-end unique parallel archive software solution. In this paper, we relay our experience of integrating a global parallel file system and a standard backup/archive product with a very small amount of additional code to provide a scalable, parallel archive. Our solution has a high degree of overlap with current parallel archive products including (a) doing parallel movement to/from tape for a single large parallel file, (b) hierarchical storage management, (c) ILM features, (d) high volume (non-single parallel file) archives for backup/archive/content management, and (e) leveraging all free file movement tools in Linux such as copy, move, Is, tar, etc. We have successfully applied our working COTS Parallel Archive System to the current world's first petafiop/s computing system, LANL's Roadrunner machine, and demonstrated its capability to address requirements of future archival storage systems.« less
NASA Astrophysics Data System (ADS)
Darch, Peter T.; Sands, Ashley E.
2016-06-01
Sky surveys, such as the Sloan Digital Sky Survey (SDSS) and the Large Synoptic Survey Telescope (LSST), generate data on an unprecedented scale. While many scientific projects span a few years from conception to completion, sky surveys are typically on the scale of decades. This paper focuses on critical challenges arising from long timescales, and how sky surveys address these challenges.We present findings from a study of LSST, comprising interviews (n=58) and observation. Conceived in the 1990s, the LSST Corporation was formed in 2003, and construction began in 2014. LSST will commence data collection operations in 2022 for ten years.One challenge arising from this long timescale is uncertainty about future needs of the astronomers who will use these data many years hence. Sources of uncertainty include scientific questions to be posed, astronomical phenomena to be studied, and tools and practices these astronomers will have at their disposal. These uncertainties are magnified by the rapid technological and scientific developments anticipated between now and the start of LSST operations.LSST is implementing a range of strategies to address these challenges. Some strategies involve delaying resolution of uncertainty, placing this resolution in the hands of future data users. Other strategies aim to reduce uncertainty by shaping astronomers’ data analysis practices so that these practices will integrate well with LSST once operations begin.One approach that exemplifies both types of strategy is the decision to make LSST data management software open source, even now as it is being developed. This policy will enable future data users to adapt this software to evolving needs. In addition, LSST intends for astronomers to start using this software well in advance of 2022, thereby embedding LSST software and data analysis approaches in the practices of astronomers.These findings strengthen arguments for making the software supporting sky surveys available as open source. Such arguments usually focus on reuse potential of software, and enhancing replicability of analyses. In this case, however, open source software also promises to mitigate the critical challenge of anticipating the needs of future data users.
NASA Astrophysics Data System (ADS)
Celicourt, P.; Sam, R.; Piasecki, M.
2016-12-01
Global phenomena such as climate change and large scale environmental degradation require the collection of accurate environmental data at detailed spatial and temporal scales from which knowledge and actionable insights can be derived using data science methods. Despite significant advances in sensor network technologies, sensors and sensor network deployment remains a labor-intensive, time consuming, cumbersome and expensive task. These factors demonstrate why environmental data collection remains a challenge especially in developing countries where technical infrastructure, expertise and pecuniary resources are scarce. In addition, they also demonstrate the reason why dense and long-term environmental data collection has been historically quite difficult. Moreover, hydrometeorological data collection efforts usually overlook the (critically important) inclusion of a standards-based system for storing, managing, organizing, indexing, documenting and sharing sensor data. We are developing a cross-platform software framework using the Python programming language that will allow us to develop a low cost end-to-end (from sensor to publication) system for hydrometeorological conditions monitoring. The software framework contains provision for sensor, sensor platforms, calibration and network protocols description, sensor programming, data storage, data publication and visualization and more importantly data retrieval in a desired unit system. It is being tested on the Raspberry Pi microcomputer as end node and a laptop PC as the base station in a wireless setting.
RevEcoR: an R package for the reverse ecology analysis of microbiomes.
Cao, Yang; Wang, Yuanyuan; Zheng, Xiaofei; Li, Fei; Bo, Xiaochen
2016-07-29
All species live in complex ecosystems. The structure and complexity of a microbial community reflects not only diversity and function, but also the environment in which it occurs. However, traditional ecological methods can only be applied on a small scale and for relatively well-understood biological systems. Recently, a graph-theory-based algorithm called the reverse ecology approach has been developed that can analyze the metabolic networks of all the species in a microbial community, and predict the metabolic interface between species and their environment. Here, we present RevEcoR, an R package and a Shiny Web application that implements the reverse ecology algorithm for determining microbe-microbe interactions in microbial communities. This software allows users to obtain large-scale ecological insights into species' ecology directly from high-throughput metagenomic data. The software has great potential for facilitating the study of microbiomes. RevEcoR is open source software for the study of microbial community ecology. The RevEcoR R package is freely available under the GNU General Public License v. 2.0 at http://cran.r-project.org/web/packages/RevEcoR/ with the vignette and typical usage examples, and the interactive Shiny web application is available at http://yiluheihei.shinyapps.io/shiny-RevEcoR , or can be installed locally with the source code accessed from https://github.com/yiluheihei/shiny-RevEcoR .
Software technology insertion: A study of success factors
NASA Technical Reports Server (NTRS)
Lydon, Tom
1990-01-01
Managing software development in large organizations has become increasingly difficult due to increasing technical complexity, stricter government standards, a shortage of experienced software engineers, competitive pressure for improved productivity and quality, the need to co-develop hardware and software together, and the rapid changes in both hardware and software technology. The 'software factory' approach to software development minimizes risks while maximizing productivity and quality through standardization, automation, and training. However, in practice, this approach is relatively inflexible when adopting new software technologies. The methods that a large multi-project software engineering organization can use to increase the likelihood of successful software technology insertion (STI), especially in a standardized engineering environment, are described.
Academic and Non-Profit Accessibility to Commercial Remote Sensing Software
NASA Astrophysics Data System (ADS)
O'Connor, A. S.; Farr, B.
2013-12-01
Remote Sensing as a topic of teaching and research at the university and college level continues to increase. As more data is made freely available and software becomes easier to use, more and more academic and non-profits institutions are turning to remote sensing to solve their tough and large spatial scale problems. Exelis Visual Information Solutions (VIS) has been supporting teaching and research endeavors for over 30 years with a special emphasis over the last 5 years with scientifically proven software and accessible training materials. The Exelis VIS academic program extends to US and Canadian 2 year and 4 year colleges and universities with tools for analyzing aerial and satellite multispectral and hyperspectral imagery, airborne LiDAR and Synthetic Aperture Radar. The Exelis VIS academic programs, using the ENVI Platform, enables labs and classrooms to be outfitted with software and makes software accessible to students. The ENVI software provides students hands on experience with remote sensing software, an easy teaching platform for professors and allows researchers scientifically vetted software they can trust. Training materials are provided at no additional cost and can either serve as a basis for course curriculum development or self paced learning. Non-profit organizations like The Nature Conservancy (TNC) and CGIAR have deployed ENVI and IDL enterprise wide licensing allowing researchers all over the world to have cost effective access COTS software for their research. Exelis VIS has also contributed licenses to the NASA DEVELOP program. Exelis VIS is committed to supporting the academic and NGO community with affordable enterprise licensing, access to training materials, and technical expertise to help researchers tackle today's Earth and Planetary science big data challenges.
DupTree: a program for large-scale phylogenetic analyses using gene tree parsimony.
Wehe, André; Bansal, Mukul S; Burleigh, J Gordon; Eulenstein, Oliver
2008-07-01
DupTree is a new software program for inferring rooted species trees from collections of gene trees using the gene tree parsimony approach. The program implements a novel algorithm that significantly improves upon the run time of standard search heuristics for gene tree parsimony, and enables the first truly genome-scale phylogenetic analyses. In addition, DupTree allows users to examine alternate rootings and to weight the reconciliation costs for gene trees. DupTree is an open source project written in C++. DupTree for Mac OS X, Windows, and Linux along with a sample dataset and an on-line manual are available at http://genome.cs.iastate.edu/CBL/DupTree
Autonomous system for Web-based microarray image analysis.
Bozinov, Daniel
2003-12-01
Software-based feature extraction from DNA microarray images still requires human intervention on various levels. Manual adjustment of grid and metagrid parameters, precise alignment of superimposed grid templates and gene spots, or simply identification of large-scale artifacts have to be performed beforehand to reliably analyze DNA signals and correctly quantify their expression values. Ideally, a Web-based system with input solely confined to a single microarray image and a data table as output containing measurements for all gene spots would directly transform raw image data into abstracted gene expression tables. Sophisticated algorithms with advanced procedures for iterative correction function can overcome imminent challenges in image processing. Herein is introduced an integrated software system with a Java-based interface on the client side that allows for decentralized access and furthermore enables the scientist to instantly employ the most updated software version at any given time. This software tool is extended from PixClust as used in Extractiff incorporated with Java Web Start deployment technology. Ultimately, this setup is destined for high-throughput pipelines in genome-wide medical diagnostics labs or microarray core facilities aimed at providing fully automated service to its users.
Existing Fortran interfaces to Trilinos in preparation for exascale ForTrilinos development
DOE Office of Scientific and Technical Information (OSTI.GOV)
Evans, Katherine J.; Young, Mitchell T.; Collins, Benjamin S.
This report summarizes the current state of Fortran interfaces to the Trilinos library within several key applications of the Exascale Computing Program (ECP), with the aim of informing developers about strategies to develop ForTrilinos, an exascale-ready, Fortran interface software package within Trilinos. The two software projects assessed within are the DOE Office of Science's Accelerated Climate Model for Energy (ACME) atmosphere component, CAM, and the DOE Office of Nuclear Energy's core-simulator portion of VERA, a nuclear reactor simulation code. Trilinos is an object-oriented, C++ based software project, and spans a collection of algorithms and other enabling technologies such as uncertaintymore » quantification and mesh generation. To date, Trilinos has enabled these codes to achieve large-scale simulation results, however the simulation needs of CAM and VERA-CS will approach exascale over the next five years. A Fortran interface to Trilinos that enables efficient use of programming models and more advanced algorithms is necessary. Where appropriate, the needs of the CAM and VERA-CS software to achieve their simulation goals are called out specifically. With this report, a design document and execution plan for ForTrilinos development can proceed.« less
Hufnagel, S; Harbison, K; Silva, J; Mettala, E
1994-01-01
This paper describes a new method for the evolutionary determination of user requirements and system specifications called scenario-based engineering process (SEP). Health care professional workstations are critical components of large scale health care system architectures. We suggest that domain-specific software architectures (DSSAs) be used to specify standard interfaces and protocols for reusable software components throughout those architectures, including workstations. We encourage the use of engineering principles and abstraction mechanisms. Engineering principles are flexible guidelines, adaptable to particular situations. Abstraction mechanisms are simplifications for management of complexity. We recommend object-oriented design principles, graphical structural specifications, and formal components' behavioral specifications. We give an ambulatory care scenario and associated models to demonstrate SEP. The scenario uses health care terminology and gives patients' and health care providers' system views. Our goal is to have a threefold benefit. (i) Scenario view abstractions provide consistent interdisciplinary communications. (ii) Hierarchical object-oriented structures provide useful abstractions for reuse, understandability, and long term evolution. (iii) SEP and health care DSSA integration into computer aided software engineering (CASE) environments. These environments should support rapid construction and certification of individualized systems, from reuse libraries.
NASA Astrophysics Data System (ADS)
Yulaeva, E.; Fan, Y.; Moosdorf, N.; Richard, S. M.; Bristol, S.; Peters, S. E.; Zaslavsky, I.; Ingebritsen, S.
2015-12-01
The Digital Crust EarthCube building block creates a framework for integrating disparate 3D/4D information from multiple sources into a comprehensive model of the structure and composition of the Earth's upper crust, and to demonstrate the utility of this model in several research scenarios. One of such scenarios is estimation of various crustal properties related to fluid dynamics (e.g. permeability and porosity) at each node of any arbitrary unstructured 3D grid to support continental-scale numerical models of fluid flow and transport. Starting from Macrostrat, an existing 4D database of 33,903 chronostratigraphic units, and employing GeoDeepDive, a software system for extracting structured information from unstructured documents, we construct 3D gridded fields of sediment/rock porosity, permeability and geochemistry for large sedimentary basins of North America, which will be used to improve our understanding of large-scale fluid flow, chemical weathering rates, and geochemical fluxes into the ocean. In this talk, we discuss the methods, data gaps (particularly in geologically complex terrain), and various physical and geological constraints on interpolation and uncertainty estimation.
NASA Technical Reports Server (NTRS)
Perusich, Stephen; Moos, Thomas; Muscatello, Anthony
2011-01-01
This innovation provides the user with autonomous on-screen monitoring, embedded computations, and tabulated output for two new processes. The software was originally written for the Continuous Lunar Water Separation Process (CLWSP), but was found to be general enough to be applicable to the Lunar Greenhouse Amplifier (LGA) as well, with minor alterations. The resultant program should have general applicability to many laboratory processes (see figure). The objective for these programs was to create a software application that would provide both autonomous monitoring and data storage, along with manual manipulation. The software also allows operators the ability to input experimental changes and comments in real time without modifying the code itself. Common process elements, such as thermocouples, pressure transducers, and relative humidity sensors, are easily incorporated into the program in various configurations, along with specialized devices such as photodiode sensors. The goal of the CLWSP research project is to design, build, and test a new method to continuously separate, capture, and quantify water from a gas stream. The application is any In-Situ Resource Utilization (ISRU) process that desires to extract or produce water from lunar or planetary regolith. The present work is aimed at circumventing current problems and ultimately producing a system capable of continuous operation at moderate temperatures that can be scaled over a large capacity range depending on the ISRU process. The goal of the LGA research project is to design, build, and test a new type of greenhouse that could be used on the moon or Mars. The LGA uses super greenhouse gases (SGGs) to absorb long-wavelength radiation, thus creating a highly efficient greenhouse at a future lunar or Mars outpost. Silica-based glass, although highly efficient at trapping heat, is heavy, fragile, and not suitable for space greenhouse applications. Plastics are much lighter and resilient, but are not efficient for absorbing longwavelength infrared radiation and therefore will lose more heat to the environment compared to glass. The LGA unit uses a transparent polymer antechamber that surrounds part of the greenhouse and encases the SGGs, thereby minimizing infrared losses through the plastic windows. With ambient temperatures at the lunar poles at 50 C, the LGA should provide a substantial enhancement to currently conceived lunar greenhouses. Positive results obtained from this project could lead to a future large-scale system capable of running autonomously on the Moon, Mars, and beyond. The software for both applications needs to run the entire units and all subprocesses; however, throughout testing, many variables and parameters need to be changed as more is learned about the system operation. The software provides the versatility to permit the software operation to change as the user requirements evolve.
Large-scale structural analysis: The structural analyst, the CSM Testbed and the NAS System
NASA Technical Reports Server (NTRS)
Knight, Norman F., Jr.; Mccleary, Susan L.; Macy, Steven C.; Aminpour, Mohammad A.
1989-01-01
The Computational Structural Mechanics (CSM) activity is developing advanced structural analysis and computational methods that exploit high-performance computers. Methods are developed in the framework of the CSM testbed software system and applied to representative complex structural analysis problems from the aerospace industry. An overview of the CSM testbed methods development environment is presented and some numerical methods developed on a CRAY-2 are described. Selected application studies performed on the NAS CRAY-2 are also summarized.
Reengineering Real-Time Software Systems
1993-09-09
reengineering existing large-scale (or real-time) systems; systems designed prior to or during the advent of applied SE (Parnas 1979, Freeman 1980). Is... Advisor : Yutaka Kanayama Approved for public release; distribution is unlimited. 93-29769 93 12 6 098 Form Appmoved REPORT DOCUMENTATION PAGE 1o No. PI rep...trm b Idn 1o tl# caik t al wdornon s easnated to waere 1how per response. fr4ikcdm the time rem matnodons. siauide exetig da"a siuo a i and mami diqw
1982-11-09
PROGRAMMING LANGUAGE U. S . ARMY CECOM CONTRACT NO. DAAK8O-81-C-3107 CONTROL DATA CORPORATION GOVERNMENT SYSTEMS 40 AVENUE AT THE COMMON SHREWSBURY, NJ...contained in this report :. are those of the author( s ) and should not be construed as an .J : . official Department of the Army position, policy or...fomIn Orimaniatil Rt. No. Control Data Corporation S . PfmgOrgnizatla Name and Address 10. PmolatTasklWwrk Unit Mo. Control Data Corporation
Dynamic Simulation of AN Helium Refrigerator
NASA Astrophysics Data System (ADS)
Deschildre, C.; Barraud, A.; Bonnay, P.; Briend, P.; Girard, A.; Poncet, J. M.; Roussel, P.; Sequeira, S. E.
2008-03-01
A dynamic simulation of a large scale existing refrigerator has been performed using the software Aspen Hysys®. The model comprises the typical equipments of a cryogenic system: heat exchangers, expanders, helium phase separators and cold compressors. It represents the 400 W @ 1.8 K Test Facility located at CEA—Grenoble. This paper describes the model development and shows the possibilities and limitations of the dynamic module of Aspen Hysys®. Then, comparison between simulation results and experimental data are presented; the simulation of cooldown process was also performed.
Comprehensive Analysis of DNA Methylation Data with RnBeads
Walter, Jörn; Lengauer, Thomas; Bock, Christoph
2014-01-01
RnBeads is a software tool for large-scale analysis and interpretation of DNA methylation data, providing a user-friendly analysis workflow that yields detailed hypertext reports (http://rnbeads.mpi-inf.mpg.de). Supported assays include whole genome bisulfite sequencing, reduced representation bisulfite sequencing, Infinium microarrays, and any other protocol that produces high-resolution DNA methylation data. Important applications of RnBeads include the analysis of epigenome-wide association studies and epigenetic biomarker discovery in cancer cohorts. PMID:25262207
Final Report for Project FG02-05ER25685
DOE Office of Scientific and Technical Information (OSTI.GOV)
Xiaosong Ma
2009-05-07
In this report, the PI summarizes the results and achievements obtained in the sponsored project. Overall, the project has been very successful and produced both research results in massive data-intensive computing and data management for large scale supercomputers today, and in open-source software products. During the project period, 14 conference/journal publications, as well as two PhD students, have been produced due to exclusive or shared support from this award. In addition, the PI has recently been granted tenure from NC State University.
Advanced Artificial Intelligence Technology Testbed
NASA Technical Reports Server (NTRS)
Anken, Craig S.
1993-01-01
The Advanced Artificial Intelligence Technology Testbed (AAITT) is a laboratory testbed for the design, analysis, integration, evaluation, and exercising of large-scale, complex, software systems, composed of both knowledge-based and conventional components. The AAITT assists its users in the following ways: configuring various problem-solving application suites; observing and measuring the behavior of these applications and the interactions between their constituent modules; gathering and analyzing statistics about the occurrence of key events; and flexibly and quickly altering the interaction of modules within the applications for further study.
From Petascale to Exascale: Eight Focus Areas of R&D Challenges for HPC Simulation Environments
DOE Office of Scientific and Technical Information (OSTI.GOV)
Springmeyer, R; Still, C; Schulz, M
2011-03-17
Programming models bridge the gap between the underlying hardware architecture and the supporting layers of software available to applications. Programming models are different from both programming languages and application programming interfaces (APIs). Specifically, a programming model is an abstraction of the underlying computer system that allows for the expression of both algorithms and data structures. In comparison, languages and APIs provide implementations of these abstractions and allow the algorithms and data structures to be put into practice - a programming model exists independently of the choice of both the programming language and the supporting APIs. Programming models are typically focusedmore » on achieving increased developer productivity, performance, and portability to other system designs. The rapidly changing nature of processor architectures and the complexity of designing an exascale platform provide significant challenges for these goals. Several other factors are likely to impact the design of future programming models. In particular, the representation and management of increasing levels of parallelism, concurrency and memory hierarchies, combined with the ability to maintain a progressive level of interoperability with today's applications are of significant concern. Overall the design of a programming model is inherently tied not only to the underlying hardware architecture, but also to the requirements of applications and libraries including data analysis, visualization, and uncertainty quantification. Furthermore, the successful implementation of a programming model is dependent on exposed features of the runtime software layers and features of the operating system. Successful use of a programming model also requires effective presentation to the software developer within the context of traditional and new software development tools. Consideration must also be given to the impact of programming models on both languages and the associated compiler infrastructure. Exascale programming models must reflect several, often competing, design goals. These design goals include desirable features such as abstraction and separation of concerns. However, some aspects are unique to large-scale computing. For example, interoperability and composability with existing implementations will prove critical. In particular, performance is the essential underlying goal for large-scale systems. A key evaluation metric for exascale models will be the extent to which they support these goals rather than merely enable them.« less
Bonnal, Raoul J P; Aerts, Jan; Githinji, George; Goto, Naohisa; MacLean, Dan; Miller, Chase A; Mishima, Hiroyuki; Pagani, Massimiliano; Ramirez-Gonzalez, Ricardo; Smant, Geert; Strozzi, Francesco; Syme, Rob; Vos, Rutger; Wennblom, Trevor J; Woodcroft, Ben J; Katayama, Toshiaki; Prins, Pjotr
2012-04-01
Biogem provides a software development environment for the Ruby programming language, which encourages community-based software development for bioinformatics while lowering the barrier to entry and encouraging best practices. Biogem, with its targeted modular and decentralized approach, software generator, tools and tight web integration, is an improved general model for scaling up collaborative open source software development in bioinformatics. Biogem and modules are free and are OSS. Biogem runs on all systems that support recent versions of Ruby, including Linux, Mac OS X and Windows. Further information at http://www.biogems.info. A tutorial is available at http://www.biogems.info/howto.html bonnal@ingm.org.
Desland, Fiona A; Afzal, Aqeela; Warraich, Zuha; Mocco, J
2014-01-01
Animal models of stroke have been crucial in advancing our understanding of the pathophysiology of cerebral ischemia. Currently, the standards for determining neurological deficit in rodents are the Bederson and Garcia scales, manual assessments scoring animals based on parameters ranked on a narrow scale of severity. Automated open field analysis of a live-video tracking system that analyzes animal behavior may provide a more sensitive test. Results obtained from the manual Bederson and Garcia scales did not show significant differences between pre- and post-stroke animals in a small cohort. When using the same cohort, however, post-stroke data obtained from automated open field analysis showed significant differences in several parameters. Furthermore, large cohort analysis also demonstrated increased sensitivity with automated open field analysis versus the Bederson and Garcia scales. These early data indicate use of automated open field analysis software may provide a more sensitive assessment when compared to traditional Bederson and Garcia scales.
Digital geomorphological landslide hazard mapping of the Alpago area, Italy
NASA Astrophysics Data System (ADS)
van Westen, Cees J.; Soeters, Rob; Sijmons, Koert
Large-scale geomorphological maps of mountainous areas are traditionally made using complex symbol-based legends. They can serve as excellent "geomorphological databases", from which an experienced geomorphologist can extract a large amount of information for hazard mapping. However, these maps are not designed to be used in combination with a GIS, due to their complex cartographic structure. In this paper, two methods are presented for digital geomorphological mapping at large scales using GIS and digital cartographic software. The methods are applied to an area with a complex geomorphological setting on the Borsoia catchment, located in the Alpago region, near Belluno in the Italian Alps. The GIS database set-up is presented with an overview of the data layers that have been generated and how they are interrelated. The GIS database was also converted into a paper map, using a digital cartographic package. The resulting largescale geomorphological hazard map is attached. The resulting GIS database and cartographic product can be used to analyse the hazard type and hazard degree for each polygon, and to find the reasons for the hazard classification.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Crossno, Patricia Joyce; Dunlavy, Daniel M.; Stanton, Eric T.
This report is a summary of the accomplishments of the 'Scalable Solutions for Processing and Searching Very Large Document Collections' LDRD, which ran from FY08 through FY10. Our goal was to investigate scalable text analysis; specifically, methods for information retrieval and visualization that could scale to extremely large document collections. Towards that end, we designed, implemented, and demonstrated a scalable framework for text analysis - ParaText - as a major project deliverable. Further, we demonstrated the benefits of using visual analysis in text analysis algorithm development, improved performance of heterogeneous ensemble models in data classification problems, and the advantages ofmore » information theoretic methods in user analysis and interpretation in cross language information retrieval. The project involved 5 members of the technical staff and 3 summer interns (including one who worked two summers). It resulted in a total of 14 publications, 3 new software libraries (2 open source and 1 internal to Sandia), several new end-user software applications, and over 20 presentations. Several follow-on projects have already begun or will start in FY11, with additional projects currently in proposal.« less
Source Code Analysis Laboratory (SCALe) for Energy Delivery Systems
2010-12-01
the software for reevaluation. Once the ree- valuation process is completed, CERT provides the client a report detailing the software’s con - formance...Flagged Nonconformities (FNC) Software System TP/FNC Ratio Mozilla Firefox version 2.0 6/12 50% Linux kernel version 2.6.15 10/126 8% Wine...inappropriately tuned for analysis of the Linux kernel, which has anomalous results. Customizing SCALe to work with energy system software will help
The (mis)use of subjective process measures in software engineering
NASA Technical Reports Server (NTRS)
Valett, Jon D.; Condon, Steven E.
1993-01-01
A variety of measures are used in software engineering research to develop an understanding of the software process and product. These measures fall into three broad categories: quantitative, characteristics, and subjective. Quantitative measures are those to which a numerical value can be assigned, for example effort or lines of code (LOC). Characteristics describe the software process or product; they might include programming language or the type of application. While such factors do not provide a quantitative measurement of a process or product, they do help characterize them. Subjective measures (as defined in this study) are those that are based on the opinion or opinions of individuals; they are somewhat unique and difficult to quantify. Capturing of subjective measure data typically involves development of some type of scale. For example, 'team experience' is one of the subjective measures that were collected and studied by the Software Engineering Laboratory (SEL). Certainly, team experience could have an impact on the software process or product; actually measuring a team's experience, however, is not a strictly mathematical exercise. Simply adding up each team member's years of experience appears inadequate. In fact, most researchers would agree that 'years' do not directly translate into 'experience.' Team experience must be defined subjectively and then a scale must be developed e.g., high experience versus low experience; or high, medium, low experience; or a different or more granular scale. Using this type of scale, a particular team's overall experience can be compared with that of other teams in the development environment. Defining, collecting, and scaling subjective measures is difficult. First, precise definitions of the measures must be established. Next, choices must be made about whose opinions will be solicited to constitute the data. Finally, care must be given to defining the right scale and level of granularity for measurement.
Large-scale deep learning for robotically gathered imagery for science
NASA Astrophysics Data System (ADS)
Skinner, K.; Johnson-Roberson, M.; Li, J.; Iscar, E.
2016-12-01
With the explosion of computing power, the intelligence and capability of mobile robotics has dramatically increased over the last two decades. Today, we can deploy autonomous robots to achieve observations in a variety of environments ripe for scientific exploration. These platforms are capable of gathering a volume of data previously unimaginable. Additionally, optical cameras, driven by mobile phones and consumer photography, have rapidly improved in size, power consumption, and quality making their deployment cheaper and easier. Finally, in parallel we have seen the rise of large-scale machine learning approaches, particularly deep neural networks (DNNs), increasing the quality of the semantic understanding that can be automatically extracted from optical imagery. In concert this enables new science using a combination of machine learning and robotics. This work will discuss the application of new low-cost high-performance computing approaches and the associated software frameworks to enable scientists to rapidly extract useful science data from millions of robotically gathered images. The automated analysis of imagery on this scale opens up new avenues of inquiry unavailable using more traditional manual or semi-automated approaches. We will use a large archive of millions of benthic images gathered with an autonomous underwater vehicle to demonstrate how these tools enable new scientific questions to be posed.
A Scalable Cyberinfrastructure for Interactive Visualization of Terascale Microscopy Data
Venkat, A.; Christensen, C.; Gyulassy, A.; Summa, B.; Federer, F.; Angelucci, A.; Pascucci, V.
2017-01-01
The goal of the recently emerged field of connectomics is to generate a wiring diagram of the brain at different scales. To identify brain circuitry, neuroscientists use specialized microscopes to perform multichannel imaging of labeled neurons at a very high resolution. CLARITY tissue clearing allows imaging labeled circuits through entire tissue blocks, without the need for tissue sectioning and section-to-section alignment. Imaging the large and complex non-human primate brain with sufficient resolution to identify and disambiguate between axons, in particular, produces massive data, creating great computational challenges to the study of neural circuits. Researchers require novel software capabilities for compiling, stitching, and visualizing large imagery. In this work, we detail the image acquisition process and a hierarchical streaming platform, ViSUS, that enables interactive visualization of these massive multi-volume datasets using a standard desktop computer. The ViSUS visualization framework has previously been shown to be suitable for 3D combustion simulation, climate simulation and visualization of large scale panoramic images. The platform is organized around a hierarchical cache oblivious data layout, called the IDX file format, which enables interactive visualization and exploration in ViSUS, scaling to the largest 3D images. In this paper we showcase the VISUS framework used in an interactive setting with the microscopy data. PMID:28638896
A Scalable Cyberinfrastructure for Interactive Visualization of Terascale Microscopy Data.
Venkat, A; Christensen, C; Gyulassy, A; Summa, B; Federer, F; Angelucci, A; Pascucci, V
2016-08-01
The goal of the recently emerged field of connectomics is to generate a wiring diagram of the brain at different scales. To identify brain circuitry, neuroscientists use specialized microscopes to perform multichannel imaging of labeled neurons at a very high resolution. CLARITY tissue clearing allows imaging labeled circuits through entire tissue blocks, without the need for tissue sectioning and section-to-section alignment. Imaging the large and complex non-human primate brain with sufficient resolution to identify and disambiguate between axons, in particular, produces massive data, creating great computational challenges to the study of neural circuits. Researchers require novel software capabilities for compiling, stitching, and visualizing large imagery. In this work, we detail the image acquisition process and a hierarchical streaming platform, ViSUS, that enables interactive visualization of these massive multi-volume datasets using a standard desktop computer. The ViSUS visualization framework has previously been shown to be suitable for 3D combustion simulation, climate simulation and visualization of large scale panoramic images. The platform is organized around a hierarchical cache oblivious data layout, called the IDX file format, which enables interactive visualization and exploration in ViSUS, scaling to the largest 3D images. In this paper we showcase the VISUS framework used in an interactive setting with the microscopy data.
Accelerating large-scale protein structure alignments with graphics processing units
2012-01-01
Background Large-scale protein structure alignment, an indispensable tool to structural bioinformatics, poses a tremendous challenge on computational resources. To ensure structure alignment accuracy and efficiency, efforts have been made to parallelize traditional alignment algorithms in grid environments. However, these solutions are costly and of limited accessibility. Others trade alignment quality for speedup by using high-level characteristics of structure fragments for structure comparisons. Findings We present ppsAlign, a parallel protein structure Alignment framework designed and optimized to exploit the parallelism of Graphics Processing Units (GPUs). As a general-purpose GPU platform, ppsAlign could take many concurrent methods, such as TM-align and Fr-TM-align, into the parallelized algorithm design. We evaluated ppsAlign on an NVIDIA Tesla C2050 GPU card, and compared it with existing software solutions running on an AMD dual-core CPU. We observed a 36-fold speedup over TM-align, a 65-fold speedup over Fr-TM-align, and a 40-fold speedup over MAMMOTH. Conclusions ppsAlign is a high-performance protein structure alignment tool designed to tackle the computational complexity issues from protein structural data. The solution presented in this paper allows large-scale structure comparisons to be performed using massive parallel computing power of GPU. PMID:22357132
NASA Astrophysics Data System (ADS)
Ulrich, T.; Gabriel, A. A.
2016-12-01
The geometry of faults is subject to a large degree of uncertainty. As buried structures being not directly observable, their complex shapes may only be inferred from surface traces, if available, or through geophysical methods, such as reflection seismology. As a consequence, most studies aiming at assessing the potential hazard of faults rely on idealized fault models, based on observable large-scale features. Yet, real faults are known to be wavy at all scales, their geometric features presenting similar statistical properties from the micro to the regional scale. The influence of roughness on the earthquake rupture process is currently a driving topic in the computational seismology community. From the numerical point of view, rough faults problems are challenging problems that require optimized codes able to run efficiently on high-performance computing infrastructure and simultaneously handle complex geometries. Physically, simulated ruptures hosted by rough faults appear to be much closer to source models inverted from observation in terms of complexity. Incorporating fault geometry on all scales may thus be crucial to model realistic earthquake source processes and to estimate more accurately seismic hazard. In this study, we use the software package SeisSol, based on an ADER-Discontinuous Galerkin scheme, to run our numerical simulations. SeisSol allows solving the spontaneous dynamic earthquake rupture problem and the wave propagation problem with high-order accuracy in space and time efficiently on large-scale machines. In this study, the influence of fault roughness on dynamic rupture style (e.g. onset of supershear transition, rupture front coherence, propagation of self-healing pulses, etc) at different length scales is investigated by analyzing ruptures on faults of varying roughness spectral content. In particular, we investigate the existence of a minimum roughness length scale in terms of rupture inherent length scales below which the rupture ceases to be sensible. Finally, the effect of fault geometry on ground-motions, in the near-field, is considered. Our simulations feature a classical linear slip weakening on the fault and a viscoplastic constitutive model off the fault. The benefits of using a more elaborate fast velocity-weakening friction law will also be considered.
Kralj, Damir; Kern, Josipa; Tonkovic, Stanko; Koncar, Miroslav
2015-09-09
Family medicine practices (FMPs) make the basis for the Croatian health care system. Use of electronic health record (EHR) software is mandatory and it plays an important role in running these practices, but important functional features still remain uneven and largely left to the will of the software developers. The objective of this study was to develop a novel and comprehensive model for functional evaluation of the EHR software in FMPs, based on current world standards, models and projects, as well as on actual user satisfaction and requirements. Based on previous theoretical and experimental research in this area, we made the initial framework model consisting of six basic categories as a base for online survey questionnaire. Family doctors assessed perceived software quality by using a five-point Likert-type scale. Using exploratory factor analysis and appropriate statistical methods over the collected data, the final optimal structure of the novel model was formed. Special attention was focused on the validity and quality of the novel model. The online survey collected a total of 384 cases. The obtained results indicate both the quality of the assessed software and the quality in use of the novel model. The intense ergonomic orientation of the novel measurement model was particularly emphasised. The resulting novel model is multiple validated, comprehensive and universal. It could be used to assess the user-perceived quality of almost all forms of the ambulatory EHR software and therefore useful to all stakeholders in this area of the health care informatisation.
Identification of MS-Cleavable and Non-Cleavable Chemically Crosslinked Peptides with MetaMorpheus.
Lu, Lei; Millikin, Robert J; Solntsev, Stefan K; Rolfs, Zach; Scalf, Mark; Shortreed, Michael R; Smith, Lloyd M
2018-05-25
Protein chemical crosslinking combined with mass spectrometry has become an important technique for the analysis of protein structure and protein-protein interactions. A variety of crosslinkers are well developed, but reliable, rapid, and user-friendly tools for large-scale analysis of crosslinked proteins are still in need. Here we report MetaMorpheusXL, a new search module within the MetaMorpheus software suite that identifies both MS-cleavable and non-cleavable crosslinked peptides in MS data. MetaMorpheusXL identifies MS-cleavable crosslinked peptides with an ion-indexing algorithm, which enables an efficient large database search. The identification does not require the presence of signature fragment ions, an advantage compared to similar programs such as XlinkX. One complication associated with the need for signature ions from cleavable crosslinkers such as DSSO (disuccinimidyl sulfoxide) is the requirement for multiple fragmentation types and energy combinations, which is not necessary for MetaMorpheusXL. The ability to perform proteome-wide analysis is another advantage of MetaMorpheusXl compared to such programs as MeroX and DXMSMS. MetaMorpheusXL is also faster than other currently available MS-cleavable crosslink search software programs. It is imbedded in MetaMorpheus, an open-source and freely available software suite that provides a reliable, fast, user-friendly graphical user interface that is readily accessible to researchers.
The Hyper Suprime-Cam software pipeline
Bosch, James; Armstrong, Robert; Bickerton, Steven; ...
2017-10-12
Here in this article, we describe the optical imaging data processing pipeline developed for the Subaru Telescope’s Hyper Suprime-Cam (HSC) instrument. The HSC Pipeline builds on the prototype pipeline being developed by the Large Synoptic Survey Telescope’s Data Management system, adding customizations for HSC, large-scale processing capabilities, and novel algorithms that have since been reincorporated into the LSST codebase. While designed primarily to reduce HSC Subaru Strategic Program (SSP) data, it is also the recommended pipeline for reducing general-observer HSC data. The HSC pipeline includes high-level processing steps that generate coadded images and science-ready catalogs as well as low-level detrendingmore » and image characterizations.« less
The Hyper Suprime-Cam software pipeline
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bosch, James; Armstrong, Robert; Bickerton, Steven
Here in this article, we describe the optical imaging data processing pipeline developed for the Subaru Telescope’s Hyper Suprime-Cam (HSC) instrument. The HSC Pipeline builds on the prototype pipeline being developed by the Large Synoptic Survey Telescope’s Data Management system, adding customizations for HSC, large-scale processing capabilities, and novel algorithms that have since been reincorporated into the LSST codebase. While designed primarily to reduce HSC Subaru Strategic Program (SSP) data, it is also the recommended pipeline for reducing general-observer HSC data. The HSC pipeline includes high-level processing steps that generate coadded images and science-ready catalogs as well as low-level detrendingmore » and image characterizations.« less
NASA Astrophysics Data System (ADS)
McFall, Steve
1994-03-01
With the increase in business automation and the widespread availability and low cost of computer systems, law enforcement agencies have seen a corresponding increase in criminal acts involving computers. The examination of computer evidence is a new field of forensic science with numerous opportunities for research and development. Research is needed to develop new software utilities to examine computer storage media, expert systems capable of finding criminal activity in large amounts of data, and to find methods of recovering data from chemically and physically damaged computer storage media. In addition, defeating encryption and password protection of computer files is also a topic requiring more research and development.
Implementing large projects in software engineering courses
NASA Astrophysics Data System (ADS)
Coppit, David
2006-03-01
In software engineering education, large projects are widely recognized as a useful way of exposing students to the real-world difficulties of team software development. But large projects are difficult to put into practice. First, educators rarely have additional time to manage software projects. Second, classrooms have inherent limitations that threaten the realism of large projects. Third, quantitative evaluation of individuals who work in groups is notoriously difficult. As a result, many software engineering courses compromise the project experience by reducing the team sizes, project scope, and risk. In this paper, we present an approach to teaching a one-semester software engineering course in which 20 to 30 students work together to construct a moderately sized (15KLOC) software system. The approach combines carefully coordinated lectures and homeworks, a hierarchical project management structure, modern communication technologies, and a web-based project tracking and individual assessment system. Our approach provides a more realistic project experience for the students, without incurring significant additional overhead for the instructor. We present our experiences using the approach the last 2 years for the software engineering course at The College of William and Mary. Although the approach has some weaknesses, we believe that they are strongly outweighed by the pedagogical benefits.
Bridging scales from satellite to grains: Structural mapping aided by tablet and photogrammetry
NASA Astrophysics Data System (ADS)
Hawemann, Friedrich; Mancktelow, Neil; Pennacchioni, Giorgio; Wex, Sebastian; Camacho, Alfredo
2016-04-01
Bridging scales from satellite to grains: Structural mapping aided by tablet and photogrammetry A fundamental problem in small-scale mapping is linking outcrop observations to the large scale deformation pattern. The evolution of handheld devices such as tablets with integrated GPS and the availability of airborne imagery allows a precise localization of outcrops. Detailed structural geometries can be analyzed through ortho-rectified photo mosaics generated by photogrammetry software. In this study, we use a cheap standard Samsung-tablet (< 300 Euro) to map individual, up to 60 m long shear zones with the tracking option offered by the program Locus Map. Even though GPS accuracy is about 3 m, the relative error from one point to another during tracking is on the order of only about 1 dm. Parts of the shear zone with excellent outcrop are photographed with a standard camera with a relatively wide angle in a mosaic array. An area of about 30 sqm needs about 50 photographs with enough overlap to be used for photogrammetry. The software PhotoScan from Agisoft matches the photographs in a fully automated manner, calculates a 3D model of the outcrop, and has the option to project this as an orthophoto onto a flat surface. This allows original orientations of grain-scale structures to be recorded over areas on a scale up to tens to hundreds of metres. The photo mosaics can then be georeferenced with the aid of the GPS-tracks of the shear zones and included in a GIS. This provides a cheap recording of the structures in high detail. The great advantages over mapping with UAVs (drones) is the resolution (<1mm to >1cm), the independence from weather and energy source, and the low cost.
Appel, R D; Palagi, P M; Walther, D; Vargas, J R; Sanchez, J C; Ravier, F; Pasquali, C; Hochstrasser, D F
1997-12-01
Although two-dimensional electrophoresis (2-DE) computer analysis software packages have existed ever since 2-DE technology was developed, it is only now that the hardware and software technology allows large-scale studies to be performed on low-cost personal computers or workstations, and that setting up a 2-DE computer analysis system in a small laboratory is no longer considered a luxury. After a first attempt in the seventies and early eighties to develop 2-DE analysis software systems on hardware that had poor or even no graphical capabilities, followed in the late eighties by a wave of innovative software developments that were possible thanks to new graphical interface standards such as XWindows, a third generation of 2-DE analysis software packages has now come to maturity. It can be run on a variety of low-cost, general-purpose personal computers, thus making the purchase of a 2-DE analysis system easily attainable for even the smallest laboratory that is involved in proteome research. Melanie II 2-D PAGE, developed at the University Hospital of Geneva, is such a third-generation software system for 2-DE analysis. Based on unique image processing algorithms, this user-friendly object-oriented software package runs on multiple platforms, including Unix, MS-Windows 95 and NT, and Power Macintosh. It provides efficient spot detection and quantitation, state-of-the-art image comparison, statistical data analysis facilities, and is Internet-ready. Linked to proteome databases such as those available on the World Wide Web, it represents a valuable tool for the "Virtual Lab" of the post-genome area.
Prototype Packaged Databases and Software in Health
Gardenier, Turkan K.
1980-01-01
This paper describes the recent demand for packaged databases and software for health applications in light of developments in mini-and micro-computer technology. Specific features for defining prospective user groups are discussed; criticisms generated for large-scale epidemiological data use as a means of replacing clinical trials and associated controls are posed to the reader. The available collaborative efforts for access and analysis of jointly structured health data are stressed, with recommendations for new analytical techniques specifically geared to monitoring data such as the CTSS (Cumulative Transitional State Score) generated for tacking ongoing patient status over time in clinical trials. Examples of graphic display are given from the Domestic Information Display System (DIDS) which is a collaborative multi-agency effort to computerize and make accessible user-specified U.S. and local maps relating to health, environment, socio-economic and energy data.
Open source Modeling and optimization tools for Planning
DOE Office of Scientific and Technical Information (OSTI.GOV)
Peles, S.
Open source modeling and optimization tools for planning The existing tools and software used for planning and analysis in California are either expensive, difficult to use, or not generally accessible to a large number of participants. These limitations restrict the availability of participants for larger scale energy and grid studies in the state. The proposed initiative would build upon federal and state investments in open source software, and create and improve open source tools for use in the state planning and analysis activities. Computational analysis and simulation frameworks in development at national labs and universities can be brought forward tomore » complement existing tools. An open source platform would provide a path for novel techniques and strategies to be brought into the larger community and reviewed by a broad set of stakeholders.« less
PC Software graphics tool for conceptual design of space/planetary electrical power systems
NASA Technical Reports Server (NTRS)
Truong, Long V.
1995-01-01
This paper describes the Decision Support System (DSS), a personal computer software graphics tool for designing conceptual space and/or planetary electrical power systems. By using the DSS, users can obtain desirable system design and operating parameters, such as system weight, electrical distribution efficiency, and bus power. With this tool, a large-scale specific power system was designed in a matter of days. It is an excellent tool to help designers make tradeoffs between system components, hardware architectures, and operation parameters in the early stages of the design cycle. The DSS is a user-friendly, menu-driven tool with online help and a custom graphical user interface. An example design and results are illustrated for a typical space power system with multiple types of power sources, frequencies, energy storage systems, and loads.
Decentralized formation flying control in a multiple-team hierarchy.
Mueller, Joseph B; Thomas, Stephanie J
2005-12-01
In recent years, formation flying has been recognized as an enabling technology for a variety of mission concepts in both the scientific and defense arenas. Examples of developing missions at NASA include magnetospheric multiscale (MMS), solar imaging radio array (SIRA), and terrestrial planet finder (TPF). For each of these missions, a multiple satellite approach is required in order to accomplish the large-scale geometries imposed by the science objectives. In addition, the paradigm shift of using a multiple satellite cluster rather than a large, monolithic spacecraft has also been motivated by the expected benefits of increased robustness, greater flexibility, and reduced cost. However, the operational costs of monitoring and commanding a fleet of close-orbiting satellites is likely to be unreasonable unless the onboard software is sufficiently autonomous, robust, and scalable to large clusters. This paper presents the prototype of a system that addresses these objectives-a decentralized guidance and control system that is distributed across spacecraft using a multiple team framework. The objective is to divide large clusters into teams of "manageable" size, so that the communication and computation demands driven by N decentralized units are related to the number of satellites in a team rather than the entire cluster. The system is designed to provide a high level of autonomy, to support clusters with large numbers of satellites, to enable the number of spacecraft in the cluster to change post-launch, and to provide for on-orbit software modification. The distributed guidance and control system will be implemented in an object-oriented style using a messaging architecture for networking and threaded applications (MANTA). In this architecture, tasks may be remotely added, removed, or replaced post launch to increase mission flexibility and robustness. This built-in adaptability will allow software modifications to be made on-orbit in a robust manner. The prototype system, which is implemented in Matlab, emulates the object-oriented and message-passing features of the MANTA software. In this paper, the multiple team organization of the cluster is described, and the modular software architecture is presented. The relative dynamics in eccentric reference orbits is reviewed, and families of periodic, relative trajectories are identified, expressed as sets of static geometric parameters. The guidance law design is presented, and an example reconfiguration scenario is used to illustrate the distributed process of assigning geometric goals to the cluster. Next, a decentralized maneuver planning approach is presented that utilizes linear-programming methods to enact reconfiguration and coarse formation keeping maneuvers. Finally, a method for performing online collision avoidance is discussed, and an example is provided to gauge its performance.
NASA Astrophysics Data System (ADS)
Fraser, Ryan; Gross, Lutz; Wyborn, Lesley; Evans, Ben; Klump, Jens
2015-04-01
Recent investments in HPC, cloud and Petascale data stores, have dramatically increased the scale and resolution that earth science challenges can now be tackled. These new infrastructures are highly parallelised and to fully utilise them and access the large volumes of earth science data now available, a new approach to software stack engineering needs to be developed. The size, complexity and cost of the new infrastructures mean any software deployed has to be reliable, trusted and reusable. Increasingly software is available via open source repositories, but these usually only enable code to be discovered and downloaded. As a user it is hard for a scientist to judge the suitability and quality of individual codes: rarely is there information on how and where codes can be run, what the critical dependencies are, and in particular, on the version requirements and licensing of the underlying software stack. A trusted software framework is proposed to enable reliable software to be discovered, accessed and then deployed on multiple hardware environments. More specifically, this framework will enable those who generate the software, and those who fund the development of software, to gain credit for the effort, IP, time and dollars spent, and facilitate quantification of the impact of individual codes. For scientific users, the framework delivers reviewed and benchmarked scientific software with mechanisms to reproduce results. The trusted framework will have five separate, but connected components: Register, Review, Reference, Run, and Repeat. 1) The Register component will facilitate discovery of relevant software from multiple open source code repositories. The registration process of the code should include information about licensing, hardware environments it can be run on, define appropriate validation (testing) procedures and list the critical dependencies. 2) The Review component is targeting on the verification of the software typically against a set of benchmark cases. This will be achieved by linking the code in the software framework to peer review forums such as Mozilla Science or appropriate Journals (e.g. Geoscientific Model Development Journal) to assist users to know which codes to trust. 3) Referencing will be accomplished by linking the Software Framework to groups such as Figshare or ImpactStory that help disseminate and measure the impact of scientific research, including program code. 4) The Run component will draw on information supplied in the registration process, benchmark cases described in the review and relevant information to instantiate the scientific code on the selected environment. 5) The Repeat component will tap into existing Provenance Workflow engines that will automatically capture information that relate to a particular run of that software, including identification of all input and output artefacts, and all elements and transactions within that workflow. The proposed trusted software framework will enable users to rapidly discover and access reliable code, reduce the time to deploy it and greatly facilitate sharing, reuse and reinstallation of code. Properly designed it could enable an ability to scale out to massively parallel systems and be accessed nationally/ internationally for multiple use cases, including Supercomputer centres, cloud facilities, and local computers.
A decentralized software bus based on IP multicas ting
NASA Technical Reports Server (NTRS)
Callahan, John R.; Montgomery, Todd
1995-01-01
We describe decentralized reconfigurable implementation of a conference management system based on the low-level Internet Protocol (IP) multicasting protocol. IP multicasting allows low-cost, world-wide, two-way transmission of data between large numbers of conferencing participants through the Multicasting Backbone (MBone). Each conference is structured as a software bus -- a messaging system that provides a run-time interconnection model that acts as a separate agent (i.e., the bus) for routing, queuing, and delivering messages between distributed programs. Unlike the client-server interconnection model, the software bus model provides a level of indirection that enhances the flexibility and reconfigurability of a distributed system. Current software bus implementations like POLYLITH, however, rely on a centralized bus process and point-to-point protocols (i.e., TCP/IP) to route, queue, and deliver messages. We implement a software bus called the MULTIBUS that relies on a separate process only for routing and uses a reliable IP multicasting protocol for delivery of messages. The use of multicasting means that interconnections are independent of IP machine addresses. This approach allows reconfiguration of bus participants during system execution without notifying other participants of new IP addresses. The use of IP multicasting also permits an economy of scale in the number of participants. We describe the MULITIBUS protocol elements and show how our implementation performs better than centralized bus implementations.
Tools and Approaches for the Construction of Knowledge Models from the Neuroscientific Literature
Burns, Gully A. P. C.; Khan, Arshad M.; Ghandeharizadeh, Shahram; O’Neill, Mark A.; Chen, Yi-Shin
2015-01-01
Within this paper, we describe a neuroinformatics project (called “NeuroScholar,” http://www.neuroscholar.org/) that enables researchers to examine, manage, manipulate, and use the information contained within the published neuroscientific literature. The project is built within a multi-level, multi-component framework constructed with the use of software engineering methods that themselves provide code-building functionality for neuroinformaticians. We describe the different software layers of the system. First, we present a hypothetical usage scenario illustrating how NeuroScholar permits users to address large-scale questions in a way that would otherwise be impossible. We do this by applying NeuroScholar to a “real-world” neuroscience question: How is stress-related information processed in the brain? We then explain how the overall design of NeuroScholar enables the system to work and illustrate different components of the user interface. We then describe the knowledge management strategy we use to store interpretations. Finally, we describe the software engineering framework we have devised (called the “View-Primitive-Data Model framework,” [VPDMf]) to provide an open-source, accelerated software development environment for the project. We believe that NeuroScholar will be useful to experimental neuroscientists by helping them interact with the primary neuroscientific literature in a meaningful way, and to neuroinformaticians by providing them with useful, affordable software engineering tools. PMID:15055395
Image-Based Single Cell Profiling: High-Throughput Processing of Mother Machine Experiments
Sachs, Christian Carsten; Grünberger, Alexander; Helfrich, Stefan; Probst, Christopher; Wiechert, Wolfgang; Kohlheyer, Dietrich; Nöh, Katharina
2016-01-01
Background Microfluidic lab-on-chip technology combined with live-cell imaging has enabled the observation of single cells in their spatio-temporal context. The mother machine (MM) cultivation system is particularly attractive for the long-term investigation of rod-shaped bacteria since it facilitates continuous cultivation and observation of individual cells over many generations in a highly parallelized manner. To date, the lack of fully automated image analysis software limits the practical applicability of the MM as a phenotypic screening tool. Results We present an image analysis pipeline for the automated processing of MM time lapse image stacks. The pipeline supports all analysis steps, i.e., image registration, orientation correction, channel/cell detection, cell tracking, and result visualization. Tailored algorithms account for the specialized MM layout to enable a robust automated analysis. Image data generated in a two-day growth study (≈ 90 GB) is analyzed in ≈ 30 min with negligible differences in growth rate between automated and manual evaluation quality. The proposed methods are implemented in the software molyso (MOther machine AnaLYsis SOftware) that provides a new profiling tool to analyze unbiasedly hitherto inaccessible large-scale MM image stacks. Conclusion Presented is the software molyso, a ready-to-use open source software (BSD-licensed) for the unsupervised analysis of MM time-lapse image stacks. molyso source code and user manual are available at https://github.com/modsim/molyso. PMID:27661996
Redshift Measurement and Spectral Classification for eBoss Galaxies with the Redmonster Software
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hutchinson, Timothy A.; Bolton, Adam S.; Dawson, Kyle S.
“Cosmological redshift surveys” are experiments conducted with astronomical telescopes, imagers, and spectrographs, which map the three-dimensional structure of the universe on the largest scales. These maps are delineated by the positions of galaxies, quasars, and intergalactic hydrogen clouds. When interpreted in the context of Einstein’s theory of gravity, these maps can be used to infer the nature of the contents of the universe, including the mysterious “dark energy” that is driving the expansion of the universe to accelerate. While the directional positions of galaxies and other objects can be measured directly in images of the sky, the third dimension ofmore » their position (i.e., their distance from the Earth and the Milky Way Galaxy) must be measured by spectrographs that distribute their light as a function of frequency, enabling a measurement of their cosmological Doppler shift (or “redshift”), which serves as an observable proxy for distance. The largest cosmological redshift surveys, such as the “eBOSS” experiment of the fourth Sloan Digital Sky Survey, collect spectroscopic data for hundreds of thousands to millions of galaxies. Future experiments such as the Dark Energy Spectroscopic Instrument will in turn collect tens of millions of spectra. To be feasible, redshift measurement methods in datasets of this scale must be made with automated software. This paper describes the algorithms, astrophysical templates, and implementation of a new redshift measurement software package that is optimized to run on large numbers of spectra with relatively low signal-to-noise ratio, typical of the most ambitious current and future cosmological redshift surveys. The software is demonstrated on spectroscopic data from the eBOSS survey, with performance that meets the scientific requirements of that experiment. The software is implemented in a general framework that will allow application to spectra from the DESI project in the future.« less
NASA Astrophysics Data System (ADS)
Murphy, Elizabeth Drummond
As advances in technology are applied in complex, semi-automated domains, human controllers are distanced from the controlled process. This physical and psychological distance may both facilitate and degrade human performance. To investigate cognitive issues in spacecraft ground-control operations, the present experimental research was undertaken. The primary issue concerned the ability of operations analysts who do not monitor operations to make timely, accurate decisions when autonomous software calls for human help. Another key issue involved the potential effects of spatial-visualization ability (SVA) in environments that present data in graphical formats. Hypotheses were derived largely from previous findings and predictions in the literature. Undergraduate psychology students were assigned at random to a monitoring condition or an on-call condition in a scaled environment. The experimental task required subjects to decide on the veracity of a problem diagnosis delivered by a software process on-board a simulated spacecraft. To support decision-making, tabular and graphical data displays presented information on system status. A level of software confidence in the problem diagnosis was displayed, and subjects reported their own level of confidence in their decisions. Contrary to expectations, the performance of on-call subjects did not differ significantly from that of continuous monitors. Analysis yielded a significant interaction of sex and condition: Females in the on-call condition had the lowest mean accuracy. Results included a preference for bar charts over line graphs and faster performance with tables than with line graphs. A significant correlation was found between subjective confidence and decision accuracy. SVA was found to be predictive of accuracy but not speed; and SVA was found to be a stronger predictor of performance for males than for females. Low-SVA subjects reported that they relied more on software confidence than did medium- or high-SVA subjects. These and other findings have implications for the design of user interfaces to support human decision-making in on-call situations and to accommodate low-SVA users.
Redshift Measurement and Spectral Classification for eBoss Galaxies with the Redmonster Software
Hutchinson, Timothy A.; Bolton, Adam S.; Dawson, Kyle S.; ...
2016-12-02
“Cosmological redshift surveys” are experiments conducted with astronomical telescopes, imagers, and spectrographs, which map the three-dimensional structure of the universe on the largest scales. These maps are delineated by the positions of galaxies, quasars, and intergalactic hydrogen clouds. When interpreted in the context of Einstein’s theory of gravity, these maps can be used to infer the nature of the contents of the universe, including the mysterious “dark energy” that is driving the expansion of the universe to accelerate. While the directional positions of galaxies and other objects can be measured directly in images of the sky, the third dimension ofmore » their position (i.e., their distance from the Earth and the Milky Way Galaxy) must be measured by spectrographs that distribute their light as a function of frequency, enabling a measurement of their cosmological Doppler shift (or “redshift”), which serves as an observable proxy for distance. The largest cosmological redshift surveys, such as the “eBOSS” experiment of the fourth Sloan Digital Sky Survey, collect spectroscopic data for hundreds of thousands to millions of galaxies. Future experiments such as the Dark Energy Spectroscopic Instrument will in turn collect tens of millions of spectra. To be feasible, redshift measurement methods in datasets of this scale must be made with automated software. This paper describes the algorithms, astrophysical templates, and implementation of a new redshift measurement software package that is optimized to run on large numbers of spectra with relatively low signal-to-noise ratio, typical of the most ambitious current and future cosmological redshift surveys. The software is demonstrated on spectroscopic data from the eBOSS survey, with performance that meets the scientific requirements of that experiment. The software is implemented in a general framework that will allow application to spectra from the DESI project in the future.« less
MEGADOCK: An All-to-All Protein-Protein Interaction Prediction System Using Tertiary Structure Data
Ohue, Masahito; Matsuzaki, Yuri; Uchikoga, Nobuyuki; Ishida, Takashi; Akiyama, Yutaka
2014-01-01
The elucidation of protein-protein interaction (PPI) networks is important for understanding cellular structure and function and structure-based drug design. However, the development of an effective method to conduct exhaustive PPI screening represents a computational challenge. We have been investigating a protein docking approach based on shape complementarity and physicochemical properties. We describe here the development of the protein-protein docking software package “MEGADOCK” that samples an extremely large number of protein dockings at high speed. MEGADOCK reduces the calculation time required for docking by using several techniques such as a novel scoring function called the real Pairwise Shape Complementarity (rPSC) score. We showed that MEGADOCK is capable of exhaustive PPI screening by completing docking calculations 7.5 times faster than the conventional docking software, ZDOCK, while maintaining an acceptable level of accuracy. When MEGADOCK was applied to a subset of a general benchmark dataset to predict 120 relevant interacting pairs from 120 x 120 = 14,400 combinations of proteins, an F-measure value of 0.231 was obtained. Further, we showed that MEGADOCK can be applied to a large-scale protein-protein interaction-screening problem with accuracy better than random. When our approach is combined with parallel high-performance computing systems, it is now feasible to search and analyze protein-protein interactions while taking into account three-dimensional structures at the interactome scale. MEGADOCK is freely available at http://www.bi.cs.titech.ac.jp/megadock. PMID:23855673
OSIRIX: open source multimodality image navigation software
NASA Astrophysics Data System (ADS)
Rosset, Antoine; Pysher, Lance; Spadola, Luca; Ratib, Osman
2005-04-01
The goal of our project is to develop a completely new software platform that will allow users to efficiently and conveniently navigate through large sets of multidimensional data without the need of high-end expensive hardware or software. We also elected to develop our system on new open source software libraries allowing other institutions and developers to contribute to this project. OsiriX is a free and open-source imaging software designed manipulate and visualize large sets of medical images: http://homepage.mac.com/rossetantoine/osirix/
NASA Astrophysics Data System (ADS)
Chen, J.; Wang, D.; Zhao, R. L.; Zhang, H.; Liao, A.; Jiu, J.
2014-04-01
Geospatial databases are irreplaceable national treasure of immense importance. Their up-to-dateness referring to its consistency with respect to the real world plays a critical role in its value and applications. The continuous updating of map databases at 1:50,000 scales is a massive and difficult task for larger countries of the size of more than several million's kilometer squares. This paper presents the research and technological development to support the national map updating at 1:50,000 scales in China, including the development of updating models and methods, production tools and systems for large-scale and rapid updating, as well as the design and implementation of the continuous updating workflow. The use of many data sources and the integration of these data to form a high accuracy, quality checked product were required. It had in turn required up to date techniques of image matching, semantic integration, generalization, data base management and conflict resolution. Design and develop specific software tools and packages to support the large-scale updating production with high resolution imagery and large-scale data generalization, such as map generalization, GIS-supported change interpretation from imagery, DEM interpolation, image matching-based orthophoto generation, data control at different levels. A national 1:50,000 databases updating strategy and its production workflow were designed, including a full coverage updating pattern characterized by all element topographic data modeling, change detection in all related areas, and whole process data quality controlling, a series of technical production specifications, and a network of updating production units in different geographic places in the country.
Parameter Uncertainty Analysis Using Monte Carlo Simulations for a Regional-Scale Groundwater Model
NASA Astrophysics Data System (ADS)
Zhang, Y.; Pohlmann, K.
2016-12-01
Regional-scale grid-based groundwater models for flow and transport often contain multiple types of parameters that can intensify the challenge of parameter uncertainty analysis. We propose a Monte Carlo approach to systematically quantify the influence of various types of model parameters on groundwater flux and contaminant travel times. The Monte Carlo simulations were conducted based on the steady-state conversion of the original transient model, which was then combined with the PEST sensitivity analysis tool SENSAN and particle tracking software MODPATH. Results identified hydrogeologic units whose hydraulic conductivity can significantly affect groundwater flux, and thirteen out of 173 model parameters that can cause large variation in travel times for contaminant particles originating from given source zones.
Jankowski, Stéphane; Currie-Fraser, Erica; Xu, Licen; Coffa, Jordy
2008-09-01
Annotated DNA samples that had been previously analyzed were tested using multiplex ligation-dependent probe amplification (MLPA) assays containing probes targeting BRCA1, BRCA2, and MMR (MLH1/MSH2 genes) and the 9p21 chromosomal region. MLPA polymerase chain reaction products were separated on a capillary electrophoresis platform, and the data were analyzed using GeneMapper v4.0 software (Applied Biosystems, Foster City, CA). After signal normalization, loci regions that had undergone deletions or duplications were identified using the GeneMapper Report Manager and verified using the DyeScale functionality. The results highlight an easy-to-use, optimal sample preparation and analysis workflow that can be used for both small- and large-scale studies.
A Biosequence-based Approach to Software Characterization
DOE Office of Scientific and Technical Information (OSTI.GOV)
Oehmen, Christopher S.; Peterson, Elena S.; Phillips, Aaron R.
For many applications, it is desirable to have some process for recognizing when software binaries are closely related without relying on them to be identical or have identical segments. Some examples include monitoring utilization of high performance computing centers or service clouds, detecting freeware in licensed code, and enforcing application whitelists. But doing so in a dynamic environment is a nontrivial task because most approaches to software similarity require extensive and time-consuming analysis of a binary, or they fail to recognize executables that are similar but nonidentical. Presented herein is a novel biosequence-based method for quantifying similarity of executable binaries.more » Using this method, it is shown in an example application on large-scale multi-author codes that 1) the biosequence-based method has a statistical performance in recognizing and distinguishing between a collection of real-world high performance computing applications better than 90% of ideal; and 2) an example of using family tree analysis to tune identification for a code subfamily can achieve better than 99% of ideal performance.« less