Next Generation Distributed Computing for Cancer Research
Agarwal, Pankaj; Owzar, Kouros
2014-01-01
Advances in next generation sequencing (NGS) and mass spectrometry (MS) technologies have provided many new opportunities and angles for extending the scope of translational cancer research while creating tremendous challenges in data management and analysis. The resulting informatics challenge is invariably not amenable to the use of traditional computing models. Recent advances in scalable computing and associated infrastructure, particularly distributed computing for Big Data, can provide solutions for addressing these challenges. In this review, the next generation of distributed computing technologies that can address these informatics problems is described from the perspective of three key components of a computational platform, namely computing, data storage and management, and networking. A broad overview of scalable computing is provided to set the context for a detailed description of Hadoop, a technology that is being rapidly adopted for large-scale distributed computing. A proof-of-concept Hadoop cluster, set up for performance benchmarking of NGS read alignment, is described as an example of how to work with Hadoop. Finally, Hadoop is compared with a number of other current technologies for distributed computing. PMID:25983539
Next generation distributed computing for cancer research.
Agarwal, Pankaj; Owzar, Kouros
2014-01-01
Advances in next generation sequencing (NGS) and mass spectrometry (MS) technologies have provided many new opportunities and angles for extending the scope of translational cancer research while creating tremendous challenges in data management and analysis. The resulting informatics challenge is invariably not amenable to the use of traditional computing models. Recent advances in scalable computing and associated infrastructure, particularly distributed computing for Big Data, can provide solutions for addressing these challenges. In this review, the next generation of distributed computing technologies that can address these informatics problems is described from the perspective of three key components of a computational platform, namely computing, data storage and management, and networking. A broad overview of scalable computing is provided to set the context for a detailed description of Hadoop, a technology that is being rapidly adopted for large-scale distributed computing. A proof-of-concept Hadoop cluster, set up for performance benchmarking of NGS read alignment, is described as an example of how to work with Hadoop. Finally, Hadoop is compared with a number of other current technologies for distributed computing.
The future of PanDA in ATLAS distributed computing
NASA Astrophysics Data System (ADS)
De, K.; Klimentov, A.; Maeno, T.; Nilsson, P.; Oleynik, D.; Panitkin, S.; Petrosyan, A.; Schovancova, J.; Vaniachine, A.; Wenaus, T.
2015-12-01
Experiments at the Large Hadron Collider (LHC) face unprecedented computing challenges. Heterogeneous resources are distributed worldwide at hundreds of sites, thousands of physicists analyse the data remotely, the volume of processed data is beyond the exabyte scale, while data processing requires more than a few billion hours of computing usage per year. The PanDA (Production and Distributed Analysis) system was developed to meet the scale and complexity of LHC distributed computing for the ATLAS experiment. In the process, the old batch job paradigm of locally managed computing in HEP was discarded in favour of a far more automated, flexible and scalable model. The success of PanDA in ATLAS is leading to widespread adoption and testing by other experiments. PanDA is the first exascale workload management system in HEP, already operating at more than a million computing jobs per day, and processing over an exabyte of data in 2013. There are many new challenges that PanDA will face in the near future, in addition to new challenges of scale, heterogeneity and increasing user base. PanDA will need to handle rapidly changing computing infrastructure, will require factorization of code for easier deployment, will need to incorporate additional information sources including network metrics in decision making, be able to control network circuits, handle dynamically sized workload processing, provide improved visualization, and face many other challenges. In this talk we will focus on the new features, planned or recently implemented, that are relevant to the next decade of distributed computing workload management using PanDA.
Global Software Development with Cloud Platforms
NASA Astrophysics Data System (ADS)
Yara, Pavan; Ramachandran, Ramaseshan; Balasubramanian, Gayathri; Muthuswamy, Karthik; Chandrasekar, Divya
Offshore and outsourced distributed software development models and processes are facing challenges, previously unknown, with respect to computing capacity, bandwidth, storage, security, complexity, reliability, and business uncertainty. Clouds promise to address these challenges by adopting recent advances in virtualization, parallel and distributed systems, utility computing, and software services. In this paper, we envision a cloud-based platform that addresses some of these core problems. We outline a generic cloud architecture, its design and our first implementation results for three cloud forms - a compute cloud, a storage cloud and a cloud-based software service- in the context of global distributed software development (GSD). Our ”compute cloud” provides computational services such as continuous code integration and a compile server farm, ”storage cloud” offers storage (block or file-based) services with an on-line virtual storage service, whereas the on-line virtual labs represent a useful cloud service. We note some of the use cases for clouds in GSD, the lessons learned with our prototypes and identify challenges that must be conquered before realizing the full business benefits. We believe that in the future, software practitioners will focus more on these cloud computing platforms and see clouds as a means to supporting a ecosystem of clients, developers and other key stakeholders.
Challenges in reducing the computational time of QSTS simulations for distribution system analysis.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Deboever, Jeremiah; Zhang, Xiaochen; Reno, Matthew J.
The rapid increase in penetration of distributed energy resources on the electric power distribution system has created a need for more comprehensive interconnection modelling and impact analysis. Unlike conventional scenario - based studies , quasi - static time - series (QSTS) simulation s can realistically model time - dependent voltage controllers and the diversity of potential impacts that can occur at different times of year . However, to accurately model a distribution system with all its controllable devices, a yearlong simulation at 1 - second resolution is often required , which could take conventional computers a computational time of 10more » to 120 hours when an actual unbalanced distribution feeder is modeled . This computational burden is a clear l imitation to the adoption of QSTS simulation s in interconnection studies and for determining optimal control solutions for utility operations . Our ongoing research to improve the speed of QSTS simulation has revealed many unique aspects of distribution system modelling and sequential power flow analysis that make fast QSTS a very difficult problem to solve. In this report , the most relevant challenges in reducing the computational time of QSTS simulations are presented: number of power flows to solve, circuit complexity, time dependence between time steps, multiple valid power flow solutions, controllable element interactions, and extensive accurate simulation analysis.« less
The process group approach to reliable distributed computing
NASA Technical Reports Server (NTRS)
Birman, Kenneth P.
1992-01-01
The difficulty of developing reliable distribution software is an impediment to applying distributed computing technology in many settings. Experience with the ISIS system suggests that a structured approach based on virtually synchronous process groups yields systems that are substantially easier to develop, exploit sophisticated forms of cooperative computation, and achieve high reliability. Six years of research on ISIS, describing the model, its implementation challenges, and the types of applications to which ISIS has been applied are reviewed.
Low latency network and distributed storage for next generation HPC systems: the ExaNeSt project
NASA Astrophysics Data System (ADS)
Ammendola, R.; Biagioni, A.; Cretaro, P.; Frezza, O.; Lo Cicero, F.; Lonardo, A.; Martinelli, M.; Paolucci, P. S.; Pastorelli, E.; Pisani, F.; Simula, F.; Vicini, P.; Navaridas, J.; Chaix, F.; Chrysos, N.; Katevenis, M.; Papaeustathiou, V.
2017-10-01
With processor architecture evolution, the HPC market has undergone a paradigm shift. The adoption of low-cost, Linux-based clusters extended the reach of HPC from its roots in modelling and simulation of complex physical systems to a broader range of industries, from biotechnology, cloud computing, computer analytics and big data challenges to manufacturing sectors. In this perspective, the near future HPC systems can be envisioned as composed of millions of low-power computing cores, densely packed — meaning cooling by appropriate technology — with a tightly interconnected, low latency and high performance network and equipped with a distributed storage architecture. Each of these features — dense packing, distributed storage and high performance interconnect — represents a challenge, made all the harder by the need to solve them at the same time. These challenges lie as stumbling blocks along the road towards Exascale-class systems; the ExaNeSt project acknowledges them and tasks itself with investigating ways around them.
Integrating Computing Resources: A Shared Distributed Architecture for Academics and Administrators.
ERIC Educational Resources Information Center
Beltrametti, Monica; English, Will
1994-01-01
Development and implementation of a shared distributed computing architecture at the University of Alberta (Canada) are described. Aspects discussed include design of the architecture, users' views of the electronic environment, technical and managerial challenges, and the campuswide human infrastructures needed to manage such an integrated…
Iterative Importance Sampling Algorithms for Parameter Estimation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Grout, Ray W; Morzfeld, Matthias; Day, Marcus S.
In parameter estimation problems one computes a posterior distribution over uncertain parameters defined jointly by a prior distribution, a model, and noisy data. Markov chain Monte Carlo (MCMC) is often used for the numerical solution of such problems. An alternative to MCMC is importance sampling, which can exhibit near perfect scaling with the number of cores on high performance computing systems because samples are drawn independently. However, finding a suitable proposal distribution is a challenging task. Several sampling algorithms have been proposed over the past years that take an iterative approach to constructing a proposal distribution. We investigate the applicabilitymore » of such algorithms by applying them to two realistic and challenging test problems, one in subsurface flow, and one in combustion modeling. More specifically, we implement importance sampling algorithms that iterate over the mean and covariance matrix of Gaussian or multivariate t-proposal distributions. Our implementation leverages massively parallel computers, and we present strategies to initialize the iterations using 'coarse' MCMC runs or Gaussian mixture models.« less
Object-oriented Tools for Distributed Computing
NASA Technical Reports Server (NTRS)
Adler, Richard M.
1993-01-01
Distributed computing systems are proliferating, owing to the availability of powerful, affordable microcomputers and inexpensive communication networks. A critical problem in developing such systems is getting application programs to interact with one another across a computer network. Remote interprogram connectivity is particularly challenging across heterogeneous environments, where applications run on different kinds of computers and operating systems. NetWorks! (trademark) is an innovative software product that provides an object-oriented messaging solution to these problems. This paper describes the design and functionality of NetWorks! and illustrates how it is being used to build complex distributed applications for NASA and in the commercial sector.
Simplified Key Management for Digital Access Control of Information Objects
2016-07-02
0001, Task BC-5-2283, “Architecture, Design of Services for Air Force Wide Distributed Systems,” for USAF HQ USAF SAF/CIO A6. The views, opinions...Challenges for Cloud Computing,” Lecture Notes in Engineering and Computer Science: Proceedings World Congress on Engineering and Computer Science 2011...P. Konieczny USAF HQ USAF SAF/CIO A6 11. SPONSOR’S / MONITOR’S REPORT NUMBER(S) 12. DISTRIBUTION / AVAILABILITY STATEMENT Approved for public
Distributed Pervasive Worlds: The Case of Exergames
ERIC Educational Resources Information Center
Laine, Teemu H.; Sedano, Carolina Islas
2015-01-01
Pervasive worlds are computing environments where a virtual world converges with the physical world through context-aware technologies such as sensors. In pervasive worlds, technology is distributed among entities that may be distributed geographically. We explore the concept, possibilities, and challenges of distributed pervasive worlds in a case…
Flórez-Arango, José F; Sriram Iyengar, M; Caicedo, Indira T; Escobar, German
2017-01-01
Development and electronic distribution of Clinical Practice Guidelines production is costly and challenging. This poster presents a rapid method to represent existing guidelines in auditable, computer executable multimedia format. We used a technology that enables a small number of clinicians to, in a short period of time, develop a substantial amount of computer executable guidelines without programming.
A Disciplined Architectural Approach to Scaling Data Analysis for Massive, Scientific Data
NASA Astrophysics Data System (ADS)
Crichton, D. J.; Braverman, A. J.; Cinquini, L.; Turmon, M.; Lee, H.; Law, E.
2014-12-01
Data collections across remote sensing and ground-based instruments in astronomy, Earth science, and planetary science are outpacing scientists' ability to analyze them. Furthermore, the distribution, structure, and heterogeneity of the measurements themselves pose challenges that limit the scalability of data analysis using traditional approaches. Methods for developing science data processing pipelines, distribution of scientific datasets, and performing analysis will require innovative approaches that integrate cyber-infrastructure, algorithms, and data into more systematic approaches that can more efficiently compute and reduce data, particularly distributed data. This requires the integration of computer science, machine learning, statistics and domain expertise to identify scalable architectures for data analysis. The size of data returned from Earth Science observing satellites and the magnitude of data from climate model output, is predicted to grow into the tens of petabytes challenging current data analysis paradigms. This same kind of growth is present in astronomy and planetary science data. One of the major challenges in data science and related disciplines defining new approaches to scaling systems and analysis in order to increase scientific productivity and yield. Specific needs include: 1) identification of optimized system architectures for analyzing massive, distributed data sets; 2) algorithms for systematic analysis of massive data sets in distributed environments; and 3) the development of software infrastructures that are capable of performing massive, distributed data analysis across a comprehensive data science framework. NASA/JPL has begun an initiative in data science to address these challenges. Our goal is to evaluate how scientific productivity can be improved through optimized architectural topologies that identify how to deploy and manage the access, distribution, computation, and reduction of massive, distributed data, while managing the uncertainties of scientific conclusions derived from such capabilities. This talk will provide an overview of JPL's efforts in developing a comprehensive architectural approach to data science.
Integrating autonomous distributed control into a human-centric C4ISR environment
NASA Astrophysics Data System (ADS)
Straub, Jeremy
2017-05-01
This paper considers incorporating autonomy into human-centric Command, Control, Communications, Computers, Intelligence, Surveillance and Reconnaissance (C4ISR) environments. Specifically, it focuses on identifying ways that current autonomy technologies can augment human control and the challenges presented by additive autonomy. Three approaches to this challenge are considered, stemming from prior work in two converging areas. In the first, the problem is approached as augmenting what humans currently do with automation. In the alternate approach, the problem is approached as treating humans as actors within a cyber-physical system-of-systems (stemming from robotic distributed computing). A third approach, combines elements of both of the aforementioned.
Efficient computation of hashes
NASA Astrophysics Data System (ADS)
Lopes, Raul H. C.; Franqueira, Virginia N. L.; Hobson, Peter R.
2014-06-01
The sequential computation of hashes at the core of many distributed storage systems and found, for example, in grid services can hinder efficiency in service quality and even pose security challenges that can only be addressed by the use of parallel hash tree modes. The main contributions of this paper are, first, the identification of several efficiency and security challenges posed by the use of sequential hash computation based on the Merkle-Damgard engine. In addition, alternatives for the parallel computation of hash trees are discussed, and a prototype for a new parallel implementation of the Keccak function, the SHA-3 winner, is introduced.
Validation of the thermal challenge problem using Bayesian Belief Networks.
DOE Office of Scientific and Technical Information (OSTI.GOV)
McFarland, John; Swiler, Laura Painton
The thermal challenge problem has been developed at Sandia National Laboratories as a testbed for demonstrating various types of validation approaches and prediction methods. This report discusses one particular methodology to assess the validity of a computational model given experimental data. This methodology is based on Bayesian Belief Networks (BBNs) and can incorporate uncertainty in experimental measurements, in physical quantities, and model uncertainties. The approach uses the prior and posterior distributions of model output to compute a validation metric based on Bayesian hypothesis testing (a Bayes' factor). This report discusses various aspects of the BBN, specifically in the context ofmore » the thermal challenge problem. A BBN is developed for a given set of experimental data in a particular experimental configuration. The development of the BBN and the method for ''solving'' the BBN to develop the posterior distribution of model output through Monte Carlo Markov Chain sampling is discussed in detail. The use of the BBN to compute a Bayes' factor is demonstrated.« less
Optimization of tomographic reconstruction workflows on geographically distributed resources
Bicer, Tekin; Gursoy, Doga; Kettimuthu, Rajkumar; ...
2016-01-01
New technological advancements in synchrotron light sources enable data acquisitions at unprecedented levels. This emergent trend affects not only the size of the generated data but also the need for larger computational resources. Although beamline scientists and users have access to local computational resources, these are typically limited and can result in extended execution times. Applications that are based on iterative processing as in tomographic reconstruction methods require high-performance compute clusters for timely analysis of data. Here, time-sensitive analysis and processing of Advanced Photon Source data on geographically distributed resources are focused on. Two main challenges are considered: (i) modelingmore » of the performance of tomographic reconstruction workflows and (ii) transparent execution of these workflows on distributed resources. For the former, three main stages are considered: (i) data transfer between storage and computational resources, (i) wait/queue time of reconstruction jobs at compute resources, and (iii) computation of reconstruction tasks. These performance models allow evaluation and estimation of the execution time of any given iterative tomographic reconstruction workflow that runs on geographically distributed resources. For the latter challenge, a workflow management system is built, which can automate the execution of workflows and minimize the user interaction with the underlying infrastructure. The system utilizes Globus to perform secure and efficient data transfer operations. The proposed models and the workflow management system are evaluated by using three high-performance computing and two storage resources, all of which are geographically distributed. Workflows were created with different computational requirements using two compute-intensive tomographic reconstruction algorithms. Experimental evaluation shows that the proposed models and system can be used for selecting the optimum resources, which in turn can provide up to 3.13× speedup (on experimented resources). Furthermore, the error rates of the models range between 2.1 and 23.3% (considering workflow execution times), where the accuracy of the model estimations increases with higher computational demands in reconstruction tasks.« less
Optimization of tomographic reconstruction workflows on geographically distributed resources
Bicer, Tekin; Gürsoy, Doǧa; Kettimuthu, Rajkumar; De Carlo, Francesco; Foster, Ian T.
2016-01-01
New technological advancements in synchrotron light sources enable data acquisitions at unprecedented levels. This emergent trend affects not only the size of the generated data but also the need for larger computational resources. Although beamline scientists and users have access to local computational resources, these are typically limited and can result in extended execution times. Applications that are based on iterative processing as in tomographic reconstruction methods require high-performance compute clusters for timely analysis of data. Here, time-sensitive analysis and processing of Advanced Photon Source data on geographically distributed resources are focused on. Two main challenges are considered: (i) modeling of the performance of tomographic reconstruction workflows and (ii) transparent execution of these workflows on distributed resources. For the former, three main stages are considered: (i) data transfer between storage and computational resources, (i) wait/queue time of reconstruction jobs at compute resources, and (iii) computation of reconstruction tasks. These performance models allow evaluation and estimation of the execution time of any given iterative tomographic reconstruction workflow that runs on geographically distributed resources. For the latter challenge, a workflow management system is built, which can automate the execution of workflows and minimize the user interaction with the underlying infrastructure. The system utilizes Globus to perform secure and efficient data transfer operations. The proposed models and the workflow management system are evaluated by using three high-performance computing and two storage resources, all of which are geographically distributed. Workflows were created with different computational requirements using two compute-intensive tomographic reconstruction algorithms. Experimental evaluation shows that the proposed models and system can be used for selecting the optimum resources, which in turn can provide up to 3.13× speedup (on experimented resources). Moreover, the error rates of the models range between 2.1 and 23.3% (considering workflow execution times), where the accuracy of the model estimations increases with higher computational demands in reconstruction tasks. PMID:27359149
Optimization of tomographic reconstruction workflows on geographically distributed resources
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bicer, Tekin; Gursoy, Doga; Kettimuthu, Rajkumar
New technological advancements in synchrotron light sources enable data acquisitions at unprecedented levels. This emergent trend affects not only the size of the generated data but also the need for larger computational resources. Although beamline scientists and users have access to local computational resources, these are typically limited and can result in extended execution times. Applications that are based on iterative processing as in tomographic reconstruction methods require high-performance compute clusters for timely analysis of data. Here, time-sensitive analysis and processing of Advanced Photon Source data on geographically distributed resources are focused on. Two main challenges are considered: (i) modelingmore » of the performance of tomographic reconstruction workflows and (ii) transparent execution of these workflows on distributed resources. For the former, three main stages are considered: (i) data transfer between storage and computational resources, (i) wait/queue time of reconstruction jobs at compute resources, and (iii) computation of reconstruction tasks. These performance models allow evaluation and estimation of the execution time of any given iterative tomographic reconstruction workflow that runs on geographically distributed resources. For the latter challenge, a workflow management system is built, which can automate the execution of workflows and minimize the user interaction with the underlying infrastructure. The system utilizes Globus to perform secure and efficient data transfer operations. The proposed models and the workflow management system are evaluated by using three high-performance computing and two storage resources, all of which are geographically distributed. Workflows were created with different computational requirements using two compute-intensive tomographic reconstruction algorithms. Experimental evaluation shows that the proposed models and system can be used for selecting the optimum resources, which in turn can provide up to 3.13× speedup (on experimented resources). Furthermore, the error rates of the models range between 2.1 and 23.3% (considering workflow execution times), where the accuracy of the model estimations increases with higher computational demands in reconstruction tasks.« less
NASA Astrophysics Data System (ADS)
Marinos, Alexandros; Briscoe, Gerard
Cloud Computing is rising fast, with its data centres growing at an unprecedented rate. However, this has come with concerns over privacy, efficiency at the expense of resilience, and environmental sustainability, because of the dependence on Cloud vendors such as Google, Amazon and Microsoft. Our response is an alternative model for the Cloud conceptualisation, providing a paradigm for Clouds in the community, utilising networked personal computers for liberation from the centralised vendor model. Community Cloud Computing (C3) offers an alternative architecture, created by combing the Cloud with paradigms from Grid Computing, principles from Digital Ecosystems, and sustainability from Green Computing, while remaining true to the original vision of the Internet. It is more technically challenging than Cloud Computing, having to deal with distributed computing issues, including heterogeneous nodes, varying quality of service, and additional security constraints. However, these are not insurmountable challenges, and with the need to retain control over our digital lives and the potential environmental consequences, it is a challenge we must pursue.
Secure distributed genome analysis for GWAS and sequence comparison computation.
Zhang, Yihua; Blanton, Marina; Almashaqbeh, Ghada
2015-01-01
The rapid increase in the availability and volume of genomic data makes significant advances in biomedical research possible, but sharing of genomic data poses challenges due to the highly sensitive nature of such data. To address the challenges, a competition for secure distributed processing of genomic data was organized by the iDASH research center. In this work we propose techniques for securing computation with real-life genomic data for minor allele frequency and chi-squared statistics computation, as well as distance computation between two genomic sequences, as specified by the iDASH competition tasks. We put forward novel optimizations, including a generalization of a version of mergesort, which might be of independent interest. We provide implementation results of our techniques based on secret sharing that demonstrate practicality of the suggested protocols and also report on performance improvements due to our optimization techniques. This work describes our techniques, findings, and experimental results developed and obtained as part of iDASH 2015 research competition to secure real-life genomic computations and shows feasibility of securely computing with genomic data in practice.
Secure distributed genome analysis for GWAS and sequence comparison computation
2015-01-01
Background The rapid increase in the availability and volume of genomic data makes significant advances in biomedical research possible, but sharing of genomic data poses challenges due to the highly sensitive nature of such data. To address the challenges, a competition for secure distributed processing of genomic data was organized by the iDASH research center. Methods In this work we propose techniques for securing computation with real-life genomic data for minor allele frequency and chi-squared statistics computation, as well as distance computation between two genomic sequences, as specified by the iDASH competition tasks. We put forward novel optimizations, including a generalization of a version of mergesort, which might be of independent interest. Results We provide implementation results of our techniques based on secret sharing that demonstrate practicality of the suggested protocols and also report on performance improvements due to our optimization techniques. Conclusions This work describes our techniques, findings, and experimental results developed and obtained as part of iDASH 2015 research competition to secure real-life genomic computations and shows feasibility of securely computing with genomic data in practice. PMID:26733307
BigData and computing challenges in high energy and nuclear physics
NASA Astrophysics Data System (ADS)
Klimentov, A.; Grigorieva, M.; Kiryanov, A.; Zarochentsev, A.
2017-06-01
In this contribution we discuss the various aspects of the computing resource needs experiments in High Energy and Nuclear Physics, in particular at the Large Hadron Collider. This will evolve in the future when moving from LHC to HL-LHC in ten years from now, when the already exascale levels of data we are processing could increase by a further order of magnitude. The distributed computing environment has been a great success and the inclusion of new super-computing facilities, cloud computing and volunteering computing for the future is a big challenge, which we are successfully mastering with a considerable contribution from many super-computing centres around the world, academic and commercial cloud providers. We also discuss R&D computing projects started recently in National Research Center ``Kurchatov Institute''
A Grid Infrastructure for Supporting Space-based Science Operations
NASA Technical Reports Server (NTRS)
Bradford, Robert N.; Redman, Sandra H.; McNair, Ann R. (Technical Monitor)
2002-01-01
Emerging technologies for computational grid infrastructures have the potential for revolutionizing the way computers are used in all aspects of our lives. Computational grids are currently being implemented to provide a large-scale, dynamic, and secure research and engineering environments based on standards and next-generation reusable software, enabling greater science and engineering productivity through shared resources and distributed computing for less cost than traditional architectures. Combined with the emerging technologies of high-performance networks, grids provide researchers, scientists and engineers the first real opportunity for an effective distributed collaborative environment with access to resources such as computational and storage systems, instruments, and software tools and services for the most computationally challenging applications.
GISpark: A Geospatial Distributed Computing Platform for Spatiotemporal Big Data
NASA Astrophysics Data System (ADS)
Wang, S.; Zhong, E.; Wang, E.; Zhong, Y.; Cai, W.; Li, S.; Gao, S.
2016-12-01
Geospatial data are growing exponentially because of the proliferation of cost effective and ubiquitous positioning technologies such as global remote-sensing satellites and location-based devices. Analyzing large amounts of geospatial data can provide great value for both industrial and scientific applications. Data- and compute- intensive characteristics inherent in geospatial big data increasingly pose great challenges to technologies of data storing, computing and analyzing. Such challenges require a scalable and efficient architecture that can store, query, analyze, and visualize large-scale spatiotemporal data. Therefore, we developed GISpark - a geospatial distributed computing platform for processing large-scale vector, raster and stream data. GISpark is constructed based on the latest virtualized computing infrastructures and distributed computing architecture. OpenStack and Docker are used to build multi-user hosting cloud computing infrastructure for GISpark. The virtual storage systems such as HDFS, Ceph, MongoDB are combined and adopted for spatiotemporal data storage management. Spark-based algorithm framework is developed for efficient parallel computing. Within this framework, SuperMap GIScript and various open-source GIS libraries can be integrated into GISpark. GISpark can also integrated with scientific computing environment (e.g., Anaconda), interactive computing web applications (e.g., Jupyter notebook), and machine learning tools (e.g., TensorFlow/Orange). The associated geospatial facilities of GISpark in conjunction with the scientific computing environment, exploratory spatial data analysis tools, temporal data management and analysis systems make up a powerful geospatial computing tool. GISpark not only provides spatiotemporal big data processing capacity in the geospatial field, but also provides spatiotemporal computational model and advanced geospatial visualization tools that deals with other domains related with spatial property. We tested the performance of the platform based on taxi trajectory analysis. Results suggested that GISpark achieves excellent run time performance in spatiotemporal big data applications.
Peek, N; Holmes, J H; Sun, J
2014-08-15
To review technical and methodological challenges for big data research in biomedicine and health. We discuss sources of big datasets, survey infrastructures for big data storage and big data processing, and describe the main challenges that arise when analyzing big data. The life and biomedical sciences are massively contributing to the big data revolution through secondary use of data that were collected during routine care and through new data sources such as social media. Efficient processing of big datasets is typically achieved by distributing computation over a cluster of computers. Data analysts should be aware of pitfalls related to big data such as bias in routine care data and the risk of false-positive findings in high-dimensional datasets. The major challenge for the near future is to transform analytical methods that are used in the biomedical and health domain, to fit the distributed storage and processing model that is required to handle big data, while ensuring confidentiality of the data being analyzed.
Generic algorithms for high performance scalable geocomputing
NASA Astrophysics Data System (ADS)
de Jong, Kor; Schmitz, Oliver; Karssenberg, Derek
2016-04-01
During the last decade, the characteristics of computing hardware have changed a lot. For example, instead of a single general purpose CPU core, personal computers nowadays contain multiple cores per CPU and often general purpose accelerators, like GPUs. Additionally, compute nodes are often grouped together to form clusters or a supercomputer, providing enormous amounts of compute power. For existing earth simulation models to be able to use modern hardware platforms, their compute intensive parts must be rewritten. This can be a major undertaking and may involve many technical challenges. Compute tasks must be distributed over CPU cores, offloaded to hardware accelerators, or distributed to different compute nodes. And ideally, all of this should be done in such a way that the compute task scales well with the hardware resources. This presents two challenges: 1) how to make good use of all the compute resources and 2) how to make these compute resources available for developers of simulation models, who may not (want to) have the required technical background for distributing compute tasks. The first challenge requires the use of specialized technology (e.g.: threads, OpenMP, MPI, OpenCL, CUDA). The second challenge requires the abstraction of the logic handling the distribution of compute tasks from the model-specific logic, hiding the technical details from the model developer. To assist the model developer, we are developing a C++ software library (called Fern) containing algorithms that can use all CPU cores available in a single compute node (distributing tasks over multiple compute nodes will be done at a later stage). The algorithms are grid-based (finite difference) and include local and spatial operations such as convolution filters. The algorithms handle distribution of the compute tasks to CPU cores internally. In the resulting model the low-level details of how this is done is separated from the model-specific logic representing the modeled system. This contrasts with practices in which code for distributing of compute tasks is mixed with model-specific code, and results in a better maintainable model. For flexibility and efficiency, the algorithms are configurable at compile-time with the respect to the following aspects: data type, value type, no-data handling, input value domain handling, and output value range handling. This makes the algorithms usable in very different contexts, without the need for making intrusive changes to existing models when using them. Applications that benefit from using the Fern library include the construction of forward simulation models in (global) hydrology (e.g. PCR-GLOBWB (Van Beek et al. 2011)), ecology, geomorphology, or land use change (e.g. PLUC (Verstegen et al. 2014)) and manipulation of hyper-resolution land surface data such as digital elevation models and remote sensing data. Using the Fern library, we have also created an add-on to the PCRaster Python Framework (Karssenberg et al. 2010) allowing its users to speed up their spatio-temporal models, sometimes by changing just a single line of Python code in their model. In our presentation we will give an overview of the design of the algorithms, providing examples of different contexts where they can be used to replace existing sequential algorithms, including the PCRaster environmental modeling software (www.pcraster.eu). We will show how the algorithms can be configured to behave differently when necessary. References Karssenberg, D., Schmitz, O., Salamon, P., De Jong, K. and Bierkens, M.F.P., 2010, A software framework for construction of process-based stochastic spatio-temporal models and data assimilation. Environmental Modelling & Software, 25, pp. 489-502, Link. Best Paper Award 2010: Software and Decision Support. Van Beek, L. P. H., Y. Wada, and M. F. P. Bierkens. 2011. Global monthly water stress: 1. Water balance and water availability. Water Resources Research. 47. Verstegen, J. A., D. Karssenberg, F. van der Hilst, and A. P. C. Faaij. 2014. Identifying a land use change cellular automaton by Bayesian data assimilation. Environmental Modelling & Software 53:121-136.
The Montage architecture for grid-enabled science processing of large, distributed datasets
NASA Technical Reports Server (NTRS)
Jacob, Joseph C.; Katz, Daniel S .; Prince, Thomas; Berriman, Bruce G.; Good, John C.; Laity, Anastasia C.; Deelman, Ewa; Singh, Gurmeet; Su, Mei-Hui
2004-01-01
Montage is an Earth Science Technology Office (ESTO) Computational Technologies (CT) Round III Grand Challenge investigation to deploy a portable, compute-intensive, custom astronomical image mosaicking service for the National Virtual Observatory (NVO). Although Montage is developing a compute- and data-intensive service for the astronomy community, we are also helping to address a problem that spans both Earth and Space science, namely how to efficiently access and process multi-terabyte, distributed datasets. In both communities, the datasets are massive, and are stored in distributed archives that are, in most cases, remote from the available Computational resources. Therefore, state of the art computational grid technologies are a key element of the Montage portal architecture. This paper describes the aspects of the Montage design that are applicable to both the Earth and Space science communities.
Support for Debugging Automatically Parallelized Programs
NASA Technical Reports Server (NTRS)
Hood, Robert; Jost, Gabriele; Biegel, Bryan (Technical Monitor)
2001-01-01
This viewgraph presentation provides information on the technical aspects of debugging computer code that has been automatically converted for use in a parallel computing system. Shared memory parallelization and distributed memory parallelization entail separate and distinct challenges for a debugging program. A prototype system has been developed which integrates various tools for the debugging of automatically parallelized programs including the CAPTools Database which provides variable definition information across subroutines as well as array distribution information.
A novel strategy for load balancing of distributed medical applications.
Logeswaran, Rajasvaran; Chen, Li-Choo
2012-04-01
Current trends in medicine, specifically in the electronic handling of medical applications, ranging from digital imaging, paperless hospital administration and electronic medical records, telemedicine, to computer-aided diagnosis, creates a burden on the network. Distributed Service Architectures, such as Intelligent Network (IN), Telecommunication Information Networking Architecture (TINA) and Open Service Access (OSA), are able to meet this new challenge. Distribution enables computational tasks to be spread among multiple processors; hence, performance is an important issue. This paper proposes a novel approach in load balancing, the Random Sender Initiated Algorithm, for distribution of tasks among several nodes sharing the same computational object (CO) instances in Distributed Service Architectures. Simulations illustrate that the proposed algorithm produces better network performance than the benchmark load balancing algorithms-the Random Node Selection Algorithm and the Shortest Queue Algorithm, especially under medium and heavily loaded conditions.
NASA Astrophysics Data System (ADS)
Bednar, Earl; Drager, Steven L.
2007-04-01
Quantum information processing's objective is to utilize revolutionary computing capability based on harnessing the paradigm shift offered by quantum computing to solve classically hard and computationally challenging problems. Some of our computationally challenging problems of interest include: the capability for rapid image processing, rapid optimization of logistics, protecting information, secure distributed simulation, and massively parallel computation. Currently, one important problem with quantum information processing is that the implementation of quantum computers is difficult to realize due to poor scalability and great presence of errors. Therefore, we have supported the development of Quantum eXpress and QuIDD Pro, two quantum computer simulators running on classical computers for the development and testing of new quantum algorithms and processes. This paper examines the different methods used by these two quantum computing simulators. It reviews both simulators, highlighting each simulators background, interface, and special features. It also demonstrates the implementation of current quantum algorithms on each simulator. It concludes with summary comments on both simulators.
Comparing the Performance of Two Dynamic Load Distribution Methods
NASA Technical Reports Server (NTRS)
Kale, L. V.
1987-01-01
Parallel processing of symbolic computations on a message-passing multi-processor presents one challenge: To effectively utilize the available processors, the load must be distributed uniformly to all the processors. However, the structure of these computations cannot be predicted in advance. go, static scheduling methods are not applicable. In this paper, we compare the performance of two dynamic, distributed load balancing methods with extensive simulation studies. The two schemes are: the Contracting Within a Neighborhood (CWN) scheme proposed by us, and the Gradient Model proposed by Lin and Keller. We conclude that although simpler, the CWN is significantly more effective at distributing the work than the Gradient model.
NASA Technical Reports Server (NTRS)
Mckay, C. W.; Bown, R. L.
1985-01-01
The space station data management system involves networks of computing resources that must work cooperatively and reliably over an indefinite life span. This program requires a long schedule of modular growth and an even longer period of maintenance and operation. The development and operation of space station computing resources will involve a spectrum of systems and software life cycle activities distributed across a variety of hosts, an integration, verification, and validation host with test bed, and distributed targets. The requirement for the early establishment and use of an apporopriate Computer Systems and Software Engineering Support Environment is identified. This environment will support the Research and Development Productivity challenges presented by the space station computing system.
TeleMed: Wide-area, secure, collaborative object computing with Java and CORBA for healthcare
DOE Office of Scientific and Technical Information (OSTI.GOV)
Forslund, D.W.; George, J.E.; Gavrilov, E.M.
1998-12-31
Distributed computing is becoming commonplace in a variety of industries with healthcare being a particularly important one for society. The authors describe the development and deployment of TeleMed in a few healthcare domains. TeleMed is a 100% Java distributed application build on CORBA and OMG standards enabling the collaboration on the treatment of chronically ill patients in a secure manner over the Internet. These standards enable other systems to work interoperably with TeleMed and provide transparent access to high performance distributed computing to the healthcare domain. The goal of wide scale integration of electronic medical records is a grand-challenge scalemore » problem of global proportions with far-reaching social benefits.« less
ERIC Educational Resources Information Center
Scott, William L.; Denton, Ryan E.; Marrs, Kathleen A.; Durrant, Jacob D.; Samaritoni, J. Geno; Abraham, Milata M.; Brown, Stephen P.; Carnahan, Jon M.; Fischer, Lindsey G.; Glos, Courtney E.; Sempsrott, Peter J.; O'Donnell, Martin J.
2015-01-01
The Distributed Drug Discovery (D3) program trains students in three drug discovery disciplines (synthesis, computational analysis, and biological screening) while addressing the important challenge of discovering drug leads for neglected diseases. This article focuses on implementation of the synthesis component in the second-semester…
Schopf, Jennifer M.; Nitzberg, Bill
2002-01-01
The design and implementation of a national computing system and data grid has become a reachable goal from both the computer science and computational science point of view. A distributed infrastructure capable of sophisticated computational functions can bring many benefits to scientific work, but poses many challenges, both technical and socio-political. Technical challenges include having basic software tools, higher-level services, functioning and pervasive security, and standards, while socio-political issues include building a user community, adding incentives for sites to be part of a user-centric environment, and educating funding sources about the needs of this community. This paper details the areasmore » relating to Grid research that we feel still need to be addressed to fully leverage the advantages of the Grid.« less
Selecting Summary Statistics in Approximate Bayesian Computation for Calibrating Stochastic Models
Burr, Tom
2013-01-01
Approximate Bayesian computation (ABC) is an approach for using measurement data to calibrate stochastic computer models, which are common in biology applications. ABC is becoming the “go-to” option when the data and/or parameter dimension is large because it relies on user-chosen summary statistics rather than the full data and is therefore computationally feasible. One technical challenge with ABC is that the quality of the approximation to the posterior distribution of model parameters depends on the user-chosen summary statistics. In this paper, the user requirement to choose effective summary statistics in order to accurately estimate the posterior distribution of model parameters is investigated and illustrated by example, using a model and corresponding real data of mitochondrial DNA population dynamics. We show that for some choices of summary statistics, the posterior distribution of model parameters is closely approximated and for other choices of summary statistics, the posterior distribution is not closely approximated. A strategy to choose effective summary statistics is suggested in cases where the stochastic computer model can be run at many trial parameter settings, as in the example. PMID:24288668
Selecting summary statistics in approximate Bayesian computation for calibrating stochastic models.
Burr, Tom; Skurikhin, Alexei
2013-01-01
Approximate Bayesian computation (ABC) is an approach for using measurement data to calibrate stochastic computer models, which are common in biology applications. ABC is becoming the "go-to" option when the data and/or parameter dimension is large because it relies on user-chosen summary statistics rather than the full data and is therefore computationally feasible. One technical challenge with ABC is that the quality of the approximation to the posterior distribution of model parameters depends on the user-chosen summary statistics. In this paper, the user requirement to choose effective summary statistics in order to accurately estimate the posterior distribution of model parameters is investigated and illustrated by example, using a model and corresponding real data of mitochondrial DNA population dynamics. We show that for some choices of summary statistics, the posterior distribution of model parameters is closely approximated and for other choices of summary statistics, the posterior distribution is not closely approximated. A strategy to choose effective summary statistics is suggested in cases where the stochastic computer model can be run at many trial parameter settings, as in the example.
AGIS: Evolution of Distributed Computing information system for ATLAS
NASA Astrophysics Data System (ADS)
Anisenkov, A.; Di Girolamo, A.; Alandes, M.; Karavakis, E.
2015-12-01
ATLAS, a particle physics experiment at the Large Hadron Collider at CERN, produces petabytes of data annually through simulation production and tens of petabytes of data per year from the detector itself. The ATLAS computing model embraces the Grid paradigm and a high degree of decentralization of computing resources in order to meet the ATLAS requirements of petabytes scale data operations. It has been evolved after the first period of LHC data taking (Run-1) in order to cope with new challenges of the upcoming Run- 2. In this paper we describe the evolution and recent developments of the ATLAS Grid Information System (AGIS), developed in order to integrate configuration and status information about resources, services and topology of the computing infrastructure used by the ATLAS Distributed Computing applications and services.
ERIC Educational Resources Information Center
Fells, Stephanie
2012-01-01
The design of online or distributed problem-based learning (dPBL) is a nascent, complex design problem. Instructional designers are challenged to effectively unite the constructivist principles of problem-based learning (PBL) with appropriate media in order to create quality dPBL environments. While computer-mediated communication (CMC) tools and…
BelleII@home: Integrate volunteer computing resources into DIRAC in a secure way
NASA Astrophysics Data System (ADS)
Wu, Wenjing; Hara, Takanori; Miyake, Hideki; Ueda, Ikuo; Kan, Wenxiao; Urquijo, Phillip
2017-10-01
The exploitation of volunteer computing resources has become a popular practice in the HEP computing community as the huge amount of potential computing power it provides. In the recent HEP experiments, the grid middleware has been used to organize the services and the resources, however it relies heavily on the X.509 authentication, which is contradictory to the untrusted feature of volunteer computing resources, therefore one big challenge to utilize the volunteer computing resources is how to integrate them into the grid middleware in a secure way. The DIRAC interware which is commonly used as the major component of the grid computing infrastructure for several HEP experiments proposes an even bigger challenge to this paradox as its pilot is more closely coupled with operations requiring the X.509 authentication compared to the implementations of pilot in its peer grid interware. The Belle II experiment is a B-factory experiment at KEK, and it uses DIRAC for its distributed computing. In the project of BelleII@home, in order to integrate the volunteer computing resources into the Belle II distributed computing platform in a secure way, we adopted a new approach which detaches the payload running from the Belle II DIRAC pilot which is a customized pilot pulling and processing jobs from the Belle II distributed computing platform, so that the payload can run on volunteer computers without requiring any X.509 authentication. In this approach we developed a gateway service running on a trusted server which handles all the operations requiring the X.509 authentication. So far, we have developed and deployed the prototype of BelleII@home, and tested its full workflow which proves the feasibility of this approach. This approach can also be applied on HPC systems whose work nodes do not have outbound connectivity to interact with the DIRAC system in general.
Reduction of collisional-radiative models for transient, atomic plasmas
NASA Astrophysics Data System (ADS)
Abrantes, Richard June; Karagozian, Ann; Bilyeu, David; Le, Hai
2017-10-01
Interactions between plasmas and any radiation field, whether by lasers or plasma emissions, introduce many computational challenges. One of these computational challenges involves resolving the atomic physics, which can influence other physical phenomena in the radiated system. In this work, a collisional-radiative (CR) model with reduction capabilities is developed to capture the atomic physics at a reduced computational cost. Although the model is made with any element in mind, the model is currently supplemented by LANL's argon database, which includes the relevant collisional and radiative processes for all of the ionic stages. Using the detailed data set as the true solution, reduction mechanisms in the form of Boltzmann grouping, uniform grouping, and quasi-steady-state (QSS), are implemented to compare against the true solution. Effects on the transient plasma stemming from the grouping methods are compared. Distribution A: Approved for public release; unlimited distribution, PA (Public Affairs) Clearance Number 17449. This work was supported by the Air Force Office of Scientific Research (AFOSR), Grant Number 17RQCOR463 (Dr. Jason Marshall).
Survey of MapReduce frame operation in bioinformatics.
Zou, Quan; Li, Xu-Bin; Jiang, Wen-Rui; Lin, Zi-Yu; Li, Gui-Lin; Chen, Ke
2014-07-01
Bioinformatics is challenged by the fact that traditional analysis tools have difficulty in processing large-scale data from high-throughput sequencing. The open source Apache Hadoop project, which adopts the MapReduce framework and a distributed file system, has recently given bioinformatics researchers an opportunity to achieve scalable, efficient and reliable computing performance on Linux clusters and on cloud computing services. In this article, we present MapReduce frame-based applications that can be employed in the next-generation sequencing and other biological domains. In addition, we discuss the challenges faced by this field as well as the future works on parallel computing in bioinformatics. © The Author 2013. Published by Oxford University Press. For Permissions, please email: journals.permissions@oup.com.
Cloud Computing and Its Applications in GIS
NASA Astrophysics Data System (ADS)
Kang, Cao
2011-12-01
Cloud computing is a novel computing paradigm that offers highly scalable and highly available distributed computing services. The objectives of this research are to: 1. analyze and understand cloud computing and its potential for GIS; 2. discover the feasibilities of migrating truly spatial GIS algorithms to distributed computing infrastructures; 3. explore a solution to host and serve large volumes of raster GIS data efficiently and speedily. These objectives thus form the basis for three professional articles. The first article is entitled "Cloud Computing and Its Applications in GIS". This paper introduces the concept, structure, and features of cloud computing. Features of cloud computing such as scalability, parallelization, and high availability make it a very capable computing paradigm. Unlike High Performance Computing (HPC), cloud computing uses inexpensive commodity computers. The uniform administration systems in cloud computing make it easier to use than GRID computing. Potential advantages of cloud-based GIS systems such as lower barrier to entry are consequently presented. Three cloud-based GIS system architectures are proposed: public cloud- based GIS systems, private cloud-based GIS systems and hybrid cloud-based GIS systems. Public cloud-based GIS systems provide the lowest entry barriers for users among these three architectures, but their advantages are offset by data security and privacy related issues. Private cloud-based GIS systems provide the best data protection, though they have the highest entry barriers. Hybrid cloud-based GIS systems provide a compromise between these extremes. The second article is entitled "A cloud computing algorithm for the calculation of Euclidian distance for raster GIS". Euclidean distance is a truly spatial GIS algorithm. Classical algorithms such as the pushbroom and growth ring techniques require computational propagation through the entire raster image, which makes it incompatible with the distributed nature of cloud computing. This paper presents a parallel Euclidean distance algorithm that works seamlessly with the distributed nature of cloud computing infrastructures. The mechanism of this algorithm is to subdivide a raster image into sub-images and wrap them with a one pixel deep edge layer of individually computed distance information. Each sub-image is then processed by a separate node, after which the resulting sub-images are reassembled into the final output. It is shown that while any rectangular sub-image shape can be used, those approximating squares are computationally optimal. This study also serves as a demonstration of this subdivide and layer-wrap strategy, which would enable the migration of many truly spatial GIS algorithms to cloud computing infrastructures. However, this research also indicates that certain spatial GIS algorithms such as cost distance cannot be migrated by adopting this mechanism, which presents significant challenges for the development of cloud-based GIS systems. The third article is entitled "A Distributed Storage Schema for Cloud Computing based Raster GIS Systems". This paper proposes a NoSQL Database Management System (NDDBMS) based raster GIS data storage schema. NDDBMS has good scalability and is able to use distributed commodity computers, which make it superior to Relational Database Management Systems (RDBMS) in a cloud computing environment. In order to provide optimized data service performance, the proposed storage schema analyzes the nature of commonly used raster GIS data sets. It discriminates two categories of commonly used data sets, and then designs corresponding data storage models for both categories. As a result, the proposed storage schema is capable of hosting and serving enormous volumes of raster GIS data speedily and efficiently on cloud computing infrastructures. In addition, the scheme also takes advantage of the data compression characteristics of Quadtrees, thus promoting efficient data storage. Through this assessment of cloud computing technology, the exploration of the challenges and solutions to the migration of GIS algorithms to cloud computing infrastructures, and the examination of strategies for serving large amounts of GIS data in a cloud computing infrastructure, this dissertation lends support to the feasibility of building a cloud-based GIS system. However, there are still challenges that need to be addressed before a full-scale functional cloud-based GIS system can be successfully implemented. (Abstract shortened by UMI.)
Silberstein, M.; Tzemach, A.; Dovgolevsky, N.; Fishelson, M.; Schuster, A.; Geiger, D.
2006-01-01
Computation of LOD scores is a valuable tool for mapping disease-susceptibility genes in the study of Mendelian and complex diseases. However, computation of exact multipoint likelihoods of large inbred pedigrees with extensive missing data is often beyond the capabilities of a single computer. We present a distributed system called “SUPERLINK-ONLINE,” for the computation of multipoint LOD scores of large inbred pedigrees. It achieves high performance via the efficient parallelization of the algorithms in SUPERLINK, a state-of-the-art serial program for these tasks, and through the use of the idle cycles of thousands of personal computers. The main algorithmic challenge has been to efficiently split a large task for distributed execution in a highly dynamic, nondedicated running environment. Notably, the system is available online, which allows computationally intensive analyses to be performed with no need for either the installation of software or the maintenance of a complicated distributed environment. As the system was being developed, it was extensively tested by collaborating medical centers worldwide on a variety of real data sets, some of which are presented in this article. PMID:16685644
Final Report for DOE Award ER25756
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kesselman, Carl
2014-11-17
The SciDAC-funded Center for Enabling Distributed Petascale Science (CEDPS) was established to address technical challenges that arise due to the frequent geographic distribution of data producers (in particular, supercomputers and scientific instruments) and data consumers (people and computers) within the DOE laboratory system. Its goal is to produce technical innovations that meet DOE end-user needs for (a) rapid and dependable placement of large quantities of data within a distributed high-performance environment, and (b) the convenient construction of scalable science services that provide for the reliable and high-performance processing of computation and data analysis requests from many remote clients. The Centermore » is also addressing (c) the important problem of troubleshooting these and other related ultra-high-performance distributed activities from the perspective of both performance and functionality« less
An adaptive importance sampling algorithm for Bayesian inversion with multimodal distributions
Li, Weixuan; Lin, Guang
2015-03-21
Parametric uncertainties are encountered in the simulations of many physical systems, and may be reduced by an inverse modeling procedure that calibrates the simulation results to observations on the real system being simulated. Following Bayes’ rule, a general approach for inverse modeling problems is to sample from the posterior distribution of the uncertain model parameters given the observations. However, the large number of repetitive forward simulations required in the sampling process could pose a prohibitive computational burden. This difficulty is particularly challenging when the posterior is multimodal. We present in this paper an adaptive importance sampling algorithm to tackle thesemore » challenges. Two essential ingredients of the algorithm are: 1) a Gaussian mixture (GM) model adaptively constructed as the proposal distribution to approximate the possibly multimodal target posterior, and 2) a mixture of polynomial chaos (PC) expansions, built according to the GM proposal, as a surrogate model to alleviate the computational burden caused by computational-demanding forward model evaluations. In three illustrative examples, the proposed adaptive importance sampling algorithm demonstrates its capabilities of automatically finding a GM proposal with an appropriate number of modes for the specific problem under study, and obtaining a sample accurately and efficiently representing the posterior with limited number of forward simulations.« less
Parallel task processing of very large datasets
NASA Astrophysics Data System (ADS)
Romig, Phillip Richardson, III
This research concerns the use of distributed computer technologies for the analysis and management of very large datasets. Improvements in sensor technology, an emphasis on global change research, and greater access to data warehouses all are increase the number of non-traditional users of remotely sensed data. We present a framework for distributed solutions to the challenges of datasets which exceed the online storage capacity of individual workstations. This framework, called parallel task processing (PTP), incorporates both the task- and data-level parallelism exemplified by many image processing operations. An implementation based on the principles of PTP, called Tricky, is also presented. Additionally, we describe the challenges and practical issues in modeling the performance of parallel task processing with large datasets. We present a mechanism for estimating the running time of each unit of work within a system and an algorithm that uses these estimates to simulate the execution environment and produce estimated runtimes. Finally, we describe and discuss experimental results which validate the design. Specifically, the system (a) is able to perform computation on datasets which exceed the capacity of any one disk, (b) provides reduction of overall computation time as a result of the task distribution even with the additional cost of data transfer and management, and (c) in the simulation mode accurately predicts the performance of the real execution environment.
An adaptive importance sampling algorithm for Bayesian inversion with multimodal distributions
DOE Office of Scientific and Technical Information (OSTI.GOV)
Li, Weixuan; Lin, Guang, E-mail: guanglin@purdue.edu
2015-08-01
Parametric uncertainties are encountered in the simulations of many physical systems, and may be reduced by an inverse modeling procedure that calibrates the simulation results to observations on the real system being simulated. Following Bayes' rule, a general approach for inverse modeling problems is to sample from the posterior distribution of the uncertain model parameters given the observations. However, the large number of repetitive forward simulations required in the sampling process could pose a prohibitive computational burden. This difficulty is particularly challenging when the posterior is multimodal. We present in this paper an adaptive importance sampling algorithm to tackle thesemore » challenges. Two essential ingredients of the algorithm are: 1) a Gaussian mixture (GM) model adaptively constructed as the proposal distribution to approximate the possibly multimodal target posterior, and 2) a mixture of polynomial chaos (PC) expansions, built according to the GM proposal, as a surrogate model to alleviate the computational burden caused by computational-demanding forward model evaluations. In three illustrative examples, the proposed adaptive importance sampling algorithm demonstrates its capabilities of automatically finding a GM proposal with an appropriate number of modes for the specific problem under study, and obtaining a sample accurately and efficiently representing the posterior with limited number of forward simulations.« less
Gallicchio, Emilio; Deng, Nanjie; He, Peng; Wickstrom, Lauren; Perryman, Alexander L.; Santiago, Daniel N.; Forli, Stefano; Olson, Arthur J.; Levy, Ronald M.
2014-01-01
As part of the SAMPL4 blind challenge, filtered AutoDock Vina ligand docking predictions and large scale binding energy distribution analysis method binding free energy calculations have been applied to the virtual screening of a focused library of candidate binders to the LEDGF site of the HIV integrase protein. The computational protocol leveraged docking and high level atomistic models to improve enrichment. The enrichment factor of our blind predictions ranked best among all of the computational submissions, and second best overall. This work represents to our knowledge the first example of the application of an all-atom physics-based binding free energy model to large scale virtual screening. A total of 285 parallel Hamiltonian replica exchange molecular dynamics absolute protein-ligand binding free energy simulations were conducted starting from docked poses. The setup of the simulations was fully automated, calculations were distributed on multiple computing resources and were completed in a 6-weeks period. The accuracy of the docked poses and the inclusion of intramolecular strain and entropic losses in the binding free energy estimates were the major factors behind the success of the method. Lack of sufficient time and computing resources to investigate additional protonation states of the ligands was a major cause of mispredictions. The experiment demonstrated the applicability of binding free energy modeling to improve hit rates in challenging virtual screening of focused ligand libraries during lead optimization. PMID:24504704
Bringing the CMS distributed computing system into scalable operations
NASA Astrophysics Data System (ADS)
Belforte, S.; Fanfani, A.; Fisk, I.; Flix, J.; Hernández, J. M.; Kress, T.; Letts, J.; Magini, N.; Miccio, V.; Sciabà, A.
2010-04-01
Establishing efficient and scalable operations of the CMS distributed computing system critically relies on the proper integration, commissioning and scale testing of the data and workload management tools, the various computing workflows and the underlying computing infrastructure, located at more than 50 computing centres worldwide and interconnected by the Worldwide LHC Computing Grid. Computing challenges periodically undertaken by CMS in the past years with increasing scale and complexity have revealed the need for a sustained effort on computing integration and commissioning activities. The Processing and Data Access (PADA) Task Force was established at the beginning of 2008 within the CMS Computing Program with the mandate of validating the infrastructure for organized processing and user analysis including the sites and the workload and data management tools, validating the distributed production system by performing functionality, reliability and scale tests, helping sites to commission, configure and optimize the networking and storage through scale testing data transfers and data processing, and improving the efficiency of accessing data across the CMS computing system from global transfers to local access. This contribution reports on the tools and procedures developed by CMS for computing commissioning and scale testing as well as the improvements accomplished towards efficient, reliable and scalable computing operations. The activities include the development and operation of load generators for job submission and data transfers with the aim of stressing the experiment and Grid data management and workload management systems, site commissioning procedures and tools to monitor and improve site availability and reliability, as well as activities targeted to the commissioning of the distributed production, user analysis and monitoring systems.
Enterprise Cloud Architecture for Chinese Ministry of Railway
NASA Astrophysics Data System (ADS)
Shan, Xumei; Liu, Hefeng
Enterprise like PRC Ministry of Railways (MOR), is facing various challenges ranging from highly distributed computing environment and low legacy system utilization, Cloud Computing is increasingly regarded as one workable solution to address this. This article describes full scale cloud solution with Intel Tashi as virtual machine infrastructure layer, Hadoop HDFS as computing platform, and self developed SaaS interface, gluing virtual machine and HDFS with Xen hypervisor. As a result, on demand computing task application and deployment have been tackled per MOR real working scenarios at the end of article.
Implementation of Parallel Dynamic Simulation on Shared-Memory vs. Distributed-Memory Environments
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jin, Shuangshuang; Chen, Yousu; Wu, Di
2015-12-09
Power system dynamic simulation computes the system response to a sequence of large disturbance, such as sudden changes in generation or load, or a network short circuit followed by protective branch switching operation. It consists of a large set of differential and algebraic equations, which is computational intensive and challenging to solve using single-processor based dynamic simulation solution. High-performance computing (HPC) based parallel computing is a very promising technology to speed up the computation and facilitate the simulation process. This paper presents two different parallel implementations of power grid dynamic simulation using Open Multi-processing (OpenMP) on shared-memory platform, and Messagemore » Passing Interface (MPI) on distributed-memory clusters, respectively. The difference of the parallel simulation algorithms and architectures of the two HPC technologies are illustrated, and their performances for running parallel dynamic simulation are compared and demonstrated.« less
Interoperating Cloud-based Virtual Farms
NASA Astrophysics Data System (ADS)
Bagnasco, S.; Colamaria, F.; Colella, D.; Casula, E.; Elia, D.; Franco, A.; Lusso, S.; Luparello, G.; Masera, M.; Miniello, G.; Mura, D.; Piano, S.; Vallero, S.; Venaruzzo, M.; Vino, G.
2015-12-01
The present work aims at optimizing the use of computing resources available at the grid Italian Tier-2 sites of the ALICE experiment at CERN LHC by making them accessible to interactive distributed analysis, thanks to modern solutions based on cloud computing. The scalability and elasticity of the computing resources via dynamic (“on-demand”) provisioning is essentially limited by the size of the computing site, reaching the theoretical optimum only in the asymptotic case of infinite resources. The main challenge of the project is to overcome this limitation by federating different sites through a distributed cloud facility. Storage capacities of the participating sites are seen as a single federated storage area, preventing the need of mirroring data across them: high data access efficiency is guaranteed by location-aware analysis software and storage interfaces, in a transparent way from an end-user perspective. Moreover, the interactive analysis on the federated cloud reduces the execution time with respect to grid batch jobs. The tests of the investigated solutions for both cloud computing and distributed storage on wide area network will be presented.
NASA Astrophysics Data System (ADS)
Lynch, Amanda H.; Abramson, David; Görgen, Klaus; Beringer, Jason; Uotila, Petteri
2007-10-01
Fires in the Australian savanna have been hypothesized to affect monsoon evolution, but the hypothesis is controversial and the effects have not been quantified. A distributed computing approach allows the development of a challenging experimental design that permits simultaneous variation of all fire attributes. The climate model simulations are distributed around multiple independent computer clusters in six countries, an approach that has potential for a range of other large simulation applications in the earth sciences. The experiment clarifies that savanna burning can shape the monsoon through two mechanisms. Boundary-layer circulation and large-scale convergence is intensified monotonically through increasing fire intensity and area burned. However, thresholds of fire timing and area are evident in the consequent influence on monsoon rainfall. In the optimal band of late, high intensity fires with a somewhat limited extent, it is possible for the wet season to be significantly enhanced.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Katz, Daniel S; Jha, Shantenu; Weissman, Jon
2017-01-31
This is the final technical report for the AIMES project. Many important advances in science and engineering are due to large-scale distributed computing. Notwithstanding this reliance, we are still learning how to design and deploy large-scale production Distributed Computing Infrastructures (DCI). This is evidenced by missing design principles for DCI, and an absence of generally acceptable and usable distributed computing abstractions. The AIMES project was conceived against this backdrop, following on the heels of a comprehensive survey of scientific distributed applications. AIMES laid the foundations to address the tripartite challenge of dynamic resource management, integrating information, and portable and interoperablemore » distributed applications. Four abstractions were defined and implemented: skeleton, resource bundle, pilot, and execution strategy. The four abstractions were implemented into software modules and then aggregated into the AIMES middleware. This middleware successfully integrates information across the application layer (skeletons) and resource layer (Bundles), derives a suitable execution strategy for the given skeleton and enacts its execution by means of pilots on one or more resources, depending on the application requirements, and resource availabilities and capabilities.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Weissman, Jon; Katz, Dan; Jha, Shantenu
2017-01-31
This is the final technical report for the AIMES project. Many important advances in science and engineering are due to large scale distributed computing. Notwithstanding this reliance, we are still learning how to design and deploy large-scale production Distributed Computing Infrastructures (DCI). This is evidenced by missing design principles for DCI, and an absence of generally acceptable and usable distributed computing abstractions. The AIMES project was conceived against this backdrop, following on the heels of a comprehensive survey of scientific distributed applications. AIMES laid the foundations to address the tripartite challenge of dynamic resource management, integrating information, and portable andmore » interoperable distributed applications. Four abstractions were defined and implemented: skeleton, resource bundle, pilot, and execution strategy. The four abstractions were implemented into software modules and then aggregated into the AIMES middleware. This middleware successfully integrates information across the application layer (skeletons) and resource layer (Bundles), derives a suitable execution strategy for the given skeleton and enacts its execution by means of pilots on one or more resources, depending on the application requirements, and resource availabilities and capabilities.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Not Available
The Computing and Communications (C) Division is responsible for the Laboratory's Integrated Computing Network (ICN) as well as Laboratory-wide communications. Our computing network, used by 8,000 people distributed throughout the nation, constitutes one of the most powerful scientific computing facilities in the world. In addition to the stable production environment of the ICN, we have taken a leadership role in high-performance computing and have established the Advanced Computing Laboratory (ACL), the site of research on experimental, massively parallel computers; high-speed communication networks; distributed computing; and a broad variety of advanced applications. The computational resources available in the ACL are ofmore » the type needed to solve problems critical to national needs, the so-called Grand Challenge'' problems. The purpose of this publication is to inform our clients of our strategic and operating plans in these important areas. We review major accomplishments since late 1990 and describe our strategic planning goals and specific projects that will guide our operations over the next few years. Our mission statement, planning considerations, and management policies and practices are also included.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Not Available
The Computing and Communications (C) Division is responsible for the Laboratory`s Integrated Computing Network (ICN) as well as Laboratory-wide communications. Our computing network, used by 8,000 people distributed throughout the nation, constitutes one of the most powerful scientific computing facilities in the world. In addition to the stable production environment of the ICN, we have taken a leadership role in high-performance computing and have established the Advanced Computing Laboratory (ACL), the site of research on experimental, massively parallel computers; high-speed communication networks; distributed computing; and a broad variety of advanced applications. The computational resources available in the ACL are ofmore » the type needed to solve problems critical to national needs, the so-called ``Grand Challenge`` problems. The purpose of this publication is to inform our clients of our strategic and operating plans in these important areas. We review major accomplishments since late 1990 and describe our strategic planning goals and specific projects that will guide our operations over the next few years. Our mission statement, planning considerations, and management policies and practices are also included.« less
WLCG scale testing during CMS data challenges
NASA Astrophysics Data System (ADS)
Gutsche, O.; Hajdu, C.
2008-07-01
The CMS computing model to process and analyze LHC collision data follows a data-location driven approach and is using the WLCG infrastructure to provide access to GRID resources. As a preparation for data taking, CMS tests its computing model during dedicated data challenges. An important part of the challenges is the test of the user analysis which poses a special challenge for the infrastructure with its random distributed access patterns. The CMS Remote Analysis Builder (CRAB) handles all interactions with the WLCG infrastructure transparently for the user. During the 2006 challenge, CMS set its goal to test the infrastructure at a scale of 50,000 user jobs per day using CRAB. Both direct submissions by individual users and automated submissions by robots were used to achieve this goal. A report will be given about the outcome of the user analysis part of the challenge using both the EGEE and OSG parts of the WLCG. In particular, the difference in submission between both GRID middlewares (resource broker vs. direct submission) will be discussed. In the end, an outlook for the 2007 data challenge is given.
Automation of multi-agent control for complex dynamic systems in heterogeneous computational network
NASA Astrophysics Data System (ADS)
Oparin, Gennady; Feoktistov, Alexander; Bogdanova, Vera; Sidorov, Ivan
2017-01-01
The rapid progress of high-performance computing entails new challenges related to solving large scientific problems for various subject domains in a heterogeneous distributed computing environment (e.g., a network, Grid system, or Cloud infrastructure). The specialists in the field of parallel and distributed computing give the special attention to a scalability of applications for problem solving. An effective management of the scalable application in the heterogeneous distributed computing environment is still a non-trivial issue. Control systems that operate in networks, especially relate to this issue. We propose a new approach to the multi-agent management for the scalable applications in the heterogeneous computational network. The fundamentals of our approach are the integrated use of conceptual programming, simulation modeling, network monitoring, multi-agent management, and service-oriented programming. We developed a special framework for an automation of the problem solving. Advantages of the proposed approach are demonstrated on the parametric synthesis example of the static linear regulator for complex dynamic systems. Benefits of the scalable application for solving this problem include automation of the multi-agent control for the systems in a parallel mode with various degrees of its detailed elaboration.
Challenging Density Functional Theory Calculations with Hemes and Porphyrins.
de Visser, Sam P; Stillman, Martin J
2016-04-07
In this paper we review recent advances in computational chemistry and specifically focus on the chemical description of heme proteins and synthetic porphyrins that act as both mimics of natural processes and technological uses. These are challenging biochemical systems involved in electron transfer as well as biocatalysis processes. In recent years computational tools have improved considerably and now can reproduce experimental spectroscopic and reactivity studies within a reasonable error margin (several kcal·mol(-1)). This paper gives recent examples from our groups, where we investigated heme and synthetic metal-porphyrin systems. The four case studies highlight how computational modelling can correctly reproduce experimental product distributions, predicted reactivity trends and guide interpretation of electronic structures of complex systems. The case studies focus on the calculations of a variety of spectroscopic features of porphyrins and show how computational modelling gives important insight that explains the experimental spectra and can lead to the design of porphyrins with tuned properties.
De La Flor, Grace; Ojaghi, Mobin; Martínez, Ignacio Lamata; Jirotka, Marina; Williams, Martin S; Blakeborough, Anthony
2010-09-13
When transitioning local laboratory practices into distributed environments, the interdependent relationship between experimental procedure and the technologies used to execute experiments becomes highly visible and a focal point for system requirements. We present an analysis of ways in which this reciprocal relationship is reconfiguring laboratory practices in earthquake engineering as a new computing infrastructure is embedded within three laboratories in order to facilitate the execution of shared experiments across geographically distributed sites. The system has been developed as part of the UK Network for Earthquake Engineering Simulation e-Research project, which links together three earthquake engineering laboratories at the universities of Bristol, Cambridge and Oxford. We consider the ways in which researchers have successfully adapted their local laboratory practices through the modification of experimental procedure so that they may meet the challenges of coordinating distributed earthquake experiments.
Generation of distributed W-states over long distances
NASA Astrophysics Data System (ADS)
Li, Yi
2017-08-01
Ultra-secure quantum communication between distant locations requires distributed entangled states between nodes. Various methodologies have been proposed to tackle this technological challenge, of which the so-called DLCZ protocol is the most promising and widely adopted scheme. This paper aims to extend this well-known protocol to a multi-node setting where the entangled W-state is generated between nodes over long distances. The generation of multipartite W-states is the foundation of quantum networks, paving the way for quantum communication and distributed quantum computation.
He, Chenlong; Feng, Zuren; Ren, Zhigang
2018-02-03
For Wireless Sensor Networks (WSNs), the Voronoi partition of a region is a challenging problem owing to the limited sensing ability of each sensor and the distributed organization of the network. In this paper, an algorithm is proposed for each sensor having a limited sensing range to compute its limited Voronoi cell autonomously, so that the limited Voronoi partition of the entire WSN is generated in a distributed manner. Inspired by Graham's Scan (GS) algorithm used to compute the convex hull of a point set, the limited Voronoi cell of each sensor is obtained by sequentially scanning two consecutive bisectors between the sensor and its neighbors. The proposed algorithm called the Boundary Scan (BS) algorithm has a lower computational complexity than the existing Range-Constrained Voronoi Cell (RCVC) algorithm and reaches the lower bound of the computational complexity of the algorithms used to solve the problem of this kind. Moreover, it also improves the time efficiency of a key step in the Adjust-Sensing-Radius (ASR) algorithm used to compute the exact Voronoi cell. Extensive numerical simulations are performed to demonstrate the correctness and effectiveness of the BS algorithm. The distributed realization of the BS combined with a localization algorithm in WSNs is used to justify the WSN nature of the proposed algorithm.
Distributed Algorithm for Voronoi Partition of Wireless Sensor Networks with a Limited Sensing Range
Feng, Zuren; Ren, Zhigang
2018-01-01
For Wireless Sensor Networks (WSNs), the Voronoi partition of a region is a challenging problem owing to the limited sensing ability of each sensor and the distributed organization of the network. In this paper, an algorithm is proposed for each sensor having a limited sensing range to compute its limited Voronoi cell autonomously, so that the limited Voronoi partition of the entire WSN is generated in a distributed manner. Inspired by Graham’s Scan (GS) algorithm used to compute the convex hull of a point set, the limited Voronoi cell of each sensor is obtained by sequentially scanning two consecutive bisectors between the sensor and its neighbors. The proposed algorithm called the Boundary Scan (BS) algorithm has a lower computational complexity than the existing Range-Constrained Voronoi Cell (RCVC) algorithm and reaches the lower bound of the computational complexity of the algorithms used to solve the problem of this kind. Moreover, it also improves the time efficiency of a key step in the Adjust-Sensing-Radius (ASR) algorithm used to compute the exact Voronoi cell. Extensive numerical simulations are performed to demonstrate the correctness and effectiveness of the BS algorithm. The distributed realization of the BS combined with a localization algorithm in WSNs is used to justify the WSN nature of the proposed algorithm. PMID:29401649
PanDA: Exascale Federation of Resources for the ATLAS Experiment at the LHC
NASA Astrophysics Data System (ADS)
Barreiro Megino, Fernando; Caballero Bejar, Jose; De, Kaushik; Hover, John; Klimentov, Alexei; Maeno, Tadashi; Nilsson, Paul; Oleynik, Danila; Padolski, Siarhei; Panitkin, Sergey; Petrosyan, Artem; Wenaus, Torre
2016-02-01
After a scheduled maintenance and upgrade period, the world's largest and most powerful machine - the Large Hadron Collider(LHC) - is about to enter its second run at unprecedented energies. In order to exploit the scientific potential of the machine, the experiments at the LHC face computational challenges with enormous data volumes that need to be analysed by thousand of physics users and compared to simulated data. Given diverse funding constraints, the computational resources for the LHC have been deployed in a worldwide mesh of data centres, connected to each other through Grid technologies. The PanDA (Production and Distributed Analysis) system was developed in 2005 for the ATLAS experiment on top of this heterogeneous infrastructure to seamlessly integrate the computational resources and give the users the feeling of a unique system. Since its origins, PanDA has evolved together with upcoming computing paradigms in and outside HEP, such as changes in the networking model, Cloud Computing and HPC. It is currently running steadily up to 200 thousand simultaneous cores (limited by the available resources for ATLAS), up to two million aggregated jobs per day and processes over an exabyte of data per year. The success of PanDA in ATLAS is triggering the widespread adoption and testing by other experiments. In this contribution we will give an overview of the PanDA components and focus on the new features and upcoming challenges that are relevant to the next decade of distributed computing workload management using PanDA.
Distributed GPU Computing in GIScience
NASA Astrophysics Data System (ADS)
Jiang, Y.; Yang, C.; Huang, Q.; Li, J.; Sun, M.
2013-12-01
Geoscientists strived to discover potential principles and patterns hidden inside ever-growing Big Data for scientific discoveries. To better achieve this objective, more capable computing resources are required to process, analyze and visualize Big Data (Ferreira et al., 2003; Li et al., 2013). Current CPU-based computing techniques cannot promptly meet the computing challenges caused by increasing amount of datasets from different domains, such as social media, earth observation, environmental sensing (Li et al., 2013). Meanwhile CPU-based computing resources structured as cluster or supercomputer is costly. In the past several years with GPU-based technology matured in both the capability and performance, GPU-based computing has emerged as a new computing paradigm. Compare to traditional computing microprocessor, the modern GPU, as a compelling alternative microprocessor, has outstanding high parallel processing capability with cost-effectiveness and efficiency(Owens et al., 2008), although it is initially designed for graphical rendering in visualization pipe. This presentation reports a distributed GPU computing framework for integrating GPU-based computing within distributed environment. Within this framework, 1) for each single computer, computing resources of both GPU-based and CPU-based can be fully utilized to improve the performance of visualizing and processing Big Data; 2) within a network environment, a variety of computers can be used to build up a virtual super computer to support CPU-based and GPU-based computing in distributed computing environment; 3) GPUs, as a specific graphic targeted device, are used to greatly improve the rendering efficiency in distributed geo-visualization, especially for 3D/4D visualization. Key words: Geovisualization, GIScience, Spatiotemporal Studies Reference : 1. Ferreira de Oliveira, M. C., & Levkowitz, H. (2003). From visual data exploration to visual data mining: A survey. Visualization and Computer Graphics, IEEE Transactions on, 9(3), 378-394. 2. Li, J., Jiang, Y., Yang, C., Huang, Q., & Rice, M. (2013). Visualizing 3D/4D Environmental Data Using Many-core Graphics Processing Units (GPUs) and Multi-core Central Processing Units (CPUs). Computers & Geosciences, 59(9), 78-89. 3. Owens, J. D., Houston, M., Luebke, D., Green, S., Stone, J. E., & Phillips, J. C. (2008). GPU computing. Proceedings of the IEEE, 96(5), 879-899.
NASA Technical Reports Server (NTRS)
1987-01-01
The Research Institute for Advanced Computer Science (RIACS) was established at the NASA Ames Research Center in June of 1983. RIACS is privately operated by the Universities Space Research Association (USRA), a consortium of 64 universities with graduate programs in the aerospace sciences, under several Cooperative Agreements with NASA. RIACS's goal is to provide preeminent leadership in basic and applied computer science research as partners in support of NASA's goals and missions. In pursuit of this goal, RIACS contributes to several of the grand challenges in science and engineering facing NASA: flying an airplane inside a computer; determining the chemical properties of materials under hostile conditions in the atmospheres of earth and the planets; sending intelligent machines on unmanned space missions; creating a one-world network that makes all scientific resources, including those in space, accessible to all the world's scientists; providing intelligent computational support to all stages of the process of scientific investigation from problem formulation to results dissemination; and developing accurate global models for climatic behavior throughout the world. In working with these challenges, we seek novel architectures, and novel ways to use them, that exploit the potential of parallel and distributed computation and make possible new functions that are beyond the current reach of computing machines. The investigation includes pattern computers as well as the more familiar numeric and symbolic computers, and it includes networked systems of resources distributed around the world. We believe that successful computer science research is interdisciplinary: it is driven by (and drives) important problems in other disciplines. We believe that research should be guided by a clear long-term vision with planned milestones. And we believe that our environment must foster and exploit innovation. Our activities and accomplishments for the calendar year 1987 and our plans for 1988 are reported.
Shipping Science Worldwide with Open Source Containers
NASA Astrophysics Data System (ADS)
Molineaux, J. P.; McLaughlin, B. D.; Pilone, D.; Plofchan, P. G.; Murphy, K. J.
2014-12-01
Scientific applications often present difficult web-hosting needs. Their compute- and data-intensive nature, as well as an increasing need for high-availability and distribution, combine to create a challenging set of hosting requirements. In the past year, advancements in container-based virtualization and related tooling have offered new lightweight and flexible ways to accommodate diverse applications with all the isolation and portability benefits of traditional virtualization. This session will introduce and demonstrate an open-source, single-interface, Platform-as-a-Serivce (PaaS) that empowers application developers to seamlessly leverage geographically distributed, public and private compute resources to achieve highly-available, performant hosting for scientific applications.
Wynden, Rob; Anderson, Nick; Casale, Marco; Lakshminarayanan, Prakash; Anderson, Kent; Prosser, Justin; Errecart, Larry; Livshits, Alice; Thimman, Tim; Weiner, Mark
2011-01-01
Within the CTSA (Clinical Translational Sciences Awards) program, academic medical centers are tasked with the storage of clinical formulary data within an Integrated Data Repository (IDR) and the subsequent exposure of that data over grid computing environments for hypothesis generation and cohort selection. Formulary data collected over long periods of time across multiple institutions requires normalization of terms before those data sets can be aggregated and compared. This paper sets forth a solution to the challenge of generating derived aggregated normalized views from large, distributed data sets of clinical formulary data intended for re-use within clinical translational research.
A Technical Survey on Optimization of Processing Geo Distributed Data
NASA Astrophysics Data System (ADS)
Naga Malleswari, T. Y. J.; Ushasukhanya, S.; Nithyakalyani, A.; Girija, S.
2018-04-01
With growing cloud services and technology, there is growth in some geographically distributed data centers to store large amounts of data. Analysis of geo-distributed data is required in various services for data processing, storage of essential information, etc., processing this geo-distributed data and performing analytics on this data is a challenging task. The distributed data processing is accompanied by issues in storage, computation and communication. The key issues to be dealt with are time efficiency, cost minimization, utility maximization. This paper describes various optimization methods like end-to-end multiphase, G-MR, etc., using the techniques like Map-Reduce, CDS (Community Detection based Scheduling), ROUT, Workload-Aware Scheduling, SAGE, AMP (Ant Colony Optimization) to handle these issues. In this paper various optimization methods and techniques used are analyzed. It has been observed that end-to end multiphase achieves time efficiency; Cost minimization concentrates to achieve Quality of Service, Computation and reduction of Communication cost. SAGE achieves performance improvisation in processing geo-distributed data sets.
Blind prediction of cyclohexane-water distribution coefficients from the SAMPL5 challenge.
Bannan, Caitlin C; Burley, Kalistyn H; Chiu, Michael; Shirts, Michael R; Gilson, Michael K; Mobley, David L
2016-11-01
In the recent SAMPL5 challenge, participants submitted predictions for cyclohexane/water distribution coefficients for a set of 53 small molecules. Distribution coefficients (log D) replace the hydration free energies that were a central part of the past five SAMPL challenges. A wide variety of computational methods were represented by the 76 submissions from 18 participating groups. Here, we analyze submissions by a variety of error metrics and provide details for a number of reference calculations we performed. As in the SAMPL4 challenge, we assessed the ability of participants to evaluate not just their statistical uncertainty, but their model uncertainty-how well they can predict the magnitude of their model or force field error for specific predictions. Unfortunately, this remains an area where prediction and analysis need improvement. In SAMPL4 the top performing submissions achieved a root-mean-squared error (RMSE) around 1.5 kcal/mol. If we anticipate accuracy in log D predictions to be similar to the hydration free energy predictions in SAMPL4, the expected error here would be around 1.54 log units. Only a few submissions had an RMSE below 2.5 log units in their predicted log D values. However, distribution coefficients introduced complexities not present in past SAMPL challenges, including tautomer enumeration, that are likely to be important in predicting biomolecular properties of interest to drug discovery, therefore some decrease in accuracy would be expected. Overall, the SAMPL5 distribution coefficient challenge provided great insight into the importance of modeling a variety of physical effects. We believe these types of measurements will be a promising source of data for future blind challenges, especially in view of the relatively straightforward nature of the experiments and the level of insight provided.
Blind prediction of cyclohexane-water distribution coefficients from the SAMPL5 challenge
Bannan, Caitlin C.; Burley, Kalistyn H.; Chiu, Michael; Shirts, Michael R.; Gilson, Michael K.; Mobley, David L.
2016-01-01
In the recent SAMPL5 challenge, participants submitted predictions for cyclohexane/water distribution coefficients for a set of 53 small molecules. Distribution coefficients (log D) replace the hydration free energies that were a central part of the past five SAMPL challenges. A wide variety of computational methods were represented by the 76 submissions from 18 participating groups. Here, we analyze submissions by a variety of error metrics and provide details for a number of reference calculations we performed. As in the SAMPL4 challenge, we assessed the ability of participants to evaluate not just their statistical uncertainty, but their model uncertainty – how well they can predict the magnitude of their model or force field error for specific predictions. Unfortunately, this remains an area where prediction and analysis need improvement. In SAMPL4 the top performing submissions achieved a root-mean-squared error (RMSE) around 1.5 kcal/mol. If we anticipate accuracy in log D predictions to be similar to the hydration free energy predictions in SAMPL4, the expected error here would be around 1.54 log units. Only a few submissions had an RMSE below 2.5 log units in their predicted log D values. However, distribution coefficients introduced complexities not present in past SAMPL challenges, including tautomer enumeration, that are likely to be important in predicting biomolecular properties of interest to drug discovery, therefore some decrease in accuracy would be expected. Overall, the SAMPL5 distribution coefficient challenge provided great insight into the importance of modeling a variety of physical effects. We believe these types of measurements will be a promising source of data for future blind challenges, especially in view of the relatively straightforward nature of the experiments and the level of insight provided. PMID:27677750
Mock Data Challenge for the MPD/NICA Experiment on the HybriLIT Cluster
NASA Astrophysics Data System (ADS)
Gertsenberger, Konstantin; Rogachevsky, Oleg
2018-02-01
Simulation of data processing before receiving first experimental data is an important issue in high-energy physics experiments. This article presents the current Event Data Model and the Mock Data Challenge for the MPD experiment at the NICA accelerator complex which uses ongoing simulation studies to exercise in a stress-testing the distributed computing infrastructure and experiment software in the full production environment from simulated data through the physical analysis.
Dinov, Ivo D
2016-01-01
Managing, processing and understanding big healthcare data is challenging, costly and demanding. Without a robust fundamental theory for representation, analysis and inference, a roadmap for uniform handling and analyzing of such complex data remains elusive. In this article, we outline various big data challenges, opportunities, modeling methods and software techniques for blending complex healthcare data, advanced analytic tools, and distributed scientific computing. Using imaging, genetic and healthcare data we provide examples of processing heterogeneous datasets using distributed cloud services, automated and semi-automated classification techniques, and open-science protocols. Despite substantial advances, new innovative technologies need to be developed that enhance, scale and optimize the management and processing of large, complex and heterogeneous data. Stakeholder investments in data acquisition, research and development, computational infrastructure and education will be critical to realize the huge potential of big data, to reap the expected information benefits and to build lasting knowledge assets. Multi-faceted proprietary, open-source, and community developments will be essential to enable broad, reliable, sustainable and efficient data-driven discovery and analytics. Big data will affect every sector of the economy and their hallmark will be 'team science'.
Data Reprocessing on Worldwide Distributed Systems
NASA Astrophysics Data System (ADS)
Wicke, Daniel
The DØ experiment faces many challenges in terms of enabling access to large datasets for physicists on four continents. The strategy for solving these problems on worldwide distributed computing clusters is presented. Since the beginning of Run II of the Tevatron (March 2001) all Monte-Carlo simulations for the experiment have been produced at remote systems. For data analysis, a system of regional analysis centers (RACs) was established which supply the associated institutes with the data. This structure, which is similar to the tiered structure foreseen for the LHC was used in Fall 2003 to reprocess all DØ data with a much improved version of the reconstruction software. This makes DØ the first running experiment that has implemented and operated all important computing tasks of a high energy physics experiment on systems distributed worldwide.
Parallel Distributed Processing Theory in the Age of Deep Networks.
Bowers, Jeffrey S
2017-12-01
Parallel distributed processing (PDP) models in psychology are the precursors of deep networks used in computer science. However, only PDP models are associated with two core psychological claims, namely that all knowledge is coded in a distributed format and cognition is mediated by non-symbolic computations. These claims have long been debated in cognitive science, and recent work with deep networks speaks to this debate. Specifically, single-unit recordings show that deep networks learn units that respond selectively to meaningful categories, and researchers are finding that deep networks need to be supplemented with symbolic systems to perform some tasks. Given the close links between PDP and deep networks, it is surprising that research with deep networks is challenging PDP theory. Copyright © 2017. Published by Elsevier Ltd.
QMC Goes BOINC: Using Public Resource Computing to Perform Quantum Monte Carlo Calculations
NASA Astrophysics Data System (ADS)
Rainey, Cameron; Engelhardt, Larry; Schröder, Christian; Hilbig, Thomas
2008-10-01
Theoretical modeling of magnetic molecules traditionally involves the diagonalization of quantum Hamiltonian matrices. However, as the complexity of these molecules increases, the matrices become so large that this process becomes unusable. An additional challenge to this modeling is that many repetitive calculations must be performed, further increasing the need for computing power. Both of these obstacles can be overcome by using a quantum Monte Carlo (QMC) method and a distributed computing project. We have recently implemented a QMC method within the Spinhenge@home project, which is a Public Resource Computing (PRC) project where private citizens allow part-time usage of their PCs for scientific computing. The use of PRC for scientific computing will be described in detail, as well as how you can contribute to the project. See, e.g., L. Engelhardt, et. al., Angew. Chem. Int. Ed. 47, 924 (2008). C. Schröoder, in Distributed & Grid Computing - Science Made Transparent for Everyone. Principles, Applications and Supporting Communities. (Weber, M.H.W., ed., 2008). Project URL: http://spin.fh-bielefeld.de
Models@Home: distributed computing in bioinformatics using a screensaver based approach.
Krieger, Elmar; Vriend, Gert
2002-02-01
Due to the steadily growing computational demands in bioinformatics and related scientific disciplines, one is forced to make optimal use of the available resources. A straightforward solution is to build a network of idle computers and let each of them work on a small piece of a scientific challenge, as done by Seti@Home (http://setiathome.berkeley.edu), the world's largest distributed computing project. We developed a generally applicable distributed computing solution that uses a screensaver system similar to Seti@Home. The software exploits the coarse-grained nature of typical bioinformatics projects. Three major considerations for the design were: (1) often, many different programs are needed, while the time is lacking to parallelize them. Models@Home can run any program in parallel without modifications to the source code; (2) in contrast to the Seti project, bioinformatics applications are normally more sensitive to lost jobs. Models@Home therefore includes stringent control over job scheduling; (3) to allow use in heterogeneous environments, Linux and Windows based workstations can be combined with dedicated PCs to build a homogeneous cluster. We present three practical applications of Models@Home, running the modeling programs WHAT IF and YASARA on 30 PCs: force field parameterization, molecular dynamics docking, and database maintenance.
Next Generation Workload Management System For Big Data on Heterogeneous Distributed Computing
NASA Astrophysics Data System (ADS)
Klimentov, A.; Buncic, P.; De, K.; Jha, S.; Maeno, T.; Mount, R.; Nilsson, P.; Oleynik, D.; Panitkin, S.; Petrosyan, A.; Porter, R. J.; Read, K. F.; Vaniachine, A.; Wells, J. C.; Wenaus, T.
2015-05-01
The Large Hadron Collider (LHC), operating at the international CERN Laboratory in Geneva, Switzerland, is leading Big Data driven scientific explorations. Experiments at the LHC explore the fundamental nature of matter and the basic forces that shape our universe, and were recently credited for the discovery of a Higgs boson. ATLAS and ALICE are the largest collaborations ever assembled in the sciences and are at the forefront of research at the LHC. To address an unprecedented multi-petabyte data processing challenge, both experiments rely on a heterogeneous distributed computational infrastructure. The ATLAS experiment uses PanDA (Production and Data Analysis) Workload Management System (WMS) for managing the workflow for all data processing on hundreds of data centers. Through PanDA, ATLAS physicists see a single computing facility that enables rapid scientific breakthroughs for the experiment, even though the data centers are physically scattered all over the world. The scale is demonstrated by the following numbers: PanDA manages O(102) sites, O(105) cores, O(108) jobs per year, O(103) users, and ATLAS data volume is O(1017) bytes. In 2013 we started an ambitious program to expand PanDA to all available computing resources, including opportunistic use of commercial and academic clouds and Leadership Computing Facilities (LCF). The project titled ‘Next Generation Workload Management and Analysis System for Big Data’ (BigPanDA) is funded by DOE ASCR and HEP. Extending PanDA to clouds and LCF presents new challenges in managing heterogeneity and supporting workflow. The BigPanDA project is underway to setup and tailor PanDA at the Oak Ridge Leadership Computing Facility (OLCF) and at the National Research Center "Kurchatov Institute" together with ALICE distributed computing and ORNL computing professionals. Our approach to integration of HPC platforms at the OLCF and elsewhere is to reuse, as much as possible, existing components of the PanDA system. We will present our current accomplishments with running the PanDA WMS at OLCF and other supercomputers and demonstrate our ability to use PanDA as a portal independent of the computing facilities infrastructure for High Energy and Nuclear Physics as well as other data-intensive science applications.
Thurman, Andrew L; Choi, Jiwoong; Choi, Sanghun; Lin, Ching-Long; Hoffman, Eric A; Lee, Chang Hyun; Chan, Kung-Sik
2017-05-10
Methacholine challenge tests are used to measure changes in pulmonary function that indicate symptoms of asthma. In addition to pulmonary function tests, which measure global changes in pulmonary function, computed tomography images taken at full inspiration before and after administration of methacholine provide local air volume changes (hyper-inflation post methacholine) at individual acinar units, indicating local airway hyperresponsiveness. Some of the acini may have extreme air volume changes relative to the global average, indicating hyperresponsiveness, and those extreme values may occur in clusters. We propose a Gaussian mixture model with a spatial smoothness penalty to improve prediction of hyperresponsive locations that occur in spatial clusters. A simulation study provides evidence that the spatial smoothness penalty improves prediction under different data-generating mechanisms. We apply this method to computed tomography data from Seoul National University Hospital on five healthy and ten asthmatic subjects. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.
Open source tools for large-scale neuroscience.
Freeman, Jeremy
2015-06-01
New technologies for monitoring and manipulating the nervous system promise exciting biology but pose challenges for analysis and computation. Solutions can be found in the form of modern approaches to distributed computing, machine learning, and interactive visualization. But embracing these new technologies will require a cultural shift: away from independent efforts and proprietary methods and toward an open source and collaborative neuroscience. Copyright © 2015 The Author. Published by Elsevier Ltd.. All rights reserved.
Parallel volume ray-casting for unstructured-grid data on distributed-memory architectures
NASA Technical Reports Server (NTRS)
Ma, Kwan-Liu
1995-01-01
As computing technology continues to advance, computational modeling of scientific and engineering problems produces data of increasing complexity: large in size and unstructured in shape. Volume visualization of such data is a challenging problem. This paper proposes a distributed parallel solution that makes ray-casting volume rendering of unstructured-grid data practical. Both the data and the rendering process are distributed among processors. At each processor, ray-casting of local data is performed independent of the other processors. The global image composing processes, which require inter-processor communication, are overlapped with the local ray-casting processes to achieve maximum parallel efficiency. This algorithm differs from previous ones in four ways: it is completely distributed, less view-dependent, reasonably scalable, and flexible. Without using dynamic load balancing, test results on the Intel Paragon using from two to 128 processors show, on average, about 60% parallel efficiency.
The role of the host in a cooperating mainframe and workstation environment, volumes 1 and 2
NASA Technical Reports Server (NTRS)
Kusmanoff, Antone; Martin, Nancy L.
1989-01-01
In recent years, advancements made in computer systems have prompted a move from centralized computing based on timesharing a large mainframe computer to distributed computing based on a connected set of engineering workstations. A major factor in this advancement is the increased performance and lower cost of engineering workstations. The shift to distributed computing from centralized computing has led to challenges associated with the residency of application programs within the system. In a combined system of multiple engineering workstations attached to a mainframe host, the question arises as to how does a system designer assign applications between the larger mainframe host and the smaller, yet powerful, workstation. The concepts related to real time data processing are analyzed and systems are displayed which use a host mainframe and a number of engineering workstations interconnected by a local area network. In most cases, distributed systems can be classified as having a single function or multiple functions and as executing programs in real time or nonreal time. In a system of multiple computers, the degree of autonomy of the computers is important; a system with one master control computer generally differs in reliability, performance, and complexity from a system in which all computers share the control. This research is concerned with generating general criteria principles for software residency decisions (host or workstation) for a diverse yet coupled group of users (the clustered workstations) which may need the use of a shared resource (the mainframe) to perform their functions.
Imam, Neena; Barhen, Jacob
2009-01-01
For real-time acoustic source localization applications, one of the primary challenges is the considerable growth in computational complexity associated with the emergence of ever larger, active or passive, distributed sensor networks. These sensors rely heavily on battery-operated system components to achieve highly functional automation in signal and information processing. In order to keep communication requirements minimal, it is desirable to perform as much processing on the receiver platforms as possible. However, the complexity of the calculations needed to achieve accurate source localization increases dramatically with the size of sensor arrays, resulting in substantial growth of computational requirements that cannot bemore » readily met with standard hardware. One option to meet this challenge builds upon the emergence of digital optical-core devices. The objective of this work was to explore the implementation of key building block algorithms used in underwater source localization on the optical-core digital processing platform recently introduced by Lenslet Inc. This demonstration of considerably faster signal processing capability should be of substantial significance to the design and innovation of future generations of distributed sensor networks.« less
Towards Integrating Distributed Energy Resources and Storage Devices in Smart Grid.
Xu, Guobin; Yu, Wei; Griffith, David; Golmie, Nada; Moulema, Paul
2017-02-01
Internet of Things (IoT) provides a generic infrastructure for different applications to integrate information communication techniques with physical components to achieve automatic data collection, transmission, exchange, and computation. The smart grid, as one of typical applications supported by IoT, denoted as a re-engineering and a modernization of the traditional power grid, aims to provide reliable, secure, and efficient energy transmission and distribution to consumers. How to effectively integrate distributed (renewable) energy resources and storage devices to satisfy the energy service requirements of users, while minimizing the power generation and transmission cost, remains a highly pressing challenge in the smart grid. To address this challenge and assess the effectiveness of integrating distributed energy resources and storage devices, in this paper we develop a theoretical framework to model and analyze three types of power grid systems: the power grid with only bulk energy generators, the power grid with distributed energy resources, and the power grid with both distributed energy resources and storage devices. Based on the metrics of the power cumulative cost and the service reliability to users, we formally model and analyze the impact of integrating distributed energy resources and storage devices in the power grid. We also use the concept of network calculus, which has been traditionally used for carrying out traffic engineering in computer networks, to derive the bounds of both power supply and user demand to achieve a high service reliability to users. Through an extensive performance evaluation, our data shows that integrating distributed energy resources conjointly with energy storage devices can reduce generation costs, smooth the curve of bulk power generation over time, reduce bulk power generation and power distribution losses, and provide a sustainable service reliability to users in the power grid.
Towards Integrating Distributed Energy Resources and Storage Devices in Smart Grid
Xu, Guobin; Yu, Wei; Griffith, David; Golmie, Nada; Moulema, Paul
2017-01-01
Internet of Things (IoT) provides a generic infrastructure for different applications to integrate information communication techniques with physical components to achieve automatic data collection, transmission, exchange, and computation. The smart grid, as one of typical applications supported by IoT, denoted as a re-engineering and a modernization of the traditional power grid, aims to provide reliable, secure, and efficient energy transmission and distribution to consumers. How to effectively integrate distributed (renewable) energy resources and storage devices to satisfy the energy service requirements of users, while minimizing the power generation and transmission cost, remains a highly pressing challenge in the smart grid. To address this challenge and assess the effectiveness of integrating distributed energy resources and storage devices, in this paper we develop a theoretical framework to model and analyze three types of power grid systems: the power grid with only bulk energy generators, the power grid with distributed energy resources, and the power grid with both distributed energy resources and storage devices. Based on the metrics of the power cumulative cost and the service reliability to users, we formally model and analyze the impact of integrating distributed energy resources and storage devices in the power grid. We also use the concept of network calculus, which has been traditionally used for carrying out traffic engineering in computer networks, to derive the bounds of both power supply and user demand to achieve a high service reliability to users. Through an extensive performance evaluation, our data shows that integrating distributed energy resources conjointly with energy storage devices can reduce generation costs, smooth the curve of bulk power generation over time, reduce bulk power generation and power distribution losses, and provide a sustainable service reliability to users in the power grid1. PMID:29354654
NASA Astrophysics Data System (ADS)
Sánchez-Martínez, V.; Borges, G.; Borrego, C.; del Peso, J.; Delfino, M.; Gomes, J.; González de la Hoz, S.; Pacheco Pages, A.; Salt, J.; Sedov, A.; Villaplana, M.; Wolters, H.
2014-06-01
In this contribution we describe the performance of the Iberian (Spain and Portugal) ATLAS cloud during the first LHC running period (March 2010-January 2013) in the context of the GRID Computing and Data Distribution Model. The evolution of the resources for CPU, disk and tape in the Iberian Tier-1 and Tier-2s is summarized. The data distribution over all ATLAS destinations is shown, focusing on the number of files transferred and the size of the data. The status and distribution of simulation and analysis jobs within the cloud are discussed. The Distributed Analysis tools used to perform physics analysis are explained as well. Cloud performance in terms of the availability and reliability of its sites is discussed. The effect of the changes in the ATLAS Computing Model on the cloud is analyzed. Finally, the readiness of the Iberian Cloud towards the first Long Shutdown (LS1) is evaluated and an outline of the foreseen actions to take in the coming years is given. The shutdown will be a good opportunity to improve and evolve the ATLAS Distributed Computing system to prepare for the future challenges of the LHC operation.
Warris, Sven; Boymans, Sander; Muiser, Iwe; Noback, Michiel; Krijnen, Wim; Nap, Jan-Peter
2014-01-13
Small RNAs are important regulators of genome function, yet their prediction in genomes is still a major computational challenge. Statistical analyses of pre-miRNA sequences indicated that their 2D structure tends to have a minimal free energy (MFE) significantly lower than MFE values of equivalently randomized sequences with the same nucleotide composition, in contrast to other classes of non-coding RNA. The computation of many MFEs is, however, too intensive to allow for genome-wide screenings. Using a local grid infrastructure, MFE distributions of random sequences were pre-calculated on a large scale. These distributions follow a normal distribution and can be used to determine the MFE distribution for any given sequence composition by interpolation. It allows on-the-fly calculation of the normal distribution for any candidate sequence composition. The speedup achieved makes genome-wide screening with this characteristic of a pre-miRNA sequence practical. Although this particular property alone will not be able to distinguish miRNAs from other sequences sufficiently discriminative, the MFE-based P-value should be added to the parameters of choice to be included in the selection of potential miRNA candidates for experimental verification.
NASA Astrophysics Data System (ADS)
Manfredi, Sabato
2016-06-01
Large-scale dynamic systems are becoming highly pervasive in their occurrence with applications ranging from system biology, environment monitoring, sensor networks, and power systems. They are characterised by high dimensionality, complexity, and uncertainty in the node dynamic/interactions that require more and more computational demanding methods for their analysis and control design, as well as the network size and node system/interaction complexity increase. Therefore, it is a challenging problem to find scalable computational method for distributed control design of large-scale networks. In this paper, we investigate the robust distributed stabilisation problem of large-scale nonlinear multi-agent systems (briefly MASs) composed of non-identical (heterogeneous) linear dynamical systems coupled by uncertain nonlinear time-varying interconnections. By employing Lyapunov stability theory and linear matrix inequality (LMI) technique, new conditions are given for the distributed control design of large-scale MASs that can be easily solved by the toolbox of MATLAB. The stabilisability of each node dynamic is a sufficient assumption to design a global stabilising distributed control. The proposed approach improves some of the existing LMI-based results on MAS by both overcoming their computational limits and extending the applicative scenario to large-scale nonlinear heterogeneous MASs. Additionally, the proposed LMI conditions are further reduced in terms of computational requirement in the case of weakly heterogeneous MASs, which is a common scenario in real application where the network nodes and links are affected by parameter uncertainties. One of the main advantages of the proposed approach is to allow to move from a centralised towards a distributed computing architecture so that the expensive computation workload spent to solve LMIs may be shared among processors located at the networked nodes, thus increasing the scalability of the approach than the network size. Finally, a numerical example shows the applicability of the proposed method and its advantage in terms of computational complexity when compared with the existing approaches.
A uniform approach for programming distributed heterogeneous computing systems
Grasso, Ivan; Pellegrini, Simone; Cosenza, Biagio; Fahringer, Thomas
2014-01-01
Large-scale compute clusters of heterogeneous nodes equipped with multi-core CPUs and GPUs are getting increasingly popular in the scientific community. However, such systems require a combination of different programming paradigms making application development very challenging. In this article we introduce libWater, a library-based extension of the OpenCL programming model that simplifies the development of heterogeneous distributed applications. libWater consists of a simple interface, which is a transparent abstraction of the underlying distributed architecture, offering advanced features such as inter-context and inter-node device synchronization. It provides a runtime system which tracks dependency information enforced by event synchronization to dynamically build a DAG of commands, on which we automatically apply two optimizations: collective communication pattern detection and device-host-device copy removal. We assess libWater’s performance in three compute clusters available from the Vienna Scientific Cluster, the Barcelona Supercomputing Center and the University of Innsbruck, demonstrating improved performance and scaling with different test applications and configurations. PMID:25844015
A uniform approach for programming distributed heterogeneous computing systems.
Grasso, Ivan; Pellegrini, Simone; Cosenza, Biagio; Fahringer, Thomas
2014-12-01
Large-scale compute clusters of heterogeneous nodes equipped with multi-core CPUs and GPUs are getting increasingly popular in the scientific community. However, such systems require a combination of different programming paradigms making application development very challenging. In this article we introduce libWater, a library-based extension of the OpenCL programming model that simplifies the development of heterogeneous distributed applications. libWater consists of a simple interface, which is a transparent abstraction of the underlying distributed architecture, offering advanced features such as inter-context and inter-node device synchronization. It provides a runtime system which tracks dependency information enforced by event synchronization to dynamically build a DAG of commands, on which we automatically apply two optimizations: collective communication pattern detection and device-host-device copy removal. We assess libWater's performance in three compute clusters available from the Vienna Scientific Cluster, the Barcelona Supercomputing Center and the University of Innsbruck, demonstrating improved performance and scaling with different test applications and configurations.
A challenge in PBPK model development is estimating the parameters for absorption, distribution, metabolism, and excretion of the parent compound and metabolites of interest. One approach to reduce the number of parameters has been to simplify pharmacokinetic models by lumping p...
Perspectives on the Future of CFD
NASA Technical Reports Server (NTRS)
Kwak, Dochan
2000-01-01
This viewgraph presentation gives an overview of the future of computational fluid dynamics (CFD), which in the past has pioneered the field of flow simulation. Over time CFD has progressed as computing power. Numerical methods have been advanced as CPU and memory capacity increases. Complex configurations are routinely computed now and direct numerical simulations (DNS) and large eddy simulations (LES) are used to study turbulence. As the computing resources changed to parallel and distributed platforms, computer science aspects such as scalability (algorithmic and implementation) and portability and transparent codings have advanced. Examples of potential future (or current) challenges include risk assessment, limitations of the heuristic model, and the development of CFD and information technology (IT) tools.
Air Pollution Monitoring and Mining Based on Sensor Grid in London
Ma, Yajie; Richards, Mark; Ghanem, Moustafa; Guo, Yike; Hassard, John
2008-01-01
In this paper, we present a distributed infrastructure based on wireless sensors network and Grid computing technology for air pollution monitoring and mining, which aims to develop low-cost and ubiquitous sensor networks to collect real-time, large scale and comprehensive environmental data from road traffic emissions for air pollution monitoring in urban environment. The main informatics challenges in respect to constructing the high-throughput sensor Grid are discussed in this paper. We present a two-layer network framework, a P2P e-Science Grid architecture, and the distributed data mining algorithm as the solutions to address the challenges. We simulated the system in TinyOS to examine the operation of each sensor as well as the networking performance. We also present the distributed data mining result to examine the effectiveness of the algorithm. PMID:27879895
Air Pollution Monitoring and Mining Based on Sensor Grid in London.
Ma, Yajie; Richards, Mark; Ghanem, Moustafa; Guo, Yike; Hassard, John
2008-06-01
In this paper, we present a distributed infrastructure based on wireless sensors network and Grid computing technology for air pollution monitoring and mining, which aims to develop low-cost and ubiquitous sensor networks to collect real-time, large scale and comprehensive environmental data from road traffic emissions for air pollution monitoring in urban environment. The main informatics challenges in respect to constructing the high-throughput sensor Grid are discussed in this paper. We present a twolayer network framework, a P2P e-Science Grid architecture, and the distributed data mining algorithm as the solutions to address the challenges. We simulated the system in TinyOS to examine the operation of each sensor as well as the networking performance. We also present the distributed data mining result to examine the effectiveness of the algorithm.
Using CFD Surface Solutions to Shape Sonic Boom Signatures Propagated from Off-Body Pressure
NASA Technical Reports Server (NTRS)
Ordaz, Irian; Li, Wu
2013-01-01
The conceptual design of a low-boom and low-drag supersonic aircraft remains a challenge despite significant progress in recent years. Inverse design using reversed equivalent area and adjoint methods have been demonstrated to be effective in shaping the ground signature propagated from computational fluid dynamics (CFD) off-body pressure distributions. However, there is still a need to reduce the computational cost in the early stages of design to obtain a baseline that is feasible for low-boom shaping, and in the search for a robust low-boom design over the entire sonic boom footprint. The proposed design method addresses the need to reduce the computational cost for robust low-boom design by using surface pressure distributions from CFD solutions to shape sonic boom ground signatures propagated from CFD off-body pressure.
Visualization, documentation, analysis, and communication of large scale gene regulatory networks
Longabaugh, William J.R.; Davidson, Eric H.; Bolouri, Hamid
2009-01-01
Summary Genetic regulatory networks (GRNs) are complex, large-scale, and spatially and temporally distributed. These characteristics impose challenging demands on computational GRN modeling tools, and there is a need for custom modeling tools. In this paper, we report on our ongoing development of BioTapestry, an open source, freely available computational tool designed specifically for GRN modeling. We also outline our future development plans, and give some examples of current applications of BioTapestry. PMID:18757046
Challenging Density Functional Theory Calculations with Hemes and Porphyrins
de Visser, Sam P.; Stillman, Martin J.
2016-01-01
In this paper we review recent advances in computational chemistry and specifically focus on the chemical description of heme proteins and synthetic porphyrins that act as both mimics of natural processes and technological uses. These are challenging biochemical systems involved in electron transfer as well as biocatalysis processes. In recent years computational tools have improved considerably and now can reproduce experimental spectroscopic and reactivity studies within a reasonable error margin (several kcal·mol−1). This paper gives recent examples from our groups, where we investigated heme and synthetic metal-porphyrin systems. The four case studies highlight how computational modelling can correctly reproduce experimental product distributions, predicted reactivity trends and guide interpretation of electronic structures of complex systems. The case studies focus on the calculations of a variety of spectroscopic features of porphyrins and show how computational modelling gives important insight that explains the experimental spectra and can lead to the design of porphyrins with tuned properties. PMID:27070578
Data management and analysis for the Earth System Grid
NASA Astrophysics Data System (ADS)
Williams, D. N.; Ananthakrishnan, R.; Bernholdt, D. E.; Bharathi, S.; Brown, D.; Chen, M.; Chervenak, A. L.; Cinquini, L.; Drach, R.; Foster, I. T.; Fox, P.; Hankin, S.; Henson, V. E.; Jones, P.; Middleton, D. E.; Schwidder, J.; Schweitzer, R.; Schuler, R.; Shoshani, A.; Siebenlist, F.; Sim, A.; Strand, W. G.; Wilhelmi, N.; Su, M.
2008-07-01
The international climate community is expected to generate hundreds of petabytes of simulation data within the next five to seven years. This data must be accessed and analyzed by thousands of analysts worldwide in order to provide accurate and timely estimates of the likely impact of climate change on physical, biological, and human systems. Climate change is thus not only a scientific challenge of the first order but also a major technological challenge. In order to address this technological challenge, the Earth System Grid Center for Enabling Technologies (ESG-CET) has been established within the U.S. Department of Energy's Scientific Discovery through Advanced Computing (SciDAC)-2 program, with support from the offices of Advanced Scientific Computing Research and Biological and Environmental Research. ESG-CET's mission is to provide climate researchers worldwide with access to the data, information, models, analysis tools, and computational capabilities required to make sense of enormous climate simulation datasets. Its specific goals are to (1) make data more useful to climate researchers by developing Grid technology that enhances data usability; (2) meet specific distributed database, data access, and data movement needs of national and international climate projects; (3) provide a universal and secure web-based data access portal for broad multi-model data collections; and (4) provide a wide-range of Grid-enabled climate data analysis tools and diagnostic methods to international climate centers and U.S. government agencies. Building on the successes of the previous Earth System Grid (ESG) project, which has enabled thousands of researchers to access tens of terabytes of data from a small number of ESG sites, ESG-CET is working to integrate a far larger number of distributed data providers, high-bandwidth wide-area networks, and remote computers in a highly collaborative problem-solving environment.
S-Cube: Enabling the Next Generation of Software Services
NASA Astrophysics Data System (ADS)
Metzger, Andreas; Pohl, Klaus
The Service Oriented Architecture (SOA) paradigm is increasingly adopted by industry for building distributed software systems. However, when designing, developing and operating innovative software services and servicebased systems, several challenges exist. Those challenges include how to manage the complexity of those systems, how to establish, monitor and enforce Quality of Service (QoS) and Service Level Agreements (SLAs), as well as how to build those systems such that they can proactively adapt to dynamically changing requirements and context conditions. Developing foundational solutions for those challenges requires joint efforts of different research communities such as Business Process Management, Grid Computing, Service Oriented Computing and Software Engineering. This paper provides an overview of S-Cube, the European Network of Excellence on Software Services and Systems. S-Cube brings together researchers from leading research institutions across Europe, who join their competences to develop foundations, theories as well as methods and tools for future service-based systems.
ClimateSpark: An In-memory Distributed Computing Framework for Big Climate Data Analytics
NASA Astrophysics Data System (ADS)
Hu, F.; Yang, C. P.; Duffy, D.; Schnase, J. L.; Li, Z.
2016-12-01
Massive array-based climate data is being generated from global surveillance systems and model simulations. They are widely used to analyze the environment problems, such as climate changes, natural hazards, and public health. However, knowing the underlying information from these big climate datasets is challenging due to both data- and computing- intensive issues in data processing and analyzing. To tackle the challenges, this paper proposes ClimateSpark, an in-memory distributed computing framework to support big climate data processing. In ClimateSpark, the spatiotemporal index is developed to enable Apache Spark to treat the array-based climate data (e.g. netCDF4, HDF4) as native formats, which are stored in Hadoop Distributed File System (HDFS) without any preprocessing. Based on the index, the spatiotemporal query services are provided to retrieve dataset according to a defined geospatial and temporal bounding box. The data subsets will be read out, and a data partition strategy will be applied to equally split the queried data to each computing node, and store them in memory as climateRDDs for processing. By leveraging Spark SQL and User Defined Function (UDFs), the climate data analysis operations can be conducted by the intuitive SQL language. ClimateSpark is evaluated by two use cases using the NASA Modern-Era Retrospective Analysis for Research and Applications (MERRA) climate reanalysis dataset. One use case is to conduct the spatiotemporal query and visualize the subset results in animation; the other one is to compare different climate model outputs using Taylor-diagram service. Experimental results show that ClimateSpark can significantly accelerate data query and processing, and enable the complex analysis services served in the SQL-style fashion.
Probabilistic co-adaptive brain-computer interfacing
NASA Astrophysics Data System (ADS)
Bryan, Matthew J.; Martin, Stefan A.; Cheung, Willy; Rao, Rajesh P. N.
2013-12-01
Objective. Brain-computer interfaces (BCIs) are confronted with two fundamental challenges: (a) the uncertainty associated with decoding noisy brain signals, and (b) the need for co-adaptation between the brain and the interface so as to cooperatively achieve a common goal in a task. We seek to mitigate these challenges. Approach. We introduce a new approach to brain-computer interfacing based on partially observable Markov decision processes (POMDPs). POMDPs provide a principled approach to handling uncertainty and achieving co-adaptation in the following manner: (1) Bayesian inference is used to compute posterior probability distributions (‘beliefs’) over brain and environment state, and (2) actions are selected based on entire belief distributions in order to maximize total expected reward; by employing methods from reinforcement learning, the POMDP’s reward function can be updated over time to allow for co-adaptive behaviour. Main results. We illustrate our approach using a simple non-invasive BCI which optimizes the speed-accuracy trade-off for individual subjects based on the signal-to-noise characteristics of their brain signals. We additionally demonstrate that the POMDP BCI can automatically detect changes in the user’s control strategy and can co-adaptively switch control strategies on-the-fly to maximize expected reward. Significance. Our results suggest that the framework of POMDPs offers a promising approach for designing BCIs that can handle uncertainty in neural signals and co-adapt with the user on an ongoing basis. The fact that the POMDP BCI maintains a probability distribution over the user’s brain state allows a much more powerful form of decision making than traditional BCI approaches, which have typically been based on the output of classifiers or regression techniques. Furthermore, the co-adaptation of the system allows the BCI to make online improvements to its behaviour, adjusting itself automatically to the user’s changing circumstances.
On joint subtree distributions under two evolutionary models.
Wu, Taoyang; Choi, Kwok Pui
2016-04-01
In population and evolutionary biology, hypotheses about micro-evolutionary and macro-evolutionary processes are commonly tested by comparing the shape indices of empirical evolutionary trees with those predicted by neutral models. A key ingredient in this approach is the ability to compute and quantify distributions of various tree shape indices under random models of interest. As a step to meet this challenge, in this paper we investigate the joint distribution of cherries and pitchforks (that is, subtrees with two and three leaves) under two widely used null models: the Yule-Harding-Kingman (YHK) model and the proportional to distinguishable arrangements (PDA) model. Based on two novel recursive formulae, we propose a dynamic approach to numerically compute the exact joint distribution (and hence the marginal distributions) for trees of any size. We also obtained insights into the statistical properties of trees generated under these two models, including a constant correlation between the cherry and the pitchfork distributions under the YHK model, and the log-concavity and unimodality of the cherry distributions under both models. In addition, we show that there exists a unique change point for the cherry distributions between these two models. Copyright © 2015 Elsevier Inc. All rights reserved.
Responsive systems - The challenge for the nineties
NASA Technical Reports Server (NTRS)
Malek, Miroslaw
1990-01-01
A concept of responsive computer systems will be introduced. The emerging responsive systems demand fault-tolerant and real-time performance in parallel and distributed computing environments. The design methodologies for fault-tolerant, real time and responsive systems will be presented. Novel techniques of introducing redundancy for improved performance and dependability will be illustrated. The methods of system responsiveness evaluation will be proposed. The issues of determinism, closed and open systems will also be discussed from the perspective of responsive systems design.
NASA Astrophysics Data System (ADS)
Poudel, Joemini; Matthews, Thomas P.; Mitsuhashi, Kenji; Garcia-Uribe, Alejandro; Wang, Lihong V.; Anastasio, Mark A.
2017-03-01
Photoacoustic computed tomography (PACT) is an emerging computed imaging modality that exploits optical contrast and ultrasonic detection principles to form images of the photoacoustically induced initial pressure distribution within tissue. The PACT reconstruction problem corresponds to a time-domain inverse source problem, where the initial pressure distribution is recovered from the measurements recorded on an aperture outside the support of the source. A major challenge in transcranial PACT brain imaging is to compensate for aberrations in the measured data due to the propagation of the photoacoustic wavefields through the skull. To properly account for these effects, a wave equation-based inversion method should be employed that can model the heterogeneous elastic properties of the medium. In this study, an iterative image reconstruction method for 3D transcranial PACT is developed based on the elastic wave equation. To accomplish this, a forward model based on a finite-difference time-domain discretization of the elastic wave equation is established. Subsequently, gradient-based methods are employed for computing penalized least squares estimates of the initial source distribution that produced the measured photoacoustic data. The developed reconstruction algorithm is validated and investigated through computer-simulation studies.
MapReduce Based Parallel Bayesian Network for Manufacturing Quality Control
NASA Astrophysics Data System (ADS)
Zheng, Mao-Kuan; Ming, Xin-Guo; Zhang, Xian-Yu; Li, Guo-Ming
2017-09-01
Increasing complexity of industrial products and manufacturing processes have challenged conventional statistics based quality management approaches in the circumstances of dynamic production. A Bayesian network and big data analytics integrated approach for manufacturing process quality analysis and control is proposed. Based on Hadoop distributed architecture and MapReduce parallel computing model, big volume and variety quality related data generated during the manufacturing process could be dealt with. Artificial intelligent algorithms, including Bayesian network learning, classification and reasoning, are embedded into the Reduce process. Relying on the ability of the Bayesian network in dealing with dynamic and uncertain problem and the parallel computing power of MapReduce, Bayesian network of impact factors on quality are built based on prior probability distribution and modified with posterior probability distribution. A case study on hull segment manufacturing precision management for ship and offshore platform building shows that computing speed accelerates almost directly proportionally to the increase of computing nodes. It is also proved that the proposed model is feasible for locating and reasoning of root causes, forecasting of manufacturing outcome, and intelligent decision for precision problem solving. The integration of bigdata analytics and BN method offers a whole new perspective in manufacturing quality control.
Virtual patient simulator for distributed collaborative medical education.
Caudell, Thomas P; Summers, Kenneth L; Holten, Jim; Hakamata, Takeshi; Mowafi, Moad; Jacobs, Joshua; Lozanoff, Beth K; Lozanoff, Scott; Wilks, David; Keep, Marcus F; Saiki, Stanley; Alverson, Dale
2003-01-01
Project TOUCH (Telehealth Outreach for Unified Community Health; http://hsc.unm.edu/touch) investigates the feasibility of using advanced technologies to enhance education in an innovative problem-based learning format currently being used in medical school curricula, applying specific clinical case models, and deploying to remote sites/workstations. The University of New Mexico's School of Medicine and the John A. Burns School of Medicine at the University of Hawai'i face similar health care challenges in providing and delivering services and training to remote and rural areas. Recognizing that health care needs are local and require local solutions, both states are committed to improving health care delivery to their unique populations by sharing information and experiences through emerging telehealth technologies by using high-performance computing and communications resources. The purpose of this study is to describe the deployment of a problem-based learning case distributed over the National Computational Science Alliance's Access Grid. Emphasis is placed on the underlying technical components of the TOUCH project, including the virtual reality development tool Flatland, the artificial intelligence-based simulation engine, the Access Grid, high-performance computing platforms, and the software that connects them all. In addition, educational and technical challenges for Project TOUCH are identified. Copyright 2003 Wiley-Liss, Inc.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Singh, Ravindra; Reilly, James T.; Wang, Jianhui
Electric distribution utilities encounter many challenges to successful deployment of Distribution Management Systems (DMSs). The key challenges are documented in this report, along with suggestions for overcoming them. This report offers a recommended list of activities for implementing a DMS. It takes a strategic approach to implementing DMS from a project management perspective. The project management strategy covers DMS planning, procurement, design, building, testing, Installation, commissioning, and system integration issues and solutions. It identifies the risks that are associated with implementation and suggests strategies for utilities to use to mitigate them or avoid them altogether. Attention is given to commonmore » barriers to successful DMS implementation. This report begins with an overview of the implementation strategy for a DMS and proceeds to put forward a basic approach for procuring hardware and software for a DMS; designing the interfaces with external corporate computing systems such as EMS, GIS, OMS, and AMI; and implementing a complete solution.« less
2014-01-01
Background Small RNAs are important regulators of genome function, yet their prediction in genomes is still a major computational challenge. Statistical analyses of pre-miRNA sequences indicated that their 2D structure tends to have a minimal free energy (MFE) significantly lower than MFE values of equivalently randomized sequences with the same nucleotide composition, in contrast to other classes of non-coding RNA. The computation of many MFEs is, however, too intensive to allow for genome-wide screenings. Results Using a local grid infrastructure, MFE distributions of random sequences were pre-calculated on a large scale. These distributions follow a normal distribution and can be used to determine the MFE distribution for any given sequence composition by interpolation. It allows on-the-fly calculation of the normal distribution for any candidate sequence composition. Conclusion The speedup achieved makes genome-wide screening with this characteristic of a pre-miRNA sequence practical. Although this particular property alone will not be able to distinguish miRNAs from other sequences sufficiently discriminative, the MFE-based P-value should be added to the parameters of choice to be included in the selection of potential miRNA candidates for experimental verification. PMID:24418292
NASA Astrophysics Data System (ADS)
Zhang, L.; Eskin, D. G.; Miroux, A.; Subroto, T.; Katgerman, L.
2012-07-01
Controlling macrosegregation is one of the major challenges in direct-chill (DC) casting of aluminium alloys. In this paper, the effect of the inlet geometry (which influences the melt distribution) on macrosegregation during the DC casting of 7050 alloy billets was studied experimentally and by using 2D computer modelling. The ALSIM model was used to determine the temperature and flow patterns during DC casting. The results from the computer simulations show that the sump profiles and flow patterns in the billet are strongly influenced by the melt flow distribution determined by the inlet geometry. These observations were correlated to the actual macrosegregation patterns found in the as-cast billets produced by having two different inlet geometries. The macrosegregation analysis presented here may assist in determining the critical parameters to consider for improving the casting of 7XXX aluminium alloys.
NASA Astrophysics Data System (ADS)
Kiktenko, E. O.; Pozhar, N. O.; Anufriev, M. N.; Trushechkin, A. S.; Yunusov, R. R.; Kurochkin, Y. V.; Lvovsky, A. I.; Fedorov, A. K.
2018-07-01
Blockchain is a distributed database which is cryptographically protected against malicious modifications. While promising for a wide range of applications, current blockchain platforms rely on digital signatures, which are vulnerable to attacks by means of quantum computers. The same, albeit to a lesser extent, applies to cryptographic hash functions that are used in preparing new blocks, so parties with access to quantum computation would have unfair advantage in procuring mining rewards. Here we propose a possible solution to the quantum era blockchain challenge and report an experimental realization of a quantum-safe blockchain platform that utilizes quantum key distribution across an urban fiber network for information-theoretically secure authentication. These results address important questions about realizability and scalability of quantum-safe blockchains for commercial and governmental applications.
Next Generation Workload Management System For Big Data on Heterogeneous Distributed Computing
Klimentov, A.; Buncic, P.; De, K.; ...
2015-05-22
The Large Hadron Collider (LHC), operating at the international CERN Laboratory in Geneva, Switzerland, is leading Big Data driven scientific explorations. Experiments at the LHC explore the fundamental nature of matter and the basic forces that shape our universe, and were recently credited for the discovery of a Higgs boson. ATLAS and ALICE are the largest collaborations ever assembled in the sciences and are at the forefront of research at the LHC. To address an unprecedented multi-petabyte data processing challenge, both experiments rely on a heterogeneous distributed computational infrastructure. The ATLAS experiment uses PanDA (Production and Data Analysis) Workload Managementmore » System (WMS) for managing the workflow for all data processing on hundreds of data centers. Through PanDA, ATLAS physicists see a single computing facility that enables rapid scientific breakthroughs for the experiment, even though the data centers are physically scattered all over the world. The scale is demonstrated by the following numbers: PanDA manages O(10 2) sites, O(10 5) cores, O(10 8) jobs per year, O(10 3) users, and ATLAS data volume is O(10 17) bytes. In 2013 we started an ambitious program to expand PanDA to all available computing resources, including opportunistic use of commercial and academic clouds and Leadership Computing Facilities (LCF). The project titled 'Next Generation Workload Management and Analysis System for Big Data' (BigPanDA) is funded by DOE ASCR and HEP. Extending PanDA to clouds and LCF presents new challenges in managing heterogeneity and supporting workflow. The BigPanDA project is underway to setup and tailor PanDA at the Oak Ridge Leadership Computing Facility (OLCF) and at the National Research Center "Kurchatov Institute" together with ALICE distributed computing and ORNL computing professionals. Our approach to integration of HPC platforms at the OLCF and elsewhere is to reuse, as much as possible, existing components of the PanDA system. Finally, we will present our current accomplishments with running the PanDA WMS at OLCF and other supercomputers and demonstrate our ability to use PanDA as a portal independent of the computing facilities infrastructure for High Energy and Nuclear Physics as well as other data-intensive science applications.« less
Next Generation Workload Management System For Big Data on Heterogeneous Distributed Computing
DOE Office of Scientific and Technical Information (OSTI.GOV)
Klimentov, A.; Buncic, P.; De, K.
The Large Hadron Collider (LHC), operating at the international CERN Laboratory in Geneva, Switzerland, is leading Big Data driven scientific explorations. Experiments at the LHC explore the fundamental nature of matter and the basic forces that shape our universe, and were recently credited for the discovery of a Higgs boson. ATLAS and ALICE are the largest collaborations ever assembled in the sciences and are at the forefront of research at the LHC. To address an unprecedented multi-petabyte data processing challenge, both experiments rely on a heterogeneous distributed computational infrastructure. The ATLAS experiment uses PanDA (Production and Data Analysis) Workload Managementmore » System (WMS) for managing the workflow for all data processing on hundreds of data centers. Through PanDA, ATLAS physicists see a single computing facility that enables rapid scientific breakthroughs for the experiment, even though the data centers are physically scattered all over the world. The scale is demonstrated by the following numbers: PanDA manages O(10 2) sites, O(10 5) cores, O(10 8) jobs per year, O(10 3) users, and ATLAS data volume is O(10 17) bytes. In 2013 we started an ambitious program to expand PanDA to all available computing resources, including opportunistic use of commercial and academic clouds and Leadership Computing Facilities (LCF). The project titled 'Next Generation Workload Management and Analysis System for Big Data' (BigPanDA) is funded by DOE ASCR and HEP. Extending PanDA to clouds and LCF presents new challenges in managing heterogeneity and supporting workflow. The BigPanDA project is underway to setup and tailor PanDA at the Oak Ridge Leadership Computing Facility (OLCF) and at the National Research Center "Kurchatov Institute" together with ALICE distributed computing and ORNL computing professionals. Our approach to integration of HPC platforms at the OLCF and elsewhere is to reuse, as much as possible, existing components of the PanDA system. Finally, we will present our current accomplishments with running the PanDA WMS at OLCF and other supercomputers and demonstrate our ability to use PanDA as a portal independent of the computing facilities infrastructure for High Energy and Nuclear Physics as well as other data-intensive science applications.« less
Zhang, Lei; Zhang, Jing
2017-08-07
A Smart Grid (SG) facilitates bidirectional demand-response communication between individual users and power providers with high computation and communication performance but also brings about the risk of leaking users' private information. Therefore, improving the individual power requirement and distribution efficiency to ensure communication reliability while preserving user privacy is a new challenge for SG. Based on this issue, we propose an efficient and privacy-preserving power requirement and distribution aggregation scheme (EPPRD) based on a hierarchical communication architecture. In the proposed scheme, an efficient encryption and authentication mechanism is proposed for better fit to each individual demand-response situation. Through extensive analysis and experiment, we demonstrate how the EPPRD resists various security threats and preserves user privacy while satisfying the individual requirement in a semi-honest model; it involves less communication overhead and computation time than the existing competing schemes.
Zhang, Lei; Zhang, Jing
2017-01-01
A Smart Grid (SG) facilitates bidirectional demand-response communication between individual users and power providers with high computation and communication performance but also brings about the risk of leaking users’ private information. Therefore, improving the individual power requirement and distribution efficiency to ensure communication reliability while preserving user privacy is a new challenge for SG. Based on this issue, we propose an efficient and privacy-preserving power requirement and distribution aggregation scheme (EPPRD) based on a hierarchical communication architecture. In the proposed scheme, an efficient encryption and authentication mechanism is proposed for better fit to each individual demand-response situation. Through extensive analysis and experiment, we demonstrate how the EPPRD resists various security threats and preserves user privacy while satisfying the individual requirement in a semi-honest model; it involves less communication overhead and computation time than the existing competing schemes. PMID:28783122
Liu, Ren; Srivastava, Anurag K.; Bakken, David E.; ...
2017-08-17
Intermittency of wind energy poses a great challenge for power system operation and control. Wind curtailment might be necessary at the certain operating condition to keep the line flow within the limit. Remedial Action Scheme (RAS) offers quick control action mechanism to keep reliability and security of the power system operation with high wind energy integration. In this paper, a new RAS is developed to maximize the wind energy integration without compromising the security and reliability of the power system based on specific utility requirements. A new Distributed Linear State Estimation (DLSE) is also developed to provide the fast andmore » accurate input data for the proposed RAS. A distributed computational architecture is designed to guarantee the robustness of the cyber system to support RAS and DLSE implementation. The proposed RAS and DLSE is validated using the modified IEEE-118 Bus system. Simulation results demonstrate the satisfactory performance of the DLSE and the effectiveness of RAS. Real-time cyber-physical testbed has been utilized to validate the cyber-resiliency of the developed RAS against computational node failure.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Liu, Ren; Srivastava, Anurag K.; Bakken, David E.
Intermittency of wind energy poses a great challenge for power system operation and control. Wind curtailment might be necessary at the certain operating condition to keep the line flow within the limit. Remedial Action Scheme (RAS) offers quick control action mechanism to keep reliability and security of the power system operation with high wind energy integration. In this paper, a new RAS is developed to maximize the wind energy integration without compromising the security and reliability of the power system based on specific utility requirements. A new Distributed Linear State Estimation (DLSE) is also developed to provide the fast andmore » accurate input data for the proposed RAS. A distributed computational architecture is designed to guarantee the robustness of the cyber system to support RAS and DLSE implementation. The proposed RAS and DLSE is validated using the modified IEEE-118 Bus system. Simulation results demonstrate the satisfactory performance of the DLSE and the effectiveness of RAS. Real-time cyber-physical testbed has been utilized to validate the cyber-resiliency of the developed RAS against computational node failure.« less
StrAuto: automation and parallelization of STRUCTURE analysis.
Chhatre, Vikram E; Emerson, Kevin J
2017-03-24
Population structure inference using the software STRUCTURE has become an integral part of population genetic studies covering a broad spectrum of taxa including humans. The ever-expanding size of genetic data sets poses computational challenges for this analysis. Although at least one tool currently implements parallel computing to reduce computational overload of this analysis, it does not fully automate the use of replicate STRUCTURE analysis runs required for downstream inference of optimal K. There is pressing need for a tool that can deploy population structure analysis on high performance computing clusters. We present an updated version of the popular Python program StrAuto, to streamline population structure analysis using parallel computing. StrAuto implements a pipeline that combines STRUCTURE analysis with the Evanno Δ K analysis and visualization of results using STRUCTURE HARVESTER. Using benchmarking tests, we demonstrate that StrAuto significantly reduces the computational time needed to perform iterative STRUCTURE analysis by distributing runs over two or more processors. StrAuto is the first tool to integrate STRUCTURE analysis with post-processing using a pipeline approach in addition to implementing parallel computation - a set up ideal for deployment on computing clusters. StrAuto is distributed under the GNU GPL (General Public License) and available to download from http://strauto.popgen.org .
NASA Astrophysics Data System (ADS)
Zhou, Jianfeng; Xu, Benda; Peng, Chuan; Yang, Yang; Huo, Zhuoxi
2015-08-01
AIRE-Linux is a dedicated Linux system for astronomers. Modern astronomy faces two big challenges: massive observed raw data which covers the whole electromagnetic spectrum, and overmuch professional data processing skill which exceeds personal or even a small team's abilities. AIRE-Linux, which is a specially designed Linux and will be distributed to users by Virtual Machine (VM) images in Open Virtualization Format (OVF), is to help astronomers confront the challenges. Most astronomical software packages, such as IRAF, MIDAS, CASA, Heasoft etc., will be integrated into AIRE-Linux. It is easy for astronomers to configure and customize the system and use what they just need. When incorporated into cloud computing platforms, AIRE-Linux will be able to handle data intensive and computing consuming tasks for astronomers. Currently, a Beta version of AIRE-Linux is ready for download and testing.
NASA Astrophysics Data System (ADS)
Dewhurst, A.; Legger, F.
2015-12-01
The ATLAS experiment accumulated more than 140 PB of data during the first run of the Large Hadron Collider (LHC) at CERN. The analysis of such an amount of data is a challenging task for the distributed physics community. The Distributed Analysis (DA) system of the ATLAS experiment is an established and stable component of the ATLAS distributed computing operations. About half a million user jobs are running daily on DA resources, submitted by more than 1500 ATLAS physicists. The reliability of the DA system during the first run of the LHC and the following shutdown period has been high thanks to the continuous automatic validation of the distributed analysis sites and the user support provided by a dedicated team of expert shifters. During the LHC shutdown, the ATLAS computing model has undergone several changes to improve the analysis workflows, including the re-design of the production system, a new analysis data format and event model, and the development of common reduction and analysis frameworks. We report on the impact such changes have on the DA infrastructure, describe the new DA components, and include recent performance measurements.
A communication efficient and scalable distributed data mining for the astronomical data
NASA Astrophysics Data System (ADS)
Govada, A.; Sahay, S. K.
2016-07-01
In 2020, ∼60PB of archived data will be accessible to the astronomers. But to analyze such a paramount data will be a challenging task. This is basically due to the computational model used to download the data from complex geographically distributed archives to a central site and then analyzing it in the local systems. Because the data has to be downloaded to the central site, the network BW limitation will be a hindrance for the scientific discoveries. Also analyzing this PB-scale on local machines in a centralized manner is challenging. In this, virtual observatory is a step towards this problem, however, it does not provide the data mining model (Zhang et al., 2004). Adding the distributed data mining layer to the VO can be the solution in which the knowledge can be downloaded by the astronomers instead the raw data and thereafter astronomers can either reconstruct the data back from the downloaded knowledge or use the knowledge directly for further analysis. Therefore, in this paper, we present Distributed Load Balancing Principal Component Analysis for optimally distributing the computation among the available nodes to minimize the transmission cost and downloading cost for the end user. The experimental analysis is done with Fundamental Plane (FP) data, Gadotti data and complex Mfeat data. In terms of transmission cost, our approach performs better than Qi et al. and Yue et al. The analysis shows that with the complex Mfeat data ∼90% downloading cost can be reduced for the end user with the negligible loss in accuracy.
Probabilistic brains: knowns and unknowns
Pouget, Alexandre; Beck, Jeffrey M; Ma, Wei Ji; Latham, Peter E
2015-01-01
There is strong behavioral and physiological evidence that the brain both represents probability distributions and performs probabilistic inference. Computational neuroscientists have started to shed light on how these probabilistic representations and computations might be implemented in neural circuits. One particularly appealing aspect of these theories is their generality: they can be used to model a wide range of tasks, from sensory processing to high-level cognition. To date, however, these theories have only been applied to very simple tasks. Here we discuss the challenges that will emerge as researchers start focusing their efforts on real-life computations, with a focus on probabilistic learning, structural learning and approximate inference. PMID:23955561
HammerCloud: A Stress Testing System for Distributed Analysis
NASA Astrophysics Data System (ADS)
van der Ster, Daniel C.; Elmsheuser, Johannes; Úbeda García, Mario; Paladin, Massimo
2011-12-01
Distributed analysis of LHC data is an I/O-intensive activity which places large demands on the internal network, storage, and local disks at remote computing facilities. Commissioning and maintaining a site to provide an efficient distributed analysis service is therefore a challenge which can be aided by tools to help evaluate a variety of infrastructure designs and configurations. HammerCloud is one such tool; it is a stress testing service which is used by central operations teams, regional coordinators, and local site admins to (a) submit arbitrary number of analysis jobs to a number of sites, (b) maintain at a steady-state a predefined number of jobs running at the sites under test, (c) produce web-based reports summarizing the efficiency and performance of the sites under test, and (d) present a web-interface for historical test results to both evaluate progress and compare sites. HammerCloud was built around the distributed analysis framework Ganga, exploiting its API for grid job management. HammerCloud has been employed by the ATLAS experiment for continuous testing of many sites worldwide, and also during large scale computing challenges such as STEP'09 and UAT'09, where the scale of the tests exceeded 10,000 concurrently running and 1,000,000 total jobs over multi-day periods. In addition, HammerCloud is being adopted by the CMS experiment; the plugin structure of HammerCloud allows the execution of CMS jobs using their official tool (CRAB).
Cloud based emergency health care information service in India.
Karthikeyan, N; Sukanesh, R
2012-12-01
A hospital is a health care organization providing patient treatment by expert physicians, surgeons and equipments. A report from a health care accreditation group says that miscommunication between patients and health care providers is the reason for the gap in providing emergency medical care to people in need. In developing countries, illiteracy is the major key root for deaths resulting from uncertain diseases constituting a serious public health problem. Mentally affected, differently abled and unconscious patients can't communicate about their medical history to the medical practitioners. Also, Medical practitioners can't edit or view DICOM images instantly. Our aim is to provide palm vein pattern recognition based medical record retrieval system, using cloud computing for the above mentioned people. Distributed computing technology is coming in the new forms as Grid computing and Cloud computing. These new forms are assured to bring Information Technology (IT) as a service. In this paper, we have described how these new forms of distributed computing will be helpful for modern health care industries. Cloud Computing is germinating its benefit to industrial sectors especially in medical scenarios. In Cloud Computing, IT-related capabilities and resources are provided as services, via the distributed computing on-demand. This paper is concerned with sprouting software as a service (SaaS) by means of Cloud computing with an aim to bring emergency health care sector in an umbrella with physical secured patient records. In framing the emergency healthcare treatment, the crucial thing considered necessary to decide about patients is their previous health conduct records. Thus a ubiquitous access to appropriate records is essential. Palm vein pattern recognition promises a secured patient record access. Likewise our paper reveals an efficient means to view, edit or transfer the DICOM images instantly which was a challenging task for medical practitioners in the past years. We have developed two services for health care. 1. Cloud based Palm vein recognition system 2. Distributed Medical image processing tools for medical practitioners.
A Logically Centralized Approach for Control and Management of Large Computer Networks
ERIC Educational Resources Information Center
Iqbal, Hammad A.
2012-01-01
Management of large enterprise and Internet service provider networks is a complex, error-prone, and costly challenge. It is widely accepted that the key contributors to this complexity are the bundling of control and data forwarding in traditional routers and the use of fully distributed protocols for network control. To address these…
QM/QM approach to model energy disorder in amorphous organic semiconductors.
Friederich, Pascal; Meded, Velimir; Symalla, Franz; Elstner, Marcus; Wenzel, Wolfgang
2015-02-10
It is an outstanding challenge to model the electronic properties of organic amorphous materials utilized in organic electronics. Computation of the charge carrier mobility is a challenging problem as it requires integration of morphological and electronic degrees of freedom in a coherent methodology and depends strongly on the distribution of polaron energies in the system. Here we represent a QM/QM model to compute the polaron energies combining density functional methods for molecules in the vicinity of the polaron with computationally efficient density functional based tight binding methods in the rest of the environment. For seven widely used amorphous organic semiconductor materials, we show that the calculations are accelerated up to 1 order of magnitude without any loss in accuracy. Considering that the quantum chemical step is the efficiency bottleneck of a workflow to model the carrier mobility, these results are an important step toward accurate and efficient disordered organic semiconductors simulations, a prerequisite for accelerated materials screening and consequent component optimization in the organic electronics industry.
Data distribution method of workflow in the cloud environment
NASA Astrophysics Data System (ADS)
Wang, Yong; Wu, Junjuan; Wang, Ying
2017-08-01
Cloud computing for workflow applications provides the required high efficiency calculation and large storage capacity and it also brings challenges to the protection of trade secrets and other privacy data. Because of privacy data will cause the increase of the data transmission time, this paper presents a new data allocation algorithm based on data collaborative damage degree, to improve the existing data allocation strategy? Safety and public cloud computer algorithm depends on the private cloud; the static allocation method in the initial stage only to the non-confidential data division to improve the original data, in the operational phase will continue to generate data to dynamically adjust the data distribution scheme. The experimental results show that the improved method is effective in reducing the data transmission time.
Efficient calculation of luminance variation of a luminaire that uses LED light sources
NASA Astrophysics Data System (ADS)
Goldstein, Peter
2007-09-01
Many luminaires have an array of LEDs that illuminate a lenslet-array diffuser in order to create the appearance of a single, extended source with a smooth luminance distribution. Designing such a system is challenging because luminance calculations for a lenslet array generally involve tracing millions of rays per LED, which is computationally intensive and time-consuming. This paper presents a technique for calculating an on-axis luminance distribution by tracing only one ray per LED per lenslet. A multiple-LED system is simulated with this method, and with Monte Carlo ray-tracing software for comparison. Accuracy improves, and computation time decreases by at least five orders of magnitude with this technique, which has applications in LED-based signage, displays, and general illumination.
Use of Emerging Grid Computing Technologies for the Analysis of LIGO Data
NASA Astrophysics Data System (ADS)
Koranda, Scott
2004-03-01
The LIGO Scientific Collaboration (LSC) today faces the challenge of enabling analysis of terabytes of LIGO data by hundreds of scientists from institutions all around the world. To meet this challenge the LSC is developing tools, infrastructure, applications, and expertise leveraging Grid Computing technologies available today, and making available to LSC scientists compute resources at sites across the United States and Europe. We use digital credentials for strong and secure authentication and authorization to compute resources and data. Building on top of products from the Globus project for high-speed data transfer and information discovery we have created the Lightweight Data Replicator (LDR) to securely and robustly replicate data to resource sites. We have deployed at our computing sites the Virtual Data Toolkit (VDT) Server and Client packages, developed in collaboration with our partners in the GriPhyN and iVDGL projects, providing uniform access to distributed resources for users and their applications. Taken together these Grid Computing technologies and infrastructure have formed the LSC DataGrid--a coherent and uniform environment across two continents for the analysis of gravitational-wave detector data. Much work, however, remains in order to scale current analyses and recent lessons learned need to be integrated into the next generation of Grid middleware.
Virtual Observatory and Distributed Data Mining
NASA Astrophysics Data System (ADS)
Borne, Kirk D.
2012-03-01
New modes of discovery are enabled by the growth of data and computational resources (i.e., cyberinfrastructure) in the sciences. This cyberinfrastructure includes structured databases, virtual observatories (distributed data, as described in Section 20.2.1 of this chapter), high-performance computing (petascale machines), distributed computing (e.g., the Grid, the Cloud, and peer-to-peer networks), intelligent search and discovery tools, and innovative visualization environments. Data streams from experiments, sensors, and simulations are increasingly complex and growing in volume. This is true in most sciences, including astronomy, climate simulations, Earth observing systems, remote sensing data collections, and sensor networks. At the same time, we see an emerging confluence of new technologies and approaches to science, most clearly visible in the growing synergism of the four modes of scientific discovery: sensors-modeling-computing-data (Eastman et al. 2005). This has been driven by numerous developments, including the information explosion, development of large-array sensors, acceleration in high-performance computing (HPC) power, advances in algorithms, and efficient modeling techniques. Among these, the most extreme is the growth in new data. Specifically, the acquisition of data in all scientific disciplines is rapidly accelerating and causing a data glut (Bell et al. 2007). It has been estimated that data volumes double every year—for example, the NCSA (National Center for Supercomputing Applications) reported that their users cumulatively generated one petabyte of data over the first 19 years of NCSA operation, but they then generated their next one petabyte in the next year alone, and the data production has been growing by almost 100% each year after that (Butler 2008). The NCSA example is just one of many demonstrations of the exponential (annual data-doubling) growth in scientific data collections. In general, this putative data-doubling is an inevitable result of several compounding factors: the proliferation of data-generating devices, sensors, projects, and enterprises; the 18-month doubling of the digital capacity of these microprocessor-based sensors and devices (commonly referred to as "Moore’s law"); the move to digital for nearly all forms of information; the increase in human-generated data (both unstructured information on the web and structured data from experiments, models, and simulation); and the ever-expanding capability of higher density media to hold greater volumes of data (i.e., data production expands to fill the available storage space). These factors are consequently producing an exponential data growth rate, which will soon (if not already) become an insurmountable technical challenge even with the great advances in computation and algorithms. This technical challenge is compounded by the ever-increasing geographic dispersion of important data sources—the data collections are not stored uniformly at a single location, or with a single data model, or in uniform formats and modalities (e.g., images, databases, structured and unstructured files, and XML data sets)—the data are in fact large, distributed, heterogeneous, and complex. The greatest scientific research challenge with these massive distributed data collections is consequently extracting all of the rich information and knowledge content contained therein, thus requiring new approaches to scientific research. This emerging data-intensive and data-oriented approach to scientific research is sometimes called discovery informatics or X-informatics (where X can be any science, such as bio, geo, astro, chem, eco, or anything; Agresti 2003; Gray 2003; Borne 2010). This data-oriented approach to science is now recognized by some (e.g., Mahootian and Eastman 2009; Hey et al. 2009) as the fourth paradigm of research, following (historically) experiment/observation, modeling/analysis, and computational science.
Johnson, Timothy C.; Versteeg, Roelof J.; Ward, Andy; Day-Lewis, Frederick D.; Revil, André
2010-01-01
Electrical geophysical methods have found wide use in the growing discipline of hydrogeophysics for characterizing the electrical properties of the subsurface and for monitoring subsurface processes in terms of the spatiotemporal changes in subsurface conductivity, chargeability, and source currents they govern. Presently, multichannel and multielectrode data collections systems can collect large data sets in relatively short periods of time. Practitioners, however, often are unable to fully utilize these large data sets and the information they contain because of standard desktop-computer processing limitations. These limitations can be addressed by utilizing the storage and processing capabilities of parallel computing environments. We have developed a parallel distributed-memory forward and inverse modeling algorithm for analyzing resistivity and time-domain induced polar-ization (IP) data. The primary components of the parallel computations include distributed computation of the pole solutions in forward mode, distributed storage and computation of the Jacobian matrix in inverse mode, and parallel execution of the inverse equation solver. We have tested the corresponding parallel code in three efforts: (1) resistivity characterization of the Hanford 300 Area Integrated Field Research Challenge site in Hanford, Washington, U.S.A., (2) resistivity characterization of a volcanic island in the southern Tyrrhenian Sea in Italy, and (3) resistivity and IP monitoring of biostimulation at a Superfund site in Brandywine, Maryland, U.S.A. Inverse analysis of each of these data sets would be limited or impossible in a standard serial computing environment, which underscores the need for parallel high-performance computing to fully utilize the potential of electrical geophysical methods in hydrogeophysical applications.
Iterative updating of model error for Bayesian inversion
NASA Astrophysics Data System (ADS)
Calvetti, Daniela; Dunlop, Matthew; Somersalo, Erkki; Stuart, Andrew
2018-02-01
In computational inverse problems, it is common that a detailed and accurate forward model is approximated by a computationally less challenging substitute. The model reduction may be necessary to meet constraints in computing time when optimization algorithms are used to find a single estimate, or to speed up Markov chain Monte Carlo (MCMC) calculations in the Bayesian framework. The use of an approximate model introduces a discrepancy, or modeling error, that may have a detrimental effect on the solution of the ill-posed inverse problem, or it may severely distort the estimate of the posterior distribution. In the Bayesian paradigm, the modeling error can be considered as a random variable, and by using an estimate of the probability distribution of the unknown, one may estimate the probability distribution of the modeling error and incorporate it into the inversion. We introduce an algorithm which iterates this idea to update the distribution of the model error, leading to a sequence of posterior distributions that are demonstrated empirically to capture the underlying truth with increasing accuracy. Since the algorithm is not based on rejections, it requires only limited full model evaluations. We show analytically that, in the linear Gaussian case, the algorithm converges geometrically fast with respect to the number of iterations when the data is finite dimensional. For more general models, we introduce particle approximations of the iteratively generated sequence of distributions; we also prove that each element of the sequence converges in the large particle limit under a simplifying assumption. We show numerically that, as in the linear case, rapid convergence occurs with respect to the number of iterations. Additionally, we show through computed examples that point estimates obtained from this iterative algorithm are superior to those obtained by neglecting the model error.
NASA Technical Reports Server (NTRS)
Young, William D.
1992-01-01
The application of formal methods to the analysis of computing systems promises to provide higher and higher levels of assurance as the sophistication of our tools and techniques increases. Improvements in tools and techniques come about as we pit the current state of the art against new and challenging problems. A promising area for the application of formal methods is in real-time and distributed computing. Some of the algorithms in this area are both subtle and important. In response to this challenge and as part of an ongoing attempt to verify an implementation of the Interactive Convergence Clock Synchronization Algorithm (ICCSA), we decided to undertake a proof of the correctness of the algorithm using the Boyer-Moore theorem prover. This paper describes our approach to proving the ICCSA using the Boyer-Moore prover.
ClimateSpark: An in-memory distributed computing framework for big climate data analytics
NASA Astrophysics Data System (ADS)
Hu, Fei; Yang, Chaowei; Schnase, John L.; Duffy, Daniel Q.; Xu, Mengchao; Bowen, Michael K.; Lee, Tsengdar; Song, Weiwei
2018-06-01
The unprecedented growth of climate data creates new opportunities for climate studies, and yet big climate data pose a grand challenge to climatologists to efficiently manage and analyze big data. The complexity of climate data content and analytical algorithms increases the difficulty of implementing algorithms on high performance computing systems. This paper proposes an in-memory, distributed computing framework, ClimateSpark, to facilitate complex big data analytics and time-consuming computational tasks. Chunking data structure improves parallel I/O efficiency, while a spatiotemporal index is built for the chunks to avoid unnecessary data reading and preprocessing. An integrated, multi-dimensional, array-based data model (ClimateRDD) and ETL operations are developed to address big climate data variety by integrating the processing components of the climate data lifecycle. ClimateSpark utilizes Spark SQL and Apache Zeppelin to develop a web portal to facilitate the interaction among climatologists, climate data, analytic operations and computing resources (e.g., using SQL query and Scala/Python notebook). Experimental results show that ClimateSpark conducts different spatiotemporal data queries/analytics with high efficiency and data locality. ClimateSpark is easily adaptable to other big multiple-dimensional, array-based datasets in various geoscience domains.
Visualizing Spatially Varying Distribution Data
NASA Technical Reports Server (NTRS)
Kao, David; Luo, Alison; Dungan, Jennifer L.; Pang, Alex; Biegel, Bryan A. (Technical Monitor)
2002-01-01
Box plot is a compact representation that encodes the minimum, maximum, mean, median, and quarters information of a distribution. In practice, a single box plot is drawn for each variable of interest. With the advent of more accessible computing power, we are now facing the problem of visual icing data where there is a distribution at each 2D spatial location. Simply extending the box plot technique to distributions over 2D domain is not straightforward. One challenge is reducing the visual clutter if a box plot is drawn over each grid location in the 2D domain. This paper presents and discusses two general approaches, using parametric statistics and shape descriptors, to present 2D distribution data sets. Both approaches provide additional insights compared to the traditional box plot technique
The topology of the cosmic web in terms of persistent Betti numbers
NASA Astrophysics Data System (ADS)
Pranav, Pratyush; Edelsbrunner, Herbert; van de Weygaert, Rien; Vegter, Gert; Kerber, Michael; Jones, Bernard J. T.; Wintraecken, Mathijs
2017-03-01
We introduce a multiscale topological description of the Megaparsec web-like cosmic matter distribution. Betti numbers and topological persistence offer a powerful means of describing the rich connectivity structure of the cosmic web and of its multiscale arrangement of matter and galaxies. Emanating from algebraic topology and Morse theory, Betti numbers and persistence diagrams represent an extension and deepening of the cosmologically familiar topological genus measure and the related geometric Minkowski functionals. In addition to a description of the mathematical background, this study presents the computational procedure for computing Betti numbers and persistence diagrams for density field filtrations. The field may be computed starting from a discrete spatial distribution of galaxies or simulation particles. The main emphasis of this study concerns an extensive and systematic exploration of the imprint of different web-like morphologies and different levels of multiscale clustering in the corresponding computed Betti numbers and persistence diagrams. To this end, we use Voronoi clustering models as templates for a rich variety of web-like configurations and the fractal-like Soneira-Peebles models exemplify a range of multiscale configurations. We have identified the clear imprint of cluster nodes, filaments, walls, and voids in persistence diagrams, along with that of the nested hierarchy of structures in multiscale point distributions. We conclude by outlining the potential of persistent topology for understanding the connectivity structure of the cosmic web, in large simulations of cosmic structure formation and in the challenging context of the observed galaxy distribution in large galaxy surveys.
NASA Astrophysics Data System (ADS)
Zhang, Ling; Nan, Zhuotong; Liang, Xu; Xu, Yi; Hernández, Felipe; Li, Lianxia
2018-03-01
Although process-based distributed hydrological models (PDHMs) are evolving rapidly over the last few decades, their extensive applications are still challenged by the computational expenses. This study attempted, for the first time, to apply the numerically efficient MacCormack algorithm to overland flow routing in a representative high-spatial resolution PDHM, i.e., the distributed hydrology-soil-vegetation model (DHSVM), in order to improve its computational efficiency. The analytical verification indicates that both the semi and full versions of the MacCormack schemes exhibit robust numerical stability and are more computationally efficient than the conventional explicit linear scheme. The full-version outperforms the semi-version in terms of simulation accuracy when a same time step is adopted. The semi-MacCormack scheme was implemented into DHSVM (version 3.1.2) to solve the kinematic wave equations for overland flow routing. The performance and practicality of the enhanced DHSVM-MacCormack model was assessed by performing two groups of modeling experiments in the Mercer Creek watershed, a small urban catchment near Bellevue, Washington. The experiments show that DHSVM-MacCormack can considerably improve the computational efficiency without compromising the simulation accuracy of the original DHSVM model. More specifically, with the same computational environment and model settings, the computational time required by DHSVM-MacCormack can be reduced to several dozen minutes for a simulation period of three months (in contrast with one day and a half by the original DHSVM model) without noticeable sacrifice of the accuracy. The MacCormack scheme proves to be applicable to overland flow routing in DHSVM, which implies that it can be coupled into other PHDMs for watershed routing to either significantly improve their computational efficiency or to make the kinematic wave routing for high resolution modeling computational feasible.
Strategic Leadership Challenges with the Joint Information Environment
2013-03-01
Approved for Public Release. Distribution is Unlimited. 13. SUPPLEMENTARY NOTES Word Count: 6,852 14. ABSTRACT In the face of growing cyber...USCYBERCOM Classification: Unclassified In the face of growing cyber attacks against Department of Defense (DoD...million computers, and 250,000 mobile devices (i.e. Blackberry ). All of this IT capability represented an investment of $37 Billion in the Fiscal Year
From an Executive Network to Executive Control: A Computational Model of the "n"-Back Task
ERIC Educational Resources Information Center
Chatham, Christopher H.; Herd, Seth A.; Brant, Angela M.; Hazy, Thomas E.; Miyake, Akira; O'Reilly, Randy; Friedman, Naomi P.
2011-01-01
A paradigmatic test of executive control, the n-back task, is known to recruit a widely distributed parietal, frontal, and striatal "executive network," and is thought to require an equally wide array of executive functions. The mapping of functions onto substrates in such a complex task presents a significant challenge to any theoretical…
Ogawa, S.; Komini Babu, S.; Chung, H. T.; ...
2016-08-22
The nano/micro-scale geometry of polymer electrolyte fuel cell (PEFC) catalyst layers critically affects cell performance. The small length scales and complex structure of these composite layers make it challenging to analyze cell performance and physics at the particle scale by experiment. We present a computational method to simulate transport and chemical reaction phenomena at the pore/particle-scale and apply it to a PEFC cathode with platinum group metal free (PGM-free) catalyst. Here, we numerically solve the governing equations for the physics with heterogeneous oxygen diffusion coefficient and proton conductivity evaluated using the actual electrode structure and ionomer distribution obtained using nano-scalemore » resolution X-ray computed tomography (nano-CT). Using this approach, the oxygen concentration and electrolyte potential distributions imposed by the oxygen reduction reaction are solved and the impact of the catalyst layer structure on performance is evaluated.« less
Computation of Steady-State Probability Distributions in Stochastic Models of Cellular Networks
Hallen, Mark; Li, Bochong; Tanouchi, Yu; Tan, Cheemeng; West, Mike; You, Lingchong
2011-01-01
Cellular processes are “noisy”. In each cell, concentrations of molecules are subject to random fluctuations due to the small numbers of these molecules and to environmental perturbations. While noise varies with time, it is often measured at steady state, for example by flow cytometry. When interrogating aspects of a cellular network by such steady-state measurements of network components, a key need is to develop efficient methods to simulate and compute these distributions. We describe innovations in stochastic modeling coupled with approaches to this computational challenge: first, an approach to modeling intrinsic noise via solution of the chemical master equation, and second, a convolution technique to account for contributions of extrinsic noise. We show how these techniques can be combined in a streamlined procedure for evaluation of different sources of variability in a biochemical network. Evaluation and illustrations are given in analysis of two well-characterized synthetic gene circuits, as well as a signaling network underlying the mammalian cell cycle entry. PMID:22022252
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ogawa, S.; Komini Babu, S.; Chung, H. T.
The nano/micro-scale geometry of polymer electrolyte fuel cell (PEFC) catalyst layers critically affects cell performance. The small length scales and complex structure of these composite layers make it challenging to analyze cell performance and physics at the particle scale by experiment. We present a computational method to simulate transport and chemical reaction phenomena at the pore/particle-scale and apply it to a PEFC cathode with platinum group metal free (PGM-free) catalyst. Here, we numerically solve the governing equations for the physics with heterogeneous oxygen diffusion coefficient and proton conductivity evaluated using the actual electrode structure and ionomer distribution obtained using nano-scalemore » resolution X-ray computed tomography (nano-CT). Using this approach, the oxygen concentration and electrolyte potential distributions imposed by the oxygen reduction reaction are solved and the impact of the catalyst layer structure on performance is evaluated.« less
NASA Astrophysics Data System (ADS)
Wang, Xiaowo; Xu, Zhijie; Soulami, Ayoub; Hu, Xiaohua; Lavender, Curt; Joshi, Vineet
2017-12-01
Low-enriched uranium alloyed with 10 wt.% molybdenum (U-10Mo) has been identified as a promising alternative to high-enriched uranium. Manufacturing U-10Mo alloy involves multiple complex thermomechanical processes that pose challenges for computational modeling. This paper describes the application of integrated computational materials engineering (ICME) concepts to integrate three individual modeling components, viz. homogenization, microstructure-based finite element method for hot rolling, and carbide particle distribution, to simulate the early-stage processes of U-10Mo alloy manufacture. The resulting integrated model enables information to be passed between different model components and leads to improved understanding of the evolution of the microstructure. This ICME approach is then used to predict the variation in the thickness of the Zircaloy-2 barrier as a function of the degree of homogenization and to analyze the carbide distribution, which can affect the recrystallization, hardness, and fracture properties of U-10Mo in subsequent processes.
Cloud Compute for Global Climate Station Summaries
NASA Astrophysics Data System (ADS)
Baldwin, R.; May, B.; Cogbill, P.
2017-12-01
Global Climate Station Summaries are simple indicators of observational normals which include climatic data summarizations and frequency distributions. These typically are statistical analyses of station data over 5-, 10-, 20-, 30-year or longer time periods. The summaries are computed from the global surface hourly dataset. This dataset totaling over 500 gigabytes is comprised of 40 different types of weather observations with 20,000 stations worldwide. NCEI and the U.S. Navy developed these value added products in the form of hourly summaries from many of these observations. Enabling this compute functionality in the cloud is the focus of the project. An overview of approach and challenges associated with application transition to the cloud will be presented.
A world-wide databridge supported by a commercial cloud provider
NASA Astrophysics Data System (ADS)
Tat Cheung, Kwong; Field, Laurence; Furano, Fabrizio
2017-10-01
Volunteer computing has the potential to provide significant additional computing capacity for the LHC experiments. One of the challenges with exploiting volunteer computing is to support a global community of volunteers that provides heterogeneous resources. However, high energy physics applications require more data input and output than the CPU intensive applications that are typically used by other volunteer computing projects. While the so-called databridge has already been successfully proposed as a method to span the untrusted and trusted domains of volunteer computing and Grid computing respective, globally transferring data between potentially poor-performing residential networks and CERN could be unreliable, leading to wasted resources usage. The expectation is that by placing a storage endpoint that is part of a wider, flexible geographical databridge deployment closer to the volunteers, the transfer success rate and the overall performance can be improved. This contribution investigates the provision of a globally distributed databridge implemented upon a commercial cloud provider.
Center for Center for Technology for Advanced Scientific Component Software (TASCS)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kostadin, Damevski
A resounding success of the Scientific Discovery through Advanced Computing (SciDAC) program is that high-performance computational science is now universally recognized as a critical aspect of scientific discovery [71], complementing both theoretical and experimental research. As scientific communities prepare to exploit unprecedented computing capabilities of emerging leadership-class machines for multi-model simulations at the extreme scale [72], it is more important than ever to address the technical and social challenges of geographically distributed teams that combine expertise in domain science, applied mathematics, and computer science to build robust and flexible codes that can incorporate changes over time. The Center for Technologymore » for Advanced Scientific Component Software (TASCS)1 tackles these these issues by exploiting component-based software development to facilitate collaborative high-performance scientific computing.« less
a Hadoop-Based Distributed Framework for Efficient Managing and Processing Big Remote Sensing Images
NASA Astrophysics Data System (ADS)
Wang, C.; Hu, F.; Hu, X.; Zhao, S.; Wen, W.; Yang, C.
2015-07-01
Various sensors from airborne and satellite platforms are producing large volumes of remote sensing images for mapping, environmental monitoring, disaster management, military intelligence, and others. However, it is challenging to efficiently storage, query and process such big data due to the data- and computing- intensive issues. In this paper, a Hadoop-based framework is proposed to manage and process the big remote sensing data in a distributed and parallel manner. Especially, remote sensing data can be directly fetched from other data platforms into the Hadoop Distributed File System (HDFS). The Orfeo toolbox, a ready-to-use tool for large image processing, is integrated into MapReduce to provide affluent image processing operations. With the integration of HDFS, Orfeo toolbox and MapReduce, these remote sensing images can be directly processed in parallel in a scalable computing environment. The experiment results show that the proposed framework can efficiently manage and process such big remote sensing data.
Change Detection of Mobile LIDAR Data Using Cloud Computing
NASA Astrophysics Data System (ADS)
Liu, Kun; Boehm, Jan; Alis, Christian
2016-06-01
Change detection has long been a challenging problem although a lot of research has been conducted in different fields such as remote sensing and photogrammetry, computer vision, and robotics. In this paper, we blend voxel grid and Apache Spark together to propose an efficient method to address the problem in the context of big data. Voxel grid is a regular geometry representation consisting of the voxels with the same size, which fairly suites parallel computation. Apache Spark is a popular distributed parallel computing platform which allows fault tolerance and memory cache. These features can significantly enhance the performance of Apache Spark and results in an efficient and robust implementation. In our experiments, both synthetic and real point cloud data are employed to demonstrate the quality of our method.
Electro-textile garments for power and data distribution
NASA Astrophysics Data System (ADS)
Slade, Jeremiah R.; Winterhalter, Carole
2015-05-01
U.S. troops are increasingly being equipped with various electronic assets including flexible displays, computers, and communications systems. While these systems can significantly enhance operational capabilities, forming reliable connections between them poses a number of challenges in terms of comfort, weight, ergonomics, and operational security. IST has addressed these challenges by developing the technologies needed to integrate large-scale cross-seam electrical functionality into virtually any textile product, including the various garments and vests that comprise the warfighter's ensemble. Using this technology IST is able to develop textile products that do not simply support or accommodate a network but are the network.
cOSPREY: A Cloud-Based Distributed Algorithm for Large-Scale Computational Protein Design
Pan, Yuchao; Dong, Yuxi; Zhou, Jingtian; Hallen, Mark; Donald, Bruce R.; Xu, Wei
2016-01-01
Abstract Finding the global minimum energy conformation (GMEC) of a huge combinatorial search space is the key challenge in computational protein design (CPD) problems. Traditional algorithms lack a scalable and efficient distributed design scheme, preventing researchers from taking full advantage of current cloud infrastructures. We design cloud OSPREY (cOSPREY), an extension to a widely used protein design software OSPREY, to allow the original design framework to scale to the commercial cloud infrastructures. We propose several novel designs to integrate both algorithm and system optimizations, such as GMEC-specific pruning, state search partitioning, asynchronous algorithm state sharing, and fault tolerance. We evaluate cOSPREY on three different cloud platforms using different technologies and show that it can solve a number of large-scale protein design problems that have not been possible with previous approaches. PMID:27154509
O'Donnell, Michael
2015-01-01
State-and-transition simulation modeling relies on knowledge of vegetation composition and structure (states) that describe community conditions, mechanistic feedbacks such as fire that can affect vegetation establishment, and ecological processes that drive community conditions as well as the transitions between these states. However, as the need for modeling larger and more complex landscapes increase, a more advanced awareness of computing resources becomes essential. The objectives of this study include identifying challenges of executing state-and-transition simulation models, identifying common bottlenecks of computing resources, developing a workflow and software that enable parallel processing of Monte Carlo simulations, and identifying the advantages and disadvantages of different computing resources. To address these objectives, this study used the ApexRMS® SyncroSim software and embarrassingly parallel tasks of Monte Carlo simulations on a single multicore computer and on distributed computing systems. The results demonstrated that state-and-transition simulation models scale best in distributed computing environments, such as high-throughput and high-performance computing, because these environments disseminate the workloads across many compute nodes, thereby supporting analysis of larger landscapes, higher spatial resolution vegetation products, and more complex models. Using a case study and five different computing environments, the top result (high-throughput computing versus serial computations) indicated an approximate 96.6% decrease of computing time. With a single, multicore compute node (bottom result), the computing time indicated an 81.8% decrease relative to using serial computations. These results provide insight into the tradeoffs of using different computing resources when research necessitates advanced integration of ecoinformatics incorporating large and complicated data inputs and models. - See more at: http://aimspress.com/aimses/ch/reader/view_abstract.aspx?file_no=Environ2015030&flag=1#sthash.p1XKDtF8.dpuf
Efficient Memory Access with NumPy Global Arrays using Local Memory Access
DOE Office of Scientific and Technical Information (OSTI.GOV)
Daily, Jeffrey A.; Berghofer, Dan C.
This paper discusses the work completed working with Global Arrays of data on distributed multi-computer systems and improving their performance. The tasks completed were done at Pacific Northwest National Laboratory in the Science Undergrad Laboratory Internship program in the summer of 2013 for the Data Intensive Computing Group in the Fundamental and Computational Sciences DIrectorate. This work was done on the Global Arrays Toolkit developed by this group. This toolkit is an interface for programmers to more easily create arrays of data on networks of computers. This is useful because scientific computation is often done on large amounts of datamore » sometimes so large that individual computers cannot hold all of it. This data is held in array form and can best be processed on supercomputers which often consist of a network of individual computers doing their computation in parallel. One major challenge for this sort of programming is that operations on arrays on multiple computers is very complex and an interface is needed so that these arrays seem like they are on a single computer. This is what global arrays does. The work done here is to use more efficient operations on that data that requires less copying of data to be completed. This saves a lot of time because copying data on many different computers is time intensive. The way this challenge was solved is when data to be operated on with binary operations are on the same computer, they are not copied when they are accessed. When they are on separate computers, only one set is copied when accessed. This saves time because of less copying done although more data access operations were done.« less
Spectral method for a kinetic swarming model
Gamba, Irene M.; Haack, Jeffrey R.; Motsch, Sebastien
2015-04-28
Here we present the first numerical method for a kinetic description of the Vicsek swarming model. The kinetic model poses a unique challenge, as there is a distribution dependent collision invariant to satisfy when computing the interaction term. We use a spectral representation linked with a discrete constrained optimization to compute these interactions. To test the numerical scheme we investigate the kinetic model at different scales and compare the solution with the microscopic and macroscopic descriptions of the Vicsek model. Lastly, we observe that the kinetic model captures key features such as vortex formation and traveling waves.
Dashboard Task Monitor for Managing ATLAS User Analysis on the Grid
NASA Astrophysics Data System (ADS)
Sargsyan, L.; Andreeva, J.; Jha, M.; Karavakis, E.; Kokoszkiewicz, L.; Saiz, P.; Schovancova, J.; Tuckett, D.; Atlas Collaboration
2014-06-01
The organization of the distributed user analysis on the Worldwide LHC Computing Grid (WLCG) infrastructure is one of the most challenging tasks among the computing activities at the Large Hadron Collider. The Experiment Dashboard offers a solution that not only monitors but also manages (kill, resubmit) user tasks and jobs via a web interface. The ATLAS Dashboard Task Monitor provides analysis users with a tool that is independent of the operating system and Grid environment. This contribution describes the functionality of the application and its implementation details, in particular authentication, authorization and audit of the management operations.
Insect vision as model for machine vision
NASA Astrophysics Data System (ADS)
Osorio, D.; Sobey, Peter J.
1992-11-01
The neural architecture, neurophysiology and behavioral abilities of insect vision are described, and compared with that of mammals. Insects have a hardwired neural architecture of highly differentiated neurons, quite different from the cerebral cortex, yet their behavioral abilities are in important respects similar to those of mammals. These observations challenge the view that the key to the power of biological neural computation is distributed processing by a plastic, highly interconnected, network of individually undifferentiated and unreliable neurons that has been a dominant picture of biological computation since Pitts and McCulloch's seminal work in the 1940's.
Separating Added Value from Hype: Some Experiences and Prognostications
NASA Astrophysics Data System (ADS)
Reed, Dan
2004-03-01
These are exciting times for the interplay of science and computing technology. As new data archives, instruments and computing facilities are connected nationally and internationally, a new model of distributed scientific collaboration is emerging. However, any new technology brings both opportunities and challenges -- Grids are no exception. In this talk, we will discuss some of the experiences deploying Grid software in production environments, illustrated with experiences from the NSF PACI Alliance, the NSF Extensible Terascale Facility (ETF) and other Grid projects. From these experiences, we derive some guidelines for deployment and some suggestions for community engagement, software development and infrastructure
An Object Oriented Extensible Architecture for Affordable Aerospace Propulsion Systems
NASA Technical Reports Server (NTRS)
Follen, Gregory J.; Lytle, John K. (Technical Monitor)
2002-01-01
Driven by a need to explore and develop propulsion systems that exceeded current computing capabilities, NASA Glenn embarked on a novel strategy leading to the development of an architecture that enables propulsion simulations never thought possible before. Full engine 3 Dimensional Computational Fluid Dynamic propulsion system simulations were deemed impossible due to the impracticality of the hardware and software computing systems required. However, with a software paradigm shift and an embracing of parallel and distributed processing, an architecture was designed to meet the needs of future propulsion system modeling. The author suggests that the architecture designed at the NASA Glenn Research Center for propulsion system modeling has potential for impacting the direction of development of affordable weapons systems currently under consideration by the Applied Vehicle Technology Panel (AVT). This paper discusses the salient features of the NPSS Architecture including its interface layer, object layer, implementation for accessing legacy codes, numerical zooming infrastructure and its computing layer. The computing layer focuses on the use and deployment of these propulsion simulations on parallel and distributed computing platforms which has been the focus of NASA Ames. Additional features of the object oriented architecture that support MultiDisciplinary (MD) Coupling, computer aided design (CAD) access and MD coupling objects will be discussed. Included will be a discussion of the successes, challenges and benefits of implementing this architecture.
An efficient distribution method for nonlinear transport problems in stochastic porous media
NASA Astrophysics Data System (ADS)
Ibrahima, F.; Tchelepi, H.; Meyer, D. W.
2015-12-01
Because geophysical data are inexorably sparse and incomplete, stochastic treatments of simulated responses are convenient to explore possible scenarios and assess risks in subsurface problems. In particular, understanding how uncertainties propagate in porous media with nonlinear two-phase flow is essential, yet challenging, in reservoir simulation and hydrology. We give a computationally efficient and numerically accurate method to estimate the one-point probability density (PDF) and cumulative distribution functions (CDF) of the water saturation for the stochastic Buckley-Leverett problem when the probability distributions of the permeability and porosity fields are available. The method draws inspiration from the streamline approach and expresses the distributions of interest essentially in terms of an analytically derived mapping and the distribution of the time of flight. In a large class of applications the latter can be estimated at low computational costs (even via conventional Monte Carlo). Once the water saturation distribution is determined, any one-point statistics thereof can be obtained, especially its average and standard deviation. Moreover, rarely available in other approaches, yet crucial information such as the probability of rare events and saturation quantiles (e.g. P10, P50 and P90) can be derived from the method. We provide various examples and comparisons with Monte Carlo simulations to illustrate the performance of the method.
Archive Management of NASA Earth Observation Data to Support Cloud Analysis
NASA Technical Reports Server (NTRS)
Lynnes, Christopher; Baynes, Kathleen; McInerney, Mark A.
2017-01-01
NASA collects, processes and distributes petabytes of Earth Observation (EO) data from satellites, aircraft, in situ instruments and model output, with an order of magnitude increase expected by 2024. Cloud-based web object storage (WOS) of these data can simplify the execution of such an increase. More importantly, it can also facilitate user analysis of those volumes by making the data available to the massively parallel computing power in the cloud. However, storing EO data in cloud WOS has a ripple effect throughout the NASA archive system with unexpected challenges and opportunities. One challenge is modifying data servicing software (such as Web Coverage Service servers) to access and subset data that are no longer on a directly accessible file system, but rather in cloud WOS. Opportunities include refactoring of the archive software to a cloud-native architecture; virtualizing data products by computing on demand; and reorganizing data to be more analysis-friendly.
Russell A. Parsons; William Mell; Peter McCauley
2010-01-01
Crown fire poses challenges to fire managers and can endanger fire fighters. Understanding of how fire interacts with tree crowns is essential to informed decisions about crown fire. Current operational crown fire predictions in the United States assume homogeneous crown fuels. While a new class of research fire models, which model fire behavior with computational...
Physics-based distributed snow models in the operational arena: Current and future challenges
NASA Astrophysics Data System (ADS)
Winstral, A. H.; Jonas, T.; Schirmer, M.; Helbig, N.
2017-12-01
The demand for modeling tools robust to climate change and weather extremes along with coincident increases in computational capabilities have led to an increase in the use of physics-based snow models in operational applications. Current operational applications include the WSL-SLF's across Switzerland, ASO's in California, and USDA-ARS's in Idaho. While the physics-based approaches offer many advantages there remain limitations and modeling challenges. The most evident limitation remains computation times that often limit forecasters to a single, deterministic model run. Other limitations however remain less conspicuous amidst the assumptions that these models require little to no calibration based on their foundation on physical principles. Yet all energy balance snow models seemingly contain parameterizations or simplifications of processes where validation data are scarce or present understanding is limited. At the research-basin scale where many of these models were developed these modeling elements may prove adequate. However when applied over large areas, spatially invariable parameterizations of snow albedo, roughness lengths and atmospheric exchange coefficients - all vital to determining the snowcover energy balance - become problematic. Moreover as we apply models over larger grid cells, the representation of sub-grid variability such as the snow-covered fraction adds to the challenges. Here, we will demonstrate some of the major sensitivities of distributed energy balance snow models to particular model constructs, the need for advanced and spatially flexible methods and parameterizations, and prompt the community for open dialogue and future collaborations to further modeling capabilities.
A hydrological emulator for global applications - HE v1.0.0
NASA Astrophysics Data System (ADS)
Liu, Yaling; Hejazi, Mohamad; Li, Hongyi; Zhang, Xuesong; Leng, Guoyong
2018-03-01
While global hydrological models (GHMs) are very useful in exploring water resources and interactions between the Earth and human systems, their use often requires numerous model inputs, complex model calibration, and high computation costs. To overcome these challenges, we construct an efficient open-source and ready-to-use hydrological emulator (HE) that can mimic complex GHMs at a range of spatial scales (e.g., basin, region, globe). More specifically, we construct both a lumped and a distributed scheme of the HE based on the monthly abcd model to explore the tradeoff between computational cost and model fidelity. Model predictability and computational efficiency are evaluated in simulating global runoff from 1971 to 2010 with both the lumped and distributed schemes. The results are compared against the runoff product from the widely used Variable Infiltration Capacity (VIC) model. Our evaluation indicates that the lumped and distributed schemes present comparable results regarding annual total quantity, spatial pattern, and temporal variation of the major water fluxes (e.g., total runoff, evapotranspiration) across the global 235 basins (e.g., correlation coefficient r between the annual total runoff from either of these two schemes and the VIC is > 0.96), except for several cold (e.g., Arctic, interior Tibet), dry (e.g., North Africa) and mountainous (e.g., Argentina) regions. Compared against the monthly total runoff product from the VIC (aggregated from daily runoff), the global mean Kling-Gupta efficiencies are 0.75 and 0.79 for the lumped and distributed schemes, respectively, with the distributed scheme better capturing spatial heterogeneity. Notably, the computation efficiency of the lumped scheme is 2 orders of magnitude higher than the distributed one and 7 orders more efficient than the VIC model. A case study of uncertainty analysis for the world's 16 basins with top annual streamflow is conducted using 100 000 model simulations, and it demonstrates the lumped scheme's extraordinary advantage in computational efficiency. Our results suggest that the revised lumped abcd model can serve as an efficient and reasonable HE for complex GHMs and is suitable for broad practical use, and the distributed scheme is also an efficient alternative if spatial heterogeneity is of more interest.
Toward Interactive Scenario Analysis and Exploration
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gayle, Thomas R.; Summers, Kenneth Lee; Jungels, John
2015-01-01
As Modeling and Simulation (M&S) tools have matured, their applicability and importance have increased across many national security challenges. In particular, they provide a way to test how something may behave without the need to do real world testing. However, current and future changes across several factors including capabilities, policy, and funding are driving a need for rapid response or evaluation in ways that many M&S tools cannot address. Issues around large data, computational requirements, delivery mechanisms, and analyst involvement already exist and pose significant challenges. Furthermore, rising expectations, rising input complexity, and increasing depth of analysis will only increasemore » the difficulty of these challenges. In this study we examine whether innovations in M&S software coupled with advances in ''cloud'' computing and ''big-data'' methodologies can overcome many of these challenges. In particular, we propose a simple, horizontally-scalable distributed computing environment that could provide the foundation (i.e. ''cloud'') for next-generation M&S-based applications based on the notion of ''parallel multi-simulation''. In our context, the goal of parallel multi- simulation is to consider as many simultaneous paths of execution as possible. Therefore, with sufficient resources, the complexity is dominated by the cost of single scenario runs as opposed to the number of runs required. We show the feasibility of this architecture through a stable prototype implementation coupled with the Umbra Simulation Framework [6]. Finally, we highlight the utility through multiple novel analysis tools and by showing the performance improvement compared to existing tools.« less
INDIGO-DataCloud solutions for Earth Sciences
NASA Astrophysics Data System (ADS)
Aguilar Gómez, Fernando; de Lucas, Jesús Marco; Fiore, Sandro; Monna, Stephen; Chen, Yin
2017-04-01
INDIGO-DataCloud (https://www.indigo-datacloud.eu/) is a European Commission funded project aiming to develop a data and computing platform targeting scientific communities, deployable on multiple hardware and provisioned over hybrid (private or public) e-infrastructures. The development of INDIGO solutions covers the different layers in cloud computing (IaaS, PaaS, SaaS), and provides tools to exploit resources like HPC or GPGPUs. INDIGO is oriented to support European Scientific research communities, that are well represented in the project. Twelve different Case Studies have been analyzed in detail from different fields: Biological & Medical sciences, Social sciences & Humanities, Environmental and Earth sciences and Physics & Astrophysics. INDIGO-DataCloud provides solutions to emerging challenges in Earth Science like: -Enabling an easy deployment of community services at different cloud sites. Many Earth Science research infrastructures often involve distributed observation stations across countries, and also have distributed data centers to support the corresponding data acquisition and curation. There is a need to easily deploy new data center services while the research infrastructure continuous spans. As an example: LifeWatch (ESFRI, Ecosystems and Biodiversity) uses INDIGO solutions to manage the deployment of services to perform complex hydrodynamics and water quality modelling over a Cloud Computing environment, predicting algae blooms, using the Docker technology: TOSCA requirement description, Docker repository, Orchestrator for deployment, AAI (AuthN, AuthZ) and OneData (Distributed Storage System). -Supporting Big Data Analysis. Nowadays, many Earth Science research communities produce large amounts of data and and are challenged by the difficulties of processing and analysing it. A climate models intercomparison data analysis case study for the European Network for Earth System Modelling (ENES) community has been setup, based on the Ophidia big data analysis framework and the Kepler workflow management system. Such services normally involve a large and distributed set of data and computing resources. In this regard, this case study exploits the INDIGO PaaS for a flexible and dynamic allocation of the resources at the infrastructural level. -Providing Distributed Data Storage Solutions. In order to allow scientific communities to perform heavy computation on huge datasets, INDIGO provides global data access solutions allowing researchers to access data in a distributed environment like fashion regardless of its location, and also to publish and share their research results with public or close communities. INDIGO solutions that support the access to distributed data storage (OneData) are being tested on EMSO infrastructure (Ocean Sciences and Geohazards) data. Another aspect of interest for the EMSO community is in efficient data processing by exploiting INDIGO services like PaaS Orchestrator. Further, for HPC exploitation, a new solution named Udocker has been implemented, enabling users to execute docker containers in supercomputers, without requiring administration privileges. This presentation will overview INDIGO solutions that are interesting and useful for Earth science communities and will show how they can be applied to other Case Studies.
Multicore Challenges and Benefits for High Performance Scientific Computing
Nielsen, Ida M. B.; Janssen, Curtis L.
2008-01-01
Until recently, performance gains in processors were achieved largely by improvements in clock speeds and instruction level parallelism. Thus, applications could obtain performance increases with relatively minor changes by upgrading to the latest generation of computing hardware. Currently, however, processor performance improvements are realized by using multicore technology and hardware support for multiple threads within each core, and taking full advantage of this technology to improve the performance of applications requires exposure of extreme levels of software parallelism. We will here discuss the architecture of parallel computers constructed from many multicore chips as well as techniques for managing the complexitymore » of programming such computers, including the hybrid message-passing/multi-threading programming model. We will illustrate these ideas with a hybrid distributed memory matrix multiply and a quantum chemistry algorithm for energy computation using Møller–Plesset perturbation theory.« less
HEPCloud, a New Paradigm for HEP Facilities: CMS Amazon Web Services Investigation
Holzman, Burt; Bauerdick, Lothar A. T.; Bockelman, Brian; ...
2017-09-29
Historically, high energy physics computing has been performed on large purpose-built computing systems. These began as single-site compute facilities, but have evolved into the distributed computing grids used today. Recently, there has been an exponential increase in the capacity and capability of commercial clouds. Cloud resources are highly virtualized and intended to be able to be flexibly deployed for a variety of computing tasks. There is a growing interest among the cloud providers to demonstrate the capability to perform large-scale scientific computing. In this paper, we discuss results from the CMS experiment using the Fermilab HEPCloud facility, which utilized bothmore » local Fermilab resources and virtual machines in the Amazon Web Services Elastic Compute Cloud. We discuss the planning, technical challenges, and lessons learned involved in performing physics workflows on a large-scale set of virtualized resources. Additionally, we will discuss the economics and operational efficiencies when executing workflows both in the cloud and on dedicated resources.« less
HEPCloud, a New Paradigm for HEP Facilities: CMS Amazon Web Services Investigation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Holzman, Burt; Bauerdick, Lothar A. T.; Bockelman, Brian
Historically, high energy physics computing has been performed on large purpose-built computing systems. These began as single-site compute facilities, but have evolved into the distributed computing grids used today. Recently, there has been an exponential increase in the capacity and capability of commercial clouds. Cloud resources are highly virtualized and intended to be able to be flexibly deployed for a variety of computing tasks. There is a growing interest among the cloud providers to demonstrate the capability to perform large-scale scientific computing. In this paper, we discuss results from the CMS experiment using the Fermilab HEPCloud facility, which utilized bothmore » local Fermilab resources and virtual machines in the Amazon Web Services Elastic Compute Cloud. We discuss the planning, technical challenges, and lessons learned involved in performing physics workflows on a large-scale set of virtualized resources. Additionally, we will discuss the economics and operational efficiencies when executing workflows both in the cloud and on dedicated resources.« less
NASA Astrophysics Data System (ADS)
Davis, A. D.; Heimbach, P.; Marzouk, Y.
2017-12-01
We develop a Bayesian inverse modeling framework for predicting future ice sheet volume with associated formal uncertainty estimates. Marine ice sheets are drained by fast-flowing ice streams, which we simulate using a flowline model. Flowline models depend on geometric parameters (e.g., basal topography), parameterized physical processes (e.g., calving laws and basal sliding), and climate parameters (e.g., surface mass balance), most of which are unknown or uncertain. Given observations of ice surface velocity and thickness, we define a Bayesian posterior distribution over static parameters, such as basal topography. We also define a parameterized distribution over variable parameters, such as future surface mass balance, which we assume are not informed by the data. Hyperparameters are used to represent climate change scenarios, and sampling their distributions mimics internal variation. For example, a warming climate corresponds to increasing mean surface mass balance but an individual sample may have periods of increasing or decreasing surface mass balance. We characterize the predictive distribution of ice volume by evaluating the flowline model given samples from the posterior distribution and the distribution over variable parameters. Finally, we determine the effect of climate change on future ice sheet volume by investigating how changing the hyperparameters affects the predictive distribution. We use state-of-the-art Bayesian computation to address computational feasibility. Characterizing the posterior distribution (using Markov chain Monte Carlo), sampling the full range of variable parameters and evaluating the predictive model is prohibitively expensive. Furthermore, the required resolution of the inferred basal topography may be very high, which is often challenging for sampling methods. Instead, we leverage regularity in the predictive distribution to build a computationally cheaper surrogate over the low dimensional quantity of interest (future ice sheet volume). Continual surrogate refinement guarantees asymptotic sampling from the predictive distribution. Directly characterizing the predictive distribution in this way allows us to assess the ice sheet's sensitivity to climate variability and change.
Li, Zhenlong; Yang, Chaowei; Jin, Baoxuan; Yu, Manzhu; Liu, Kai; Sun, Min; Zhan, Matthew
2015-01-01
Geoscience observations and model simulations are generating vast amounts of multi-dimensional data. Effectively analyzing these data are essential for geoscience studies. However, the tasks are challenging for geoscientists because processing the massive amount of data is both computing and data intensive in that data analytics requires complex procedures and multiple tools. To tackle these challenges, a scientific workflow framework is proposed for big geoscience data analytics. In this framework techniques are proposed by leveraging cloud computing, MapReduce, and Service Oriented Architecture (SOA). Specifically, HBase is adopted for storing and managing big geoscience data across distributed computers. MapReduce-based algorithm framework is developed to support parallel processing of geoscience data. And service-oriented workflow architecture is built for supporting on-demand complex data analytics in the cloud environment. A proof-of-concept prototype tests the performance of the framework. Results show that this innovative framework significantly improves the efficiency of big geoscience data analytics by reducing the data processing time as well as simplifying data analytical procedures for geoscientists. PMID:25742012
Parallel Processing of Images in Mobile Devices using BOINC
NASA Astrophysics Data System (ADS)
Curiel, Mariela; Calle, David F.; Santamaría, Alfredo S.; Suarez, David F.; Flórez, Leonardo
2018-04-01
Medical image processing helps health professionals make decisions for the diagnosis and treatment of patients. Since some algorithms for processing images require substantial amounts of resources, one could take advantage of distributed or parallel computing. A mobile grid can be an adequate computing infrastructure for this problem. A mobile grid is a grid that includes mobile devices as resource providers. In a previous step of this research, we selected BOINC as the infrastructure to build our mobile grid. However, parallel processing of images in mobile devices poses at least two important challenges: the execution of standard libraries for processing images and obtaining adequate performance when compared to desktop computers grids. By the time we started our research, the use of BOINC in mobile devices also involved two issues: a) the execution of programs in mobile devices required to modify the code to insert calls to the BOINC API, and b) the division of the image among the mobile devices as well as its merging required additional code in some BOINC components. This article presents answers to these four challenges.
Li, Zhenlong; Yang, Chaowei; Jin, Baoxuan; Yu, Manzhu; Liu, Kai; Sun, Min; Zhan, Matthew
2015-01-01
Geoscience observations and model simulations are generating vast amounts of multi-dimensional data. Effectively analyzing these data are essential for geoscience studies. However, the tasks are challenging for geoscientists because processing the massive amount of data is both computing and data intensive in that data analytics requires complex procedures and multiple tools. To tackle these challenges, a scientific workflow framework is proposed for big geoscience data analytics. In this framework techniques are proposed by leveraging cloud computing, MapReduce, and Service Oriented Architecture (SOA). Specifically, HBase is adopted for storing and managing big geoscience data across distributed computers. MapReduce-based algorithm framework is developed to support parallel processing of geoscience data. And service-oriented workflow architecture is built for supporting on-demand complex data analytics in the cloud environment. A proof-of-concept prototype tests the performance of the framework. Results show that this innovative framework significantly improves the efficiency of big geoscience data analytics by reducing the data processing time as well as simplifying data analytical procedures for geoscientists.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wang, Jianhui
2015-09-01
Grid modernization is transforming the operation and management of electric distribution systems from manual, paper-driven business processes to electronic, computer-assisted decisionmaking. At the center of this business transformation is the distribution management system (DMS), which provides a foundation from which optimal levels of performance can be achieved in an increasingly complex business and operating environment. Electric distribution utilities are facing many new challenges that are dramatically increasing the complexity of operating and managing the electric distribution system: growing customer expectations for service reliability and power quality, pressure to achieve better efficiency and utilization of existing distribution system assets, and reductionmore » of greenhouse gas emissions by accommodating high penetration levels of distributed generating resources powered by renewable energy sources (wind, solar, etc.). Recent “storm of the century” events in the northeastern United States and the lengthy power outages and customer hardships that followed have greatly elevated the need to make power delivery systems more resilient to major storm events and to provide a more effective electric utility response during such regional power grid emergencies. Despite these newly emerging challenges for electric distribution system operators, only a small percentage of electric utilities have actually implemented a DMS. This paper discusses reasons why a DMS is needed and why the DMS may emerge as a mission-critical system that will soon be considered essential as electric utilities roll out their grid modernization strategies.« less
Extreme-Scale De Novo Genome Assembly
DOE Office of Scientific and Technical Information (OSTI.GOV)
Georganas, Evangelos; Hofmeyr, Steven; Egan, Rob
De novo whole genome assembly reconstructs genomic sequence from short, overlapping, and potentially erroneous DNA segments and is one of the most important computations in modern genomics. This work presents HipMER, a high-quality end-to-end de novo assembler designed for extreme scale analysis, via efficient parallelization of the Meraculous code. Genome assembly software has many components, each of which stresses different components of a computer system. This chapter explains the computational challenges involved in each step of the HipMer pipeline, the key distributed data structures, and communication costs in detail. We present performance results of assembling the human genome and themore » large hexaploid wheat genome on large supercomputers up to tens of thousands of cores.« less
CBRAIN: a web-based, distributed computing platform for collaborative neuroimaging research
Sherif, Tarek; Rioux, Pierre; Rousseau, Marc-Etienne; Kassis, Nicolas; Beck, Natacha; Adalat, Reza; Das, Samir; Glatard, Tristan; Evans, Alan C.
2014-01-01
The Canadian Brain Imaging Research Platform (CBRAIN) is a web-based collaborative research platform developed in response to the challenges raised by data-heavy, compute-intensive neuroimaging research. CBRAIN offers transparent access to remote data sources, distributed computing sites, and an array of processing and visualization tools within a controlled, secure environment. Its web interface is accessible through any modern browser and uses graphical interface idioms to reduce the technical expertise required to perform large-scale computational analyses. CBRAIN's flexible meta-scheduling has allowed the incorporation of a wide range of heterogeneous computing sites, currently including nine national research High Performance Computing (HPC) centers in Canada, one in Korea, one in Germany, and several local research servers. CBRAIN leverages remote computing cycles and facilitates resource-interoperability in a transparent manner for the end-user. Compared with typical grid solutions available, our architecture was designed to be easily extendable and deployed on existing remote computing sites with no tool modification, administrative intervention, or special software/hardware configuration. As October 2013, CBRAIN serves over 200 users spread across 53 cities in 17 countries. The platform is built as a generic framework that can accept data and analysis tools from any discipline. However, its current focus is primarily on neuroimaging research and studies of neurological diseases such as Autism, Parkinson's and Alzheimer's diseases, Multiple Sclerosis as well as on normal brain structure and development. This technical report presents the CBRAIN Platform, its current deployment and usage and future direction. PMID:24904400
CBRAIN: a web-based, distributed computing platform for collaborative neuroimaging research.
Sherif, Tarek; Rioux, Pierre; Rousseau, Marc-Etienne; Kassis, Nicolas; Beck, Natacha; Adalat, Reza; Das, Samir; Glatard, Tristan; Evans, Alan C
2014-01-01
The Canadian Brain Imaging Research Platform (CBRAIN) is a web-based collaborative research platform developed in response to the challenges raised by data-heavy, compute-intensive neuroimaging research. CBRAIN offers transparent access to remote data sources, distributed computing sites, and an array of processing and visualization tools within a controlled, secure environment. Its web interface is accessible through any modern browser and uses graphical interface idioms to reduce the technical expertise required to perform large-scale computational analyses. CBRAIN's flexible meta-scheduling has allowed the incorporation of a wide range of heterogeneous computing sites, currently including nine national research High Performance Computing (HPC) centers in Canada, one in Korea, one in Germany, and several local research servers. CBRAIN leverages remote computing cycles and facilitates resource-interoperability in a transparent manner for the end-user. Compared with typical grid solutions available, our architecture was designed to be easily extendable and deployed on existing remote computing sites with no tool modification, administrative intervention, or special software/hardware configuration. As October 2013, CBRAIN serves over 200 users spread across 53 cities in 17 countries. The platform is built as a generic framework that can accept data and analysis tools from any discipline. However, its current focus is primarily on neuroimaging research and studies of neurological diseases such as Autism, Parkinson's and Alzheimer's diseases, Multiple Sclerosis as well as on normal brain structure and development. This technical report presents the CBRAIN Platform, its current deployment and usage and future direction.
Redefining Tactical Operations for MER Using Cloud Computing
NASA Technical Reports Server (NTRS)
Joswig, Joseph C.; Shams, Khawaja S.
2011-01-01
The Mars Exploration Rover Mission (MER) includes the twin rovers, Spirit and Opportunity, which have been performing geological research and surface exploration since early 2004. The rovers' durability well beyond their original prime mission (90 sols or Martian days) has allowed them to be a valuable platform for scientific research for well over 2000 sols, but as a by-product it has produced new challenges in providing efficient and cost-effective tactical operational planning. An early stage process adaptation was the move to distributed operations as mission scientists returned to their places of work in the summer of 2004, but they would still came together via teleconference and connected software to plan rover activities a few times a week. This distributed model has worked well since, but it requires the purchase, operation, and maintenance of a dedicated infrastructure at the Jet Propulsion Laboratory. This server infrastructure is costly to operate and the periodic nature of its usage (typically heavy usage for 8 hours every 2 days) has made moving to a cloud based tactical infrastructure an extremely tempting proposition. In this paper we will review both past and current implementations of the tactical planning application focusing on remote plan saving and discuss the unique challenges present with long-latency, distributed operations. We then detail the motivations behind our move to cloud based computing services and as well as our system design and implementation. We will discuss security and reliability concerns and how they were addressed
The Osseus platform: a prototype for advanced web-based distributed simulation
NASA Astrophysics Data System (ADS)
Franceschini, Derrick; Riecken, Mark
2016-05-01
Recent technological advances in web-based distributed computing and database technology have made possible a deeper and more transparent integration of some modeling and simulation applications. Despite these advances towards true integration of capabilities, disparate systems, architectures, and protocols will remain in the inventory for some time to come. These disparities present interoperability challenges for distributed modeling and simulation whether the application is training, experimentation, or analysis. Traditional approaches call for building gateways to bridge between disparate protocols and retaining interoperability specialists. Challenges in reconciling data models also persist. These challenges and their traditional mitigation approaches directly contribute to higher costs, schedule delays, and frustration for the end users. Osseus is a prototype software platform originally funded as a research project by the Defense Modeling & Simulation Coordination Office (DMSCO) to examine interoperability alternatives using modern, web-based technology and taking inspiration from the commercial sector. Osseus provides tools and services for nonexpert users to connect simulations, targeting the time and skillset needed to successfully connect disparate systems. The Osseus platform presents a web services interface to allow simulation applications to exchange data using modern techniques efficiently over Local or Wide Area Networks. Further, it provides Service Oriented Architecture capabilities such that finer granularity components such as individual models can contribute to simulation with minimal effort.
Towards a Cloud Based Smart Traffic Management Framework
NASA Astrophysics Data System (ADS)
Rahimi, M. M.; Hakimpour, F.
2017-09-01
Traffic big data has brought many opportunities for traffic management applications. However several challenges like heterogeneity, storage, management, processing and analysis of traffic big data may hinder their efficient and real-time applications. All these challenges call for well-adapted distributed framework for smart traffic management that can efficiently handle big traffic data integration, indexing, query processing, mining and analysis. In this paper, we present a novel, distributed, scalable and efficient framework for traffic management applications. The proposed cloud computing based framework can answer technical challenges for efficient and real-time storage, management, process and analyse of traffic big data. For evaluation of the framework, we have used OpenStreetMap (OSM) real trajectories and road network on a distributed environment. Our evaluation results indicate that speed of data importing to this framework exceeds 8000 records per second when the size of datasets is near to 5 million. We also evaluate performance of data retrieval in our proposed framework. The data retrieval speed exceeds 15000 records per second when the size of datasets is near to 5 million. We have also evaluated scalability and performance of our proposed framework using parallelisation of a critical pre-analysis in transportation applications. The results show that proposed framework achieves considerable performance and efficiency in traffic management applications.
MRPack: Multi-Algorithm Execution Using Compute-Intensive Approach in MapReduce
2015-01-01
Large quantities of data have been generated from multiple sources at exponential rates in the last few years. These data are generated at high velocity as real time and streaming data in variety of formats. These characteristics give rise to challenges in its modeling, computation, and processing. Hadoop MapReduce (MR) is a well known data-intensive distributed processing framework using the distributed file system (DFS) for Big Data. Current implementations of MR only support execution of a single algorithm in the entire Hadoop cluster. In this paper, we propose MapReducePack (MRPack), a variation of MR that supports execution of a set of related algorithms in a single MR job. We exploit the computational capability of a cluster by increasing the compute-intensiveness of MapReduce while maintaining its data-intensive approach. It uses the available computing resources by dynamically managing the task assignment and intermediate data. Intermediate data from multiple algorithms are managed using multi-key and skew mitigation strategies. The performance study of the proposed system shows that it is time, I/O, and memory efficient compared to the default MapReduce. The proposed approach reduces the execution time by 200% with an approximate 50% decrease in I/O cost. Complexity and qualitative results analysis shows significant performance improvement. PMID:26305223
MRPack: Multi-Algorithm Execution Using Compute-Intensive Approach in MapReduce.
Idris, Muhammad; Hussain, Shujaat; Siddiqi, Muhammad Hameed; Hassan, Waseem; Syed Muhammad Bilal, Hafiz; Lee, Sungyoung
2015-01-01
Large quantities of data have been generated from multiple sources at exponential rates in the last few years. These data are generated at high velocity as real time and streaming data in variety of formats. These characteristics give rise to challenges in its modeling, computation, and processing. Hadoop MapReduce (MR) is a well known data-intensive distributed processing framework using the distributed file system (DFS) for Big Data. Current implementations of MR only support execution of a single algorithm in the entire Hadoop cluster. In this paper, we propose MapReducePack (MRPack), a variation of MR that supports execution of a set of related algorithms in a single MR job. We exploit the computational capability of a cluster by increasing the compute-intensiveness of MapReduce while maintaining its data-intensive approach. It uses the available computing resources by dynamically managing the task assignment and intermediate data. Intermediate data from multiple algorithms are managed using multi-key and skew mitigation strategies. The performance study of the proposed system shows that it is time, I/O, and memory efficient compared to the default MapReduce. The proposed approach reduces the execution time by 200% with an approximate 50% decrease in I/O cost. Complexity and qualitative results analysis shows significant performance improvement.
A General-purpose Framework for Parallel Processing of Large-scale LiDAR Data
NASA Astrophysics Data System (ADS)
Li, Z.; Hodgson, M.; Li, W.
2016-12-01
Light detection and ranging (LiDAR) technologies have proven efficiency to quickly obtain very detailed Earth surface data for a large spatial extent. Such data is important for scientific discoveries such as Earth and ecological sciences and natural disasters and environmental applications. However, handling LiDAR data poses grand geoprocessing challenges due to data intensity and computational intensity. Previous studies received notable success on parallel processing of LiDAR data to these challenges. However, these studies either relied on high performance computers and specialized hardware (GPUs) or focused mostly on finding customized solutions for some specific algorithms. We developed a general-purpose scalable framework coupled with sophisticated data decomposition and parallelization strategy to efficiently handle big LiDAR data. Specifically, 1) a tile-based spatial index is proposed to manage big LiDAR data in the scalable and fault-tolerable Hadoop distributed file system, 2) two spatial decomposition techniques are developed to enable efficient parallelization of different types of LiDAR processing tasks, and 3) by coupling existing LiDAR processing tools with Hadoop, this framework is able to conduct a variety of LiDAR data processing tasks in parallel in a highly scalable distributed computing environment. The performance and scalability of the framework is evaluated with a series of experiments conducted on a real LiDAR dataset using a proof-of-concept prototype system. The results show that the proposed framework 1) is able to handle massive LiDAR data more efficiently than standalone tools; and 2) provides almost linear scalability in terms of either increased workload (data volume) or increased computing nodes with both spatial decomposition strategies. We believe that the proposed framework provides valuable references on developing a collaborative cyberinfrastructure for processing big earth science data in a highly scalable environment.
Mitsuhashi, Kenji; Poudel, Joemini; Matthews, Thomas P.; Garcia-Uribe, Alejandro; Wang, Lihong V.; Anastasio, Mark A.
2017-01-01
Photoacoustic computed tomography (PACT) is an emerging imaging modality that exploits optical contrast and ultrasonic detection principles to form images of the photoacoustically induced initial pressure distribution within tissue. The PACT reconstruction problem corresponds to an inverse source problem in which the initial pressure distribution is recovered from measurements of the radiated wavefield. A major challenge in transcranial PACT brain imaging is compensation for aberrations in the measured data due to the presence of the skull. Ultrasonic waves undergo absorption, scattering and longitudinal-to-shear wave mode conversion as they propagate through the skull. To properly account for these effects, a wave-equation-based inversion method should be employed that can model the heterogeneous elastic properties of the skull. In this work, a forward model based on a finite-difference time-domain discretization of the three-dimensional elastic wave equation is established and a procedure for computing the corresponding adjoint of the forward operator is presented. Massively parallel implementations of these operators employing multiple graphics processing units (GPUs) are also developed. The developed numerical framework is validated and investigated in computer19 simulation and experimental phantom studies whose designs are motivated by transcranial PACT applications. PMID:29387291
Benkner, Siegfried; Arbona, Antonio; Berti, Guntram; Chiarini, Alessandro; Dunlop, Robert; Engelbrecht, Gerhard; Frangi, Alejandro F; Friedrich, Christoph M; Hanser, Susanne; Hasselmeyer, Peer; Hose, Rod D; Iavindrasana, Jimison; Köhler, Martin; Iacono, Luigi Lo; Lonsdale, Guy; Meyer, Rodolphe; Moore, Bob; Rajasekaran, Hariharan; Summers, Paul E; Wöhrer, Alexander; Wood, Steven
2010-11-01
The increasing volume of data describing human disease processes and the growing complexity of understanding, managing, and sharing such data presents a huge challenge for clinicians and medical researchers. This paper presents the @neurIST system, which provides an infrastructure for biomedical research while aiding clinical care, by bringing together heterogeneous data and complex processing and computing services. Although @neurIST targets the investigation and treatment of cerebral aneurysms, the system's architecture is generic enough that it could be adapted to the treatment of other diseases. Innovations in @neurIST include confining the patient data pertaining to aneurysms inside a single environment that offers clinicians the tools to analyze and interpret patient data and make use of knowledge-based guidance in planning their treatment. Medical researchers gain access to a critical mass of aneurysm related data due to the system's ability to federate distributed information sources. A semantically mediated grid infrastructure ensures that both clinicians and researchers are able to seamlessly access and work on data that is distributed across multiple sites in a secure way in addition to providing computing resources on demand for performing computationally intensive simulations for treatment planning and research.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kumar, Jitendra; HargroveJr., William Walter; Norman, Steven P
Vegetation canopy structure is a critically important habit characteristic for many threatened and endangered birds and other animal species, and it is key information needed by forest and wildlife managers for monitoring and managing forest resources, conservation planning and fostering biodiversity. Advances in Light Detection and Ranging (LiDAR) technologies have enabled remote sensing-based studies of vegetation canopies by capturing three-dimensional structures, yielding information not available in two-dimensional images of the landscape pro- vided by traditional multi-spectral remote sensing platforms. However, the large volume data sets produced by airborne LiDAR instruments pose a significant computational challenge, requiring algorithms to identify andmore » analyze patterns of interest buried within LiDAR point clouds in a computationally efficient manner, utilizing state-of-art computing infrastructure. We developed and applied a computationally efficient approach to analyze a large volume of LiDAR data and to characterize and map the vegetation canopy structures for 139,859 hectares (540 sq. miles) in the Great Smoky Mountains National Park. This study helps improve our understanding of the distribution of vegetation and animal habitats in this extremely diverse ecosystem.« less
Real-time modeling and simulation of distribution feeder and distributed resources
NASA Astrophysics Data System (ADS)
Singh, Pawan
The analysis of the electrical system dates back to the days when analog network analyzers were used. With the advent of digital computers, many programs were written for power-flow and short circuit analysis for the improvement of the electrical system. Real-time computer simulations can answer many what-if scenarios in the existing or the proposed power system. In this thesis, the standard IEEE 13-Node distribution feeder is developed and validated on a real-time platform OPAL-RT. The concept and the challenges of the real-time simulation are studied and addressed. Distributed energy resources include some of the commonly used distributed generation and storage devices like diesel engine, solar photovoltaic array, and battery storage system are modeled and simulated on a real-time platform. A microgrid encompasses a portion of an electric power distribution which is located downstream of the distribution substation. Normally, the microgrid operates in paralleled mode with the grid; however, scheduled or forced isolation can take place. In such conditions, the microgrid must have the ability to operate stably and autonomously. The microgrid can operate in grid connected and islanded mode, both the operating modes are studied in the last chapter. Towards the end, a simple microgrid controller modeled and simulated on the real-time platform is developed for energy management and protection for the microgrid.
Cormode, Graham; Dasgupta, Anirban; Goyal, Amit; Lee, Chi Hoon
2018-01-01
Many modern applications of AI such as web search, mobile browsing, image processing, and natural language processing rely on finding similar items from a large database of complex objects. Due to the very large scale of data involved (e.g., users' queries from commercial search engines), computing such near or nearest neighbors is a non-trivial task, as the computational cost grows significantly with the number of items. To address this challenge, we adopt Locality Sensitive Hashing (a.k.a, LSH) methods and evaluate four variants in a distributed computing environment (specifically, Hadoop). We identify several optimizations which improve performance, suitable for deployment in very large scale settings. The experimental results demonstrate our variants of LSH achieve the robust performance with better recall compared with "vanilla" LSH, even when using the same amount of space.
Materials science. Materials that couple sensing, actuation, computation, and communication.
McEvoy, M A; Correll, N
2015-03-20
Tightly integrating sensing, actuation, and computation into composites could enable a new generation of truly smart material systems that can change their appearance and shape autonomously. Applications for such materials include airfoils that change their aerodynamic profile, vehicles with camouflage abilities, bridges that detect and repair damage, or robotic skins and prosthetics with a realistic sense of touch. Although integrating sensors and actuators into composites is becoming increasingly common, the opportunities afforded by embedded computation have only been marginally explored. Here, the key challenge is the gap between the continuous physics of materials and the discrete mathematics of computation. Bridging this gap requires a fundamental understanding of the constituents of such robotic materials and the distributed algorithms and controls that make these structures smart. Copyright © 2015, American Association for the Advancement of Science.
A resilient and secure software platform and architecture for distributed spacecraft
NASA Astrophysics Data System (ADS)
Otte, William R.; Dubey, Abhishek; Karsai, Gabor
2014-06-01
A distributed spacecraft is a cluster of independent satellite modules flying in formation that communicate via ad-hoc wireless networks. This system in space is a cloud platform that facilitates sharing sensors and other computing and communication resources across multiple applications, potentially developed and maintained by different organizations. Effectively, such architecture can realize the functions of monolithic satellites at a reduced cost and with improved adaptivity and robustness. Openness of these architectures pose special challenges because the distributed software platform has to support applications from different security domains and organizations, and where information flows have to be carefully managed and compartmentalized. If the platform is used as a robust shared resource its management, configuration, and resilience becomes a challenge in itself. We have designed and prototyped a distributed software platform for such architectures. The core element of the platform is a new operating system whose services were designed to restrict access to the network and the file system, and to enforce resource management constraints for all non-privileged processes Mixed-criticality applications operating at different security labels are deployed and controlled by a privileged management process that is also pre-configuring all information flows. This paper describes the design and objective of this layer.
Efficient LIDAR Point Cloud Data Managing and Processing in a Hadoop-Based Distributed Framework
NASA Astrophysics Data System (ADS)
Wang, C.; Hu, F.; Sha, D.; Han, X.
2017-10-01
Light Detection and Ranging (LiDAR) is one of the most promising technologies in surveying and mapping city management, forestry, object recognition, computer vision engineer and others. However, it is challenging to efficiently storage, query and analyze the high-resolution 3D LiDAR data due to its volume and complexity. In order to improve the productivity of Lidar data processing, this study proposes a Hadoop-based framework to efficiently manage and process LiDAR data in a distributed and parallel manner, which takes advantage of Hadoop's storage and computing ability. At the same time, the Point Cloud Library (PCL), an open-source project for 2D/3D image and point cloud processing, is integrated with HDFS and MapReduce to conduct the Lidar data analysis algorithms provided by PCL in a parallel fashion. The experiment results show that the proposed framework can efficiently manage and process big LiDAR data.
NASA Astrophysics Data System (ADS)
Ibrahima, Fayadhoi; Meyer, Daniel; Tchelepi, Hamdi
2016-04-01
Because geophysical data are inexorably sparse and incomplete, stochastic treatments of simulated responses are crucial to explore possible scenarios and assess risks in subsurface problems. In particular, nonlinear two-phase flows in porous media are essential, yet challenging, in reservoir simulation and hydrology. Adding highly heterogeneous and uncertain input, such as the permeability and porosity fields, transforms the estimation of the flow response into a tough stochastic problem for which computationally expensive Monte Carlo (MC) simulations remain the preferred option.We propose an alternative approach to evaluate the probability distribution of the (water) saturation for the stochastic Buckley-Leverett problem when the probability distributions of the permeability and porosity fields are available. We give a computationally efficient and numerically accurate method to estimate the one-point probability density (PDF) and cumulative distribution functions (CDF) of the (water) saturation. The distribution method draws inspiration from a Lagrangian approach of the stochastic transport problem and expresses the saturation PDF and CDF essentially in terms of a deterministic mapping and the distribution and statistics of scalar random fields. In a large class of applications these random fields can be estimated at low computational costs (few MC runs), thus making the distribution method attractive. Even though the method relies on a key assumption of fixed streamlines, we show that it performs well for high input variances, which is the case of interest. Once the saturation distribution is determined, any one-point statistics thereof can be obtained, especially the saturation average and standard deviation. Moreover, the probability of rare events and saturation quantiles (e.g. P10, P50 and P90) can be efficiently derived from the distribution method. These statistics can then be used for risk assessment, as well as data assimilation and uncertainty reduction in the prior knowledge of input distributions. We provide various examples and comparisons with MC simulations to illustrate the performance of the method.
A Gaussian Approximation Approach for Value of Information Analysis.
Jalal, Hawre; Alarid-Escudero, Fernando
2018-02-01
Most decisions are associated with uncertainty. Value of information (VOI) analysis quantifies the opportunity loss associated with choosing a suboptimal intervention based on current imperfect information. VOI can inform the value of collecting additional information, resource allocation, research prioritization, and future research designs. However, in practice, VOI remains underused due to many conceptual and computational challenges associated with its application. Expected value of sample information (EVSI) is rooted in Bayesian statistical decision theory and measures the value of information from a finite sample. The past few years have witnessed a dramatic growth in computationally efficient methods to calculate EVSI, including metamodeling. However, little research has been done to simplify the experimental data collection step inherent to all EVSI computations, especially for correlated model parameters. This article proposes a general Gaussian approximation (GA) of the traditional Bayesian updating approach based on the original work by Raiffa and Schlaifer to compute EVSI. The proposed approach uses a single probabilistic sensitivity analysis (PSA) data set and involves 2 steps: 1) a linear metamodel step to compute the EVSI on the preposterior distributions and 2) a GA step to compute the preposterior distribution of the parameters of interest. The proposed approach is efficient and can be applied for a wide range of data collection designs involving multiple non-Gaussian parameters and unbalanced study designs. Our approach is particularly useful when the parameters of an economic evaluation are correlated or interact.
Cloud-Based Perception and Control of Sensor Nets and Robot Swarms
2016-04-01
distributed stream processing framework provides the necessary API and infrastructure to develop and execute such applications in a cluster of computation...streaming DDDAS applications based on challenges they present to the backend Cloud control system. Figure 2 Parallel SLAM Application 3 1) Set of...the art deep learning- based object detectors can recognize among hundreds of object classes and this capability would be very useful for mobile
Introduction: The SERENITY vision
NASA Astrophysics Data System (ADS)
Maña, Antonio; Spanoudakis, George; Kokolakis, Spyros
In this chapter we present an overview of the SERENITY approach. We describe the SERENITY model of secure and dependable applications and show how it addresses the challenge of developing, integrating and dynamically maintaining security and dependability mechanisms in open, dynamic, distributed and heterogeneous computing systems and in particular Ambient Intelligence scenarios. The chapter describes the basic concepts used in the approach and introduces the different processes supported by SERENITY, along with the tools provided.
NASA Astrophysics Data System (ADS)
McKee, Shawn; Kissel, Ezra; Meekhof, Benjeman; Swany, Martin; Miller, Charles; Gregorowicz, Michael
2017-10-01
We report on the first year of the OSiRIS project (NSF Award #1541335, UM, IU, MSU and WSU) which is targeting the creation of a distributed Ceph storage infrastructure coupled together with software-defined networking to provide high-performance access for well-connected locations on any participating campus. The projects goal is to provide a single scalable, distributed storage infrastructure that allows researchers at each campus to read, write, manage and share data directly from their own computing locations. The NSF CC*DNI DIBBS program which funded OSiRIS is seeking solutions to the challenges of multi-institutional collaborations involving large amounts of data and we are exploring the creative use of Ceph and networking to address those challenges. While OSiRIS will eventually be serving a broad range of science domains, its first adopter will be the LHC ATLAS detector project via the ATLAS Great Lakes Tier-2 (AGLT2) jointly located at the University of Michigan and Michigan State University. Part of our presentation will cover how ATLAS is using the OSiRIS infrastructure and our experiences integrating our first user community. The presentation will also review the motivations for and goals of the project, the technical details of the OSiRIS infrastructure, the challenges in providing such an infrastructure, and the technical choices made to address those challenges. We will conclude with our plans for the remaining 4 years of the project and our vision for what we hope to deliver by the projects end.
Using Approximate Bayesian Computation to Probe Multiple Transiting Planet Systems
NASA Astrophysics Data System (ADS)
Morehead, Robert C.
2015-08-01
The large number of multiple transiting planet systems (MTPS) uncovered with Kepler suggest a population of well-aligned planetary systems. Previously, the distribution of transit duration ratios in MTPSs has been used to place constraints on the distributions of mutual orbital inclinations and orbital eccentricities in these systems. However, degeneracies with the underlying number of planets in these systems pose added challenges and make explicit likelihood functions intractable. Approximate Bayesian computation (ABC) offers an intriguing path forward. In its simplest form, ABC proposes from a prior on the population parameters to produce synthetic datasets via a physically-motivated model. Samples are accepted or rejected based on how close they come to reproducing the actual observed dataset to some tolerance. The accepted samples then form a robust and useful approximation of the true posterior distribution of the underlying population parameters. We will demonstrate the utility of ABC in exoplanet populations by presenting new constraints on the mutual inclination and eccentricity distributions in the Kepler MTPSs. We will also introduce Simple-ABC, a new open-source Python package designed for ease of use and rapid specification of general models, suitable for use in a wide variety of applications in both exoplanet science and astrophysics as a whole.
Fan-out Estimation in Spin-based Quantum Computer Scale-up.
Nguyen, Thien; Hill, Charles D; Hollenberg, Lloyd C L; James, Matthew R
2017-10-17
Solid-state spin-based qubits offer good prospects for scaling based on their long coherence times and nexus to large-scale electronic scale-up technologies. However, high-threshold quantum error correction requires a two-dimensional qubit array operating in parallel, posing significant challenges in fabrication and control. While architectures incorporating distributed quantum control meet this challenge head-on, most designs rely on individual control and readout of all qubits with high gate densities. We analysed the fan-out routing overhead of a dedicated control line architecture, basing the analysis on a generalised solid-state spin qubit platform parameterised to encompass Coulomb confined (e.g. donor based spin qubits) or electrostatically confined (e.g. quantum dot based spin qubits) implementations. The spatial scalability under this model is estimated using standard electronic routing methods and present-day fabrication constraints. Based on reasonable assumptions for qubit control and readout we estimate 10 2 -10 5 physical qubits, depending on the quantum interconnect implementation, can be integrated and fanned-out independently. Assuming relatively long control-free interconnects the scalability can be extended. Ultimately, the universal quantum computation may necessitate a much higher number of integrated qubits, indicating that higher dimensional electronics fabrication and/or multiplexed distributed control and readout schemes may be the preferredstrategy for large-scale implementation.
Nonlinear Reduced-Order Analysis with Time-Varying Spatial Loading Distributions
NASA Technical Reports Server (NTRS)
Prezekop, Adam
2008-01-01
Oscillating shocks acting in combination with high-intensity acoustic loadings present a challenge to the design of resilient hypersonic flight vehicle structures. This paper addresses some features of this loading condition and certain aspects of a nonlinear reduced-order analysis with emphasis on system identification leading to formation of a robust modal basis. The nonlinear dynamic response of a composite structure subject to the simultaneous action of locally strong oscillating pressure gradients and high-intensity acoustic loadings is considered. The reduced-order analysis used in this work has been previously demonstrated to be both computationally efficient and accurate for time-invariant spatial loading distributions, provided that an appropriate modal basis is used. The challenge of the present study is to identify a suitable basis for loadings with time-varying spatial distributions. Using a proper orthogonal decomposition and modal expansion, it is shown that such a basis can be developed. The basis is made more robust by incrementally expanding it to account for changes in the location, frequency and span of the oscillating pressure gradient.
Research in Parallel Algorithms and Software for Computational Aerosciences
NASA Technical Reports Server (NTRS)
Domel, Neal D.
1996-01-01
Phase I is complete for the development of a Computational Fluid Dynamics parallel code with automatic grid generation and adaptation for the Euler analysis of flow over complex geometries. SPLITFLOW, an unstructured Cartesian grid code developed at Lockheed Martin Tactical Aircraft Systems, has been modified for a distributed memory/massively parallel computing environment. The parallel code is operational on an SGI network, Cray J90 and C90 vector machines, SGI Power Challenge, and Cray T3D and IBM SP2 massively parallel machines. Parallel Virtual Machine (PVM) is the message passing protocol for portability to various architectures. A domain decomposition technique was developed which enforces dynamic load balancing to improve solution speed and memory requirements. A host/node algorithm distributes the tasks. The solver parallelizes very well, and scales with the number of processors. Partially parallelized and non-parallelized tasks consume most of the wall clock time in a very fine grain environment. Timing comparisons on a Cray C90 demonstrate that Parallel SPLITFLOW runs 2.4 times faster on 8 processors than its non-parallel counterpart autotasked over 8 processors.
Image-based metrology of porous tissue engineering scaffolds
NASA Astrophysics Data System (ADS)
Rajagopalan, Srinivasan; Robb, Richard A.
2006-03-01
Tissue engineering is an interdisciplinary effort aimed at the repair and regeneration of biological tissues through the application and control of cells, porous scaffolds and growth factors. The regeneration of specific tissues guided by tissue analogous substrates is dependent on diverse scaffold architectural indices that can be derived quantitatively from the microCT and microMR images of the scaffolds. However, the randomness of pore-solid distributions in conventional stochastic scaffolds presents unique computational challenges. As a result, image-based characterization of scaffolds has been predominantly qualitative. In this paper, we discuss quantitative image-based techniques that can be used to compute the metrological indices of porous tissue engineering scaffolds. While bulk averaged quantities such as porosity and surface are derived directly from the optimal pore-solid delineations, the spatially distributed geometric indices are derived from the medial axis representations of the pore network. The computational framework proposed (to the best of our knowledge for the first time in tissue engineering) in this paper might have profound implications towards unraveling the symbiotic structure-function relationship of porous tissue engineering scaffolds.
Research in Parallel Algorithms and Software for Computational Aerosciences
NASA Technical Reports Server (NTRS)
Domel, Neal D.
1996-01-01
Phase 1 is complete for the development of a computational fluid dynamics CFD) parallel code with automatic grid generation and adaptation for the Euler analysis of flow over complex geometries. SPLITFLOW, an unstructured Cartesian grid code developed at Lockheed Martin Tactical Aircraft Systems, has been modified for a distributed memory/massively parallel computing environment. The parallel code is operational on an SGI network, Cray J90 and C90 vector machines, SGI Power Challenge, and Cray T3D and IBM SP2 massively parallel machines. Parallel Virtual Machine (PVM) is the message passing protocol for portability to various architectures. A domain decomposition technique was developed which enforces dynamic load balancing to improve solution speed and memory requirements. A host/node algorithm distributes the tasks. The solver parallelizes very well, and scales with the number of processors. Partially parallelized and non-parallelized tasks consume most of the wall clock time in a very fine grain environment. Timing comparisons on a Cray C90 demonstrate that Parallel SPLITFLOW runs 2.4 times faster on 8 processors than its non-parallel counterpart autotasked over 8 processors.
Fasani, Rick A; Livi, Carolina B; Choudhury, Dipanwita R; Kleensang, Andre; Bouhifd, Mounir; Pendse, Salil N; McMullen, Patrick D; Andersen, Melvin E; Hartung, Thomas; Rosenberg, Michael
2015-01-01
The Human Toxome Project is part of a long-term vision to modernize toxicity testing for the 21st century. In the initial phase of the project, a consortium of six academic, commercial, and government organizations has partnered to map pathways of toxicity, using endocrine disruption as a model hazard. Experimental data is generated at multiple sites, and analyzed using a range of computational tools. While effectively gathering, managing, and analyzing the data for high-content experiments is a challenge in its own right, doing so for a growing number of -omics technologies, with larger data sets, across multiple institutions complicates the process. Interestingly, one of the most difficult, ongoing challenges has been the computational collaboration between the geographically separate institutions. Existing solutions cannot handle the growing heterogeneous data, provide a computational environment for consistent analysis, accommodate different workflows, and adapt to the constantly evolving methods and goals of a research project. To meet the needs of the project, we have created and managed The Human Toxome Collaboratorium, a shared computational environment hosted on third-party cloud services. The Collaboratorium provides a familiar virtual desktop, with a mix of commercial, open-source, and custom-built applications. It shares some of the challenges of traditional information technology, but with unique and unexpected constraints that emerge from the cloud. Here we describe the problems we faced, the current architecture of the solution, an example of its use, the major lessons we learned, and the future potential of the concept. In particular, the Collaboratorium represents a novel distribution method that could increase the reproducibility and reusability of results from similar large, multi-omic studies.
NASA Technical Reports Server (NTRS)
Saini, Subhash; Frumkin, Michael; Hribar, Michelle; Jin, Hao-Qiang; Waheed, Abdul; Yan, Jerry
1998-01-01
Porting applications to new high performance parallel and distributed computing platforms is a challenging task. Since writing parallel code by hand is extremely time consuming and costly, porting codes would ideally be automated by using some parallelization tools and compilers. In this paper, we compare the performance of the hand written NAB Parallel Benchmarks against three parallel versions generated with the help of tools and compilers: 1) CAPTools: an interactive computer aided parallelization too] that generates message passing code, 2) the Portland Group's HPF compiler and 3) using compiler directives with the native FORTAN77 compiler on the SGI Origin2000.
Thermoelastic damping in bilayered microbar resonators with circular cross-section
NASA Astrophysics Data System (ADS)
Liang, Xiaoyao; Li, Pu
2017-11-01
It is always a challenge to determine the Thermoelastic damping (TED) in bilayered microbars precisely. In this paper, a model for TED in the bilayered and cantilevered microbar was proposed, in which the total damping was derived by calculating the energy evanished in each layer. The distribution of temperature in the bilayered microbar with a thermodynamically ideal boundary receiving a time-harmonic force is obtained. An infinite summation for the computing of TED in the bilayered slender microbars under axial loading is presented, and the convergence rate of it is discussed. There are little differences between the results computed by our model and that by finite element method (FEM).
NASA Astrophysics Data System (ADS)
Gregory, A. E.; Benedict, K. K.; Zhang, S.; Savickas, J.
2017-12-01
Large scale, high severity wildfires in forests have become increasingly prevalent in the western United States due to fire exclusion. Although past work has focused on the immediate consequences of wildfire (ie. runoff magnitude and debris flow), little has been done to understand the post wildfire hydrologic consequences of vegetation regrowth. Furthermore, vegetation is often characterized by static parameterizations within hydrological models. In order to understand the temporal relationship between hydrologic processes and revegetation, we modularized and partially automated the hydrologic modeling process to increase connectivity between remotely sensed data, the Virtual Watershed Platform (a data management resource, called the VWP), input meteorological data, and the Precipitation-Runoff Modeling System (PRMS). This process was used to run simulations in the Valles Caldera of NM, an area impacted by the 2011 Las Conchas Fire, in PRMS before and after the Las Conchas to evaluate hydrologic process changes. The modeling environment addressed some of the existing challenges faced by hydrological modelers. At present, modelers are somewhat limited in their ability to push the boundaries of hydrologic understanding. Specific issues faced by modelers include limited computational resources to model processes at large spatial and temporal scales, data storage capacity and accessibility from the modeling platform, computational and time contraints for experimental modeling, and the skills to integrate modeling software in ways that have not been explored. By taking an interdisciplinary approach, we were able to address some of these challenges by leveraging the skills of hydrologic, data, and computer scientists; and the technical capabilities provided by a combination of on-demand/high-performance computing, distributed data, and cloud services. The hydrologic modeling process was modularized to include options for distributing meteorological data, parameter space experimentation, data format transformation, looping, validation of models and containerization for enabling new analytic scenarios. The user interacts with the modules through Jupyter Notebooks which can be connected to an on-demand computing and HPC environment, and data services built as part of the VWP.
NASA Astrophysics Data System (ADS)
Thomer, A.
2017-12-01
Data provenance - the record of the varied processes that went into the creation of a dataset, as well as the relationships between resulting data objects - is necessary to support the reusability, reproducibility and reliability of earth science data. In sUAS-based research, capturing provenance can be particularly challenging because of the breadth and distributed nature of the many platforms used to collect, process and analyze data. In any given project, multiple drones, controllers, computers, software systems, sensors, cameras, imaging processing algorithms and data processing workflows are used over sometimes long periods of time. These platforms and processing result in dozens - if not hundreds - of data products in varying stages of readiness-for-analysis and sharing. Provenance tracking mechanisms are needed to make the relationships between these many data products explicit, and therefore more reusable and shareable. In this talk, I discuss opportunities and challenges in tracking provenance in sUAS-based research, and identify gaps in current workflow-capture technologies. I draw on prior work conducted as part of the IMLS-funded Site-Based Data Curation project in which we developed methods of documenting in and ex silico (that is, computational and non-computation) workflows, and demonstrate this approaches applicability to research with sUASes. I conclude with a discussion of ontologies and other semantic technologies that have potential application in sUAS research.
NASA Technical Reports Server (NTRS)
Krasteva, Denitza T.
1998-01-01
Multidisciplinary design optimization (MDO) for large-scale engineering problems poses many challenges (e.g., the design of an efficient concurrent paradigm for global optimization based on disciplinary analyses, expensive computations over vast data sets, etc.) This work focuses on the application of distributed schemes for massively parallel architectures to MDO problems, as a tool for reducing computation time and solving larger problems. The specific problem considered here is configuration optimization of a high speed civil transport (HSCT), and the efficient parallelization of the embedded paradigm for reasonable design space identification. Two distributed dynamic load balancing techniques (random polling and global round robin with message combining) and two necessary termination detection schemes (global task count and token passing) were implemented and evaluated in terms of effectiveness and scalability to large problem sizes and a thousand processors. The effect of certain parameters on execution time was also inspected. Empirical results demonstrated stable performance and effectiveness for all schemes, and the parametric study showed that the selected algorithmic parameters have a negligible effect on performance.
WE-B-BRD-01: Innovation in Radiation Therapy Planning II: Cloud Computing in RT
DOE Office of Scientific and Technical Information (OSTI.GOV)
Moore, K; Kagadis, G; Xing, L
As defined by the National Institute of Standards and Technology, cloud computing is “a model for enabling ubiquitous, convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction.” Despite the omnipresent role of computers in radiotherapy, cloud computing has yet to achieve widespread adoption in clinical or research applications, though the transition to such “on-demand” access is underway. As this transition proceeds, new opportunities for aggregate studies and efficient use of computational resources are set againstmore » new challenges in patient privacy protection, data integrity, and management of clinical informatics systems. In this Session, current and future applications of cloud computing and distributed computational resources will be discussed in the context of medical imaging, radiotherapy research, and clinical radiation oncology applications. Learning Objectives: Understand basic concepts of cloud computing. Understand how cloud computing could be used for medical imaging applications. Understand how cloud computing could be employed for radiotherapy research.4. Understand how clinical radiotherapy software applications would function in the cloud.« less
Organization of the secure distributed computing based on multi-agent system
NASA Astrophysics Data System (ADS)
Khovanskov, Sergey; Rumyantsev, Konstantin; Khovanskova, Vera
2018-04-01
Nowadays developing methods for distributed computing is received much attention. One of the methods of distributed computing is using of multi-agent systems. The organization of distributed computing based on the conventional network computers can experience security threats performed by computational processes. Authors have developed the unified agent algorithm of control system of computing network nodes operation. Network PCs is used as computing nodes. The proposed multi-agent control system for the implementation of distributed computing allows in a short time to organize using of the processing power of computers any existing network to solve large-task by creating a distributed computing. Agents based on a computer network can: configure a distributed computing system; to distribute the computational load among computers operated agents; perform optimization distributed computing system according to the computing power of computers on the network. The number of computers connected to the network can be increased by connecting computers to the new computer system, which leads to an increase in overall processing power. Adding multi-agent system in the central agent increases the security of distributed computing. This organization of the distributed computing system reduces the problem solving time and increase fault tolerance (vitality) of computing processes in a changing computing environment (dynamic change of the number of computers on the network). Developed a multi-agent system detects cases of falsification of the results of a distributed system, which may lead to wrong decisions. In addition, the system checks and corrects wrong results.
2018-01-01
Many modern applications of AI such as web search, mobile browsing, image processing, and natural language processing rely on finding similar items from a large database of complex objects. Due to the very large scale of data involved (e.g., users’ queries from commercial search engines), computing such near or nearest neighbors is a non-trivial task, as the computational cost grows significantly with the number of items. To address this challenge, we adopt Locality Sensitive Hashing (a.k.a, LSH) methods and evaluate four variants in a distributed computing environment (specifically, Hadoop). We identify several optimizations which improve performance, suitable for deployment in very large scale settings. The experimental results demonstrate our variants of LSH achieve the robust performance with better recall compared with “vanilla” LSH, even when using the same amount of space. PMID:29346410
A Responsive Client for Distributed Visualization
NASA Astrophysics Data System (ADS)
Bollig, E. F.; Jensen, P. A.; Erlebacher, G.; Yuen, D. A.; Momsen, A. R.
2006-12-01
As grids, web services and distributed computing continue to gain popularity in the scientific community, demand for virtual laboratories likewise increases. Today organizations such as the Virtual Laboratory for Earth and Planetary Sciences (VLab) are dedicated to developing web-based portals to perform various simulations remotely while abstracting away details of the underlying computation. Two of the biggest challenges in portal- based computing are fast visualization and smooth interrogation without over taxing clients resources. In response to this challenge, we have expanded on our previous data storage strategy and thick client visualization scheme [1] to develop a client-centric distributed application that utilizes remote visualization of large datasets and makes use of the local graphics processor for improved interactivity. Rather than waste precious client resources for visualization, a combination of 3D graphics and 2D server bitmaps are used to simulate the look and feel of local rendering. Java Web Start and Java Bindings for OpenGL enable install-on- demand functionality as well as low level access to client graphics for all platforms. Powerful visualization services based on VTK and auto-generated by the WATT compiler [2] are accessible through a standard web API. Data is permanently stored on compute nodes while separate visualization nodes fetch data requested by clients, caching it locally to prevent unnecessary transfers. We will demonstrate application capabilities in the context of simulated charge density visualization within the VLab portal. In addition, we will address generalizations of our application to interact with a wider number of WATT services and performance bottlenecks. [1] Ananthuni, R., Karki, B.B., Bollig, E.F., da Silva, C.R.S., Erlebacher, G., "A Web-Based Visualization and Reposition Scheme for Scientific Data," In Press, Proceedings of the 2006 International Conference on Modeling Simulation and Visualization Methods (MSV'06) (2006). [2] Jensen, P.A., Yuen, D.A., Erlebacher, G., Bollig, E.F., Kigelman, D.G., Shukh, E.A., Automated Generation of Web Services for Visualization Toolkits, Eos Trans. AGU, 86(52), Fall Meet. Suppl., Abstract IN42A-06, 2005.
Statistical mechanics of complex neural systems and high dimensional data
NASA Astrophysics Data System (ADS)
Advani, Madhu; Lahiri, Subhaneil; Ganguli, Surya
2013-03-01
Recent experimental advances in neuroscience have opened new vistas into the immense complexity of neuronal networks. This proliferation of data challenges us on two parallel fronts. First, how can we form adequate theoretical frameworks for understanding how dynamical network processes cooperate across widely disparate spatiotemporal scales to solve important computational problems? Second, how can we extract meaningful models of neuronal systems from high dimensional datasets? To aid in these challenges, we give a pedagogical review of a collection of ideas and theoretical methods arising at the intersection of statistical physics, computer science and neurobiology. We introduce the interrelated replica and cavity methods, which originated in statistical physics as powerful ways to quantitatively analyze large highly heterogeneous systems of many interacting degrees of freedom. We also introduce the closely related notion of message passing in graphical models, which originated in computer science as a distributed algorithm capable of solving large inference and optimization problems involving many coupled variables. We then show how both the statistical physics and computer science perspectives can be applied in a wide diversity of contexts to problems arising in theoretical neuroscience and data analysis. Along the way we discuss spin glasses, learning theory, illusions of structure in noise, random matrices, dimensionality reduction and compressed sensing, all within the unified formalism of the replica method. Moreover, we review recent conceptual connections between message passing in graphical models, and neural computation and learning. Overall, these ideas illustrate how statistical physics and computer science might provide a lens through which we can uncover emergent computational functions buried deep within the dynamical complexities of neuronal networks.
BCM: toolkit for Bayesian analysis of Computational Models using samplers.
Thijssen, Bram; Dijkstra, Tjeerd M H; Heskes, Tom; Wessels, Lodewyk F A
2016-10-21
Computational models in biology are characterized by a large degree of uncertainty. This uncertainty can be analyzed with Bayesian statistics, however, the sampling algorithms that are frequently used for calculating Bayesian statistical estimates are computationally demanding, and each algorithm has unique advantages and disadvantages. It is typically unclear, before starting an analysis, which algorithm will perform well on a given computational model. We present BCM, a toolkit for the Bayesian analysis of Computational Models using samplers. It provides efficient, multithreaded implementations of eleven algorithms for sampling from posterior probability distributions and for calculating marginal likelihoods. BCM includes tools to simplify the process of model specification and scripts for visualizing the results. The flexible architecture allows it to be used on diverse types of biological computational models. In an example inference task using a model of the cell cycle based on ordinary differential equations, BCM is significantly more efficient than existing software packages, allowing more challenging inference problems to be solved. BCM represents an efficient one-stop-shop for computational modelers wishing to use sampler-based Bayesian statistics.
Extraction of drainage networks from large terrain datasets using high throughput computing
NASA Astrophysics Data System (ADS)
Gong, Jianya; Xie, Jibo
2009-02-01
Advanced digital photogrammetry and remote sensing technology produces large terrain datasets (LTD). How to process and use these LTD has become a big challenge for GIS users. Extracting drainage networks, which are basic for hydrological applications, from LTD is one of the typical applications of digital terrain analysis (DTA) in geographical information applications. Existing serial drainage algorithms cannot deal with large data volumes in a timely fashion, and few GIS platforms can process LTD beyond the GB size. High throughput computing (HTC), a distributed parallel computing mode, is proposed to improve the efficiency of drainage networks extraction from LTD. Drainage network extraction using HTC involves two key issues: (1) how to decompose the large DEM datasets into independent computing units and (2) how to merge the separate outputs into a final result. A new decomposition method is presented in which the large datasets are partitioned into independent computing units using natural watershed boundaries instead of using regular 1-dimensional (strip-wise) and 2-dimensional (block-wise) decomposition. Because the distribution of drainage networks is strongly related to watershed boundaries, the new decomposition method is more effective and natural. The method to extract natural watershed boundaries was improved by using multi-scale DEMs instead of single-scale DEMs. A HTC environment is employed to test the proposed methods with real datasets.
Neuroimaging Study Designs, Computational Analyses and Data Provenance Using the LONI Pipeline
Dinov, Ivo; Lozev, Kamen; Petrosyan, Petros; Liu, Zhizhong; Eggert, Paul; Pierce, Jonathan; Zamanyan, Alen; Chakrapani, Shruthi; Van Horn, John; Parker, D. Stott; Magsipoc, Rico; Leung, Kelvin; Gutman, Boris; Woods, Roger; Toga, Arthur
2010-01-01
Modern computational neuroscience employs diverse software tools and multidisciplinary expertise to analyze heterogeneous brain data. The classical problems of gathering meaningful data, fitting specific models, and discovering appropriate analysis and visualization tools give way to a new class of computational challenges—management of large and incongruous data, integration and interoperability of computational resources, and data provenance. We designed, implemented and validated a new paradigm for addressing these challenges in the neuroimaging field. Our solution is based on the LONI Pipeline environment [3], [4], a graphical workflow environment for constructing and executing complex data processing protocols. We developed study-design, database and visual language programming functionalities within the LONI Pipeline that enable the construction of complete, elaborate and robust graphical workflows for analyzing neuroimaging and other data. These workflows facilitate open sharing and communication of data and metadata, concrete processing protocols, result validation, and study replication among different investigators and research groups. The LONI Pipeline features include distributed grid-enabled infrastructure, virtualized execution environment, efficient integration, data provenance, validation and distribution of new computational tools, automated data format conversion, and an intuitive graphical user interface. We demonstrate the new LONI Pipeline features using large scale neuroimaging studies based on data from the International Consortium for Brain Mapping [5] and the Alzheimer's Disease Neuroimaging Initiative [6]. User guides, forums, instructions and downloads of the LONI Pipeline environment are available at http://pipeline.loni.ucla.edu. PMID:20927408
BigDebug: Debugging Primitives for Interactive Big Data Processing in Spark.
Gulzar, Muhammad Ali; Interlandi, Matteo; Yoo, Seunghyun; Tetali, Sai Deep; Condie, Tyson; Millstein, Todd; Kim, Miryung
2016-05-01
Developers use cloud computing platforms to process a large quantity of data in parallel when developing big data analytics. Debugging the massive parallel computations that run in today's data-centers is time consuming and error-prone. To address this challenge, we design a set of interactive, real-time debugging primitives for big data processing in Apache Spark, the next generation data-intensive scalable cloud computing platform. This requires re-thinking the notion of step-through debugging in a traditional debugger such as gdb, because pausing the entire computation across distributed worker nodes causes significant delay and naively inspecting millions of records using a watchpoint is too time consuming for an end user. First, BIGDEBUG's simulated breakpoints and on-demand watchpoints allow users to selectively examine distributed, intermediate data on the cloud with little overhead. Second, a user can also pinpoint a crash-inducing record and selectively resume relevant sub-computations after a quick fix. Third, a user can determine the root causes of errors (or delays) at the level of individual records through a fine-grained data provenance capability. Our evaluation shows that BIGDEBUG scales to terabytes and its record-level tracing incurs less than 25% overhead on average. It determines crash culprits orders of magnitude more accurately and provides up to 100% time saving compared to the baseline replay debugger. The results show that BIGDEBUG supports debugging at interactive speeds with minimal performance impact.
A Columnar Storage Strategy with Spatiotemporal Index for Big Climate Data
NASA Astrophysics Data System (ADS)
Hu, F.; Bowen, M. K.; Li, Z.; Schnase, J. L.; Duffy, D.; Lee, T. J.; Yang, C. P.
2015-12-01
Large collections of observational, reanalysis, and climate model output data may grow to as large as a 100 PB in the coming years, so climate dataset is in the Big Data domain, and various distributed computing frameworks have been utilized to address the challenges by big climate data analysis. However, due to the binary data format (NetCDF, HDF) with high spatial and temporal dimensions, the computing frameworks in Apache Hadoop ecosystem are not originally suited for big climate data. In order to make the computing frameworks in Hadoop ecosystem directly support big climate data, we propose a columnar storage format with spatiotemporal index to store climate data, which will support any project in the Apache Hadoop ecosystem (e.g. MapReduce, Spark, Hive, Impala). With this approach, the climate data will be transferred into binary Parquet data format, a columnar storage format, and spatial and temporal index will be built and attached into the end of Parquet files to enable real-time data query. Then such climate data in Parquet data format could be available to any computing frameworks in Hadoop ecosystem. The proposed approach is evaluated using the NASA Modern-Era Retrospective Analysis for Research and Applications (MERRA) climate reanalysis dataset. Experimental results show that this approach could efficiently overcome the gap between the big climate data and the distributed computing frameworks, and the spatiotemporal index could significantly accelerate data querying and processing.
NASA Astrophysics Data System (ADS)
Shamugam, Veeramani; Murray, I.; Leong, J. A.; Sidhu, Amandeep S.
2016-03-01
Cloud computing provides services on demand instantly, such as access to network infrastructure consisting of computing hardware, operating systems, network storage, database and applications. Network usage and demands are growing at a very fast rate and to meet the current requirements, there is a need for automatic infrastructure scaling. Traditional networks are difficult to automate because of the distributed nature of their decision making process for switching or routing which are collocated on the same device. Managing complex environments using traditional networks is time-consuming and expensive, especially in the case of generating virtual machines, migration and network configuration. To mitigate the challenges, network operations require efficient, flexible, agile and scalable software defined networks (SDN). This paper discuss various issues in SDN and suggests how to mitigate the network management related issues. A private cloud prototype test bed was setup to implement the SDN on the OpenStack platform to test and evaluate the various network performances provided by the various configurations.
Archive Management of NASA Earth Observation Data to Support Cloud Analysis
NASA Technical Reports Server (NTRS)
Lynnes, Christopher; Baynes, Kathleen; McInerney, Mark
2017-01-01
NASA collects, processes and distributes petabytes of Earth Observation (EO) data from satellites, aircraft, in situ instruments and model output, with an order of magnitude increase expected by 2024. Cloud-based web object storage (WOS) of these data can simplify the execution of such an increase. More importantly, it can also facilitate user analysis of those volumes by making the data available to the massively parallel computing power in the cloud. However, storing EO data in cloud WOS has a ripple effect throughout the NASA archive system with unexpected challenges and opportunities. One challenge is modifying data servicing software (such as Web Coverage Service servers) to access and subset data that are no longer on a directly accessible file system, but rather in cloud WOS. Opportunities include refactoring of the archive software to a cloud-native architecture; virtualizing data products by computing on demand; and reorganizing data to be more analysis-friendly. Reviewed by Mark McInerney ESDIS Deputy Project Manager.
A hydrological emulator for global applications – HE v1.0.0
DOE Office of Scientific and Technical Information (OSTI.GOV)
Liu, Yaling; Hejazi, Mohamad; Li, Hongyi
While global hydrological models (GHMs) are very useful in exploring water resources and interactions between the Earth and human systems, their use often requires numerous model inputs, complex model calibration, and high computation costs. To overcome these challenges, we construct an efficient open-source and ready-to-use hydrological emulator (HE) that can mimic complex GHMs at a range of spatial scales (e.g., basin, region, globe). More specifically, we construct both a lumped and a distributed scheme of the HE based on the monthly abcd model to explore the tradeoff between computational cost and model fidelity. Model predictability and computational efficiency are evaluatedmore » in simulating global runoff from 1971 to 2010 with both the lumped and distributed schemes. The results are compared against the runoff product from the widely used Variable Infiltration Capacity (VIC) model. Our evaluation indicates that the lumped and distributed schemes present comparable results regarding annual total quantity, spatial pattern, and temporal variation of the major water fluxes (e.g., total runoff, evapotranspiration) across the global 235 basins (e.g., correlation coefficient r between the annual total runoff from either of these two schemes and the VIC is > 0.96), except for several cold (e.g., Arctic, interior Tibet), dry (e.g., North Africa) and mountainous (e.g., Argentina) regions. Compared against the monthly total runoff product from the VIC (aggregated from daily runoff), the global mean Kling–Gupta efficiencies are 0.75 and 0.79 for the lumped and distributed schemes, respectively, with the distributed scheme better capturing spatial heterogeneity. Notably, the computation efficiency of the lumped scheme is 2 orders of magnitude higher than the distributed one and 7 orders more efficient than the VIC model. A case study of uncertainty analysis for the world's 16 basins with top annual streamflow is conducted using 100 000 model simulations, and it demonstrates the lumped scheme's extraordinary advantage in computational efficiency. Lastly, our results suggest that the revised lumped abcd model can serve as an efficient and reasonable HE for complex GHMs and is suitable for broad practical use, and the distributed scheme is also an efficient alternative if spatial heterogeneity is of more interest.« less
de la Iglesia, D; Cachau, R E; García-Remesal, M; Maojo, V
2013-11-27
Nanotechnology represents an area of particular promise and significant opportunity across multiple scientific disciplines. Ongoing nanotechnology research ranges from the characterization of nanoparticles and nanomaterials to the analysis and processing of experimental data seeking correlations between nanoparticles and their functionalities and side effects. Due to their special properties, nanoparticles are suitable for cellular-level diagnostics and therapy, offering numerous applications in medicine, e.g. development of biomedical devices, tissue repair, drug delivery systems and biosensors. In nanomedicine, recent studies are producing large amounts of structural and property data, highlighting the role for computational approaches in information management. While in vitro and in vivo assays are expensive, the cost of computing is falling. Furthermore, improvements in the accuracy of computational methods (e.g. data mining, knowledge discovery, modeling and simulation) have enabled effective tools to automate the extraction, management and storage of these vast data volumes. Since this information is widely distributed, one major issue is how to locate and access data where it resides (which also poses data-sharing limitations). The novel discipline of nanoinformatics addresses the information challenges related to nanotechnology research. In this paper, we summarize the needs and challenges in the field and present an overview of extant initiatives and efforts.
NASA Astrophysics Data System (ADS)
de la Iglesia, D.; Cachau, R. E.; García-Remesal, M.; Maojo, V.
2013-01-01
Nanotechnology represents an area of particular promise and significant opportunity across multiple scientific disciplines. Ongoing nanotechnology research ranges from the characterization of nanoparticles and nanomaterials to the analysis and processing of experimental data seeking correlations between nanoparticles and their functionalities and side effects. Due to their special properties, nanoparticles are suitable for cellular-level diagnostics and therapy, offering numerous applications in medicine, e.g. development of biomedical devices, tissue repair, drug delivery systems and biosensors. In nanomedicine, recent studies are producing large amounts of structural and property data, highlighting the role for computational approaches in information management. While in vitro and in vivo assays are expensive, the cost of computing is falling. Furthermore, improvements in the accuracy of computational methods (e.g. data mining, knowledge discovery, modeling and simulation) have enabled effective tools to automate the extraction, management and storage of these vast data volumes. Since this information is widely distributed, one major issue is how to locate and access data where it resides (which also poses data-sharing limitations). The novel discipline of nanoinformatics addresses the information challenges related to nanotechnology research. In this paper, we summarize the needs and challenges in the field and present an overview of extant initiatives and efforts.
Veksler, Vladislav D; Buchler, Norbou; Hoffman, Blaine E; Cassenti, Daniel N; Sample, Char; Sugrim, Shridat
2018-01-01
Computational models of cognitive processes may be employed in cyber-security tools, experiments, and simulations to address human agency and effective decision-making in keeping computational networks secure. Cognitive modeling can addresses multi-disciplinary cyber-security challenges requiring cross-cutting approaches over the human and computational sciences such as the following: (a) adversarial reasoning and behavioral game theory to predict attacker subjective utilities and decision likelihood distributions, (b) human factors of cyber tools to address human system integration challenges, estimation of defender cognitive states, and opportunities for automation, (c) dynamic simulations involving attacker, defender, and user models to enhance studies of cyber epidemiology and cyber hygiene, and (d) training effectiveness research and training scenarios to address human cyber-security performance, maturation of cyber-security skill sets, and effective decision-making. Models may be initially constructed at the group-level based on mean tendencies of each subject's subgroup, based on known statistics such as specific skill proficiencies, demographic characteristics, and cultural factors. For more precise and accurate predictions, cognitive models may be fine-tuned to each individual attacker, defender, or user profile, and updated over time (based on recorded behavior) via techniques such as model tracing and dynamic parameter fitting.
2014-01-01
The emergence of massive datasets in a clinical setting presents both challenges and opportunities in data storage and analysis. This so called “big data” challenges traditional analytic tools and will increasingly require novel solutions adapted from other fields. Advances in information and communication technology present the most viable solutions to big data analysis in terms of efficiency and scalability. It is vital those big data solutions are multithreaded and that data access approaches be precisely tailored to large volumes of semi-structured/unstructured data. The MapReduce programming framework uses two tasks common in functional programming: Map and Reduce. MapReduce is a new parallel processing framework and Hadoop is its open-source implementation on a single computing node or on clusters. Compared with existing parallel processing paradigms (e.g. grid computing and graphical processing unit (GPU)), MapReduce and Hadoop have two advantages: 1) fault-tolerant storage resulting in reliable data processing by replicating the computing tasks, and cloning the data chunks on different computing nodes across the computing cluster; 2) high-throughput data processing via a batch processing framework and the Hadoop distributed file system (HDFS). Data are stored in the HDFS and made available to the slave nodes for computation. In this paper, we review the existing applications of the MapReduce programming framework and its implementation platform Hadoop in clinical big data and related medical health informatics fields. The usage of MapReduce and Hadoop on a distributed system represents a significant advance in clinical big data processing and utilization, and opens up new opportunities in the emerging era of big data analytics. The objective of this paper is to summarize the state-of-the-art efforts in clinical big data analytics and highlight what might be needed to enhance the outcomes of clinical big data analytics tools. This paper is concluded by summarizing the potential usage of the MapReduce programming framework and Hadoop platform to process huge volumes of clinical data in medical health informatics related fields. PMID:25383096
Mohammed, Emad A; Far, Behrouz H; Naugler, Christopher
2014-01-01
The emergence of massive datasets in a clinical setting presents both challenges and opportunities in data storage and analysis. This so called "big data" challenges traditional analytic tools and will increasingly require novel solutions adapted from other fields. Advances in information and communication technology present the most viable solutions to big data analysis in terms of efficiency and scalability. It is vital those big data solutions are multithreaded and that data access approaches be precisely tailored to large volumes of semi-structured/unstructured data. THE MAPREDUCE PROGRAMMING FRAMEWORK USES TWO TASKS COMMON IN FUNCTIONAL PROGRAMMING: Map and Reduce. MapReduce is a new parallel processing framework and Hadoop is its open-source implementation on a single computing node or on clusters. Compared with existing parallel processing paradigms (e.g. grid computing and graphical processing unit (GPU)), MapReduce and Hadoop have two advantages: 1) fault-tolerant storage resulting in reliable data processing by replicating the computing tasks, and cloning the data chunks on different computing nodes across the computing cluster; 2) high-throughput data processing via a batch processing framework and the Hadoop distributed file system (HDFS). Data are stored in the HDFS and made available to the slave nodes for computation. In this paper, we review the existing applications of the MapReduce programming framework and its implementation platform Hadoop in clinical big data and related medical health informatics fields. The usage of MapReduce and Hadoop on a distributed system represents a significant advance in clinical big data processing and utilization, and opens up new opportunities in the emerging era of big data analytics. The objective of this paper is to summarize the state-of-the-art efforts in clinical big data analytics and highlight what might be needed to enhance the outcomes of clinical big data analytics tools. This paper is concluded by summarizing the potential usage of the MapReduce programming framework and Hadoop platform to process huge volumes of clinical data in medical health informatics related fields.
Yoshida, Hiroyuki; Wu, Yin; Cai, Wenli; Brett, Bevin
2013-01-01
One of the key challenges in three-dimensional (3D) medical imaging is to enable the fast turn-around time, which is often required for interactive or real-time response. This inevitably requires not only high computational power but also high memory bandwidth due to the massive amount of data that need to be processed. In this work, we have developed a software platform that is designed to support high-performance 3D medical image processing for a wide range of applications using increasingly available and affordable commodity computing systems: multi-core, clusters, and cloud computing systems. To achieve scalable, high-performance computing, our platform (1) employs size-adaptive, distributable block volumes as a core data structure for efficient parallelization of a wide range of 3D image processing algorithms; (2) supports task scheduling for efficient load distribution and balancing; and (3) consists of a layered parallel software libraries that allow a wide range of medical applications to share the same functionalities. We evaluated the performance of our platform by applying it to an electronic cleansing system in virtual colonoscopy, with initial experimental results showing a 10 times performance improvement on an 8-core workstation over the original sequential implementation of the system. PMID:23366803
HEP Computing Tools, Grid and Supercomputers for Genome Sequencing Studies
NASA Astrophysics Data System (ADS)
De, K.; Klimentov, A.; Maeno, T.; Mashinistov, R.; Novikov, A.; Poyda, A.; Tertychnyy, I.; Wenaus, T.
2017-10-01
PanDA - Production and Distributed Analysis Workload Management System has been developed to address ATLAS experiment at LHC data processing and analysis challenges. Recently PanDA has been extended to run HEP scientific applications on Leadership Class Facilities and supercomputers. The success of the projects to use PanDA beyond HEP and Grid has drawn attention from other compute intensive sciences such as bioinformatics. Recent advances of Next Generation Genome Sequencing (NGS) technology led to increasing streams of sequencing data that need to be processed, analysed and made available for bioinformaticians worldwide. Analysis of genomes sequencing data using popular software pipeline PALEOMIX can take a month even running it on the powerful computer resource. In this paper we will describe the adaptation the PALEOMIX pipeline to run it on a distributed computing environment powered by PanDA. To run pipeline we split input files into chunks which are run separately on different nodes as separate inputs for PALEOMIX and finally merge output file, it is very similar to what it done by ATLAS to process and to simulate data. We dramatically decreased the total walltime because of jobs (re)submission automation and brokering within PanDA. Using software tools developed initially for HEP and Grid can reduce payload execution time for Mammoths DNA samples from weeks to days.
Assembling Large, Multi-Sensor Climate Datasets Using the SciFlo Grid Workflow System
NASA Astrophysics Data System (ADS)
Wilson, B.; Manipon, G.; Xing, Z.; Fetzer, E.
2009-04-01
NASA's Earth Observing System (EOS) is an ambitious facility for studying global climate change. The mandate now is to combine measurements from the instruments on the "A-Train" platforms (AIRS, AMSR-E, MODIS, MISR, MLS, and CloudSat) and other Earth probes to enable large-scale studies of climate change over periods of years to decades. However, moving from predominantly single-instrument studies to a multi-sensor, measurement-based model for long-duration analysis of important climate variables presents serious challenges for large-scale data mining and data fusion. For example, one might want to compare temperature and water vapor retrievals from one instrument (AIRS) to another instrument (MODIS), and to a model (ECMWF), stratify the comparisons using a classification of the "cloud scenes" from CloudSat, and repeat the entire analysis over years of AIRS data. To perform such an analysis, one must discover & access multiple datasets from remote sites, find the space/time "matchups" between instruments swaths and model grids, understand the quality flags and uncertainties for retrieved physical variables, assemble merged datasets, and compute fused products for further scientific and statistical analysis. To meet these large-scale challenges, we are utilizing a Grid computing and dataflow framework, named SciFlo, in which we are deploying a set of versatile and reusable operators for data query, access, subsetting, co-registration, mining, fusion, and advanced statistical analysis. SciFlo is a semantically-enabled ("smart") Grid Workflow system that ties together a peer-to-peer network of computers into an efficient engine for distributed computation. The SciFlo workflow engine enables scientists to do multi-instrument Earth Science by assembling remotely-invokable Web Services (SOAP or http GET URLs), native executables, command-line scripts, and Python codes into a distributed computing flow. A scientist visually authors the graph of operation in the VizFlow GUI, or uses a text editor to modify the simple XML workflow documents. The SciFlo client & server engines optimize the execution of such distributed workflows and allow the user to transparently find and use datasets and operators without worrying about the actual location of the Grid resources. The engine transparently moves data to the operators, and moves operators to the data (on the dozen trusted SciFlo nodes). SciFlo also deploys a variety of Data Grid services to: query datasets in space and time, locate & retrieve on-line data granules, provide on-the-fly variable and spatial subsetting, perform pairwise instrument matchups for A-Train datasets, and compute fused products. These services are combined into efficient workflows to assemble the desired large-scale, merged climate datasets. SciFlo is currently being applied in several large climate studies: comparisons of aerosol optical depth between MODIS, MISR, AERONET ground network, and U. Michigan's IMPACT aerosol transport model; characterization of long-term biases in microwave and infrared instruments (AIRS, MLS) by comparisons to GPS temperature retrievals accurate to 0.1 degrees Kelvin; and construction of a decade-long, multi-sensor water vapor climatology stratified by classified cloud scene by bringing together datasets from AIRS/AMSU, AMSR-E, MLS, MODIS, and CloudSat (NASA MEASUREs grant, Fetzer PI). The presentation will discuss the SciFlo technologies, their application in these distributed workflows, and the many challenges encountered in assembling and analyzing these massive datasets.
Robotic insects: Manufacturing, actuation, and power considerations
NASA Astrophysics Data System (ADS)
Wood, Robert
2015-12-01
As the characteristic size of a flying robot decreases, the challenges for successful flight revert to basic questions of fabrication, actuation, fluid mechanics, stabilization, and power - whereas such questions have in general been answered for larger aircraft. When developing a robot on the scale of a housefly, all hardware must be developed from scratch as there is nothing "off-the-shelf" which can be used for mechanisms, sensors, or computation that would satisfy the extreme mass and power limitations. With these challenges in mind, this talk will present progress in the essential technologies for insect-like robots with an emphasis on multi-scale manufacturing methods, high power density actuation, and energy-efficient power distribution.
Time-Series INSAR: An Integer Least-Squares Approach For Distributed Scatterers
NASA Astrophysics Data System (ADS)
Samiei-Esfahany, Sami; Hanssen, Ramon F.
2012-01-01
The objective of this research is to extend the geode- tic mathematical model which was developed for persistent scatterers to a model which can exploit distributed scatterers (DS). The main focus is on the integer least- squares framework, and the main challenge is to include the decorrelation effect in the mathematical model. In order to adapt the integer least-squares mathematical model for DS we altered the model from a single master to a multi-master configuration and introduced the decorrelation effect stochastically. This effect is described in our model by a full covariance matrix. We propose to de- rive this covariance matrix by numerical integration of the (joint) probability distribution function (PDF) of interferometric phases. This PDF is a function of coherence values and can be directly computed from radar data. We show that the use of this model can improve the performance of temporal phase unwrapping of distributed scatterers.
An Efficient Numerical Approach for Nonlinear Fokker-Planck equations
NASA Astrophysics Data System (ADS)
Otten, Dustin; Vedula, Prakash
2009-03-01
Fokker-Planck equations which are nonlinear with respect to their probability densities that occur in many nonequilibrium systems relevant to mean field interaction models, plasmas, classical fermions and bosons can be challenging to solve numerically. To address some underlying challenges in obtaining numerical solutions, we propose a quadrature based moment method for efficient and accurate determination of transient (and stationary) solutions of nonlinear Fokker-Planck equations. In this approach the distribution function is represented as a collection of Dirac delta functions with corresponding quadrature weights and locations, that are in turn determined from constraints based on evolution of generalized moments. Properties of the distribution function can be obtained by solution of transport equations for quadrature weights and locations. We will apply this computational approach to study a wide range of problems, including the Desai-Zwanzig Model (for nonlinear muscular contraction) and multivariate nonlinear Fokker-Planck equations describing classical fermions and bosons, and will also demonstrate good agreement with results obtained from Monte Carlo and other standard numerical methods.
Cloud-Based Computational Tools for Earth Science Applications
NASA Astrophysics Data System (ADS)
Arendt, A. A.; Fatland, R.; Howe, B.
2015-12-01
Earth scientists are increasingly required to think across disciplines and utilize a wide range of datasets in order to solve complex environmental challenges. Although significant progress has been made in distributing data, researchers must still invest heavily in developing computational tools to accommodate their specific domain. Here we document our development of lightweight computational data systems aimed at enabling rapid data distribution, analytics and problem solving tools for Earth science applications. Our goal is for these systems to be easily deployable, scalable and flexible to accommodate new research directions. As an example we describe "Ice2Ocean", a software system aimed at predicting runoff from snow and ice in the Gulf of Alaska region. Our backend components include relational database software to handle tabular and vector datasets, Python tools (NumPy, pandas and xray) for rapid querying of gridded climate data, and an energy and mass balance hydrological simulation model (SnowModel). These components are hosted in a cloud environment for direct access across research teams, and can also be accessed via API web services using a REST interface. This API is a vital component of our system architecture, as it enables quick integration of our analytical tools across disciplines, and can be accessed by any existing data distribution centers. We will showcase several data integration and visualization examples to illustrate how our system has expanded our ability to conduct cross-disciplinary research.
Integration of end-user Cloud storage for CMS analysis
DOE Office of Scientific and Technical Information (OSTI.GOV)
Riahi, Hassen; Aimar, Alberto; Ayllon, Alejandro Alvarez
End-user Cloud storage is increasing rapidly in popularity in research communities thanks to the collaboration capabilities it offers, namely synchronisation and sharing. CERN IT has implemented a model of such storage named, CERNBox, integrated with the CERN AuthN and AuthZ services. To exploit the use of the end-user Cloud storage for the distributed data analysis activity, the CMS experiment has started the integration of CERNBox as a Grid resource. This will allow CMS users to make use of their own storage in the Cloud for their analysis activities as well as to benefit from synchronisation and sharing capabilities to achievemore » results faster and more effectively. It will provide an integration model of Cloud storages in the Grid, which is implemented and commissioned over the world’s largest computing Grid infrastructure, Worldwide LHC Computing Grid (WLCG). In this paper, we present the integration strategy and infrastructure changes needed in order to transparently integrate end-user Cloud storage with the CMS distributed computing model. We describe the new challenges faced in data management between Grid and Cloud and how they were addressed, along with details of the support for Cloud storage recently introduced into the WLCG data movement middleware, FTS3. Finally, the commissioning experience of CERNBox for the distributed data analysis activity is also presented.« less
Integration of end-user Cloud storage for CMS analysis
Riahi, Hassen; Aimar, Alberto; Ayllon, Alejandro Alvarez; ...
2017-05-19
End-user Cloud storage is increasing rapidly in popularity in research communities thanks to the collaboration capabilities it offers, namely synchronisation and sharing. CERN IT has implemented a model of such storage named, CERNBox, integrated with the CERN AuthN and AuthZ services. To exploit the use of the end-user Cloud storage for the distributed data analysis activity, the CMS experiment has started the integration of CERNBox as a Grid resource. This will allow CMS users to make use of their own storage in the Cloud for their analysis activities as well as to benefit from synchronisation and sharing capabilities to achievemore » results faster and more effectively. It will provide an integration model of Cloud storages in the Grid, which is implemented and commissioned over the world’s largest computing Grid infrastructure, Worldwide LHC Computing Grid (WLCG). In this paper, we present the integration strategy and infrastructure changes needed in order to transparently integrate end-user Cloud storage with the CMS distributed computing model. We describe the new challenges faced in data management between Grid and Cloud and how they were addressed, along with details of the support for Cloud storage recently introduced into the WLCG data movement middleware, FTS3. Finally, the commissioning experience of CERNBox for the distributed data analysis activity is also presented.« less
Local storage federation through XRootD architecture for interactive distributed analysis
NASA Astrophysics Data System (ADS)
Colamaria, F.; Colella, D.; Donvito, G.; Elia, D.; Franco, A.; Luparello, G.; Maggi, G.; Miniello, G.; Vallero, S.; Vino, G.
2015-12-01
A cloud-based Virtual Analysis Facility (VAF) for the ALICE experiment at the LHC has been deployed in Bari. Similar facilities are currently running in other Italian sites with the aim to create a federation of interoperating farms able to provide their computing resources for interactive distributed analysis. The use of cloud technology, along with elastic provisioning of computing resources as an alternative to the grid for running data intensive analyses, is the main challenge of these facilities. One of the crucial aspects of the user-driven analysis execution is the data access. A local storage facility has the disadvantage that the stored data can be accessed only locally, i.e. from within the single VAF. To overcome such a limitation a federated infrastructure, which provides full access to all the data belonging to the federation independently from the site where they are stored, has been set up. The federation architecture exploits both cloud computing and XRootD technologies, in order to provide a dynamic, easy-to-use and well performing solution for data handling. It should allow the users to store the files and efficiently retrieve the data, since it implements a dynamic distributed cache among many datacenters in Italy connected to one another through the high-bandwidth national network. Details on the preliminary architecture implementation and performance studies are discussed.
Zao, John K.; Gan, Tchin-Tze; You, Chun-Kai; Chung, Cheng-En; Wang, Yu-Te; Rodríguez Méndez, Sergio José; Mullen, Tim; Yu, Chieh; Kothe, Christian; Hsiao, Ching-Teng; Chu, San-Liang; Shieh, Ce-Kuen; Jung, Tzyy-Ping
2014-01-01
EEG-based Brain-computer interfaces (BCI) are facing basic challenges in real-world applications. The technical difficulties in developing truly wearable BCI systems that are capable of making reliable real-time prediction of users' cognitive states in dynamic real-life situations may seem almost insurmountable at times. Fortunately, recent advances in miniature sensors, wireless communication and distributed computing technologies offered promising ways to bridge these chasms. In this paper, we report an attempt to develop a pervasive on-line EEG-BCI system using state-of-art technologies including multi-tier Fog and Cloud Computing, semantic Linked Data search, and adaptive prediction/classification models. To verify our approach, we implement a pilot system by employing wireless dry-electrode EEG headsets and MEMS motion sensors as the front-end devices, Android mobile phones as the personal user interfaces, compact personal computers as the near-end Fog Servers and the computer clusters hosted by the Taiwan National Center for High-performance Computing (NCHC) as the far-end Cloud Servers. We succeeded in conducting synchronous multi-modal global data streaming in March and then running a multi-player on-line EEG-BCI game in September, 2013. We are currently working with the ARL Translational Neuroscience Branch to use our system in real-life personal stress monitoring and the UCSD Movement Disorder Center to conduct in-home Parkinson's disease patient monitoring experiments. We shall proceed to develop the necessary BCI ontology and introduce automatic semantic annotation and progressive model refinement capability to our system. PMID:24917804
Zao, John K; Gan, Tchin-Tze; You, Chun-Kai; Chung, Cheng-En; Wang, Yu-Te; Rodríguez Méndez, Sergio José; Mullen, Tim; Yu, Chieh; Kothe, Christian; Hsiao, Ching-Teng; Chu, San-Liang; Shieh, Ce-Kuen; Jung, Tzyy-Ping
2014-01-01
EEG-based Brain-computer interfaces (BCI) are facing basic challenges in real-world applications. The technical difficulties in developing truly wearable BCI systems that are capable of making reliable real-time prediction of users' cognitive states in dynamic real-life situations may seem almost insurmountable at times. Fortunately, recent advances in miniature sensors, wireless communication and distributed computing technologies offered promising ways to bridge these chasms. In this paper, we report an attempt to develop a pervasive on-line EEG-BCI system using state-of-art technologies including multi-tier Fog and Cloud Computing, semantic Linked Data search, and adaptive prediction/classification models. To verify our approach, we implement a pilot system by employing wireless dry-electrode EEG headsets and MEMS motion sensors as the front-end devices, Android mobile phones as the personal user interfaces, compact personal computers as the near-end Fog Servers and the computer clusters hosted by the Taiwan National Center for High-performance Computing (NCHC) as the far-end Cloud Servers. We succeeded in conducting synchronous multi-modal global data streaming in March and then running a multi-player on-line EEG-BCI game in September, 2013. We are currently working with the ARL Translational Neuroscience Branch to use our system in real-life personal stress monitoring and the UCSD Movement Disorder Center to conduct in-home Parkinson's disease patient monitoring experiments. We shall proceed to develop the necessary BCI ontology and introduce automatic semantic annotation and progressive model refinement capability to our system.
Haro, Alexander J.; Chelminski, Michael; Dudley, Robert W.
2015-01-01
We developed two-dimensional computational fluid hydraulics-habitat suitability index (CFD-HSI) models to identify and qualitatively assess potential zones of shallow water depth and high water velocity that may present passage challenges for five major anadromous fish species in a 2.63-km reach of the main stem Penobscot River, Maine, as a result of a dam removal downstream of the reach. Suitability parameters were based on distribution of fish lengths and body depths and transformed to cruising, maximum sustained and sprint swimming speeds. Zones of potential depth and velocity challenges were calculated based on the hydraulic models; ability of fish to pass a challenge zone was based on the percent of river channel that the contiguous zone spanned and its maximum along-current length. Three river flows (low: 99.1 m3 sec-1; normal: 344.9 m3 sec-1; and high: 792.9 m3 sec-1) were modelled to simulate existing hydraulic conditions and hydraulic conditions simulating removal of a dam at the downstream boundary of the reach. Potential depth challenge zones were nonexistent for all low-flow simulations of existing conditions for deeper-bodied fishes. Increasing flows for existing conditions and removal of the dam under all flow conditions increased the number and size of potential velocity challenge zones, with the effects of zones being more pronounced for smaller species. The two-dimensional CFD-HSI model has utility in demonstrating gross effects of flow and hydraulic alteration, but may not be as precise a predictive tool as a three-dimensional model. Passability of the potential challenge zones cannot be precisely quantified for two-dimensional or three-dimensional models due to untested assumptions and incomplete data on fish swimming performance and behaviours.
Distributed Peer-to-Peer Target Tracking in Wireless Sensor Networks
Wang, Xue; Wang, Sheng; Bi, Dao-Wei; Ma, Jun-Jie
2007-01-01
Target tracking is usually a challenging application for wireless sensor networks (WSNs) because it is always computation-intensive and requires real-time processing. This paper proposes a practical target tracking system based on the auto regressive moving average (ARMA) model in a distributed peer-to-peer (P2P) signal processing framework. In the proposed framework, wireless sensor nodes act as peers that perform target detection, feature extraction, classification and tracking, whereas target localization requires the collaboration between wireless sensor nodes for improving the accuracy and robustness. For carrying out target tracking under the constraints imposed by the limited capabilities of the wireless sensor nodes, some practically feasible algorithms, such as the ARMA model and the 2-D integer lifting wavelet transform, are adopted in single wireless sensor nodes due to their outstanding performance and light computational burden. Furthermore, a progressive multi-view localization algorithm is proposed in distributed P2P signal processing framework considering the tradeoff between the accuracy and energy consumption. Finally, a real world target tracking experiment is illustrated. Results from experimental implementations have demonstrated that the proposed target tracking system based on a distributed P2P signal processing framework can make efficient use of scarce energy and communication resources and achieve target tracking successfully.
2017-04-05
Information Technology at Nationwide v Abstract vi 1 Business Imperatives 1 1.1 Deliver the Right Work 1 1.2 Deliver the Right Way 1 1.3 Deliver with...an Engaged Workforce 1 2 Challenges and Opportunities 2 2.1 Responding to Demand 2 2.2 Standards and Capabilities 2 2.3 Information Technology ...release and unlimited distribution. Information Technology at Nationwide Nationwide Information Technology (IT) is comprised of seven offices
Evaluating Cloud Computing in the Proposed NASA DESDynI Ground Data System
NASA Technical Reports Server (NTRS)
Tran, John J.; Cinquini, Luca; Mattmann, Chris A.; Zimdars, Paul A.; Cuddy, David T.; Leung, Kon S.; Kwoun, Oh-Ig; Crichton, Dan; Freeborn, Dana
2011-01-01
The proposed NASA Deformation, Ecosystem Structure and Dynamics of Ice (DESDynI) mission would be a first-of-breed endeavor that would fundamentally change the paradigm by which Earth Science data systems at NASA are built. DESDynI is evaluating a distributed architecture where expert science nodes around the country all engage in some form of mission processing and data archiving. This is compared to the traditional NASA Earth Science missions where the science processing is typically centralized. What's more, DESDynI is poised to profoundly increase the amount of data collection and processing well into the 5 terabyte/day and tens of thousands of job range, both of which comprise a tremendous challenge to DESDynI's proposed distributed data system architecture. In this paper, we report on a set of architectural trade studies and benchmarks meant to inform the DESDynI mission and the broader community of the impacts of these unprecedented requirements. In particular, we evaluate the benefits of cloud computing and its integration with our existing NASA ground data system software called Apache Object Oriented Data Technology (OODT). The preliminary conclusions of our study suggest that the use of the cloud and OODT together synergistically form an effective, efficient and extensible combination that could meet the challenges of NASA science missions requiring DESDynI-like data collection and processing volumes at reduced costs.
WLCG Transfers Dashboard: a Unified Monitoring Tool for Heterogeneous Data Transfers
NASA Astrophysics Data System (ADS)
Andreeva, J.; Beche, A.; Belov, S.; Kadochnikov, I.; Saiz, P.; Tuckett, D.
2014-06-01
The Worldwide LHC Computing Grid provides resources for the four main virtual organizations. Along with data processing, data distribution is the key computing activity on the WLCG infrastructure. The scale of this activity is very large, the ATLAS virtual organization (VO) alone generates and distributes more than 40 PB of data in 100 million files per year. Another challenge is the heterogeneity of data transfer technologies. Currently there are two main alternatives for data transfers on the WLCG: File Transfer Service and XRootD protocol. Each LHC VO has its own monitoring system which is limited to the scope of that particular VO. There is a need for a global system which would provide a complete cross-VO and cross-technology picture of all WLCG data transfers. We present a unified monitoring tool - WLCG Transfers Dashboard - where all the VOs and technologies coexist and are monitored together. The scale of the activity and the heterogeneity of the system raise a number of technical challenges. Each technology comes with its own monitoring specificities and some of the VOs use several of these technologies. This paper describes the implementation of the system with particular focus on the design principles applied to ensure the necessary scalability and performance, and to easily integrate any new technology providing additional functionality which might be specific to that technology.
Parallelization of NAS Benchmarks for Shared Memory Multiprocessors
NASA Technical Reports Server (NTRS)
Waheed, Abdul; Yan, Jerry C.; Saini, Subhash (Technical Monitor)
1998-01-01
This paper presents our experiences of parallelizing the sequential implementation of NAS benchmarks using compiler directives on SGI Origin2000 distributed shared memory (DSM) system. Porting existing applications to new high performance parallel and distributed computing platforms is a challenging task. Ideally, a user develops a sequential version of the application, leaving the task of porting to new generations of high performance computing systems to parallelization tools and compilers. Due to the simplicity of programming shared-memory multiprocessors, compiler developers have provided various facilities to allow the users to exploit parallelism. Native compilers on SGI Origin2000 support multiprocessing directives to allow users to exploit loop-level parallelism in their programs. Additionally, supporting tools can accomplish this process automatically and present the results of parallelization to the users. We experimented with these compiler directives and supporting tools by parallelizing sequential implementation of NAS benchmarks. Results reported in this paper indicate that with minimal effort, the performance gain is comparable with the hand-parallelized, carefully optimized, message-passing implementations of the same benchmarks.
Large Scale Analysis of Geospatial Data with Dask and XArray
NASA Astrophysics Data System (ADS)
Zender, C. S.; Hamman, J.; Abernathey, R.; Evans, K. J.; Rocklin, M.; Zender, C. S.; Rocklin, M.
2017-12-01
The analysis of geospatial data with high level languages has acceleratedinnovation and the impact of existing data resources. However, as datasetsgrow beyond single-machine memory, data structures within these high levellanguages can become a bottleneck. New libraries like Dask and XArray resolve some of these scalability issues,providing interactive workflows that are both familiar tohigh-level-language researchers while also scaling out to much largerdatasets. This broadens the access of researchers to larger datasets on highperformance computers and, through interactive development, reducestime-to-insight when compared to traditional parallel programming techniques(MPI). This talk describes Dask, a distributed dynamic task scheduler, Dask.array, amulti-dimensional array that copies the popular NumPy interface, and XArray,a library that wraps NumPy/Dask.array with labeled and indexes axes,implementing the CF conventions. We discuss both the basic design of theselibraries and how they change interactive analysis of geospatial data, and alsorecent benefits and challenges of distributed computing on clusters ofmachines.
A Grid Metadata Service for Earth and Environmental Sciences
NASA Astrophysics Data System (ADS)
Fiore, Sandro; Negro, Alessandro; Aloisio, Giovanni
2010-05-01
Critical challenges for climate modeling researchers are strongly connected with the increasingly complex simulation models and the huge quantities of produced datasets. Future trends in climate modeling will only increase computational and storage requirements. For this reason the ability to transparently access to both computational and data resources for large-scale complex climate simulations must be considered as a key requirement for Earth Science and Environmental distributed systems. From the data management perspective (i) the quantity of data will continuously increases, (ii) data will become more and more distributed and widespread, (iii) data sharing/federation will represent a key challenging issue among different sites distributed worldwide, (iv) the potential community of users (large and heterogeneous) will be interested in discovery experimental results, searching of metadata, browsing collections of files, compare different results, display output, etc.; A key element to carry out data search and discovery, manage and access huge and distributed amount of data is the metadata handling framework. What we propose for the management of distributed datasets is the GRelC service (a data grid solution focusing on metadata management). Despite the classical approaches, the proposed data-grid solution is able to address scalability, transparency, security and efficiency and interoperability. The GRelC service we propose is able to provide access to metadata stored in different and widespread data sources (relational databases running on top of MySQL, Oracle, DB2, etc. leveraging SQL as query language, as well as XML databases - XIndice, eXist, and libxml2 based documents, adopting either XPath or XQuery) providing a strong data virtualization layer in a grid environment. Such a technological solution for distributed metadata management leverages on well known adopted standards (W3C, OASIS, etc.); (ii) supports role-based management (based on VOMS), which increases flexibility and scalability; (iii) provides full support for Grid Security Infrastructure, which means (authorization, mutual authentication, data integrity, data confidentiality and delegation); (iv) is compatible with existing grid middleware such as gLite and Globus and finally (v) is currently adopted at the Euro-Mediterranean Centre for Climate Change (CMCC - Italy) to manage the entire CMCC data production activity as well as in the international Climate-G testbed.
Classification of kidney and liver tissue using ultrasound backscatter data
NASA Astrophysics Data System (ADS)
Aalamifar, Fereshteh; Rivaz, Hassan; Cerrolaza, Juan J.; Jago, James; Safdar, Nabile; Boctor, Emad M.; Linguraru, Marius G.
2015-03-01
Ultrasound (US) tissue characterization provides valuable information for the initialization of automatic segmentation algorithms, and can further provide complementary information for diagnosis of pathologies. US tissue characterization is challenging due to the presence of various types of image artifacts and dependence on the sonographer's skills. One way of overcoming this challenge is by characterizing images based on the distribution of the backscatter data derived from the interaction between US waves and tissue. The goal of this work is to classify liver versus kidney tissue in 3D volumetric US data using the distribution of backscatter US data recovered from end-user displayed Bmode image available in clinical systems. To this end, we first propose the computation of a large set of features based on the homodyned-K distribution of the speckle as well as the correlation coefficients between small patches in 3D images. We then utilize the random forests framework to select the most important features for classification. Experiments on in-vivo 3D US data from nine pediatric patients with hydronephrosis showed an average accuracy of 94% for the classification of liver and kidney tissues showing a good potential of this work to assist in the classification and segmentation of abdominal soft tissue.
Streamlined Genome Sequence Compression using Distributed Source Coding
Wang, Shuang; Jiang, Xiaoqian; Chen, Feng; Cui, Lijuan; Cheng, Samuel
2014-01-01
We aim at developing a streamlined genome sequence compression algorithm to support alternative miniaturized sequencing devices, which have limited communication, storage, and computation power. Existing techniques that require heavy client (encoder side) cannot be applied. To tackle this challenge, we carefully examined distributed source coding theory and developed a customized reference-based genome compression protocol to meet the low-complexity need at the client side. Based on the variation between source and reference, our protocol will pick adaptively either syndrome coding or hash coding to compress subsequences of changing code length. Our experimental results showed promising performance of the proposed method when compared with the state-of-the-art algorithm (GRS). PMID:25520552
3D reconstruction from non-uniform point clouds via local hierarchical clustering
NASA Astrophysics Data System (ADS)
Yang, Jiaqi; Li, Ruibo; Xiao, Yang; Cao, Zhiguo
2017-07-01
Raw scanned 3D point clouds are usually irregularly distributed due to the essential shortcomings of laser sensors, which therefore poses a great challenge for high-quality 3D surface reconstruction. This paper tackles this problem by proposing a local hierarchical clustering (LHC) method to improve the consistency of point distribution. Specifically, LHC consists of two steps: 1) adaptive octree-based decomposition of 3D space, and 2) hierarchical clustering. The former aims at reducing the computational complexity and the latter transforms the non-uniform point set into uniform one. Experimental results on real-world scanned point clouds validate the effectiveness of our method from both qualitative and quantitative aspects.
A Blueprint for Demonstrating Quantum Supremacy with Superconducting Qubits
NASA Technical Reports Server (NTRS)
Kechedzhi, Kostyantyn
2018-01-01
Long coherence times and high fidelity control recently achieved in scalable superconducting circuits paved the way for the growing number of experimental studies of many-qubit quantum coherent phenomena in these devices. Albeit full implementation of quantum error correction and fault tolerant quantum computation remains a challenge the near term pre-error correction devices could allow new fundamental experiments despite inevitable accumulation of errors. One such open question foundational for quantum computing is achieving the so called quantum supremacy, an experimental demonstration of a computational task that takes polynomial time on the quantum computer whereas the best classical algorithm would require exponential time and/or resources. It is possible to formulate such a task for a quantum computer consisting of less than a 100 qubits. The computational task we consider is to provide approximate samples from a non-trivial quantum distribution. This is a generalization for the case of superconducting circuits of ideas behind boson sampling protocol for quantum optics introduced by Arkhipov and Aaronson. In this presentation we discuss a proof-of-principle demonstration of such a sampling task on a 9-qubit chain of superconducting gmon qubits developed by Google. We discuss theoretical analysis of the driven evolution of the device resulting in output approximating samples from a uniform distribution in the Hilbert space, a quantum chaotic state. We analyze quantum chaotic characteristics of the output of the circuit and the time required to generate a sufficiently complex quantum distribution. We demonstrate that the classical simulation of the sampling output requires exponential resources by connecting the task of calculating the output amplitudes to the sign problem of the Quantum Monte Carlo method. We also discuss the detailed theoretical modeling required to achieve high fidelity control and calibration of the multi-qubit unitary evolution in the device. We use a novel cross-entropy statistical metric as a figure of merit to verify the output and calibrate the device controls. Finally, we demonstrate the statistics of the wave function amplitudes generated on the 9-gmon chain and verify the quantum chaotic nature of the generated quantum distribution. This verifies the implementation of the quantum supremacy protocol.
PanDA for ATLAS distributed computing in the next decade
NASA Astrophysics Data System (ADS)
Barreiro Megino, F. H.; De, K.; Klimentov, A.; Maeno, T.; Nilsson, P.; Oleynik, D.; Padolski, S.; Panitkin, S.; Wenaus, T.; ATLAS Collaboration
2017-10-01
The Production and Distributed Analysis (PanDA) system has been developed to meet ATLAS production and analysis requirements for a data-driven workload management system capable of operating at the Large Hadron Collider (LHC) data processing scale. Heterogeneous resources used by the ATLAS experiment are distributed worldwide at hundreds of sites, thousands of physicists analyse the data remotely, the volume of processed data is beyond the exabyte scale, dozens of scientific applications are supported, while data processing requires more than a few billion hours of computing usage per year. PanDA performed very well over the last decade including the LHC Run 1 data taking period. However, it was decided to upgrade the whole system concurrently with the LHC’s first long shutdown in order to cope with rapidly changing computing infrastructure. After two years of reengineering efforts, PanDA has embedded capabilities for fully dynamic and flexible workload management. The static batch job paradigm was discarded in favor of a more automated and scalable model. Workloads are dynamically tailored for optimal usage of resources, with the brokerage taking network traffic and forecasts into account. Computing resources are partitioned based on dynamic knowledge of their status and characteristics. The pilot has been re-factored around a plugin structure for easier development and deployment. Bookkeeping is handled with both coarse and fine granularities for efficient utilization of pledged or opportunistic resources. An in-house security mechanism authenticates the pilot and data management services in off-grid environments such as volunteer computing and private local clusters. The PanDA monitor has been extensively optimized for performance and extended with analytics to provide aggregated summaries of the system as well as drill-down to operational details. There are as well many other challenges planned or recently implemented, and adoption by non-LHC experiments such as bioinformatics groups successfully running Paleomix (microbial genome and metagenomes) payload on supercomputers. In this paper we will focus on the new and planned features that are most important to the next decade of distributed computing workload management.
Back to the future: virtualization of the computing environment at the W. M. Keck Observatory
NASA Astrophysics Data System (ADS)
McCann, Kevin L.; Birch, Denny A.; Holt, Jennifer M.; Randolph, William B.; Ward, Josephine A.
2014-07-01
Over its two decades of science operations, the W.M. Keck Observatory computing environment has evolved to contain a distributed hybrid mix of hundreds of servers, desktops and laptops of multiple different hardware platforms, O/S versions and vintages. Supporting the growing computing capabilities to meet the observatory's diverse, evolving computing demands within fixed budget constraints, presents many challenges. This paper describes the significant role that virtualization is playing in addressing these challenges while improving the level and quality of service as well as realizing significant savings across many cost areas. Starting in December 2012, the observatory embarked on an ambitious plan to incrementally test and deploy a migration to virtualized platforms to address a broad range of specific opportunities. Implementation to date has been surprisingly glitch free, progressing well and yielding tangible benefits much faster than many expected. We describe here the general approach, starting with the initial identification of some low hanging fruit which also provided opportunity to gain experience and build confidence among both the implementation team and the user community. We describe the range of challenges, opportunities and cost savings potential. Very significant among these was the substantial power savings which resulted in strong broad support for moving forward. We go on to describe the phasing plan, the evolving scalable architecture, some of the specific technical choices, as well as some of the individual technical issues encountered along the way. The phased implementation spans Windows and Unix servers for scientific, engineering and business operations, virtualized desktops for typical office users as well as more the more demanding graphics intensive CAD users. Other areas discussed in this paper include staff training, load balancing, redundancy, scalability, remote access, disaster readiness and recovery.
Effects of a chirped bias voltage on ion energy distributions in inductively coupled plasma reactors
NASA Astrophysics Data System (ADS)
Lanham, Steven J.; Kushner, Mark J.
2017-08-01
The metrics for controlling reactive fluxes to wafers for microelectronics processing are becoming more stringent as feature sizes continue to shrink. Recent strategies for controlling ion energy distributions to the wafer involve using several different frequencies and/or pulsed powers. Although effective, these strategies are often costly or present challenges in impedance matching. With the advent of matching schemes for wide band amplifiers, other strategies to customize ion energy distributions become available. In this paper, we discuss results from a computational investigation of biasing substrates using chirped frequencies in high density, electronegative inductively coupled plasmas. Depending on the frequency range and chirp duration, the resulting ion energy distributions exhibit components sampled from the entire frequency range. However, the chirping process also produces transient shifts in the self-generated dc bias due to the reapportionment of displacement and conduction with frequency to balance the current in the system. The dynamics of the dc bias can also be leveraged towards customizing ion energy distributions.
Theoretical Framework for Integrating Distributed Energy Resources into Distribution Systems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lian, Jianming; Wu, Di; Kalsi, Karanjit
This paper focuses on developing a novel theoretical framework for effective coordination and control of a large number of distributed energy resources in distribution systems in order to more reliably manage the future U.S. electric power grid under the high penetration of renewable generation. The proposed framework provides a systematic view of the overall structure of the future distribution systems along with the underlying information flow, functional organization, and operational procedures. It is characterized by the features of being open, flexible and interoperable with the potential to support dynamic system configuration. Under the proposed framework, the energy consumption of variousmore » DERs is coordinated and controlled in a hierarchical way by using market-based approaches. The real-time voltage control is simultaneously considered to complement the real power control in order to keep nodal voltages stable within acceptable ranges during real time. In addition, computational challenges associated with the proposed framework are also discussed with recommended practices.« less
NASA Astrophysics Data System (ADS)
Hassan, A. H.; Fluke, C. J.; Barnes, D. G.
2012-09-01
Upcoming and future astronomy research facilities will systematically generate terabyte-sized data sets moving astronomy into the Petascale data era. While such facilities will provide astronomers with unprecedented levels of accuracy and coverage, the increases in dataset size and dimensionality will pose serious computational challenges for many current astronomy data analysis and visualization tools. With such data sizes, even simple data analysis tasks (e.g. calculating a histogram or computing data minimum/maximum) may not be achievable without access to a supercomputing facility. To effectively handle such dataset sizes, which exceed today's single machine memory and processing limits, we present a framework that exploits the distributed power of GPUs and many-core CPUs, with a goal of providing data analysis and visualizing tasks as a service for astronomers. By mixing shared and distributed memory architectures, our framework effectively utilizes the underlying hardware infrastructure handling both batched and real-time data analysis and visualization tasks. Offering such functionality as a service in a “software as a service” manner will reduce the total cost of ownership, provide an easy to use tool to the wider astronomical community, and enable a more optimized utilization of the underlying hardware infrastructure.
NASA Astrophysics Data System (ADS)
Zhu, F.; Yu, H.; Rilee, M. L.; Kuo, K. S.; Yu, L.; Pan, Y.; Jiang, H.
2017-12-01
Since the establishment of data archive centers and the standardization of file formats, scientists are required to search metadata catalogs for data needed and download the data files to their local machines to carry out data analysis. This approach has facilitated data discovery and access for decades, but it inevitably leads to data transfer from data archive centers to scientists' computers through low-bandwidth Internet connections. Data transfer becomes a major performance bottleneck in such an approach. Combined with generally constrained local compute/storage resources, they limit the extent of scientists' studies and deprive them of timely outcomes. Thus, this conventional approach is not scalable with respect to both the volume and variety of geoscience data. A much more viable solution is to couple analysis and storage systems to minimize data transfer. In our study, we compare loosely coupled approaches (exemplified by Spark and Hadoop) and tightly coupled approaches (exemplified by parallel distributed database management systems, e.g., SciDB). In particular, we investigate the optimization of data placement and movement to effectively tackle the variety challenge, and boost the popularization of parallelization to address the volume challenge. Our goal is to enable high-performance interactive analysis for a good portion of geoscience data analysis exercise. We show that tightly coupled approaches can concentrate data traffic between local storage systems and compute units, and thereby optimizing bandwidth utilization to achieve a better throughput. Based on our observations, we develop a geoscience data analysis system that tightly couples analysis engines with storages, which has direct access to the detailed map of data partition locations. Through an innovation data partitioning and distribution scheme, our system has demonstrated scalable and interactive performance in real-world geoscience data analysis applications.
CMS results in the Combined Computing Readiness Challenge CCRC'08
NASA Astrophysics Data System (ADS)
Bonacorsi, D.; Bauerdick, L.; CMS Collaboration
2009-12-01
During February and May 2008, CMS participated to the Combined Computing Readiness Challenge (CCRC'08) together with all other LHC experiments. The purpose of this worldwide exercise was to check the readiness of the Computing infrastructure for LHC data taking. Another set of major CMS tests called Computing, Software and Analysis challenge (CSA'08) - as well as CMS cosmic runs - were also running at the same time: CCRC augmented the load on computing with additional tests to validate and stress-test all CMS computing workflows at full data taking scale, also extending this to the global WLCG community. CMS exercised most aspects of the CMS computing model, with very comprehensive tests. During May 2008, CMS moved more than 3.6 Petabytes among more than 300 links in the complex Grid topology. CMS demonstrated that is able to safely move data out of CERN to the Tier-1 sites, sustaining more than 600 MB/s as a daily average for more than seven days in a row, with enough headroom and with hourly peaks of up to 1.7 GB/s. CMS ran hundreds of simultaneous jobs at each Tier-1 site, re-reconstructing and skimming hundreds of millions of events. After re-reconstruction the fresh AOD (Analysis Object Data) has to be synchronized between Tier-1 centers: CMS demonstrated that the required inter-Tier-1 transfers are achievable within a few days. CMS also showed that skimmed analysis data sets can be transferred to Tier-2 sites for analysis at sufficient rate, regionally as well as inter-regionally, achieving all goals in about 90% of >200 links. Simultaneously, CMS also ran a large Tier-2 analysis exercise, where realistic analysis jobs were submitted to a large set of Tier-2 sites by a large number of people to produce a chaotic workload across the systems, and with more than 400 analysis users in May. Taken all together, CMS routinely achieved submissions of 100k jobs/day, with peaks up to 200k jobs/day. The achieved results in CCRC'08 - focussing on the distributed workflows - are presented and discussed.
Assessment of Computer Literacy of Nurses in Lesotho.
Mugomeri, Eltony; Chatanga, Peter; Maibvise, Charles; Masitha, Matseliso
2016-11-01
Health systems worldwide are moving toward use of information technology to improve healthcare delivery. However, this requires basic computer skills. This study assessed the computer literacy of nurses in Lesotho using a cross-sectional quantitative approach. A structured questionnaire with 32 standardized computer skills was distributed to 290 randomly selected nurses in Maseru District. Univariate and multivariate logistic regression analyses in Stata 13 were performed to identify factors associated with having inadequate computer skills. Overall, 177 (61%) nurses scored below 16 of the 32 skills assessed. Finding hyperlinks on Web pages (63%), use of advanced search parameters (60.2%), and downloading new software (60.1%) proved to be challenging to the highest proportions of nurses. Age, sex, year of obtaining latest qualification, computer experience, and work experience were significantly (P < .05) associated with inadequate computer skills in univariate analysis. However, in multivariate analyses, sex (P = .001), year of obtaining latest qualification (P = .011), and computer experience (P < .001) emerged as significant factors. The majority of nurses in Lesotho have inadequate computer skills, and this is significantly associated with having many years since obtaining their latest qualification, being female, and lack of exposure to computers. These factors should be considered during planning of training curriculum for nurses in Lesotho.
Human white matter and knowledge representation
2018-01-01
Understanding how knowledge is represented in the human brain is a fundamental challenge in neuroscience. To date, most of the work on this topic has focused on knowledge representation in cortical areas and debated whether knowledge is represented in a distributed or localized fashion. Fang and colleagues provide evidence that brain connections and the white matter supporting such connections might play a significant role. The work opens new avenues of investigation, breaking through disciplinary boundaries across network neuroscience, computational neuroscience, cognitive science, and classical lesion studies. PMID:29698391
Human white matter and knowledge representation.
Pestilli, Franco
2018-04-01
Understanding how knowledge is represented in the human brain is a fundamental challenge in neuroscience. To date, most of the work on this topic has focused on knowledge representation in cortical areas and debated whether knowledge is represented in a distributed or localized fashion. Fang and colleagues provide evidence that brain connections and the white matter supporting such connections might play a significant role. The work opens new avenues of investigation, breaking through disciplinary boundaries across network neuroscience, computational neuroscience, cognitive science, and classical lesion studies.
2013-01-01
Predicting the onset of atrial fibrillation : the Computers in Cardiology Challenge 2001. Comput Cardiol 2001;28:113-6. [22] Moody GB, Mark RG, Goldberger AL...Computers in Cardiology Challenge 2006: QT interval measurement. Comput Cardiol 2006;33:313-6. [18] Moody GB. Spontaneous termination of atrial ... fibrillation : a challenge from PhysioNet and Computers in Cardiology 2004. Comput Cardiol 2004;31:101-4. [19] Moody GB, Jager F. Distinguishing ischemic from non
WLCG Monitoring Consolidation and further evolution
NASA Astrophysics Data System (ADS)
Saiz, P.; Aimar, A.; Andreeva, J.; Babik, M.; Cons, L.; Dzhunov, I.; Forti, A.; di Girolamo, A.; Karavakis, E.; Litmaath, M.; Magini, N.; Magnoni, L.; de los Rios, H. Martin; Roiser, S.; Sciaba, A.; Schulz, M.; Tarragon, J.; Tuckett, D.
2015-12-01
The WLCG monitoring system solves a challenging task of keeping track of the LHC computing activities on the WLCG infrastructure, ensuring health and performance of the distributed services at more than 170 sites. The challenge consists of decreasing the effort needed to operate the monitoring service and to satisfy the constantly growing requirements for its scalability and performance. This contribution describes the recent consolidation work aimed to reduce the complexity of the system, and to ensure more effective operations, support and service management. This was done by unifying where possible the implementation of the monitoring components. The contribution also covers further steps like the evaluation of the new technologies for data storage, processing and visualization and migration to a new technology stack.
Analysis of scalability of high-performance 3D image processing platform for virtual colonoscopy
NASA Astrophysics Data System (ADS)
Yoshida, Hiroyuki; Wu, Yin; Cai, Wenli
2014-03-01
One of the key challenges in three-dimensional (3D) medical imaging is to enable the fast turn-around time, which is often required for interactive or real-time response. This inevitably requires not only high computational power but also high memory bandwidth due to the massive amount of data that need to be processed. For this purpose, we previously developed a software platform for high-performance 3D medical image processing, called HPC 3D-MIP platform, which employs increasingly available and affordable commodity computing systems such as the multicore, cluster, and cloud computing systems. To achieve scalable high-performance computing, the platform employed size-adaptive, distributable block volumes as a core data structure for efficient parallelization of a wide range of 3D-MIP algorithms, supported task scheduling for efficient load distribution and balancing, and consisted of a layered parallel software libraries that allow image processing applications to share the common functionalities. We evaluated the performance of the HPC 3D-MIP platform by applying it to computationally intensive processes in virtual colonoscopy. Experimental results showed a 12-fold performance improvement on a workstation with 12-core CPUs over the original sequential implementation of the processes, indicating the efficiency of the platform. Analysis of performance scalability based on the Amdahl's law for symmetric multicore chips showed the potential of a high performance scalability of the HPC 3DMIP platform when a larger number of cores is available.
HTSFinder: Powerful Pipeline of DNA Signature Discovery by Parallel and Distributed Computing
Karimi, Ramin; Hajdu, Andras
2016-01-01
Comprehensive effort for low-cost sequencing in the past few years has led to the growth of complete genome databases. In parallel with this effort, a strong need, fast and cost-effective methods and applications have been developed to accelerate sequence analysis. Identification is the very first step of this task. Due to the difficulties, high costs, and computational challenges of alignment-based approaches, an alternative universal identification method is highly required. Like an alignment-free approach, DNA signatures have provided new opportunities for the rapid identification of species. In this paper, we present an effective pipeline HTSFinder (high-throughput signature finder) with a corresponding k-mer generator GkmerG (genome k-mers generator). Using this pipeline, we determine the frequency of k-mers from the available complete genome databases for the detection of extensive DNA signatures in a reasonably short time. Our application can detect both unique and common signatures in the arbitrarily selected target and nontarget databases. Hadoop and MapReduce as parallel and distributed computing tools with commodity hardware are used in this pipeline. This approach brings the power of high-performance computing into the ordinary desktop personal computers for discovering DNA signatures in large databases such as bacterial genome. A considerable number of detected unique and common DNA signatures of the target database bring the opportunities to improve the identification process not only for polymerase chain reaction and microarray assays but also for more complex scenarios such as metagenomics and next-generation sequencing analysis. PMID:26884678
HTSFinder: Powerful Pipeline of DNA Signature Discovery by Parallel and Distributed Computing.
Karimi, Ramin; Hajdu, Andras
2016-01-01
Comprehensive effort for low-cost sequencing in the past few years has led to the growth of complete genome databases. In parallel with this effort, a strong need, fast and cost-effective methods and applications have been developed to accelerate sequence analysis. Identification is the very first step of this task. Due to the difficulties, high costs, and computational challenges of alignment-based approaches, an alternative universal identification method is highly required. Like an alignment-free approach, DNA signatures have provided new opportunities for the rapid identification of species. In this paper, we present an effective pipeline HTSFinder (high-throughput signature finder) with a corresponding k-mer generator GkmerG (genome k-mers generator). Using this pipeline, we determine the frequency of k-mers from the available complete genome databases for the detection of extensive DNA signatures in a reasonably short time. Our application can detect both unique and common signatures in the arbitrarily selected target and nontarget databases. Hadoop and MapReduce as parallel and distributed computing tools with commodity hardware are used in this pipeline. This approach brings the power of high-performance computing into the ordinary desktop personal computers for discovering DNA signatures in large databases such as bacterial genome. A considerable number of detected unique and common DNA signatures of the target database bring the opportunities to improve the identification process not only for polymerase chain reaction and microarray assays but also for more complex scenarios such as metagenomics and next-generation sequencing analysis.
BigDebug: Debugging Primitives for Interactive Big Data Processing in Spark
Gulzar, Muhammad Ali; Interlandi, Matteo; Yoo, Seunghyun; Tetali, Sai Deep; Condie, Tyson; Millstein, Todd; Kim, Miryung
2016-01-01
Developers use cloud computing platforms to process a large quantity of data in parallel when developing big data analytics. Debugging the massive parallel computations that run in today’s data-centers is time consuming and error-prone. To address this challenge, we design a set of interactive, real-time debugging primitives for big data processing in Apache Spark, the next generation data-intensive scalable cloud computing platform. This requires re-thinking the notion of step-through debugging in a traditional debugger such as gdb, because pausing the entire computation across distributed worker nodes causes significant delay and naively inspecting millions of records using a watchpoint is too time consuming for an end user. First, BIGDEBUG’s simulated breakpoints and on-demand watchpoints allow users to selectively examine distributed, intermediate data on the cloud with little overhead. Second, a user can also pinpoint a crash-inducing record and selectively resume relevant sub-computations after a quick fix. Third, a user can determine the root causes of errors (or delays) at the level of individual records through a fine-grained data provenance capability. Our evaluation shows that BIGDEBUG scales to terabytes and its record-level tracing incurs less than 25% overhead on average. It determines crash culprits orders of magnitude more accurately and provides up to 100% time saving compared to the baseline replay debugger. The results show that BIGDEBUG supports debugging at interactive speeds with minimal performance impact. PMID:27390389
Camerlengo, Terry; Ozer, Hatice Gulcin; Onti-Srinivasan, Raghuram; Yan, Pearlly; Huang, Tim; Parvin, Jeffrey; Huang, Kun
2012-01-01
Next Generation Sequencing is highly resource intensive. NGS Tasks related to data processing, management and analysis require high-end computing servers or even clusters. Additionally, processing NGS experiments requires suitable storage space and significant manual interaction. At The Ohio State University's Biomedical Informatics Shared Resource, we designed and implemented a scalable architecture to address the challenges associated with the resource intensive nature of NGS secondary analysis built around Illumina Genome Analyzer II sequencers and Illumina's Gerald data processing pipeline. The software infrastructure includes a distributed computing platform consisting of a LIMS called QUEST (http://bisr.osumc.edu), an Automation Server, a computer cluster for processing NGS pipelines, and a network attached storage device expandable up to 40TB. The system has been architected to scale to multiple sequencers without requiring additional computing or labor resources. This platform provides demonstrates how to manage and automate NGS experiments in an institutional or core facility setting.
A new tool called DISSECT for analysing large genomic data sets using a Big Data approach
Canela-Xandri, Oriol; Law, Andy; Gray, Alan; Woolliams, John A.; Tenesa, Albert
2015-01-01
Large-scale genetic and genomic data are increasingly available and the major bottleneck in their analysis is a lack of sufficiently scalable computational tools. To address this problem in the context of complex traits analysis, we present DISSECT. DISSECT is a new and freely available software that is able to exploit the distributed-memory parallel computational architectures of compute clusters, to perform a wide range of genomic and epidemiologic analyses, which currently can only be carried out on reduced sample sizes or under restricted conditions. We demonstrate the usefulness of our new tool by addressing the challenge of predicting phenotypes from genotype data in human populations using mixed-linear model analysis. We analyse simulated traits from 470,000 individuals genotyped for 590,004 SNPs in ∼4 h using the combined computational power of 8,400 processor cores. We find that prediction accuracies in excess of 80% of the theoretical maximum could be achieved with large sample sizes. PMID:26657010
Parallel Rendering of Large Time-Varying Volume Data
NASA Technical Reports Server (NTRS)
Garbutt, Alexander E.
2005-01-01
Interactive visualization of large time-varying 3D volume datasets has been and still is a great challenge to the modem computational world. It stretches the limits of the memory capacity, the disk space, the network bandwidth and the CPU speed of a conventional computer. In this SURF project, we propose to develop a parallel volume rendering program on SGI's Prism, a cluster computer equipped with state-of-the-art graphic hardware. The proposed program combines both parallel computing and hardware rendering in order to achieve an interactive rendering rate. We use 3D texture mapping and a hardware shader to implement 3D volume rendering on each workstation. We use SGI's VisServer to enable remote rendering using Prism's graphic hardware. And last, we will integrate this new program with ParVox, a parallel distributed visualization system developed at JPL. At the end of the project, we Will demonstrate remote interactive visualization using this new hardware volume renderer on JPL's Prism System using a time-varying dataset from selected JPL applications.
NASA Astrophysics Data System (ADS)
Nguyen, L.; Chee, T.; Minnis, P.; Palikonda, R.; Smith, W. L., Jr.; Spangenberg, D.
2016-12-01
The NASA LaRC Satellite ClOud and Radiative Property retrieval System (SatCORPS) processes and derives near real-time (NRT) global cloud products from operational geostationary satellite imager datasets. These products are being used in NRT to improve forecast model, aircraft icing warnings, and support aircraft field campaigns. Next generation satellites, such as the Japanese Himawari-8 and the upcoming NOAA GOES-R, present challenges for NRT data processing and product dissemination due to the increase in temporal and spatial resolution. The volume of data is expected to increase to approximately 10 folds. This increase in data volume will require additional IT resources to keep up with the processing demands to satisfy NRT requirements. In addition, these resources are not readily available due to cost and other technical limitations. To anticipate and meet these computing resource requirements, we have employed a hybrid cloud computing environment to augment the generation of SatCORPS products. This paper will describe the workflow to ingest, process, and distribute SatCORPS products and the technologies used. Lessons learn from working on both AWS Clouds and GovCloud will be discussed: benefits, similarities, and differences that could impact decision to use cloud computing and storage. A detail cost analysis will be presented. In addition, future cloud utilization, parallelization, and architecture layout will be discussed for GOES-R.
CombiMotif: A new algorithm for network motifs discovery in protein-protein interaction networks
NASA Astrophysics Data System (ADS)
Luo, Jiawei; Li, Guanghui; Song, Dan; Liang, Cheng
2014-12-01
Discovering motifs in protein-protein interaction networks is becoming a current major challenge in computational biology, since the distribution of the number of network motifs can reveal significant systemic differences among species. However, this task can be computationally expensive because of the involvement of graph isomorphic detection. In this paper, we present a new algorithm (CombiMotif) that incorporates combinatorial techniques to count non-induced occurrences of subgraph topologies in the form of trees. The efficiency of our algorithm is demonstrated by comparing the obtained results with the current state-of-the art subgraph counting algorithms. We also show major differences between unicellular and multicellular organisms. The datasets and source code of CombiMotif are freely available upon request.
Hielscher, Andreas H; Bartel, Sebastian
2004-02-01
Optical tomography (OT) is a fast developing novel imaging modality that uses near-infrared (NIR) light to obtain cross-sectional views of optical properties inside the human body. A major challenge remains the time-consuming, computational-intensive image reconstruction problem that converts NIR transmission measurements into cross-sectional images. To increase the speed of iterative image reconstruction schemes that are commonly applied for OT, we have developed and implemented several parallel algorithms on a cluster of workstations. Static process distribution as well as dynamic load balancing schemes suitable for heterogeneous clusters and varying machine performances are introduced and tested. The resulting algorithms are shown to accelerate the reconstruction process to various degrees, substantially reducing the computation times for clinically relevant problems.
Electric dipole polarizability from first principles calculations
Miorelli, M.; Bacca, S.; Barnea, N.; ...
2016-09-19
The electric dipole polarizability quantifies the low-energy behavior of the dipole strength and is related to critical observables such as the radii of the proton and neutron distributions. Its computation is challenging because most of the dipole strength lies in the scattering continuum. In our paper we combine integral transforms with the coupled-cluster method and compute the dipole polarizability using bound-state techniques. Furthermore, employing different interactions from chiral effective field theory, we confirm the strong correlation between the dipole polarizability and the charge radius, and study its dependence on three-nucleon forces. Finally, we find good agreement with data for themore » 4He, 40Ca, and 16O nuclei, and predict the dipole polarizability for the rare nucleus 22O.« less
Computer vision in roadway transportation systems: a survey
NASA Astrophysics Data System (ADS)
Loce, Robert P.; Bernal, Edgar A.; Wu, Wencheng; Bala, Raja
2013-10-01
There is a worldwide effort to apply 21st century intelligence to evolving our transportation networks. The goals of smart transportation networks are quite noble and manifold, including safety, efficiency, law enforcement, energy conservation, and emission reduction. Computer vision is playing a key role in this transportation evolution. Video imaging scientists are providing intelligent sensing and processing technologies for a wide variety of applications and services. There are many interesting technical challenges including imaging under a variety of environmental and illumination conditions, data overload, recognition and tracking of objects at high speed, distributed network sensing and processing, energy sources, as well as legal concerns. This paper presents a survey of computer vision techniques related to three key problems in the transportation domain: safety, efficiency, and security and law enforcement. A broad review of the literature is complemented by detailed treatment of a few selected algorithms and systems that the authors believe represent the state-of-the-art.
Characterizing output bottlenecks in a supercomputer
DOE Office of Scientific and Technical Information (OSTI.GOV)
Xie, Bing; Chase, Jeffrey; Dillow, David A
2012-01-01
Supercomputer I/O loads are often dominated by writes. HPC (High Performance Computing) file systems are designed to absorb these bursty outputs at high bandwidth through massive parallelism. However, the delivered write bandwidth often falls well below the peak. This paper characterizes the data absorption behavior of a center-wide shared Lustre parallel file system on the Jaguar supercomputer. We use a statistical methodology to address the challenges of accurately measuring a shared machine under production load and to obtain the distribution of bandwidth across samples of compute nodes, storage targets, and time intervals. We observe and quantify limitations from competing traffic,more » contention on storage servers and I/O routers, concurrency limitations in the client compute node operating systems, and the impact of variance (stragglers) on coupled output such as striping. We then examine the implications of our results for application performance and the design of I/O middleware systems on shared supercomputers.« less
A Multi-Level Parallelization Concept for High-Fidelity Multi-Block Solvers
NASA Technical Reports Server (NTRS)
Hatay, Ferhat F.; Jespersen, Dennis C.; Guruswamy, Guru P.; Rizk, Yehia M.; Byun, Chansup; Gee, Ken; VanDalsem, William R. (Technical Monitor)
1997-01-01
The integration of high-fidelity Computational Fluid Dynamics (CFD) analysis tools with the industrial design process benefits greatly from the robust implementations that are transportable across a wide range of computer architectures. In the present work, a hybrid domain-decomposition and parallelization concept was developed and implemented into the widely-used NASA multi-block Computational Fluid Dynamics (CFD) packages implemented in ENSAERO and OVERFLOW. The new parallel solver concept, PENS (Parallel Euler Navier-Stokes Solver), employs both fine and coarse granularity in data partitioning as well as data coalescing to obtain the desired load-balance characteristics on the available computer platforms. This multi-level parallelism implementation itself introduces no changes to the numerical results, hence the original fidelity of the packages are identically preserved. The present implementation uses the Message Passing Interface (MPI) library for interprocessor message passing and memory accessing. By choosing an appropriate combination of the available partitioning and coalescing capabilities only during the execution stage, the PENS solver becomes adaptable to different computer architectures from shared-memory to distributed-memory platforms with varying degrees of parallelism. The PENS implementation on the IBM SP2 distributed memory environment at the NASA Ames Research Center obtains 85 percent scalable parallel performance using fine-grain partitioning of single-block CFD domains using up to 128 wide computational nodes. Multi-block CFD simulations of complete aircraft simulations achieve 75 percent perfect load-balanced executions using data coalescing and the two levels of parallelism. SGI PowerChallenge, SGI Origin 2000, and a cluster of workstations are the other platforms where the robustness of the implementation is tested. The performance behavior on the other computer platforms with a variety of realistic problems will be included as this on-going study progresses.
Heterogeneous scalable framework for multiphase flows
DOE Office of Scientific and Technical Information (OSTI.GOV)
Morris, Karla Vanessa
2013-09-01
Two categories of challenges confront the developer of computational spray models: those related to the computation and those related to the physics. Regarding the computation, the trend towards heterogeneous, multi- and many-core platforms will require considerable re-engineering of codes written for the current supercomputing platforms. Regarding the physics, accurate methods for transferring mass, momentum and energy from the dispersed phase onto the carrier fluid grid have so far eluded modelers. Significant challenges also lie at the intersection between these two categories. To be competitive, any physics model must be expressible in a parallel algorithm that performs well on evolving computermore » platforms. This work created an application based on a software architecture where the physics and software concerns are separated in a way that adds flexibility to both. The develop spray-tracking package includes an application programming interface (API) that abstracts away the platform-dependent parallelization concerns, enabling the scientific programmer to write serial code that the API resolves into parallel processes and threads of execution. The project also developed the infrastructure required to provide similar APIs to other application. The API allow object-oriented Fortran applications direct interaction with Trilinos to support memory management of distributed objects in central processing units (CPU) and graphic processing units (GPU) nodes for applications using C++.« less
de la Iglesia, D; Cachau, R E; García-Remesal, M; Maojo, V
2014-01-01
Nanotechnology represents an area of particular promise and significant opportunity across multiple scientific disciplines. Ongoing nanotechnology research ranges from the characterization of nanoparticles and nanomaterials to the analysis and processing of experimental data seeking correlations between nanoparticles and their functionalities and side effects. Due to their special properties, nanoparticles are suitable for cellular-level diagnostics and therapy, offering numerous applications in medicine, e.g. development of biomedical devices, tissue repair, drug delivery systems and biosensors. In nanomedicine, recent studies are producing large amounts of structural and property data, highlighting the role for computational approaches in information management. While in vitro and in vivo assays are expensive, the cost of computing is falling. Furthermore, improvements in the accuracy of computational methods (e.g. data mining, knowledge discovery, modeling and simulation) have enabled effective tools to automate the extraction, management and storage of these vast data volumes. Since this information is widely distributed, one major issue is how to locate and access data where it resides (which also poses data-sharing limitations). The novel discipline of nanoinformatics addresses the information challenges related to nanotechnology research. In this paper, we summarize the needs and challenges in the field and present an overview of extant initiatives and efforts. PMID:24932210
Veksler, Vladislav D.; Buchler, Norbou; Hoffman, Blaine E.; Cassenti, Daniel N.; Sample, Char; Sugrim, Shridat
2018-01-01
Computational models of cognitive processes may be employed in cyber-security tools, experiments, and simulations to address human agency and effective decision-making in keeping computational networks secure. Cognitive modeling can addresses multi-disciplinary cyber-security challenges requiring cross-cutting approaches over the human and computational sciences such as the following: (a) adversarial reasoning and behavioral game theory to predict attacker subjective utilities and decision likelihood distributions, (b) human factors of cyber tools to address human system integration challenges, estimation of defender cognitive states, and opportunities for automation, (c) dynamic simulations involving attacker, defender, and user models to enhance studies of cyber epidemiology and cyber hygiene, and (d) training effectiveness research and training scenarios to address human cyber-security performance, maturation of cyber-security skill sets, and effective decision-making. Models may be initially constructed at the group-level based on mean tendencies of each subject's subgroup, based on known statistics such as specific skill proficiencies, demographic characteristics, and cultural factors. For more precise and accurate predictions, cognitive models may be fine-tuned to each individual attacker, defender, or user profile, and updated over time (based on recorded behavior) via techniques such as model tracing and dynamic parameter fitting. PMID:29867661
Rational approximations to rational models: alternative algorithms for category learning.
Sanborn, Adam N; Griffiths, Thomas L; Navarro, Daniel J
2010-10-01
Rational models of cognition typically consider the abstract computational problems posed by the environment, assuming that people are capable of optimally solving those problems. This differs from more traditional formal models of cognition, which focus on the psychological processes responsible for behavior. A basic challenge for rational models is thus explaining how optimal solutions can be approximated by psychological processes. We outline a general strategy for answering this question, namely to explore the psychological plausibility of approximation algorithms developed in computer science and statistics. In particular, we argue that Monte Carlo methods provide a source of rational process models that connect optimal solutions to psychological processes. We support this argument through a detailed example, applying this approach to Anderson's (1990, 1991) rational model of categorization (RMC), which involves a particularly challenging computational problem. Drawing on a connection between the RMC and ideas from nonparametric Bayesian statistics, we propose 2 alternative algorithms for approximate inference in this model. The algorithms we consider include Gibbs sampling, a procedure appropriate when all stimuli are presented simultaneously, and particle filters, which sequentially approximate the posterior distribution with a small number of samples that are updated as new data become available. Applying these algorithms to several existing datasets shows that a particle filter with a single particle provides a good description of human inferences.
NASA Technical Reports Server (NTRS)
Shooman, Martin L.
1991-01-01
Many of the most challenging reliability problems of our present decade involve complex distributed systems such as interconnected telephone switching computers, air traffic control centers, aircraft and space vehicles, and local area and wide area computer networks. In addition to the challenge of complexity, modern fault-tolerant computer systems require very high levels of reliability, e.g., avionic computers with MTTF goals of one billion hours. Most analysts find that it is too difficult to model such complex systems without computer aided design programs. In response to this need, NASA has developed a suite of computer aided reliability modeling programs beginning with CARE 3 and including a group of new programs such as: HARP, HARP-PC, Reliability Analysts Workbench (Combination of model solvers SURE, STEM, PAWS, and common front-end model ASSIST), and the Fault Tree Compiler. The HARP program is studied and how well the user can model systems using this program is investigated. One of the important objectives will be to study how user friendly this program is, e.g., how easy it is to model the system, provide the input information, and interpret the results. The experiences of the author and his graduate students who used HARP in two graduate courses are described. Some brief comparisons were made with the ARIES program which the students also used. Theoretical studies of the modeling techniques used in HARP are also included. Of course no answer can be any more accurate than the fidelity of the model, thus an Appendix is included which discusses modeling accuracy. A broad viewpoint is taken and all problems which occurred in the use of HARP are discussed. Such problems include: computer system problems, installation manual problems, user manual problems, program inconsistencies, program limitations, confusing notation, long run times, accuracy problems, etc.
Angiuoli, Samuel V; Matalka, Malcolm; Gussman, Aaron; Galens, Kevin; Vangala, Mahesh; Riley, David R; Arze, Cesar; White, James R; White, Owen; Fricke, W Florian
2011-08-30
Next-generation sequencing technologies have decentralized sequence acquisition, increasing the demand for new bioinformatics tools that are easy to use, portable across multiple platforms, and scalable for high-throughput applications. Cloud computing platforms provide on-demand access to computing infrastructure over the Internet and can be used in combination with custom built virtual machines to distribute pre-packaged with pre-configured software. We describe the Cloud Virtual Resource, CloVR, a new desktop application for push-button automated sequence analysis that can utilize cloud computing resources. CloVR is implemented as a single portable virtual machine (VM) that provides several automated analysis pipelines for microbial genomics, including 16S, whole genome and metagenome sequence analysis. The CloVR VM runs on a personal computer, utilizes local computer resources and requires minimal installation, addressing key challenges in deploying bioinformatics workflows. In addition CloVR supports use of remote cloud computing resources to improve performance for large-scale sequence processing. In a case study, we demonstrate the use of CloVR to automatically process next-generation sequencing data on multiple cloud computing platforms. The CloVR VM and associated architecture lowers the barrier of entry for utilizing complex analysis protocols on both local single- and multi-core computers and cloud systems for high throughput data processing.
Sensor Data Distribution With Robustness and Reliability: Toward Distributed Components Model
NASA Technical Reports Server (NTRS)
Alena, Richard L.; Lee, Charles
2005-01-01
In planetary surface exploration mission, sensor data distribution is required in many aspects, for example, in navigation, scheduling, planning, monitoring, diagnostics, and automation of the field tasks. The challenge is to distribute such data in the robust and reliable way so that we can minimize the errors caused by miscalculations, and misjudgments that based on the error data input in the mission. The ad-hoc wireless network on planetary surface is not constantly connected because of the nature of the rough terrain and lack of permanent establishments on the surface. There are some disconnected moments that the computation nodes will re-associate with different repeaters or access points until connections are reestablished. Such a nature requires our sensor data distribution software robust and reliable with ability to tolerant disconnected moments. This paper presents a distributed components model as a framework to accomplish such tasks. The software is written in Java and utilized the available Java Message Services schema and the Boss implementation. The results of field experimentations show that the model is very effective in completing the tasks.
Taking digital imaging to the next level: challenges and opportunities.
Hobbs, W Cecyl
2004-01-01
New medical imaging technology, such as multi-detector computed tomography (CT) scanners and positron emission tomography (PET) scanners, are creating new possibilities for non-invasive diagnosis that are leading providers to invest heavily in these new technologies. The volume of data produced by such technology is so large that it cannot be "read" using traditional film-based methods, and once in digital form, it creates a massive data integration and archiving challenge. Despite the benefits of digital imaging and archiving, there are several key challenges that healthcare organizations should consider in planning, selecting, and implementing the information technology (IT) infrastructure to support digital imaging. Decisions about storage and image distribution are essentially questions of "where" and "how fast." When planning the digital archiving infrastructure, organizations should think about where they want to store and distribute their images. This is similar to decisions that organizations have to make in regard to physical film storage and distribution, except the portability of images is even greater in a digital environment. The principle of "network effects" seems like a simple concept, yet the effect is not always considered when implementing a technology plan. To fully realize the benefits of digital imaging, the radiology department must integrate the archiving solutions throughout the department and, ultimately, with applications across other departments and enterprises. Medical institutions can derive a number of benefits from implementing digital imaging and archiving solutions like PACS. Hospitals and imaging centers can use the transition from film-based imaging as a foundational opportunity to reduce costs, increase competitive advantage, attract talent, and improve service to patients. The key factors in achieving these goals include attention to the means of data storage, distribution and protection.
Service Migration from Cloud to Multi-tier Fog Nodes for Multimedia Dissemination with QoE Support
Camargo, João; Rochol, Juergen; Gerla, Mario
2018-01-01
A wide range of multimedia services is expected to be offered for mobile users via various wireless access networks. Even the integration of Cloud Computing in such networks does not support an adequate Quality of Experience (QoE) in areas with high demands for multimedia contents. Fog computing has been conceptualized to facilitate the deployment of new services that cloud computing cannot provide, particularly those demanding QoE guarantees. These services are provided using fog nodes located at the network edge, which is capable of virtualizing their functions/applications. Service migration from the cloud to fog nodes can be actuated by request patterns and the timing issues. To the best of our knowledge, existing works on fog computing focus on architecture and fog node deployment issues. In this article, we describe the operational impacts and benefits associated with service migration from the cloud to multi-tier fog computing for video distribution with QoE support. Besides that, we perform the evaluation of such service migration of video services. Finally, we present potential research challenges and trends. PMID:29364172
Service Migration from Cloud to Multi-tier Fog Nodes for Multimedia Dissemination with QoE Support.
Rosário, Denis; Schimuneck, Matias; Camargo, João; Nobre, Jéferson; Both, Cristiano; Rochol, Juergen; Gerla, Mario
2018-01-24
A wide range of multimedia services is expected to be offered for mobile users via various wireless access networks. Even the integration of Cloud Computing in such networks does not support an adequate Quality of Experience (QoE) in areas with high demands for multimedia contents. Fog computing has been conceptualized to facilitate the deployment of new services that cloud computing cannot provide, particularly those demanding QoE guarantees. These services are provided using fog nodes located at the network edge, which is capable of virtualizing their functions/applications. Service migration from the cloud to fog nodes can be actuated by request patterns and the timing issues. To the best of our knowledge, existing works on fog computing focus on architecture and fog node deployment issues. In this article, we describe the operational impacts and benefits associated with service migration from the cloud to multi-tier fog computing for video distribution with QoE support. Besides that, we perform the evaluation of such service migration of video services. Finally, we present potential research challenges and trends.
A quantum–quantum Metropolis algorithm
Yung, Man-Hong; Aspuru-Guzik, Alán
2012-01-01
The classical Metropolis sampling method is a cornerstone of many statistical modeling applications that range from physics, chemistry, and biology to economics. This method is particularly suitable for sampling the thermal distributions of classical systems. The challenge of extending this method to the simulation of arbitrary quantum systems is that, in general, eigenstates of quantum Hamiltonians cannot be obtained efficiently with a classical computer. However, this challenge can be overcome by quantum computers. Here, we present a quantum algorithm which fully generalizes the classical Metropolis algorithm to the quantum domain. The meaning of quantum generalization is twofold: The proposed algorithm is not only applicable to both classical and quantum systems, but also offers a quantum speedup relative to the classical counterpart. Furthermore, unlike the classical method of quantum Monte Carlo, this quantum algorithm does not suffer from the negative-sign problem associated with fermionic systems. Applications of this algorithm include the study of low-temperature properties of quantum systems, such as the Hubbard model, and preparing the thermal states of sizable molecules to simulate, for example, chemical reactions at an arbitrary temperature. PMID:22215584
NASA Astrophysics Data System (ADS)
Vivoni, Enrique R.; Mascaro, Giuseppe; Mniszewski, Susan; Fasel, Patricia; Springer, Everett P.; Ivanov, Valeriy Y.; Bras, Rafael L.
2011-10-01
SummaryA major challenge in the use of fully-distributed hydrologic models has been the lack of computational capabilities for high-resolution, long-term simulations in large river basins. In this study, we present the parallel model implementation and real-world hydrologic assessment of the Triangulated Irregular Network (TIN)-based Real-time Integrated Basin Simulator (tRIBS). Our parallelization approach is based on the decomposition of a complex watershed using the channel network as a directed graph. The resulting sub-basin partitioning divides effort among processors and handles hydrologic exchanges across boundaries. Through numerical experiments in a set of nested basins, we quantify parallel performance relative to serial runs for a range of processors, simulation complexities and lengths, and sub-basin partitioning methods, while accounting for inter-run variability on a parallel computing system. In contrast to serial simulations, the parallel model speed-up depends on the variability of hydrologic processes. Load balancing significantly improves parallel speed-up with proportionally faster runs as simulation complexity (domain resolution and channel network extent) increases. The best strategy for large river basins is to combine a balanced partitioning with an extended channel network, with potential savings through a lower TIN resolution. Based on these advances, a wider range of applications for fully-distributed hydrologic models are now possible. This is illustrated through a set of ensemble forecasts that account for precipitation uncertainty derived from a statistical downscaling model.
Chen, Wen Hao; Yang, Sam Y. S.; Xiao, Ti Qiao; Mayo, Sherry C.; Wang, Yu Dan; Wang, Hai Peng
2014-01-01
Quantifying three-dimensional spatial distributions of pores and material compositions in samples is a key materials characterization challenge, particularly in samples where compositions are distributed across a range of length scales, and where such compositions have similar X-ray absorption properties, such as in coal. Consequently, obtaining detailed information within sub-regions of a multi-length-scale sample by conventional approaches may not provide the resolution and level of detail one might desire. Herein, an approach for quantitative high-definition determination of material compositions from X-ray local computed tomography combined with a data-constrained modelling method is proposed. The approach is capable of dramatically improving the spatial resolution and enabling finer details within a region of interest of a sample larger than the field of view to be revealed than by using conventional techniques. A coal sample containing distributions of porosity and several mineral compositions is employed to demonstrate the approach. The optimal experimental parameters are pre-analyzed. The quantitative results demonstrated that the approach can reveal significantly finer details of compositional distributions in the sample region of interest. The elevated spatial resolution is crucial for coal-bed methane reservoir evaluation and understanding the transformation of the minerals during coal processing. The method is generic and can be applied for three-dimensional compositional characterization of other materials. PMID:24763649
Integration of PanDA workload management system with Titan supercomputer at OLCF
NASA Astrophysics Data System (ADS)
De, K.; Klimentov, A.; Oleynik, D.; Panitkin, S.; Petrosyan, A.; Schovancova, J.; Vaniachine, A.; Wenaus, T.
2015-12-01
The PanDA (Production and Distributed Analysis) workload management system (WMS) was developed to meet the scale and complexity of LHC distributed computing for the ATLAS experiment. While PanDA currently distributes jobs to more than 100,000 cores at well over 100 Grid sites, the future LHC data taking runs will require more resources than Grid computing can possibly provide. To alleviate these challenges, ATLAS is engaged in an ambitious program to expand the current computing model to include additional resources such as the opportunistic use of supercomputers. We will describe a project aimed at integration of PanDA WMS with Titan supercomputer at Oak Ridge Leadership Computing Facility (OLCF). The current approach utilizes a modified PanDA pilot framework for job submission to Titan's batch queues and local data management, with light-weight MPI wrappers to run single threaded workloads in parallel on Titan's multicore worker nodes. It also gives PanDA new capability to collect, in real time, information about unused worker nodes on Titan, which allows precise definition of the size and duration of jobs submitted to Titan according to available free resources. This capability significantly reduces PanDA job wait time while improving Titan's utilization efficiency. This implementation was tested with a variety of Monte-Carlo workloads on Titan and is being tested on several other supercomputing platforms. Notice: This manuscript has been authored, by employees of Brookhaven Science Associates, LLC under Contract No. DE-AC02-98CH10886 with the U.S. Department of Energy. The publisher by accepting the manuscript for publication acknowledges that the United States Government retains a non-exclusive, paid-up, irrevocable, world-wide license to publish or reproduce the published form of this manuscript, or allow others to do so, for United States Government purposes.
Characterizing the heterogeneity of tumor tissues from spatially resolved molecular measures
Zavodszky, Maria I.
2017-01-01
Background Tumor heterogeneity can manifest itself by sub-populations of cells having distinct phenotypic profiles expressed as diverse molecular, morphological and spatial distributions. This inherent heterogeneity poses challenges in terms of diagnosis, prognosis and efficient treatment. Consequently, tools and techniques are being developed to properly characterize and quantify tumor heterogeneity. Multiplexed immunofluorescence (MxIF) is one such technology that offers molecular insight into both inter-individual and intratumor heterogeneity. It enables the quantification of both the concentration and spatial distribution of 60+ proteins across a tissue section. Upon bioimage processing, protein expression data can be generated for each cell from a tissue field of view. Results The Multi-Omics Heterogeneity Analysis (MOHA) tool was developed to compute tissue heterogeneity metrics from MxIF spatially resolved tissue imaging data. This technique computes the molecular state of each cell in a sample based on a pathway or gene set. Spatial states are then computed based on the spatial arrangements of the cells as distinguished by their respective molecular states. MOHA computes tissue heterogeneity metrics from the distributions of these molecular and spatially defined states. A colorectal cancer cohort of approximately 700 subjects with MxIF data is presented to demonstrate the MOHA methodology. Within this dataset, statistically significant correlations were found between the intratumor AKT pathway state diversity and cancer stage and histological tumor grade. Furthermore, intratumor spatial diversity metrics were found to correlate with cancer recurrence. Conclusions MOHA provides a simple and robust approach to characterize molecular and spatial heterogeneity of tissues. Research projects that generate spatially resolved tissue imaging data can take full advantage of this useful technique. The MOHA algorithm is implemented as a freely available R script (see supplementary information). PMID:29190747
Distributed computation: the new wave of synthetic biology devices.
Macía, Javier; Posas, Francesc; Solé, Ricard V
2012-06-01
Synthetic biology (SB) offers a unique opportunity for designing complex molecular circuits able to perform predefined functions. But the goal of achieving a flexible toolbox of reusable molecular components has been shown to be limited due to circuit unpredictability, incompatible parts or random fluctuations. Many of these problems arise from the challenges posed by engineering the molecular circuitry: multiple wires are usually difficult to implement reliably within one cell and the resulting systems cannot be reused in other modules. These problems are solved by means of a nonstandard approach to single cell devices, using cell consortia and allowing the output signal to be distributed among different cell types, which can be combined in multiple, reusable and scalable ways. Copyright © 2012 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Petrie, C.; Margaria, T.; Lausen, H.; Zaremba, M.
Explores trade-offs among existing approaches. Reveals strengths and weaknesses of proposed approaches, as well as which aspects of the problem are not yet covered. Introduces software engineering approach to evaluating semantic web services. Service-Oriented Computing is one of the most promising software engineering trends because of the potential to reduce the programming effort for future distributed industrial systems. However, only a small part of this potential rests on the standardization of tools offered by the web services stack. The larger part of this potential rests upon the development of sufficient semantics to automate service orchestration. Currently there are many different approaches to semantic web service descriptions and many frameworks built around them. A common understanding, evaluation scheme, and test bed to compare and classify these frameworks in terms of their capabilities and shortcomings, is necessary to make progress in developing the full potential of Service-Oriented Computing. The Semantic Web Services Challenge is an open source initiative that provides a public evaluation and certification of multiple frameworks on common industrially-relevant problem sets. This edited volume reports on the first results in developing common understanding of the various technologies intended to facilitate the automation of mediation, choreography and discovery for Web Services using semantic annotations. Semantic Web Services Challenge: Results from the First Year is designed for a professional audience composed of practitioners and researchers in industry. Professionals can use this book to evaluate SWS technology for their potential practical use. The book is also suitable for advanced-level students in computer science.
The implementation of AI technologies in computer wargames
NASA Astrophysics Data System (ADS)
Tiller, John A.
2004-08-01
Computer wargames involve the most in-depth analysis of general game theory. The enumerated turns of a game like chess are dwarfed by the exponentially larger possibilities of even a simple computer wargame. Implementing challenging AI is computer wargames is an important goal in both the commercial and military environments. In the commercial marketplace, customers demand a challenging AI opponent when they play a computer wargame and are frustrated by a lack of competence on the part of the AI. In the military environment, challenging AI opponents are important for several reasons. A challenging AI opponent will force the military professional to avoid routine or set-piece approaches to situations and cause them to think much deeper about military situations before taking action. A good AI opponent would also include national characteristics of the opponent being simulated, thus providing the military professional with even more of a challenge in planning and approach. Implementing current AI technologies in computer wargames is a technological challenge. The goal is to join the needs of AI in computer wargames with the solutions of current AI technologies. This talk will address several of those issues, possible solutions, and currently unsolved problems.
Biological Basis For Computer Vision: Some Perspectives
NASA Astrophysics Data System (ADS)
Gupta, Madan M.
1990-03-01
Using biology as a basis for the development of sensors, devices and computer vision systems is a challenge to systems and vision scientists. It is also a field of promising research for engineering applications. Biological sensory systems, such as vision, touch and hearing, sense different physical phenomena from our environment, yet they possess some common mathematical functions. These mathematical functions are cast into the neural layers which are distributed throughout our sensory regions, sensory information transmission channels and in the cortex, the centre of perception. In this paper, we are concerned with the study of the biological vision system and the emulation of some of its mathematical functions, both retinal and visual cortex, for the development of a robust computer vision system. This field of research is not only intriguing, but offers a great challenge to systems scientists in the development of functional algorithms. These functional algorithms can be generalized for further studies in such fields as signal processing, control systems and image processing. Our studies are heavily dependent on the the use of fuzzy - neural layers and generalized receptive fields. Building blocks of such neural layers and receptive fields may lead to the design of better sensors and better computer vision systems. It is hoped that these studies will lead to the development of better artificial vision systems with various applications to vision prosthesis for the blind, robotic vision, medical imaging, medical sensors, industrial automation, remote sensing, space stations and ocean exploration.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Karakaya, Mahmut; Qi, Hairong
This paper addresses the communication and energy efficiency in collaborative visual sensor networks (VSNs) for people localization, a challenging computer vision problem of its own. We focus on the design of a light-weight and energy efficient solution where people are localized based on distributed camera nodes integrating the so-called certainty map generated at each node, that records the target non-existence information within the camera s field of view. We first present a dynamic itinerary for certainty map integration where not only each sensor node transmits a very limited amount of data but that a limited number of camera nodes ismore » involved. Then, we perform a comprehensive analytical study to evaluate communication and energy efficiency between different integration schemes, i.e., centralized and distributed integration. Based on results obtained from analytical study and real experiments, the distributed method shows effectiveness in detection accuracy as well as energy and bandwidth efficiency.« less
Distributed Optimal Power Flow of AC/DC Interconnected Power Grid Using Synchronous ADMM
NASA Astrophysics Data System (ADS)
Liang, Zijun; Lin, Shunjiang; Liu, Mingbo
2017-05-01
Distributed optimal power flow (OPF) is of great importance and challenge to AC/DC interconnected power grid with different dispatching centres, considering the security and privacy of information transmission. In this paper, a fully distributed algorithm for OPF problem of AC/DC interconnected power grid called synchronous ADMM is proposed, and it requires no form of central controller. The algorithm is based on the fundamental alternating direction multiplier method (ADMM), by using the average value of boundary variables of adjacent regions obtained from current iteration as the reference values of both regions for next iteration, which realizes the parallel computation among different regions. The algorithm is tested with the IEEE 11-bus AC/DC interconnected power grid, and by comparing the results with centralized algorithm, we find it nearly no differences, and its correctness and effectiveness can be validated.
Versioned distributed arrays for resilience in scientific applications: Global view resilience
Chien, A.; Balaji, P.; Beckman, P.; ...
2015-06-01
Exascale studies project reliability challenges for future high-performance computing (HPC) systems. We propose the Global View Resilience (GVR) system, a library that enables applications to add resilience in a portable, application-controlled fashion using versioned distributed arrays. We describe GVR’s interfaces to distributed arrays, versioning, and cross-layer error recovery. Using several large applications (OpenMC, the preconditioned conjugate gradient solver PCG, ddcMD, and Chombo), we evaluate the programmer effort to add resilience. The required changes are small (<2% LOC), localized, and machine-independent, requiring no software architecture changes. We also measure the overhead of adding GVR versioning and show that generally overheads <2%more » are achieved. We conclude that GVR’s interfaces and implementation are flexible and portable and create a gentle-slope path to tolerate growing error rates in future systems.« less
Distributed Economic Dispatch in Microgrids Based on Cooperative Reinforcement Learning.
Liu, Weirong; Zhuang, Peng; Liang, Hao; Peng, Jun; Huang, Zhiwu; Weirong Liu; Peng Zhuang; Hao Liang; Jun Peng; Zhiwu Huang; Liu, Weirong; Liang, Hao; Peng, Jun; Zhuang, Peng; Huang, Zhiwu
2018-06-01
Microgrids incorporated with distributed generation (DG) units and energy storage (ES) devices are expected to play more and more important roles in the future power systems. Yet, achieving efficient distributed economic dispatch in microgrids is a challenging issue due to the randomness and nonlinear characteristics of DG units and loads. This paper proposes a cooperative reinforcement learning algorithm for distributed economic dispatch in microgrids. Utilizing the learning algorithm can avoid the difficulty of stochastic modeling and high computational complexity. In the cooperative reinforcement learning algorithm, the function approximation is leveraged to deal with the large and continuous state spaces. And a diffusion strategy is incorporated to coordinate the actions of DG units and ES devices. Based on the proposed algorithm, each node in microgrids only needs to communicate with its local neighbors, without relying on any centralized controllers. Algorithm convergence is analyzed, and simulations based on real-world meteorological and load data are conducted to validate the performance of the proposed algorithm.
Applied Distributed Model Predictive Control for Energy Efficient Buildings and Ramp Metering
NASA Astrophysics Data System (ADS)
Koehler, Sarah Muraoka
Industrial large-scale control problems present an interesting algorithmic design challenge. A number of controllers must cooperate in real-time on a network of embedded hardware with limited computing power in order to maximize system efficiency while respecting constraints and despite communication delays. Model predictive control (MPC) can automatically synthesize a centralized controller which optimizes an objective function subject to a system model, constraints, and predictions of disturbance. Unfortunately, the computations required by model predictive controllers for large-scale systems often limit its industrial implementation only to medium-scale slow processes. Distributed model predictive control (DMPC) enters the picture as a way to decentralize a large-scale model predictive control problem. The main idea of DMPC is to split the computations required by the MPC problem amongst distributed processors that can compute in parallel and communicate iteratively to find a solution. Some popularly proposed solutions are distributed optimization algorithms such as dual decomposition and the alternating direction method of multipliers (ADMM). However, these algorithms ignore two practical challenges: substantial communication delays present in control systems and also problem non-convexity. This thesis presents two novel and practically effective DMPC algorithms. The first DMPC algorithm is based on a primal-dual active-set method which achieves fast convergence, making it suitable for large-scale control applications which have a large communication delay across its communication network. In particular, this algorithm is suited for MPC problems with a quadratic cost, linear dynamics, forecasted demand, and box constraints. We measure the performance of this algorithm and show that it significantly outperforms both dual decomposition and ADMM in the presence of communication delay. The second DMPC algorithm is based on an inexact interior point method which is suited for nonlinear optimization problems. The parallel computation of the algorithm exploits iterative linear algebra methods for the main linear algebra computations in the algorithm. We show that the splitting of the algorithm is flexible and can thus be applied to various distributed platform configurations. The two proposed algorithms are applied to two main energy and transportation control problems. The first application is energy efficient building control. Buildings represent 40% of energy consumption in the United States. Thus, it is significant to improve the energy efficiency of buildings. The goal is to minimize energy consumption subject to the physics of the building (e.g. heat transfer laws), the constraints of the actuators as well as the desired operating constraints (thermal comfort of the occupants), and heat load on the system. In this thesis, we describe the control systems of forced air building systems in practice. We discuss the "Trim and Respond" algorithm which is a distributed control algorithm that is used in practice, and show that it performs similarly to a one-step explicit DMPC algorithm. Then, we apply the novel distributed primal-dual active-set method and provide extensive numerical results for the building MPC problem. The second main application is the control of ramp metering signals to optimize traffic flow through a freeway system. This application is particularly important since urban congestion has more than doubled in the past few decades. The ramp metering problem is to maximize freeway throughput subject to freeway dynamics (derived from mass conservation), actuation constraints, freeway capacity constraints, and predicted traffic demand. In this thesis, we develop a hybrid model predictive controller for ramp metering that is guaranteed to be persistently feasible and stable. This contrasts to previous work on MPC for ramp metering where such guarantees are absent. We apply a smoothing method to the hybrid model predictive controller and apply the inexact interior point method to this nonlinear non-convex ramp metering problem.
Infrastructure and the Virtual Observatory
NASA Astrophysics Data System (ADS)
Dowler, P.; Gaudet, S.; Schade, D.
2011-07-01
The modern data center is faced with architectural and software engineering challenges that grow along with the challenges facing observatories: massive data flow, distributed computing environments, and distributed teams collaborating on large and small projects. By using VO standards as key components of the infrastructure, projects can take advantage of a decade of intellectual investment by the IVOA community. By their nature, these standards are proven and tested designs that already exist. Adopting VO standards saves considerable design effort, allows projects to take advantage of open-source software and test suites to speed development, and enables the use of third party tools that understand the VO protocols. The evolving CADC architecture now makes heavy use of VO standards. We show examples of how these standards may be used directly, coupled with non-VO standards, or extended with custom capabilities to solve real problems and provide value to our users. In the end, we use VO services as major parts of the core infrastructure to reduce cost rather than as an extra layer with additional cost and we can deliver more general purpose and robust services to our user community.
Discrete stochastic analogs of Erlang epidemic models.
Getz, Wayne M; Dougherty, Eric R
2018-12-01
Erlang differential equation models of epidemic processes provide more realistic disease-class transition dynamics from susceptible (S) to exposed (E) to infectious (I) and removed (R) categories than the ubiquitous SEIR model. The latter is itself is at one end of the spectrum of Erlang SE[Formula: see text]I[Formula: see text]R models with [Formula: see text] concatenated E compartments and [Formula: see text] concatenated I compartments. Discrete-time models, however, are computationally much simpler to simulate and fit to epidemic outbreak data than continuous-time differential equations, and are also much more readily extended to include demographic and other types of stochasticity. Here we formulate discrete-time deterministic analogs of the Erlang models, and their stochastic extension, based on a time-to-go distributional principle. Depending on which distributions are used (e.g. discretized Erlang, Gamma, Beta, or Uniform distributions), we demonstrate that our formulation represents both a discretization of Erlang epidemic models and generalizations thereof. We consider the challenges of fitting SE[Formula: see text]I[Formula: see text]R models and our discrete-time analog to data (the recent outbreak of Ebola in Liberia). We demonstrate that the latter performs much better than the former; although confining fits to strict SEIR formulations reduces the numerical challenges, but sacrifices best-fit likelihood scores by at least 7%.
Vineyard, Craig M.; Verzi, Stephen J.; James, Conrad D.; ...
2015-08-10
Despite technological advances making computing devices faster, smaller, and more prevalent in today's age, data generation and collection has outpaced data processing capabilities. Simply having more compute platforms does not provide a means of addressing challenging problems in the big data era. Rather, alternative processing approaches are needed and the application of machine learning to big data is hugely important. The MapReduce programming paradigm is an alternative to conventional supercomputing approaches, and requires less stringent data passing constrained problem decompositions. Rather, MapReduce relies upon defining a means of partitioning the desired problem so that subsets may be computed independently andmore » recom- bined to yield the net desired result. However, not all machine learning algorithms are amenable to such an approach. Game-theoretic algorithms are often innately distributed, consisting of local interactions between players without requiring a central authority and are iterative by nature rather than requiring extensive retraining. Effectively, a game-theoretic approach to machine learning is well suited for the MapReduce paradigm and provides a novel, alternative new perspective to addressing the big data problem. In this paper we present a variant of our Support Vector Machine (SVM) Game classifier which may be used in a distributed manner, and show an illustrative example of applying this algorithm.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Vineyard, Craig M.; Verzi, Stephen J.; James, Conrad D.
Despite technological advances making computing devices faster, smaller, and more prevalent in today's age, data generation and collection has outpaced data processing capabilities. Simply having more compute platforms does not provide a means of addressing challenging problems in the big data era. Rather, alternative processing approaches are needed and the application of machine learning to big data is hugely important. The MapReduce programming paradigm is an alternative to conventional supercomputing approaches, and requires less stringent data passing constrained problem decompositions. Rather, MapReduce relies upon defining a means of partitioning the desired problem so that subsets may be computed independently andmore » recom- bined to yield the net desired result. However, not all machine learning algorithms are amenable to such an approach. Game-theoretic algorithms are often innately distributed, consisting of local interactions between players without requiring a central authority and are iterative by nature rather than requiring extensive retraining. Effectively, a game-theoretic approach to machine learning is well suited for the MapReduce paradigm and provides a novel, alternative new perspective to addressing the big data problem. In this paper we present a variant of our Support Vector Machine (SVM) Game classifier which may be used in a distributed manner, and show an illustrative example of applying this algorithm.« less
Geographic Gossip: Efficient Averaging for Sensor Networks
NASA Astrophysics Data System (ADS)
Dimakis, Alexandros D. G.; Sarwate, Anand D.; Wainwright, Martin J.
Gossip algorithms for distributed computation are attractive due to their simplicity, distributed nature, and robustness in noisy and uncertain environments. However, using standard gossip algorithms can lead to a significant waste in energy by repeatedly recirculating redundant information. For realistic sensor network model topologies like grids and random geometric graphs, the inefficiency of gossip schemes is related to the slow mixing times of random walks on the communication graph. We propose and analyze an alternative gossiping scheme that exploits geographic information. By utilizing geographic routing combined with a simple resampling method, we demonstrate substantial gains over previously proposed gossip protocols. For regular graphs such as the ring or grid, our algorithm improves standard gossip by factors of $n$ and $\\sqrt{n}$ respectively. For the more challenging case of random geometric graphs, our algorithm computes the true average to accuracy $\\epsilon$ using $O(\\frac{n^{1.5}}{\\sqrt{\\log n}} \\log \\epsilon^{-1})$ radio transmissions, which yields a $\\sqrt{\\frac{n}{\\log n}}$ factor improvement over standard gossip algorithms. We illustrate these theoretical results with experimental comparisons between our algorithm and standard methods as applied to various classes of random fields.
NASA Astrophysics Data System (ADS)
Volcke, P.; Pequegnat, C.; Grunberg, M.; Lecointre, A.; Bzeznik, B.; Wolyniec, D.; Engels, F.; Maron, C.; Cheze, J.; Pardo, C.; Saurel, J. M.; André, F.
2015-12-01
RESIF is a nationwide french project aimed at building a high quality observation system to observe and understand the inner earth. RESIF deals with permanent seismic networks data as well as mobile networks data, including dense/semi-dense arrays. RESIF project is distributed among different nodes providing qualified data to the main datacentre in Université Grenoble Alpes, France. Data control and qualification is performed by each individual nodes : the poster will provide some insights on RESIF broadband seismic component data quality control. We will then present data that has been recently made publicly available. Data is distributed through worldwide FDSN and european EIDA standards protocols. A new web portal is now opened to explore and download seismic data and metadata. The RESIF datacentre is also now connected to Grenoble University High Performance Computing (HPC) facility : a typical use-case will be presented using iRODS technologies. The use of dense observation networks is increasing, bringing challenges in data growth and handling : we will present an example where HDF5 data format was used as an alternative to usual seismology data formats.
Sekulić, Vladislav; Skinner, Frances K
2017-01-01
Although biophysical details of inhibitory neurons are becoming known, it is challenging to map these details onto function. Oriens-lacunosum/moleculare (O-LM) cells are inhibitory cells in the hippocampus that gate information flow, firing while phase-locked to theta rhythms. We build on our existing computational model database of O-LM cells to link model with function. We place our models in high-conductance states and modulate inhibitory inputs at a wide range of frequencies. We find preferred spiking recruitment of models at high (4–9 Hz) or low (2–5 Hz) theta depending on, respectively, the presence or absence of h-channels on their dendrites. This also depends on slow delayed-rectifier potassium channels, and preferred theta ranges shift when h-channels are potentiated by cyclic AMP. Our results suggest that O-LM cells can be differentially recruited by frequency-modulated inputs depending on specific channel types and distributions. This work exposes a strategy for understanding how biophysical characteristics contribute to function. DOI: http://dx.doi.org/10.7554/eLife.22962.001 PMID:28318488
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tawhai, Merryn; Bischoff, Jeff; Einstein, Daniel R.
2009-05-01
Abstract In this article, we describe some current multiscale modeling issues in computational biomechanics from the perspective of the musculoskeletal and respiratory systems and mechanotransduction. First, we outline the necessity of multiscale simulations in these biological systems. Then we summarize challenges inherent to multiscale biomechanics modeling, regardless of the subdiscipline, followed by computational challenges that are system-specific. We discuss some of the current tools that have been utilized to aid research in multiscale mechanics simulations, and the priorities to further the field of multiscale biomechanics computation.
Uncertainty propagation for statistical impact prediction of space debris
NASA Astrophysics Data System (ADS)
Hoogendoorn, R.; Mooij, E.; Geul, J.
2018-01-01
Predictions of the impact time and location of space debris in a decaying trajectory are highly influenced by uncertainties. The traditional Monte Carlo (MC) method can be used to perform accurate statistical impact predictions, but requires a large computational effort. A method is investigated that directly propagates a Probability Density Function (PDF) in time, which has the potential to obtain more accurate results with less computational effort. The decaying trajectory of Delta-K rocket stages was used to test the methods using a six degrees-of-freedom state model. The PDF of the state of the body was propagated in time to obtain impact-time distributions. This Direct PDF Propagation (DPP) method results in a multi-dimensional scattered dataset of the PDF of the state, which is highly challenging to process. No accurate results could be obtained, because of the structure of the DPP data and the high dimensionality. Therefore, the DPP method is less suitable for practical uncontrolled entry problems and the traditional MC method remains superior. Additionally, the MC method was used with two improved uncertainty models to obtain impact-time distributions, which were validated using observations of true impacts. For one of the two uncertainty models, statistically more valid impact-time distributions were obtained than in previous research.
Mobile Computing and Ubiquitous Networking: Concepts, Technologies and Challenges.
ERIC Educational Resources Information Center
Pierre, Samuel
2001-01-01
Analyzes concepts, technologies and challenges related to mobile computing and networking. Defines basic concepts of cellular systems. Describes the evolution of wireless technologies that constitute the foundations of mobile computing and ubiquitous networking. Presents characterization and issues of mobile computing. Analyzes economical and…
Software Systems for High-performance Quantum Computing
DOE Office of Scientific and Technical Information (OSTI.GOV)
Humble, Travis S; Britt, Keith A
Quantum computing promises new opportunities for solving hard computational problems, but harnessing this novelty requires breakthrough concepts in the design, operation, and application of computing systems. We define some of the challenges facing the development of quantum computing systems as well as software-based approaches that can be used to overcome these challenges. Following a brief overview of the state of the art, we present models for the quantum programming and execution models, the development of architectures for hybrid high-performance computing systems, and the realization of software stacks for quantum networking. This leads to a discussion of the role that conventionalmore » computing plays in the quantum paradigm and how some of the current challenges for exascale computing overlap with those facing quantum computing.« less
Flexible services for the support of research.
Turilli, Matteo; Wallom, David; Williams, Chris; Gough, Steve; Curran, Neal; Tarrant, Richard; Bretherton, Dan; Powell, Andy; Johnson, Matt; Harmer, Terry; Wright, Peter; Gordon, John
2013-01-28
Cloud computing has been increasingly adopted by users and providers to promote a flexible, scalable and tailored access to computing resources. Nonetheless, the consolidation of this paradigm has uncovered some of its limitations. Initially devised by corporations with direct control over large amounts of computational resources, cloud computing is now being endorsed by organizations with limited resources or with a more articulated, less direct control over these resources. The challenge for these organizations is to leverage the benefits of cloud computing while dealing with limited and often widely distributed computing resources. This study focuses on the adoption of cloud computing by higher education institutions and addresses two main issues: flexible and on-demand access to a large amount of storage resources, and scalability across a heterogeneous set of cloud infrastructures. The proposed solutions leverage a federated approach to cloud resources in which users access multiple and largely independent cloud infrastructures through a highly customizable broker layer. This approach allows for a uniform authentication and authorization infrastructure, a fine-grained policy specification and the aggregation of accounting and monitoring. Within a loosely coupled federation of cloud infrastructures, users can access vast amount of data without copying them across cloud infrastructures and can scale their resource provisions when the local cloud resources become insufficient.
Data Processing Factory for the Sloan Digital Sky Survey
NASA Astrophysics Data System (ADS)
Stoughton, Christopher; Adelman, Jennifer; Annis, James T.; Hendry, John; Inkmann, John; Jester, Sebastian; Kent, Steven M.; Kuropatkin, Nickolai; Lee, Brian; Lin, Huan; Peoples, John, Jr.; Sparks, Robert; Tucker, Douglas; Vanden Berk, Dan; Yanny, Brian; Yocum, Dan
2002-12-01
The Sloan Digital Sky Survey (SDSS) data handling presents two challenges: large data volume and timely production of spectroscopic plates from imaging data. A data processing factory, using technologies both old and new, handles this flow. Distribution to end users is via disk farms, to serve corrected images and calibrated spectra, and a database, to efficiently process catalog queries. For distribution of modest amounts of data from Apache Point Observatory to Fermilab, scripts use rsync to update files, while larger data transfers are accomplished by shipping magnetic tapes commercially. All data processing pipelines are wrapped in scripts to address consecutive phases: preparation, submission, checking, and quality control. We constructed the factory by chaining these pipelines together while using an operational database to hold processed imaging catalogs. The science database catalogs all imaging and spectroscopic object, with pointers to the various external files associated with them. Diverse computing systems address particular processing phases. UNIX computers handle tape reading and writing, as well as calibration steps that require access to a large amount of data with relatively modest computational demands. Commodity CPUs process steps that require access to a limited amount of data with more demanding computations requirements. Disk servers optimized for cost per Gbyte serve terabytes of processed data, while servers optimized for disk read speed run SQLServer software to process queries on the catalogs. This factory produced data for the SDSS Early Data Release in June 2001, and it is currently producing Data Release One, scheduled for January 2003.
2011-01-01
Background Next-generation sequencing technologies have decentralized sequence acquisition, increasing the demand for new bioinformatics tools that are easy to use, portable across multiple platforms, and scalable for high-throughput applications. Cloud computing platforms provide on-demand access to computing infrastructure over the Internet and can be used in combination with custom built virtual machines to distribute pre-packaged with pre-configured software. Results We describe the Cloud Virtual Resource, CloVR, a new desktop application for push-button automated sequence analysis that can utilize cloud computing resources. CloVR is implemented as a single portable virtual machine (VM) that provides several automated analysis pipelines for microbial genomics, including 16S, whole genome and metagenome sequence analysis. The CloVR VM runs on a personal computer, utilizes local computer resources and requires minimal installation, addressing key challenges in deploying bioinformatics workflows. In addition CloVR supports use of remote cloud computing resources to improve performance for large-scale sequence processing. In a case study, we demonstrate the use of CloVR to automatically process next-generation sequencing data on multiple cloud computing platforms. Conclusion The CloVR VM and associated architecture lowers the barrier of entry for utilizing complex analysis protocols on both local single- and multi-core computers and cloud systems for high throughput data processing. PMID:21878105
The patient and the computer in the primary care consultation
Arnold, Michael; Phillips, Christine; Trumble, Stephen; Dwan, Kathryn
2011-01-01
Objective Studies of the doctor–patient relationship have focused on the elaboration of power and/or authority using a range of techniques to study the encounter between doctor and patient. The widespread adoption of computers by doctors brings a third party into the consultation. While there has been some research into the way doctors view and manage this new relationship, the behavior of patients in response to the computer is rarely studied. In this paper, the authors use Goffman's dramaturgy to explore patients' approaches to the doctor's computer in the consultation, and its influence on the patient–doctor relationship. Design Observational study of Australian general practice. 141 consultations from 20 general practitioners were videotaped and analyzed using a hermeneutic framework. Results Patients negotiated the relationship between themselves, the doctor, and the computer demonstrating two themes: dyadic (dealing primarily with the doctor) or triadic (dealing with both computer and doctor). Patients used three signaling behaviors in relation to the computer on the doctor's desk (screen watching, screen ignoring, and screen excluding) to influence the behavior of the doctor. Patients were able to draw the doctor to the computer, and used the computer to challenge doctor's statements. Conclusion This study demonstrates that in consultations where doctors use computers, the computer can legitimately be regarded as part of a triadic relationship. Routine use of computers in the consultation changes the doctor–patient relationship, and is altering the distribution of power and authority between doctor and patient. PMID:21262923
The patient and the computer in the primary care consultation.
Pearce, Christopher; Arnold, Michael; Phillips, Christine; Trumble, Stephen; Dwan, Kathryn
2011-01-01
Studies of the doctor-patient relationship have focused on the elaboration of power and/or authority using a range of techniques to study the encounter between doctor and patient. The widespread adoption of computers by doctors brings a third party into the consultation. While there has been some research into the way doctors view and manage this new relationship, the behavior of patients in response to the computer is rarely studied. In this paper, the authors use Goffman's dramaturgy to explore patients' approaches to the doctor's computer in the consultation, and its influence on the patient-doctor relationship. Observational study of Australian general practice. 141 consultations from 20 general practitioners were videotaped and analyzed using a hermeneutic framework. Patients negotiated the relationship between themselves, the doctor, and the computer demonstrating two themes: dyadic (dealing primarily with the doctor) or triadic (dealing with both computer and doctor). Patients used three signaling behaviors in relation to the computer on the doctor's desk (screen watching, screen ignoring, and screen excluding) to influence the behavior of the doctor. Patients were able to draw the doctor to the computer, and used the computer to challenge doctor's statements. This study demonstrates that in consultations where doctors use computers, the computer can legitimately be regarded as part of a triadic relationship. Routine use of computers in the consultation changes the doctor-patient relationship, and is altering the distribution of power and authority between doctor and patient.
Applications of the pipeline environment for visual informatics and genomics computations
2011-01-01
Background Contemporary informatics and genomics research require efficient, flexible and robust management of large heterogeneous data, advanced computational tools, powerful visualization, reliable hardware infrastructure, interoperability of computational resources, and detailed data and analysis-protocol provenance. The Pipeline is a client-server distributed computational environment that facilitates the visual graphical construction, execution, monitoring, validation and dissemination of advanced data analysis protocols. Results This paper reports on the applications of the LONI Pipeline environment to address two informatics challenges - graphical management of diverse genomics tools, and the interoperability of informatics software. Specifically, this manuscript presents the concrete details of deploying general informatics suites and individual software tools to new hardware infrastructures, the design, validation and execution of new visual analysis protocols via the Pipeline graphical interface, and integration of diverse informatics tools via the Pipeline eXtensible Markup Language syntax. We demonstrate each of these processes using several established informatics packages (e.g., miBLAST, EMBOSS, mrFAST, GWASS, MAQ, SAMtools, Bowtie) for basic local sequence alignment and search, molecular biology data analysis, and genome-wide association studies. These examples demonstrate the power of the Pipeline graphical workflow environment to enable integration of bioinformatics resources which provide a well-defined syntax for dynamic specification of the input/output parameters and the run-time execution controls. Conclusions The LONI Pipeline environment http://pipeline.loni.ucla.edu provides a flexible graphical infrastructure for efficient biomedical computing and distributed informatics research. The interactive Pipeline resource manager enables the utilization and interoperability of diverse types of informatics resources. The Pipeline client-server model provides computational power to a broad spectrum of informatics investigators - experienced developers and novice users, user with or without access to advanced computational-resources (e.g., Grid, data), as well as basic and translational scientists. The open development, validation and dissemination of computational networks (pipeline workflows) facilitates the sharing of knowledge, tools, protocols and best practices, and enables the unbiased validation and replication of scientific findings by the entire community. PMID:21791102
NASA Astrophysics Data System (ADS)
Narayanaswami, Chandra; Raghunath, Mandayam T.
2004-09-01
We outline a collection of technological challenges in the design of wearable computers with a focus on one of the most desirable form-factors, the wrist watch. We describe our experience with building three generations of wrist watch computers. We built these research prototypes as platforms to investigate the fundamental limitations of wearable computing. Results of our investigations are presented in the form of challenges that have been overcome and those that still remain.
Magdoom, Kulam Najmudeen; Pishko, Gregory L.; Rice, Lori; Pampo, Chris; Siemann, Dietmar W.; Sarntinoranont, Malisa
2014-01-01
Systemic drug delivery to solid tumors involving macromolecular therapeutic agents is challenging for many reasons. Amongst them is their chaotic microvasculature which often leads to inadequate and uneven uptake of the drug. Localized drug delivery can circumvent such obstacles and convection-enhanced delivery (CED) - controlled infusion of the drug directly into the tissue - has emerged as a promising delivery method for distributing macromolecules over larger tissue volumes. In this study, a three-dimensional MR image-based computational porous media transport model accounting for realistic anatomical geometry and tumor leakiness was developed for predicting the interstitial flow field and distribution of albumin tracer following CED into the hind-limb tumor (KHT sarcoma) in a mouse. Sensitivity of the model to changes in infusion flow rate, catheter placement and tissue hydraulic conductivity were investigated. The model predictions suggest that 1) tracer distribution is asymmetric due to heterogeneous porosity; 2) tracer distribution volume varies linearly with infusion volume within the whole leg, and exponentially within the tumor reaching a maximum steady-state value; 3) infusion at the center of the tumor with high flow rates leads to maximum tracer coverage in the tumor with minimal leakage outside; and 4) increasing the tissue hydraulic conductivity lowers the tumor interstitial fluid pressure and decreases the tracer distribution volume within the whole leg and tumor. The model thus predicts that the interstitial fluid flow and drug transport is sensitive to porosity and changes in extracellular space. This image-based model thus serves as a potential tool for exploring the effects of transport heterogeneity in tumors. PMID:24619021
Large-scale 3D geoelectromagnetic modeling using parallel adaptive high-order finite element method
Grayver, Alexander V.; Kolev, Tzanio V.
2015-11-01
Here, we have investigated the use of the adaptive high-order finite-element method (FEM) for geoelectromagnetic modeling. Because high-order FEM is challenging from the numerical and computational points of view, most published finite-element studies in geoelectromagnetics use the lowest order formulation. Solution of the resulting large system of linear equations poses the main practical challenge. We have developed a fully parallel and distributed robust and scalable linear solver based on the optimal block-diagonal and auxiliary space preconditioners. The solver was found to be efficient for high finite element orders, unstructured and nonconforming locally refined meshes, a wide range of frequencies, largemore » conductivity contrasts, and number of degrees of freedom (DoFs). Furthermore, the presented linear solver is in essence algebraic; i.e., it acts on the matrix-vector level and thus requires no information about the discretization, boundary conditions, or physical source used, making it readily efficient for a wide range of electromagnetic modeling problems. To get accurate solutions at reduced computational cost, we have also implemented goal-oriented adaptive mesh refinement. The numerical tests indicated that if highly accurate modeling results were required, the high-order FEM in combination with the goal-oriented local mesh refinement required less computational time and DoFs than the lowest order adaptive FEM.« less
Large-scale 3D geoelectromagnetic modeling using parallel adaptive high-order finite element method
DOE Office of Scientific and Technical Information (OSTI.GOV)
Grayver, Alexander V.; Kolev, Tzanio V.
Here, we have investigated the use of the adaptive high-order finite-element method (FEM) for geoelectromagnetic modeling. Because high-order FEM is challenging from the numerical and computational points of view, most published finite-element studies in geoelectromagnetics use the lowest order formulation. Solution of the resulting large system of linear equations poses the main practical challenge. We have developed a fully parallel and distributed robust and scalable linear solver based on the optimal block-diagonal and auxiliary space preconditioners. The solver was found to be efficient for high finite element orders, unstructured and nonconforming locally refined meshes, a wide range of frequencies, largemore » conductivity contrasts, and number of degrees of freedom (DoFs). Furthermore, the presented linear solver is in essence algebraic; i.e., it acts on the matrix-vector level and thus requires no information about the discretization, boundary conditions, or physical source used, making it readily efficient for a wide range of electromagnetic modeling problems. To get accurate solutions at reduced computational cost, we have also implemented goal-oriented adaptive mesh refinement. The numerical tests indicated that if highly accurate modeling results were required, the high-order FEM in combination with the goal-oriented local mesh refinement required less computational time and DoFs than the lowest order adaptive FEM.« less
NASA Astrophysics Data System (ADS)
Frew, E.; Argrow, B. M.; Houston, A. L.; Weiss, C.
2014-12-01
The energy-aware airborne dynamic, data-driven application system (EA-DDDAS) performs persistent sampling in complex atmospheric conditions by exploiting wind energy using the dynamic data-driven application system paradigm. The main challenge for future airborne sampling missions is operation with tight integration of physical and computational resources over wireless communication networks, in complex atmospheric conditions. The physical resources considered here include sensor platforms, particularly mobile Doppler radar and unmanned aircraft, the complex conditions in which they operate, and the region of interest. Autonomous operation requires distributed computational effort connected by layered wireless communication. Onboard decision-making and coordination algorithms can be enhanced by atmospheric models that assimilate input from physics-based models and wind fields derived from multiple sources. These models are generally too complex to be run onboard the aircraft, so they need to be executed in ground vehicles in the field, and connected over broadband or other wireless links back to the field. Finally, the wind field environment drives strong interaction between the computational and physical systems, both as a challenge to autonomous path planning algorithms and as a novel energy source that can be exploited to improve system range and endurance. Implementation details of a complete EA-DDDAS will be provided, along with preliminary flight test results targeting coherent boundary-layer structures.
Expanding the user base beyond HEP for the Ganga distributed analysis user interface
NASA Astrophysics Data System (ADS)
Currie, R.; Egede, U.; Richards, A.; Slater, M.; Williams, M.
2017-10-01
This document presents the result of recent developments within Ganga[1] project to support users from new communities outside of HEP. In particular I will examine the case of users from the Large Scale Survey Telescope (LSST) group looking to use resources provided by the UK based GridPP[2][3] DIRAC[4][5] instance. An example use case is work performed with users from the LSST Virtual Organisation (VO) to distribute the workflow used for galaxy shape identification analyses. This work highlighted some LSST specific challenges which could be well solved by common tools within the HEP community. As a result of this work the LSST community was able to take advantage of GridPP[2][3] resources to perform large computing tasks within the UK.
An economic model of friendship and enmity for measuring social balance in networks
NASA Astrophysics Data System (ADS)
Lee, Kyu-Min; Shin, Euncheol; You, Seungil
2017-12-01
We propose a dynamic economic model of networks where agents can be friends or enemies with one another. This is a decentralized relationship model in that agents decide whether to change their relationships so as to minimize their imbalanced triads. In this model, there is a single parameter, which we call social temperature, that captures the degree to which agents care about social balance in their relationships. We show that the global structure of relationship configuration converges to a unique stationary distribution. Using this stationary distribution, we characterize the maximum likelihood estimator of the social temperature parameter. Since the estimator is computationally challenging to calculate from real social network datasets, we provide a simple simulation algorithm and verify its performance with real social network datasets.
Authentication and Authorization of End User in Microservice Architecture
NASA Astrophysics Data System (ADS)
He, Xiuyu; Yang, Xudong
2017-10-01
As the market and business continues to expand; the traditional single monolithic architecture is facing more and more challenges. The development of cloud computing and container technology promote microservice architecture became more popular. While the low coupling, fine granularity, scalability, flexibility and independence of the microservice architecture bring convenience, the inherent complexity of the distributed system make the security of microservice architecture important and difficult. This paper aims to study the authentication and authorization of the end user under the microservice architecture. By comparing with the traditional measures and researching on existing technology, this paper put forward a set of authentication and authorization strategies suitable for microservice architecture, such as distributed session, SSO solutions, client-side JSON web token and JWT + API Gateway, and summarize the advantages and disadvantages of each method.
Integrated Hardware and Software for No-Loss Computing
NASA Technical Reports Server (NTRS)
James, Mark
2007-01-01
When an algorithm is distributed across multiple threads executing on many distinct processors, a loss of one of those threads or processors can potentially result in the total loss of all the incremental results up to that point. When implementation is massively hardware distributed, then the probability of a hardware failure during the course of a long execution is potentially high. Traditionally, this problem has been addressed by establishing checkpoints where the current state of some or part of the execution is saved. Then in the event of a failure, this state information can be used to recompute that point in the execution and resume the computation from that point. A serious problem arises when one distributes a problem across multiple threads and physical processors is that one increases the likelihood of the algorithm failing due to no fault of the scientist but as a result of hardware faults coupled with operating system problems. With good reason, scientists expect their computing tools to serve them and not the other way around. What is novel here is a unique combination of hardware and software that reformulates an application into monolithic structure that can be monitored in real-time and dynamically reconfigured in the event of a failure. This unique reformulation of hardware and software will provide advanced aeronautical technologies to meet the challenges of next-generation systems in aviation, for civilian and scientific purposes, in our atmosphere and in atmospheres of other worlds. In particular, with respect to NASA s manned flight to Mars, this technology addresses the critical requirements for improving safety and increasing reliability of manned spacecraft.
Computational imaging of light in flight
NASA Astrophysics Data System (ADS)
Hullin, Matthias B.
2014-10-01
Many computer vision tasks are hindered by image formation itself, a process that is governed by the so-called plenoptic integral. By averaging light falling into the lens over space, angle, wavelength and time, a great deal of information is irreversibly lost. The emerging idea of transient imaging operates on a time resolution fast enough to resolve non-stationary light distributions in real-world scenes. It enables the discrimination of light contributions by the optical path length from light source to receiver, a dimension unavailable in mainstream imaging to date. Until recently, such measurements used to require high-end optical equipment and could only be acquired under extremely restricted lab conditions. To address this challenge, we introduced a family of computational imaging techniques operating on standard time-of-flight image sensors, for the first time allowing the user to "film" light in flight in an affordable, practical and portable way. Just as impulse responses have proven a valuable tool in almost every branch of science and engineering, we expect light-in-flight analysis to impact a wide variety of applications in computer vision and beyond.
Randomized Dynamic Mode Decomposition
NASA Astrophysics Data System (ADS)
Erichson, N. Benjamin; Brunton, Steven L.; Kutz, J. Nathan
2017-11-01
The dynamic mode decomposition (DMD) is an equation-free, data-driven matrix decomposition that is capable of providing accurate reconstructions of spatio-temporal coherent structures arising in dynamical systems. We present randomized algorithms to compute the near-optimal low-rank dynamic mode decomposition for massive datasets. Randomized algorithms are simple, accurate and able to ease the computational challenges arising with `big data'. Moreover, randomized algorithms are amenable to modern parallel and distributed computing. The idea is to derive a smaller matrix from the high-dimensional input data matrix using randomness as a computational strategy. Then, the dynamic modes and eigenvalues are accurately learned from this smaller representation of the data, whereby the approximation quality can be controlled via oversampling and power iterations. Here, we present randomized DMD algorithms that are categorized by how many passes the algorithm takes through the data. Specifically, the single-pass randomized DMD does not require data to be stored for subsequent passes. Thus, it is possible to approximately decompose massive fluid flows (stored out of core memory, or not stored at all) using single-pass algorithms, which is infeasible with traditional DMD algorithms.
Astronomy In The Cloud: Using Mapreduce For Image Coaddition
NASA Astrophysics Data System (ADS)
Wiley, Keith; Connolly, A.; Gardner, J.; Krughoff, S.; Balazinska, M.; Howe, B.; Kwon, Y.; Bu, Y.
2011-01-01
In the coming decade, astronomical surveys of the sky will generate tens of terabytes of images and detect hundreds of millions of sources every night. The study of these sources will involve computational challenges such as anomaly detection, classification, and moving object tracking. Since such studies require the highest quality data, methods such as image coaddition, i.e., registration, stacking, and mosaicing, will be critical to scientific investigation. With a requirement that these images be analyzed on a nightly basis to identify moving sources, e.g., asteroids, or transient objects, e.g., supernovae, these datastreams present many computational challenges. Given the quantity of data involved, the computational load of these problems can only be addressed by distributing the workload over a large number of nodes. However, the high data throughput demanded by these applications may present scalability challenges for certain storage architectures. One scalable data-processing method that has emerged in recent years is MapReduce, and in this paper we focus on its popular open-source implementation called Hadoop. In the Hadoop framework, the data is partitioned among storage attached directly to worker nodes, and the processing workload is scheduled in parallel on the nodes that contain the required input data. A further motivation for using Hadoop is that it allows us to exploit cloud computing resources, i.e., platforms where Hadoop is offered as a service. We report on our experience implementing a scalable image-processing pipeline for the SDSS imaging database using Hadoop. This multi-terabyte imaging dataset provides a good testbed for algorithm development since its scope and structure approximate future surveys. First, we describe MapReduce and how we adapted image coaddition to the MapReduce framework. Then we describe a number of optimizations to our basic approach and report experimental results compring their performance. This work is funded by the NSF and by NASA.
Astronomy in the Cloud: Using MapReduce for Image Co-Addition
NASA Astrophysics Data System (ADS)
Wiley, K.; Connolly, A.; Gardner, J.; Krughoff, S.; Balazinska, M.; Howe, B.; Kwon, Y.; Bu, Y.
2011-03-01
In the coming decade, astronomical surveys of the sky will generate tens of terabytes of images and detect hundreds of millions of sources every night. The study of these sources will involve computation challenges such as anomaly detection and classification and moving-object tracking. Since such studies benefit from the highest-quality data, methods such as image co-addition, i.e., astrometric registration followed by per-pixel summation, will be a critical preprocessing step prior to scientific investigation. With a requirement that these images be analyzed on a nightly basis to identify moving sources such as potentially hazardous asteroids or transient objects such as supernovae, these data streams present many computational challenges. Given the quantity of data involved, the computational load of these problems can only be addressed by distributing the workload over a large number of nodes. However, the high data throughput demanded by these applications may present scalability challenges for certain storage architectures. One scalable data-processing method that has emerged in recent years is MapReduce, and in this article we focus on its popular open-source implementation called Hadoop. In the Hadoop framework, the data are partitioned among storage attached directly to worker nodes, and the processing workload is scheduled in parallel on the nodes that contain the required input data. A further motivation for using Hadoop is that it allows us to exploit cloud computing resources: i.e., platforms where Hadoop is offered as a service. We report on our experience of implementing a scalable image-processing pipeline for the SDSS imaging database using Hadoop. This multiterabyte imaging data set provides a good testbed for algorithm development, since its scope and structure approximate future surveys. First, we describe MapReduce and how we adapted image co-addition to the MapReduce framework. Then we describe a number of optimizations to our basic approach and report experimental results comparing their performance.
NREL, SolarCity Addressing Challenges of High Penetrations of Distributed
Companies NREL, SolarCity Addressing Challenges of High Penetrations of Distributed Photovoltaics NREL is , reliability, and stability challenges of interconnecting high penetrations of distributed photovoltaics (PV country that distributed solar is not a liability for reliability-and can even be an asset. Project Impact
NASA Technical Reports Server (NTRS)
Rastaetter, Lutz; Kuznetsova, Maria; Hesse, Michael; Chulaki, Anna; Pulkkinen, Antti; Ridley, Aaron J.; Gombosi, Tamas; Vapirev, Alexander; Raeder, Joachim; Wiltberger, Michael James;
2010-01-01
The GEM 2008 modeling challenge efforts are expanding beyond comparing in-situ measurements in the magnetosphere and ionosphere to include the computation of indices to be compared. The Dst index measures the largest deviations of the horizontal magnetic field at 4 equatorial magnetometers from the quiet-time background field and is commonly used to track the strength of the magnetic disturbance of the magnetosphere during storms. Models can calculate a proxy Dst index in various ways, including using the Dessler-Parker Sckopke relation and the energy of the ring current and Biot-Savart integration of electric currents in the magnetosphere. The GEM modeling challenge investigates 4 space weather events and we compare models available at CCMC against each other and the observed values of Ost. Models used include SWMF/BATSRUS, OpenGGCM, LFM, GUMICS (3D magnetosphere MHD models), Fok-RC, CRCM, RAM-SCB (kinetic drift models of the ring current), WINDMI (magnetosphere-ionosphere electric circuit model), and predictions based on an impulse response function (IRF) model and analytic coupling functions with inputs of solar wind data. In addition to the analysis of model-observation comparisons we look at the way Dst is computed in global magnetosphere models. The default value of Dst computed by the SWMF model is for Bz the Earth's center. In addition to this, we present results obtained at different locations on the Earth's surface. We choose equatorial locations at local noon, dusk (18:00 hours), midnight and dawn (6:00 hours). The different virtual observatory locations reveal the variation around the earth-centered Dst value resulting from the distribution of electric currents in the magnetosphere during different phases of a storm.
Adaptive Core Simulation Employing Discrete Inverse Theory - Part I: Theory
DOE Office of Scientific and Technical Information (OSTI.GOV)
Abdel-Khalik, Hany S.; Turinsky, Paul J.
2005-07-15
Use of adaptive simulation is intended to improve the fidelity and robustness of important core attribute predictions such as core power distribution, thermal margins, and core reactivity. Adaptive simulation utilizes a selected set of past and current reactor measurements of reactor observables, i.e., in-core instrumentation readings, to adapt the simulation in a meaningful way. A meaningful adaption will result in high-fidelity and robust adapted core simulator models. To perform adaption, we propose an inverse theory approach in which the multitudes of input data to core simulators, i.e., reactor physics and thermal-hydraulic data, are to be adjusted to improve agreement withmore » measured observables while keeping core simulator models unadapted. At first glance, devising such adaption for typical core simulators with millions of input and observables data would spawn not only several prohibitive challenges but also numerous disparaging concerns. The challenges include the computational burdens of the sensitivity-type calculations required to construct Jacobian operators for the core simulator models. Also, the computational burdens of the uncertainty-type calculations required to estimate the uncertainty information of core simulator input data present a demanding challenge. The concerns however are mainly related to the reliability of the adjusted input data. The methodologies of adaptive simulation are well established in the literature of data adjustment. We adopt the same general framework for data adjustment; however, we refrain from solving the fundamental adjustment equations in a conventional manner. We demonstrate the use of our so-called Efficient Subspace Methods (ESMs) to overcome the computational and storage burdens associated with the core adaption problem. We illustrate the successful use of ESM-based adaptive techniques for a typical boiling water reactor core simulator adaption problem.« less
NASA Astrophysics Data System (ADS)
Fritze, Matthew D.
Fluid-structure interaction (FSI) modeling of spacecraft parachutes involves a number of computational challenges. The canopy complexity created by the hundreds of gaps and slits and design-related modification of that geometric porosity by removal of some of the sails and panels are among the formidable challenges. Disreefing from one stage to another when the parachute is used in multiple stages is another formidable challenge. This thesis addresses the computational challenges involved in disreefing of spacecraft parachutes and fully-open and reefed stages of the parachutes with modified geometric porosity. The special techniques developed to address these challenges are described and the FSI computations are be reported. The thesis also addresses the modeling and computation challenges involved in very early stages, where the sudden separation of a cover jettisoned to the spacecraft wake needs to be modeled. Higher-order temporal representations used in modeling the separation motion are described, and the computed separation and wake-induced forces acting on the cover are reported.
Space Object Collision Probability via Monte Carlo on the Graphics Processing Unit
NASA Astrophysics Data System (ADS)
Vittaldev, Vivek; Russell, Ryan P.
2017-09-01
Fast and accurate collision probability computations are essential for protecting space assets. Monte Carlo (MC) simulation is the most accurate but computationally intensive method. A Graphics Processing Unit (GPU) is used to parallelize the computation and reduce the overall runtime. Using MC techniques to compute the collision probability is common in literature as the benchmark. An optimized implementation on the GPU, however, is a challenging problem and is the main focus of the current work. The MC simulation takes samples from the uncertainty distributions of the Resident Space Objects (RSOs) at any time during a time window of interest and outputs the separations at closest approach. Therefore, any uncertainty propagation method may be used and the collision probability is automatically computed as a function of RSO collision radii. Integration using a fixed time step and a quartic interpolation after every Runge Kutta step ensures that no close approaches are missed. Two orders of magnitude speedups over a serial CPU implementation are shown, and speedups improve moderately with higher fidelity dynamics. The tool makes the MC approach tractable on a single workstation, and can be used as a final product, or for verifying surrogate and analytical collision probability methods.
Storage and distribution of pathology digital images using integrated web-based viewing systems.
Marchevsky, Alberto M; Dulbandzhyan, Ronda; Seely, Kevin; Carey, Steve; Duncan, Raymond G
2002-05-01
Health care providers have expressed increasing interest in incorporating digital images of gross pathology specimens and photomicrographs in routine pathology reports. To describe the multiple technical and logistical challenges involved in the integration of the various components needed for the development of a system for integrated Web-based viewing, storage, and distribution of digital images in a large health system. An Oracle version 8.1.6 database was developed to store, index, and deploy pathology digital photographs via our Intranet. The database allows for retrieval of images by patient demographics or by SNOMED code information. The Intranet of a large health system accessible from multiple computers located within the medical center and at distant private physician offices. The images can be viewed using any of the workstations of the health system that have authorized access to our Intranet, using a standard browser or a browser configured with an external viewer or inexpensive plug-in software, such as Prizm 2.0. The images can be printed on paper or transferred to film using a digital film recorder. Digital images can also be displayed at pathology conferences by using wireless local area network (LAN) and secure remote technologies. The standardization of technologies and the adoption of a Web interface for all our computer systems allows us to distribute digital images from a pathology database to a potentially large group of users distributed in multiple locations throughout a large medical center.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Blanford, M.
1997-12-31
Most commercially-available quasistatic finite element programs assemble element stiffnesses into a global stiffness matrix, then use a direct linear equation solver to obtain nodal displacements. However, for large problems (greater than a few hundred thousand degrees of freedom), the memory size and computation time required for this approach becomes prohibitive. Moreover, direct solution does not lend itself to the parallel processing needed for today`s multiprocessor systems. This talk gives an overview of the iterative solution strategy of JAS3D, the nonlinear large-deformation quasistatic finite element program. Because its architecture is derived from an explicit transient-dynamics code, it does not ever assemblemore » a global stiffness matrix. The author describes the approach he used to implement the solver on multiprocessor computers, and shows examples of problems run on hundreds of processors and more than a million degrees of freedom. Finally, he describes some of the work he is presently doing to address the challenges of iterative convergence for ill-conditioned problems.« less
Bringing Federated Identity to Grid Computing
DOE Office of Scientific and Technical Information (OSTI.GOV)
Teheran, Jeny
The Fermi National Accelerator Laboratory (FNAL) is facing the challenge of providing scientific data access and grid submission to scientific collaborations that span the globe but are hosted at FNAL. Users in these collaborations are currently required to register as an FNAL user and obtain FNAL credentials to access grid resources to perform their scientific computations. These requirements burden researchers with managing additional authentication credentials, and put additional load on FNAL for managing user identities. Our design integrates the existing InCommon federated identity infrastructure, CILogon Basic CA, and MyProxy with the FNAL grid submission system to provide secure access formore » users from diverse experiments and collab orations without requiring each user to have authentication credentials from FNAL. The design automates the handling of certificates so users do not need to manage them manually. Although the initial implementation is for FNAL's grid submission system, the design and the core of the implementation are general and could be applied to other distributed computing systems.« less
Recycling potential of neodymium: the case of computer hard disk drives.
Sprecher, Benjamin; Kleijn, Rene; Kramer, Gert Jan
2014-08-19
Neodymium, one of the more critically scarce rare earth metals, is often used in sustainable technologies. In this study, we investigate the potential contribution of neodymium recycling to reducing scarcity in supply, with a case study on computer hard disk drives (HDDs). We first review the literature on neodymium production and recycling potential. From this review, we find that recycling of computer HDDs is currently the most feasible pathway toward large-scale recycling of neodymium, even though HDDs do not represent the largest application of neodymium. We then use a combination of dynamic modeling and empirical experiments to conclude that within the application of NdFeB magnets for HDDs, the potential for loop-closing is significant: up to 57% in 2017. However, compared to the total NdFeB production capacity, the recovery potential from HDDs is relatively small (in the 1-3% range). The distributed nature of neodymium poses a significant challenge for recycling of neodymium.
Why Waveform Correlation Sometimes Fails
NASA Astrophysics Data System (ADS)
Carmichael, J.
2015-12-01
Waveform correlation detectors used in explosion monitoring scan noisy geophysical data to test two competing hypotheses: either (1) an amplitude-scaled version of a template waveform is present, or, (2) no signal is present at all. In reality, geophysical wavefields that are monitored for explosion signatures include waveforms produced by non-target sources that are partially correlated with the waveform template. Such signals can falsely trigger correlation detectors, particularly at low thresholds required to monitor for smaller target explosions. This challenge is particularly formidable when monitoring known test sites for seismic disturbances, since uncatalogued natural seismicity is (generally) more prevalent at lower magnitudes, and could be mistaken for small explosions. To address these challenges, we identify real examples in which correlation detectors targeting explosions falsely trigger on both site-proximal earthquakes (Figure 1, below) and microseismic "noise". Motivated by these examples, we quantify performance loss when applying these detectors, and re-evaluate the correlation-detector's hypothesis test. We thereby derive new detectors from more general hypotheses that admit unknown background seismicity, and apply these to real data. From our treatment, we derive "rules of thumb'' for proper template and threshold selection in heavily cluttered signal environments. Last, we answer the question "what is the probability of falsely detecting an earthquake collocated at a test site?", using correlation detectors that include explosion-triggered templates. Figure Top: An eight-channel data stream (black) recorded from an earthquake near a mine. Red markers indicate a detection. Middle: The correlation statistic computed by scanning the template against the data stream at top. The red line indicates the threshold for event declaration, determined by a false-alarm on noise probability constraint, as computed from the signal-absent distribution using the Neyman Pearson criteria. Bottom: The histogram of the correlation statistic time series (gray) superimposed on the theoretical null distribution (black curve). The line shows the threshold, consistent with a right-tail probability, computed from the black curve.
NASA Astrophysics Data System (ADS)
Chung, Kee-Choo; Park, Hwangseo
2016-11-01
The performance of the extended solvent-contact model has been addressed in the SAMPL5 blind prediction challenge for distribution coefficient (LogD) of drug-like molecules with respect to the cyclohexane/water partitioning system. All the atomic parameters defined for 41 atom types in the solvation free energy function were optimized by operating a standard genetic algorithm with respect to water and cyclohexane solvents. In the parameterizations for cyclohexane, the experimental solvation free energy (Δ G sol ) data of 15 molecules for 1-octanol were combined with those of 77 molecules for cyclohexane to construct a training set because Δ G sol values of the former were unavailable for cyclohexane in publicly accessible databases. Using this hybrid training set, we established the LogD prediction model with the correlation coefficient ( R), average error (AE), and root mean square error (RMSE) of 0.55, 1.53, and 3.03, respectively, for the comparison of experimental and computational results for 53 SAMPL5 molecules. The modest accuracy in LogD prediction could be attributed to the incomplete optimization of atomic solvation parameters for cyclohexane. With respect to 31 SAMPL5 molecules containing the atom types for which experimental reference data for Δ G sol were available for both water and cyclohexane, the accuracy in LogD prediction increased remarkably with the R, AE, and RMSE values of 0.82, 0.89, and 1.60, respectively. This significant enhancement in performance stemmed from the better optimization of atomic solvation parameters by limiting the element of training set to the molecules with experimental Δ G sol data for cyclohexane. Due to the simplicity in model building and to low computational cost for parameterizations, the extended solvent-contact model is anticipated to serve as a valuable computational tool for LogD prediction upon the enrichment of experimental Δ G sol data for organic solvents.
NASA Technical Reports Server (NTRS)
Talbot, Bryan; Zhou, Shu-Jia; Higgins, Glenn; Zukor, Dorothy (Technical Monitor)
2002-01-01
One of the most significant challenges in large-scale climate modeling, as well as in high-performance computing in other scientific fields, is that of effectively integrating many software models from multiple contributors. A software framework facilitates the integration task, both in the development and runtime stages of the simulation. Effective software frameworks reduce the programming burden for the investigators, freeing them to focus more on the science and less on the parallel communication implementation. while maintaining high performance across numerous supercomputer and workstation architectures. This document surveys numerous software frameworks for potential use in Earth science modeling. Several frameworks are evaluated in depth, including Parallel Object-Oriented Methods and Applications (POOMA), Cactus (from (he relativistic physics community), Overture, Goddard Earth Modeling System (GEMS), the National Center for Atmospheric Research Flux Coupler, and UCLA/UCB Distributed Data Broker (DDB). Frameworks evaluated in less detail include ROOT, Parallel Application Workspace (PAWS), and Advanced Large-Scale Integrated Computational Environment (ALICE). A host of other frameworks and related tools are referenced in this context. The frameworks are evaluated individually and also compared with each other.
Orion EFT-1 Cavity Heating Tile Experiments and Environment Reconstruction
NASA Technical Reports Server (NTRS)
Salazar, Giovanni; Amar, Adam; Oliver, Brandon; Hyatt, Andrew; Rezin, Marc
2016-01-01
Developing aerothermodynamic environments for deep cavities, such as those produced by micrometeoroids and orbital debris impacts, poses a great challenge for engineers. In order to assess existing cavity heating models, two one-inch diameter cavities were flown on the Orion Multi-Purpose Crew Vehicle during Exploration Flight Test 1 (EFT1). These cavities were manufactured with depths of 1.0 in and 1.4 in, and they were both instrumented. Instrumentation included surface thermocouples upstream, downstream and within the cavities, and additional thermocouples at the TPS-structure interface. This paper will present the data obtained, and comparisons with computational predictions will be shown. Additionally, the development of a 3D material thermal model will be described, which will be used to account for the three-dimensionality of the problem when interpreting the data. Furthermore, using a multi-dimensional inverse heat conduction approach, a reconstruction of a time- and space-dependent flight heating distribution during EFT1 will be presented. Additional discussions will focus on instrumentation challenges and calibration techniques specific to these experiments. The analysis shown will highlight the accuracies and/or deficiencies of current computational techniques to model cavity flows during hypersonic re-entry.
NASA Technical Reports Server (NTRS)
Biegel, Bryan A.
1999-01-01
We are on the path to meet the major challenges ahead for TCAD (technology computer aided design). The emerging computational grid will ultimately solve the challenge of limited computational power. The Modular TCAD Framework will solve the TCAD software challenge once TCAD software developers realize that there is no other way to meet industry's needs. The modular TCAD framework (MTF) also provides the ideal platform for solving the TCAD model challenge by rapid implementation of models in a partial differential solver.
A Web-based Distributed Voluntary Computing Platform for Large Scale Hydrological Computations
NASA Astrophysics Data System (ADS)
Demir, I.; Agliamzanov, R.
2014-12-01
Distributed volunteer computing can enable researchers and scientist to form large parallel computing environments to utilize the computing power of the millions of computers on the Internet, and use them towards running large scale environmental simulations and models to serve the common good of local communities and the world. Recent developments in web technologies and standards allow client-side scripting languages to run at speeds close to native application, and utilize the power of Graphics Processing Units (GPU). Using a client-side scripting language like JavaScript, we have developed an open distributed computing framework that makes it easy for researchers to write their own hydrologic models, and run them on volunteer computers. Users will easily enable their websites for visitors to volunteer sharing their computer resources to contribute running advanced hydrological models and simulations. Using a web-based system allows users to start volunteering their computational resources within seconds without installing any software. The framework distributes the model simulation to thousands of nodes in small spatial and computational sizes. A relational database system is utilized for managing data connections and queue management for the distributed computing nodes. In this paper, we present a web-based distributed volunteer computing platform to enable large scale hydrological simulations and model runs in an open and integrated environment.
Exploring the challenges faced by polytechnic students
NASA Astrophysics Data System (ADS)
Matore, Mohd Effendi @ Ewan Mohd; Khairani, Ahmad Zamri
2015-02-01
This study aims to identify other challenges besides those already faced by students, in seven polytechnics in Malaysia as a continuation to the previous research that had identified 52 main challenges faced by students using the Rasch Model. The explorative study focuses on the challenges that are not included in the Mooney Problem Checklist (MPCL). A total of 121 polytechnic students submitted 183 written responses through the open questions provided. Two hundred fifty two students had responded from a students' perspective on the dichotomous questions regarding their view on the challenges faced. The data was analysed qualitatively using the NVivo 8.0. The findings showed that students from Politeknik Seberang Perai (PSP) gave the highest response, which was 56 (30.6%) and Politeknik Metro Kuala Lumpur (PMKL) had the lowest response of 2 (1.09%). Five dominant challenges were identified, which were the English language (32, 17.5%), learning (14, 7.7%), vehicles (13, 7.1%), information technology and communication (ICT) (13, 7.1%), and peers (11, 6.0%). This article, however, focus on three apparent challenges, namely, English language, vehicles, as well as computer and ICT, as the challenges of learning and peers had been analysed in the previous MPCL. The challenge of English language that had been raised was regarding the weakness in commanding the aspects of speech and fluency. The computer and ICT challenge covered the weakness in mastering ICT and computers, as well as computer breakdowns and low-performance computers. The challenge of vehicles emphasized the unavailability of vehicles to attend lectures and go elsewhere, lack of transportation service in the polytechnic and not having a valid driving license. These challenges are very relevant and need to be discussed in an effort to prepare polytechnics in facing the transformational process of polytechnics.
Spiking network simulation code for petascale computers.
Kunkel, Susanne; Schmidt, Maximilian; Eppler, Jochen M; Plesser, Hans E; Masumoto, Gen; Igarashi, Jun; Ishii, Shin; Fukai, Tomoki; Morrison, Abigail; Diesmann, Markus; Helias, Moritz
2014-01-01
Brain-scale networks exhibit a breathtaking heterogeneity in the dynamical properties and parameters of their constituents. At cellular resolution, the entities of theory are neurons and synapses and over the past decade researchers have learned to manage the heterogeneity of neurons and synapses with efficient data structures. Already early parallel simulation codes stored synapses in a distributed fashion such that a synapse solely consumes memory on the compute node harboring the target neuron. As petaflop computers with some 100,000 nodes become increasingly available for neuroscience, new challenges arise for neuronal network simulation software: Each neuron contacts on the order of 10,000 other neurons and thus has targets only on a fraction of all compute nodes; furthermore, for any given source neuron, at most a single synapse is typically created on any compute node. From the viewpoint of an individual compute node, the heterogeneity in the synaptic target lists thus collapses along two dimensions: the dimension of the types of synapses and the dimension of the number of synapses of a given type. Here we present a data structure taking advantage of this double collapse using metaprogramming techniques. After introducing the relevant scaling scenario for brain-scale simulations, we quantitatively discuss the performance on two supercomputers. We show that the novel architecture scales to the largest petascale supercomputers available today.
Spiking network simulation code for petascale computers
Kunkel, Susanne; Schmidt, Maximilian; Eppler, Jochen M.; Plesser, Hans E.; Masumoto, Gen; Igarashi, Jun; Ishii, Shin; Fukai, Tomoki; Morrison, Abigail; Diesmann, Markus; Helias, Moritz
2014-01-01
Brain-scale networks exhibit a breathtaking heterogeneity in the dynamical properties and parameters of their constituents. At cellular resolution, the entities of theory are neurons and synapses and over the past decade researchers have learned to manage the heterogeneity of neurons and synapses with efficient data structures. Already early parallel simulation codes stored synapses in a distributed fashion such that a synapse solely consumes memory on the compute node harboring the target neuron. As petaflop computers with some 100,000 nodes become increasingly available for neuroscience, new challenges arise for neuronal network simulation software: Each neuron contacts on the order of 10,000 other neurons and thus has targets only on a fraction of all compute nodes; furthermore, for any given source neuron, at most a single synapse is typically created on any compute node. From the viewpoint of an individual compute node, the heterogeneity in the synaptic target lists thus collapses along two dimensions: the dimension of the types of synapses and the dimension of the number of synapses of a given type. Here we present a data structure taking advantage of this double collapse using metaprogramming techniques. After introducing the relevant scaling scenario for brain-scale simulations, we quantitatively discuss the performance on two supercomputers. We show that the novel architecture scales to the largest petascale supercomputers available today. PMID:25346682
Assembling Large, Multi-Sensor Climate Datasets Using the SciFlo Grid Workflow System
NASA Astrophysics Data System (ADS)
Wilson, B. D.; Manipon, G.; Xing, Z.; Fetzer, E.
2008-12-01
NASA's Earth Observing System (EOS) is the world's most ambitious facility for studying global climate change. The mandate now is to combine measurements from the instruments on the A-Train platforms (AIRS, AMSR-E, MODIS, MISR, MLS, and CloudSat) and other Earth probes to enable large-scale studies of climate change over periods of years to decades. However, moving from predominantly single-instrument studies to a multi-sensor, measurement-based model for long-duration analysis of important climate variables presents serious challenges for large-scale data mining and data fusion. For example, one might want to compare temperature and water vapor retrievals from one instrument (AIRS) to another instrument (MODIS), and to a model (ECMWF), stratify the comparisons using a classification of the cloud scenes from CloudSat, and repeat the entire analysis over years of AIRS data. To perform such an analysis, one must discover & access multiple datasets from remote sites, find the space/time matchups between instruments swaths and model grids, understand the quality flags and uncertainties for retrieved physical variables, and assemble merged datasets for further scientific and statistical analysis. To meet these large-scale challenges, we are utilizing a Grid computing and dataflow framework, named SciFlo, in which we are deploying a set of versatile and reusable operators for data query, access, subsetting, co-registration, mining, fusion, and advanced statistical analysis. SciFlo is a semantically-enabled ("smart") Grid Workflow system that ties together a peer-to-peer network of computers into an efficient engine for distributed computation. The SciFlo workflow engine enables scientists to do multi-instrument Earth Science by assembling remotely-invokable Web Services (SOAP or http GET URLs), native executables, command-line scripts, and Python codes into a distributed computing flow. A scientist visually authors the graph of operation in the VizFlow GUI, or uses a text editor to modify the simple XML workflow documents. The SciFlo client & server engines optimize the execution of such distributed workflows and allow the user to transparently find and use datasets and operators without worrying about the actual location of the Grid resources. The engine transparently moves data to the operators, and moves operators to the data (on the dozen trusted SciFlo nodes). SciFlo also deploys a variety of Data Grid services to: query datasets in space and time, locate & retrieve on-line data granules, provide on-the-fly variable and spatial subsetting, and perform pairwise instrument matchups for A-Train datasets. These services are combined into efficient workflows to assemble the desired large-scale, merged climate datasets. SciFlo is currently being applied in several large climate studies: comparisons of aerosol optical depth between MODIS, MISR, AERONET ground network, and U. Michigan's IMPACT aerosol transport model; characterization of long-term biases in microwave and infrared instruments (AIRS, MLS) by comparisons to GPS temperature retrievals accurate to 0.1 degrees Kelvin; and construction of a decade-long, multi-sensor water vapor climatology stratified by classified cloud scene by bringing together datasets from AIRS/AMSU, AMSR-E, MLS, MODIS, and CloudSat (NASA MEASUREs grant, Fetzer PI). The presentation will discuss the SciFlo technologies, their application in these distributed workflows, and the many challenges encountered in assembling and analyzing these massive datasets.
Applications of intelligent computer-aided training
NASA Technical Reports Server (NTRS)
Loftin, R. B.; Savely, Robert T.
1991-01-01
Intelligent computer-aided training (ICAT) systems simulate the behavior of an experienced instructor observing a trainee, responding to help requests, diagnosing and remedying trainee errors, and proposing challenging new training scenarios. This paper presents a generic ICAT architecture that supports the efficient development of ICAT systems for varied tasks. In addition, details of ICAT projects, built with this architecture, that deliver specific training for Space Shuttle crew members, ground support personnel, and flight controllers are presented. Concurrently with the creation of specific ICAT applications, a general-purpose software development environment for ICAT systems is being built. The widespread use of such systems for both ground-based and on-orbit training will serve to preserve task and training expertise, support the training of large numbers of personnel in a distributed manner, and ensure the uniformity and verifiability of training experiences.
Structure preserving parallel algorithms for solving the Bethe–Salpeter eigenvalue problem
Shao, Meiyue; da Jornada, Felipe H.; Yang, Chao; ...
2015-10-02
The Bethe–Salpeter eigenvalue problem is a dense structured eigenvalue problem arising from discretized Bethe–Salpeter equation in the context of computing exciton energies and states. A computational challenge is that at least half of the eigenvalues and the associated eigenvectors are desired in practice. In this paper, we establish the equivalence between Bethe–Salpeter eigenvalue problems and real Hamiltonian eigenvalue problems. Based on theoretical analysis, structure preserving algorithms for a class of Bethe–Salpeter eigenvalue problems are proposed. We also show that for this class of problems all eigenvalues obtained from the Tamm–Dancoff approximation are overestimated. In order to solve large scale problemsmore » of practical interest, we discuss parallel implementations of our algorithms targeting distributed memory systems. Finally, several numerical examples are presented to demonstrate the efficiency and accuracy of our algorithms.« less
Kontopantelis, Evangelos; Stevens, Richard John; Helms, Peter J; Edwards, Duncan; Doran, Tim; Ashcroft, Darren M
2018-02-28
UK primary care databases (PCDs) are used by researchers worldwide to inform clinical practice. These databases have been primarily tied to single clinical computer systems, but little is known about the adoption of these systems by primary care practices or their geographical representativeness. We explore the spatial distribution of clinical computing systems and discuss the implications for the longevity and regional representativeness of these resources. Cross-sectional study. English primary care clinical computer systems. 7526 general practices in August 2016. Spatial mapping of family practices in England in 2016 by clinical computer system at two geographical levels, the lower Clinical Commissioning Group (CCG, 209 units) and the higher National Health Service regions (14 units). Data for practices included numbers of doctors, nurses and patients, and area deprivation. Of 7526 practices, Egton Medical Information Systems (EMIS) was used in 4199 (56%), SystmOne in 2552 (34%) and Vision in 636 (9%). Great regional variability was observed for all systems, with EMIS having a stronger presence in the West of England, London and the South; SystmOne in the East and some regions in the South; and Vision in London, the South, Greater Manchester and Birmingham. PCDs based on single clinical computer systems are geographically clustered in England. For example, Clinical Practice Research Datalink and The Health Improvement Network, the most popular primary care databases in terms of research outputs, are based on the Vision clinical computer system, used by <10% of practices and heavily concentrated in three major conurbations and the South. Researchers need to be aware of the analytical challenges posed by clustering, and barriers to accessing alternative PCDs need to be removed. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
Dynamic Collaboration Infrastructure for Hydrologic Science
NASA Astrophysics Data System (ADS)
Tarboton, D. G.; Idaszak, R.; Castillo, C.; Yi, H.; Jiang, F.; Jones, N.; Goodall, J. L.
2016-12-01
Data and modeling infrastructure is becoming increasingly accessible to water scientists. HydroShare is a collaborative environment that currently offers water scientists the ability to access modeling and data infrastructure in support of data intensive modeling and analysis. It supports the sharing of and collaboration around "resources" which are social objects defined to include both data and models in a structured standardized format. Users collaborate around these objects via comments, ratings, and groups. HydroShare also supports web services and cloud based computation for the execution of hydrologic models and analysis and visualization of hydrologic data. However, the quantity and variety of data and modeling infrastructure available that can be accessed from environments like HydroShare is increasing. Storage infrastructure can range from one's local PC to campus or organizational storage to storage in the cloud. Modeling or computing infrastructure can range from one's desktop to departmental clusters to national HPC resources to grid and cloud computing resources. How does one orchestrate this vast number of data and computing infrastructure without needing to correspondingly learn each new system? A common limitation across these systems is the lack of efficient integration between data transport mechanisms and the corresponding high-level services to support large distributed data and compute operations. A scientist running a hydrology model from their desktop may require processing a large collection of files across the aforementioned storage and compute resources and various national databases. To address these community challenges a proof-of-concept prototype was created integrating HydroShare with RADII (Resource Aware Data-centric collaboration Infrastructure) to provide software infrastructure to enable the comprehensive and rapid dynamic deployment of what we refer to as "collaborative infrastructure." In this presentation we discuss the results of this proof-of-concept prototype which enabled HydroShare users to readily instantiate virtual infrastructure marshaling arbitrary combinations, varieties, and quantities of distributed data and computing infrastructure in addressing big problems in hydrology.
NASA Astrophysics Data System (ADS)
Fedorov, D.; Miller, R. J.; Kvilekval, K. G.; Doheny, B.; Sampson, S.; Manjunath, B. S.
2016-02-01
Logistical and financial limitations of underwater operations are inherent in marine science, including biodiversity observation. Imagery is a promising way to address these challenges, but the diversity of organisms thwarts simple automated analysis. Recent developments in computer vision methods, such as convolutional neural networks (CNN), are promising for automated classification and detection tasks but are typically very computationally expensive and require extensive training on large datasets. Therefore, managing and connecting distributed computation, large storage and human annotations of diverse marine datasets is crucial for effective application of these methods. BisQue is a cloud-based system for management, annotation, visualization, analysis and data mining of underwater and remote sensing imagery and associated data. Designed to hide the complexity of distributed storage, large computational clusters, diversity of data formats and inhomogeneous computational environments behind a user friendly web-based interface, BisQue is built around an idea of flexible and hierarchical annotations defined by the user. Such textual and graphical annotations can describe captured attributes and the relationships between data elements. Annotations are powerful enough to describe cells in fluorescent 4D images, fish species in underwater videos and kelp beds in aerial imagery. Presently we are developing BisQue-based analysis modules for automated identification of benthic marine organisms. Recent experiments with drop-out and CNN based classification of several thousand annotated underwater images demonstrated an overall accuracy above 70% for the 15 best performing species and above 85% for the top 5 species. Based on these promising results, we have extended bisque with a CNN-based classification system allowing continuous training on user-provided data.
Job optimization in ATLAS TAG-based distributed analysis
NASA Astrophysics Data System (ADS)
Mambelli, M.; Cranshaw, J.; Gardner, R.; Maeno, T.; Malon, D.; Novak, M.
2010-04-01
The ATLAS experiment is projected to collect over one billion events/year during the first few years of operation. The efficient selection of events for various physics analyses across all appropriate samples presents a significant technical challenge. ATLAS computing infrastructure leverages the Grid to tackle the analysis across large samples by organizing data into a hierarchical structure and exploiting distributed computing to churn through the computations. This includes events at different stages of processing: RAW, ESD (Event Summary Data), AOD (Analysis Object Data), DPD (Derived Physics Data). Event Level Metadata Tags (TAGs) contain information about each event stored using multiple technologies accessible by POOL and various web services. This allows users to apply selection cuts on quantities of interest across the entire sample to compile a subset of events that are appropriate for their analysis. This paper describes new methods for organizing jobs using the TAGs criteria to analyze ATLAS data. It further compares different access patterns to the event data and explores ways to partition the workload for event selection and analysis. Here analysis is defined as a broader set of event processing tasks including event selection and reduction operations ("skimming", "slimming" and "thinning") as well as DPD making. Specifically it compares analysis with direct access to the events (AOD and ESD data) to access mediated by different TAG-based event selections. We then compare different ways of splitting the processing to maximize performance.
Parallel Simulation of Unsteady Turbulent Flames
NASA Technical Reports Server (NTRS)
Menon, Suresh
1996-01-01
Time-accurate simulation of turbulent flames in high Reynolds number flows is a challenging task since both fluid dynamics and combustion must be modeled accurately. To numerically simulate this phenomenon, very large computer resources (both time and memory) are required. Although current vector supercomputers are capable of providing adequate resources for simulations of this nature, the high cost and their limited availability, makes practical use of such machines less than satisfactory. At the same time, the explicit time integration algorithms used in unsteady flow simulations often possess a very high degree of parallelism, making them very amenable to efficient implementation on large-scale parallel computers. Under these circumstances, distributed memory parallel computers offer an excellent near-term solution for greatly increased computational speed and memory, at a cost that may render the unsteady simulations of the type discussed above more feasible and affordable.This paper discusses the study of unsteady turbulent flames using a simulation algorithm that is capable of retaining high parallel efficiency on distributed memory parallel architectures. Numerical studies are carried out using large-eddy simulation (LES). In LES, the scales larger than the grid are computed using a time- and space-accurate scheme, while the unresolved small scales are modeled using eddy viscosity based subgrid models. This is acceptable for the moment/energy closure since the small scales primarily provide a dissipative mechanism for the energy transferred from the large scales. However, for combustion to occur, the species must first undergo mixing at the small scales and then come into molecular contact. Therefore, global models cannot be used. Recently, a new model for turbulent combustion was developed, in which the combustion is modeled, within the subgrid (small-scales) using a methodology that simulates the mixing and the molecular transport and the chemical kinetics within each LES grid cell. Finite-rate kinetics can be included without any closure and this approach actually provides a means to predict the turbulent rates and the turbulent flame speed. The subgrid combustion model requires resolution of the local time scales associated with small-scale mixing, molecular diffusion and chemical kinetics and, therefore, within each grid cell, a significant amount of computations must be carried out before the large-scale (LES resolved) effects are incorporated. Therefore, this approach is uniquely suited for parallel processing and has been implemented on various systems such as: Intel Paragon, IBM SP-2, Cray T3D and SGI Power Challenge (PC) using the system independent Message Passing Interface (MPI) compiler. In this paper, timing data on these machines is reported along with some characteristic results.
Mu, Guangyu; Liu, Ying; Wang, Limin
2015-01-01
The spatial pooling method such as spatial pyramid matching (SPM) is very crucial in the bag of features model used in image classification. SPM partitions the image into a set of regular grids and assumes that the spatial layout of all visual words obey the uniform distribution over these regular grids. However, in practice, we consider that different visual words should obey different spatial layout distributions. To improve SPM, we develop a novel spatial pooling method, namely spatial distribution pooling (SDP). The proposed SDP method uses an extension model of Gauss mixture model to estimate the spatial layout distributions of the visual vocabulary. For each visual word type, SDP can generate a set of flexible grids rather than the regular grids from the traditional SPM. Furthermore, we can compute the grid weights for visual word tokens according to their spatial coordinates. The experimental results demonstrate that SDP outperforms the traditional spatial pooling methods, and is competitive with the state-of-the-art classification accuracy on several challenging image datasets.
Advanced algorithms for distributed fusion
NASA Astrophysics Data System (ADS)
Gelfand, A.; Smith, C.; Colony, M.; Bowman, C.; Pei, R.; Huynh, T.; Brown, C.
2008-03-01
The US Military has been undergoing a radical transition from a traditional "platform-centric" force to one capable of performing in a "Network-Centric" environment. This transformation will place all of the data needed to efficiently meet tactical and strategic goals at the warfighter's fingertips. With access to this information, the challenge of fusing data from across the batttlespace into an operational picture for real-time Situational Awareness emerges. In such an environment, centralized fusion approaches will have limited application due to the constraints of real-time communications networks and computational resources. To overcome these limitations, we are developing a formalized architecture for fusion and track adjudication that allows the distribution of fusion processes over a dynamically created and managed information network. This network will support the incorporation and utilization of low level tracking information within the Army Distributed Common Ground System (DCGS-A) or Future Combat System (FCS). The framework is based on Bowman's Dual Node Network (DNN) architecture that utilizes a distributed network of interlaced fusion and track adjudication nodes to build and maintain a globally consistent picture across all assets.
National Storage Laboratory: a collaborative research project
NASA Astrophysics Data System (ADS)
Coyne, Robert A.; Hulen, Harry; Watson, Richard W.
1993-01-01
The grand challenges of science and industry that are driving computing and communications have created corresponding challenges in information storage and retrieval. An industry-led collaborative project has been organized to investigate technology for storage systems that will be the future repositories of national information assets. Industry participants are IBM Federal Systems Company, Ampex Recording Systems Corporation, General Atomics DISCOS Division, IBM ADSTAR, Maximum Strategy Corporation, Network Systems Corporation, and Zitel Corporation. Industry members of the collaborative project are funding their own participation. Lawrence Livermore National Laboratory through its National Energy Research Supercomputer Center (NERSC) will participate in the project as the operational site and provider of applications. The expected result is the creation of a National Storage Laboratory to serve as a prototype and demonstration facility. It is expected that this prototype will represent a significant advance in the technology for distributed storage systems capable of handling gigabyte-class files at gigabit-per-second data rates. Specifically, the collaboration expects to make significant advances in hardware, software, and systems technology in four areas of need, (1) network-attached high performance storage; (2) multiple, dynamic, distributed storage hierarchies; (3) layered access to storage system services; and (4) storage system management.
Comprehensive security framework for the communication and storage of medical images
NASA Astrophysics Data System (ADS)
Slik, David; Montour, Mike; Altman, Tym
2003-05-01
Confidentiality, integrity verification and access control of medical imagery and associated metadata is critical for the successful deployment of integrated healthcare networks that extend beyond the department level. As medical imagery continues to become widely accessed across multiple administrative domains and geographically distributed locations, image data should be able to travel and be stored on untrusted infrastructure, including public networks and server equipment operated by external entities. Given these challenges associated with protecting large-scale distributed networks, measures must be taken to protect patient identifiable information while guarding against tampering, denial of service attacks, and providing robust audit mechanisms. The proposed framework outlines a series of security practices for the protection of medical images, incorporating Transport Layer Security (TLS), public and secret key cryptography, certificate management and a token based trusted computing base. It outlines measures that can be utilized to protect information stored within databases, online and nearline storage, and during transport over trusted and untrusted networks. In addition, it provides a framework for ensuring end-to-end integrity of image data from acquisition to viewing, and presents a potential solution to the challenges associated with access control across multiple administrative domains and institution user bases.
A Semantic Big Data Platform for Integrating Heterogeneous Wearable Data in Healthcare.
Mezghani, Emna; Exposito, Ernesto; Drira, Khalil; Da Silveira, Marcos; Pruski, Cédric
2015-12-01
Advances supported by emerging wearable technologies in healthcare promise patients a provision of high quality of care. Wearable computing systems represent one of the most thrust areas used to transform traditional healthcare systems into active systems able to continuously monitor and control the patients' health in order to manage their care at an early stage. However, their proliferation creates challenges related to data management and integration. The diversity and variety of wearable data related to healthcare, their huge volume and their distribution make data processing and analytics more difficult. In this paper, we propose a generic semantic big data architecture based on the "Knowledge as a Service" approach to cope with heterogeneity and scalability challenges. Our main contribution focuses on enriching the NIST Big Data model with semantics in order to smartly understand the collected data, and generate more accurate and valuable information by correlating scattered medical data stemming from multiple wearable devices or/and from other distributed data sources. We have implemented and evaluated a Wearable KaaS platform to smartly manage heterogeneous data coming from wearable devices in order to assist the physicians in supervising the patient health evolution and keep the patient up-to-date about his/her status.
DREAM: Distributed Resources for the Earth System Grid Federation (ESGF) Advanced Management
NASA Astrophysics Data System (ADS)
Williams, D. N.
2015-12-01
The data associated with climate research is often generated, accessed, stored, and analyzed on a mix of unique platforms. The volume, variety, velocity, and veracity of this data creates unique challenges as climate research attempts to move beyond stand-alone platforms to a system that truly integrates dispersed resources. Today, sharing data across multiple facilities is often a challenge due to the large variance in supporting infrastructures. This results in data being accessed and downloaded many times, which requires significant amounts of resources, places a heavy analytic development burden on the end users, and mismanaged resources. Working across U.S. federal agencies, international agencies, and multiple worldwide data centers, and spanning seven international network organizations, the Earth System Grid Federation (ESGF) has begun to solve this problem. Its architecture employs a system of geographically distributed peer nodes that are independently administered yet united by common federation protocols and application programming interfaces. However, significant challenges remain, including workflow provenance, modular and flexible deployment, scalability of a diverse set of computational resources, and more. Expanding on the existing ESGF, the Distributed Resources for the Earth System Grid Federation Advanced Management (DREAM) will ensure that the access, storage, movement, and analysis of the large quantities of data that are processed and produced by diverse science projects can be dynamically distributed with proper resource management. This system will enable data from an infinite number of diverse sources to be organized and accessed from anywhere on any device (including mobile platforms). The approach offers a powerful roadmap for the creation and integration of a unified knowledge base of an entire ecosystem, including its many geophysical, geographical, social, political, agricultural, energy, transportation, and cyber aspects. The resulting aggregation of data combined with analytics services has the potential to generate an informational universe and knowledge system of unprecedented size and value to the scientific community, downstream applications, decision makers, and the public.
Scale effect challenges in urban hydrology highlighted with a distributed hydrological model
NASA Astrophysics Data System (ADS)
Ichiba, Abdellah; Gires, Auguste; Tchiguirinskaia, Ioulia; Schertzer, Daniel; Bompard, Philippe; Ten Veldhuis, Marie-Claire
2018-01-01
Hydrological models are extensively used in urban water management, development and evaluation of future scenarios and research activities. There is a growing interest in the development of fully distributed and grid-based models. However, some complex questions related to scale effects are not yet fully understood and still remain open issues in urban hydrology. In this paper we propose a two-step investigation framework to illustrate the extent of scale effects in urban hydrology. First, fractal tools are used to highlight the scale dependence observed within distributed data input into urban hydrological models. Then an intensive multi-scale modelling work is carried out to understand scale effects on hydrological model performance. Investigations are conducted using a fully distributed and physically based model, Multi-Hydro, developed at Ecole des Ponts ParisTech. The model is implemented at 17 spatial resolutions ranging from 100 to 5 m. Results clearly exhibit scale effect challenges in urban hydrology modelling. The applicability of fractal concepts highlights the scale dependence observed within distributed data. Patterns of geophysical data change when the size of the observation pixel changes. The multi-scale modelling investigation confirms scale effects on hydrological model performance. Results are analysed over three ranges of scales identified in the fractal analysis and confirmed through modelling. This work also discusses some remaining issues in urban hydrology modelling related to the availability of high-quality data at high resolutions, and model numerical instabilities as well as the computation time requirements. The main findings of this paper enable a replacement of traditional methods of model calibration
by innovative methods of model resolution alteration
based on the spatial data variability and scaling of flows in urban hydrology.
Halimi, Abdelghafour; Batatia, Hadj; Le Digabel, Jimmy; Josse, Gwendal; Tourneret, Jean Yves
2017-01-01
Detecting skin lentigo in reflectance confocal microscopy images is an important and challenging problem. This imaging modality has not yet been widely investigated for this problem and there are a few automatic processing techniques. They are mostly based on machine learning approaches and rely on numerous classical image features that lead to high computational costs given the very large resolution of these images. This paper presents a detection method with very low computational complexity that is able to identify the skin depth at which the lentigo can be detected. The proposed method performs multiresolution decomposition of the image obtained at each skin depth. The distribution of image pixels at a given depth can be approximated accurately by a generalized Gaussian distribution whose parameters depend on the decomposition scale, resulting in a very-low-dimension parameter space. SVM classifiers are then investigated to classify the scale parameter of this distribution allowing real-time detection of lentigo. The method is applied to 45 healthy and lentigo patients from a clinical study, where sensitivity of 81.4% and specificity of 83.3% are achieved. Our results show that lentigo is identifiable at depths between 50μm and 60μm, corresponding to the average location of the the dermoepidermal junction. This result is in agreement with the clinical practices that characterize the lentigo by assessing the disorganization of the dermoepidermal junction. PMID:29296480
The Awareness and Challenges of Cloud Computing Adoption on Tertiary Education in Malaysia
NASA Astrophysics Data System (ADS)
Hazreeni Hamzah, Nor; Mahmud, Maziah; Zukri, Shamsunarnie Mohamed; Yaacob, Wan Fairos Wan; Yacob, Jusoh
2017-09-01
This preliminary study aims to investigate the awareness of the adoption of cloud computing among the academicians in tertiary education in Malaysia. Besides, this study also want to explore the possible challenges faced by the academician while adopting this new technology. The pilot study was done on 40 lecturers in Universiti Teknologi MARA Kampus Kota Bharu (UiTMKB) by using self administered questionnaire. The results found that almost half (40 percent) were not aware on the existing of cloud computing in teaching and learning (T&L) process. The challenges confronting the adoption of cloud computing are data insecurity, data insecurity, unsolicited advertisement, lock-in, reluctance to eliminate staff positions, privacy concerns, reliability challenge, regulatory compliance concerns/user control and institutional culture/resistance to change in technology. This possible challenges can be factorized in two major factors which were security and dependency factor and user control and mentality factor.
Distributed computing environments for future space control systems
NASA Technical Reports Server (NTRS)
Viallefont, Pierre
1993-01-01
The aim of this paper is to present the results of a CNES research project on distributed computing systems. The purpose of this research was to study the impact of the use of new computer technologies in the design and development of future space applications. The first part of this study was a state-of-the-art review of distributed computing systems. One of the interesting ideas arising from this review is the concept of a 'virtual computer' allowing the distributed hardware architecture to be hidden from a software application. The 'virtual computer' can improve system performance by adapting the best architecture (addition of computers) to the software application without having to modify its source code. This concept can also decrease the cost and obsolescence of the hardware architecture. In order to verify the feasibility of the 'virtual computer' concept, a prototype representative of a distributed space application is being developed independently of the hardware architecture.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Livescu, Veronica; Bronkhorst, Curt Allan; Vander Wiel, Scott Alan
Many challenges exist with regard to understanding and representing complex physical processes involved with ductile damage and failure in polycrystalline metallic materials. Currently, the ability to accurately predict the macroscale ductile damage and failure response of metallic materials is lacking. Research at Los Alamos National Laboratory (LANL) is aimed at building a coupled experimental and computational methodology that supports the development of predictive damage capabilities by: capturing real distributions of microstructural features from real material and implementing them as digitally generated microstructures in damage model development; and, distilling structure-property information to link microstructural details to damage evolution under a multitudemore » of loading states.« less
Efficient and optimized identification of generalized Maxwell viscoelastic relaxation spectra
Babaei, Behzad; Davarian, Ali; Pryse, Kenneth M.; Elson, Elliot L.; Genin, Guy M.
2017-01-01
Viscoelastic relaxation spectra are essential for predicting and interpreting the mechanical responses of materials and structures. For biological tissues, these spectra must usually be estimated from viscoelastic relaxation tests. Interpreting viscoelastic relaxation tests is challenging because the inverse problem is expensive computationally. We present here an efficient algorithm that enables rapid identification of viscoelastic relaxation spectra. The algorithm was tested against trial data to characterize its robustness and identify its limitations and strengths. The algorithm was then applied to identify the viscoelastic response of reconstituted collagen, revealing an extensive distribution of viscoelastic time constants. PMID:26523785
Bonawitz, Elizabeth; Denison, Stephanie; Griffiths, Thomas L; Gopnik, Alison
2014-10-01
Although probabilistic models of cognitive development have become increasingly prevalent, one challenge is to account for how children might cope with a potentially vast number of possible hypotheses. We propose that children might address this problem by 'sampling' hypotheses from a probability distribution. We discuss empirical results demonstrating signatures of sampling, which offer an explanation for the variability of children's responses. The sampling hypothesis provides an algorithmic account of how children might address computationally intractable problems and suggests a way to make sense of their 'noisy' behavior. Copyright © 2014 Elsevier Ltd. All rights reserved.
A Scalable Infrastructure for Lidar Topography Data Distribution, Processing, and Discovery
NASA Astrophysics Data System (ADS)
Crosby, C. J.; Nandigam, V.; Krishnan, S.; Phan, M.; Cowart, C. A.; Arrowsmith, R.; Baru, C.
2010-12-01
High-resolution topography data acquired with lidar (light detection and ranging) technology have emerged as a fundamental tool in the Earth sciences, and are also being widely utilized for ecological, planning, engineering, and environmental applications. Collected from airborne, terrestrial, and space-based platforms, these data are revolutionary because they permit analysis of geologic and biologic processes at resolutions essential for their appropriate representation. Public domain lidar data collection by federal, state, and local agencies are a valuable resource to the scientific community, however the data pose significant distribution challenges because of the volume and complexity of data that must be stored, managed, and processed. Lidar data acquisition may generate terabytes of data in the form of point clouds, digital elevation models (DEMs), and derivative products. This massive volume of data is often challenging to host for resource-limited agencies. Furthermore, these data can be technically challenging for users who lack appropriate software, computing resources, and expertise. The National Science Foundation-funded OpenTopography Facility (www.opentopography.org) has developed a cyberinfrastructure-based solution to enable online access to Earth science-oriented high-resolution lidar topography data, online processing tools, and derivative products. OpenTopography provides access to terabytes of point cloud data, standard DEMs, and Google Earth image data, all co-located with computational resources for on-demand data processing. The OpenTopography portal is built upon a cyberinfrastructure platform that utilizes a Services Oriented Architecture (SOA) to provide a modular system that is highly scalable and flexible enough to support the growing needs of the Earth science lidar community. OpenTopography strives to host and provide access to datasets as soon as they become available, and also to expose greater application level functionalities to our end-users (such as generation of custom DEMs via various gridding algorithms, and hydrological modeling algorithms). In the future, the SOA will enable direct authenticated access to back-end functionality through simple Web service Application Programming Interfaces (APIs), so that users may access our data and compute resources via clients other than Web browsers. In addition to an overview of the OpenTopography SOA, this presentation will discuss our recently developed lidar data ingestion and management system for point cloud data delivered in the binary LAS standard. This system compliments our existing partitioned database approach for data delivered in ASCII format, and permits rapid ingestion of data. The system has significantly reduced data ingestion times and has implications for data distribution in emergency response situations. We will also address on ongoing work to develop a community lidar metadata catalog based on the OGC Catalogue Service for Web (CSW) standard, which will help to centralize discovery of public domain lidar data.
Global EOS: exploring the 300-ms-latency region
NASA Astrophysics Data System (ADS)
Mascetti, L.; Jericho, D.; Hsu, C.-Y.
2017-10-01
EOS, the CERN open-source distributed disk storage system, provides the highperformance storage solution for HEP analysis and the back-end for various work-flows. Recently EOS became the back-end of CERNBox, the cloud synchronisation service for CERN users. EOS can be used to take advantage of wide-area distributed installations: for the last few years CERN EOS uses a common deployment across two computer centres (Geneva-Meyrin and Budapest-Wigner) about 1,000 km apart (∼20-ms latency) with about 200 PB of disk (JBOD). In late 2015, the CERN-IT Storage group and AARNET (Australia) set-up a challenging R&D project: a single EOS instance between CERN and AARNET with more than 300ms latency (16,500 km apart). This paper will report about the success in deploy and run a distributed storage system between Europe (Geneva, Budapest), Australia (Melbourne) and later in Asia (ASGC Taipei), allowing different type of data placement and data access across these four sites.
Edge effect modeling of small tool polishing in planetary movement
NASA Astrophysics Data System (ADS)
Li, Qi-xin; Ma, Zhen; Jiang, Bo; Yao, Yong-sheng
2018-03-01
As one of the most challenging problems in Computer Controlled Optical Surfacing (CCOS), the edge effect greatly affects the polishing accuracy and efficiency. CCOS rely on stable tool influence function (TIF), however, at the edge of the mirror surface,with the grinding head out of the mirror ,the contact area and pressure distribution changes, which resulting in a non-linear change of TIF, and leads to tilting or sagging at the edge of the mirror. In order reduce the adverse effects and improve the polishing accuracy and efficiency. In this paper, we used the finite element simulation to analyze the pressure distribution at the mirror edge and combined with the improved traditional method to establish a new model. The new method fully considered the non-uniformity of pressure distribution. After modeling the TIFs in different locations, the description and prediction of the edge effects are realized, which has a positive significance on the control and suppression of edge effects
Measurements of void fraction distribution in cavitating pipe flow using x-ray CT
NASA Astrophysics Data System (ADS)
Bauer, D.; Chaves, H.; Arcoumanis, C.
2012-05-01
Measuring the void fraction distribution is still one of the greatest challenges in cavitation research. In this paper, a measurement technique for the quantitative void fraction characterization in a cavitating pipe flow is presented. While it is almost impossible to visualize the inside of the cavitation region with visible light, it is shown that with x-ray computed tomography (CT) it is possible to capture the time-averaged void fraction distribution in a quasi-steady pipe flow. Different types of cavitation have been investigated including cloud-like cavitation, bubble cavitation and film cavitation at very high flow rates. A specially designed nozzle was employed to induce very stable quasi-steady cavitation. The obtained results demonstrate the advantages of the measurement technique compared to other ones; for example, structures were observed inside the cavitation region that could not be visualized by photographic images. Furthermore, photographic images and pressure measurements were used to allow comparisons to be made and to prove the superiority of the CT measurement technique.
Mean, covariance, and effective dimension of stochastic distributed delay dynamics
NASA Astrophysics Data System (ADS)
René, Alexandre; Longtin, André
2017-11-01
Dynamical models are often required to incorporate both delays and noise. However, the inherently infinite-dimensional nature of delay equations makes formal solutions to stochastic delay differential equations (SDDEs) challenging. Here, we present an approach, similar in spirit to the analysis of functional differential equations, but based on finite-dimensional matrix operators. This results in a method for obtaining both transient and stationary solutions that is directly amenable to computation, and applicable to first order differential systems with either discrete or distributed delays. With fewer assumptions on the system's parameters than other current solution methods and no need to be near a bifurcation, we decompose the solution to a linear SDDE with arbitrary distributed delays into natural modes, in effect the eigenfunctions of the differential operator, and show that relatively few modes can suffice to approximate the probability density of solutions. Thus, we are led to conclude that noise makes these SDDEs effectively low dimensional, which opens the possibility of practical definitions of probability densities over their solution space.
NASA Astrophysics Data System (ADS)
Vucinic, Dean; Deen, Danny; Oanta, Emil; Batarilo, Zvonimir; Lacor, Chris
This paper focuses on visualization and manipulation of graphical content in distributed network environments. The developed graphical middleware and 3D desktop prototypes were specialized for situational awareness. This research was done in the LArge Scale COllaborative decision support Technology (LASCOT) project, which explored and combined software technologies to support human-centred decision support system for crisis management (earthquake, tsunami, flooding, airplane or oil-tanker incidents, chemical, radio-active or other pollutants spreading, etc.). The performed state-of-the-art review did not identify any publicly available large scale distributed application of this kind. Existing proprietary solutions rely on the conventional technologies and 2D representations. Our challenge was to apply the "latest" available technologies, such Java3D, X3D and SOAP, compatible with average computer graphics hardware. The selected technologies are integrated and we demonstrate: the flow of data, which originates from heterogeneous data sources; interoperability across different operating systems and 3D visual representations to enhance the end-users interactions.
An Overview of Cloud Computing in Distributed Systems
NASA Astrophysics Data System (ADS)
Divakarla, Usha; Kumari, Geetha
2010-11-01
Cloud computing is the emerging trend in the field of distributed computing. Cloud computing evolved from grid computing and distributed computing. Cloud plays an important role in huge organizations in maintaining huge data with limited resources. Cloud also helps in resource sharing through some specific virtual machines provided by the cloud service provider. This paper gives an overview of the cloud organization and some of the basic security issues pertaining to the cloud.
Distribution of man-machine controls in space teleoperation
NASA Technical Reports Server (NTRS)
Bejczy, A. K.
1982-01-01
The distribution of control between man and machine is dependent on the tasks, available technology, human performance characteristics and control goals. This dependency has very specific projections on systems designed for teleoperation in space. This paper gives a brief outline of the space-related issues and presents the results of advanced teleoperator research and development at the Jet Propulsion Laboratory (JPL). The research and development work includes smart sensors, flexible computer controls and intelligent man-machine interface devices in the area of visual displays and kinesthetic man-machine coupling in remote control of manipulators. Some of the development results have been tested at the Johnson Space Center (JSC) using the simulated full-scale Shuttle Remote Manipulator System (RMS). The research and development work for advanced space teleoperation is far from complete and poses many interdisciplinary challenges.
The NatCarb geoportal: Linking distributed data from the Carbon Sequestration Regional Partnerships
Carr, T.R.; Rich, P.M.; Bartley, J.D.
2007-01-01
The Department of Energy (DOE) Carbon Sequestration Regional Partnerships are generating the data for a "carbon atlas" of key geospatial data (carbon sources, potential sinks, etc.) required for rapid implementation of carbon sequestration on a broad scale. The NATional CARBon Sequestration Database and Geographic Information System (NatCarb) provides Web-based, nation-wide data access. Distributed computing solutions link partnerships and other publicly accessible repositories of geological, geophysical, natural resource, infrastructure, and environmental data. Data are maintained and enhanced locally, but assembled and accessed through a single geoportal. NatCarb, as a first attempt at a national carbon cyberinfrastructure (NCCI), assembles the data required to address technical and policy challenges of carbon capture and storage. We present a path forward to design and implement a comprehensive and successful NCCI. ?? 2007 The Haworth Press, Inc. All rights reserved.
Implicit Multibody Penalty-BasedDistributed Contact.
Xu, Hongyi; Zhao, Yili; Barbic, Jernej
2014-09-01
The penalty method is a simple and popular approach to resolving contact in computer graphics and robotics. Penalty-based contact, however, suffers from stability problems due to the highly variable and unpredictable net stiffness, and this is particularly pronounced in simulations with time-varying distributed geometrically complex contact. We employ semi-implicit integration, exact analytical contact gradients, symbolic Gaussian elimination and a SVD solver to simulate stable penalty-based frictional contact with large, time-varying contact areas, involving many rigid objects and articulated rigid objects in complex conforming contact and self-contact. We also derive implicit proportional-derivative control forces for real-time control of articulated structures with loops. We present challenging contact scenarios such as screwing a hexbolt into a hole, bowls stacked in perfectly conforming configurations, and manipulating many objects using actively controlled articulated mechanisms in real time.
EDDA: An Efficient Distributed Data Replication Algorithm in VANETs.
Zhu, Junyu; Huang, Chuanhe; Fan, Xiying; Guo, Sipei; Fu, Bin
2018-02-10
Efficient data dissemination in vehicular ad hoc networks (VANETs) is a challenging issue due to the dynamic nature of the network. To improve the performance of data dissemination, we study distributed data replication algorithms in VANETs for exchanging information and computing in an arbitrarily-connected network of vehicle nodes. To achieve low dissemination delay and improve the network performance, we control the number of message copies that can be disseminated in the network and then propose an efficient distributed data replication algorithm (EDDA). The key idea is to let the data carrier distribute the data dissemination tasks to multiple nodes to speed up the dissemination process. We calculate the number of communication stages for the network to enter into a balanced status and show that the proposed distributed algorithm can converge to a consensus in a small number of communication stages. Most of the theoretical results described in this paper are to study the complexity of network convergence. The lower bound and upper bound are also provided in the analysis of the algorithm. Simulation results show that the proposed EDDA can efficiently disseminate messages to vehicles in a specific area with low dissemination delay and system overhead.
EDDA: An Efficient Distributed Data Replication Algorithm in VANETs
Zhu, Junyu; Huang, Chuanhe; Fan, Xiying; Guo, Sipei; Fu, Bin
2018-01-01
Efficient data dissemination in vehicular ad hoc networks (VANETs) is a challenging issue due to the dynamic nature of the network. To improve the performance of data dissemination, we study distributed data replication algorithms in VANETs for exchanging information and computing in an arbitrarily-connected network of vehicle nodes. To achieve low dissemination delay and improve the network performance, we control the number of message copies that can be disseminated in the network and then propose an efficient distributed data replication algorithm (EDDA). The key idea is to let the data carrier distribute the data dissemination tasks to multiple nodes to speed up the dissemination process. We calculate the number of communication stages for the network to enter into a balanced status and show that the proposed distributed algorithm can converge to a consensus in a small number of communication stages. Most of the theoretical results described in this paper are to study the complexity of network convergence. The lower bound and upper bound are also provided in the analysis of the algorithm. Simulation results show that the proposed EDDA can efficiently disseminate messages to vehicles in a specific area with low dissemination delay and system overhead. PMID:29439443
Challenges and Security in Cloud Computing
NASA Astrophysics Data System (ADS)
Chang, Hyokyung; Choi, Euiin
People who live in this world want to solve any problems as they happen then. An IT technology called Ubiquitous computing should help the situations easier and we call a technology which makes it even better and powerful cloud computing. Cloud computing, however, is at the stage of the beginning to implement and use and it faces a lot of challenges in technical matters and security issues. This paper looks at the cloud computing security.
Giga-voxel computational morphogenesis for structural design
NASA Astrophysics Data System (ADS)
Aage, Niels; Andreassen, Erik; Lazarov, Boyan S.; Sigmund, Ole
2017-10-01
In the design of industrial products ranging from hearing aids to automobiles and aeroplanes, material is distributed so as to maximize the performance and minimize the cost. Historically, human intuition and insight have driven the evolution of mechanical design, recently assisted by computer-aided design approaches. The computer-aided approach known as topology optimization enables unrestricted design freedom and shows great promise with regard to weight savings, but its applicability has so far been limited to the design of single components or simple structures, owing to the resolution limits of current optimization methods. Here we report a computational morphogenesis tool, implemented on a supercomputer, that produces designs with giga-voxel resolution—more than two orders of magnitude higher than previously reported. Such resolution provides insights into the optimal distribution of material within a structure that were hitherto unachievable owing to the challenges of scaling up existing modelling and optimization frameworks. As an example, we apply the tool to the design of the internal structure of a full-scale aeroplane wing. The optimized full-wing design has unprecedented structural detail at length scales ranging from tens of metres to millimetres and, intriguingly, shows remarkable similarity to naturally occurring bone structures in, for example, bird beaks. We estimate that our optimized design corresponds to a reduction in mass of 2-5 per cent compared to currently used aeroplane wing designs, which translates into a reduction in fuel consumption of about 40-200 tonnes per year per aeroplane. Our morphogenesis process is generally applicable, not only to mechanical design, but also to flow systems, antennas, nano-optics and micro-systems.
Automating usability of ATLAS Distributed Computing resources
NASA Astrophysics Data System (ADS)
Tupputi, S. A.; Di Girolamo, A.; Kouba, T.; Schovancová, J.; Atlas Collaboration
2014-06-01
The automation of ATLAS Distributed Computing (ADC) operations is essential to reduce manpower costs and allow performance-enhancing actions, which improve the reliability of the system. In this perspective a crucial case is the automatic handling of outages of ATLAS computing sites storage resources, which are continuously exploited at the edge of their capabilities. It is challenging to adopt unambiguous decision criteria for storage resources of non-homogeneous types, sizes and roles. The recently developed Storage Area Automatic Blacklisting (SAAB) tool has provided a suitable solution, by employing an inference algorithm which processes history of storage monitoring tests outcome. SAAB accomplishes both the tasks of providing global monitoring as well as automatic operations on single sites. The implementation of the SAAB tool has been the first step in a comprehensive review of the storage areas monitoring and central management at all levels. Such review has involved the reordering and optimization of SAM tests deployment and the inclusion of SAAB results in the ATLAS Site Status Board with both dedicated metrics and views. The resulting structure allows monitoring the storage resources status with fine time-granularity and automatic actions to be taken in foreseen cases, like automatic outage handling and notifications to sites. Hence, the human actions are restricted to reporting and following up problems, where and when needed. In this work we show SAAB working principles and features. We present also the decrease of human interactions achieved within the ATLAS Computing Operation team. The automation results in a prompt reaction to failures, which leads to the optimization of resource exploitation.
Barrès, Victor; Lee, Jinyong
2014-01-01
How does the language system coordinate with our visual system to yield flexible integration of linguistic, perceptual, and world-knowledge information when we communicate about the world we perceive? Schema theory is a computational framework that allows the simulation of perceptuo-motor coordination programs on the basis of known brain operating principles such as cooperative computation and distributed processing. We present first its application to a model of language production, SemRep/TCG, which combines a semantic representation of visual scenes (SemRep) with Template Construction Grammar (TCG) as a means to generate verbal descriptions of a scene from its associated SemRep graph. SemRep/TCG combines the neurocomputational framework of schema theory with the representational format of construction grammar in a model linking eye-tracking data to visual scene descriptions. We then offer a conceptual extension of TCG to include language comprehension and address data on the role of both world knowledge and grammatical semantics in the comprehension performances of agrammatic aphasic patients. This extension introduces a distinction between heavy and light semantics. The TCG model of language comprehension offers a computational framework to quantitatively analyze the distributed dynamics of language processes, focusing on the interactions between grammatical, world knowledge, and visual information. In particular, it reveals interesting implications for the understanding of the various patterns of comprehension performances of agrammatic aphasics measured using sentence-picture matching tasks. This new step in the life cycle of the model serves as a basis for exploring the specific challenges that neurolinguistic computational modeling poses to the neuroinformatics community.
Giga-voxel computational morphogenesis for structural design.
Aage, Niels; Andreassen, Erik; Lazarov, Boyan S; Sigmund, Ole
2017-10-04
In the design of industrial products ranging from hearing aids to automobiles and aeroplanes, material is distributed so as to maximize the performance and minimize the cost. Historically, human intuition and insight have driven the evolution of mechanical design, recently assisted by computer-aided design approaches. The computer-aided approach known as topology optimization enables unrestricted design freedom and shows great promise with regard to weight savings, but its applicability has so far been limited to the design of single components or simple structures, owing to the resolution limits of current optimization methods. Here we report a computational morphogenesis tool, implemented on a supercomputer, that produces designs with giga-voxel resolution-more than two orders of magnitude higher than previously reported. Such resolution provides insights into the optimal distribution of material within a structure that were hitherto unachievable owing to the challenges of scaling up existing modelling and optimization frameworks. As an example, we apply the tool to the design of the internal structure of a full-scale aeroplane wing. The optimized full-wing design has unprecedented structural detail at length scales ranging from tens of metres to millimetres and, intriguingly, shows remarkable similarity to naturally occurring bone structures in, for example, bird beaks. We estimate that our optimized design corresponds to a reduction in mass of 2-5 per cent compared to currently used aeroplane wing designs, which translates into a reduction in fuel consumption of about 40-200 tonnes per year per aeroplane. Our morphogenesis process is generally applicable, not only to mechanical design, but also to flow systems, antennas, nano-optics and micro-systems.
Multi-scale Material Appearance
NASA Astrophysics Data System (ADS)
Wu, Hongzhi
Modeling and rendering the appearance of materials is important for a diverse range of applications of computer graphics - from automobile design to movies and cultural heritage. The appearance of materials varies considerably at different scales, posing significant challenges due to the sheer complexity of the data, as well the need to maintain inter-scale consistency constraints. This thesis presents a series of studies around the modeling, rendering and editing of multi-scale material appearance. To efficiently render material appearance at multiple scales, we develop an object-space precomputed adaptive sampling method, which precomputes a hierarchy of view-independent points that preserve multi-level appearance. To support bi-scale material appearance design, we propose a novel reflectance filtering algorithm, which rapidly computes the large-scale appearance from small-scale details, by exploiting the low-rank structures of Bidirectional Visible Normal Distribution Functions and pre-rotated Bidirectional Reflectance Distribution Functions in the matrix formulation of the rendering algorithm. This approach can guide the physical realization of appearance, as well as the modeling of real-world materials using very sparse measurements. Finally, we present a bi-scale-inspired high-quality general representation for material appearance described by Bidirectional Texture Functions. Our representation is at once compact, easily editable, and amenable to efficient rendering.
3-D phononic crystals with ultra-wide band gaps
Lu, Yan; Yang, Yang; Guest, James K.; Srivastava, Ankit
2017-01-01
In this paper gradient based topology optimization (TO) is used to discover 3-D phononic structures that exhibit ultra-wide normalized all-angle all-mode band gaps. The challenging computational task of repeated 3-D phononic band-structure evaluations is accomplished by a combination of a fast mixed variational eigenvalue solver and distributed Graphic Processing Unit (GPU) parallel computations. The TO algorithm utilizes the material distribution-based approach and a gradient-based optimizer. The design sensitivity for the mixed variational eigenvalue problem is derived using the adjoint method and is implemented through highly efficient vectorization techniques. We present optimized results for two-material simple cubic (SC), body centered cubic (BCC), and face centered cubic (FCC) crystal structures and show that in each of these cases different initial designs converge to single inclusion network topologies within their corresponding primitive cells. The optimized results show that large phononic stop bands for bulk wave propagation can be achieved at lower than close packed spherical configurations leading to lighter unit cells. For tungsten carbide - epoxy crystals we identify all angle all mode normalized stop bands exceeding 100%, which is larger than what is possible with only spherical inclusions. PMID:28233812
3-D phononic crystals with ultra-wide band gaps.
Lu, Yan; Yang, Yang; Guest, James K; Srivastava, Ankit
2017-02-24
In this paper gradient based topology optimization (TO) is used to discover 3-D phononic structures that exhibit ultra-wide normalized all-angle all-mode band gaps. The challenging computational task of repeated 3-D phononic band-structure evaluations is accomplished by a combination of a fast mixed variational eigenvalue solver and distributed Graphic Processing Unit (GPU) parallel computations. The TO algorithm utilizes the material distribution-based approach and a gradient-based optimizer. The design sensitivity for the mixed variational eigenvalue problem is derived using the adjoint method and is implemented through highly efficient vectorization techniques. We present optimized results for two-material simple cubic (SC), body centered cubic (BCC), and face centered cubic (FCC) crystal structures and show that in each of these cases different initial designs converge to single inclusion network topologies within their corresponding primitive cells. The optimized results show that large phononic stop bands for bulk wave propagation can be achieved at lower than close packed spherical configurations leading to lighter unit cells. For tungsten carbide - epoxy crystals we identify all angle all mode normalized stop bands exceeding 100%, which is larger than what is possible with only spherical inclusions.
Dynamic Distribution and Layouting of Model-Based User Interfaces in Smart Environments
NASA Astrophysics Data System (ADS)
Roscher, Dirk; Lehmann, Grzegorz; Schwartze, Veit; Blumendorf, Marco; Albayrak, Sahin
The developments in computer technology in the last decade change the ways of computer utilization. The emerging smart environments make it possible to build ubiquitous applications that assist users during their everyday life, at any time, in any context. But the variety of contexts-of-use (user, platform and environment) makes the development of such ubiquitous applications for smart environments and especially its user interfaces a challenging and time-consuming task. We propose a model-based approach, which allows adapting the user interface at runtime to numerous (also unknown) contexts-of-use. Based on a user interface modelling language, defining the fundamentals and constraints of the user interface, a runtime architecture exploits the description to adapt the user interface to the current context-of-use. The architecture provides automatic distribution and layout algorithms for adapting the applications also to contexts unforeseen at design time. Designers do not specify predefined adaptations for each specific situation, but adaptation constraints and guidelines. Furthermore, users are provided with a meta user interface to influence the adaptations according to their needs. A smart home energy management system serves as running example to illustrate the approach.
Christiansen, Morten H.; Onnis, Luca; Hockema, Stephen A.
2009-01-01
When learning language young children are faced with many seemingly formidable challenges, including discovering words embedded in a continuous stream of sounds and determining what role these words play in syntactic constructions. We suggest that knowledge of phoneme distributions may play a crucial part in helping children segment words and determine their lexical category, and propose an integrated model of how children might go from unsegmented speech to lexical categories. We corroborated this theoretical model using a two-stage computational analysis of a large corpus of English child-directed speech. First, we used transition probabilities between phonemes to find words in unsegmented speech. Second, we used distributional information about word edges—the beginning and ending phonemes of words—to predict whether the segmented words from the first stage were nouns, verbs, or something else. The results indicate that discovering lexical units and their associated syntactic category in child-directed speech is possible by attending to the statistics of single phoneme transitions and word-initial and final phonemes. Thus, we suggest that a core computational principle in language acquisition is that the same source of information is used to learn about different aspects of linguistic structure. PMID:19371361
A Weibull distribution accrual failure detector for cloud computing.
Liu, Jiaxi; Wu, Zhibo; Wu, Jin; Dong, Jian; Zhao, Yao; Wen, Dongxin
2017-01-01
Failure detectors are used to build high availability distributed systems as the fundamental component. To meet the requirement of a complicated large-scale distributed system, accrual failure detectors that can adapt to multiple applications have been studied extensively. However, several implementations of accrual failure detectors do not adapt well to the cloud service environment. To solve this problem, a new accrual failure detector based on Weibull Distribution, called the Weibull Distribution Failure Detector, has been proposed specifically for cloud computing. It can adapt to the dynamic and unexpected network conditions in cloud computing. The performance of the Weibull Distribution Failure Detector is evaluated and compared based on public classical experiment data and cloud computing experiment data. The results show that the Weibull Distribution Failure Detector has better performance in terms of speed and accuracy in unstable scenarios, especially in cloud computing.
Potjans, Wiebke; Morrison, Abigail; Diesmann, Markus
2010-01-01
A major puzzle in the field of computational neuroscience is how to relate system-level learning in higher organisms to synaptic plasticity. Recently, plasticity rules depending not only on pre- and post-synaptic activity but also on a third, non-local neuromodulatory signal have emerged as key candidates to bridge the gap between the macroscopic and the microscopic level of learning. Crucial insights into this topic are expected to be gained from simulations of neural systems, as these allow the simultaneous study of the multiple spatial and temporal scales that are involved in the problem. In particular, synaptic plasticity can be studied during the whole learning process, i.e., on a time scale of minutes to hours and across multiple brain areas. Implementing neuromodulated plasticity in large-scale network simulations where the neuromodulatory signal is dynamically generated by the network itself is challenging, because the network structure is commonly defined purely by the connectivity graph without explicit reference to the embedding of the nodes in physical space. Furthermore, the simulation of networks with realistic connectivity entails the use of distributed computing. A neuromodulated synapse must therefore be informed in an efficient way about the neuromodulatory signal, which is typically generated by a population of neurons located on different machines than either the pre- or post-synaptic neuron. Here, we develop a general framework to solve the problem of implementing neuromodulated plasticity in a time-driven distributed simulation, without reference to a particular implementation language, neuromodulator, or neuromodulated plasticity mechanism. We implement our framework in the simulator NEST and demonstrate excellent scaling up to 1024 processors for simulations of a recurrent network incorporating neuromodulated spike-timing dependent plasticity. PMID:21151370
Geospatial Data as a Service: Towards planetary scale real-time analytics
NASA Astrophysics Data System (ADS)
Evans, B. J. K.; Larraondo, P. R.; Antony, J.; Richards, C. J.
2017-12-01
The rapid growth of earth systems, environmental and geophysical datasets poses a challenge to both end-users and infrastructure providers. For infrastructure and data providers, tasks like managing, indexing and storing large collections of geospatial data needs to take into consideration the various use cases by which consumers will want to access and use the data. Considerable investment has been made by the Earth Science community to produce suitable real-time analytics platforms for geospatial data. There are currently different interfaces that have been defined to provide data services. Unfortunately, there is considerable difference on the standards, protocols or data models which have been designed to target specific communities or working groups. The Australian National University's National Computational Infrastructure (NCI) is used for a wide range of activities in the geospatial community. Earth observations, climate and weather forecasting are examples of these communities which generate large amounts of geospatial data. The NCI has been carrying out significant effort to develop a data and services model that enables the cross-disciplinary use of data. Recent developments in cloud and distributed computing provide a publicly accessible platform where new infrastructures can be built. One of the key components these technologies offer is the possibility of having "limitless" compute power next to where the data is stored. This model is rapidly transforming data delivery from centralised monolithic services towards ubiquitous distributed services that scale up and down adapting to fluctuations in the demand. NCI has developed GSKY, a scalable, distributed server which presents a new approach for geospatial data discovery and delivery based on OGC standards. We will present the architecture and motivating use-cases that drove GSKY's collaborative design, development and production deployment. We show our approach offers the community valuable exploratory analysis capabilities, for dealing with petabyte-scale geospatial data collections.
Mapping Agricultural Fields in Sub-Saharan Africa with a Computer Vision Approach
NASA Astrophysics Data System (ADS)
Debats, S. R.; Luo, D.; Estes, L. D.; Fuchs, T.; Caylor, K. K.
2014-12-01
Sub-Saharan Africa is an important focus for food security research, because it is experiencing unprecedented population growth, agricultural activities are largely dominated by smallholder production, and the region is already home to 25% of the world's undernourished. One of the greatest challenges to monitoring and improving food security in this region is obtaining an accurate accounting of the spatial distribution of agriculture. Households are the primary units of agricultural production in smallholder communities and typically rely on small fields of less than 2 hectares. Field sizes are directly related to household crop productivity, management choices, and adoption of new technologies. As population and agriculture expand, it becomes increasingly important to understand both the distribution of field sizes as well as how agricultural communities are spatially embedded in the landscape. In addition, household surveys, a common tool for tracking agricultural productivity in Sub-Saharan Africa, would greatly benefit from spatially explicit accounting of fields. Current gridded land cover data sets do not provide information on individual agricultural fields or the distribution of field sizes. Therefore, we employ cutting edge approaches from the field of computer vision to map fields across Sub-Saharan Africa, including semantic segmentation, discriminative classifiers, and automatic feature selection. Our approach aims to not only improve the binary classification accuracy of cropland, but also to isolate distinct fields, thereby capturing crucial information on size and geometry. Our research focuses on the development of descriptive features across scales to increase the accuracy and geographic range of our computer vision algorithm. Relevant data sets include high-resolution remote sensing imagery and Landsat (30-m) multi-spectral imagery. Training data for field boundaries is derived from hand-digitized data sets as well as crowdsourcing.
A compound memristive synapse model for statistical learning through STDP in spiking neural networks
Bill, Johannes; Legenstein, Robert
2014-01-01
Memristors have recently emerged as promising circuit elements to mimic the function of biological synapses in neuromorphic computing. The fabrication of reliable nanoscale memristive synapses, that feature continuous conductance changes based on the timing of pre- and postsynaptic spikes, has however turned out to be challenging. In this article, we propose an alternative approach, the compound memristive synapse, that circumvents this problem by the use of memristors with binary memristive states. A compound memristive synapse employs multiple bistable memristors in parallel to jointly form one synapse, thereby providing a spectrum of synaptic efficacies. We investigate the computational implications of synaptic plasticity in the compound synapse by integrating the recently observed phenomenon of stochastic filament formation into an abstract model of stochastic switching. Using this abstract model, we first show how standard pulsing schemes give rise to spike-timing dependent plasticity (STDP) with a stabilizing weight dependence in compound synapses. In a next step, we study unsupervised learning with compound synapses in networks of spiking neurons organized in a winner-take-all architecture. Our theoretical analysis reveals that compound-synapse STDP implements generalized Expectation-Maximization in the spiking network. Specifically, the emergent synapse configuration represents the most salient features of the input distribution in a Mixture-of-Gaussians generative model. Furthermore, the network's spike response to spiking input streams approximates a well-defined Bayesian posterior distribution. We show in computer simulations how such networks learn to represent high-dimensional distributions over images of handwritten digits with high fidelity even in presence of substantial device variations and under severe noise conditions. Therefore, the compound memristive synapse may provide a synaptic design principle for future neuromorphic architectures. PMID:25565943
Coupled basin-scale water resource models for arid and semiarid regions
NASA Astrophysics Data System (ADS)
Winter, C.; Springer, E.; Costigan, K.; Fasel, P.; Mniewski, S.; Zyvoloski, G.
2003-04-01
Managers of semi-arid and arid water resources must allocate increasingly variable surface sources and limited groundwater resources to growing demands. This challenge is leading to a new generation of detailed computational models that link multiple interacting sources and demands. We will discuss a new computational model of arid region hydrology that we are parameterizing for the upper Rio Grande Basin of the United States. The model consists of linked components for the atmosphere (the Regional Atmospheric Modeling System, RAMS), surface hydrology (the Los Alamos Distributed Hydrologic System, LADHS), and groundwater (the Finite Element Heat and Mass code, FEHM), and the couplings between them. The model runs under the Parallel Application WorkSpace software developed at Los Alamos for applications running on large distributed memory computers. RAMS simulates regional meteorology coupled to global climate data on the one hand and land surface hydrology on the other. LADHS generates runoff by infiltration or saturation excess mechanisms, as well as interception, evapotranspiration, and snow accumulation and melt. FEHM simulates variably saturated flow and heat transport in three dimensions. A key issue is to increase the components’ spatial and temporal resolution to account for changes in topography and other rapidly changing variables that affect results such as soil moisture distribution or groundwater recharge. Thus, RAMS’ smallest grid is 5 km on a side, LADHS uses 100 m spacing, while FEHM concentrates processing on key volumes by means of an unstructured grid. Couplings within our model are based on new scaling methods that link groundwater-groundwater systems and streams to aquifers and we are developing evapotranspiration methods based on detailed calculations of latent heat and vegetative cover. Simulations of precipitation and soil moisture for the 1992-93 El Nino year will be used to demonstrate the approach and suggest further needs.
TethysCluster: A comprehensive approach for harnessing cloud resources for hydrologic modeling
NASA Astrophysics Data System (ADS)
Nelson, J.; Jones, N.; Ames, D. P.
2015-12-01
Advances in water resources modeling are improving the information that can be supplied to support decisions affecting the safety and sustainability of society. However, as water resources models become more sophisticated and data-intensive they require more computational power to run. Purchasing and maintaining the computing facilities needed to support certain modeling tasks has been cost-prohibitive for many organizations. With the advent of the cloud, the computing resources needed to address this challenge are now available and cost-effective, yet there still remains a significant technical barrier to leverage these resources. This barrier inhibits many decision makers and even trained engineers from taking advantage of the best science and tools available. Here we present the Python tools TethysCluster and CondorPy, that have been developed to lower the barrier to model computation in the cloud by providing (1) programmatic access to dynamically scalable computing resources, (2) a batch scheduling system to queue and dispatch the jobs to the computing resources, (3) data management for job inputs and outputs, and (4) the ability to dynamically create, submit, and monitor computing jobs. These Python tools leverage the open source, computing-resource management, and job management software, HTCondor, to offer a flexible and scalable distributed-computing environment. While TethysCluster and CondorPy can be used independently to provision computing resources and perform large modeling tasks, they have also been integrated into Tethys Platform, a development platform for water resources web apps, to enable computing support for modeling workflows and decision-support systems deployed as web apps.
Strategies, Challenges and Prospects for Active Learning in the Computer-Based Classroom
ERIC Educational Resources Information Center
Holbert, K. E.; Karady, G. G.
2009-01-01
The introduction of computer-equipped classrooms into engineering education has brought with it a host of opportunities and issues. Herein, some of the challenges and successes for creating an environment for active learning within computer-based classrooms are described. The particular teaching approach developed for undergraduate electrical…
LaRC local area networks to support distributed computing
NASA Technical Reports Server (NTRS)
Riddle, E. P.
1984-01-01
The Langley Research Center's (LaRC) Local Area Network (LAN) effort is discussed. LaRC initiated the development of a LAN to support a growing distributed computing environment at the Center. The purpose of the network is to provide an improved capability (over inteactive and RJE terminal access) for sharing multivendor computer resources. Specifically, the network will provide a data highway for the transfer of files between mainframe computers, minicomputers, work stations, and personal computers. An important influence on the overall network design was the vital need of LaRC researchers to efficiently utilize the large CDC mainframe computers in the central scientific computing facility. Although there was a steady migration from a centralized to a distributed computing environment at LaRC in recent years, the work load on the central resources increased. Major emphasis in the network design was on communication with the central resources within the distributed environment. The network to be implemented will allow researchers to utilize the central resources, distributed minicomputers, work stations, and personal computers to obtain the proper level of computing power to efficiently perform their jobs.
Multilinear Computing and Multilinear Algebraic Geometry
2016-08-10
instructions, searching existing data sources, gathering and maintaining the data needed, and completing and reviewing the collection of information. Send...performance period of this project. 15. SUBJECT TERMS Tensors , multilinearity, algebraic geometry, numerical computations, computational tractability, high...Reset DISTRIBUTION A: Distribution approved for public release. DISTRIBUTION A: Distribution approved for public release. INSTRUCTIONS FOR COMPLETING
DOE Office of Scientific and Technical Information (OSTI.GOV)
Huang, Zhenyu Henry; Tate, Zeb; Abhyankar, Shrirang
The power grid has been evolving over the last 120 years, but it is seeing more changes in this decade and next than it has seen over the past century. In particular, the widespread deployment of intermittent renewable generation, smart loads and devices, hierarchical and distributed control technologies, phasor measurement units, energy storage, and widespread usage of electric vehicles will require fundamental changes in methods and tools for the operation and planning of the power grid. The resulting new dynamic and stochastic behaviors will demand the inclusion of more complexity in modeling the power grid. Solving such complex models inmore » the traditional computing environment will be a major challenge. Along with the increasing complexity of power system models, the increasing complexity of smart grid data further adds to the prevailing challenges. In this environment, the myriad of smart sensors and meters in the power grid increase by multiple orders of magnitude, so do the volume and speed of the data. The information infrastructure will need to drastically change to support the exchange of enormous amounts of data as smart grid applications will need the capability to collect, assimilate, analyze and process the data, to meet real-time grid functions. High performance computing (HPC) holds the promise to enhance these functions, but it is a great resource that has not been fully explored and adopted for the power grid domain.« less
Quantum computing and probability.
Ferry, David K
2009-11-25
Over the past two decades, quantum computing has become a popular and promising approach to trying to solve computationally difficult problems. Missing in many descriptions of quantum computing is just how probability enters into the process. Here, we discuss some simple examples of how uncertainty and probability enter, and how this and the ideas of quantum computing challenge our interpretations of quantum mechanics. It is found that this uncertainty can lead to intrinsic decoherence, and this raises challenges for error correction.
Multi-Relational Characterization of Dynamic Social Network Communities
NASA Astrophysics Data System (ADS)
Lin, Yu-Ru; Sundaram, Hari; Kelliher, Aisling
The emergence of the mediated social web - a distributed network of participants creating rich media content and engaging in interactive conversations through Internet-based communication technologies - has contributed to the evolution of powerful social, economic and cultural change. Online social network sites and blogs, such as Facebook, Twitter, Flickr and LiveJournal, thrive due to their fundamental sense of "community". The growth of online communities offers both opportunities and challenges for researchers and practitioners. Participation in online communities has been observed to influence people's behavior in diverse ways ranging from financial decision-making to political choices, suggesting the rich potential for diverse applications. However, although studies on the social web have been extensive, discovering communities from online social media remains challenging, due to the interdisciplinary nature of this subject. In this article, we present our recent work on characterization of communities in online social media using computational approaches grounded on the observations from social science.
High-End Computing Challenges in Aerospace Design and Engineering
NASA Technical Reports Server (NTRS)
Bailey, F. Ronald
2004-01-01
High-End Computing (HEC) has had significant impact on aerospace design and engineering and is poised to make even more in the future. In this paper we describe four aerospace design and engineering challenges: Digital Flight, Launch Simulation, Rocket Fuel System and Digital Astronaut. The paper discusses modeling capabilities needed for each challenge and presents projections of future near and far-term HEC computing requirements. NASA's HEC Project Columbia is described and programming strategies presented that are necessary to achieve high real performance.
The National Cancer Institute (NCI) Clinical Proteomic Tumor Analysis Consortium (CPTAC) is pleased to announce that teams led by Jaewoo Kang (Korea University), and Yuanfang Guan with Hongyang Li (University of Michigan) as the best performers of the NCI-CPTAC DREAM Proteogenomics Computational Challenge. Over 500 participants from 20 countries registered for the Challenge, which offered $25,000 in cash awards contributed by the NVIDIA Foundation through its Compute the Cure initiative.
A distributed computing model for telemetry data processing
NASA Astrophysics Data System (ADS)
Barry, Matthew R.; Scott, Kevin L.; Weismuller, Steven P.
1994-05-01
We present a new approach to distributing processed telemetry data among spacecraft flight controllers within the control centers at NASA's Johnson Space Center. This approach facilitates the development of application programs which integrate spacecraft-telemetered data and ground-based synthesized data, then distributes this information to flight controllers for analysis and decision-making. The new approach combines various distributed computing models into one hybrid distributed computing model. The model employs both client-server and peer-to-peer distributed computing models cooperating to provide users with information throughout a diverse operations environment. Specifically, it provides an attractive foundation upon which we are building critical real-time monitoring and control applications, while simultaneously lending itself to peripheral applications in playback operations, mission preparations, flight controller training, and program development and verification. We have realized the hybrid distributed computing model through an information sharing protocol. We shall describe the motivations that inspired us to create this protocol, along with a brief conceptual description of the distributed computing models it employs. We describe the protocol design in more detail, discussing many of the program design considerations and techniques we have adopted. Finally, we describe how this model is especially suitable for supporting the implementation of distributed expert system applications.
A distributed computing model for telemetry data processing
NASA Technical Reports Server (NTRS)
Barry, Matthew R.; Scott, Kevin L.; Weismuller, Steven P.
1994-01-01
We present a new approach to distributing processed telemetry data among spacecraft flight controllers within the control centers at NASA's Johnson Space Center. This approach facilitates the development of application programs which integrate spacecraft-telemetered data and ground-based synthesized data, then distributes this information to flight controllers for analysis and decision-making. The new approach combines various distributed computing models into one hybrid distributed computing model. The model employs both client-server and peer-to-peer distributed computing models cooperating to provide users with information throughout a diverse operations environment. Specifically, it provides an attractive foundation upon which we are building critical real-time monitoring and control applications, while simultaneously lending itself to peripheral applications in playback operations, mission preparations, flight controller training, and program development and verification. We have realized the hybrid distributed computing model through an information sharing protocol. We shall describe the motivations that inspired us to create this protocol, along with a brief conceptual description of the distributed computing models it employs. We describe the protocol design in more detail, discussing many of the program design considerations and techniques we have adopted. Finally, we describe how this model is especially suitable for supporting the implementation of distributed expert system applications.
UBioLab: a web-LABoratory for Ubiquitous in-silico experiments.
Bartocci, E; Di Berardini, M R; Merelli, E; Vito, L
2012-03-01
The huge and dynamic amount of bioinformatic resources (e.g., data and tools) available nowadays in Internet represents a big challenge for biologists -for what concerns their management and visualization- and for bioinformaticians -for what concerns the possibility of rapidly creating and executing in-silico experiments involving resources and activities spread over the WWW hyperspace. Any framework aiming at integrating such resources as in a physical laboratory has imperatively to tackle -and possibly to handle in a transparent and uniform way- aspects concerning physical distribution, semantic heterogeneity, co-existence of different computational paradigms and, as a consequence, of different invocation interfaces (i.e., OGSA for Grid nodes, SOAP for Web Services, Java RMI for Java objects, etc.). The framework UBioLab has been just designed and developed as a prototype following the above objective. Several architectural features -as those ones of being fully Web-based and of combining domain ontologies, Semantic Web and workflow techniques- give evidence of an effort in such a direction. The integration of a semantic knowledge management system for distributed (bioinformatic) resources, a semantic-driven graphic environment for defining and monitoring ubiquitous workflows and an intelligent agent-based technology for their distributed execution allows UBioLab to be a semantic guide for bioinformaticians and biologists providing (i) a flexible environment for visualizing, organizing and inferring any (semantics and computational) "type" of domain knowledge (e.g., resources and activities, expressed in a declarative form), (ii) a powerful engine for defining and storing semantic-driven ubiquitous in-silico experiments on the domain hyperspace, as well as (iii) a transparent, automatic and distributed environment for correct experiment executions.
AGIS: Integration of new technologies used in ATLAS Distributed Computing
NASA Astrophysics Data System (ADS)
Anisenkov, Alexey; Di Girolamo, Alessandro; Alandes Pradillo, Maria
2017-10-01
The variety of the ATLAS Distributed Computing infrastructure requires a central information system to define the topology of computing resources and to store different parameters and configuration data which are needed by various ATLAS software components. The ATLAS Grid Information System (AGIS) is the system designed to integrate configuration and status information about resources, services and topology of the computing infrastructure used by ATLAS Distributed Computing applications and services. Being an intermediate middleware system between clients and external information sources (like central BDII, GOCDB, MyOSG), AGIS defines the relations between experiment specific used resources and physical distributed computing capabilities. Being in production during LHC Runl AGIS became the central information system for Distributed Computing in ATLAS and it is continuously evolving to fulfil new user requests, enable enhanced operations and follow the extension of the ATLAS Computing model. The ATLAS Computing model and data structures used by Distributed Computing applications and services are continuously evolving and trend to fit newer requirements from ADC community. In this note, we describe the evolution and the recent developments of AGIS functionalities, related to integration of new technologies recently become widely used in ATLAS Computing, like flexible computing utilization of opportunistic Cloud and HPC resources, ObjectStore services integration for Distributed Data Management (Rucio) and ATLAS workload management (PanDA) systems, unified storage protocols declaration required for PandDA Pilot site movers and others. The improvements of information model and general updates are also shown, in particular we explain how other collaborations outside ATLAS could benefit the system as a computing resources information catalogue. AGIS is evolving towards a common information system, not coupled to a specific experiment.
Modality-Driven Classification and Visualization of Ensemble Variance
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bensema, Kevin; Gosink, Luke; Obermaier, Harald
Advances in computational power now enable domain scientists to address conceptual and parametric uncertainty by running simulations multiple times in order to sufficiently sample the uncertain input space. While this approach helps address conceptual and parametric uncertainties, the ensemble datasets produced by this technique present a special challenge to visualization researchers as the ensemble dataset records a distribution of possible values for each location in the domain. Contemporary visualization approaches that rely solely on summary statistics (e.g., mean and variance) cannot convey the detailed information encoded in ensemble distributions that are paramount to ensemble analysis; summary statistics provide no informationmore » about modality classification and modality persistence. To address this problem, we propose a novel technique that classifies high-variance locations based on the modality of the distribution of ensemble predictions. Additionally, we develop a set of confidence metrics to inform the end-user of the quality of fit between the distribution at a given location and its assigned class. We apply a similar method to time-varying ensembles to illustrate the relationship between peak variance and bimodal or multimodal behavior. These classification schemes enable a deeper understanding of the behavior of the ensemble members by distinguishing between distributions that can be described by a single tendency and distributions which reflect divergent trends in the ensemble.« less
Effects of Initial Particle Distribution on an Energetic Dispersal of Particles
NASA Astrophysics Data System (ADS)
Rollin, Bertrand; Ouellet, Frederick; Koneru, Rahul; Garno, Joshua; Durant, Bradford
2017-11-01
Accurate predictions of the late time solid particle cloud distribution ensuing an explosive dispersal of particles is an extremely challenging problem for compressible multiphase flow simulations. The source of this difficulty is twofold: (i) The complex sequence of events taking place. Indeed, as the blast wave crosses the surrounding layer of particles, compaction occurs shortly before particles disperse radially at high speed. Then, during the dispersion phase, complex multiphase interactions occurs between particles and detonation products. (ii) Precise characterization of the explosive and particle distribution is virtually impossible. In this numerical experiment, we focus on the sensitivity of late time particle cloud distributions relative to carefully designed initial distributions, assuming the explosive is well described. Using point particle simulations, we study the case of a bed of glass particles surrounding an explosive. Constraining our simulations to relatively low initial volume fractions to prevent reaching of the close packing limit, we seek to describe qualitatively and quantitatively the late time dependency of a solid particle cloud on its distribution before the energy release of an explosive. This work was supported by the U.S. DoE, NNSA, Advanced Simulation and Computing Program, as a Cooperative Agreement under the Predictive Science Academic Alliance Program, under Contract No. DE-NA0002378.
A Weibull distribution accrual failure detector for cloud computing
Wu, Zhibo; Wu, Jin; Zhao, Yao; Wen, Dongxin
2017-01-01
Failure detectors are used to build high availability distributed systems as the fundamental component. To meet the requirement of a complicated large-scale distributed system, accrual failure detectors that can adapt to multiple applications have been studied extensively. However, several implementations of accrual failure detectors do not adapt well to the cloud service environment. To solve this problem, a new accrual failure detector based on Weibull Distribution, called the Weibull Distribution Failure Detector, has been proposed specifically for cloud computing. It can adapt to the dynamic and unexpected network conditions in cloud computing. The performance of the Weibull Distribution Failure Detector is evaluated and compared based on public classical experiment data and cloud computing experiment data. The results show that the Weibull Distribution Failure Detector has better performance in terms of speed and accuracy in unstable scenarios, especially in cloud computing. PMID:28278229
Consolidating WLCG topology and configuration in the Computing Resource Information Catalogue
Alandes, Maria; Andreeva, Julia; Anisenkov, Alexey; ...
2017-10-01
Here, the Worldwide LHC Computing Grid infrastructure links about 200 participating computing centres affiliated with several partner projects. It is built by integrating heterogeneous computer and storage resources in diverse data centres all over the world and provides CPU and storage capacity to the LHC experiments to perform data processing and physics analysis. In order to be used by the experiments, these distributed resources should be well described, which implies easy service discovery and detailed description of service configuration. Currently this information is scattered over multiple generic information sources like GOCDB, OIM, BDII and experiment-specific information systems. Such a modelmore » does not allow to validate topology and configuration information easily. Moreover, information in various sources is not always consistent. Finally, the evolution of computing technologies introduces new challenges. Experiments are more and more relying on opportunistic resources, which by their nature are more dynamic and should also be well described in the WLCG information system. This contribution describes the new WLCG configuration service CRIC (Computing Resource Information Catalogue) which collects information from various information providers, performs validation and provides a consistent set of UIs and APIs to the LHC VOs for service discovery and usage configuration. The main requirements for CRIC are simplicity, agility and robustness. CRIC should be able to be quickly adapted to new types of computing resources, new information sources, and allow for new data structures to be implemented easily following the evolution of the computing models and operations of the experiments.« less
Consolidating WLCG topology and configuration in the Computing Resource Information Catalogue
DOE Office of Scientific and Technical Information (OSTI.GOV)
Alandes, Maria; Andreeva, Julia; Anisenkov, Alexey
Here, the Worldwide LHC Computing Grid infrastructure links about 200 participating computing centres affiliated with several partner projects. It is built by integrating heterogeneous computer and storage resources in diverse data centres all over the world and provides CPU and storage capacity to the LHC experiments to perform data processing and physics analysis. In order to be used by the experiments, these distributed resources should be well described, which implies easy service discovery and detailed description of service configuration. Currently this information is scattered over multiple generic information sources like GOCDB, OIM, BDII and experiment-specific information systems. Such a modelmore » does not allow to validate topology and configuration information easily. Moreover, information in various sources is not always consistent. Finally, the evolution of computing technologies introduces new challenges. Experiments are more and more relying on opportunistic resources, which by their nature are more dynamic and should also be well described in the WLCG information system. This contribution describes the new WLCG configuration service CRIC (Computing Resource Information Catalogue) which collects information from various information providers, performs validation and provides a consistent set of UIs and APIs to the LHC VOs for service discovery and usage configuration. The main requirements for CRIC are simplicity, agility and robustness. CRIC should be able to be quickly adapted to new types of computing resources, new information sources, and allow for new data structures to be implemented easily following the evolution of the computing models and operations of the experiments.« less
Consolidating WLCG topology and configuration in the Computing Resource Information Catalogue
NASA Astrophysics Data System (ADS)
Alandes, Maria; Andreeva, Julia; Anisenkov, Alexey; Bagliesi, Giuseppe; Belforte, Stephano; Campana, Simone; Dimou, Maria; Flix, Jose; Forti, Alessandra; di Girolamo, A.; Karavakis, Edward; Lammel, Stephan; Litmaath, Maarten; Sciaba, Andrea; Valassi, Andrea
2017-10-01
The Worldwide LHC Computing Grid infrastructure links about 200 participating computing centres affiliated with several partner projects. It is built by integrating heterogeneous computer and storage resources in diverse data centres all over the world and provides CPU and storage capacity to the LHC experiments to perform data processing and physics analysis. In order to be used by the experiments, these distributed resources should be well described, which implies easy service discovery and detailed description of service configuration. Currently this information is scattered over multiple generic information sources like GOCDB, OIM, BDII and experiment-specific information systems. Such a model does not allow to validate topology and configuration information easily. Moreover, information in various sources is not always consistent. Finally, the evolution of computing technologies introduces new challenges. Experiments are more and more relying on opportunistic resources, which by their nature are more dynamic and should also be well described in the WLCG information system. This contribution describes the new WLCG configuration service CRIC (Computing Resource Information Catalogue) which collects information from various information providers, performs validation and provides a consistent set of UIs and APIs to the LHC VOs for service discovery and usage configuration. The main requirements for CRIC are simplicity, agility and robustness. CRIC should be able to be quickly adapted to new types of computing resources, new information sources, and allow for new data structures to be implemented easily following the evolution of the computing models and operations of the experiments.
Enhanced conformational sampling using enveloping distribution sampling.
Lin, Zhixiong; van Gunsteren, Wilfred F
2013-10-14
To lessen the problem of insufficient conformational sampling in biomolecular simulations is still a major challenge in computational biochemistry. In this article, an application of the method of enveloping distribution sampling (EDS) is proposed that addresses this challenge and its sampling efficiency is demonstrated in simulations of a hexa-β-peptide whose conformational equilibrium encompasses two different helical folds, i.e., a right-handed 2.7(10∕12)-helix and a left-handed 3(14)-helix, separated by a high energy barrier. Standard MD simulations of this peptide using the GROMOS 53A6 force field did not reach convergence of the free enthalpy difference between the two helices even after 500 ns of simulation time. The use of soft-core non-bonded interactions in the centre of the peptide did enhance the number of transitions between the helices, but at the same time led to neglect of relevant helical configurations. In the simulations of a two-state EDS reference Hamiltonian that envelops both the physical peptide and the soft-core peptide, sampling of the conformational space of the physical peptide ensures that physically relevant conformations can be visited, and sampling of the conformational space of the soft-core peptide helps to enhance the transitions between the two helices. The EDS simulations sampled many more transitions between the two helices and showed much faster convergence of the relative free enthalpy of the two helices compared with the standard MD simulations with only a slightly larger computational effort to determine optimized EDS parameters. Combined with various methods to smoothen the potential energy surface, the proposed EDS application will be a powerful technique to enhance the sampling efficiency in biomolecular simulations.
NASA Technical Reports Server (NTRS)
Keye, Stefan; Togiti, Vamish; Eisfeld, Bernhard; Brodersen, Olaf P.; Rivers, Melissa B.
2013-01-01
The accurate calculation of aerodynamic forces and moments is of significant importance during the design phase of an aircraft. Reynolds-averaged Navier-Stokes (RANS) based Computational Fluid Dynamics (CFD) has been strongly developed over the last two decades regarding robustness, efficiency, and capabilities for aerodynamically complex configurations. Incremental aerodynamic coefficients of different designs can be calculated with an acceptable reliability at the cruise design point of transonic aircraft for non-separated flows. But regarding absolute values as well as increments at off-design significant challenges still exist to compute aerodynamic data and the underlying flow physics with the accuracy required. In addition to drag, pitching moments are difficult to predict because small deviations of the pressure distributions, e.g. due to neglecting wing bending and twisting caused by the aerodynamic loads can result in large discrepancies compared to experimental data. Flow separations that start to develop at off-design conditions, e.g. in corner-flows, at trailing edges, or shock induced, can have a strong impact on the predictions of aerodynamic coefficients too. Based on these challenges faced by the CFD community a working group of the AIAA Applied Aerodynamics Technical Committee initiated in 2001 the CFD Drag Prediction Workshop (DPW) series resulting in five international workshops. The results of the participants and the committee are summarized in more than 120 papers. The latest, fifth workshop took place in June 2012 in conjunction with the 30th AIAA Applied Aerodynamics Conference. The results in this paper will evaluate the influence of static aeroelastic wing deformations onto pressure distributions and overall aerodynamic coefficients based on the NASA finite element structural model and the common grids.
Evolving the Technical Infrastructure of the Planetary Data System for the 21st Century
NASA Technical Reports Server (NTRS)
Beebe, Reta F.; Crichton, D.; Hughes, S.; Grayzeck, E.
2010-01-01
The Planetary Data System (PDS) was established in 1989 as a distributed system to assure scientific oversight. Initially the PDS followed guidelines recommended by the National Academies Committee on Data Management and Computation (CODMAC, 1982) and placed emphasis on archiving validated datasets. But overtime user demands, supported by increased computing capabilities and communication methods, have placed increasing demands on the PDS. The PDS must add additional services to better enable scientific analysis within distributed environments and to ensure that those services integrate with existing systems and data. To face these challenges the Planetary Data System (PDS) must modernize its architecture and technical implementation. The PDS 2010 project addresses these challenges. As part of this project, the PDS has three fundamental project goals that include: (1) Providing more efficient client delivery of data by data providers to the PDS (2) Enabling a stable, long-term usable planetary science data archive (3) Enabling services for the data consumer to find, access and use the data they require in contemporary data formats. In order to achieve these goals, the PDS 2010 project is upgrading both the technical infrastructure and the data standards to support increased efficiency in data delivery as well as usability of the PDS. Efforts are underway to interface with missions as early as possible and to streamline the preparation and delivery of data to the PDS. Likewise, the PDS is working to define and plan for data services that will help researchers to perform analysis in cost-constrained environments. This presentation will cover the PDS 2010 project including the goals, data standards and technical implementation plans that are underway within the Planetary Data System. It will discuss the plans for moving from the current system, version PDS 3, to version PDS 4.
Parallel computing method for simulating hydrological processesof large rivers under climate change
NASA Astrophysics Data System (ADS)
Wang, H.; Chen, Y.
2016-12-01
Climate change is one of the proverbial global environmental problems in the world.Climate change has altered the watershed hydrological processes in time and space distribution, especially in worldlarge rivers.Watershed hydrological process simulation based on physically based distributed hydrological model can could have better results compared with the lumped models.However, watershed hydrological process simulation includes large amount of calculations, especially in large rivers, thus needing huge computing resources that may not be steadily available for the researchers or at high expense, this seriously restricted the research and application. To solve this problem, the current parallel method are mostly parallel computing in space and time dimensions.They calculate the natural features orderly thatbased on distributed hydrological model by grid (unit, a basin) from upstream to downstream.This articleproposes ahigh-performancecomputing method of hydrological process simulation with high speedratio and parallel efficiency.It combinedthe runoff characteristics of time and space of distributed hydrological model withthe methods adopting distributed data storage, memory database, distributed computing, parallel computing based on computing power unit.The method has strong adaptability and extensibility,which means it canmake full use of the computing and storage resources under the condition of limited computing resources, and the computing efficiency can be improved linearly with the increase of computing resources .This method can satisfy the parallel computing requirements ofhydrological process simulation in small, medium and large rivers.
Patient specific CFD models of nasal airflow: overview of methods and challenges.
Kim, Sung Kyun; Na, Yang; Kim, Jee-In; Chung, Seung-Kyu
2013-01-18
Respiratory physiology and pathology are strongly dependent on the airflow inside the nasal cavity. However, the nasal anatomy, which is characterized by complex airway channels and significant individual differences, is difficult to analyze. Thus, commonly adopted diagnostic tools have yielded limited success. Nevertheless, with the rapid advances in computer resources, there have been more elaborate attempts to correlate airflow characteristics in human nasal airways with the symptoms and functions of the nose by computational fluid dynamics study. Furthermore, the computed nasal geometry can be virtually modified to reflect predicted results of the proposed surgical technique. In this article, several computational fluid mechanics (CFD) issues on patient-specific three dimensional (3D) modeling of nasal cavity and clinical applications were reviewed in relation to the cases of deviated nasal septum (decision for surgery), turbinectomy, and maxillary sinus ventilation (simulated- and post-surgery). Clinical relevance of fluid mechanical parameters, such as nasal resistance, flow allocation, wall shear stress, heat/humidity/NO gas distributions, to the symptoms and surgical outcome were discussed. Absolute values of such parameters reported by many research groups were different each other due to individual difference of nasal anatomy, the methodology for 3D modeling and numerical grid, laminar/turbulent flow model in CFD code. But, the correlation of these parameters to symptoms and surgery outcome seems to be obvious in each research group with subject-specific models and its variations (virtual- and post-surgery models). For the more reliable, patient-specific, and objective tools for diagnosis and outcomes of nasal surgery by using CFD, the future challenges will be the standardizations on the methodology for creating 3D airway models and the CFD procedures. Copyright © 2012 Elsevier Ltd. All rights reserved.
Security screening via computational imaging using frequency-diverse metasurface apertures
NASA Astrophysics Data System (ADS)
Smith, David R.; Reynolds, Matthew S.; Gollub, Jonah N.; Marks, Daniel L.; Imani, Mohammadreza F.; Yurduseven, Okan; Arnitz, Daniel; Pedross-Engel, Andreas; Sleasman, Timothy; Trofatter, Parker; Boyarsky, Michael; Rose, Alec; Odabasi, Hayrettin; Lipworth, Guy
2017-05-01
Computational imaging is a proven strategy for obtaining high-quality images with fast acquisition rates and simpler hardware. Metasurfaces provide exquisite control over electromagnetic fields, enabling the radiated field to be molded into unique patterns. The fusion of these two concepts can bring about revolutionary advances in the design of imaging systems for security screening. In the context of computational imaging, each field pattern serves as a single measurement of a scene; imaging a scene can then be interpreted as estimating the reflectivity distribution of a target from a set of measurements. As with any computational imaging system, the key challenge is to arrive at a minimal set of measurements from which a diffraction-limited image can be resolved. Here, we show that the information content of a frequency-diverse metasurface aperture can be maximized by design, and used to construct a complete millimeter-wave imaging system spanning a 2 m by 2 m area, consisting of 96 metasurfaces, capable of producing diffraction-limited images of human-scale targets. The metasurfacebased frequency-diverse system presented in this work represents an inexpensive, but tremendously flexible alternative to traditional hardware paradigms, offering the possibility of low-cost, real-time, and ubiquitous screening platforms.
Dinov, Ivo D.; Petrosyan, Petros; Liu, Zhizhong; Eggert, Paul; Zamanyan, Alen; Torri, Federica; Macciardi, Fabio; Hobel, Sam; Moon, Seok Woo; Sung, Young Hee; Jiang, Zhiguo; Labus, Jennifer; Kurth, Florian; Ashe-McNalley, Cody; Mayer, Emeran; Vespa, Paul M.; Van Horn, John D.; Toga, Arthur W.
2013-01-01
The volume, diversity and velocity of biomedical data are exponentially increasing providing petabytes of new neuroimaging and genetics data every year. At the same time, tens-of-thousands of computational algorithms are developed and reported in the literature along with thousands of software tools and services. Users demand intuitive, quick and platform-agnostic access to data, software tools, and infrastructure from millions of hardware devices. This explosion of information, scientific techniques, computational models, and technological advances leads to enormous challenges in data analysis, evidence-based biomedical inference and reproducibility of findings. The Pipeline workflow environment provides a crowd-based distributed solution for consistent management of these heterogeneous resources. The Pipeline allows multiple (local) clients and (remote) servers to connect, exchange protocols, control the execution, monitor the states of different tools or hardware, and share complete protocols as portable XML workflows. In this paper, we demonstrate several advanced computational neuroimaging and genetics case-studies, and end-to-end pipeline solutions. These are implemented as graphical workflow protocols in the context of analyzing imaging (sMRI, fMRI, DTI), phenotypic (demographic, clinical), and genetic (SNP) data. PMID:23975276
Polytopol computing for multi-core and distributed systems
NASA Astrophysics Data System (ADS)
Spaanenburg, Henk; Spaanenburg, Lambert; Ranefors, Johan
2009-05-01
Multi-core computing provides new challenges to software engineering. The paper addresses such issues in the general setting of polytopol computing, that takes multi-core problems in such widely differing areas as ambient intelligence sensor networks and cloud computing into account. It argues that the essence lies in a suitable allocation of free moving tasks. Where hardware is ubiquitous and pervasive, the network is virtualized into a connection of software snippets judiciously injected to such hardware that a system function looks as one again. The concept of polytopol computing provides a further formalization in terms of the partitioning of labor between collector and sensor nodes. Collectors provide functions such as a knowledge integrator, awareness collector, situation displayer/reporter, communicator of clues and an inquiry-interface provider. Sensors provide functions such as anomaly detection (only communicating singularities, not continuous observation), they are generally powered or self-powered, amorphous (not on a grid) with generation-and-attrition, field re-programmable, and sensor plug-and-play-able. Together the collector and the sensor are part of the skeleton injector mechanism, added to every node, and give the network the ability to organize itself into some of many topologies. Finally we will discuss a number of applications and indicate how a multi-core architecture supports the security aspects of the skeleton injector.
Computational challenges in modeling gene regulatory events.
Pataskar, Abhijeet; Tiwari, Vijay K
2016-10-19
Cellular transcriptional programs driven by genetic and epigenetic mechanisms could be better understood by integrating "omics" data and subsequently modeling the gene-regulatory events. Toward this end, computational biology should keep pace with evolving experimental procedures and data availability. This article gives an exemplified account of the current computational challenges in molecular biology.
Computer-Assisted Diagnostic Decision Support: History, Challenges, and Possible Paths Forward
ERIC Educational Resources Information Center
Miller, Randolph A.
2009-01-01
This paper presents a brief history of computer-assisted diagnosis, including challenges and future directions. Some ideas presented in this article on computer-assisted diagnostic decision support systems (CDDSS) derive from prior work by the author and his colleagues (see list in Acknowledgments) on the INTERNIST-1 and QMR projects. References…
Designing and validating the joint battlespace infosphere
NASA Astrophysics Data System (ADS)
Peterson, Gregory D.; Alexander, W. Perry; Birdwell, J. Douglas
2001-08-01
Fielding and managing the dynamic, complex information systems infrastructure necessary for defense operations presents significant opportunities for revolutionary improvements in capabilities. An example of this technology trend is the creation and validation of the Joint Battlespace Infosphere (JBI) being developed by the Air Force Research Lab. The JBI is a system of systems that integrates, aggregates, and distributes information to users at all echelons, from the command center to the battlefield. The JBI is a key enabler of meeting the Air Force's Joint Vision 2010 core competencies such as Information Superiority, by providing increased situational awareness, planning capabilities, and dynamic execution. At the same time, creating this new operational environment introduces significant risk due to an increased dependency on computational and communications infrastructure combined with more sophisticated and frequent threats. Hence, the challenge facing the nation is the most effective means to exploit new computational and communications technologies while mitigating the impact of attacks, faults, and unanticipated usage patterns.
Activation pathway of Src kinase reveals intermediate states as novel targets for drug design
Shukla, Diwakar; Meng, Yilin; Roux, Benoît; Pande, Vijay S.
2014-01-01
Unregulated activation of Src kinases leads to aberrant signaling, uncontrolled growth, and differentiation of cancerous cells. Reaching a complete mechanistic understanding of large scale conformational transformations underlying the activation of kinases could greatly help in the development of therapeutic drugs for the treatment of these pathologies. In principle, the nature of conformational transition could be modeled in silico via atomistic molecular dynamics simulations, although this is very challenging due to the long activation timescales. Here, we employ a computational paradigm that couples transition pathway techniques and Markov state model-based massively distributed simulations for mapping the conformational landscape of c-src tyrosine kinase. The computations provide the thermodynamics and kinetics of kinase activation for the first time, and help identify key structural intermediates. Furthermore, the presence of a novel allosteric site in an intermediate state of c-src that could be potentially utilized for drug design is predicted. PMID:24584478
NASA Astrophysics Data System (ADS)
Cao, Siqin; Zhu, Lizhe; Huang, Xuhui
2018-04-01
The 3D reference interaction site model (3DRISM) is a powerful tool to study the thermodynamic and structural properties of liquids. However, for hydrophobic solutes, the inhomogeneity of the solvent density around them poses a great challenge to the 3DRISM theory. To address this issue, we have previously introduced the hydrophobic-induced density inhomogeneity theory (HI) for purely hydrophobic solutes. To further consider the complex hydrophobic solutes containing partial charges, here we propose the D2MSA closure to incorporate the short-range and long-range interactions with the D2 closure and the mean spherical approximation, respectively. We demonstrate that our new theory can compute the solvent distributions around real hydrophobic solutes in water and complex organic solvents that agree well with the explicit solvent molecular dynamics simulations.
Technologies for imaging neural activity in large volumes
Ji, Na; Freeman, Jeremy; Smith, Spencer L.
2017-01-01
Neural circuitry has evolved to form distributed networks that act dynamically across large volumes. Collecting data from individual planes, conventional microscopy cannot sample circuitry across large volumes at the temporal resolution relevant to neural circuit function and behaviors. Here, we review emerging technologies for rapid volume imaging of neural circuitry. We focus on two critical challenges: the inertia of optical systems, which limits image speed, and aberrations, which restrict the image volume. Optical sampling time must be long enough to ensure high-fidelity measurements, but optimized sampling strategies and point spread function engineering can facilitate rapid volume imaging of neural activity within this constraint. We also discuss new computational strategies for the processing and analysis of volume imaging data of increasing size and complexity. Together, optical and computational advances are providing a broader view of neural circuit dynamics, and help elucidate how brain regions work in concert to support behavior. PMID:27571194
ASCR Cybersecurity for Scientific Computing Integrity
DOE Office of Scientific and Technical Information (OSTI.GOV)
Piesert, Sean
The Department of Energy (DOE) has the responsibility to address the energy, environmental, and nuclear security challenges that face our nation. Much of DOE’s enterprise involves distributed, collaborative teams; a signi¬cant fraction involves “open science,” which depends on multi-institutional, often international collaborations that must access or share signi¬cant amounts of information between institutions and over networks around the world. The mission of the Office of Science is the delivery of scienti¬c discoveries and major scienti¬c tools to transform our understanding of nature and to advance the energy, economic, and national security of the United States. The ability of DOE tomore » execute its responsibilities depends critically on its ability to assure the integrity and availability of scienti¬c facilities and computer systems, and of the scienti¬c, engineering, and operational software and data that support its mission.« less
The Square Kilometre Array: a challenge for 2020 to which Spain can contribute in 2012
NASA Astrophysics Data System (ADS)
Verdes-Montenegro, L.; Dios Santander-Vela, J. D.
2013-05-01
The SKA, composed of several hundreds of three different types of antennas with separations up to 3.000 km, and up to 200 square degrees field of view, will be the largest, most sensitive radio telescope ever built. It will be able to provide fundamental answers in areas such as the dark era, when gas in galaxies was first turned into stars and the first black holes formed, star formation in nearby galaxies from stellar birth to death, faint extragalactic emission, magnetism in galaxies, extrasolar planets, or confrontation of Einstein predictions with pulsars and black hole observations. The technological challenges involved offer an unprecedented opportunity to collaborate in the development of hardware and software technologies. The energy requirements of the SKA provide an opportunity to accelerate technology development in scalable renewable energy generation, distribution, storage and demand monitoring and reduction. Data transport will reach over a hundred times the current global internet traffic data rates, delivering as much data as the full World Wide Web. Processing this data torrent in real time will require high-performance distributed computing as well as data storage and innovative retrieval technologies in the exascale. This way to do science, based on data-intensive interdisciplinary cooperation, is the base of the concept of e-Science, which necessarily includes outreach as an indissoluble part of the knowledge-based human progress. The scientific and technological challenges and opportunities that SKA can bring to the Spanish community will be described in this talk.
NASA's Earth Observing Data and Information System - Near-Term Challenges
NASA Technical Reports Server (NTRS)
Behnke, Jeanne; Mitchell, Andrew; Ramapriyan, Hampapuram
2018-01-01
NASA's Earth Observing System Data and Information System (EOSDIS) has been a central component of the NASA Earth observation program since the 1990's. EOSDIS manages data covering a wide range of Earth science disciplines including cryosphere, land cover change, polar processes, field campaigns, ocean surface, digital elevation, atmosphere dynamics and composition, and inter-disciplinary research, and many others. One of the key components of EOSDIS is a set of twelve discipline-based Distributed Active Archive Centers (DAACs) distributed across the United States. Managed by NASA's Earth Science Data and Information System (ESDIS) Project at Goddard Space Flight Center, these DAACs serve over 3 million users globally. The ESDIS Project provides the infrastructure support for EOSDIS, which includes other components such as the Science Investigator-led Processing systems (SIPS), common metadata and metrics management systems, specialized network systems, standards management, and centralized support for use of commercial cloud capabilities. Given the long-term requirements, and the rapid pace of information technology and changing expectations of the user community, EOSDIS has evolved continually over the past three decades. However, many challenges remain. Challenges addressed in this paper include: growing volume and variety, achieving consistency across a diverse set of data producers, managing information about a large number of datasets, migration to a cloud computing environment, optimizing data discovery and access, incorporating user feedback from a diverse community, keeping metadata updated as data collections grow and age, and ensuring that all the content needed for understanding datasets by future users is identified and preserved.
Perceiving emotion: towards a realistic understanding of the task.
Cowie, Roddy
2009-12-12
A decade ago, perceiving emotion was generally equated with taking a sample (a still photograph or a few seconds of speech) that unquestionably signified an archetypal emotional state, and attaching the appropriate label. Computational research has shifted that paradigm in multiple ways. Concern with realism is key. Emotion generally colours ongoing action and interaction: describing that colouring is a different problem from categorizing brief episodes of relatively pure emotion. Multiple challenges flow from that. Describing emotional colouring is a challenge in itself. One approach is to use everyday categories describing states that are partly emotional and partly cognitive. Another approach is to use dimensions. Both approaches need ways to deal with gradual changes over time and mixed emotions. Attaching target descriptions to a sample poses problems of both procedure and validation. Cues are likely to be distributed both in time and across modalities, and key decisions may depend heavily on context. The usefulness of acted data is limited because it tends not to reproduce these features. By engaging with these challenging issues, research is not only achieving impressive results, but also offering a much deeper understanding of the problem.
Multi-agent coordination algorithms for control of distributed energy resources in smart grids
NASA Astrophysics Data System (ADS)
Cortes, Andres
Sustainable energy is a top-priority for researchers these days, since electricity and transportation are pillars of modern society. Integration of clean energy technologies such as wind, solar, and plug-in electric vehicles (PEVs), is a major engineering challenge in operation and management of power systems. This is due to the uncertain nature of renewable energy technologies and the large amount of extra load that PEVs would add to the power grid. Given the networked structure of a power system, multi-agent control and optimization strategies are natural approaches to address the various problems of interest for the safe and reliable operation of the power grid. The distributed computation in multi-agent algorithms addresses three problems at the same time: i) it allows for the handling of problems with millions of variables that a single processor cannot compute, ii) it allows certain independence and privacy to electricity customers by not requiring any usage information, and iii) it is robust to localized failures in the communication network, being able to solve problems by simply neglecting the failing section of the system. We propose various algorithms to coordinate storage, generation, and demand resources in a power grid using multi-agent computation and decentralized decision making. First, we introduce a hierarchical vehicle-one-grid (V1G) algorithm for coordination of PEVs under usage constraints, where energy only flows from the grid in to the batteries of PEVs. We then present a hierarchical vehicle-to-grid (V2G) algorithm for PEV coordination that takes into consideration line capacity constraints in the distribution grid, and where energy flows both ways, from the grid in to the batteries, and from the batteries to the grid. Next, we develop a greedy-like hierarchical algorithm for management of demand response events with on/off loads. Finally, we introduce distributed algorithms for the optimal control of distributed energy resources, i.e., generation and storage in a microgrid. The algorithms we present are provably correct and tested in simulation. Each algorithm is assumed to work on a particular network topology, and simulation studies are carried out in order to demonstrate their convergence properties to a desired solution.
Computer Graphics Simulations of Sampling Distributions.
ERIC Educational Resources Information Center
Gordon, Florence S.; Gordon, Sheldon P.
1989-01-01
Describes the use of computer graphics simulations to enhance student understanding of sampling distributions that arise in introductory statistics. Highlights include the distribution of sample proportions, the distribution of the difference of sample means, the distribution of the difference of sample proportions, and the distribution of sample…
Simulation study of entropy production in the one-dimensional Vlasov system
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dai, Zongliang, E-mail: liangliang1223@gmail.com; Wang, Shaojie
2016-07-15
The coarse-grain averaged distribution function of the one-dimensional Vlasov system is obtained by numerical simulation. The entropy productions in cases of the random field, the linear Landau damping, and the bump-on-tail instability are computed with the coarse-grain averaged distribution function. The computed entropy production is converged with increasing length of coarse-grain average. When the distribution function differs slightly from a Maxwellian distribution, the converged value agrees with the result computed by using the definition of thermodynamic entropy. The length of the coarse-grain average to compute the coarse-grain averaged distribution function is discussed.
A comparison of queueing, cluster and distributed computing systems
NASA Technical Reports Server (NTRS)
Kaplan, Joseph A.; Nelson, Michael L.
1993-01-01
Using workstation clusters for distributed computing has become popular with the proliferation of inexpensive, powerful workstations. Workstation clusters offer both a cost effective alternative to batch processing and an easy entry into parallel computing. However, a number of workstations on a network does not constitute a cluster. Cluster management software is necessary to harness the collective computing power. A variety of cluster management and queuing systems are compared: Distributed Queueing Systems (DQS), Condor, Load Leveler, Load Balancer, Load Sharing Facility (LSF - formerly Utopia), Distributed Job Manager (DJM), Computing in Distributed Networked Environments (CODINE), and NQS/Exec. The systems differ in their design philosophy and implementation. Based on published reports on the different systems and conversations with the system's developers and vendors, a comparison of the systems are made on the integral issues of clustered computing.
Fault Tolerant Software Technology for Distributed Computer Systems
1989-03-01
RAY.) &-TR-88-296 I Fin;.’ Technical Report ,r 19,39 i A28 3329 F’ULT TOLERANT SOFTWARE TECHNOLOGY FOR DISTRIBUTED COMPUTER SYSTEMS Georgia Institute...GrfisABN 34-70IiWftlI NO0. IN?3. NO IACCESSION NO. 158 21 7 11. TITLE (Incld security Cassification) FAULT TOLERANT SOFTWARE FOR DISTRIBUTED COMPUTER ...Technology for Distributed Computing Systems," a two year effort performed at Georgia Institute of Technology as part of the Clouds Project. The Clouds
Challenges of Big Data Analysis.
Fan, Jianqing; Han, Fang; Liu, Han
2014-06-01
Big Data bring new opportunities to modern society and challenges to data scientists. On one hand, Big Data hold great promises for discovering subtle population patterns and heterogeneities that are not possible with small-scale data. On the other hand, the massive sample size and high dimensionality of Big Data introduce unique computational and statistical challenges, including scalability and storage bottleneck, noise accumulation, spurious correlation, incidental endogeneity, and measurement errors. These challenges are distinguished and require new computational and statistical paradigm. This article gives overviews on the salient features of Big Data and how these features impact on paradigm change on statistical and computational methods as well as computing architectures. We also provide various new perspectives on the Big Data analysis and computation. In particular, we emphasize on the viability of the sparsest solution in high-confidence set and point out that exogeneous assumptions in most statistical methods for Big Data can not be validated due to incidental endogeneity. They can lead to wrong statistical inferences and consequently wrong scientific conclusions.
Challenges of Big Data Analysis
Fan, Jianqing; Han, Fang; Liu, Han
2014-01-01
Big Data bring new opportunities to modern society and challenges to data scientists. On one hand, Big Data hold great promises for discovering subtle population patterns and heterogeneities that are not possible with small-scale data. On the other hand, the massive sample size and high dimensionality of Big Data introduce unique computational and statistical challenges, including scalability and storage bottleneck, noise accumulation, spurious correlation, incidental endogeneity, and measurement errors. These challenges are distinguished and require new computational and statistical paradigm. This article gives overviews on the salient features of Big Data and how these features impact on paradigm change on statistical and computational methods as well as computing architectures. We also provide various new perspectives on the Big Data analysis and computation. In particular, we emphasize on the viability of the sparsest solution in high-confidence set and point out that exogeneous assumptions in most statistical methods for Big Data can not be validated due to incidental endogeneity. They can lead to wrong statistical inferences and consequently wrong scientific conclusions. PMID:25419469
An Overview of High Performance Computing and Challenges for the Future
Google Tech Talks
2017-12-09
In this talk we examine how high performance computing has changed over the last 10-year and look toward the future in terms of trends. These changes have had and will continue to have a major impact on our software. A new generation of software libraries and lgorithms are needed for the effective and reliable use of (wide area) dynamic, distributed and parallel environments. Some of the software and algorithm challenges have already been encountered, such as management of communication and memory hierarchies through a combination of compile--time and run--time techniques, but the increased scale of computation, depth of memory hierarchies, range of latencies, and increased run--time environment variability will make these problems much harder. We will focus on the redesign of software to fit multicore architectures. Speaker: Jack Dongarra University of Tennessee Oak Ridge National Laboratory University of Manchester Jack Dongarra received a Bachelor of Science in Mathematics from Chicago State University in 1972 and a Master of Science in Computer Science from the Illinois Institute of Technology in 1973. He received his Ph.D. in Applied Mathematics from the University of New Mexico in 1980. He worked at the Argonne National Laboratory until 1989, becoming a senior scientist. He now holds an appointment as University Distinguished Professor of Computer Science in the Electrical Engineering and Computer Science Department at the University of Tennessee, has the position of a Distinguished Research Staff member in the Computer Science and Mathematics Division at Oak Ridge National Laboratory (ORNL), Turing Fellow in the Computer Science and Mathematics Schools at the University of Manchester, and an Adjunct Professor in the Computer Science Department at Rice University. He specializes in numerical algorithms in linear algebra, parallel computing, the use of advanced-computer architectures, programming methodology, and tools for parallel computers. His research includes the development, testing and documentation of high quality mathematical software. He has contributed to the design and implementation of the following open source software packages and systems: EISPACK, LINPACK, the BLAS, LAPACK, ScaLAPACK, Netlib, PVM, MPI, NetSolve, Top500, ATLAS, and PAPI. He has published approximately 200 articles, papers, reports and technical memoranda and he is coauthor of several books. He was awarded the IEEE Sid Fernbach Award in 2004 for his contributions in the application of high performance computers using innovative approaches. He is a Fellow of the AAAS, ACM, and the IEEE and a member of the National Academy of Engineering.
An Overview of High Performance Computing and Challenges for the Future
DOE Office of Scientific and Technical Information (OSTI.GOV)
Google Tech Talks
In this talk we examine how high performance computing has changed over the last 10-year and look toward the future in terms of trends. These changes have had and will continue to have a major impact on our software. A new generation of software libraries and lgorithms are needed for the effective and reliable use of (wide area) dynamic, distributed and parallel environments. Some of the software and algorithm challenges have already been encountered, such as management of communication and memory hierarchies through a combination of compile--time and run--time techniques, but the increased scale of computation, depth of memory hierarchies,more » range of latencies, and increased run--time environment variability will make these problems much harder. We will focus on the redesign of software to fit multicore architectures. Speaker: Jack Dongarra University of Tennessee Oak Ridge National Laboratory University of Manchester Jack Dongarra received a Bachelor of Science in Mathematics from Chicago State University in 1972 and a Master of Science in Computer Science from the Illinois Institute of Technology in 1973. He received his Ph.D. in Applied Mathematics from the University of New Mexico in 1980. He worked at the Argonne National Laboratory until 1989, becoming a senior scientist. He now holds an appointment as University Distinguished Professor of Computer Science in the Electrical Engineering and Computer Science Department at the University of Tennessee, has the position of a Distinguished Research Staff member in the Computer Science and Mathematics Division at Oak Ridge National Laboratory (ORNL), Turing Fellow in the Computer Science and Mathematics Schools at the University of Manchester, and an Adjunct Professor in the Computer Science Department at Rice University. He specializes in numerical algorithms in linear algebra, parallel computing, the use of advanced-computer architectures, programming methodology, and tools for parallel computers. His research includes the development, testing and documentation of high quality mathematical software. He has contributed to the design and implementation of the following open source software packages and systems: EISPACK, LINPACK, the BLAS, LAPACK, ScaLAPACK, Netlib, PVM, MPI, NetSolve, Top500, ATLAS, and PAPI. He has published approximately 200 articles, papers, reports and technical memoranda and he is coauthor of several books. He was awarded the IEEE Sid Fernbach Award in 2004 for his contributions in the application of high performance computers using innovative approaches. He is a Fellow of the AAAS, ACM, and the IEEE and a member of the National Academy of Engineering.« less
High-performance computing in image registration
NASA Astrophysics Data System (ADS)
Zanin, Michele; Remondino, Fabio; Dalla Mura, Mauro
2012-10-01
Thanks to the recent technological advances, a large variety of image data is at our disposal with variable geometric, radiometric and temporal resolution. In many applications the processing of such images needs high performance computing techniques in order to deliver timely responses e.g. for rapid decisions or real-time actions. Thus, parallel or distributed computing methods, Digital Signal Processor (DSP) architectures, Graphical Processing Unit (GPU) programming and Field-Programmable Gate Array (FPGA) devices have become essential tools for the challenging issue of processing large amount of geo-data. The article focuses on the processing and registration of large datasets of terrestrial and aerial images for 3D reconstruction, diagnostic purposes and monitoring of the environment. For the image alignment procedure, sets of corresponding feature points need to be automatically extracted in order to successively compute the geometric transformation that aligns the data. The feature extraction and matching are ones of the most computationally demanding operations in the processing chain thus, a great degree of automation and speed is mandatory. The details of the implemented operations (named LARES) exploiting parallel architectures and GPU are thus presented. The innovative aspects of the implementation are (i) the effectiveness on a large variety of unorganized and complex datasets, (ii) capability to work with high-resolution images and (iii) the speed of the computations. Examples and comparisons with standard CPU processing are also reported and commented.
NASA Technical Reports Server (NTRS)
Lee, A. Y.
1967-01-01
Computer program calculates the steady state fluid distribution, temperature rise, and pressure drop of a coolant, the material temperature distribution of a heat generating solid, and the heat flux distributions at the fluid-solid interfaces. It performs the necessary iterations automatically within the computer, in one machine run.
The National Cancer Institute’s Clinical Proteomic Tumor Analysis Consortium (CPTAC) is pleased to announce the opening of the leaderboard to its Proteogenomics Computational DREAM Challenge. The leadership board remains open for submissions during September 25, 2017 through October 8, 2017, with the Challenge expected to run until November 17, 2017.
Parallel Computational Fluid Dynamics: Current Status and Future Requirements
NASA Technical Reports Server (NTRS)
Simon, Horst D.; VanDalsem, William R.; Dagum, Leonardo; Kutler, Paul (Technical Monitor)
1994-01-01
One or the key objectives of the Applied Research Branch in the Numerical Aerodynamic Simulation (NAS) Systems Division at NASA Allies Research Center is the accelerated introduction of highly parallel machines into a full operational environment. In this report we discuss the performance results obtained from the implementation of some computational fluid dynamics (CFD) applications on the Connection Machine CM-2 and the Intel iPSC/860. We summarize some of the experiences made so far with the parallel testbed machines at the NAS Applied Research Branch. Then we discuss the long term computational requirements for accomplishing some of the grand challenge problems in computational aerosciences. We argue that only massively parallel machines will be able to meet these grand challenge requirements, and we outline the computer science and algorithm research challenges ahead.
NASA Astrophysics Data System (ADS)
Andreeva, J.; Dzhunov, I.; Karavakis, E.; Kokoszkiewicz, L.; Nowotka, M.; Saiz, P.; Tuckett, D.
2012-12-01
Improvements in web browser performance and web standards compliance, as well as the availability of comprehensive JavaScript libraries, provides an opportunity to develop functionally rich yet intuitive web applications that allow users to access, render and analyse data in novel ways. However, the development of such large-scale JavaScript web applications presents new challenges, in particular with regard to code sustainability and team-based work. We present an approach that meets the challenges of large-scale JavaScript web application design and development, including client-side model-view-controller architecture, design patterns, and JavaScript libraries. Furthermore, we show how the approach leads naturally to the encapsulation of the data source as a web API, allowing applications to be easily ported to new data sources. The Experiment Dashboard framework is used for the development of applications for monitoring the distributed computing activities of virtual organisations on the Worldwide LHC Computing Grid. We demonstrate the benefits of the approach for large-scale JavaScript web applications in this context by examining the design of several Experiment Dashboard applications for data processing, data transfer and site status monitoring, and by showing how they have been ported for different virtual organisations and technologies.
Efficient development and processing of thermal math models of very large space truss structures
NASA Technical Reports Server (NTRS)
Warren, Andrew H.; Arelt, Joseph E.; Lalicata, Anthony L.
1993-01-01
As the spacecraft moves along the orbit, the truss members are subjected to direct and reflected solar, albedo and planetary infra-red (IR) heating rates, as well as IR heating and shadowing from other spacecraft components. This is a transient process with continuously changing heating loads and the shadowing effects. The resulting nonuniform temperature distribution may cause nonuniform thermal expansion, deflection and stress in the truss elements, truss warping and thermal distortions. There are three challenges in the thermal-structural analysis of the large truss structures. The first is the development of the thermal and structural math models, the second - model processing, and the third - the data transfer between the models. All three tasks require considerable time and computer resources to be done because of a very large number of components involved. To address these challenges a series of techniques of automated thermal math modeling and efficient processing of very large space truss structures were developed. In the process the finite element and finite difference methods are interfaced. A very substantial reduction of the quantity of computations was achieved while assuring a desired accuracy of the results. The techniques are illustrated on the thermal analysis of a segment of the Space Station main truss.
ADDRESSING ENVIRONMENTAL ENGINEERING CHALLENGES WITH COMPUTATIONAL FLUID DYNAMICS
This paper discusses the status and application of Computational Fluid Dynamics )CFD) models to address environmental engineering challenges for more detailed understanding of air pollutant source emissions, atmospheric dispersion and resulting human exposure. CFD simulations ...
Beyond moore computing research challenge workshop report.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Huey, Mark C.; Aidun, John Bahram
2013-10-01
We summarize the presentations and break out session discussions from the in-house workshop that was held on 11 July 2013 to acquaint a wider group of Sandians with the Beyond Moore Computing research challenge.
An approach for heterogeneous and loosely coupled geospatial data distributed computing
NASA Astrophysics Data System (ADS)
Chen, Bin; Huang, Fengru; Fang, Yu; Huang, Zhou; Lin, Hui
2010-07-01
Most GIS (Geographic Information System) applications tend to have heterogeneous and autonomous geospatial information resources, and the availability of these local resources is unpredictable and dynamic under a distributed computing environment. In order to make use of these local resources together to solve larger geospatial information processing problems that are related to an overall situation, in this paper, with the support of peer-to-peer computing technologies, we propose a geospatial data distributed computing mechanism that involves loosely coupled geospatial resource directories and a term named as Equivalent Distributed Program of global geospatial queries to solve geospatial distributed computing problems under heterogeneous GIS environments. First, a geospatial query process schema for distributed computing as well as a method for equivalent transformation from a global geospatial query to distributed local queries at SQL (Structured Query Language) level to solve the coordinating problem among heterogeneous resources are presented. Second, peer-to-peer technologies are used to maintain a loosely coupled network environment that consists of autonomous geospatial information resources, thus to achieve decentralized and consistent synchronization among global geospatial resource directories, and to carry out distributed transaction management of local queries. Finally, based on the developed prototype system, example applications of simple and complex geospatial data distributed queries are presented to illustrate the procedure of global geospatial information processing.
Computer Programming in Middle School: How Pairs Respond to Challenges
ERIC Educational Resources Information Center
Denner, Jill; Werner, Linda
2007-01-01
Many believe that girls lack the confidence and motivation to persist with computers when they face a challenge. In order to increase the number of girls and women in information technology careers, we need a better understanding of how they think about and solve problems while working on the computer. In this article, we describe a qualitative…
Computational challenges in modeling gene regulatory events
Pataskar, Abhijeet; Tiwari, Vijay K.
2016-01-01
ABSTRACT Cellular transcriptional programs driven by genetic and epigenetic mechanisms could be better understood by integrating “omics” data and subsequently modeling the gene-regulatory events. Toward this end, computational biology should keep pace with evolving experimental procedures and data availability. This article gives an exemplified account of the current computational challenges in molecular biology. PMID:27390891
A K-6 Computational Thinking Curriculum Framework: Implications for Teacher Knowledge
ERIC Educational Resources Information Center
Angeli, Charoula; Voogt, Joke; Fluck, Andrew; Webb, Mary; Cox, Margaret; Malyn-Smith, Joyce; Zagami, Jason
2016-01-01
Adding computer science as a separate school subject to the core K-6 curriculum is a complex issue with educational challenges. The authors herein address two of these challenges: (1) the design of the curriculum based on a generic computational thinking framework, and (2) the knowledge teachers need to teach the curriculum. The first issue is…
Collaborative Science Using Web Services and the SciFlo Grid Dataflow Engine
NASA Astrophysics Data System (ADS)
Wilson, B. D.; Manipon, G.; Xing, Z.; Yunck, T.
2006-12-01
The General Earth Science Investigation Suite (GENESIS) project is a NASA-sponsored partnership between the Jet Propulsion Laboratory, academia, and NASA data centers to develop a new suite of Web Services tools to facilitate multi-sensor investigations in Earth System Science. The goal of GENESIS is to enable large-scale, multi-instrument atmospheric science using combined datasets from the AIRS, MODIS, MISR, and GPS sensors. Investigations include cross-comparison of spaceborne climate sensors, cloud spectral analysis, study of upper troposphere-stratosphere water transport, study of the aerosol indirect cloud effect, and global climate model validation. The challenges are to bring together very large datasets, reformat and understand the individual instrument retrievals, co-register or re-grid the retrieved physical parameters, perform computationally-intensive data fusion and data mining operations, and accumulate complex statistics over months to years of data. To meet these challenges, we have developed a Grid computing and dataflow framework, named SciFlo, in which we are deploying a set of versatile and reusable operators for data access, subsetting, registration, mining, fusion, compression, and advanced statistical analysis. SciFlo leverages remote Web Services, called via Simple Object Access Protocol (SOAP) or REST (one-line) URLs, and the Grid Computing standards (WS-* &Globus Alliance toolkits), and enables scientists to do multi-instrument Earth Science by assembling reusable Web Services and native executables into a distributed computing flow (tree of operators). The SciFlo client &server engines optimize the execution of such distributed data flows and allow the user to transparently find and use datasets and operators without worrying about the actual location of the Grid resources. In particular, SciFlo exploits the wealth of datasets accessible by OpenGIS Consortium (OGC) Web Mapping Servers & Web Coverage Servers (WMS/WCS), and by Open Data Access Protocol (OpenDAP) servers. The scientist injects a distributed computation into the Grid by simply filling out an HTML form or directly authoring the underlying XML dataflow document, and results are returned directly to the scientist's desktop. Once an analysis has been specified for a chunk or day of data, it can be easily repeated with different control parameters or over months of data. Recently, the Earth Science Information Partners (ESIP) Federation sponsored a collaborative activity in which several ESIP members advertised their respective WMS/WCS and SOAP services, developed some collaborative science scenarios for atmospheric and aerosol science, and then choreographed services from multiple groups into demonstration workflows using the SciFlo engine and a Business Process Execution Language (BPEL) workflow engine. For several scenarios, the same collaborative workflow was executed in three ways: using hand-coded scripts, by executing a SciFlo document, and by executing a BPEL workflow document. We will discuss the lessons learned from this activity, the need for standardized interfaces (like WMS/WCS), the difficulty in agreeing on even simple XML formats and interfaces, and further collaborations that are being pursued.
Antonietti, Alberto; Casellato, Claudia; Garrido, Jesús A; Luque, Niceto R; Naveros, Francisco; Ros, Eduardo; D' Angelo, Egidio; Pedrocchi, Alessandra
2016-01-01
In this study, we defined a realistic cerebellar model through the use of artificial spiking neural networks, testing it in computational simulations that reproduce associative motor tasks in multiple sessions of acquisition and extinction. By evolutionary algorithms, we tuned the cerebellar microcircuit to find out the near-optimal plasticity mechanism parameters that better reproduced human-like behavior in eye blink classical conditioning, one of the most extensively studied paradigms related to the cerebellum. We used two models: one with only the cortical plasticity and another including two additional plasticity sites at nuclear level. First, both spiking cerebellar models were able to well reproduce the real human behaviors, in terms of both "timing" and "amplitude", expressing rapid acquisition, stable late acquisition, rapid extinction, and faster reacquisition of an associative motor task. Even though the model with only the cortical plasticity site showed good learning capabilities, the model with distributed plasticity produced faster and more stable acquisition of conditioned responses in the reacquisition phase. This behavior is explained by the effect of the nuclear plasticities, which have slow dynamics and can express memory consolidation and saving. We showed how the spiking dynamics of multiple interactive neural mechanisms implicitly drive multiple essential components of complex learning processes. This study presents a very advanced computational model, developed together by biomedical engineers, computer scientists, and neuroscientists. Since its realistic features, the proposed model can provide confirmations and suggestions about neurophysiological and pathological hypotheses and can be used in challenging clinical applications.
A Reliable and Real-Time Tracking Method with Color Distribution
Zhao, Zishu; Han, Yuqi; Xu, Tingfa; Li, Xiangmin; Song, Haiping; Luo, Jiqiang
2017-01-01
Occlusion is a challenging problem in visual tracking. Therefore, in recent years, many trackers have been explored to solve this problem, but most of them cannot track the target in real time because of the heavy computational cost. A spatio-temporal context (STC) tracker was proposed to accelerate the task by calculating context information in the Fourier domain, alleviating the performance in handling occlusion. In this paper, we take advantage of the high efficiency of the STC tracker and employ salient prior model information based on color distribution to improve the robustness. Furthermore, we exploit a scale pyramid for accurate scale estimation. In particular, a new high-confidence update strategy and a re-searching mechanism are used to avoid the model corruption and handle occlusion. Extensive experimental results demonstrate our algorithm outperforms several state-of-the-art algorithms on the OTB2015 dataset. PMID:28994748
NASA Astrophysics Data System (ADS)
Mousavi, Monireh Sadat; Ashrafi, Khosro; Motlagh, Majid Shafie Pour; Niksokhan, Mohhamad Hosein; Vosoughifar, HamidReza
2018-02-01
In this study, coupled method for simulation of flow pattern based on computational methods for fluid dynamics with optimization technique using genetic algorithms is presented to determine the optimal location and number of sensors in an enclosed residential complex parking in Tehran. The main objective of this research is costs reduction and maximum coverage with regard to distribution of existing concentrations in different scenarios. In this study, considering all the different scenarios for simulation of pollution distribution using CFD simulations has been challenging due to extent of parking and number of cars available. To solve this problem, some scenarios have been selected based on random method. Then, maximum concentrations of scenarios are chosen for performing optimization. CFD simulation outputs are inserted as input in the optimization model using genetic algorithm. The obtained results stated optimal number and location of sensors.
MNE software for processing MEG and EEG data
Gramfort, A.; Luessi, M.; Larson, E.; Engemann, D.; Strohmeier, D.; Brodbeck, C.; Parkkonen, L.; Hämäläinen, M.
2013-01-01
Magnetoencephalography and electroencephalography (M/EEG) measure the weak electromagnetic signals originating from neural currents in the brain. Using these signals to characterize and locate brain activity is a challenging task, as evidenced by several decades of methodological contributions. MNE, whose name stems from its capability to compute cortically-constrained minimum-norm current estimates from M/EEG data, is a software package that provides comprehensive analysis tools and workflows including preprocessing, source estimation, time–frequency analysis, statistical analysis, and several methods to estimate functional connectivity between distributed brain regions. The present paper gives detailed information about the MNE package and describes typical use cases while also warning about potential caveats in analysis. The MNE package is a collaborative effort of multiple institutes striving to implement and share best methods and to facilitate distribution of analysis pipelines to advance reproducibility of research. Full documentation is available at http://martinos.org/mne. PMID:24161808
Investigating effects of communications modulation technique on targeting performance
NASA Astrophysics Data System (ADS)
Blasch, Erik; Eusebio, Gerald; Huling, Edward
2006-05-01
One of the key challenges facing the global war on terrorism (GWOT) and urban operations is the increased need for rapid and diverse information from distributed sources. For users to get adequate information on target types and movements, they would need reliable data. In order to facilitate reliable computational intelligence, we seek to explore the communication modulation tradeoffs affecting information distribution and accumulation. In this analysis, we explore the modulation techniques of Orthogonal Frequency Division Multiplexing (OFDM), Direct Sequence Spread Spectrum (DSSS), and statistical time-division multiple access (TDMA) as a function of the bit error rate and jitter that affect targeting performance. In the analysis, we simulate a Link 16 with a simple bandpass frequency shift keying (PSK) technique using different Signal-to-Noise ratios. The communications transfer delay and accuracy tradeoffs are assessed as to the effects incurred in targeting performance.
Focused Belief Measures for Uncertainty Quantification in High Performance Semantic Analysis
DOE Office of Scientific and Technical Information (OSTI.GOV)
Joslyn, Cliff A.; Weaver, Jesse R.
In web-scale semantic data analytics there is a great need for methods which aggregate uncertainty claims, on the one hand respecting the information provided as accurately as possible, while on the other still being tractable. Traditional statistical methods are more robust, but only represent distributional, additive uncertainty. Generalized information theory methods, including fuzzy systems and Dempster-Shafer (DS) evidence theory, represent multiple forms of uncertainty, but are computationally and methodologically difficult. We require methods which provide an effective balance between the complete representation of the full complexity of uncertainty claims in their interaction, while satisfying the needs of both computational complexitymore » and human cognition. Here we build on J{\\o}sang's subjective logic to posit methods in focused belief measures (FBMs), where a full DS structure is focused to a single event. The resulting ternary logical structure is posited to be able to capture the minimal amount of generalized complexity needed at a maximum of computational efficiency. We demonstrate the efficacy of this approach in a web ingest experiment over the 2012 Billion Triple dataset from the Semantic Web Challenge.« less
Stride, E.; Cheema, U.
2017-01-01
The growth of bubbles within the body is widely believed to be the cause of decompression sickness (DCS). Dive computer algorithms that aim to prevent DCS by mathematically modelling bubble dynamics and tissue gas kinetics are challenging to validate. This is due to lack of understanding regarding the mechanism(s) leading from bubble formation to DCS. In this work, a biomimetic in vitro tissue phantom and a three-dimensional computational model, comprising a hyperelastic strain-energy density function to model tissue elasticity, were combined to investigate key areas of bubble dynamics. A sensitivity analysis indicated that the diffusion coefficient was the most influential material parameter. Comparison of computational and experimental data revealed the bubble surface's diffusion coefficient to be 30 times smaller than that in the bulk tissue and dependent on the bubble's surface area. The initial size, size distribution and proximity of bubbles within the tissue phantom were also shown to influence their subsequent dynamics highlighting the importance of modelling bubble nucleation and bubble–bubble interactions in order to develop more accurate dive algorithms. PMID:29263127
Overview of the Aeroelastic Prediction Workshop
NASA Technical Reports Server (NTRS)
Heeg, Jennifer; Chwalowski, Pawel; Florance, Jennifer P.; Wieseman, Carol D.; Schuster, David M.; Perry, Raleigh B.
2013-01-01
The Aeroelastic Prediction Workshop brought together an international community of computational fluid dynamicists as a step in defining the state of the art in computational aeroelasticity. This workshop's technical focus was prediction of unsteady pressure distributions resulting from forced motion, benchmarking the results first using unforced system data. The most challenging aspects of the physics were identified as capturing oscillatory shock behavior, dynamic shock-induced separated flow and tunnel wall boundary layer influences. The majority of the participants used unsteady Reynolds-averaged Navier Stokes codes. These codes were exercised at transonic Mach numbers for three configurations and comparisons were made with existing experimental data. Substantial variations were observed among the computational solutions as well as differences relative to the experimental data. Contributing issues to these differences include wall effects and wall modeling, non-standardized convergence criteria, inclusion of static aeroelastic deflection, methodology for oscillatory solutions, post-processing methods. Contributing issues pertaining principally to the experimental data sets include the position of the model relative to the tunnel wall, splitter plate size, wind tunnel expansion slot configuration, spacing and location of pressure instrumentation, and data processing methods.
NASA Technical Reports Server (NTRS)
Schilling, D. L.; Oh, S. J.; Thau, F.
1975-01-01
Developments in communications systems, computer systems, and power distribution systems for the space shuttle are described. The use of high speed delta modulation for bit rate compression in the transmission of television signals is discussed. Simultaneous Multiprocessor Organization, an approach to computer organization, is presented. Methods of computer simulation and automatic malfunction detection for the shuttle power distribution system are also described.
Distributed intrusion detection system based on grid security model
NASA Astrophysics Data System (ADS)
Su, Jie; Liu, Yahui
2008-03-01
Grid computing has developed rapidly with the development of network technology and it can solve the problem of large-scale complex computing by sharing large-scale computing resource. In grid environment, we can realize a distributed and load balance intrusion detection system. This paper first discusses the security mechanism in grid computing and the function of PKI/CA in the grid security system, then gives the application of grid computing character in the distributed intrusion detection system (IDS) based on Artificial Immune System. Finally, it gives a distributed intrusion detection system based on grid security system that can reduce the processing delay and assure the detection rates.
Ising Processing Units: Potential and Challenges for Discrete Optimization
DOE Office of Scientific and Technical Information (OSTI.GOV)
Coffrin, Carleton James; Nagarajan, Harsha; Bent, Russell Whitford
The recent emergence of novel computational devices, such as adiabatic quantum computers, CMOS annealers, and optical parametric oscillators, presents new opportunities for hybrid-optimization algorithms that leverage these kinds of specialized hardware. In this work, we propose the idea of an Ising processing unit as a computational abstraction for these emerging tools. Challenges involved in using and bench- marking these devices are presented, and open-source software tools are proposed to address some of these challenges. The proposed benchmarking tools and methodology are demonstrated by conducting a baseline study of established solution methods to a D-Wave 2X adiabatic quantum computer, one examplemore » of a commercially available Ising processing unit.« less
A distributed programming environment for Ada
NASA Technical Reports Server (NTRS)
Brennan, Peter; Mcdonnell, Tom; Mcfarland, Gregory; Timmins, Lawrence J.; Litke, John D.
1986-01-01
Despite considerable commercial exploitation of fault tolerance systems, significant and difficult research problems remain in such areas as fault detection and correction. A research project is described which constructs a distributed computing test bed for loosely coupled computers. The project is constructing a tool kit to support research into distributed control algorithms, including a distributed Ada compiler, distributed debugger, test harnesses, and environment monitors. The Ada compiler is being written in Ada and will implement distributed computing at the subsystem level. The design goal is to provide a variety of control mechanics for distributed programming while retaining total transparency at the code level.
The Challenges of Distributing Leadership in Irish Post-Primary Schools
ERIC Educational Resources Information Center
O'Donovan, Margaret
2015-01-01
This study explores the challenges and opportunities in relation to developing distributed leadership practice in Irish post-primary schools. It considers school leadership within the context of contemporary distributed leadership theory. Associated concepts such as distributed cognition and activity theory are used to frame the study. The study…
PLIF Temperature and Velocity Distributions in Laminar Hypersonic Flat-plate Flow
NASA Technical Reports Server (NTRS)
OByrne, S.; Danehy, P. M.; Houwing, A. F. P.
2003-01-01
Rotational temperature and velocity distributions have been measured across a hypersonic laminar flat-plate boundary layer, using planar laser-induced fluorescence. The measurements are compared to a finite-volume computation and a first-order boundary layer computation, assuming local similarity. Both computations produced similar temperature distributions and nearly identical velocity distributions. The disagreement between calculations is ascribed to the similarity solution not accounting for leading-edge displacement effects. The velocity measurements agreed to within the measurement uncertainty of 2 % with both calculated distributions. The peak measured temperature was 200 K lower than the computed values. This discrepancy is tentatively ascribed to vibrational relaxation in the boundary layer.
Distributed metadata in a high performance computing environment
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bent, John M.; Faibish, Sorin; Zhang, Zhenhua
A computer-executable method, system, and computer program product for managing meta-data in a distributed storage system, wherein the distributed storage system includes one or more burst buffers enabled to operate with a distributed key-value store, the co computer-executable method, system, and computer program product comprising receiving a request for meta-data associated with a block of data stored in a first burst buffer of the one or more burst buffers in the distributed storage system, wherein the meta data is associated with a key-value, determining which of the one or more burst buffers stores the requested metadata, and upon determination thatmore » a first burst buffer of the one or more burst buffers stores the requested metadata, locating the key-value in a portion of the distributed key-value store accessible from the first burst buffer.« less
Quantum cryptography: a view from classical cryptography
NASA Astrophysics Data System (ADS)
Buchmann, Johannes; Braun, Johannes; Demirel, Denise; Geihs, Matthias
2017-06-01
Much of digital data requires long-term protection of confidentiality, for example, medical health records. Cryptography provides such protection. However, currently used cryptographic techniques such as Diffe-Hellman key exchange may not provide long-term security. Such techniques rely on certain computational assumptions, such as the hardness of the discrete logarithm problem that may turn out to be incorrect. On the other hand, quantum cryptography---in particular quantum random number generation and quantum key distribution---offers information theoretic protection. In this paper, we explore the challenge of providing long-term confidentiality and we argue that a combination of quantum cryptography and classical cryptography can provide such protection.
NASA Astrophysics Data System (ADS)
Graham, Matthew; Gray, N.; Burke, D.
2010-01-01
Many activities in the era of data-intensive astronomy are predicated upon some transference of domain knowledge and expertise from human to machine. The semantic infrastructure required to support this is no longer a pipe dream of computer science but a set of practical engineering challenges, more concerned with deployment and performance details than AI abstractions. The application of such ideas promises to help in such areas as contextual data access, exploiting distributed annotation and heterogeneous sources, and intelligent data dissemination and discovery. In this talk, we will review the status and use of semantic technologies in astronomy, particularly to address current problems in astroinformatics, with such projects as SKUA and AstroCollation.
3D Microstructures for Materials and Damage Models
Livescu, Veronica; Bronkhorst, Curt Allan; Vander Wiel, Scott Alan
2017-02-01
Many challenges exist with regard to understanding and representing complex physical processes involved with ductile damage and failure in polycrystalline metallic materials. Currently, the ability to accurately predict the macroscale ductile damage and failure response of metallic materials is lacking. Research at Los Alamos National Laboratory (LANL) is aimed at building a coupled experimental and computational methodology that supports the development of predictive damage capabilities by: capturing real distributions of microstructural features from real material and implementing them as digitally generated microstructures in damage model development; and, distilling structure-property information to link microstructural details to damage evolution under a multitudemore » of loading states.« less
Grid-wide neuroimaging data federation in the context of the NeuroLOG project
Michel, Franck; Gaignard, Alban; Ahmad, Farooq; Barillot, Christian; Batrancourt, Bénédicte; Dojat, Michel; Gibaud, Bernard; Girard, Pascal; Godard, David; Kassel, Gilles; Lingrand, Diane; Malandain, Grégoire; Montagnat, Johan; Pélégrini-Issac, Mélanie; Pennec, Xavier; Rojas Balderrama, Javier; Wali, Bacem
2010-01-01
Grid technologies are appealing to deal with the challenges raised by computational neurosciences and support multi-centric brain studies. However, core grids middleware hardly cope with the complex neuroimaging data representation and multi-layer data federation needs. Moreover, legacy neuroscience environments need to be preserved and cannot be simply superseded by grid services. This paper describes the NeuroLOG platform design and implementation, shedding light on its Data Management Layer. It addresses the integration of brain image files, associated relational metadata and neuroscience semantic data in a heterogeneous distributed environment, integrating legacy data managers through a mediation layer. PMID:20543431
Efficient and optimized identification of generalized Maxwell viscoelastic relaxation spectra.
Babaei, Behzad; Davarian, Ali; Pryse, Kenneth M; Elson, Elliot L; Genin, Guy M
2015-03-01
Viscoelastic relaxation spectra are essential for predicting and interpreting the mechanical responses of materials and structures. For biological tissues, these spectra must usually be estimated from viscoelastic relaxation tests. Interpreting viscoelastic relaxation tests is challenging because the inverse problem is expensive computationally. We present here an efficient algorithm that enables rapid identification of viscoelastic relaxation spectra. The algorithm was tested against trial data to characterize its robustness and identify its limitations and strengths. The algorithm was then applied to identify the viscoelastic response of reconstituted collagen, revealing an extensive distribution of viscoelastic time constants. Copyright © 2015 Elsevier Ltd. All rights reserved.
Gomez-Ramirez, Jaime; Sanz, Ricardo
2013-09-01
One of the most important scientific challenges today is the quantitative and predictive understanding of biological function. Classical mathematical and computational approaches have been enormously successful in modeling inert matter, but they may be inadequate to address inherent features of biological systems. We address the conceptual and methodological obstacles that lie in the inverse problem in biological systems modeling. We introduce a full Bayesian approach (FBA), a theoretical framework to study biological function, in which probability distributions are conditional on biophysical information that physically resides in the biological system that is studied by the scientist. Copyright © 2013 Elsevier Ltd. All rights reserved.
Exploratory visualization of astronomical data on ultra-high-resolution wall displays
NASA Astrophysics Data System (ADS)
Pietriga, Emmanuel; del Campo, Fernando; Ibsen, Amanda; Primet, Romain; Appert, Caroline; Chapuis, Olivier; Hempel, Maren; Muñoz, Roberto; Eyheramendy, Susana; Jordan, Andres; Dole, Hervé
2016-07-01
Ultra-high-resolution wall displays feature a very high pixel density over a large physical surface, which makes them well-suited to the collaborative, exploratory visualization of large datasets. We introduce FITS-OW, an application designed for such wall displays, that enables astronomers to navigate in large collections of FITS images, query astronomical databases, and display detailed, complementary data and documents about multiple sources simultaneously. We describe how astronomers interact with their data using both the wall's touchsensitive surface and handheld devices. We also report on the technical challenges we addressed in terms of distributed graphics rendering and data sharing over the computer clusters that drive wall displays.
The computational challenges of Earth-system science.
O'Neill, Alan; Steenman-Clark, Lois
2002-06-15
The Earth system--comprising atmosphere, ocean, land, cryosphere and biosphere--is an immensely complex system, involving processes and interactions on a wide range of space- and time-scales. To understand and predict the evolution of the Earth system is one of the greatest challenges of modern science, with success likely to bring enormous societal benefits. High-performance computing, along with the wealth of new observational data, is revolutionizing our ability to simulate the Earth system with computer models that link the different components of the system together. There are, however, considerable scientific and technical challenges to be overcome. This paper will consider four of them: complexity, spatial resolution, inherent uncertainty and time-scales. Meeting these challenges requires a significant increase in the power of high-performance computers. The benefits of being able to make reliable predictions about the evolution of the Earth system should, on their own, amply repay this investment.
2006-04-01
and Scalability, (2) Sensors and Platforms, (3) Distributed Computing and Processing , (4) Information Management, (5) Fusion and Resource Management...use of the deployed system. 3.3 Distributed Computing and Processing Session The Distributed Computing and Processing Session consisted of three
DOE Office of Scientific and Technical Information (OSTI.GOV)
Andrus, Jason P.; Pope, Chad; Toston, Mary
2016-12-01
Nonreactor nuclear facilities operating under the approval authority of the U.S. Department of Energy use unmitigated hazard evaluations to determine if potential radiological doses associated with design basis events challenge or exceed dose evaluation guidelines. Unmitigated design basis events that sufficiently challenge dose evaluation guidelines or exceed the guidelines for members of the public or workers, merit selection of safety structures, systems, or components or other controls to prevent or mitigate the hazard. Idaho State University, in collaboration with Idaho National Laboratory, has developed a portable and simple to use software application called SODA (Stochastic Objective Decision-Aide) that stochastically calculatesmore » the radiation dose distribution associated with hypothetical radiological material release scenarios. Rather than producing a point estimate of the dose, SODA produces a dose distribution result to allow a deeper understanding of the dose potential. SODA allows users to select the distribution type and parameter values for all of the input variables used to perform the dose calculation. Users can also specify custom distributions through a user defined distribution option. SODA then randomly samples each distribution input variable and calculates the overall resulting dose distribution. In cases where an input variable distribution is unknown, a traditional single point value can be used. SODA, developed using the MATLAB coding framework, has a graphical user interface and can be installed on both Windows and Mac computers. SODA is a standalone software application and does not require MATLAB to function. SODA provides improved risk understanding leading to better informed decision making associated with establishing nuclear facility material-at-risk limits and safety structure, system, or component selection. It is important to note that SODA does not replace or compete with codes such as MACCS or RSAC; rather it is viewed as an easy to use supplemental tool to help improve risk understanding and support better informed decisions. The SODA development project was funded through a grant from the DOE Nuclear Safety Research and Development Program.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Andrus, Jason P.; Pope, Chad; Toston, Mary
Nonreactor nuclear facilities operating under the approval authority of the U.S. Department of Energy use unmitigated hazard evaluations to determine if potential radiological doses associated with design basis events challenge or exceed dose evaluation guidelines. Unmitigated design basis events that sufficiently challenge dose evaluation guidelines or exceed the guidelines for members of the public or workers, merit selection of safety structures, systems, or components or other controls to prevent or mitigate the hazard. Idaho State University, in collaboration with Idaho National Laboratory, has developed a portable and simple to use software application called SODA (Stochastic Objective Decision-Aide) that stochastically calculatesmore » the radiation dose distribution associated with hypothetical radiological material release scenarios. Rather than producing a point estimate of the dose, SODA produces a dose distribution result to allow a deeper understanding of the dose potential. SODA allows users to select the distribution type and parameter values for all of the input variables used to perform the dose calculation. Users can also specify custom distributions through a user defined distribution option. SODA then randomly samples each distribution input variable and calculates the overall resulting dose distribution. In cases where an input variable distribution is unknown, a traditional single point value can be used. SODA, developed using the MATLAB coding framework, has a graphical user interface and can be installed on both Windows and Mac computers. SODA is a standalone software application and does not require MATLAB to function. SODA provides improved risk understanding leading to better informed decision making associated with establishing nuclear facility material-at-risk limits and safety structure, system, or component selection. It is important to note that SODA does not replace or compete with codes such as MACCS or RSAC; rather it is viewed as an easy to use supplemental tool to help improve risk understanding and support better informed decisions. The SODA development project was funded through a grant from the DOE Nuclear Safety Research and Development Program.« less
Cortical Neural Computation by Discrete Results Hypothesis
Castejon, Carlos; Nuñez, Angel
2016-01-01
One of the most challenging problems we face in neuroscience is to understand how the cortex performs computations. There is increasing evidence that the power of the cortical processing is produced by populations of neurons forming dynamic neuronal ensembles. Theoretical proposals and multineuronal experimental studies have revealed that ensembles of neurons can form emergent functional units. However, how these ensembles are implicated in cortical computations is still a mystery. Although cell ensembles have been associated with brain rhythms, the functional interaction remains largely unclear. It is still unknown how spatially distributed neuronal activity can be temporally integrated to contribute to cortical computations. A theoretical explanation integrating spatial and temporal aspects of cortical processing is still lacking. In this Hypothesis and Theory article, we propose a new functional theoretical framework to explain the computational roles of these ensembles in cortical processing. We suggest that complex neural computations underlying cortical processing could be temporally discrete and that sensory information would need to be quantized to be computed by the cerebral cortex. Accordingly, we propose that cortical processing is produced by the computation of discrete spatio-temporal functional units that we have called “Discrete Results” (Discrete Results Hypothesis). This hypothesis represents a novel functional mechanism by which information processing is computed in the cortex. Furthermore, we propose that precise dynamic sequences of “Discrete Results” is the mechanism used by the cortex to extract, code, memorize and transmit neural information. The novel “Discrete Results” concept has the ability to match the spatial and temporal aspects of cortical processing. We discuss the possible neural underpinnings of these functional computational units and describe the empirical evidence supporting our hypothesis. We propose that fast-spiking (FS) interneuron may be a key element in our hypothesis providing the basis for this computation. PMID:27807408
Cortical Neural Computation by Discrete Results Hypothesis.
Castejon, Carlos; Nuñez, Angel
2016-01-01
One of the most challenging problems we face in neuroscience is to understand how the cortex performs computations. There is increasing evidence that the power of the cortical processing is produced by populations of neurons forming dynamic neuronal ensembles. Theoretical proposals and multineuronal experimental studies have revealed that ensembles of neurons can form emergent functional units. However, how these ensembles are implicated in cortical computations is still a mystery. Although cell ensembles have been associated with brain rhythms, the functional interaction remains largely unclear. It is still unknown how spatially distributed neuronal activity can be temporally integrated to contribute to cortical computations. A theoretical explanation integrating spatial and temporal aspects of cortical processing is still lacking. In this Hypothesis and Theory article, we propose a new functional theoretical framework to explain the computational roles of these ensembles in cortical processing. We suggest that complex neural computations underlying cortical processing could be temporally discrete and that sensory information would need to be quantized to be computed by the cerebral cortex. Accordingly, we propose that cortical processing is produced by the computation of discrete spatio-temporal functional units that we have called "Discrete Results" (Discrete Results Hypothesis). This hypothesis represents a novel functional mechanism by which information processing is computed in the cortex. Furthermore, we propose that precise dynamic sequences of "Discrete Results" is the mechanism used by the cortex to extract, code, memorize and transmit neural information. The novel "Discrete Results" concept has the ability to match the spatial and temporal aspects of cortical processing. We discuss the possible neural underpinnings of these functional computational units and describe the empirical evidence supporting our hypothesis. We propose that fast-spiking (FS) interneuron may be a key element in our hypothesis providing the basis for this computation.
Pandey, Parul; Lee, Eun Kyung; Pompili, Dario
2016-11-01
Stress is one of the key factor that impacts the quality of our daily life: From the productivity and efficiency in the production processes to the ability of (civilian and military) individuals in making rational decisions. Also, stress can propagate from one individual to other working in a close proximity or toward a common goal, e.g., in a military operation or workforce. Real-time assessment of the stress of individuals alone is, however, not sufficient, as understanding its source and direction in which it propagates in a group of people is equally-if not more-important. A continuous near real-time in situ personal stress monitoring system to quantify level of stress of individuals and its direction of propagation in a team is envisioned. However, stress monitoring of an individual via his/her mobile device may not always be possible for extended periods of time due to limited battery capacity of these devices. To overcome this challenge a novel distributed mobile computing framework is proposed to organize the resources in the vicinity and form a mobile device cloud that enables offloading of computation tasks in stress detection algorithm from resource constrained devices (low residual battery, limited CPU cycles) to resource rich devices. Our framework also supports computing parallelization and workflows, defining how the data and tasks divided/assigned among the entities of the framework are designed. The direction of propagation and magnitude of influence of stress in a group of individuals are studied by applying real-time, in situ analysis of Granger Causality. Tangible benefits (in terms of energy expenditure and execution time) of the proposed framework in comparison to a centralized framework are presented via thorough simulations and real experiments.
NASA Astrophysics Data System (ADS)
Tysowski, Piotr K.; Ling, Xinhua; Lütkenhaus, Norbert; Mosca, Michele
2018-04-01
Quantum key distribution (QKD) is a means of generating keys between a pair of computing hosts that is theoretically secure against cryptanalysis, even by a quantum computer. Although there is much active research into improving the QKD technology itself, there is still significant work to be done to apply engineering methodology and determine how it can be practically built to scale within an enterprise IT environment. Significant challenges exist in building a practical key management service (KMS) for use in a metropolitan network. QKD is generally a point-to-point technique only and is subject to steep performance constraints. The integration of QKD into enterprise-level computing has been researched, to enable quantum-safe communication. A novel method for constructing a KMS is presented that allows arbitrary computing hosts on one site to establish multiple secure communication sessions with the hosts of another site. A key exchange protocol is proposed where symmetric private keys are granted to hosts while satisfying the scalability needs of an enterprise population of users. The KMS operates within a layered architectural style that is able to interoperate with various underlying QKD implementations. Variable levels of security for the host population are enforced through a policy engine. A network layer provides key generation across a network of nodes connected by quantum links. Scheduling and routing functionality allows quantum key material to be relayed across trusted nodes. Optimizations are performed to match the real-time host demand for key material with the capacity afforded by the infrastructure. The result is a flexible and scalable architecture that is suitable for enterprise use and independent of any specific QKD technology.
3D printing of novel osteochondral scaffolds with graded microstructure
NASA Astrophysics Data System (ADS)
Nowicki, Margaret A.; Castro, Nathan J.; Plesniak, Michael W.; Zhang, Lijie Grace
2016-10-01
Osteochondral tissue has a complex graded structure where biological, physiological, and mechanical properties vary significantly over the full thickness spanning from the subchondral bone region beneath the joint surface to the hyaline cartilage region at the joint surface. This presents a significant challenge for tissue-engineered structures addressing osteochondral defects. Fused deposition modeling (FDM) 3D bioprinters present a unique solution to this problem. The objective of this study is to use FDM-based 3D bioprinting and nanocrystalline hydroxyapatite for improved bone marrow human mesenchymal stem cell (hMSC) adhesion, growth, and osteochondral differentiation. FDM printing parameters can be tuned through computer aided design and computer numerical control software to manipulate scaffold geometries in ways that are beneficial to mechanical performance without hindering cellular behavior. Additionally, the ability to fine-tune 3D printed scaffolds increases further through our investment casting procedure which facilitates the inclusion of nanoparticles with biochemical factors to further elicit desired hMSC differentiation. For this study, FDM was used to print investment-casting molds innovatively designed with varied pore distribution over the full thickness of the scaffold. The mechanical and biological impacts of the varied pore distributions were compared and evaluated to determine the benefits of this physical manipulation. The results indicate that both mechanical properties and cell performance improve in the graded pore structures when compared to homogeneously distributed porous and non-porous structures. Differentiation results indicated successful osteogenic and chondrogenic manipulation in engineered scaffolds.
NASA Astrophysics Data System (ADS)
Liu, Bin
2014-07-01
We describe an algorithm that can adaptively provide mixture summaries of multimodal posterior distributions. The parameter space of the involved posteriors ranges in size from a few dimensions to dozens of dimensions. This work was motivated by an astrophysical problem called extrasolar planet (exoplanet) detection, wherein the computation of stochastic integrals that are required for Bayesian model comparison is challenging. The difficulty comes from the highly nonlinear models that lead to multimodal posterior distributions. We resort to importance sampling (IS) to estimate the integrals, and thus translate the problem to be how to find a parametric approximation of the posterior. To capture the multimodal structure in the posterior, we initialize a mixture proposal distribution and then tailor its parameters elaborately to make it resemble the posterior to the greatest extent possible. We use the effective sample size (ESS) calculated based on the IS draws to measure the degree of approximation. The bigger the ESS is, the better the proposal resembles the posterior. A difficulty within this tailoring operation lies in the adjustment of the number of mixing components in the mixture proposal. Brute force methods just preset it as a large constant, which leads to an increase in the required computational resources. We provide an iterative delete/merge/add process, which works in tandem with an expectation-maximization step to tailor such a number online. The efficiency of our proposed method is tested via both simulation studies and real exoplanet data analysis.
Design of multiple sequence alignment algorithms on parallel, distributed memory supercomputers.
Church, Philip C; Goscinski, Andrzej; Holt, Kathryn; Inouye, Michael; Ghoting, Amol; Makarychev, Konstantin; Reumann, Matthias
2011-01-01
The challenge of comparing two or more genomes that have undergone recombination and substantial amounts of segmental loss and gain has recently been addressed for small numbers of genomes. However, datasets of hundreds of genomes are now common and their sizes will only increase in the future. Multiple sequence alignment of hundreds of genomes remains an intractable problem due to quadratic increases in compute time and memory footprint. To date, most alignment algorithms are designed for commodity clusters without parallelism. Hence, we propose the design of a multiple sequence alignment algorithm on massively parallel, distributed memory supercomputers to enable research into comparative genomics on large data sets. Following the methodology of the sequential progressiveMauve algorithm, we design data structures including sequences and sorted k-mer lists on the IBM Blue Gene/P supercomputer (BG/P). Preliminary results show that we can reduce the memory footprint so that we can potentially align over 250 bacterial genomes on a single BG/P compute node. We verify our results on a dataset of E.coli, Shigella and S.pneumoniae genomes. Our implementation returns results matching those of the original algorithm but in 1/2 the time and with 1/4 the memory footprint for scaffold building. In this study, we have laid the basis for multiple sequence alignment of large-scale datasets on a massively parallel, distributed memory supercomputer, thus enabling comparison of hundreds instead of a few genome sequences within reasonable time.
Distributed Prognostic Health Management with Gaussian Process Regression
NASA Technical Reports Server (NTRS)
Saha, Sankalita; Saha, Bhaskar; Saxena, Abhinav; Goebel, Kai Frank
2010-01-01
Distributed prognostics architecture design is an enabling step for efficient implementation of health management systems. A major challenge encountered in such design is formulation of optimal distributed prognostics algorithms. In this paper. we present a distributed GPR based prognostics algorithm whose target platform is a wireless sensor network. In addition to challenges encountered in a distributed implementation, a wireless network poses constraints on communication patterns, thereby making the problem more challenging. The prognostics application that was used to demonstrate our new algorithms is battery prognostics. In order to present trade-offs within different prognostic approaches, we present comparison with the distributed implementation of a particle filter based prognostics for the same battery data.
NASA Astrophysics Data System (ADS)
Ojaghi, Mobin; Martínez, Ignacio Lamata; Dietz, Matt S.; Williams, Martin S.; Blakeborough, Anthony; Crewe, Adam J.; Taylor, Colin A.; Madabhushi, S. P. Gopal; Haigh, Stuart K.
2018-01-01
Distributed Hybrid Testing (DHT) is an experimental technique designed to capitalise on advances in modern networking infrastructure to overcome traditional laboratory capacity limitations. By coupling the heterogeneous test apparatus and computational resources of geographically distributed laboratories, DHT provides the means to take on complex, multi-disciplinary challenges with new forms of communication and collaboration. To introduce the opportunity and practicability afforded by DHT, here an exemplar multi-site test is addressed in which a dedicated fibre network and suite of custom software is used to connect the geotechnical centrifuge at the University of Cambridge with a variety of structural dynamics loading apparatus at the University of Oxford and the University of Bristol. While centrifuge time-scaling prevents real-time rates of loading in this test, such experiments may be used to gain valuable insights into physical phenomena, test procedure and accuracy. These and other related experiments have led to the development of the real-time DHT technique and the creation of a flexible framework that aims to facilitate future distributed tests within the UK and beyond. As a further example, a real-time DHT experiment between structural labs using this framework for testing across the Internet is also presented.
NASA Astrophysics Data System (ADS)
Kardani, Arash; Mehrafrooz, Behzad; Montazeri, Abbas
2018-03-01
Porous nickel-based nanocatalysts have attracted great attention thanks to their high surface-to-volume ratio and desired mechanical properties. One of the major challenges associated with their applications is weakening their shear properties due to their contact with the high fluid flow values at elevated service temperatures. On the other hand, their shear behavior is dominantly influenced by the size and distribution of pores available in their structure. In this study, different nickel samples containing periodic distribution surface porosities with 2 nm diameter are examined via molecular dynamics simulation. Moreover, to explore the effects of porosities distribution, the obtained results are compared with those of the samples having concentrated pores at the bigger size of 10nm. Accordingly, shear loading conditions are imposed to capture the dependency of the shear characteristics of the samples on the location and on the geometrical factors of the aforementioned porosities. Surprisingly, it is revealed that the existence of randomly distributed pores can lead to an enhancement of their yield strain compared to that of non-porous counterparts. The underlying mechanism governing this special behavior is thoroughly studied employing several case studies.
Computer-assisted diagnostic decision support: history, challenges, and possible paths forward.
Miller, Randolph A
2009-09-01
This paper presents a brief history of computer-assisted diagnosis, including challenges and future directions. Some ideas presented in this article on computer-assisted diagnostic decision support systems (CDDSS) derive from prior work by the author and his colleagues (see list in Acknowledgments) on the INTERNIST-1 and QMR projects. References indicate the original sources of many of these ideas.
NASA Technical Reports Server (NTRS)
Talbot, Bryan; Zhou, Shu-Jia; Higgins, Glenn
2002-01-01
One of the most significant challenges in large-scale climate modeling, as well as in high-performance computing in other scientific fields, is that of effectively integrating many software models from multiple contributors. A software framework facilitates the integration task. both in the development and runtime stages of the simulation. Effective software frameworks reduce the programming burden for the investigators, freeing them to focus more on the science and less on the parallel communication implementation, while maintaining high performance across numerous supercomputer and workstation architectures. This document proposes a strawman framework design for the climate community based on the integration of Cactus, from the relativistic physics community, and UCLA/UCB Distributed Data Broker (DDB) from the climate community. This design is the result of an extensive survey of climate models and frameworks in the climate community as well as frameworks from many other scientific communities. The design addresses fundamental development and runtime needs using Cactus, a framework with interfaces for FORTRAN and C-based languages, and high-performance model communication needs using DDB. This document also specifically explores object-oriented design issues in the context of climate modeling as well as climate modeling issues in terms of object-oriented design.
Assurance Technology Challenges of Advanced Space Systems
NASA Technical Reports Server (NTRS)
Chern, E. James
2004-01-01
The initiative to explore space and extend a human presence across our solar system to revisit the moon and Mars post enormous technological challenges to the nation's space agency and aerospace industry. Key areas of technology development needs to enable the endeavor include advanced materials, structures and mechanisms; micro/nano sensors and detectors; power generation, storage and management; advanced thermal and cryogenic control; guidance, navigation and control; command and data handling; advanced propulsion; advanced communication; on-board processing; advanced information technology systems; modular and reconfigurable systems; precision formation flying; solar sails; distributed observing systems; space robotics; and etc. Quality assurance concerns such as functional performance, structural integrity, radiation tolerance, health monitoring, diagnosis, maintenance, calibration, and initialization can affect the performance of systems and subsystems. It is thus imperative to employ innovative nondestructive evaluation methodologies to ensure quality and integrity of advanced space systems. Advancements in integrated multi-functional sensor systems, autonomous inspection approaches, distributed embedded sensors, roaming inspectors, and shape adaptive sensors are sought. Concepts in computational models for signal processing and data interpretation to establish quantitative characterization and event determination are also of interest. Prospective evaluation technologies include ultrasonics, laser ultrasonics, optics and fiber optics, shearography, video optics and metrology, thermography, electromagnetics, acoustic emission, x-ray, data management, biomimetics, and nano-scale sensing approaches for structural health monitoring.
Distributed cluster management techniques for unattended ground sensor networks
NASA Astrophysics Data System (ADS)
Essawy, Magdi A.; Stelzig, Chad A.; Bevington, James E.; Minor, Sharon
2005-05-01
Smart Sensor Networks are becoming important target detection and tracking tools. The challenging problems in such networks include the sensor fusion, data management and communication schemes. This work discusses techniques used to distribute sensor management and multi-target tracking responsibilities across an ad hoc, self-healing cluster of sensor nodes. Although miniaturized computing resources possess the ability to host complex tracking and data fusion algorithms, there still exist inherent bandwidth constraints on the RF channel. Therefore, special attention is placed on the reduction of node-to-node communications within the cluster by minimizing unsolicited messaging, and distributing the sensor fusion and tracking tasks onto local portions of the network. Several challenging problems are addressed in this work including track initialization and conflict resolution, track ownership handling, and communication control optimization. Emphasis is also placed on increasing the overall robustness of the sensor cluster through independent decision capabilities on all sensor nodes. Track initiation is performed using collaborative sensing within a neighborhood of sensor nodes, allowing each node to independently determine if initial track ownership should be assumed. This autonomous track initiation prevents the formation of duplicate tracks while eliminating the need for a central "management" node to assign tracking responsibilities. Track update is performed as an ownership node requests sensor reports from neighboring nodes based on track error covariance and the neighboring nodes geo-positional location. Track ownership is periodically recomputed using propagated track states to determine which sensing node provides the desired coverage characteristics. High fidelity multi-target simulation results are presented, indicating the distribution of sensor management and tracking capabilities to not only reduce communication bandwidth consumption, but to also simplify multi-target tracking within the cluster.
Vilas, Carlos; Balsa-Canto, Eva; García, Maria-Sonia G; Banga, Julio R; Alonso, Antonio A
2012-07-02
Systems biology allows the analysis of biological systems behavior under different conditions through in silico experimentation. The possibility of perturbing biological systems in different manners calls for the design of perturbations to achieve particular goals. Examples would include, the design of a chemical stimulation to maximize the amplitude of a given cellular signal or to achieve a desired pattern in pattern formation systems, etc. Such design problems can be mathematically formulated as dynamic optimization problems which are particularly challenging when the system is described by partial differential equations.This work addresses the numerical solution of such dynamic optimization problems for spatially distributed biological systems. The usual nonlinear and large scale nature of the mathematical models related to this class of systems and the presence of constraints on the optimization problems, impose a number of difficulties, such as the presence of suboptimal solutions, which call for robust and efficient numerical techniques. Here, the use of a control vector parameterization approach combined with efficient and robust hybrid global optimization methods and a reduced order model methodology is proposed. The capabilities of this strategy are illustrated considering the solution of a two challenging problems: bacterial chemotaxis and the FitzHugh-Nagumo model. In the process of chemotaxis the objective was to efficiently compute the time-varying optimal concentration of chemotractant in one of the spatial boundaries in order to achieve predefined cell distribution profiles. Results are in agreement with those previously published in the literature. The FitzHugh-Nagumo problem is also efficiently solved and it illustrates very well how dynamic optimization may be used to force a system to evolve from an undesired to a desired pattern with a reduced number of actuators. The presented methodology can be used for the efficient dynamic optimization of generic distributed biological systems.
OSG-GEM: Gene Expression Matrix Construction Using the Open Science Grid.
Poehlman, William L; Rynge, Mats; Branton, Chris; Balamurugan, D; Feltus, Frank A
2016-01-01
High-throughput DNA sequencing technology has revolutionized the study of gene expression while introducing significant computational challenges for biologists. These computational challenges include access to sufficient computer hardware and functional data processing workflows. Both these challenges are addressed with our scalable, open-source Pegasus workflow for processing high-throughput DNA sequence datasets into a gene expression matrix (GEM) using computational resources available to U.S.-based researchers on the Open Science Grid (OSG). We describe the usage of the workflow (OSG-GEM), discuss workflow design, inspect performance data, and assess accuracy in mapping paired-end sequencing reads to a reference genome. A target OSG-GEM user is proficient with the Linux command line and possesses basic bioinformatics experience. The user may run this workflow directly on the OSG or adapt it to novel computing environments.
OSG-GEM: Gene Expression Matrix Construction Using the Open Science Grid
Poehlman, William L.; Rynge, Mats; Branton, Chris; Balamurugan, D.; Feltus, Frank A.
2016-01-01
High-throughput DNA sequencing technology has revolutionized the study of gene expression while introducing significant computational challenges for biologists. These computational challenges include access to sufficient computer hardware and functional data processing workflows. Both these challenges are addressed with our scalable, open-source Pegasus workflow for processing high-throughput DNA sequence datasets into a gene expression matrix (GEM) using computational resources available to U.S.-based researchers on the Open Science Grid (OSG). We describe the usage of the workflow (OSG-GEM), discuss workflow design, inspect performance data, and assess accuracy in mapping paired-end sequencing reads to a reference genome. A target OSG-GEM user is proficient with the Linux command line and possesses basic bioinformatics experience. The user may run this workflow directly on the OSG or adapt it to novel computing environments. PMID:27499617