A Web-based Distributed Voluntary Computing Platform for Large Scale Hydrological Computations
NASA Astrophysics Data System (ADS)
Demir, I.; Agliamzanov, R.
2014-12-01
Distributed volunteer computing can enable researchers and scientist to form large parallel computing environments to utilize the computing power of the millions of computers on the Internet, and use them towards running large scale environmental simulations and models to serve the common good of local communities and the world. Recent developments in web technologies and standards allow client-side scripting languages to run at speeds close to native application, and utilize the power of Graphics Processing Units (GPU). Using a client-side scripting language like JavaScript, we have developed an open distributed computing framework that makes it easy for researchers to write their own hydrologic models, and run them on volunteer computers. Users will easily enable their websites for visitors to volunteer sharing their computer resources to contribute running advanced hydrological models and simulations. Using a web-based system allows users to start volunteering their computational resources within seconds without installing any software. The framework distributes the model simulation to thousands of nodes in small spatial and computational sizes. A relational database system is utilized for managing data connections and queue management for the distributed computing nodes. In this paper, we present a web-based distributed volunteer computing platform to enable large scale hydrological simulations and model runs in an open and integrated environment.
Distributed intrusion detection system based on grid security model
NASA Astrophysics Data System (ADS)
Su, Jie; Liu, Yahui
2008-03-01
Grid computing has developed rapidly with the development of network technology and it can solve the problem of large-scale complex computing by sharing large-scale computing resource. In grid environment, we can realize a distributed and load balance intrusion detection system. This paper first discusses the security mechanism in grid computing and the function of PKI/CA in the grid security system, then gives the application of grid computing character in the distributed intrusion detection system (IDS) based on Artificial Immune System. Finally, it gives a distributed intrusion detection system based on grid security system that can reduce the processing delay and assure the detection rates.
Bringing the CMS distributed computing system into scalable operations
NASA Astrophysics Data System (ADS)
Belforte, S.; Fanfani, A.; Fisk, I.; Flix, J.; Hernández, J. M.; Kress, T.; Letts, J.; Magini, N.; Miccio, V.; Sciabà, A.
2010-04-01
Establishing efficient and scalable operations of the CMS distributed computing system critically relies on the proper integration, commissioning and scale testing of the data and workload management tools, the various computing workflows and the underlying computing infrastructure, located at more than 50 computing centres worldwide and interconnected by the Worldwide LHC Computing Grid. Computing challenges periodically undertaken by CMS in the past years with increasing scale and complexity have revealed the need for a sustained effort on computing integration and commissioning activities. The Processing and Data Access (PADA) Task Force was established at the beginning of 2008 within the CMS Computing Program with the mandate of validating the infrastructure for organized processing and user analysis including the sites and the workload and data management tools, validating the distributed production system by performing functionality, reliability and scale tests, helping sites to commission, configure and optimize the networking and storage through scale testing data transfers and data processing, and improving the efficiency of accessing data across the CMS computing system from global transfers to local access. This contribution reports on the tools and procedures developed by CMS for computing commissioning and scale testing as well as the improvements accomplished towards efficient, reliable and scalable computing operations. The activities include the development and operation of load generators for job submission and data transfers with the aim of stressing the experiment and Grid data management and workload management systems, site commissioning procedures and tools to monitor and improve site availability and reliability, as well as activities targeted to the commissioning of the distributed production, user analysis and monitoring systems.
NASA Astrophysics Data System (ADS)
Manfredi, Sabato
2016-06-01
Large-scale dynamic systems are becoming highly pervasive in their occurrence with applications ranging from system biology, environment monitoring, sensor networks, and power systems. They are characterised by high dimensionality, complexity, and uncertainty in the node dynamic/interactions that require more and more computational demanding methods for their analysis and control design, as well as the network size and node system/interaction complexity increase. Therefore, it is a challenging problem to find scalable computational method for distributed control design of large-scale networks. In this paper, we investigate the robust distributed stabilisation problem of large-scale nonlinear multi-agent systems (briefly MASs) composed of non-identical (heterogeneous) linear dynamical systems coupled by uncertain nonlinear time-varying interconnections. By employing Lyapunov stability theory and linear matrix inequality (LMI) technique, new conditions are given for the distributed control design of large-scale MASs that can be easily solved by the toolbox of MATLAB. The stabilisability of each node dynamic is a sufficient assumption to design a global stabilising distributed control. The proposed approach improves some of the existing LMI-based results on MAS by both overcoming their computational limits and extending the applicative scenario to large-scale nonlinear heterogeneous MASs. Additionally, the proposed LMI conditions are further reduced in terms of computational requirement in the case of weakly heterogeneous MASs, which is a common scenario in real application where the network nodes and links are affected by parameter uncertainties. One of the main advantages of the proposed approach is to allow to move from a centralised towards a distributed computing architecture so that the expensive computation workload spent to solve LMIs may be shared among processors located at the networked nodes, thus increasing the scalability of the approach than the network size. Finally, a numerical example shows the applicability of the proposed method and its advantage in terms of computational complexity when compared with the existing approaches.
Computer programs for smoothing and scaling airfoil coordinates
NASA Technical Reports Server (NTRS)
Morgan, H. L., Jr.
1983-01-01
Detailed descriptions are given of the theoretical methods and associated computer codes of a program to smooth and a program to scale arbitrary airfoil coordinates. The smoothing program utilizes both least-squares polynomial and least-squares cubic spline techniques to smooth interatively the second derivatives of the y-axis airfoil coordinates with respect to a transformed x-axis system which unwraps the airfoil and stretches the nose and trailing-edge regions. The corresponding smooth airfoil coordinates are then determined by solving a tridiagonal matrix of simultaneous cubic-spline equations relating the y-axis coordinates and their corresponding second derivatives. A technique for computing the camber and thickness distribution of the smoothed airfoil is also discussed. The scaling program can then be used to scale the thickness distribution generated by the smoothing program to a specific maximum thickness which is then combined with the camber distribution to obtain the final scaled airfoil contour. Computer listings of the smoothing and scaling programs are included.
Architecture and Programming Models for High Performance Intensive Computation
2016-06-29
Applications Systems and Large-Scale-Big-Data & Large-Scale-Big-Computing (DDDAS- LS ). ICCS 2015, June 2015. Reykjavk, Ice- land. 2. Bo YT, Wang P, Guo ZL...The Mahali project,” Communications Magazine , vol. 52, pp. 111–133, Aug 2014. 14 DISTRIBUTION A: Distribution approved for public release. Response ID
NASA Technical Reports Server (NTRS)
Deardorff, Glenn; Djomehri, M. Jahed; Freeman, Ken; Gambrel, Dave; Green, Bryan; Henze, Chris; Hinke, Thomas; Hood, Robert; Kiris, Cetin; Moran, Patrick;
2001-01-01
A series of NASA presentations for the Supercomputing 2001 conference are summarized. The topics include: (1) Mars Surveyor Landing Sites "Collaboratory"; (2) Parallel and Distributed CFD for Unsteady Flows with Moving Overset Grids; (3) IP Multicast for Seamless Support of Remote Science; (4) Consolidated Supercomputing Management Office; (5) Growler: A Component-Based Framework for Distributed/Collaborative Scientific Visualization and Computational Steering; (6) Data Mining on the Information Power Grid (IPG); (7) Debugging on the IPG; (8) Debakey Heart Assist Device: (9) Unsteady Turbopump for Reusable Launch Vehicle; (10) Exploratory Computing Environments Component Framework; (11) OVERSET Computational Fluid Dynamics Tools; (12) Control and Observation in Distributed Environments; (13) Multi-Level Parallelism Scaling on NASA's Origin 1024 CPU System; (14) Computing, Information, & Communications Technology; (15) NAS Grid Benchmarks; (16) IPG: A Large-Scale Distributed Computing and Data Management System; and (17) ILab: Parameter Study Creation and Submission on the IPG.
Are X-rays the key to integrated computational materials engineering?
Ice, Gene E.
2015-11-01
The ultimate dream of materials science is to predict materials behavior from composition and processing history. Owing to the growing power of computers, this long-time dream has recently found expression through worldwide excitement in a number of computation-based thrusts: integrated computational materials engineering, materials by design, computational materials design, three-dimensional materials physics and mesoscale physics. However, real materials have important crystallographic structures at multiple length scales, which evolve during processing and in service. Moreover, real materials properties can depend on the extreme tails in their structural and chemical distributions. This makes it critical to map structural distributions with sufficient resolutionmore » to resolve small structures and with sufficient statistics to capture the tails of distributions. For two-dimensional materials, there are high-resolution nondestructive probes of surface and near-surface structures with atomic or near-atomic resolution that can provide detailed structural, chemical and functional distributions over important length scales. Furthermore, there are no nondestructive three-dimensional probes with atomic resolution over the multiple length scales needed to understand most materials.« less
Information Power Grid Posters
NASA Technical Reports Server (NTRS)
Vaziri, Arsi
2003-01-01
This document is a summary of the accomplishments of the Information Power Grid (IPG). Grids are an emerging technology that provide seamless and uniform access to the geographically dispersed, computational, data storage, networking, instruments, and software resources needed for solving large-scale scientific and engineering problems. The goal of the NASA IPG is to use NASA's remotely located computing and data system resources to build distributed systems that can address problems that are too large or complex for a single site. The accomplishments outlined in this poster presentation are: access to distributed data, IPG heterogeneous computing, integration of large-scale computing node into distributed environment, remote access to high data rate instruments,and exploratory grid environment.
Performance of distributed multiscale simulations
Borgdorff, J.; Ben Belgacem, M.; Bona-Casas, C.; Fazendeiro, L.; Groen, D.; Hoenen, O.; Mizeranschi, A.; Suter, J. L.; Coster, D.; Coveney, P. V.; Dubitzky, W.; Hoekstra, A. G.; Strand, P.; Chopard, B.
2014-01-01
Multiscale simulations model phenomena across natural scales using monolithic or component-based code, running on local or distributed resources. In this work, we investigate the performance of distributed multiscale computing of component-based models, guided by six multiscale applications with different characteristics and from several disciplines. Three modes of distributed multiscale computing are identified: supplementing local dependencies with large-scale resources, load distribution over multiple resources, and load balancing of small- and large-scale resources. We find that the first mode has the apparent benefit of increasing simulation speed, and the second mode can increase simulation speed if local resources are limited. Depending on resource reservation and model coupling topology, the third mode may result in a reduction of resource consumption. PMID:24982258
NASA's Information Power Grid: Large Scale Distributed Computing and Data Management
NASA Technical Reports Server (NTRS)
Johnston, William E.; Vaziri, Arsi; Hinke, Tom; Tanner, Leigh Ann; Feiereisen, William J.; Thigpen, William; Tang, Harry (Technical Monitor)
2001-01-01
Large-scale science and engineering are done through the interaction of people, heterogeneous computing resources, information systems, and instruments, all of which are geographically and organizationally dispersed. The overall motivation for Grids is to facilitate the routine interactions of these resources in order to support large-scale science and engineering. Multi-disciplinary simulations provide a good example of a class of applications that are very likely to require aggregation of widely distributed computing, data, and intellectual resources. Such simulations - e.g. whole system aircraft simulation and whole system living cell simulation - require integrating applications and data that are developed by different teams of researchers frequently in different locations. The research team's are the only ones that have the expertise to maintain and improve the simulation code and/or the body of experimental data that drives the simulations. This results in an inherently distributed computing and data management environment.
2006-10-01
NCAPS ) Christina M. Underhill, Ph.D. Approved for public release; distribution is unlimited. NPRST-TN-06-9 October 2006...Investigation of Item-Pair Presentation and Construct Validity of the Navy Computer Adaptive Personality Scales ( NCAPS ) Christina M. Underhill, Ph.D...documents one of the steps in our development of the Navy Computer Adaptive Personality Scales ( NCAPS ). NCAPS is a computer adaptive personality measure
The future of PanDA in ATLAS distributed computing
NASA Astrophysics Data System (ADS)
De, K.; Klimentov, A.; Maeno, T.; Nilsson, P.; Oleynik, D.; Panitkin, S.; Petrosyan, A.; Schovancova, J.; Vaniachine, A.; Wenaus, T.
2015-12-01
Experiments at the Large Hadron Collider (LHC) face unprecedented computing challenges. Heterogeneous resources are distributed worldwide at hundreds of sites, thousands of physicists analyse the data remotely, the volume of processed data is beyond the exabyte scale, while data processing requires more than a few billion hours of computing usage per year. The PanDA (Production and Distributed Analysis) system was developed to meet the scale and complexity of LHC distributed computing for the ATLAS experiment. In the process, the old batch job paradigm of locally managed computing in HEP was discarded in favour of a far more automated, flexible and scalable model. The success of PanDA in ATLAS is leading to widespread adoption and testing by other experiments. PanDA is the first exascale workload management system in HEP, already operating at more than a million computing jobs per day, and processing over an exabyte of data in 2013. There are many new challenges that PanDA will face in the near future, in addition to new challenges of scale, heterogeneity and increasing user base. PanDA will need to handle rapidly changing computing infrastructure, will require factorization of code for easier deployment, will need to incorporate additional information sources including network metrics in decision making, be able to control network circuits, handle dynamically sized workload processing, provide improved visualization, and face many other challenges. In this talk we will focus on the new features, planned or recently implemented, that are relevant to the next decade of distributed computing workload management using PanDA.
Task Assignment Heuristics for Distributed CFD Applications
NASA Technical Reports Server (NTRS)
Lopez-Benitez, N.; Djomehri, M. J.; Biswas, R.; Biegel, Bryan (Technical Monitor)
2001-01-01
CFD applications require high-performance computational platforms: 1. Complex physics and domain configuration demand strongly coupled solutions; 2. Applications are CPU and memory intensive; and 3. Huge resource requirements can only be satisfied by teraflop-scale machines or distributed computing.
Distributed weighted least-squares estimation with fast convergence for large-scale systems.
Marelli, Damián Edgardo; Fu, Minyue
2015-01-01
In this paper we study a distributed weighted least-squares estimation problem for a large-scale system consisting of a network of interconnected sub-systems. Each sub-system is concerned with a subset of the unknown parameters and has a measurement linear in the unknown parameters with additive noise. The distributed estimation task is for each sub-system to compute the globally optimal estimate of its own parameters using its own measurement and information shared with the network through neighborhood communication. We first provide a fully distributed iterative algorithm to asymptotically compute the global optimal estimate. The convergence rate of the algorithm will be maximized using a scaling parameter and a preconditioning method. This algorithm works for a general network. For a network without loops, we also provide a different iterative algorithm to compute the global optimal estimate which converges in a finite number of steps. We include numerical experiments to illustrate the performances of the proposed methods.
Distributed weighted least-squares estimation with fast convergence for large-scale systems☆
Marelli, Damián Edgardo; Fu, Minyue
2015-01-01
In this paper we study a distributed weighted least-squares estimation problem for a large-scale system consisting of a network of interconnected sub-systems. Each sub-system is concerned with a subset of the unknown parameters and has a measurement linear in the unknown parameters with additive noise. The distributed estimation task is for each sub-system to compute the globally optimal estimate of its own parameters using its own measurement and information shared with the network through neighborhood communication. We first provide a fully distributed iterative algorithm to asymptotically compute the global optimal estimate. The convergence rate of the algorithm will be maximized using a scaling parameter and a preconditioning method. This algorithm works for a general network. For a network without loops, we also provide a different iterative algorithm to compute the global optimal estimate which converges in a finite number of steps. We include numerical experiments to illustrate the performances of the proposed methods. PMID:25641976
A distributed computing approach to mission operations support. [for spacecraft
NASA Technical Reports Server (NTRS)
Larsen, R. L.
1975-01-01
Computing mission operation support includes orbit determination, attitude processing, maneuver computation, resource scheduling, etc. The large-scale third-generation distributed computer network discussed is capable of fulfilling these dynamic requirements. It is shown that distribution of resources and control leads to increased reliability, and exhibits potential for incremental growth. Through functional specialization, a distributed system may be tuned to very specific operational requirements. Fundamental to the approach is the notion of process-to-process communication, which is effected through a high-bandwidth communications network. Both resource-sharing and load-sharing may be realized in the system.
Multiscaling properties of coastal waters particle size distribution from LISST in situ measurements
NASA Astrophysics Data System (ADS)
Pannimpullath Remanan, R.; Schmitt, F. G.; Loisel, H.; Mériaux, X.
2013-12-01
An eulerian high frequency sampling of particle size distribution (PSD) is performed during 5 tidal cycles (65 hours) in a coastal environment of the eastern English Channel at 1 Hz. The particle data are recorded using a LISST-100x type C (Laser In Situ Scattering and Transmissometry, Sequoia Scientific), recording volume concentrations of particles having diameters ranging from 2.5 to 500 mu in 32 size classes in logarithmic scale. This enables the estimation at each time step (every second) of the probability density function of particle sizes. At every time step, the pdf of PSD is hyperbolic. We can thus estimate PSD slope time series. Power spectral analysis shows that the mean diameter of the suspended particles is scaling at high frequencies (from 1s to 1000s). The scaling properties of particle sizes is studied by computing the moment function, from the pdf of the size distribution. Moment functions at many different time scales (from 1s to 1000 s) are computed and their scaling properties considered. The Shannon entropy at each time scale is also estimated and is related to other parameters. The multiscaling properties of the turbidity (coefficient cp computed from the LISST) are also consider on the same time scales, using Empirical Mode Decomposition.
A Weibull distribution accrual failure detector for cloud computing.
Liu, Jiaxi; Wu, Zhibo; Wu, Jin; Dong, Jian; Zhao, Yao; Wen, Dongxin
2017-01-01
Failure detectors are used to build high availability distributed systems as the fundamental component. To meet the requirement of a complicated large-scale distributed system, accrual failure detectors that can adapt to multiple applications have been studied extensively. However, several implementations of accrual failure detectors do not adapt well to the cloud service environment. To solve this problem, a new accrual failure detector based on Weibull Distribution, called the Weibull Distribution Failure Detector, has been proposed specifically for cloud computing. It can adapt to the dynamic and unexpected network conditions in cloud computing. The performance of the Weibull Distribution Failure Detector is evaluated and compared based on public classical experiment data and cloud computing experiment data. The results show that the Weibull Distribution Failure Detector has better performance in terms of speed and accuracy in unstable scenarios, especially in cloud computing.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Voltolini, Marco; Kwon, Tae-Hyuk; Ajo-Franklin, Jonathan
Pore-scale distribution of supercritical CO 2 (scCO 2) exerts significant control on a variety of key hydrologic as well as geochemical processes, including residual trapping and dissolution. Despite such importance, only a small number of experiments have directly characterized the three-dimensional distribution of scCO 2 in geologic materials during the invasion (drainage) process. Here, we present a study which couples dynamic high-resolution synchrotron X-ray micro-computed tomography imaging of a scCO 2/brine system at in situ pressure/temperature conditions with quantitative pore-scale modeling to allow direct validation of a pore-scale description of scCO2 distribution. The experiment combines high-speed synchrotron radiography with tomographymore » to characterize the brine saturated sample, the scCO 2 breakthrough process, and the partially saturated state of a sandstone sample from the Domengine Formation, a regionally extensive unit within the Sacramento Basin (California, USA). The availability of a 3D dataset allowed us to examine correlations between grains and pores morphometric parameters and the actual distribution of scCO 2 in the sample, including the examination of the role of small-scale sedimentary structure on CO2 distribution. The segmented scCO 2/brine volume was also used to validate a simple computational model based on the local thickness concept, able to accurately simulate the distribution of scCO 2 after drainage. The same method was also used to simulate Hg capillary pressure curves with satisfactory results when compared to the measured ones. Finally, this predictive approach, requiring only a tomographic scan of the dry sample, proved to be an effective route for studying processes related to CO 2 invasion structure in geological samples at the pore scale.« less
Voltolini, Marco; Kwon, Tae-Hyuk; Ajo-Franklin, Jonathan
2017-10-21
Pore-scale distribution of supercritical CO 2 (scCO 2) exerts significant control on a variety of key hydrologic as well as geochemical processes, including residual trapping and dissolution. Despite such importance, only a small number of experiments have directly characterized the three-dimensional distribution of scCO 2 in geologic materials during the invasion (drainage) process. Here, we present a study which couples dynamic high-resolution synchrotron X-ray micro-computed tomography imaging of a scCO 2/brine system at in situ pressure/temperature conditions with quantitative pore-scale modeling to allow direct validation of a pore-scale description of scCO2 distribution. The experiment combines high-speed synchrotron radiography with tomographymore » to characterize the brine saturated sample, the scCO 2 breakthrough process, and the partially saturated state of a sandstone sample from the Domengine Formation, a regionally extensive unit within the Sacramento Basin (California, USA). The availability of a 3D dataset allowed us to examine correlations between grains and pores morphometric parameters and the actual distribution of scCO 2 in the sample, including the examination of the role of small-scale sedimentary structure on CO2 distribution. The segmented scCO 2/brine volume was also used to validate a simple computational model based on the local thickness concept, able to accurately simulate the distribution of scCO 2 after drainage. The same method was also used to simulate Hg capillary pressure curves with satisfactory results when compared to the measured ones. Finally, this predictive approach, requiring only a tomographic scan of the dry sample, proved to be an effective route for studying processes related to CO 2 invasion structure in geological samples at the pore scale.« less
2017-02-13
3550 Aberdeen Ave., SE 11. SPONSOR/MONITOR’S REPORT Kirtland AFB, NM 87117-5776 NUMBER(S) AFRL -RV-PS-TR-2016-0161 12. DISTRIBUTION / AVAILABILITY...RVIL Kirtland AFB, NM 87117-5776 2 cys Official Record Copy AFRL /RVSW/David Cardimona 1 cy 22 Approved for public release; distribution is unlimited. ... AFRL -RV-PS- AFRL -RV-PS- TR-2016-0161 TR-2016-0161 ATOMISTIC- AND MESO-SCALE COMPUTATIONAL SIMULATIONS FOR DEVELOPING MULTI-TIMESCALE THEORY FOR
NASA Astrophysics Data System (ADS)
Senthilkumar, K.; Ruchika Mehra Vijayan, E.
2017-11-01
This paper aims to illustrate real time analysis of large scale data. For practical implementation we are performing sentiment analysis on live Twitter feeds for each individual tweet. To analyze sentiments we will train our data model on sentiWordNet, a polarity assigned wordNet sample by Princeton University. Our main objective will be to efficiency analyze large scale data on the fly using distributed computation. Apache Spark and Apache Hadoop eco system is used as distributed computation platform with Java as development language
Scientific Services on the Cloud
NASA Astrophysics Data System (ADS)
Chapman, David; Joshi, Karuna P.; Yesha, Yelena; Halem, Milt; Yesha, Yaacov; Nguyen, Phuong
Scientific Computing was one of the first every applications for parallel and distributed computation. To this date, scientific applications remain some of the most compute intensive, and have inspired creation of petaflop compute infrastructure such as the Oak Ridge Jaguar and Los Alamos RoadRunner. Large dedicated hardware infrastructure has become both a blessing and a curse to the scientific community. Scientists are interested in cloud computing for much the same reason as businesses and other professionals. The hardware is provided, maintained, and administrated by a third party. Software abstraction and virtualization provide reliability, and fault tolerance. Graduated fees allow for multi-scale prototyping and execution. Cloud computing resources are only a few clicks away, and by far the easiest high performance distributed platform to gain access to. There may still be dedicated infrastructure for ultra-scale science, but the cloud can easily play a major part of the scientific computing initiative.
A Weibull distribution accrual failure detector for cloud computing
Wu, Zhibo; Wu, Jin; Zhao, Yao; Wen, Dongxin
2017-01-01
Failure detectors are used to build high availability distributed systems as the fundamental component. To meet the requirement of a complicated large-scale distributed system, accrual failure detectors that can adapt to multiple applications have been studied extensively. However, several implementations of accrual failure detectors do not adapt well to the cloud service environment. To solve this problem, a new accrual failure detector based on Weibull Distribution, called the Weibull Distribution Failure Detector, has been proposed specifically for cloud computing. It can adapt to the dynamic and unexpected network conditions in cloud computing. The performance of the Weibull Distribution Failure Detector is evaluated and compared based on public classical experiment data and cloud computing experiment data. The results show that the Weibull Distribution Failure Detector has better performance in terms of speed and accuracy in unstable scenarios, especially in cloud computing. PMID:28278229
Effect of Variable Spatial Scales on USLE-GIS Computations
NASA Astrophysics Data System (ADS)
Patil, R. J.; Sharma, S. K.
2017-12-01
Use of appropriate spatial scale is very important in Universal Soil Loss Equation (USLE) based spatially distributed soil erosion modelling. This study aimed at assessment of annual rates of soil erosion at different spatial scales/grid sizes and analysing how changes in spatial scales affect USLE-GIS computations using simulation and statistical variabilities. Efforts have been made in this study to recommend an optimum spatial scale for further USLE-GIS computations for management and planning in the study area. The present research study was conducted in Shakkar River watershed, situated in Narsinghpur and Chhindwara districts of Madhya Pradesh, India. Remote Sensing and GIS techniques were integrated with Universal Soil Loss Equation (USLE) to predict spatial distribution of soil erosion in the study area at four different spatial scales viz; 30 m, 50 m, 100 m, and 200 m. Rainfall data, soil map, digital elevation model (DEM) and an executable C++ program, and satellite image of the area were used for preparation of the thematic maps for various USLE factors. Annual rates of soil erosion were estimated for 15 years (1992 to 2006) at four different grid sizes. The statistical analysis of four estimated datasets showed that sediment loss dataset at 30 m spatial scale has a minimum standard deviation (2.16), variance (4.68), percent deviation from observed values (2.68 - 18.91 %), and highest coefficient of determination (R2 = 0.874) among all the four datasets. Thus, it is recommended to adopt this spatial scale for USLE-GIS computations in the study area due to its minimum statistical variability and better agreement with the observed sediment loss data. This study also indicates large scope for use of finer spatial scales in spatially distributed soil erosion modelling.
Large-Scale Distributed Computational Fluid Dynamics on the Information Power Grid Using Globus
NASA Technical Reports Server (NTRS)
Barnard, Stephen; Biswas, Rupak; Saini, Subhash; VanderWijngaart, Robertus; Yarrow, Maurice; Zechtzer, Lou; Foster, Ian; Larsson, Olle
1999-01-01
This paper describes an experiment in which a large-scale scientific application development for tightly-coupled parallel machines is adapted to the distributed execution environment of the Information Power Grid (IPG). A brief overview of the IPG and a description of the computational fluid dynamics (CFD) algorithm are given. The Globus metacomputing toolkit is used as the enabling device for the geographically-distributed computation. Modifications related to latency hiding and Load balancing were required for an efficient implementation of the CFD application in the IPG environment. Performance results on a pair of SGI Origin 2000 machines indicate that real scientific applications can be effectively implemented on the IPG; however, a significant amount of continued effort is required to make such an environment useful and accessible to scientists and engineers.
GLAD: a system for developing and deploying large-scale bioinformatics grid.
Teo, Yong-Meng; Wang, Xianbing; Ng, Yew-Kwong
2005-03-01
Grid computing is used to solve large-scale bioinformatics problems with gigabytes database by distributing the computation across multiple platforms. Until now in developing bioinformatics grid applications, it is extremely tedious to design and implement the component algorithms and parallelization techniques for different classes of problems, and to access remotely located sequence database files of varying formats across the grid. In this study, we propose a grid programming toolkit, GLAD (Grid Life sciences Applications Developer), which facilitates the development and deployment of bioinformatics applications on a grid. GLAD has been developed using ALiCE (Adaptive scaLable Internet-based Computing Engine), a Java-based grid middleware, which exploits the task-based parallelism. Two bioinformatics benchmark applications, such as distributed sequence comparison and distributed progressive multiple sequence alignment, have been developed using GLAD.
Ogawa, S.; Komini Babu, S.; Chung, H. T.; ...
2016-08-22
The nano/micro-scale geometry of polymer electrolyte fuel cell (PEFC) catalyst layers critically affects cell performance. The small length scales and complex structure of these composite layers make it challenging to analyze cell performance and physics at the particle scale by experiment. We present a computational method to simulate transport and chemical reaction phenomena at the pore/particle-scale and apply it to a PEFC cathode with platinum group metal free (PGM-free) catalyst. Here, we numerically solve the governing equations for the physics with heterogeneous oxygen diffusion coefficient and proton conductivity evaluated using the actual electrode structure and ionomer distribution obtained using nano-scalemore » resolution X-ray computed tomography (nano-CT). Using this approach, the oxygen concentration and electrolyte potential distributions imposed by the oxygen reduction reaction are solved and the impact of the catalyst layer structure on performance is evaluated.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ogawa, S.; Komini Babu, S.; Chung, H. T.
The nano/micro-scale geometry of polymer electrolyte fuel cell (PEFC) catalyst layers critically affects cell performance. The small length scales and complex structure of these composite layers make it challenging to analyze cell performance and physics at the particle scale by experiment. We present a computational method to simulate transport and chemical reaction phenomena at the pore/particle-scale and apply it to a PEFC cathode with platinum group metal free (PGM-free) catalyst. Here, we numerically solve the governing equations for the physics with heterogeneous oxygen diffusion coefficient and proton conductivity evaluated using the actual electrode structure and ionomer distribution obtained using nano-scalemore » resolution X-ray computed tomography (nano-CT). Using this approach, the oxygen concentration and electrolyte potential distributions imposed by the oxygen reduction reaction are solved and the impact of the catalyst layer structure on performance is evaluated.« less
A compositional reservoir simulator on distributed memory parallel computers
DOE Office of Scientific and Technical Information (OSTI.GOV)
Rame, M.; Delshad, M.
1995-12-31
This paper presents the application of distributed memory parallel computes to field scale reservoir simulations using a parallel version of UTCHEM, The University of Texas Chemical Flooding Simulator. The model is a general purpose highly vectorized chemical compositional simulator that can simulate a wide range of displacement processes at both field and laboratory scales. The original simulator was modified to run on both distributed memory parallel machines (Intel iPSC/960 and Delta, Connection Machine 5, Kendall Square 1 and 2, and CRAY T3D) and a cluster of workstations. A domain decomposition approach has been taken towards parallelization of the code. Amore » portion of the discrete reservoir model is assigned to each processor by a set-up routine that attempts a data layout as even as possible from the load-balance standpoint. Each of these subdomains is extended so that data can be shared between adjacent processors for stencil computation. The added routines that make parallel execution possible are written in a modular fashion that makes the porting to new parallel platforms straight forward. Results of the distributed memory computing performance of Parallel simulator are presented for field scale applications such as tracer flood and polymer flood. A comparison of the wall-clock times for same problems on a vector supercomputer is also presented.« less
Supporting large scale applications on networks of workstations
NASA Technical Reports Server (NTRS)
Cooper, Robert; Birman, Kenneth P.
1989-01-01
Distributed applications on networks of workstations are an increasingly common way to satisfy computing needs. However, existing mechanisms for distributed programming exhibit poor performance and reliability as application size increases. Extension of the ISIS distributed programming system to support large scale distributed applications by providing hierarchical process groups is discussed. Incorporation of hierarchy in the program structure and exploitation of this to limit the communication and storage required in any one component of the distributed system is examined.
Wan, Shixiang; Zou, Quan
2017-01-01
Multiple sequence alignment (MSA) plays a key role in biological sequence analyses, especially in phylogenetic tree construction. Extreme increase in next-generation sequencing results in shortage of efficient ultra-large biological sequence alignment approaches for coping with different sequence types. Distributed and parallel computing represents a crucial technique for accelerating ultra-large (e.g. files more than 1 GB) sequence analyses. Based on HAlign and Spark distributed computing system, we implement a highly cost-efficient and time-efficient HAlign-II tool to address ultra-large multiple biological sequence alignment and phylogenetic tree construction. The experiments in the DNA and protein large scale data sets, which are more than 1GB files, showed that HAlign II could save time and space. It outperformed the current software tools. HAlign-II can efficiently carry out MSA and construct phylogenetic trees with ultra-large numbers of biological sequences. HAlign-II shows extremely high memory efficiency and scales well with increases in computing resource. THAlign-II provides a user-friendly web server based on our distributed computing infrastructure. HAlign-II with open-source codes and datasets was established at http://lab.malab.cn/soft/halign.
Next Generation Distributed Computing for Cancer Research
Agarwal, Pankaj; Owzar, Kouros
2014-01-01
Advances in next generation sequencing (NGS) and mass spectrometry (MS) technologies have provided many new opportunities and angles for extending the scope of translational cancer research while creating tremendous challenges in data management and analysis. The resulting informatics challenge is invariably not amenable to the use of traditional computing models. Recent advances in scalable computing and associated infrastructure, particularly distributed computing for Big Data, can provide solutions for addressing these challenges. In this review, the next generation of distributed computing technologies that can address these informatics problems is described from the perspective of three key components of a computational platform, namely computing, data storage and management, and networking. A broad overview of scalable computing is provided to set the context for a detailed description of Hadoop, a technology that is being rapidly adopted for large-scale distributed computing. A proof-of-concept Hadoop cluster, set up for performance benchmarking of NGS read alignment, is described as an example of how to work with Hadoop. Finally, Hadoop is compared with a number of other current technologies for distributed computing. PMID:25983539
Next generation distributed computing for cancer research.
Agarwal, Pankaj; Owzar, Kouros
2014-01-01
Advances in next generation sequencing (NGS) and mass spectrometry (MS) technologies have provided many new opportunities and angles for extending the scope of translational cancer research while creating tremendous challenges in data management and analysis. The resulting informatics challenge is invariably not amenable to the use of traditional computing models. Recent advances in scalable computing and associated infrastructure, particularly distributed computing for Big Data, can provide solutions for addressing these challenges. In this review, the next generation of distributed computing technologies that can address these informatics problems is described from the perspective of three key components of a computational platform, namely computing, data storage and management, and networking. A broad overview of scalable computing is provided to set the context for a detailed description of Hadoop, a technology that is being rapidly adopted for large-scale distributed computing. A proof-of-concept Hadoop cluster, set up for performance benchmarking of NGS read alignment, is described as an example of how to work with Hadoop. Finally, Hadoop is compared with a number of other current technologies for distributed computing.
Simple Kinematic Pathway Approach (KPA) to Catchment-scale Travel Time and Water Age Distributions
NASA Astrophysics Data System (ADS)
Soltani, S. S.; Cvetkovic, V.; Destouni, G.
2017-12-01
The distribution of catchment-scale water travel times is strongly influenced by morphological dispersion and is partitioned between hillslope and larger, regional scales. We explore whether hillslope travel times are predictable using a simple semi-analytical "kinematic pathway approach" (KPA) that accounts for dispersion on two levels of morphological and macro-dispersion. The study gives new insights to shallow (hillslope) and deep (regional) groundwater travel times by comparing numerical simulations of travel time distributions, referred to as "dynamic model", with corresponding KPA computations for three different real catchment case studies in Sweden. KPA uses basic structural and hydrological data to compute transient water travel time (forward mode) and age (backward mode) distributions at the catchment outlet. Longitudinal and morphological dispersion components are reflected in KPA computations by assuming an effective Peclet number and topographically driven pathway length distributions, respectively. Numerical simulations of advective travel times are obtained by means of particle tracking using the fully-integrated flow model MIKE SHE. The comparison of computed cumulative distribution functions of travel times shows significant influence of morphological dispersion and groundwater recharge rate on the compatibility of the "kinematic pathway" and "dynamic" models. Zones of high recharge rate in "dynamic" models are associated with topographically driven groundwater flow paths to adjacent discharge zones, e.g. rivers and lakes, through relatively shallow pathway compartments. These zones exhibit more compatible behavior between "dynamic" and "kinematic pathway" models than the zones of low recharge rate. Interestingly, the travel time distributions of hillslope compartments remain almost unchanged with increasing recharge rates in the "dynamic" models. This robust "dynamic" model behavior suggests that flow path lengths and travel times in shallow hillslope compartments are controlled by topography, and therefore application and further development of the simple "kinematic pathway" approach is promising for their modeling.
Horsch, Karla; Pesce, Lorenzo L.; Giger, Maryellen L.; Metz, Charles E.; Jiang, Yulei
2012-01-01
Purpose: The authors developed scaling methods that monotonically transform the output of one classifier to the “scale” of another. Such transformations affect the distribution of classifier output while leaving the ROC curve unchanged. In particular, they investigated transformations between radiologists and computer classifiers, with the goal of addressing the problem of comparing and interpreting case-specific values of output from two classifiers. Methods: Using both simulated and radiologists’ rating data of breast imaging cases, the authors investigated a likelihood-ratio-scaling transformation, based on “matching” classifier likelihood ratios. For comparison, three other scaling transformations were investigated that were based on matching classifier true positive fraction, false positive fraction, or cumulative distribution function, respectively. The authors explored modifying the computer output to reflect the scale of the radiologist, as well as modifying the radiologist’s ratings to reflect the scale of the computer. They also evaluated how dataset size affects the transformations. Results: When ROC curves of two classifiers differed substantially, the four transformations were found to be quite different. The likelihood-ratio scaling transformation was found to vary widely from radiologist to radiologist. Similar results were found for the other transformations. Our simulations explored the effect of database sizes on the accuracy of the estimation of our scaling transformations. Conclusions: The likelihood-ratio-scaling transformation that the authors have developed and evaluated was shown to be capable of transforming computer and radiologist outputs to a common scale reliably, thereby allowing the comparison of the computer and radiologist outputs on the basis of a clinically relevant statistic. PMID:22559651
NASA Astrophysics Data System (ADS)
Jang, W.; Engda, T. A.; Neff, J. C.; Herrick, J.
2017-12-01
Many crop models are increasingly used to evaluate crop yields at regional and global scales. However, implementation of these models across large areas using fine-scale grids is limited by computational time requirements. In order to facilitate global gridded crop modeling with various scenarios (i.e., different crop, management schedule, fertilizer, and irrigation) using the Environmental Policy Integrated Climate (EPIC) model, we developed a distributed parallel computing framework in Python. Our local desktop with 14 cores (28 threads) was used to test the distributed parallel computing framework in Iringa, Tanzania which has 406,839 grid cells. High-resolution soil data, SoilGrids (250 x 250 m), and climate data, AgMERRA (0.25 x 0.25 deg) were also used as input data for the gridded EPIC model. The framework includes a master file for parallel computing, input database, input data formatters, EPIC model execution, and output analyzers. Through the master file for parallel computing, the user-defined number of threads of CPU divides the EPIC simulation into jobs. Then, Using EPIC input data formatters, the raw database is formatted for EPIC input data and the formatted data moves into EPIC simulation jobs. Then, 28 EPIC jobs run simultaneously and only interesting results files are parsed and moved into output analyzers. We applied various scenarios with seven different slopes and twenty-four fertilizer ranges. Parallelized input generators create different scenarios as a list for distributed parallel computing. After all simulations are completed, parallelized output analyzers are used to analyze all outputs according to the different scenarios. This saves significant computing time and resources, making it possible to conduct gridded modeling at regional to global scales with high-resolution data. For example, serial processing for the Iringa test case would require 113 hours, while using the framework developed in this study requires only approximately 6 hours, a nearly 95% reduction in computing time.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Katz, Daniel S; Jha, Shantenu; Weissman, Jon
2017-01-31
This is the final technical report for the AIMES project. Many important advances in science and engineering are due to large-scale distributed computing. Notwithstanding this reliance, we are still learning how to design and deploy large-scale production Distributed Computing Infrastructures (DCI). This is evidenced by missing design principles for DCI, and an absence of generally acceptable and usable distributed computing abstractions. The AIMES project was conceived against this backdrop, following on the heels of a comprehensive survey of scientific distributed applications. AIMES laid the foundations to address the tripartite challenge of dynamic resource management, integrating information, and portable and interoperablemore » distributed applications. Four abstractions were defined and implemented: skeleton, resource bundle, pilot, and execution strategy. The four abstractions were implemented into software modules and then aggregated into the AIMES middleware. This middleware successfully integrates information across the application layer (skeletons) and resource layer (Bundles), derives a suitable execution strategy for the given skeleton and enacts its execution by means of pilots on one or more resources, depending on the application requirements, and resource availabilities and capabilities.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Weissman, Jon; Katz, Dan; Jha, Shantenu
2017-01-31
This is the final technical report for the AIMES project. Many important advances in science and engineering are due to large scale distributed computing. Notwithstanding this reliance, we are still learning how to design and deploy large-scale production Distributed Computing Infrastructures (DCI). This is evidenced by missing design principles for DCI, and an absence of generally acceptable and usable distributed computing abstractions. The AIMES project was conceived against this backdrop, following on the heels of a comprehensive survey of scientific distributed applications. AIMES laid the foundations to address the tripartite challenge of dynamic resource management, integrating information, and portable andmore » interoperable distributed applications. Four abstractions were defined and implemented: skeleton, resource bundle, pilot, and execution strategy. The four abstractions were implemented into software modules and then aggregated into the AIMES middleware. This middleware successfully integrates information across the application layer (skeletons) and resource layer (Bundles), derives a suitable execution strategy for the given skeleton and enacts its execution by means of pilots on one or more resources, depending on the application requirements, and resource availabilities and capabilities.« less
Karthikeyan, M; Krishnan, S; Pandey, Anil Kumar; Bender, Andreas; Tropsha, Alexander
2008-04-01
We present the application of a Java remote method invocation (RMI) based open source architecture to distributed chemical computing. This architecture was previously employed for distributed data harvesting of chemical information from the Internet via the Google application programming interface (API; ChemXtreme). Due to its open source character and its flexibility, the underlying server/client framework can be quickly adopted to virtually every computational task that can be parallelized. Here, we present the server/client communication framework as well as an application to distributed computing of chemical properties on a large scale (currently the size of PubChem; about 18 million compounds), using both the Marvin toolkit as well as the open source JOELib package. As an application, for this set of compounds, the agreement of log P and TPSA between the packages was compared. Outliers were found to be mostly non-druglike compounds and differences could usually be explained by differences in the underlying algorithms. ChemStar is the first open source distributed chemical computing environment built on Java RMI, which is also easily adaptable to user demands due to its "plug-in architecture". The complete source codes as well as calculated properties along with links to PubChem resources are available on the Internet via a graphical user interface at http://moltable.ncl.res.in/chemstar/.
Simulation Framework for Intelligent Transportation Systems
DOT National Transportation Integrated Search
1996-10-01
A simulation framework has been developed for a large-scale, comprehensive, scaleable simulation of an Intelligent Transportation System. The simulator is designed for running on parellel computers and distributed (networked) computer systems, but ca...
Analysis of Delays in Transmitting Time Code Using an Automated Computer Time Distribution System
1999-12-01
jlevine@clock. bldrdoc.gov Abstract An automated computer time distribution system broadcasts standard tune to users using computers and modems via...contributed to &lays - sofhareplatform (50% of the delay), transmission speed of time- codes (25OA), telephone network (lS%), modem and others (10’4). The... modems , and telephone lines. Users dial the ACTS server to receive time traceable to the national time scale of Singapore, UTC(PSB). The users can in
The revolution in data gathering systems
NASA Technical Reports Server (NTRS)
Cambra, J. M.; Trover, W. F.
1975-01-01
Data acquisition systems used in NASA's wind tunnels from the 1950's through the present time are summarized as a baseline for assessing the impact of minicomputers and microcomputers on data acquisition and data processing. Emphasis is placed on the cyclic evolution in computer technology which transformed the central computer system, and finally the distributed computer system. Other developments discussed include: medium scale integration, large scale integration, combining the functions of data acquisition and control, and micro and minicomputers.
Singular Perturbations and Time-Scale Methods in Control Theory: Survey 1976-1982.
1982-12-01
established in the 1960s, when they first became a means for simplified computation of optimal trajectories. It was soon recognized that singular...null-space of P(ao). The asymptotic values of the invariant zeros and associated invariant-zero directions as € O are the values computed from the...49 ’ 49 7. WEAK COUPLING AND TIME SCALES The need for model simplification with a reduction (or distribution) of computational effort is
Alternative Smoothing and Scaling Strategies for Weighted Composite Scores
ERIC Educational Resources Information Center
Moses, Tim
2014-01-01
In this study, smoothing and scaling approaches are compared for estimating subscore-to-composite scaling results involving composites computed as rounded and weighted combinations of subscores. The considered smoothing and scaling approaches included those based on raw data, on smoothing the bivariate distribution of the subscores, on smoothing…
SciSpark's SRDD : A Scientific Resilient Distributed Dataset for Multidimensional Data
NASA Astrophysics Data System (ADS)
Palamuttam, R. S.; Wilson, B. D.; Mogrovejo, R. M.; Whitehall, K. D.; Mattmann, C. A.; McGibbney, L. J.; Ramirez, P.
2015-12-01
Remote sensing data and climate model output are multi-dimensional arrays of massive sizes locked away in heterogeneous file formats (HDF5/4, NetCDF 3/4) and metadata models (HDF-EOS, CF) making it difficult to perform multi-stage, iterative science processing since each stage requires writing and reading data to and from disk. We have developed SciSpark, a robust Big Data framework, that extends ApacheTM Spark for scaling scientific computations. Apache Spark improves the map-reduce implementation in ApacheTM Hadoop for parallel computing on a cluster, by emphasizing in-memory computation, "spilling" to disk only as needed, and relying on lazy evaluation. Central to Spark is the Resilient Distributed Dataset (RDD), an in-memory distributed data structure that extends the functional paradigm provided by the Scala programming language. However, RDDs are ideal for tabular or unstructured data, and not for highly dimensional data. The SciSpark project introduces the Scientific Resilient Distributed Dataset (sRDD), a distributed-computing array structure which supports iterative scientific algorithms for multidimensional data. SciSpark processes data stored in NetCDF and HDF files by partitioning them across time or space and distributing the partitions among a cluster of compute nodes. We show usability and extensibility of SciSpark by implementing distributed algorithms for geospatial operations on large collections of multi-dimensional grids. In particular we address the problem of scaling an automated method for finding Mesoscale Convective Complexes. SciSpark provides a tensor interface to support the pluggability of different matrix libraries. We evaluate performance of the various matrix libraries in distributed pipelines, such as Nd4jTM and BreezeTM. We detail the architecture and design of SciSpark, our efforts to integrate climate science algorithms, parallel ingest and partitioning (sharding) of A-Train satellite observations from model grids. These solutions are encompassed in SciSpark, an open-source software framework for distributed computing on scientific data.
Drawert, Brian; Trogdon, Michael; Toor, Salman; Petzold, Linda; Hellander, Andreas
2016-01-01
Computational experiments using spatial stochastic simulations have led to important new biological insights, but they require specialized tools and a complex software stack, as well as large and scalable compute and data analysis resources due to the large computational cost associated with Monte Carlo computational workflows. The complexity of setting up and managing a large-scale distributed computation environment to support productive and reproducible modeling can be prohibitive for practitioners in systems biology. This results in a barrier to the adoption of spatial stochastic simulation tools, effectively limiting the type of biological questions addressed by quantitative modeling. In this paper, we present PyURDME, a new, user-friendly spatial modeling and simulation package, and MOLNs, a cloud computing appliance for distributed simulation of stochastic reaction-diffusion models. MOLNs is based on IPython and provides an interactive programming platform for development of sharable and reproducible distributed parallel computational experiments.
Large scale cardiac modeling on the Blue Gene supercomputer.
Reumann, Matthias; Fitch, Blake G; Rayshubskiy, Aleksandr; Keller, David U; Weiss, Daniel L; Seemann, Gunnar; Dössel, Olaf; Pitman, Michael C; Rice, John J
2008-01-01
Multi-scale, multi-physical heart models have not yet been able to include a high degree of accuracy and resolution with respect to model detail and spatial resolution due to computational limitations of current systems. We propose a framework to compute large scale cardiac models. Decomposition of anatomical data in segments to be distributed on a parallel computer is carried out by optimal recursive bisection (ORB). The algorithm takes into account a computational load parameter which has to be adjusted according to the cell models used. The diffusion term is realized by the monodomain equations. The anatomical data-set was given by both ventricles of the Visible Female data-set in a 0.2 mm resolution. Heterogeneous anisotropy was included in the computation. Model weights as input for the decomposition and load balancing were set to (a) 1 for tissue and 0 for non-tissue elements; (b) 10 for tissue and 1 for non-tissue elements. Scaling results for 512, 1024, 2048, 4096 and 8192 computational nodes were obtained for 10 ms simulation time. The simulations were carried out on an IBM Blue Gene/L parallel computer. A 1 s simulation was then carried out on 2048 nodes for the optimal model load. Load balances did not differ significantly across computational nodes even if the number of data elements distributed to each node differed greatly. Since the ORB algorithm did not take into account computational load due to communication cycles, the speedup is close to optimal for the computation time but not optimal overall due to the communication overhead. However, the simulation times were reduced form 87 minutes on 512 to 11 minutes on 8192 nodes. This work demonstrates that it is possible to run simulations of the presented detailed cardiac model within hours for the simulation of a heart beat.
A Latency-Tolerant Partitioner for Distributed Computing on the Information Power Grid
NASA Technical Reports Server (NTRS)
Das, Sajal K.; Harvey, Daniel J.; Biwas, Rupak; Kwak, Dochan (Technical Monitor)
2001-01-01
NASA's Information Power Grid (IPG) is an infrastructure designed to harness the power of graphically distributed computers, databases, and human expertise, in order to solve large-scale realistic computational problems. This type of a meta-computing environment is necessary to present a unified virtual machine to application developers that hides the intricacies of a highly heterogeneous environment and yet maintains adequate security. In this paper, we present a novel partitioning scheme. called MinEX, that dynamically balances processor workloads while minimizing data movement and runtime communication, for applications that are executed in a parallel distributed fashion on the IPG. We also analyze the conditions that are required for the IPG to be an effective tool for such distributed computations. Our results show that MinEX is a viable load balancer provided the nodes of the IPG are connected by a high-speed asynchronous interconnection network.
Distributed and grid computing projects with research focus in human health.
Diomidous, Marianna; Zikos, Dimitrios
2012-01-01
Distributed systems and grid computing systems are used to connect several computers to obtain a higher level of performance, in order to solve a problem. During the last decade, projects use the World Wide Web to aggregate individuals' CPU power for research purposes. This paper presents the existing active large scale distributed and grid computing projects with research focus in human health. There have been found and presented 11 active projects with more than 2000 Processing Units (PUs) each. The research focus for most of them is molecular biology and, specifically on understanding or predicting protein structure through simulation, comparing proteins, genomic analysis for disease provoking genes and drug design. Though not in all cases explicitly stated, common target diseases include research to find cure against HIV, dengue, Duchene dystrophy, Parkinson's disease, various types of cancer and influenza. Other diseases include malaria, anthrax, Alzheimer's disease. The need for national initiatives and European Collaboration for larger scale projects is stressed, to raise the awareness of citizens to participate in order to create a culture of internet volunteering altruism.
High-Throughput Computing on High-Performance Platforms: A Case Study
DOE Office of Scientific and Technical Information (OSTI.GOV)
Oleynik, D; Panitkin, S; Matteo, Turilli
The computing systems used by LHC experiments has historically consisted of the federation of hundreds to thousands of distributed resources, ranging from small to mid-size resource. In spite of the impressive scale of the existing distributed computing solutions, the federation of small to mid-size resources will be insufficient to meet projected future demands. This paper is a case study of how the ATLAS experiment has embraced Titan -- a DOE leadership facility in conjunction with traditional distributed high- throughput computing to reach sustained production scales of approximately 52M core-hours a years. The three main contributions of this paper are: (i)more » a critical evaluation of design and operational considerations to support the sustained, scalable and production usage of Titan; (ii) a preliminary characterization of a next generation executor for PanDA to support new workloads and advanced execution modes; and (iii) early lessons for how current and future experimental and observational systems can be integrated with production supercomputers and other platforms in a general and extensible manner.« less
Scalable parallel distance field construction for large-scale applications
Yu, Hongfeng; Xie, Jinrong; Ma, Kwan -Liu; ...
2015-10-01
Computing distance fields is fundamental to many scientific and engineering applications. Distance fields can be used to direct analysis and reduce data. In this paper, we present a highly scalable method for computing 3D distance fields on massively parallel distributed-memory machines. Anew distributed spatial data structure, named parallel distance tree, is introduced to manage the level sets of data and facilitate surface tracking overtime, resulting in significantly reduced computation and communication costs for calculating the distance to the surface of interest from any spatial locations. Our method supports several data types and distance metrics from real-world applications. We demonstrate itsmore » efficiency and scalability on state-of-the-art supercomputers using both large-scale volume datasets and surface models. We also demonstrate in-situ distance field computation on dynamic turbulent flame surfaces for a petascale combustion simulation. In conclusion, our work greatly extends the usability of distance fields for demanding applications.« less
Scalable Parallel Distance Field Construction for Large-Scale Applications.
Yu, Hongfeng; Xie, Jinrong; Ma, Kwan-Liu; Kolla, Hemanth; Chen, Jacqueline H
2015-10-01
Computing distance fields is fundamental to many scientific and engineering applications. Distance fields can be used to direct analysis and reduce data. In this paper, we present a highly scalable method for computing 3D distance fields on massively parallel distributed-memory machines. A new distributed spatial data structure, named parallel distance tree, is introduced to manage the level sets of data and facilitate surface tracking over time, resulting in significantly reduced computation and communication costs for calculating the distance to the surface of interest from any spatial locations. Our method supports several data types and distance metrics from real-world applications. We demonstrate its efficiency and scalability on state-of-the-art supercomputers using both large-scale volume datasets and surface models. We also demonstrate in-situ distance field computation on dynamic turbulent flame surfaces for a petascale combustion simulation. Our work greatly extends the usability of distance fields for demanding applications.
Parallel and distributed computation for fault-tolerant object recognition
NASA Technical Reports Server (NTRS)
Wechsler, Harry
1988-01-01
The distributed associative memory (DAM) model is suggested for distributed and fault-tolerant computation as it relates to object recognition tasks. The fault-tolerance is with respect to geometrical distortions (scale and rotation), noisy inputs, occulsion/overlap, and memory faults. An experimental system was developed for fault-tolerant structure recognition which shows the feasibility of such an approach. The approach is futher extended to the problem of multisensory data integration and applied successfully to the recognition of colored polyhedral objects.
NASA Astrophysics Data System (ADS)
Carvalho, D.; Gavillet, Ph.; Delgado, V.; Albert, J. N.; Bellas, N.; Javello, J.; Miere, Y.; Ruffinoni, D.; Smith, G.
Large Scientific Equipments are controlled by Computer Systems whose complexity is growing driven, on the one hand by the volume and variety of the information, its distributed nature, the sophistication of its treatment and, on the other hand by the fast evolution of the computer and network market. Some people call them genetically Large-Scale Distributed Data Intensive Information Systems or Distributed Computer Control Systems (DCCS) for those systems dealing more with real time control. Taking advantage of (or forced by) the distributed architecture, the tasks are more and more often implemented as Client-Server applications. In this framework the monitoring of the computer nodes, the communications network and the applications becomes of primary importance for ensuring the safe running and guaranteed performance of the system. With the future generation of HEP experiments, such as those at the LHC in view, it is proposed to integrate the various functions of DCCS monitoring into one general purpose Multi-layer System.
The coupling of fluids, dynamics, and controls on advanced architecture computers
NASA Technical Reports Server (NTRS)
Atwood, Christopher
1995-01-01
This grant provided for the demonstration of coupled controls, body dynamics, and fluids computations in a workstation cluster environment; and an investigation of the impact of peer-peer communication on flow solver performance and robustness. The findings of these investigations were documented in the conference articles.The attached publication, 'Towards Distributed Fluids/Controls Simulations', documents the solution and scaling of the coupled Navier-Stokes, Euler rigid-body dynamics, and state feedback control equations for a two-dimensional canard-wing. The poor scaling shown was due to serialized grid connectivity computation and Ethernet bandwidth limits. The scaling of a peer-to-peer communication flow code on an IBM SP-2 was also shown. The scaling of the code on the switched fabric-linked nodes was good, with a 2.4 percent loss due to communication of intergrid boundary point information. The code performance on 30 worker nodes was 1.7 (mu)s/point/iteration, or a factor of three over a Cray C-90 head. The attached paper, 'Nonlinear Fluid Computations in a Distributed Environment', documents the effect of several computational rate enhancing methods on convergence. For the cases shown, the highest throughput was achieved using boundary updates at each step, with the manager process performing communication tasks only. Constrained domain decomposition of the implicit fluid equations did not degrade the convergence rate or final solution. The scaling of a coupled body/fluid dynamics problem on an Ethernet-linked cluster was also shown.
Rapid solution of large-scale systems of equations
NASA Technical Reports Server (NTRS)
Storaasli, Olaf O.
1994-01-01
The analysis and design of complex aerospace structures requires the rapid solution of large systems of linear and nonlinear equations, eigenvalue extraction for buckling, vibration and flutter modes, structural optimization and design sensitivity calculation. Computers with multiple processors and vector capabilities can offer substantial computational advantages over traditional scalar computer for these analyses. These computers fall into two categories: shared memory computers and distributed memory computers. This presentation covers general-purpose, highly efficient algorithms for generation/assembly or element matrices, solution of systems of linear and nonlinear equations, eigenvalue and design sensitivity analysis and optimization. All algorithms are coded in FORTRAN for shared memory computers and many are adapted to distributed memory computers. The capability and numerical performance of these algorithms will be addressed.
Warris, Sven; Boymans, Sander; Muiser, Iwe; Noback, Michiel; Krijnen, Wim; Nap, Jan-Peter
2014-01-13
Small RNAs are important regulators of genome function, yet their prediction in genomes is still a major computational challenge. Statistical analyses of pre-miRNA sequences indicated that their 2D structure tends to have a minimal free energy (MFE) significantly lower than MFE values of equivalently randomized sequences with the same nucleotide composition, in contrast to other classes of non-coding RNA. The computation of many MFEs is, however, too intensive to allow for genome-wide screenings. Using a local grid infrastructure, MFE distributions of random sequences were pre-calculated on a large scale. These distributions follow a normal distribution and can be used to determine the MFE distribution for any given sequence composition by interpolation. It allows on-the-fly calculation of the normal distribution for any candidate sequence composition. The speedup achieved makes genome-wide screening with this characteristic of a pre-miRNA sequence practical. Although this particular property alone will not be able to distinguish miRNAs from other sequences sufficiently discriminative, the MFE-based P-value should be added to the parameters of choice to be included in the selection of potential miRNA candidates for experimental verification.
A Grid Infrastructure for Supporting Space-based Science Operations
NASA Technical Reports Server (NTRS)
Bradford, Robert N.; Redman, Sandra H.; McNair, Ann R. (Technical Monitor)
2002-01-01
Emerging technologies for computational grid infrastructures have the potential for revolutionizing the way computers are used in all aspects of our lives. Computational grids are currently being implemented to provide a large-scale, dynamic, and secure research and engineering environments based on standards and next-generation reusable software, enabling greater science and engineering productivity through shared resources and distributed computing for less cost than traditional architectures. Combined with the emerging technologies of high-performance networks, grids provide researchers, scientists and engineers the first real opportunity for an effective distributed collaborative environment with access to resources such as computational and storage systems, instruments, and software tools and services for the most computationally challenging applications.
Multiscale modeling of porous ceramics using movable cellular automaton method
NASA Astrophysics Data System (ADS)
Smolin, Alexey Yu.; Smolin, Igor Yu.; Smolina, Irina Yu.
2017-10-01
The paper presents a multiscale model for porous ceramics based on movable cellular automaton method, which is a particle method in novel computational mechanics of solid. The initial scale of the proposed approach corresponds to the characteristic size of the smallest pores in the ceramics. At this scale, we model uniaxial compression of several representative samples with an explicit account of pores of the same size but with the unique position in space. As a result, we get the average values of Young's modulus and strength, as well as the parameters of the Weibull distribution of these properties at the current scale level. These data allow us to describe the material behavior at the next scale level were only the larger pores are considered explicitly, while the influence of small pores is included via effective properties determined earliar. If the pore size distribution function of the material has N maxima we need to perform computations for N-1 levels in order to get the properties step by step from the lowest scale up to the macroscale. The proposed approach was applied to modeling zirconia ceramics with bimodal pore size distribution. The obtained results show correct behavior of the model sample at the macroscale.
Job Superscheduler Architecture and Performance in Computational Grid Environments
NASA Technical Reports Server (NTRS)
Shan, Hongzhang; Oliker, Leonid; Biswas, Rupak
2003-01-01
Computational grids hold great promise in utilizing geographically separated heterogeneous resources to solve large-scale complex scientific problems. However, a number of major technical hurdles, including distributed resource management and effective job scheduling, stand in the way of realizing these gains. In this paper, we propose a novel grid superscheduler architecture and three distributed job migration algorithms. We also model the critical interaction between the superscheduler and autonomous local schedulers. Extensive performance comparisons with ideal, central, and local schemes using real workloads from leading computational centers are conducted in a simulation environment. Additionally, synthetic workloads are used to perform a detailed sensitivity analysis of our superscheduler. Several key metrics demonstrate that substantial performance gains can be achieved via smart superscheduling in distributed computational grids.
A uniform approach for programming distributed heterogeneous computing systems
Grasso, Ivan; Pellegrini, Simone; Cosenza, Biagio; Fahringer, Thomas
2014-01-01
Large-scale compute clusters of heterogeneous nodes equipped with multi-core CPUs and GPUs are getting increasingly popular in the scientific community. However, such systems require a combination of different programming paradigms making application development very challenging. In this article we introduce libWater, a library-based extension of the OpenCL programming model that simplifies the development of heterogeneous distributed applications. libWater consists of a simple interface, which is a transparent abstraction of the underlying distributed architecture, offering advanced features such as inter-context and inter-node device synchronization. It provides a runtime system which tracks dependency information enforced by event synchronization to dynamically build a DAG of commands, on which we automatically apply two optimizations: collective communication pattern detection and device-host-device copy removal. We assess libWater’s performance in three compute clusters available from the Vienna Scientific Cluster, the Barcelona Supercomputing Center and the University of Innsbruck, demonstrating improved performance and scaling with different test applications and configurations. PMID:25844015
A uniform approach for programming distributed heterogeneous computing systems.
Grasso, Ivan; Pellegrini, Simone; Cosenza, Biagio; Fahringer, Thomas
2014-12-01
Large-scale compute clusters of heterogeneous nodes equipped with multi-core CPUs and GPUs are getting increasingly popular in the scientific community. However, such systems require a combination of different programming paradigms making application development very challenging. In this article we introduce libWater, a library-based extension of the OpenCL programming model that simplifies the development of heterogeneous distributed applications. libWater consists of a simple interface, which is a transparent abstraction of the underlying distributed architecture, offering advanced features such as inter-context and inter-node device synchronization. It provides a runtime system which tracks dependency information enforced by event synchronization to dynamically build a DAG of commands, on which we automatically apply two optimizations: collective communication pattern detection and device-host-device copy removal. We assess libWater's performance in three compute clusters available from the Vienna Scientific Cluster, the Barcelona Supercomputing Center and the University of Innsbruck, demonstrating improved performance and scaling with different test applications and configurations.
How Much Higher Can HTCondor Fly?
DOE Office of Scientific and Technical Information (OSTI.GOV)
Fajardo, E. M.; Dost, J. M.; Holzman, B.
The HTCondor high throughput computing system is heavily used in the high energy physics (HEP) community as the batch system for several Worldwide LHC Computing Grid (WLCG) resources. Moreover, it is the backbone of GlidelnWMS, the pilot system used by the computing organization of the Compact Muon Solenoid (CMS) experiment. To prepare for LHC Run 2, we probed the scalability limits of new versions and configurations of HTCondor with a goal of reaching 200,000 simultaneous running jobs in a single internationally distributed dynamic pool.In this paper, we first describe how we created an opportunistic distributed testbed capable of exercising runsmore » with 200,000 simultaneous jobs without impacting production. This testbed methodology is appropriate not only for scale testing HTCondor, but potentially for many other services. In addition to the test conditions and the testbed topology, we include the suggested configuration options used to obtain the scaling results, and describe some of the changes to HTCondor inspired by our testing that enabled sustained operations at scales well beyond previous limits.« less
A study of complex scaling transformation using the Wigner representation of wavefunctions.
Kaprálová-Ždánská, Petra Ruth
2011-05-28
The complex scaling operator exp(-θ ̂x̂p/ℏ), being a foundation of the complex scaling method for resonances, is studied in the Wigner phase-space representation. It is shown that the complex scaling operator behaves similarly to the squeezing operator, rotating and amplifying Wigner quasi-probability distributions of the respective wavefunctions. It is disclosed that the distorting effect of the complex scaling transformation is correlated with increased numerical errors of computed resonance energies and widths. The behavior of the numerical error is demonstrated for a computation of CO(2+) vibronic resonances. © 2011 American Institute of Physics
Jade: using on-demand cloud analysis to give scientists back their flow
NASA Astrophysics Data System (ADS)
Robinson, N.; Tomlinson, J.; Hilson, A. J.; Arribas, A.; Powell, T.
2017-12-01
The UK's Met Office generates 400 TB weather and climate data every day by running physical models on its Top 20 supercomputer. As data volumes explode, there is a danger that analysis workflows become dominated by watching progress bars, and not thinking about science. We have been researching how we can use distributed computing to allow analysts to process these large volumes of high velocity data in a way that's easy, effective and cheap.Our prototype analysis stack, Jade, tries to encapsulate this. Functionality includes: An under-the-hood Dask engine which parallelises and distributes computations, without the need to retrain analysts Hybrid compute clusters (AWS, Alibaba, and local compute) comprising many thousands of cores Clusters which autoscale up/down in response to calculation load using Kubernetes, and balances the cluster across providers based on the current price of compute Lazy data access from cloud storage via containerised OpenDAP This technology stack allows us to perform calculations many orders of magnitude faster than is possible on local workstations. It is also possible to outperform dedicated local compute clusters, as cloud compute can, in principle, scale to much larger scales. The use of ephemeral compute resources also makes this implementation cost efficient.
The MIT Alewife Machine: A Large-Scale Distributed-Memory Multiprocessor
1991-06-01
Symposium on Compiler Construction, June 1986. [14] Daniel Gajski , David Kuck, Duncan Lawrie, and Ahmed Saleh. Cedar - A Large Scale Multiprocessor. In...Directory Methods. In Proceedings 17th Annual International Symposium on Computer Architecture, June 1990. [31] G . M. Papadopoulos and D.E. Culler...Monsoon: An Explicit Token-Store Ar- chitecture. In Proceedings 17th Annual International Symposium on Computer Architecture, June 1990. [32] G . F
Multiscale Simulation of Porous Ceramics Based on Movable Cellular Automaton Method
NASA Astrophysics Data System (ADS)
Smolin, A.; Smolin, I.; Eremina, G.; Smolina, I.
2017-10-01
The paper presents a model for simulating mechanical behaviour of multiscale porous ceramics based on movable cellular automaton method, which is a novel particle method in computational mechanics of solid. The initial scale of the proposed approach corresponds to the characteristic size of the smallest pores in the ceramics. At this scale, we model uniaxial compression of several representative samples with an explicit account of pores of the same size but with the random unique position in space. As a result, we get the average values of Young’s modulus and strength, as well as the parameters of the Weibull distribution of these properties at the current scale level. These data allow us to describe the material behaviour at the next scale level were only the larger pores are considered explicitly, while the influence of small pores is included via the effective properties determined at the previous scale level. If the pore size distribution function of the material has N maxima we need to perform computations for N - 1 levels in order to get the properties from the lowest scale up to the macroscale step by step. The proposed approach was applied to modelling zirconia ceramics with bimodal pore size distribution. The obtained results show correct behaviour of the model sample at the macroscale.
Multi-scale Material Appearance
NASA Astrophysics Data System (ADS)
Wu, Hongzhi
Modeling and rendering the appearance of materials is important for a diverse range of applications of computer graphics - from automobile design to movies and cultural heritage. The appearance of materials varies considerably at different scales, posing significant challenges due to the sheer complexity of the data, as well the need to maintain inter-scale consistency constraints. This thesis presents a series of studies around the modeling, rendering and editing of multi-scale material appearance. To efficiently render material appearance at multiple scales, we develop an object-space precomputed adaptive sampling method, which precomputes a hierarchy of view-independent points that preserve multi-level appearance. To support bi-scale material appearance design, we propose a novel reflectance filtering algorithm, which rapidly computes the large-scale appearance from small-scale details, by exploiting the low-rank structures of Bidirectional Visible Normal Distribution Functions and pre-rotated Bidirectional Reflectance Distribution Functions in the matrix formulation of the rendering algorithm. This approach can guide the physical realization of appearance, as well as the modeling of real-world materials using very sparse measurements. Finally, we present a bi-scale-inspired high-quality general representation for material appearance described by Bidirectional Texture Functions. Our representation is at once compact, easily editable, and amenable to efficient rendering.
GISpark: A Geospatial Distributed Computing Platform for Spatiotemporal Big Data
NASA Astrophysics Data System (ADS)
Wang, S.; Zhong, E.; Wang, E.; Zhong, Y.; Cai, W.; Li, S.; Gao, S.
2016-12-01
Geospatial data are growing exponentially because of the proliferation of cost effective and ubiquitous positioning technologies such as global remote-sensing satellites and location-based devices. Analyzing large amounts of geospatial data can provide great value for both industrial and scientific applications. Data- and compute- intensive characteristics inherent in geospatial big data increasingly pose great challenges to technologies of data storing, computing and analyzing. Such challenges require a scalable and efficient architecture that can store, query, analyze, and visualize large-scale spatiotemporal data. Therefore, we developed GISpark - a geospatial distributed computing platform for processing large-scale vector, raster and stream data. GISpark is constructed based on the latest virtualized computing infrastructures and distributed computing architecture. OpenStack and Docker are used to build multi-user hosting cloud computing infrastructure for GISpark. The virtual storage systems such as HDFS, Ceph, MongoDB are combined and adopted for spatiotemporal data storage management. Spark-based algorithm framework is developed for efficient parallel computing. Within this framework, SuperMap GIScript and various open-source GIS libraries can be integrated into GISpark. GISpark can also integrated with scientific computing environment (e.g., Anaconda), interactive computing web applications (e.g., Jupyter notebook), and machine learning tools (e.g., TensorFlow/Orange). The associated geospatial facilities of GISpark in conjunction with the scientific computing environment, exploratory spatial data analysis tools, temporal data management and analysis systems make up a powerful geospatial computing tool. GISpark not only provides spatiotemporal big data processing capacity in the geospatial field, but also provides spatiotemporal computational model and advanced geospatial visualization tools that deals with other domains related with spatial property. We tested the performance of the platform based on taxi trajectory analysis. Results suggested that GISpark achieves excellent run time performance in spatiotemporal big data applications.
A distribution model for the aerial application of granular agricultural particles
NASA Technical Reports Server (NTRS)
Fernandes, S. T.; Ormsbee, A. I.
1978-01-01
A model is developed to predict the shape of the distribution of granular agricultural particles applied by aircraft. The particle is assumed to have a random size and shape and the model includes the effect of air resistance, distributor geometry and aircraft wake. General requirements for the maintenance of similarity of the distribution for scale model tests are derived and are addressed to the problem of a nongeneral drag law. It is shown that if the mean and variance of the particle diameter and density are scaled according to the scaling laws governing the system, the shape of the distribution will be preserved. Distributions are calculated numerically and show the effect of a random initial lateral position, particle size and drag coefficient. A listing of the computer code is included.
Drawert, Brian; Trogdon, Michael; Toor, Salman; Petzold, Linda; Hellander, Andreas
2017-01-01
Computational experiments using spatial stochastic simulations have led to important new biological insights, but they require specialized tools and a complex software stack, as well as large and scalable compute and data analysis resources due to the large computational cost associated with Monte Carlo computational workflows. The complexity of setting up and managing a large-scale distributed computation environment to support productive and reproducible modeling can be prohibitive for practitioners in systems biology. This results in a barrier to the adoption of spatial stochastic simulation tools, effectively limiting the type of biological questions addressed by quantitative modeling. In this paper, we present PyURDME, a new, user-friendly spatial modeling and simulation package, and MOLNs, a cloud computing appliance for distributed simulation of stochastic reaction-diffusion models. MOLNs is based on IPython and provides an interactive programming platform for development of sharable and reproducible distributed parallel computational experiments. PMID:28190948
Computer-generated forces in distributed interactive simulation
NASA Astrophysics Data System (ADS)
Petty, Mikel D.
1995-04-01
Distributed Interactive Simulation (DIS) is an architecture for building large-scale simulation models from a set of independent simulator nodes communicating via a common network protocol. DIS is most often used to create a simulated battlefield for military training. Computer Generated Forces (CGF) systems control large numbers of autonomous battlefield entities in a DIS simulation using computer equipment and software rather than humans in simulators. CGF entities serve as both enemy forces and supplemental friendly forces in a DIS exercise. Research into various aspects of CGF systems is ongoing. Several CGF systems have been implemented.
Lagerlöf, Jakob H; Bernhardt, Peter
2016-01-01
To develop a general model that utilises a stochastic method to generate a vessel tree based on experimental data, and an associated irregular, macroscopic tumour. These will be used to evaluate two different methods for computing oxygen distribution. A vessel tree structure, and an associated tumour of 127 cm3, were generated, using a stochastic method and Bresenham's line algorithm to develop trees on two different scales and fusing them together. The vessel dimensions were adjusted through convolution and thresholding and each vessel voxel was assigned an oxygen value. Diffusion and consumption were modelled using a Green's function approach together with Michaelis-Menten kinetics. The computations were performed using a combined tree method (CTM) and an individual tree method (ITM). Five tumour sub-sections were compared, to evaluate the methods. The oxygen distributions of the same tissue samples, using different methods of computation, were considerably less similar (root mean square deviation, RMSD≈0.02) than the distributions of different samples using CTM (0.001< RMSD<0.01). The deviations of ITM from CTM increase with lower oxygen values, resulting in ITM severely underestimating the level of hypoxia in the tumour. Kolmogorov Smirnov (KS) tests showed that millimetre-scale samples may not represent the whole. The stochastic model managed to capture the heterogeneous nature of hypoxic fractions and, even though the simplified computation did not considerably alter the oxygen distribution, it leads to an evident underestimation of tumour hypoxia, and thereby radioresistance. For a trustworthy computation of tumour oxygenation, the interaction between adjacent microvessel trees must not be neglected, why evaluation should be made using high resolution and the CTM, applied to the entire tumour.
The Computing and Data Grid Approach: Infrastructure for Distributed Science Applications
NASA Technical Reports Server (NTRS)
Johnston, William E.
2002-01-01
With the advent of Grids - infrastructure for using and managing widely distributed computing and data resources in the science environment - there is now an opportunity to provide a standard, large-scale, computing, data, instrument, and collaboration environment for science that spans many different projects and provides the required infrastructure and services in a relatively uniform and supportable way. Grid technology has evolved over the past several years to provide the services and infrastructure needed for building 'virtual' systems and organizations. We argue that Grid technology provides an excellent basis for the creation of the integrated environments that can combine the resources needed to support the large- scale science projects located at multiple laboratories and universities. We present some science case studies that indicate that a paradigm shift in the process of science will come about as a result of Grids providing transparent and secure access to advanced and integrated information and technologies infrastructure: powerful computing systems, large-scale data archives, scientific instruments, and collaboration tools. These changes will be in the form of services that can be integrated with the user's work environment, and that enable uniform and highly capable access to these computers, data, and instruments, regardless of the location or exact nature of these resources. These services will integrate transient-use resources like computing systems, scientific instruments, and data caches (e.g., as they are needed to perform a simulation or analyze data from a single experiment); persistent-use resources. such as databases, data catalogues, and archives, and; collaborators, whose involvement will continue for the lifetime of a project or longer. While we largely address large-scale science in this paper, Grids, particularly when combined with Web Services, will address a broad spectrum of science scenarios. both large and small scale.
Non-Gaussian Nature of Fracture and the Survival of Fat-Tail Exponents
NASA Astrophysics Data System (ADS)
Tallakstad, Ken Tore; Toussaint, Renaud; Santucci, Stephane; Måløy, Knut Jørgen
2013-04-01
We study the fluctuations of the global velocity Vl(t), computed at various length scales l, during the intermittent mode-I propagation of a crack front. The statistics converge to a non-Gaussian distribution, with an asymmetric shape and a fat tail. This breakdown of the central limit theorem (CLT) is due to the diverging variance of the underlying local crack front velocity distribution, displaying a power law tail. Indeed, by the application of a generalized CLT, the full shape of our experimental velocity distribution at large scale is shown to follow the stable Levy distribution, which preserves the power law tail exponent under upscaling. This study aims to demonstrate in general for crackling noise systems how one can infer the complete scale dependence of the activity—and extreme event distributions—by measuring only at a global scale.
Large-scale anisotropy of the cosmic microwave background radiation
NASA Technical Reports Server (NTRS)
Silk, J.; Wilson, M. L.
1981-01-01
Inhomogeneities in the large-scale distribution of matter inevitably lead to the generation of large-scale anisotropy in the cosmic background radiation. The dipole, quadrupole, and higher order fluctuations expected in an Einstein-de Sitter cosmological model have been computed. The dipole and quadrupole anisotropies are comparable to the measured values, and impose important constraints on the allowable spectrum of large-scale matter density fluctuations. A significant dipole anisotropy is generated by the matter distribution on scales greater than approximately 100 Mpc. The large-scale anisotropy is insensitive to the ionization history of the universe since decoupling, and cannot easily be reconciled with a galaxy formation theory that is based on primordial adiabatic density fluctuations.
Universal distribution of component frequencies in biological and technological systems
Pang, Tin Yau; Maslov, Sergei
2013-01-01
Bacterial genomes and large-scale computer software projects both consist of a large number of components (genes or software packages) connected via a network of mutual dependencies. Components can be easily added or removed from individual systems, and their use frequencies vary over many orders of magnitude. We study this frequency distribution in genomes of ∼500 bacterial species and in over 2 million Linux computers and find that in both cases it is described by the same scale-free power-law distribution with an additional peak near the tail of the distribution corresponding to nearly universal components. We argue that the existence of a power law distribution of frequencies of components is a general property of any modular system with a multilayered dependency network. We demonstrate that the frequency of a component is positively correlated with its dependency degree given by the total number of upstream components whose operation directly or indirectly depends on the selected component. The observed frequency/dependency degree distributions are reproduced in a simple mathematically tractable model introduced and analyzed in this study. PMID:23530195
cOSPREY: A Cloud-Based Distributed Algorithm for Large-Scale Computational Protein Design
Pan, Yuchao; Dong, Yuxi; Zhou, Jingtian; Hallen, Mark; Donald, Bruce R.; Xu, Wei
2016-01-01
Abstract Finding the global minimum energy conformation (GMEC) of a huge combinatorial search space is the key challenge in computational protein design (CPD) problems. Traditional algorithms lack a scalable and efficient distributed design scheme, preventing researchers from taking full advantage of current cloud infrastructures. We design cloud OSPREY (cOSPREY), an extension to a widely used protein design software OSPREY, to allow the original design framework to scale to the commercial cloud infrastructures. We propose several novel designs to integrate both algorithm and system optimizations, such as GMEC-specific pruning, state search partitioning, asynchronous algorithm state sharing, and fault tolerance. We evaluate cOSPREY on three different cloud platforms using different technologies and show that it can solve a number of large-scale protein design problems that have not been possible with previous approaches. PMID:27154509
Semantics-based distributed I/O with the ParaMEDIC framework.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Balaji, P.; Feng, W.; Lin, H.
2008-01-01
Many large-scale applications simultaneously rely on multiple resources for efficient execution. For example, such applications may require both large compute and storage resources; however, very few supercomputing centers can provide large quantities of both. Thus, data generated at the compute site oftentimes has to be moved to a remote storage site for either storage or visualization and analysis. Clearly, this is not an efficient model, especially when the two sites are distributed over a wide-area network. Thus, we present a framework called 'ParaMEDIC: Parallel Metadata Environment for Distributed I/O and Computing' which uses application-specific semantic information to convert the generatedmore » data to orders-of-magnitude smaller metadata at the compute site, transfer the metadata to the storage site, and re-process the metadata at the storage site to regenerate the output. Specifically, ParaMEDIC trades a small amount of additional computation (in the form of data post-processing) for a potentially significant reduction in data that needs to be transferred in distributed environments.« less
Scheduling based on a dynamic resource connection
NASA Astrophysics Data System (ADS)
Nagiyev, A. E.; Botygin, I. A.; Shersntneva, A. I.; Konyaev, P. A.
2017-02-01
The practical using of distributed computing systems associated with many problems, including troubles with the organization of an effective interaction between the agents located at the nodes of the system, with the specific configuration of each node of the system to perform a certain task, with the effective distribution of the available information and computational resources of the system, with the control of multithreading which implements the logic of solving research problems and so on. The article describes the method of computing load balancing in distributed automatic systems, focused on the multi-agency and multi-threaded data processing. The scheme of the control of processing requests from the terminal devices, providing the effective dynamic scaling of computing power under peak load is offered. The results of the model experiments research of the developed load scheduling algorithm are set out. These results show the effectiveness of the algorithm even with a significant expansion in the number of connected nodes and zoom in the architecture distributed computing system.
A rapid local singularity analysis algorithm with applications
NASA Astrophysics Data System (ADS)
Chen, Zhijun; Cheng, Qiuming; Agterberg, Frits
2015-04-01
The local singularity model developed by Cheng is fast gaining popularity in characterizing mineralization and detecting anomalies of geochemical, geophysical and remote sensing data. However in one of the conventional algorithms involving the moving average values with different scales is time-consuming especially while analyzing a large dataset. Summed area table (SAT), also called as integral image, is a fast algorithm used within the Viola-Jones object detection framework in computer vision area. Historically, the principle of SAT is well-known in the study of multi-dimensional probability distribution functions, namely in computing 2D (or ND) probabilities (area under the probability distribution) from the respective cumulative distribution functions. We introduce SAT and it's variation Rotated Summed Area Table in the isotropic, anisotropic or directional local singularity mapping in this study. Once computed using SAT, any one of the rectangular sum can be computed at any scale or location in constant time. The area for any rectangular region in the image can be computed by using only 4 array accesses in constant time independently of the size of the region; effectively reducing the time complexity from O(n) to O(1). New programs using Python, Julia, matlab and C++ are implemented respectively to satisfy different applications, especially to the big data analysis. Several large geochemical and remote sensing datasets are tested. A wide variety of scale changes (linear spacing or log spacing) for non-iterative or iterative approach are adopted to calculate the singularity index values and compare the results. The results indicate that the local singularity analysis with SAT is more robust and superior to traditional approach in identifying anomalies.
2014-01-01
Background Small RNAs are important regulators of genome function, yet their prediction in genomes is still a major computational challenge. Statistical analyses of pre-miRNA sequences indicated that their 2D structure tends to have a minimal free energy (MFE) significantly lower than MFE values of equivalently randomized sequences with the same nucleotide composition, in contrast to other classes of non-coding RNA. The computation of many MFEs is, however, too intensive to allow for genome-wide screenings. Results Using a local grid infrastructure, MFE distributions of random sequences were pre-calculated on a large scale. These distributions follow a normal distribution and can be used to determine the MFE distribution for any given sequence composition by interpolation. It allows on-the-fly calculation of the normal distribution for any candidate sequence composition. Conclusion The speedup achieved makes genome-wide screening with this characteristic of a pre-miRNA sequence practical. Although this particular property alone will not be able to distinguish miRNAs from other sequences sufficiently discriminative, the MFE-based P-value should be added to the parameters of choice to be included in the selection of potential miRNA candidates for experimental verification. PMID:24418292
Standard Model parton distributions at very high energies
Bauer, Christian W.; Ferland, Nicolas; Webber, Bryan R.
2017-08-09
We compute the leading-order evolution of parton distribution functions for all the Standard Model fermions and bosons up to energy scales far above the electroweak scale, where electroweak symmetry is restored. Our results include the 52 PDFs of the unpolarized proton, evolving according to the SU(3), SU(2), U(1), mixed SU(2)×U(1) and Yukawa interactions. We illustrate the numerical effects on parton distributions at large energies, and show that this can lead to important corrections to parton luminosities at a future 100 TeV collider.
Standard Model parton distributions at very high energies
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bauer, Christian W.; Ferland, Nicolas; Webber, Bryan R.
We compute the leading-order evolution of parton distribution functions for all the Standard Model fermions and bosons up to energy scales far above the electroweak scale, where electroweak symmetry is restored. Our results include the 52 PDFs of the unpolarized proton, evolving according to the SU(3), SU(2), U(1), mixed SU(2)×U(1) and Yukawa interactions. We illustrate the numerical effects on parton distributions at large energies, and show that this can lead to important corrections to parton luminosities at a future 100 TeV collider.
NASA Astrophysics Data System (ADS)
Dednam, W.; Botha, A. E.
2015-01-01
Solvation of bio-molecules in water is severely affected by the presence of co-solvent within the hydration shell of the solute structure. Furthermore, since solute molecules can range from small molecules, such as methane, to very large protein structures, it is imperative to understand the detailed structure-function relationship on the microscopic level. For example, it is useful know the conformational transitions that occur in protein structures. Although such an understanding can be obtained through large-scale molecular dynamic simulations, it is often the case that such simulations would require excessively large simulation times. In this context, Kirkwood-Buff theory, which connects the microscopic pair-wise molecular distributions to global thermodynamic properties, together with the recently developed technique, called finite size scaling, may provide a better method to reduce system sizes, and hence also the computational times. In this paper, we present molecular dynamics trial simulations of biologically relevant low-concentration solvents, solvated by aqueous co-solvent solutions. In particular we compare two different methods of calculating the relevant Kirkwood-Buff integrals. The first (traditional) method computes running integrals over the radial distribution functions, which must be obtained from large system-size NVT or NpT simulations. The second, newer method, employs finite size scaling to obtain the Kirkwood-Buff integrals directly by counting the particle number fluctuations in small, open sub-volumes embedded within a larger reservoir that can be well approximated by a much smaller simulation cell. In agreement with previous studies, which made a similar comparison for aqueous co-solvent solutions, without the additional solvent, we conclude that the finite size scaling method is also applicable to the present case, since it can produce computationally more efficient results which are equivalent to the more costly radial distribution function method.
Probabilistic simulation of multi-scale composite behavior
NASA Technical Reports Server (NTRS)
Liaw, D. G.; Shiao, M. C.; Singhal, S. N.; Chamis, Christos C.
1993-01-01
A methodology is developed to computationally assess the probabilistic composite material properties at all composite scale levels due to the uncertainties in the constituent (fiber and matrix) properties and in the fabrication process variables. The methodology is computationally efficient for simulating the probability distributions of material properties. The sensitivity of the probabilistic composite material property to each random variable is determined. This information can be used to reduce undesirable uncertainties in material properties at the macro scale of the composite by reducing the uncertainties in the most influential random variables at the micro scale. This methodology was implemented into the computer code PICAN (Probabilistic Integrated Composite ANalyzer). The accuracy and efficiency of this methodology are demonstrated by simulating the uncertainties in the material properties of a typical laminate and comparing the results with the Monte Carlo simulation method. The experimental data of composite material properties at all scales fall within the scatters predicted by PICAN.
Scale-free Graphs for General Aviation Flight Schedules
NASA Technical Reports Server (NTRS)
Alexandov, Natalia M. (Technical Monitor); Kincaid, Rex K.
2003-01-01
In the late 1990s a number of researchers noticed that networks in biology, sociology, and telecommunications exhibited similar characteristics unlike standard random networks. In particular, they found that the cummulative degree distributions of these graphs followed a power law rather than a binomial distribution and that their clustering coefficients tended to a nonzero constant as the number of nodes, n, became large rather than O(1/n). Moreover, these networks shared an important property with traditional random graphs as n becomes large the average shortest path length scales with log n. This latter property has been coined the small-world property. When taken together these three properties small-world, power law, and constant clustering coefficient describe what are now most commonly referred to as scale-free networks. Since 1997 at least six books and over 400 articles have been written about scale-free networks. In this manuscript an overview of the salient characteristics of scale-free networks. Computational experience will be provided for two mechanisms that grow (dynamic) scale-free graphs. Additional computational experience will be given for constructing (static) scale-free graphs via a tabu search optimization approach. Finally, a discussion of potential applications to general aviation networks is given.
Visualization, documentation, analysis, and communication of large scale gene regulatory networks
Longabaugh, William J.R.; Davidson, Eric H.; Bolouri, Hamid
2009-01-01
Summary Genetic regulatory networks (GRNs) are complex, large-scale, and spatially and temporally distributed. These characteristics impose challenging demands on computational GRN modeling tools, and there is a need for custom modeling tools. In this paper, we report on our ongoing development of BioTapestry, an open source, freely available computational tool designed specifically for GRN modeling. We also outline our future development plans, and give some examples of current applications of BioTapestry. PMID:18757046
Extracting Useful Semantic Information from Large Scale Corpora of Text
ERIC Educational Resources Information Center
Mendoza, Ray Padilla, Jr.
2012-01-01
Extracting and representing semantic information from large scale corpora is at the crux of computer-assisted knowledge generation. Semantic information depends on collocation extraction methods, mathematical models used to represent distributional information, and weighting functions which transform the space. This dissertation provides a…
Offdiagonal complexity: A computationally quick complexity measure for graphs and networks
NASA Astrophysics Data System (ADS)
Claussen, Jens Christian
2007-02-01
A vast variety of biological, social, and economical networks shows topologies drastically differing from random graphs; yet the quantitative characterization remains unsatisfactory from a conceptual point of view. Motivated from the discussion of small scale-free networks, a biased link distribution entropy is defined, which takes an extremum for a power-law distribution. This approach is extended to the node-node link cross-distribution, whose nondiagonal elements characterize the graph structure beyond link distribution, cluster coefficient and average path length. From here a simple (and computationally cheap) complexity measure can be defined. This offdiagonal complexity (OdC) is proposed as a novel measure to characterize the complexity of an undirected graph, or network. While both for regular lattices and fully connected networks OdC is zero, it takes a moderately low value for a random graph and shows high values for apparently complex structures as scale-free networks and hierarchical trees. The OdC approach is applied to the Helicobacter pylori protein interaction network and randomly rewired surrogates.
Growth models and the expected distribution of fluctuating asymmetry
Graham, John H.; Shimizu, Kunio; Emlen, John M.; Freeman, D. Carl; Merkel, John
2003-01-01
Multiplicative error accounts for much of the size-scaling and leptokurtosis in fluctuating asymmetry. It arises when growth involves the addition of tissue to that which is already present. Such errors are lognormally distributed. The distribution of the difference between two lognormal variates is leptokurtic. If those two variates are correlated, then the asymmetry variance will scale with size. Inert tissues typically exhibit additive error and have a gamma distribution. Although their asymmetry variance does not exhibit size-scaling, the distribution of the difference between two gamma variates is nevertheless leptokurtic. Measurement error is also additive, but has a normal distribution. Thus, the measurement of fluctuating asymmetry may involve the mixing of additive and multiplicative error. When errors are multiplicative, we recommend computing log E(l) − log E(r), the difference between the logarithms of the expected values of left and right sides, even when size-scaling is not obvious. If l and r are lognormally distributed, and measurement error is nil, the resulting distribution will be normal, and multiplicative error will not confound size-related changes in asymmetry. When errors are additive, such a transformation to remove size-scaling is unnecessary. Nevertheless, the distribution of l − r may still be leptokurtic.
Coalescence computations for large samples drawn from populations of time-varying sizes
Polanski, Andrzej; Szczesna, Agnieszka; Garbulowski, Mateusz; Kimmel, Marek
2017-01-01
We present new results concerning probability distributions of times in the coalescence tree and expected allele frequencies for coalescent with large sample size. The obtained results are based on computational methodologies, which involve combining coalescence time scale changes with techniques of integral transformations and using analytical formulae for infinite products. We show applications of the proposed methodologies for computing probability distributions of times in the coalescence tree and their limits, for evaluation of accuracy of approximate expressions for times in the coalescence tree and expected allele frequencies, and for analysis of large human mitochondrial DNA dataset. PMID:28170404
Distributed-Memory Computing With the Langley Aerothermodynamic Upwind Relaxation Algorithm (LAURA)
NASA Technical Reports Server (NTRS)
Riley, Christopher J.; Cheatwood, F. McNeil
1997-01-01
The Langley Aerothermodynamic Upwind Relaxation Algorithm (LAURA), a Navier-Stokes solver, has been modified for use in a parallel, distributed-memory environment using the Message-Passing Interface (MPI) standard. A standard domain decomposition strategy is used in which the computational domain is divided into subdomains with each subdomain assigned to a processor. Performance is examined on dedicated parallel machines and a network of desktop workstations. The effect of domain decomposition and frequency of boundary updates on performance and convergence is also examined for several realistic configurations and conditions typical of large-scale computational fluid dynamic analysis.
NASA Astrophysics Data System (ADS)
Ajami, H.; Sharma, A.; Lakshmi, V.
2017-12-01
Application of semi-distributed hydrologic modeling frameworks is a viable alternative to fully distributed hyper-resolution hydrologic models due to computational efficiency and resolving fine-scale spatial structure of hydrologic fluxes and states. However, fidelity of semi-distributed model simulations is impacted by (1) formulation of hydrologic response units (HRUs), and (2) aggregation of catchment properties for formulating simulation elements. Here, we evaluate the performance of a recently developed Soil Moisture and Runoff simulation Toolkit (SMART) for large catchment scale simulations. In SMART, topologically connected HRUs are delineated using thresholds obtained from topographic and geomorphic analysis of a catchment, and simulation elements are equivalent cross sections (ECS) representative of a hillslope in first order sub-basins. Earlier investigations have shown that formulation of ECSs at the scale of a first order sub-basin reduces computational time significantly without compromising simulation accuracy. However, the implementation of this approach has not been fully explored for catchment scale simulations. To assess SMART performance, we set-up the model over the Little Washita watershed in Oklahoma. Model evaluations using in-situ soil moisture observations show satisfactory model performance. In addition, we evaluated the performance of a number of soil moisture disaggregation schemes recently developed to provide spatially explicit soil moisture outputs at fine scale resolution. Our results illustrate that the statistical disaggregation scheme performs significantly better than the methods based on topographic data. Future work is focused on assessing the performance of SMART using remotely sensed soil moisture observations using spatially based model evaluation metrics.
Estimation of the vortex length scale and intensity from two-dimensional samples
NASA Technical Reports Server (NTRS)
Reuss, D. L.; Cheng, W. P.
1992-01-01
A method is proposed for estimating flow features that influence flame wrinkling in reciprocating internal combustion engines, where traditional statistical measures of turbulence are suspect. Candidate methods were tested in a computed channel flow where traditional turbulence measures are valid and performance can be rationally evaluated. Two concepts are tested. First, spatial filtering is applied to the two-dimensional velocity distribution and found to reveal structures corresponding to the vorticity field. Decreasing the spatial-frequency cutoff of the filter locally changes the character and size of the flow structures that are revealed by the filter. Second, vortex length scale and intensity is estimated by computing the ensemble-average velocity distribution conditionally sampled on the vorticity peaks. The resulting conditionally sampled 'average vortex' has a peak velocity less than half the rms velocity and a size approximately equal to the two-point-correlation integral-length scale.
Cormode, Graham; Dasgupta, Anirban; Goyal, Amit; Lee, Chi Hoon
2018-01-01
Many modern applications of AI such as web search, mobile browsing, image processing, and natural language processing rely on finding similar items from a large database of complex objects. Due to the very large scale of data involved (e.g., users' queries from commercial search engines), computing such near or nearest neighbors is a non-trivial task, as the computational cost grows significantly with the number of items. To address this challenge, we adopt Locality Sensitive Hashing (a.k.a, LSH) methods and evaluate four variants in a distributed computing environment (specifically, Hadoop). We identify several optimizations which improve performance, suitable for deployment in very large scale settings. The experimental results demonstrate our variants of LSH achieve the robust performance with better recall compared with "vanilla" LSH, even when using the same amount of space.
Spatially explicit spectral analysis of point clouds and geospatial data
Buscombe, Daniel D.
2015-01-01
The increasing use of spatially explicit analyses of high-resolution spatially distributed data (imagery and point clouds) for the purposes of characterising spatial heterogeneity in geophysical phenomena necessitates the development of custom analytical and computational tools. In recent years, such analyses have become the basis of, for example, automated texture characterisation and segmentation, roughness and grain size calculation, and feature detection and classification, from a variety of data types. In this work, much use has been made of statistical descriptors of localised spatial variations in amplitude variance (roughness), however the horizontal scale (wavelength) and spacing of roughness elements is rarely considered. This is despite the fact that the ratio of characteristic vertical to horizontal scales is not constant and can yield important information about physical scaling relationships. Spectral analysis is a hitherto under-utilised but powerful means to acquire statistical information about relevant amplitude and wavelength scales, simultaneously and with computational efficiency. Further, quantifying spatially distributed data in the frequency domain lends itself to the development of stochastic models for probing the underlying mechanisms which govern the spatial distribution of geological and geophysical phenomena. The software packagePySESA (Python program for Spatially Explicit Spectral Analysis) has been developed for generic analyses of spatially distributed data in both the spatial and frequency domains. Developed predominantly in Python, it accesses libraries written in Cython and C++ for efficiency. It is open source and modular, therefore readily incorporated into, and combined with, other data analysis tools and frameworks with particular utility for supporting research in the fields of geomorphology, geophysics, hydrography, photogrammetry and remote sensing. The analytical and computational structure of the toolbox is described, and its functionality illustrated with an example of a high-resolution bathymetric point cloud data collected with multibeam echosounder.
A Development of Lightweight Grid Interface
NASA Astrophysics Data System (ADS)
Iwai, G.; Kawai, Y.; Sasaki, T.; Watase, Y.
2011-12-01
In order to help a rapid development of Grid/Cloud aware applications, we have developed API to abstract the distributed computing infrastructures based on SAGA (A Simple API for Grid Applications). SAGA, which is standardized in the OGF (Open Grid Forum), defines API specifications to access distributed computing infrastructures, such as Grid, Cloud and local computing resources. The Universal Grid API (UGAPI), which is a set of command line interfaces (CLI) and APIs, aims to offer simpler API to combine several SAGA interfaces with richer functionalities. These CLIs of the UGAPI offer typical functionalities required by end users for job management and file access to the different distributed computing infrastructures as well as local computing resources. We have also built a web interface for the particle therapy simulation and demonstrated the large scale calculation using the different infrastructures at the same time. In this paper, we would like to present how the web interface based on UGAPI and SAGA achieve more efficient utilization of computing resources over the different infrastructures with technical details and practical experiences.
Beyond Scale-Free Small-World Networks: Cortical Columns for Quick Brains
NASA Astrophysics Data System (ADS)
Stoop, Ralph; Saase, Victor; Wagner, Clemens; Stoop, Britta; Stoop, Ruedi
2013-03-01
We study to what extent cortical columns with their particular wiring boost neural computation. Upon a vast survey of columnar networks performing various real-world cognitive tasks, we detect no signs of enhancement. It is on a mesoscopic—intercolumnar—scale that the existence of columns, largely irrespective of their inner organization, enhances the speed of information transfer and minimizes the total wiring length required to bind distributed columnar computations towards spatiotemporally coherent results. We suggest that brain efficiency may be related to a doubly fractal connectivity law, resulting in networks with efficiency properties beyond those by scale-free networks.
TeleMed: Wide-area, secure, collaborative object computing with Java and CORBA for healthcare
DOE Office of Scientific and Technical Information (OSTI.GOV)
Forslund, D.W.; George, J.E.; Gavrilov, E.M.
1998-12-31
Distributed computing is becoming commonplace in a variety of industries with healthcare being a particularly important one for society. The authors describe the development and deployment of TeleMed in a few healthcare domains. TeleMed is a 100% Java distributed application build on CORBA and OMG standards enabling the collaboration on the treatment of chronically ill patients in a secure manner over the Internet. These standards enable other systems to work interoperably with TeleMed and provide transparent access to high performance distributed computing to the healthcare domain. The goal of wide scale integration of electronic medical records is a grand-challenge scalemore » problem of global proportions with far-reaching social benefits.« less
Connecting Performance Analysis and Visualization to Advance Extreme Scale Computing
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bremer, Peer-Timo; Mohr, Bernd; Schulz, Martin
2015-07-29
The characterization, modeling, analysis, and tuning of software performance has been a central topic in High Performance Computing (HPC) since its early beginnings. The overall goal is to make HPC software run faster on particular hardware, either through better scheduling, on-node resource utilization, or more efficient distributed communication.
Reproducible Large-Scale Neuroimaging Studies with the OpenMOLE Workflow Management System.
Passerat-Palmbach, Jonathan; Reuillon, Romain; Leclaire, Mathieu; Makropoulos, Antonios; Robinson, Emma C; Parisot, Sarah; Rueckert, Daniel
2017-01-01
OpenMOLE is a scientific workflow engine with a strong emphasis on workload distribution. Workflows are designed using a high level Domain Specific Language (DSL) built on top of Scala. It exposes natural parallelism constructs to easily delegate the workload resulting from a workflow to a wide range of distributed computing environments. OpenMOLE hides the complexity of designing complex experiments thanks to its DSL. Users can embed their own applications and scale their pipelines from a small prototype running on their desktop computer to a large-scale study harnessing distributed computing infrastructures, simply by changing a single line in the pipeline definition. The construction of the pipeline itself is decoupled from the execution context. The high-level DSL abstracts the underlying execution environment, contrary to classic shell-script based pipelines. These two aspects allow pipelines to be shared and studies to be replicated across different computing environments. Workflows can be run as traditional batch pipelines or coupled with OpenMOLE's advanced exploration methods in order to study the behavior of an application, or perform automatic parameter tuning. In this work, we briefly present the strong assets of OpenMOLE and detail recent improvements targeting re-executability of workflows across various Linux platforms. We have tightly coupled OpenMOLE with CARE, a standalone containerization solution that allows re-executing on a Linux host any application that has been packaged on another Linux host previously. The solution is evaluated against a Python-based pipeline involving packages such as scikit-learn as well as binary dependencies. All were packaged and re-executed successfully on various HPC environments, with identical numerical results (here prediction scores) obtained on each environment. Our results show that the pair formed by OpenMOLE and CARE is a reliable solution to generate reproducible results and re-executable pipelines. A demonstration of the flexibility of our solution showcases three neuroimaging pipelines harnessing distributed computing environments as heterogeneous as local clusters or the European Grid Infrastructure (EGI).
Reproducible Large-Scale Neuroimaging Studies with the OpenMOLE Workflow Management System
Passerat-Palmbach, Jonathan; Reuillon, Romain; Leclaire, Mathieu; Makropoulos, Antonios; Robinson, Emma C.; Parisot, Sarah; Rueckert, Daniel
2017-01-01
OpenMOLE is a scientific workflow engine with a strong emphasis on workload distribution. Workflows are designed using a high level Domain Specific Language (DSL) built on top of Scala. It exposes natural parallelism constructs to easily delegate the workload resulting from a workflow to a wide range of distributed computing environments. OpenMOLE hides the complexity of designing complex experiments thanks to its DSL. Users can embed their own applications and scale their pipelines from a small prototype running on their desktop computer to a large-scale study harnessing distributed computing infrastructures, simply by changing a single line in the pipeline definition. The construction of the pipeline itself is decoupled from the execution context. The high-level DSL abstracts the underlying execution environment, contrary to classic shell-script based pipelines. These two aspects allow pipelines to be shared and studies to be replicated across different computing environments. Workflows can be run as traditional batch pipelines or coupled with OpenMOLE's advanced exploration methods in order to study the behavior of an application, or perform automatic parameter tuning. In this work, we briefly present the strong assets of OpenMOLE and detail recent improvements targeting re-executability of workflows across various Linux platforms. We have tightly coupled OpenMOLE with CARE, a standalone containerization solution that allows re-executing on a Linux host any application that has been packaged on another Linux host previously. The solution is evaluated against a Python-based pipeline involving packages such as scikit-learn as well as binary dependencies. All were packaged and re-executed successfully on various HPC environments, with identical numerical results (here prediction scores) obtained on each environment. Our results show that the pair formed by OpenMOLE and CARE is a reliable solution to generate reproducible results and re-executable pipelines. A demonstration of the flexibility of our solution showcases three neuroimaging pipelines harnessing distributed computing environments as heterogeneous as local clusters or the European Grid Infrastructure (EGI). PMID:28381997
NASA Technical Reports Server (NTRS)
Kato, S.; Smith, G. L.; Barker, H. W.
2001-01-01
An algorithm is developed for the gamma-weighted discrete ordinate two-stream approximation that computes profiles of domain-averaged shortwave irradiances for horizontally inhomogeneous cloudy atmospheres. The algorithm assumes that frequency distributions of cloud optical depth at unresolved scales can be represented by a gamma distribution though it neglects net horizontal transport of radiation. This algorithm is an alternative to the one used in earlier studies that adopted the adding method. At present, only overcast cloudy layers are permitted.
Workflow management in large distributed systems
NASA Astrophysics Data System (ADS)
Legrand, I.; Newman, H.; Voicu, R.; Dobre, C.; Grigoras, C.
2011-12-01
The MonALISA (Monitoring Agents using a Large Integrated Services Architecture) framework provides a distributed service system capable of controlling and optimizing large-scale, data-intensive applications. An essential part of managing large-scale, distributed data-processing facilities is a monitoring system for computing facilities, storage, networks, and the very large number of applications running on these systems in near realtime. All this monitoring information gathered for all the subsystems is essential for developing the required higher-level services—the components that provide decision support and some degree of automated decisions—and for maintaining and optimizing workflow in large-scale distributed systems. These management and global optimization functions are performed by higher-level agent-based services. We present several applications of MonALISA's higher-level services including optimized dynamic routing, control, data-transfer scheduling, distributed job scheduling, dynamic allocation of storage resource to running jobs and automated management of remote services among a large set of grid facilities.
NASA Technical Reports Server (NTRS)
Johnston, William E.; Gannon, Dennis; Nitzberg, Bill; Feiereisen, William (Technical Monitor)
2000-01-01
The term "Grid" refers to distributed, high performance computing and data handling infrastructure that incorporates geographically and organizationally dispersed, heterogeneous resources that are persistent and supported. The vision for NASN's Information Power Grid - a computing and data Grid - is that it will provide significant new capabilities to scientists and engineers by facilitating routine construction of information based problem solving environments / frameworks that will knit together widely distributed computing, data, instrument, and human resources into just-in-time systems that can address complex and large-scale computing and data analysis problems. IPG development and deployment is addressing requirements obtained by analyzing a number of different application areas, in particular from the NASA Aero-Space Technology Enterprise. This analysis has focussed primarily on two types of users: The scientist / design engineer whose primary interest is problem solving (e.g., determining wing aerodynamic characteristics in many different operating environments), and whose primary interface to IPG will be through various sorts of problem solving frameworks. The second type of user if the tool designer: The computational scientists who convert physics and mathematics into code that can simulate the physical world. These are the two primary users of IPG, and they have rather different requirements. This paper describes the current state of IPG (the operational testbed), the set of capabilities being put into place for the operational prototype IPG, as well as some of the longer term R&D tasks.
Bernhardt, Peter
2016-01-01
Purpose To develop a general model that utilises a stochastic method to generate a vessel tree based on experimental data, and an associated irregular, macroscopic tumour. These will be used to evaluate two different methods for computing oxygen distribution. Methods A vessel tree structure, and an associated tumour of 127 cm3, were generated, using a stochastic method and Bresenham’s line algorithm to develop trees on two different scales and fusing them together. The vessel dimensions were adjusted through convolution and thresholding and each vessel voxel was assigned an oxygen value. Diffusion and consumption were modelled using a Green’s function approach together with Michaelis-Menten kinetics. The computations were performed using a combined tree method (CTM) and an individual tree method (ITM). Five tumour sub-sections were compared, to evaluate the methods. Results The oxygen distributions of the same tissue samples, using different methods of computation, were considerably less similar (root mean square deviation, RMSD≈0.02) than the distributions of different samples using CTM (0.001< RMSD<0.01). The deviations of ITM from CTM increase with lower oxygen values, resulting in ITM severely underestimating the level of hypoxia in the tumour. Kolmogorov Smirnov (KS) tests showed that millimetre-scale samples may not represent the whole. Conclusions The stochastic model managed to capture the heterogeneous nature of hypoxic fractions and, even though the simplified computation did not considerably alter the oxygen distribution, it leads to an evident underestimation of tumour hypoxia, and thereby radioresistance. For a trustworthy computation of tumour oxygenation, the interaction between adjacent microvessel trees must not be neglected, why evaluation should be made using high resolution and the CTM, applied to the entire tumour. PMID:27861529
Scalable and fault tolerant orthogonalization based on randomized distributed data aggregation
Gansterer, Wilfried N.; Niederbrucker, Gerhard; Straková, Hana; Schulze Grotthoff, Stefan
2013-01-01
The construction of distributed algorithms for matrix computations built on top of distributed data aggregation algorithms with randomized communication schedules is investigated. For this purpose, a new aggregation algorithm for summing or averaging distributed values, the push-flow algorithm, is developed, which achieves superior resilience properties with respect to failures compared to existing aggregation methods. It is illustrated that on a hypercube topology it asymptotically requires the same number of iterations as the optimal all-to-all reduction operation and that it scales well with the number of nodes. Orthogonalization is studied as a prototypical matrix computation task. A new fault tolerant distributed orthogonalization method rdmGS, which can produce accurate results even in the presence of node failures, is built on top of distributed data aggregation algorithms. PMID:24748902
Performance issues for domain-oriented time-driven distributed simulations
NASA Technical Reports Server (NTRS)
Nicol, David M.
1987-01-01
It has long been recognized that simulations form an interesting and important class of computations that may benefit from distributed or parallel processing. Since the point of parallel processing is improved performance, the recent proliferation of multiprocessors requires that we consider the performance issues that naturally arise when attempting to implement a distributed simulation. Three such issues are: (1) the problem of mapping the simulation onto the architecture, (2) the possibilities for performing redundant computation in order to reduce communication, and (3) the avoidance of deadlock due to distributed contention for message-buffer space. These issues are discussed in the context of a battlefield simulation implemented on a medium-scale multiprocessor message-passing architecture.
GEANT4 distributed computing for compact clusters
NASA Astrophysics Data System (ADS)
Harrawood, Brian P.; Agasthya, Greeshma A.; Lakshmanan, Manu N.; Raterman, Gretchen; Kapadia, Anuj J.
2014-11-01
A new technique for distribution of GEANT4 processes is introduced to simplify running a simulation in a parallel environment such as a tightly coupled computer cluster. Using a new C++ class derived from the GEANT4 toolkit, multiple runs forming a single simulation are managed across a local network of computers with a simple inter-node communication protocol. The class is integrated with the GEANT4 toolkit and is designed to scale from a single symmetric multiprocessing (SMP) machine to compact clusters ranging in size from tens to thousands of nodes. User designed 'work tickets' are distributed to clients using a client-server work flow model to specify the parameters for each individual run of the simulation. The new g4DistributedRunManager class was developed and well tested in the course of our Neutron Stimulated Emission Computed Tomography (NSECT) experiments. It will be useful for anyone running GEANT4 for large discrete data sets such as covering a range of angles in computed tomography, calculating dose delivery with multiple fractions or simply speeding the through-put of a single model.
Scaling Bulk Data Analysis with Mapreduce
2017-09-01
Submitted in partial fulfillment of the requirements for the degree of MASTER OF SCIENCE IN COMPUTER SCIENCE from the NAVAL POSTGRADUATE SCHOOL September...2017 Approved by: Michael McCarrin Thesis Co-Advisor Marcus S. Stefanou Thesis Co-Advisor Peter J. Denning Chair, Department of Computer Science iii...98 xiii THIS PAGE INTENTIONALLY LEFT BLANK xiv List of Acronyms and Abbreviations CART Computer Analysis and Response Team DELV Distributed Environment
Scaling the Poisson Distribution
ERIC Educational Resources Information Center
Farnsworth, David L.
2014-01-01
We derive the additive property of Poisson random variables directly from the probability mass function. An important application of the additive property to quality testing of computer chips is presented.
Harrigan, Robert L; Yvernault, Benjamin C; Boyd, Brian D; Damon, Stephen M; Gibney, Kyla David; Conrad, Benjamin N; Phillips, Nicholas S; Rogers, Baxter P; Gao, Yurui; Landman, Bennett A
2016-01-01
The Vanderbilt University Institute for Imaging Science (VUIIS) Center for Computational Imaging (CCI) has developed a database built on XNAT housing over a quarter of a million scans. The database provides framework for (1) rapid prototyping, (2) large scale batch processing of images and (3) scalable project management. The system uses the web-based interfaces of XNAT and REDCap to allow for graphical interaction. A python middleware layer, the Distributed Automation for XNAT (DAX) package, distributes computation across the Vanderbilt Advanced Computing Center for Research and Education high performance computing center. All software are made available in open source for use in combining portable batch scripting (PBS) grids and XNAT servers. Copyright © 2015 Elsevier Inc. All rights reserved.
AGIS: Evolution of Distributed Computing information system for ATLAS
NASA Astrophysics Data System (ADS)
Anisenkov, A.; Di Girolamo, A.; Alandes, M.; Karavakis, E.
2015-12-01
ATLAS, a particle physics experiment at the Large Hadron Collider at CERN, produces petabytes of data annually through simulation production and tens of petabytes of data per year from the detector itself. The ATLAS computing model embraces the Grid paradigm and a high degree of decentralization of computing resources in order to meet the ATLAS requirements of petabytes scale data operations. It has been evolved after the first period of LHC data taking (Run-1) in order to cope with new challenges of the upcoming Run- 2. In this paper we describe the evolution and recent developments of the ATLAS Grid Information System (AGIS), developed in order to integrate configuration and status information about resources, services and topology of the computing infrastructure used by the ATLAS Distributed Computing applications and services.
2018-01-01
Many modern applications of AI such as web search, mobile browsing, image processing, and natural language processing rely on finding similar items from a large database of complex objects. Due to the very large scale of data involved (e.g., users’ queries from commercial search engines), computing such near or nearest neighbors is a non-trivial task, as the computational cost grows significantly with the number of items. To address this challenge, we adopt Locality Sensitive Hashing (a.k.a, LSH) methods and evaluate four variants in a distributed computing environment (specifically, Hadoop). We identify several optimizations which improve performance, suitable for deployment in very large scale settings. The experimental results demonstrate our variants of LSH achieve the robust performance with better recall compared with “vanilla” LSH, even when using the same amount of space. PMID:29346410
Portable parallel stochastic optimization for the design of aeropropulsion components
NASA Technical Reports Server (NTRS)
Sues, Robert H.; Rhodes, G. S.
1994-01-01
This report presents the results of Phase 1 research to develop a methodology for performing large-scale Multi-disciplinary Stochastic Optimization (MSO) for the design of aerospace systems ranging from aeropropulsion components to complete aircraft configurations. The current research recognizes that such design optimization problems are computationally expensive, and require the use of either massively parallel or multiple-processor computers. The methodology also recognizes that many operational and performance parameters are uncertain, and that uncertainty must be considered explicitly to achieve optimum performance and cost. The objective of this Phase 1 research was to initialize the development of an MSO methodology that is portable to a wide variety of hardware platforms, while achieving efficient, large-scale parallelism when multiple processors are available. The first effort in the project was a literature review of available computer hardware, as well as review of portable, parallel programming environments. The first effort was to implement the MSO methodology for a problem using the portable parallel programming language, Parallel Virtual Machine (PVM). The third and final effort was to demonstrate the example on a variety of computers, including a distributed-memory multiprocessor, a distributed-memory network of workstations, and a single-processor workstation. Results indicate the MSO methodology can be well-applied towards large-scale aerospace design problems. Nearly perfect linear speedup was demonstrated for computation of optimization sensitivity coefficients on both a 128-node distributed-memory multiprocessor (the Intel iPSC/860) and a network of workstations (speedups of almost 19 times achieved for 20 workstations). Very high parallel efficiencies (75 percent for 31 processors and 60 percent for 50 processors) were also achieved for computation of aerodynamic influence coefficients on the Intel. Finally, the multi-level parallelization strategy that will be needed for large-scale MSO problems was demonstrated to be highly efficient. The same parallel code instructions were used on both platforms, demonstrating portability. There are many applications for which MSO can be applied, including NASA's High-Speed-Civil Transport, and advanced propulsion systems. The use of MSO will reduce design and development time and testing costs dramatically.
The Value Proposition in Institutional Repositories
ERIC Educational Resources Information Center
Blythe, Erv; Chachra, Vinod
2005-01-01
In the education and research arena of the late 1970s and early 1980s, a struggle developed between those who advocated centralized, mainframe-based computing and those who advocated distributed computing. Ultimately, the debate reduced to whether economies of scale or economies of scope are more important to the effectiveness and efficiency of…
Differential pencil beam dose computation model for photons.
Mohan, R; Chui, C; Lidofsky, L
1986-01-01
Differential pencil beam (DPB) is defined as the dose distribution relative to the position of the first collision, per unit collision density, for a monoenergetic pencil beam of photons in an infinite homogeneous medium of unit density. We have generated DPB dose distribution tables for a number of photon energies in water using the Monte Carlo method. The three-dimensional (3D) nature of the transport of photons and electrons is automatically incorporated in DPB dose distributions. Dose is computed by evaluating 3D integrals of DPB dose. The DPB dose computation model has been applied to calculate dose distributions for 60Co and accelerator beams. Calculations for the latter are performed using energy spectra generated with the Monte Carlo program. To predict dose distributions near the beam boundaries defined by the collimation system as well as blocks, we utilize the angular distribution of incident photons. Inhomogeneities are taken into account by attenuating the primary photon fluence exponentially utilizing the average total linear attenuation coefficient of intervening tissue, by multiplying photon fluence by the linear attenuation coefficient to yield the number of collisions in the scattering volume, and by scaling the path between the scattering volume element and the computation point by an effective density.
Van de Kamer, J B; Lagendijk, J J W
2002-05-21
SAR distributions in a healthy female adult head as a result of a radiating vertical dipole antenna (frequency 915 MHz) representing a hand-held mobile phone have been computed for three different resolutions: 2 mm, 1 mm and 0.4 mm. The extremely high resolution of 0.4 mm was obtained with our quasistatic zooming technique, which is briefly described in this paper. For an effectively transmitted power of 0.25 W, the maximum averaged SAR values in both cubic- and arbitrary-shaped volumes are, respectively, about 1.72 and 2.55 W kg(-1) for 1 g and 0.98 and 1.73 W kg(-1) for 10 g of tissue. These numbers do not vary much (<8%) for the different resolutions, indicating that SAR computations at a resolution of 2 mm are sufficiently accurate to describe the large-scale distribution. However, considering the detailed SAR pattern in the head, large differences may occur if high-resolution computations are performed rather than low-resolution ones. These deviations are caused by both increased modelling accuracy and improved anatomical description in higher resolution simulations. For example, the SAR profile across a boundary between tissues with high dielectric contrast is much more accurately described at higher resolutions. Furthermore, low-resolution dielectric geometries may suffer from loss of anatomical detail, which greatly affects small-scale SAR distributions. Thus. for strongly inhomogeneous regions high-resolution SAR modelling is an absolute necessity.
Infrastructures for Distributed Computing: the case of BESIII
NASA Astrophysics Data System (ADS)
Pellegrino, J.
2018-05-01
The BESIII is an electron-positron collision experiment hosted at BEPCII in Beijing and aimed to investigate Tau-Charm physics. Now BESIII has been running for several years and gathered more than 1PB raw data. In order to analyze these data and perform massive Monte Carlo simulations, a large amount of computing and storage resources is needed. The distributed computing system is based up on DIRAC and it is in production since 2012. It integrates computing and storage resources from different institutes and a variety of resource types such as cluster, grid, cloud or volunteer computing. About 15 sites from BESIII Collaboration from all over the world joined this distributed computing infrastructure, giving a significant contribution to the IHEP computing facility. Nowadays cloud computing is playing a key role in the HEP computing field, due to its scalability and elasticity. Cloud infrastructures take advantages of several tools, such as VMDirac, to manage virtual machines through cloud managers according to the job requirements. With the virtually unlimited resources from commercial clouds, the computing capacity could scale accordingly in order to deal with any burst demands. General computing models have been discussed in the talk and are addressed herewith, with particular focus on the BESIII infrastructure. Moreover new computing tools and upcoming infrastructures will be addressed.
DOE Office of Scientific and Technical Information (OSTI.GOV)
2012-09-25
The Megatux platform enables the emulation of large scale (multi-million node) distributed systems. In particular, it allows for the emulation of large-scale networks interconnecting a very large number of emulated computer systems. It does this by leveraging virtualization and associated technologies to allow hundreds of virtual computers to be hosted on a single moderately sized server or workstation. Virtualization technology provided by modern processors allows for multiple guest OSs to run at the same time, sharing the hardware resources. The Megatux platform can be deployed on a single PC, a small cluster of a few boxes or a large clustermore » of computers. With a modest cluster, the Megatux platform can emulate complex organizational networks. By using virtualization, we emulate the hardware, but run actual software enabling large scale without sacrificing fidelity.« less
NASA Technical Reports Server (NTRS)
Montag, Bruce C.; Bishop, Alfred M.; Redfield, Joe B.
1989-01-01
The findings of a preliminary investigation by Southwest Research Institute (SwRI) in simulation host computer concepts is presented. It is designed to aid NASA in evaluating simulation technologies for use in spaceflight training. The focus of the investigation is on the next generation of space simulation systems that will be utilized in training personnel for Space Station Freedom operations. SwRI concludes that NASA should pursue a distributed simulation host computer system architecture for the Space Station Training Facility (SSTF) rather than a centralized mainframe based arrangement. A distributed system offers many advantages and is seen by SwRI as the only architecture that will allow NASA to achieve established functional goals and operational objectives over the life of the Space Station Freedom program. Several distributed, parallel computing systems are available today that offer real-time capabilities for time critical, man-in-the-loop simulation. These systems are flexible in terms of connectivity and configurability, and are easily scaled to meet increasing demands for more computing power.
Wormlike Chain Theory and Bending of Short DNA
NASA Astrophysics Data System (ADS)
Mazur, Alexey K.
2007-05-01
The probability distributions for bending angles in double helical DNA obtained in all-atom molecular dynamics simulations are compared with theoretical predictions. The computed distributions remarkably agree with the wormlike chain theory and qualitatively differ from predictions of the subelastic chain model. The computed data exhibit only small anomalies in the apparent flexibility of short DNA and cannot account for the recently reported AFM data. It is possible that the current atomistic DNA models miss some essential mechanisms of DNA bending on intermediate length scales. Analysis of bent DNA structures reveal, however, that the bending motion is structurally heterogeneous and directionally anisotropic on the length scales where the experimental anomalies were detected. These effects are essential for interpretation of the experimental data and they also can be responsible for the apparent discrepancy.
Computational vibrational study on coordinated nicotinamide
NASA Astrophysics Data System (ADS)
Bolukbasi, Olcay; Akyuz, Sevim
2005-06-01
The molecular structure and vibrational spectra of zinc (II) halide complexes of nicotinamide (ZnX 2(NIA) 2; X=Cl or Br; NIA=Nicotinamide) were investigated by computational vibrational study and scaled quantum mechanical (SQM) analysis. The geometry optimisation and vibrational wavenumber calculations of zinc halide complexes of nicotinamide were carried out by using the DFT/RB3LYP level of theory with 6-31G(d,p) basis set. The calculated wavenumbers were scaled by using scaled quantum mechanical (SQM) force field method. The fundamental vibrational modes were characterised by their total energy distribution. The coordination effects on nicotinamide through the ring nitrogen were discussed.
Improving Distributed Diagnosis Through Structural Model Decomposition
NASA Technical Reports Server (NTRS)
Bregon, Anibal; Daigle, Matthew John; Roychoudhury, Indranil; Biswas, Gautam; Koutsoukos, Xenofon; Pulido, Belarmino
2011-01-01
Complex engineering systems require efficient fault diagnosis methodologies, but centralized approaches do not scale well, and this motivates the development of distributed solutions. This work presents an event-based approach for distributed diagnosis of abrupt parametric faults in continuous systems, by using the structural model decomposition capabilities provided by Possible Conflicts. We develop a distributed diagnosis algorithm that uses residuals computed by extending Possible Conflicts to build local event-based diagnosers based on global diagnosability analysis. The proposed approach is applied to a multitank system, and results demonstrate an improvement in the design of local diagnosers. Since local diagnosers use only a subset of the residuals, and use subsystem models to compute residuals (instead of the global system model), the local diagnosers are more efficient than previously developed distributed approaches.
Iterative Importance Sampling Algorithms for Parameter Estimation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Grout, Ray W; Morzfeld, Matthias; Day, Marcus S.
In parameter estimation problems one computes a posterior distribution over uncertain parameters defined jointly by a prior distribution, a model, and noisy data. Markov chain Monte Carlo (MCMC) is often used for the numerical solution of such problems. An alternative to MCMC is importance sampling, which can exhibit near perfect scaling with the number of cores on high performance computing systems because samples are drawn independently. However, finding a suitable proposal distribution is a challenging task. Several sampling algorithms have been proposed over the past years that take an iterative approach to constructing a proposal distribution. We investigate the applicabilitymore » of such algorithms by applying them to two realistic and challenging test problems, one in subsurface flow, and one in combustion modeling. More specifically, we implement importance sampling algorithms that iterate over the mean and covariance matrix of Gaussian or multivariate t-proposal distributions. Our implementation leverages massively parallel computers, and we present strategies to initialize the iterations using 'coarse' MCMC runs or Gaussian mixture models.« less
Energy and time determine scaling in biological and computer designs
Bezerra, George; Edwards, Benjamin; Brown, James; Forrest, Stephanie
2016-01-01
Metabolic rate in animals and power consumption in computers are analogous quantities that scale similarly with size. We analyse vascular systems of mammals and on-chip networks of microprocessors, where natural selection and human engineering, respectively, have produced systems that minimize both energy dissipation and delivery times. Using a simple network model that simultaneously minimizes energy and time, our analysis explains empirically observed trends in the scaling of metabolic rate in mammals and power consumption and performance in microprocessors across several orders of magnitude in size. Just as the evolutionary transitions from unicellular to multicellular animals in biology are associated with shifts in metabolic scaling, our model suggests that the scaling of power and performance will change as computer designs transition to decentralized multi-core and distributed cyber-physical systems. More generally, a single energy–time minimization principle may govern the design of many complex systems that process energy, materials and information. This article is part of the themed issue ‘The major synthetic evolutionary transitions’. PMID:27431524
Energy and time determine scaling in biological and computer designs.
Moses, Melanie; Bezerra, George; Edwards, Benjamin; Brown, James; Forrest, Stephanie
2016-08-19
Metabolic rate in animals and power consumption in computers are analogous quantities that scale similarly with size. We analyse vascular systems of mammals and on-chip networks of microprocessors, where natural selection and human engineering, respectively, have produced systems that minimize both energy dissipation and delivery times. Using a simple network model that simultaneously minimizes energy and time, our analysis explains empirically observed trends in the scaling of metabolic rate in mammals and power consumption and performance in microprocessors across several orders of magnitude in size. Just as the evolutionary transitions from unicellular to multicellular animals in biology are associated with shifts in metabolic scaling, our model suggests that the scaling of power and performance will change as computer designs transition to decentralized multi-core and distributed cyber-physical systems. More generally, a single energy-time minimization principle may govern the design of many complex systems that process energy, materials and information.This article is part of the themed issue 'The major synthetic evolutionary transitions'. © 2016 The Author(s).
Investigation of low-latitude hydrogen emission in terms of a two-component interstellar gas model
NASA Technical Reports Server (NTRS)
Baker, P. L.; Burton, W. B.
1975-01-01
High-resolution 21-cm hydrogen line observations at low galactic latitude are analyzed to determine the large-scale distribution of galactic hydrogen. Distribution parameters are found by model fitting, optical depth effects are computed using a two-component gas model suggested by the observations, and calculations are made for a one-component uniform spin-temperature gas model to show the systematic departures between this model and data obtained by incorrect treatment of the optical depth effects. Synthetic 21-cm line profiles are computed from the two-component model, and the large-scale trends of the observed emission profiles are reproduced together with the magnitude of the small-scale emission irregularities. Values are determined for the thickness of the galactic hydrogen disk between half density points, the total observed neutral hydrogen mass of the galaxy, and the central number density of the intercloud hydrogen atoms. It is shown that typical hydrogen clouds must be between 1 and 13 pc in diameter and that optical thinness exists on large-scale despite the presence of optically thin gas.
Foust, Thomas D.; Ziegler, Jack L.; Pannala, Sreekanth; ...
2017-02-28
Here in this computational study, we model the mixing of biomass pyrolysis vapor with solid catalyst in circulating riser reactors with a focus on the determination of solid catalyst residence time distributions (RTDs). A comprehensive set of 2D and 3D simulations were conducted for a pilot-scale riser using the Eulerian-Eulerian two-fluid modeling framework with and without sub-grid-scale models for the gas-solids interaction. A validation test case was also simulated and compared to experiments, showing agreement in the pressure gradient and RTD mean and spread. For simulation cases, it was found that for accurate RTD prediction, the Johnson and Jackson partialmore » slip solids boundary condition was required for all models and a sub-grid model is useful so that ultra high resolutions grids that are very computationally intensive are not required. Finally, we discovered a 2/3 scaling relation for the RTD mean and spread when comparing resolved 2D simulations to validated unresolved 3D sub-grid-scale model simulations.« less
A Hybrid Method for Accelerated Simulation of Coulomb Collisions in a Plasma
DOE Office of Scientific and Technical Information (OSTI.GOV)
Caflisch, R; Wang, C; Dimarco, G
2007-10-09
If the collisional time scale for Coulomb collisions is comparable to the characteristic time scales for a plasma, then simulation of Coulomb collisions may be important for computation of kinetic plasma dynamics. This can be a computational bottleneck because of the large number of simulated particles and collisions (or phase-space resolution requirements in continuum algorithms), as well as the wide range of collision rates over the velocity distribution function. This paper considers Monte Carlo simulation of Coulomb collisions using the binary collision models of Takizuka & Abe and Nanbu. It presents a hybrid method for accelerating the computation of Coulombmore » collisions. The hybrid method represents the velocity distribution function as a combination of a thermal component (a Maxwellian distribution) and a kinetic component (a set of discrete particles). Collisions between particles from the thermal component preserve the Maxwellian; collisions between particles from the kinetic component are performed using the method of or Nanbu. Collisions between the kinetic and thermal components are performed by sampling a particle from the thermal component and selecting a particle from the kinetic component. Particles are also transferred between the two components according to thermalization and dethermalization probabilities, which are functions of phase space.« less
Feng, Huan; Tappero, Ryan; Zhang, Weiguo; ...
2015-07-26
This study is focused on micro-scale measurement of metal (Ca, Cl, Fe, K, Mn, Cu, Pb, and Zn) distributions in Spartina alterniflora root system. The root samples were collected in the Yangtze River intertidal zone in July 2013. Synchrotron X-ray fluorescence (XRF), computed microtomography (CMT), and X-ray absorption near-edge structure (XANES) techniques, which provide micro-meter scale analytical resolution, were applied to this study. Although it was found that the metals of interest were distributed in both epidermis and vascular tissue with the varying concentrations, the results showed that Fe plaque was mainly distributed in the root epidermis. Other metals (e.g.,more » Cu, Mn, Pb, and Zn) were correlated with Fe in the epidermis possibly due to scavenge by Fe plaque. Relatively high metal concentrations were observed in the root hair tip. As a result, this micro-scale investigation provides insights of understanding the metal uptake and spatial distribution as well as the function of Fe plaque governing metal transport in the root system.« less
O'Donnell, Michael
2015-01-01
State-and-transition simulation modeling relies on knowledge of vegetation composition and structure (states) that describe community conditions, mechanistic feedbacks such as fire that can affect vegetation establishment, and ecological processes that drive community conditions as well as the transitions between these states. However, as the need for modeling larger and more complex landscapes increase, a more advanced awareness of computing resources becomes essential. The objectives of this study include identifying challenges of executing state-and-transition simulation models, identifying common bottlenecks of computing resources, developing a workflow and software that enable parallel processing of Monte Carlo simulations, and identifying the advantages and disadvantages of different computing resources. To address these objectives, this study used the ApexRMS® SyncroSim software and embarrassingly parallel tasks of Monte Carlo simulations on a single multicore computer and on distributed computing systems. The results demonstrated that state-and-transition simulation models scale best in distributed computing environments, such as high-throughput and high-performance computing, because these environments disseminate the workloads across many compute nodes, thereby supporting analysis of larger landscapes, higher spatial resolution vegetation products, and more complex models. Using a case study and five different computing environments, the top result (high-throughput computing versus serial computations) indicated an approximate 96.6% decrease of computing time. With a single, multicore compute node (bottom result), the computing time indicated an 81.8% decrease relative to using serial computations. These results provide insight into the tradeoffs of using different computing resources when research necessitates advanced integration of ecoinformatics incorporating large and complicated data inputs and models. - See more at: http://aimspress.com/aimses/ch/reader/view_abstract.aspx?file_no=Environ2015030&flag=1#sthash.p1XKDtF8.dpuf
NASA Astrophysics Data System (ADS)
Miller, M.; Miller, E.; Liu, J.; Lund, R. M.; McKinley, J. P.
2012-12-01
X-ray computed tomography (CT), scanning electron microscopy (SEM), electron microprobe analysis (EMP), and computational image analysis are mature technologies used in many disciplines. Cross-discipline combination of these imaging and image-analysis technologies is the focus of this research, which uses laboratory and light-source resources in an iterative approach. The objective is to produce images across length scales, taking advantage of instrumentation that is optimized for each scale, and to unify them into a single compositional reconstruction. Initially, CT images will be collected using both x-ray absorption and differential phase contrast modes. The imaged sample will then be physically sectioned and the exposed surfaces imaged and characterized via SEM/EMP. The voxel slice corresponding to the physical sample surface will be isolated computationally, and the volumetric data will be combined with two-dimensional SEM images along CT image planes. This registration step will take advantage of the similarity between the X-ray absorption (CT) and backscattered electron (SEM) coefficients (both proportional to average atomic number in the interrogated volume) as well as the images' mutual information. Elemental and solid-phase distributions on the exposed surfaces, co-registered with SEM images, will be mapped using EMP. The solid-phase distribution will be propagated into three-dimensional space using computational methods relying on the estimation of compositional distributions derived from the CT data. If necessary, solid-phase and pore-space boundaries will be resolved using X-ray differential phase contrast tomography, x-ray fluorescence tomography, and absorption-edge microtomography at a light-source facility. Computational methods will be developed to register and model images collected over varying scales and data types. Image resolution, physically and dynamically, is qualitatively different for the electron microscopy and CT methodologies. Routine CT images are resolved at 10-20 μm, while SEM images are resolved at 10-20 nm; grayscale values vary according to collection time and instrument sensitivity; and compositional sensitivities via EMP vary in interrogation volume and scale. We have so far successfully registered SEM imagery within a multimode tomographic volume and have used standard methods to isolate pore space within the volume. We are developing a three-dimensional solid-phase identification and registration method that is constrained by bulk-sample X-ray diffraction Rietveld refinements. The results of this project will prove useful in fields that require the fine-scale definition of solid-phase distributions and relationships, and could replace more inefficient methods for making these estimations.
Investigating Darcy-scale assumptions by means of a multiphysics algorithm
NASA Astrophysics Data System (ADS)
Tomin, Pavel; Lunati, Ivan
2016-09-01
Multiphysics (or hybrid) algorithms, which couple Darcy and pore-scale descriptions of flow through porous media in a single numerical framework, are usually employed to decrease the computational cost of full pore-scale simulations or to increase the accuracy of pure Darcy-scale simulations when a simple macroscopic description breaks down. Despite the massive increase in available computational power, the application of these techniques remains limited to core-size problems and upscaling remains crucial for practical large-scale applications. In this context, the Hybrid Multiscale Finite Volume (HMsFV) method, which constructs the macroscopic (Darcy-scale) problem directly by numerical averaging of pore-scale flow, offers not only a flexible framework to efficiently deal with multiphysics problems, but also a tool to investigate the assumptions used to derive macroscopic models and to better understand the relationship between pore-scale quantities and the corresponding macroscale variables. Indeed, by direct comparison of the multiphysics solution with a reference pore-scale simulation, we can assess the validity of the closure assumptions inherent to the multiphysics algorithm and infer the consequences for macroscopic models at the Darcy scale. We show that the definition of the scale ratio based on the geometric properties of the porous medium is well justified only for single-phase flow, whereas in case of unstable multiphase flow the nonlinear interplay between different forces creates complex fluid patterns characterized by new spatial scales, which emerge dynamically and weaken the scale-separation assumption. In general, the multiphysics solution proves very robust even when the characteristic size of the fluid-distribution patterns is comparable with the observation length, provided that all relevant physical processes affecting the fluid distribution are considered. This suggests that macroscopic constitutive relationships (e.g., the relative permeability) should account for the fact that they depend not only on the saturation but also on the actual characteristics of the fluid distribution.
Distributed sensor networks: a cellular nonlinear network perspective.
Haenggi, Martin
2003-12-01
Large-scale networks of integrated wireless sensors become increasingly tractable. Advances in hardware technology and engineering design have led to dramatic reductions in size, power consumption, and cost for digital circuitry, and wireless communications. Networking, self-organization, and distributed operation are crucial ingredients to harness the sensing, computing, and computational capabilities of the nodes into a complete system. This article shows that those networks can be considered as cellular nonlinear networks (CNNs), and that their analysis and design may greatly benefit from the rich theoretical results available for CNNs.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dmitriy Morozov, Tom Peterka
2014-07-29
Computing a Voronoi or Delaunay tessellation from a set of points is a core part of the analysis of many simulated and measured datasets. As the scale of simulations and observations surpasses billions of particles, a distributed-memory scalable parallel algorithm is the only feasible approach. The primary contribution of this software is a distributed-memory parallel Delaunay and Voronoi tessellation algorithm based on existing serial computational geometry libraries that automatically determines which neighbor points need to be exchanged among the subdomains of a spatial decomposition. Other contributions include the addition of periodic and wall boundary conditions.
Towards Portable Large-Scale Image Processing with High-Performance Computing.
Huo, Yuankai; Blaber, Justin; Damon, Stephen M; Boyd, Brian D; Bao, Shunxing; Parvathaneni, Prasanna; Noguera, Camilo Bermudez; Chaganti, Shikha; Nath, Vishwesh; Greer, Jasmine M; Lyu, Ilwoo; French, William R; Newton, Allen T; Rogers, Baxter P; Landman, Bennett A
2018-05-03
High-throughput, large-scale medical image computing demands tight integration of high-performance computing (HPC) infrastructure for data storage, job distribution, and image processing. The Vanderbilt University Institute for Imaging Science (VUIIS) Center for Computational Imaging (CCI) has constructed a large-scale image storage and processing infrastructure that is composed of (1) a large-scale image database using the eXtensible Neuroimaging Archive Toolkit (XNAT), (2) a content-aware job scheduling platform using the Distributed Automation for XNAT pipeline automation tool (DAX), and (3) a wide variety of encapsulated image processing pipelines called "spiders." The VUIIS CCI medical image data storage and processing infrastructure have housed and processed nearly half-million medical image volumes with Vanderbilt Advanced Computing Center for Research and Education (ACCRE), which is the HPC facility at the Vanderbilt University. The initial deployment was natively deployed (i.e., direct installations on a bare-metal server) within the ACCRE hardware and software environments, which lead to issues of portability and sustainability. First, it could be laborious to deploy the entire VUIIS CCI medical image data storage and processing infrastructure to another HPC center with varying hardware infrastructure, library availability, and software permission policies. Second, the spiders were not developed in an isolated manner, which has led to software dependency issues during system upgrades or remote software installation. To address such issues, herein, we describe recent innovations using containerization techniques with XNAT/DAX which are used to isolate the VUIIS CCI medical image data storage and processing infrastructure from the underlying hardware and software environments. The newly presented XNAT/DAX solution has the following new features: (1) multi-level portability from system level to the application level, (2) flexible and dynamic software development and expansion, and (3) scalable spider deployment compatible with HPC clusters and local workstations.
Parallel Simulation of Unsteady Turbulent Flames
NASA Technical Reports Server (NTRS)
Menon, Suresh
1996-01-01
Time-accurate simulation of turbulent flames in high Reynolds number flows is a challenging task since both fluid dynamics and combustion must be modeled accurately. To numerically simulate this phenomenon, very large computer resources (both time and memory) are required. Although current vector supercomputers are capable of providing adequate resources for simulations of this nature, the high cost and their limited availability, makes practical use of such machines less than satisfactory. At the same time, the explicit time integration algorithms used in unsteady flow simulations often possess a very high degree of parallelism, making them very amenable to efficient implementation on large-scale parallel computers. Under these circumstances, distributed memory parallel computers offer an excellent near-term solution for greatly increased computational speed and memory, at a cost that may render the unsteady simulations of the type discussed above more feasible and affordable.This paper discusses the study of unsteady turbulent flames using a simulation algorithm that is capable of retaining high parallel efficiency on distributed memory parallel architectures. Numerical studies are carried out using large-eddy simulation (LES). In LES, the scales larger than the grid are computed using a time- and space-accurate scheme, while the unresolved small scales are modeled using eddy viscosity based subgrid models. This is acceptable for the moment/energy closure since the small scales primarily provide a dissipative mechanism for the energy transferred from the large scales. However, for combustion to occur, the species must first undergo mixing at the small scales and then come into molecular contact. Therefore, global models cannot be used. Recently, a new model for turbulent combustion was developed, in which the combustion is modeled, within the subgrid (small-scales) using a methodology that simulates the mixing and the molecular transport and the chemical kinetics within each LES grid cell. Finite-rate kinetics can be included without any closure and this approach actually provides a means to predict the turbulent rates and the turbulent flame speed. The subgrid combustion model requires resolution of the local time scales associated with small-scale mixing, molecular diffusion and chemical kinetics and, therefore, within each grid cell, a significant amount of computations must be carried out before the large-scale (LES resolved) effects are incorporated. Therefore, this approach is uniquely suited for parallel processing and has been implemented on various systems such as: Intel Paragon, IBM SP-2, Cray T3D and SGI Power Challenge (PC) using the system independent Message Passing Interface (MPI) compiler. In this paper, timing data on these machines is reported along with some characteristic results.
NASA Technical Reports Server (NTRS)
Wentz, F. J.
1977-01-01
The general problem of bistatic scattering from a two scale surface was evaluated. The treatment was entirely two-dimensional and in a vector formulation independent of any particular coordinate system. The two scale scattering model was then applied to backscattering from the sea surface. In particular, the model was used in conjunction with the JONSWAP 1975 aircraft scatterometer measurements to determine the sea surface's two scale roughness distributions, namely the probability density of the large scale surface slope and the capillary wavenumber spectrum. Best fits yield, on the average, a 0.7 dB rms difference between the model computations and the vertical polarization measurements of the normalized radar cross section. Correlations between the distribution parameters and the wind speed were established from linear, least squares regressions.
A mixed parallel strategy for the solution of coupled multi-scale problems at finite strains
NASA Astrophysics Data System (ADS)
Lopes, I. A. Rodrigues; Pires, F. M. Andrade; Reis, F. J. P.
2018-02-01
A mixed parallel strategy for the solution of homogenization-based multi-scale constitutive problems undergoing finite strains is proposed. The approach aims to reduce the computational time and memory requirements of non-linear coupled simulations that use finite element discretization at both scales (FE^2). In the first level of the algorithm, a non-conforming domain decomposition technique, based on the FETI method combined with a mortar discretization at the interface of macroscopic subdomains, is employed. A master-slave scheme, which distributes tasks by macroscopic element and adopts dynamic scheduling, is then used for each macroscopic subdomain composing the second level of the algorithm. This strategy allows the parallelization of FE^2 simulations in computers with either shared memory or distributed memory architectures. The proposed strategy preserves the quadratic rates of asymptotic convergence that characterize the Newton-Raphson scheme. Several examples are presented to demonstrate the robustness and efficiency of the proposed parallel strategy.
al3c: high-performance software for parameter inference using Approximate Bayesian Computation.
Stram, Alexander H; Marjoram, Paul; Chen, Gary K
2015-11-01
The development of Approximate Bayesian Computation (ABC) algorithms for parameter inference which are both computationally efficient and scalable in parallel computing environments is an important area of research. Monte Carlo rejection sampling, a fundamental component of ABC algorithms, is trivial to distribute over multiple processors but is inherently inefficient. While development of algorithms such as ABC Sequential Monte Carlo (ABC-SMC) help address the inherent inefficiencies of rejection sampling, such approaches are not as easily scaled on multiple processors. As a result, current Bayesian inference software offerings that use ABC-SMC lack the ability to scale in parallel computing environments. We present al3c, a C++ framework for implementing ABC-SMC in parallel. By requiring only that users define essential functions such as the simulation model and prior distribution function, al3c abstracts the user from both the complexities of parallel programming and the details of the ABC-SMC algorithm. By using the al3c framework, the user is able to scale the ABC-SMC algorithm in parallel computing environments for his or her specific application, with minimal programming overhead. al3c is offered as a static binary for Linux and OS-X computing environments. The user completes an XML configuration file and C++ plug-in template for the specific application, which are used by al3c to obtain the desired results. Users can download the static binaries, source code, reference documentation and examples (including those in this article) by visiting https://github.com/ahstram/al3c. astram@usc.edu Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Exact posterior computation in non-conjugate Gaussian location-scale parameters models
NASA Astrophysics Data System (ADS)
Andrade, J. A. A.; Rathie, P. N.
2017-12-01
In Bayesian analysis the class of conjugate models allows to obtain exact posterior distributions, however this class quite restrictive in the sense that it involves only a few distributions. In fact, most of the practical applications involves non-conjugate models, thus approximate methods, such as the MCMC algorithms, are required. Although these methods can deal with quite complex structures, some practical problems can make their applications quite time demanding, for example, when we use heavy-tailed distributions, convergence may be difficult, also the Metropolis-Hastings algorithm can become very slow, in addition to the extra work inevitably required on choosing efficient candidate generator distributions. In this work, we draw attention to the special functions as a tools for Bayesian computation, we propose an alternative method for obtaining the posterior distribution in Gaussian non-conjugate models in an exact form. We use complex integration methods based on the H-function in order to obtain the posterior distribution and some of its posterior quantities in an explicit computable form. Two examples are provided in order to illustrate the theory.
Spontaneous Movements of a Computer Mouse Reveal Egoism and In-group Favoritism.
Maliszewski, Norbert; Wojciechowski, Łukasz; Suszek, Hubert
2017-01-01
The purpose of the project was to assess whether the first spontaneous movements of a computer mouse, when making an assessment on a scale presented on the screen, may express a respondent's implicit attitudes. In Study 1, the altruistic behaviors of 66 students were assessed. The students were led to believe that the task they were performing was also being performed by another person and they were asked to distribute earnings between themselves and the partner. The participants performed the tasks under conditions with and without distractors. With the distractors, in the first few seconds spontaneous mouse movements on the scale expressed a selfish distribution of money, while later the movements gravitated toward more altruism. In Study 2, 77 Polish students evaluated a painting by a Polish/Jewish painter on a scale. They evaluated it under conditions of full or distracted cognitive abilities. Spontaneous movements of the mouse on the scale were analyzed. In addition, implicit attitudes toward both Poles and Jews were measured with the Implicit Association Test (IAT). A significant association between implicit attitudes (IAT) and spontaneous evaluation of images using a computer mouse was observed in the group with the distractor. The participants with strong implicit in-group favoritism of Poles revealed stronger preference for the Polish painter's work in the first few seconds of mouse movement. Taken together, these results suggest that spontaneous mouse movements may reveal egoism (in-group favoritism), i.e., processes that were not observed in the participants' final decisions (clicking on the scale).
Fractal analysis of urban environment: land use and sewer system
NASA Astrophysics Data System (ADS)
Gires, A.; Ochoa Rodriguez, S.; Van Assel, J.; Bruni, G.; Murla Tulys, D.; Wang, L.; Pina, R.; Richard, J.; Ichiba, A.; Willems, P.; Tchiguirinskaia, I.; ten Veldhuis, M. C.; Schertzer, D. J. M.
2014-12-01
Land use distribution are usually obtained by automatic processing of satellite and airborne pictures. The complexity of the obtained patterns which are furthermore scale dependent is enhanced in urban environment. This scale dependency is even more visible in a rasterized representation where only a unique class is affected to each pixel. A parameter commonly analysed in urban hydrology is the coefficient of imperviousness, which reflects the proportion of rainfall that will be immediately active in the catchment response. This coefficient is strongly scale dependent with a rasterized representation. This complex behaviour is well grasped with the help of the scale invariant notion of fractal dimension which enables to quantify the space occupied by a geometrical set (here the impervious areas) not only at a single scale but across all scales. This fractal dimension is also compared to the ones computed on the representation of the catchments with the help of operational semi-distributed models. Fractal dimensions of the corresponding sewer systems are also computed and compared with values found in the literature for natural river networks. This methodology is tested on 7 pilot sites of the European NWE Interreg IV RainGain project located in France, Belgium, Netherlands, United-Kingdom and Portugal. Results are compared between all the case study which exhibit different physical features (slope, level of urbanisation, population density...).
Design for Run-Time Monitor on Cloud Computing
NASA Astrophysics Data System (ADS)
Kang, Mikyung; Kang, Dong-In; Yun, Mira; Park, Gyung-Leen; Lee, Junghoon
Cloud computing is a new information technology trend that moves computing and data away from desktops and portable PCs into large data centers. The basic principle of cloud computing is to deliver applications as services over the Internet as well as infrastructure. A cloud is the type of a parallel and distributed system consisting of a collection of inter-connected and virtualized computers that are dynamically provisioned and presented as one or more unified computing resources. The large-scale distributed applications on a cloud require adaptive service-based software, which has the capability of monitoring the system status change, analyzing the monitored information, and adapting its service configuration while considering tradeoffs among multiple QoS features simultaneously. In this paper, we design Run-Time Monitor (RTM) which is a system software to monitor the application behavior at run-time, analyze the collected information, and optimize resources on cloud computing. RTM monitors application software through library instrumentation as well as underlying hardware through performance counter optimizing its computing configuration based on the analyzed data.
Evaluating two process scale chromatography column header designs using CFD.
Johnson, Chris; Natarajan, Venkatesh; Antoniou, Chris
2014-01-01
Chromatography is an indispensable unit operation in the downstream processing of biomolecules. Scaling of chromatographic operations typically involves a significant increase in the column diameter. At this scale, the flow distribution within a packed bed could be severely affected by the distributor design in process scale columns. Different vendors offer process scale columns with varying design features. The effect of these design features on the flow distribution in packed beds and the resultant effect on column efficiency and cleanability needs to be properly understood in order to prevent unpleasant surprises on scale-up. Computational Fluid Dynamics (CFD) provides a cost-effective means to explore the effect of various distributor designs on process scale performance. In this work, we present a CFD tool that was developed and validated against experimental dye traces and tracer injections. Subsequently, the tool was employed to compare and contrast two commercially available header designs. © 2014 American Institute of Chemical Engineers.
Access control and privacy in large distributed systems
NASA Technical Reports Server (NTRS)
Leiner, B. M.; Bishop, M.
1986-01-01
Large scale distributed systems consists of workstations, mainframe computers, supercomputers and other types of servers, all connected by a computer network. These systems are being used in a variety of applications including the support of collaborative scientific research. In such an environment, issues of access control and privacy arise. Access control is required for several reasons, including the protection of sensitive resources and cost control. Privacy is also required for similar reasons, including the protection of a researcher's proprietary results. A possible architecture for integrating available computer and communications security technologies into a system that meet these requirements is described. This architecture is meant as a starting point for discussion, rather that the final answer.
Afshar, Yaser; Sbalzarini, Ivo F.
2016-01-01
Modern fluorescence microscopy modalities, such as light-sheet microscopy, are capable of acquiring large three-dimensional images at high data rate. This creates a bottleneck in computational processing and analysis of the acquired images, as the rate of acquisition outpaces the speed of processing. Moreover, images can be so large that they do not fit the main memory of a single computer. We address both issues by developing a distributed parallel algorithm for segmentation of large fluorescence microscopy images. The method is based on the versatile Discrete Region Competition algorithm, which has previously proven useful in microscopy image segmentation. The present distributed implementation decomposes the input image into smaller sub-images that are distributed across multiple computers. Using network communication, the computers orchestrate the collectively solving of the global segmentation problem. This not only enables segmentation of large images (we test images of up to 1010 pixels), but also accelerates segmentation to match the time scale of image acquisition. Such acquisition-rate image segmentation is a prerequisite for the smart microscopes of the future and enables online data compression and interactive experiments. PMID:27046144
Afshar, Yaser; Sbalzarini, Ivo F
2016-01-01
Modern fluorescence microscopy modalities, such as light-sheet microscopy, are capable of acquiring large three-dimensional images at high data rate. This creates a bottleneck in computational processing and analysis of the acquired images, as the rate of acquisition outpaces the speed of processing. Moreover, images can be so large that they do not fit the main memory of a single computer. We address both issues by developing a distributed parallel algorithm for segmentation of large fluorescence microscopy images. The method is based on the versatile Discrete Region Competition algorithm, which has previously proven useful in microscopy image segmentation. The present distributed implementation decomposes the input image into smaller sub-images that are distributed across multiple computers. Using network communication, the computers orchestrate the collectively solving of the global segmentation problem. This not only enables segmentation of large images (we test images of up to 10(10) pixels), but also accelerates segmentation to match the time scale of image acquisition. Such acquisition-rate image segmentation is a prerequisite for the smart microscopes of the future and enables online data compression and interactive experiments.
Description of a MIL-STD-1553B Data Bus Ada Driver for the LeRC EPS Testbed
NASA Technical Reports Server (NTRS)
Mackin, Michael A.
1995-01-01
This document describes the software designed to provide communication between control computers in the NASA Lewis Research Center Electrical Power System Testbed using MIL-STD-1553B. The software drivers are coded in the Ada programming language and were developed on a MSDOS-based computer workstation. The Electrical Power System (EPS) Testbed is a reduced-scale prototype space station electrical power system. The power system manages and distributes electrical power from the sources (batteries or photovoltaic arrays) to the end-user loads. The electrical system primary operates at 120 volts DC, and the secondary system operates at 28 volts DC. The devices which direct the flow of electrical power are controlled by a network of six control computers. Data and control messages are passed between the computers using the MIL-STD-1553B network. One of the computers, the Power Management Controller (PMC), controls the primary power distribution and another, the Load Management Controller (LMC), controls the secondary power distribution. Each of these computers communicates with two other computers which act as subsidiary controllers. These subsidiary controllers are, in turn, connected to the devices which directly control the flow of electrical power.
Jafari, G Reza; Sahimi, Muhammad; Rasaei, M Reza; Tabar, M Reza Rahimi
2011-02-01
Several methods have been developed in the past for analyzing the porosity and other types of well logs for large-scale porous media, such as oil reservoirs, as well as their permeability distributions. We developed a method for analyzing the porosity logs ϕ(h) (where h is the depth) and similar data that are often nonstationary stochastic series. In this method one first generates a new stationary series based on the original data, and then analyzes the resulting series. It is shown that the series based on the successive increments of the log y(h)=ϕ(h+δh)-ϕ(h) is a stationary and Markov process, characterized by a Markov length scale h(M). The coefficients of the Kramers-Moyal expansion for the conditional probability density function (PDF) P(y,h|y(0),h(0)) are then computed. The resulting PDFs satisfy a Fokker-Planck (FP) equation, which is equivalent to a Langevin equation for y(h) that provides probabilistic predictions for the porosity logs. We also show that the Hurst exponent H of the self-affine distributions, which have been used in the past to describe the porosity logs, is directly linked to the drift and diffusion coefficients that we compute for the FP equation. Also computed are the level-crossing probabilities that provide insight into identifying the high or low values of the porosity beyond the depth interval in which the data have been measured. ©2011 American Physical Society
Spiking network simulation code for petascale computers.
Kunkel, Susanne; Schmidt, Maximilian; Eppler, Jochen M; Plesser, Hans E; Masumoto, Gen; Igarashi, Jun; Ishii, Shin; Fukai, Tomoki; Morrison, Abigail; Diesmann, Markus; Helias, Moritz
2014-01-01
Brain-scale networks exhibit a breathtaking heterogeneity in the dynamical properties and parameters of their constituents. At cellular resolution, the entities of theory are neurons and synapses and over the past decade researchers have learned to manage the heterogeneity of neurons and synapses with efficient data structures. Already early parallel simulation codes stored synapses in a distributed fashion such that a synapse solely consumes memory on the compute node harboring the target neuron. As petaflop computers with some 100,000 nodes become increasingly available for neuroscience, new challenges arise for neuronal network simulation software: Each neuron contacts on the order of 10,000 other neurons and thus has targets only on a fraction of all compute nodes; furthermore, for any given source neuron, at most a single synapse is typically created on any compute node. From the viewpoint of an individual compute node, the heterogeneity in the synaptic target lists thus collapses along two dimensions: the dimension of the types of synapses and the dimension of the number of synapses of a given type. Here we present a data structure taking advantage of this double collapse using metaprogramming techniques. After introducing the relevant scaling scenario for brain-scale simulations, we quantitatively discuss the performance on two supercomputers. We show that the novel architecture scales to the largest petascale supercomputers available today.
Spiking network simulation code for petascale computers
Kunkel, Susanne; Schmidt, Maximilian; Eppler, Jochen M.; Plesser, Hans E.; Masumoto, Gen; Igarashi, Jun; Ishii, Shin; Fukai, Tomoki; Morrison, Abigail; Diesmann, Markus; Helias, Moritz
2014-01-01
Brain-scale networks exhibit a breathtaking heterogeneity in the dynamical properties and parameters of their constituents. At cellular resolution, the entities of theory are neurons and synapses and over the past decade researchers have learned to manage the heterogeneity of neurons and synapses with efficient data structures. Already early parallel simulation codes stored synapses in a distributed fashion such that a synapse solely consumes memory on the compute node harboring the target neuron. As petaflop computers with some 100,000 nodes become increasingly available for neuroscience, new challenges arise for neuronal network simulation software: Each neuron contacts on the order of 10,000 other neurons and thus has targets only on a fraction of all compute nodes; furthermore, for any given source neuron, at most a single synapse is typically created on any compute node. From the viewpoint of an individual compute node, the heterogeneity in the synaptic target lists thus collapses along two dimensions: the dimension of the types of synapses and the dimension of the number of synapses of a given type. Here we present a data structure taking advantage of this double collapse using metaprogramming techniques. After introducing the relevant scaling scenario for brain-scale simulations, we quantitatively discuss the performance on two supercomputers. We show that the novel architecture scales to the largest petascale supercomputers available today. PMID:25346682
Reduced-Order Biogeochemical Flux Model for High-Resolution Multi-Scale Biophysical Simulations
NASA Astrophysics Data System (ADS)
Smith, Katherine; Hamlington, Peter; Pinardi, Nadia; Zavatarelli, Marco
2017-04-01
Biogeochemical tracers and their interactions with upper ocean physical processes such as submesoscale circulations and small-scale turbulence are critical for understanding the role of the ocean in the global carbon cycle. These interactions can cause small-scale spatial and temporal heterogeneity in tracer distributions that can, in turn, greatly affect carbon exchange rates between the atmosphere and interior ocean. For this reason, it is important to take into account small-scale biophysical interactions when modeling the global carbon cycle. However, explicitly resolving these interactions in an earth system model (ESM) is currently infeasible due to the enormous associated computational cost. As a result, understanding and subsequently parameterizing how these small-scale heterogeneous distributions develop and how they relate to larger resolved scales is critical for obtaining improved predictions of carbon exchange rates in ESMs. In order to address this need, we have developed the reduced-order, 17 state variable Biogeochemical Flux Model (BFM-17) that follows the chemical functional group approach, which allows for non-Redfield stoichiometric ratios and the exchange of matter through units of carbon, nitrate, and phosphate. This model captures the behavior of open-ocean biogeochemical systems without substantially increasing computational cost, thus allowing the model to be combined with computationally-intensive, fully three-dimensional, non-hydrostatic large eddy simulations (LES). In this talk, we couple BFM-17 with the Princeton Ocean Model and show good agreement between predicted monthly-averaged results and Bermuda testbed area field data (including the Bermuda-Atlantic Time-series Study and Bermuda Testbed Mooring). Through these tests, we demonstrate the capability of BFM-17 to accurately model open-ocean biochemistry. Additionally, we discuss the use of BFM-17 within a multi-scale LES framework and outline how this will further our understanding of turbulent biophysical interactions in the upper ocean.
Reduced-Order Biogeochemical Flux Model for High-Resolution Multi-Scale Biophysical Simulations
NASA Astrophysics Data System (ADS)
Smith, K.; Hamlington, P.; Pinardi, N.; Zavatarelli, M.; Milliff, R. F.
2016-12-01
Biogeochemical tracers and their interactions with upper ocean physical processes such as submesoscale circulations and small-scale turbulence are critical for understanding the role of the ocean in the global carbon cycle. These interactions can cause small-scale spatial and temporal heterogeneity in tracer distributions which can, in turn, greatly affect carbon exchange rates between the atmosphere and interior ocean. For this reason, it is important to take into account small-scale biophysical interactions when modeling the global carbon cycle. However, explicitly resolving these interactions in an earth system model (ESM) is currently infeasible due to the enormous associated computational cost. As a result, understanding and subsequently parametrizing how these small-scale heterogeneous distributions develop and how they relate to larger resolved scales is critical for obtaining improved predictions of carbon exchange rates in ESMs. In order to address this need, we have developed the reduced-order, 17 state variable Biogeochemical Flux Model (BFM-17). This model captures the behavior of open-ocean biogeochemical systems without substantially increasing computational cost, thus allowing the model to be combined with computationally-intensive, fully three-dimensional, non-hydrostatic large eddy simulations (LES). In this talk, we couple BFM-17 with the Princeton Ocean Model and show good agreement between predicted monthly-averaged results and Bermuda testbed area field data (including the Bermuda-Atlantic Time Series and Bermuda Testbed Mooring). Through these tests, we demonstrate the capability of BFM-17 to accurately model open-ocean biochemistry. Additionally, we discuss the use of BFM-17 within a multi-scale LES framework and outline how this will further our understanding of turbulent biophysical interactions in the upper ocean.
Exact Extremal Statistics in the Classical 1D Coulomb Gas
NASA Astrophysics Data System (ADS)
Dhar, Abhishek; Kundu, Anupam; Majumdar, Satya N.; Sabhapandit, Sanjib; Schehr, Grégory
2017-08-01
We consider a one-dimensional classical Coulomb gas of N -like charges in a harmonic potential—also known as the one-dimensional one-component plasma. We compute, analytically, the probability distribution of the position xmax of the rightmost charge in the limit of large N . We show that the typical fluctuations of xmax around its mean are described by a nontrivial scaling function, with asymmetric tails. This distribution is different from the Tracy-Widom distribution of xmax for Dyson's log gas. We also compute the large deviation functions of xmax explicitly and show that the system exhibits a third-order phase transition, as in the log gas. Our theoretical predictions are verified numerically.
NASA Astrophysics Data System (ADS)
Forrester, Peter J.; Trinh, Allan K.
2018-05-01
The neighbourhood of the largest eigenvalue λmax in the Gaussian unitary ensemble (GUE) and Laguerre unitary ensemble (LUE) is referred to as the soft edge. It is known that there exists a particular centring and scaling such that the distribution of λmax tends to a universal form, with an error term bounded by 1/N2/3. We take up the problem of computing the exact functional form of the leading error term in a large N asymptotic expansion for both the GUE and LUE—two versions of the LUE are considered, one with the parameter a fixed and the other with a proportional to N. Both settings in the LUE case allow for an interpretation in terms of the distribution of a particular weighted path length in a model involving exponential variables on a rectangular grid, as the grid size gets large. We give operator theoretic forms of the corrections, which are corollaries of knowledge of the first two terms in the large N expansion of the scaled kernel and are readily computed using a method due to Bornemann. We also give expressions in terms of the solutions of particular systems of coupled differential equations, which provide an alternative method of computation. Both characterisations are well suited to a thinned generalisation of the original ensemble, whereby each eigenvalue is deleted independently with probability (1 - ξ). In Sec. V, we investigate using simulation the question of whether upon an appropriate centring and scaling a wider class of complex Hermitian random matrix ensembles have their leading correction to the distribution of λmax proportional to 1/N2/3.
Potjans, Wiebke; Morrison, Abigail; Diesmann, Markus
2010-01-01
A major puzzle in the field of computational neuroscience is how to relate system-level learning in higher organisms to synaptic plasticity. Recently, plasticity rules depending not only on pre- and post-synaptic activity but also on a third, non-local neuromodulatory signal have emerged as key candidates to bridge the gap between the macroscopic and the microscopic level of learning. Crucial insights into this topic are expected to be gained from simulations of neural systems, as these allow the simultaneous study of the multiple spatial and temporal scales that are involved in the problem. In particular, synaptic plasticity can be studied during the whole learning process, i.e., on a time scale of minutes to hours and across multiple brain areas. Implementing neuromodulated plasticity in large-scale network simulations where the neuromodulatory signal is dynamically generated by the network itself is challenging, because the network structure is commonly defined purely by the connectivity graph without explicit reference to the embedding of the nodes in physical space. Furthermore, the simulation of networks with realistic connectivity entails the use of distributed computing. A neuromodulated synapse must therefore be informed in an efficient way about the neuromodulatory signal, which is typically generated by a population of neurons located on different machines than either the pre- or post-synaptic neuron. Here, we develop a general framework to solve the problem of implementing neuromodulated plasticity in a time-driven distributed simulation, without reference to a particular implementation language, neuromodulator, or neuromodulated plasticity mechanism. We implement our framework in the simulator NEST and demonstrate excellent scaling up to 1024 processors for simulations of a recurrent network incorporating neuromodulated spike-timing dependent plasticity. PMID:21151370
MultiPhyl: a high-throughput phylogenomics webserver using distributed computing
Keane, Thomas M.; Naughton, Thomas J.; McInerney, James O.
2007-01-01
With the number of fully sequenced genomes increasing steadily, there is greater interest in performing large-scale phylogenomic analyses from large numbers of individual gene families. Maximum likelihood (ML) has been shown repeatedly to be one of the most accurate methods for phylogenetic construction. Recently, there have been a number of algorithmic improvements in maximum-likelihood-based tree search methods. However, it can still take a long time to analyse the evolutionary history of many gene families using a single computer. Distributed computing refers to a method of combining the computing power of multiple computers in order to perform some larger overall calculation. In this article, we present the first high-throughput implementation of a distributed phylogenetics platform, MultiPhyl, capable of using the idle computational resources of many heterogeneous non-dedicated machines to form a phylogenetics supercomputer. MultiPhyl allows a user to upload hundreds or thousands of amino acid or nucleotide alignments simultaneously and perform computationally intensive tasks such as model selection, tree searching and bootstrapping of each of the alignments using many desktop machines. The program implements a set of 88 amino acid models and 56 nucleotide maximum likelihood models and a variety of statistical methods for choosing between alternative models. A MultiPhyl webserver is available for public use at: http://www.cs.nuim.ie/distributed/multiphyl.php. PMID:17553837
HEPCloud, a New Paradigm for HEP Facilities: CMS Amazon Web Services Investigation
Holzman, Burt; Bauerdick, Lothar A. T.; Bockelman, Brian; ...
2017-09-29
Historically, high energy physics computing has been performed on large purpose-built computing systems. These began as single-site compute facilities, but have evolved into the distributed computing grids used today. Recently, there has been an exponential increase in the capacity and capability of commercial clouds. Cloud resources are highly virtualized and intended to be able to be flexibly deployed for a variety of computing tasks. There is a growing interest among the cloud providers to demonstrate the capability to perform large-scale scientific computing. In this paper, we discuss results from the CMS experiment using the Fermilab HEPCloud facility, which utilized bothmore » local Fermilab resources and virtual machines in the Amazon Web Services Elastic Compute Cloud. We discuss the planning, technical challenges, and lessons learned involved in performing physics workflows on a large-scale set of virtualized resources. Additionally, we will discuss the economics and operational efficiencies when executing workflows both in the cloud and on dedicated resources.« less
HEPCloud, a New Paradigm for HEP Facilities: CMS Amazon Web Services Investigation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Holzman, Burt; Bauerdick, Lothar A. T.; Bockelman, Brian
Historically, high energy physics computing has been performed on large purpose-built computing systems. These began as single-site compute facilities, but have evolved into the distributed computing grids used today. Recently, there has been an exponential increase in the capacity and capability of commercial clouds. Cloud resources are highly virtualized and intended to be able to be flexibly deployed for a variety of computing tasks. There is a growing interest among the cloud providers to demonstrate the capability to perform large-scale scientific computing. In this paper, we discuss results from the CMS experiment using the Fermilab HEPCloud facility, which utilized bothmore » local Fermilab resources and virtual machines in the Amazon Web Services Elastic Compute Cloud. We discuss the planning, technical challenges, and lessons learned involved in performing physics workflows on a large-scale set of virtualized resources. Additionally, we will discuss the economics and operational efficiencies when executing workflows both in the cloud and on dedicated resources.« less
NASA Technical Reports Server (NTRS)
Blair, M. F.
1991-01-01
A combined experimental and computational program was conducted to examine the heat transfer distribution in a turbine rotor passage geometrically similar to the Space Shuttle Main Engine (SSME) High Pressure Fuel Turbopump (HPFTP). Heat transfer was measured and computed for both the full span suction and pressure surfaces of the rotor airfoil as well as for the hub endwall surface. The objective of the program was to provide a benchmark-quality database for the assessment of rotor heat transfer computational techniques. The experimental portion of the study was conducted in a large scale, ambient temperature, rotating turbine model. The computational portion consisted of the application of a well-posed parabolized Navier-Stokes analysis of the calculation of the three-dimensional viscous flow through ducts simulating a gas turbine package. The results of this assessment indicate that the procedure has the potential to predict the aerodynamics and the heat transfer in a gas turbine passage and can be used to develop detailed three dimensional turbulence models for the prediction of skin friction and heat transfer in complex three dimensional flow passages.
Multiscale modeling and distributed computing to predict cosmesis outcome after a lumpectomy
NASA Astrophysics Data System (ADS)
Garbey, M.; Salmon, R.; Thanoon, D.; Bass, B. L.
2013-07-01
Surgery for early stage breast carcinoma is either total mastectomy (complete breast removal) or surgical lumpectomy (only tumor removal). The lumpectomy or partial mastectomy is intended to preserve a breast that satisfies the woman's cosmetic, emotional and physical needs. But in a fairly large number of cases the cosmetic outcome is not satisfactory. Today, predicting that surgery outcome is essentially based on heuristic. Modeling such a complex process must encompass multiple scales, in space from cells to tissue, as well as in time, from minutes for the tissue mechanics to months for healing. The goal of this paper is to present a first step in multiscale modeling of the long time scale prediction of breast shape after tumor resection. This task requires coupling very different mechanical and biological models with very different computing needs. We provide a simple illustration of the application of heterogeneous distributed computing and modular software design to speed up the model development. Our computational framework serves currently to test hypothesis on breast tissue healing in a pilot study with women who have been elected to undergo BCT and are being treated at the Methodist Hospital in Houston, TX.
Parallel Tensor Compression for Large-Scale Scientific Data.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kolda, Tamara G.; Ballard, Grey; Austin, Woody Nathan
As parallel computing trends towards the exascale, scientific data produced by high-fidelity simulations are growing increasingly massive. For instance, a simulation on a three-dimensional spatial grid with 512 points per dimension that tracks 64 variables per grid point for 128 time steps yields 8 TB of data. By viewing the data as a dense five way tensor, we can compute a Tucker decomposition to find inherent low-dimensional multilinear structure, achieving compression ratios of up to 10000 on real-world data sets with negligible loss in accuracy. So that we can operate on such massive data, we present the first-ever distributed memorymore » parallel implementation for the Tucker decomposition, whose key computations correspond to parallel linear algebra operations, albeit with nonstandard data layouts. Our approach specifies a data distribution for tensors that avoids any tensor data redistribution, either locally or in parallel. We provide accompanying analysis of the computation and communication costs of the algorithms. To demonstrate the compression and accuracy of the method, we apply our approach to real-world data sets from combustion science simulations. We also provide detailed performance results, including parallel performance in both weak and strong scaling experiments.« less
Performance of the Heavy Flavor Tracker (HFT) detector in star experiment at RHIC
NASA Astrophysics Data System (ADS)
Alruwaili, Manal
With the growing technology, the number of the processors is becoming massive. Current supercomputer processing will be available on desktops in the next decade. For mass scale application software development on massive parallel computing available on desktops, existing popular languages with large libraries have to be augmented with new constructs and paradigms that exploit massive parallel computing and distributed memory models while retaining the user-friendliness. Currently, available object oriented languages for massive parallel computing such as Chapel, X10 and UPC++ exploit distributed computing, data parallel computing and thread-parallelism at the process level in the PGAS (Partitioned Global Address Space) memory model. However, they do not incorporate: 1) any extension at for object distribution to exploit PGAS model; 2) the programs lack the flexibility of migrating or cloning an object between places to exploit load balancing; and 3) lack the programming paradigms that will result from the integration of data and thread-level parallelism and object distribution. In the proposed thesis, I compare different languages in PGAS model; propose new constructs that extend C++ with object distribution and object migration; and integrate PGAS based process constructs with these extensions on distributed objects. Object cloning and object migration. Also a new paradigm MIDD (Multiple Invocation Distributed Data) is presented when different copies of the same class can be invoked, and work on different elements of a distributed data concurrently using remote method invocations. I present new constructs, their grammar and their behavior. The new constructs have been explained using simple programs utilizing these constructs.
NASA Astrophysics Data System (ADS)
Iwasawa, Masaki; Tanikawa, Ataru; Hosono, Natsuki; Nitadori, Keigo; Muranushi, Takayuki; Makino, Junichiro
2016-08-01
We present the basic idea, implementation, measured performance, and performance model of FDPS (Framework for Developing Particle Simulators). FDPS is an application-development framework which helps researchers to develop simulation programs using particle methods for large-scale distributed-memory parallel supercomputers. A particle-based simulation program for distributed-memory parallel computers needs to perform domain decomposition, exchange of particles which are not in the domain of each computing node, and gathering of the particle information in other nodes which are necessary for interaction calculation. Also, even if distributed-memory parallel computers are not used, in order to reduce the amount of computation, algorithms such as the Barnes-Hut tree algorithm or the Fast Multipole Method should be used in the case of long-range interactions. For short-range interactions, some methods to limit the calculation to neighbor particles are required. FDPS provides all of these functions which are necessary for efficient parallel execution of particle-based simulations as "templates," which are independent of the actual data structure of particles and the functional form of the particle-particle interaction. By using FDPS, researchers can write their programs with the amount of work necessary to write a simple, sequential and unoptimized program of O(N2) calculation cost, and yet the program, once compiled with FDPS, will run efficiently on large-scale parallel supercomputers. A simple gravitational N-body program can be written in around 120 lines. We report the actual performance of these programs and the performance model. The weak scaling performance is very good, and almost linear speed-up was obtained for up to the full system of the K computer. The minimum calculation time per timestep is in the range of 30 ms (N = 107) to 300 ms (N = 109). These are currently limited by the time for the calculation of the domain decomposition and communication necessary for the interaction calculation. We discuss how we can overcome these bottlenecks.
NASA Astrophysics Data System (ADS)
Karhula, Sakari S.; Finnilä, Mikko A.; Freedman, Jonathan D.; Kauppinen, Sami; Valkealahti, Maarit; Lehenkari, Petri; Pritzker, Kenneth P. H.; Nieminen, Heikki J.; Snyder, Brian D.; Grinstaff, Mark W.; Saarakkala, Simo
2017-08-01
Contrast-enhanced micro-computed tomography (CEµCT) with cationic and anionic contrast agents reveals glycosaminoglycan (GAG) content and distribution in articular cartilage (AC). The advantage of using cationic stains (e.g. CA4+) compared to anionic stains (e.g. Hexabrix®), is that it distributes proportionally with GAGs, while anionic stain distribution in AC is inversely proportional to the GAG content. To date, studies using cationic stains have been conducted with sufficient resolution to study its distributions on the macro-scale, but with insufficient resolution to study its distributions on the micro-scale. Therefore, it is not known whether the cationic contrast agents accumulate in extra/pericellular matrix and if they interact with chondrocytes. The insufficient resolution has also prevented to answer the question whether CA4+ accumulation in chondrons could lead to an erroneous quantification of GAG distribution with low-resolution µCT setups. In this study, we use high-resolution µCT to investigate whether CA4+ accumulates in chondrocytes, and further, to determine whether it affects the low-resolution ex vivo µCT studies of CA4+ stained human AC with varying degree of osteoarthritis. Human osteochondral samples were immersed in three different concentrations of CA4+ (3 mgI/ml, 6mgI/ml, and 24 mgI/ml) and imaged with high-resolution µCT at several timepoints. Different uptake diffusion profiles of CA4+ were observed between the segmented chondrons and the rest of the tissue. While the X-ray -detected CA4+ concentration in chondrons was greater than in the rest of the AC, its contribution to the uptake into the whole tissue was negligible and in line with macro-scale GAG content detected from histology. The efficient uptake of CA4+ into chondrons and surrounding territorial matrix can be explained by the micro-scale distribution of GAG content. CA4+ uptake in chondrons occurred regardless of the progression stage of osteoarthritis in the samples and the relative difference between the interterritorial matrix and segmented chondron area was less than 4%. To conclude, our results suggest that GAG quantification with CEµCT is not affected by the chondron uptake of CA4+. This further confirms the use of CA4+ for macro-scale assessment of GAG throughout the AC, and highlight the capability of studying chondron properties in 3D at the micro scale.
Squid - a simple bioinformatics grid.
Carvalho, Paulo C; Glória, Rafael V; de Miranda, Antonio B; Degrave, Wim M
2005-08-03
BLAST is a widely used genetic research tool for analysis of similarity between nucleotide and protein sequences. This paper presents a software application entitled "Squid" that makes use of grid technology. The current version, as an example, is configured for BLAST applications, but adaptation for other computing intensive repetitive tasks can be easily accomplished in the open source version. This enables the allocation of remote resources to perform distributed computing, making large BLAST queries viable without the need of high-end computers. Most distributed computing / grid solutions have complex installation procedures requiring a computer specialist, or have limitations regarding operating systems. Squid is a multi-platform, open-source program designed to "keep things simple" while offering high-end computing power for large scale applications. Squid also has an efficient fault tolerance and crash recovery system against data loss, being able to re-route jobs upon node failure and recover even if the master machine fails. Our results show that a Squid application, working with N nodes and proper network resources, can process BLAST queries almost N times faster than if working with only one computer. Squid offers high-end computing, even for the non-specialist, and is freely available at the project web site. Its open-source and binary Windows distributions contain detailed instructions and a "plug-n-play" instalation containing a pre-configured example.
Scale effect challenges in urban hydrology highlighted with a distributed hydrological model
NASA Astrophysics Data System (ADS)
Ichiba, Abdellah; Gires, Auguste; Tchiguirinskaia, Ioulia; Schertzer, Daniel; Bompard, Philippe; Ten Veldhuis, Marie-Claire
2018-01-01
Hydrological models are extensively used in urban water management, development and evaluation of future scenarios and research activities. There is a growing interest in the development of fully distributed and grid-based models. However, some complex questions related to scale effects are not yet fully understood and still remain open issues in urban hydrology. In this paper we propose a two-step investigation framework to illustrate the extent of scale effects in urban hydrology. First, fractal tools are used to highlight the scale dependence observed within distributed data input into urban hydrological models. Then an intensive multi-scale modelling work is carried out to understand scale effects on hydrological model performance. Investigations are conducted using a fully distributed and physically based model, Multi-Hydro, developed at Ecole des Ponts ParisTech. The model is implemented at 17 spatial resolutions ranging from 100 to 5 m. Results clearly exhibit scale effect challenges in urban hydrology modelling. The applicability of fractal concepts highlights the scale dependence observed within distributed data. Patterns of geophysical data change when the size of the observation pixel changes. The multi-scale modelling investigation confirms scale effects on hydrological model performance. Results are analysed over three ranges of scales identified in the fractal analysis and confirmed through modelling. This work also discusses some remaining issues in urban hydrology modelling related to the availability of high-quality data at high resolutions, and model numerical instabilities as well as the computation time requirements. The main findings of this paper enable a replacement of traditional methods of model calibration
by innovative methods of model resolution alteration
based on the spatial data variability and scaling of flows in urban hydrology.
Singh, Dadabhai T; Trehan, Rahul; Schmidt, Bertil; Bretschneider, Timo
2008-01-01
Preparedness for a possible global pandemic caused by viruses such as the highly pathogenic influenza A subtype H5N1 has become a global priority. In particular, it is critical to monitor the appearance of any new emerging subtypes. Comparative phyloinformatics can be used to monitor, analyze, and possibly predict the evolution of viruses. However, in order to utilize the full functionality of available analysis packages for large-scale phyloinformatics studies, a team of computer scientists, biostatisticians and virologists is needed--a requirement which cannot be fulfilled in many cases. Furthermore, the time complexities of many algorithms involved leads to prohibitive runtimes on sequential computer platforms. This has so far hindered the use of comparative phyloinformatics as a commonly applied tool in this area. In this paper the graphical-oriented workflow design system called Quascade and its efficient usage for comparative phyloinformatics are presented. In particular, we focus on how this task can be effectively performed in a distributed computing environment. As a proof of concept, the designed workflows are used for the phylogenetic analysis of neuraminidase of H5N1 isolates (micro level) and influenza viruses (macro level). The results of this paper are hence twofold. Firstly, this paper demonstrates the usefulness of a graphical user interface system to design and execute complex distributed workflows for large-scale phyloinformatics studies of virus genes. Secondly, the analysis of neuraminidase on different levels of complexity provides valuable insights of this virus's tendency for geographical based clustering in the phylogenetic tree and also shows the importance of glycan sites in its molecular evolution. The current study demonstrates the efficiency and utility of workflow systems providing a biologist friendly approach to complex biological dataset analysis using high performance computing. In particular, the utility of the platform Quascade for deploying distributed and parallelized versions of a variety of computationally intensive phylogenetic algorithms has been shown. Secondly, the analysis of the utilized H5N1 neuraminidase datasets at macro and micro levels has clearly indicated a pattern of spatial clustering of the H5N1 viral isolates based on geographical distribution rather than temporal or host range based clustering.
NASA Astrophysics Data System (ADS)
Lynch, Amanda H.; Abramson, David; Görgen, Klaus; Beringer, Jason; Uotila, Petteri
2007-10-01
Fires in the Australian savanna have been hypothesized to affect monsoon evolution, but the hypothesis is controversial and the effects have not been quantified. A distributed computing approach allows the development of a challenging experimental design that permits simultaneous variation of all fire attributes. The climate model simulations are distributed around multiple independent computer clusters in six countries, an approach that has potential for a range of other large simulation applications in the earth sciences. The experiment clarifies that savanna burning can shape the monsoon through two mechanisms. Boundary-layer circulation and large-scale convergence is intensified monotonically through increasing fire intensity and area burned. However, thresholds of fire timing and area are evident in the consequent influence on monsoon rainfall. In the optimal band of late, high intensity fires with a somewhat limited extent, it is possible for the wet season to be significantly enhanced.
Incorporation of the TIP4P water model into a continuum solvent for computing solvation free energy
NASA Astrophysics Data System (ADS)
Yang, Pei-Kun
2014-10-01
The continuum solvent model is one of the commonly used strategies to compute solvation free energy especially for large-scale conformational transitions such as protein folding or to calculate the binding affinity of protein-protein/ligand interactions. However, the dielectric polarization for computing solvation free energy from the continuum solvent is different than that obtained from molecular dynamic simulations. To mimic the dielectric polarization surrounding a solute in molecular dynamic simulations, the first-shell water molecules was modeled using a charge distribution of TIP4P in a hard sphere; the time-averaged charge distribution from the first-shell water molecules were estimated based on the coordination number of the solute, and the orientation distribution of the first-shell waters and the intermediate water molecules were treated as that of a bulk solvent. Based on this strategy, an equation describing the solvation free energy of ions was derived.
Multi-Scale Computational Modeling of Two-Phased Metal Using GMC Method
NASA Technical Reports Server (NTRS)
Moghaddam, Masoud Ghorbani; Achuthan, A.; Bednacyk, B. A.; Arnold, S. M.; Pineda, E. J.
2014-01-01
A multi-scale computational model for determining plastic behavior in two-phased CMSX-4 Ni-based superalloys is developed on a finite element analysis (FEA) framework employing crystal plasticity constitutive model that can capture the microstructural scale stress field. The generalized method of cells (GMC) micromechanics model is used for homogenizing the local field quantities. At first, GMC as stand-alone is validated by analyzing a repeating unit cell (RUC) as a two-phased sample with 72.9% volume fraction of gamma'-precipitate in the gamma-matrix phase and comparing the results with those predicted by finite element analysis (FEA) models incorporating the same crystal plasticity constitutive model. The global stress-strain behavior and the local field quantity distributions predicted by GMC demonstrated good agreement with FEA. High computational saving, at the expense of some accuracy in the components of local tensor field quantities, was obtained with GMC. Finally, the capability of the developed multi-scale model linking FEA and GMC to solve real life sized structures is demonstrated by analyzing an engine disc component and determining the microstructural scale details of the field quantities.
Fast Decentralized Averaging via Multi-scale Gossip
NASA Astrophysics Data System (ADS)
Tsianos, Konstantinos I.; Rabbat, Michael G.
We are interested in the problem of computing the average consensus in a distributed fashion on random geometric graphs. We describe a new algorithm called Multi-scale Gossip which employs a hierarchical decomposition of the graph to partition the computation into tractable sub-problems. Using only pairwise messages of fixed size that travel at most O(n^{1/3}) hops, our algorithm is robust and has communication cost of O(n loglogn logɛ - 1) transmissions, which is order-optimal up to the logarithmic factor in n. Simulated experiments verify the good expected performance on graphs of many thousands of nodes.
Cosmic Reionization On Computers III. The Clumping Factor
Kaurov, Alexander A.; Gnedin, Nickolay Y.
2015-09-09
We use fully self-consistent numerical simulations of cosmic reionization, completed under the Cosmic Reionization On Computers project, to explore how well the recombinations in the ionized intergalactic medium (IGM) can be quantified by the effective "clumping factor." The density distribution in the simulations (and, presumably, in a real universe) is highly inhomogeneous and more-or-less smoothly varying in space. However, even in highly complex and dynamic environments, the concept of the IGM remains reasonably well-defined; the largest ambiguity comes from the unvirialized regions around galaxies that are over-ionized by the local enhancement in the radiation field ("proximity zones"). This ambiguity precludesmore » computing the IGM clumping factor to better than about 20%. Furthermore, we discuss a "local clumping factor," defined over a particular spatial scale, and quantify its scatter on a given scale and its variation as a function of scale.« less
Cosmic Reionization On Computers III. The Clumping Factor
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kaurov, Alexander A.; Gnedin, Nickolay Y.
We use fully self-consistent numerical simulations of cosmic reionization, completed under the Cosmic Reionization On Computers project, to explore how well the recombinations in the ionized intergalactic medium (IGM) can be quantified by the effective "clumping factor." The density distribution in the simulations (and, presumably, in a real universe) is highly inhomogeneous and more-or-less smoothly varying in space. However, even in highly complex and dynamic environments, the concept of the IGM remains reasonably well-defined; the largest ambiguity comes from the unvirialized regions around galaxies that are over-ionized by the local enhancement in the radiation field ("proximity zones"). This ambiguity precludesmore » computing the IGM clumping factor to better than about 20%. Furthermore, we discuss a "local clumping factor," defined over a particular spatial scale, and quantify its scatter on a given scale and its variation as a function of scale.« less
COSMIC REIONIZATION ON COMPUTERS. III. THE CLUMPING FACTOR
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kaurov, Alexander A.; Gnedin, Nickolay Y., E-mail: kaurov@uchicago.edu, E-mail: gnedin@fnal.gov
We use fully self-consistent numerical simulations of cosmic reionization, completed under the Cosmic Reionization On Computers project, to explore how well the recombinations in the ionized intergalactic medium (IGM) can be quantified by the effective “clumping factor.” The density distribution in the simulations (and, presumably, in a real universe) is highly inhomogeneous and more-or-less smoothly varying in space. However, even in highly complex and dynamic environments, the concept of the IGM remains reasonably well-defined; the largest ambiguity comes from the unvirialized regions around galaxies that are over-ionized by the local enhancement in the radiation field (“proximity zones”). That ambiguity precludesmore » computing the IGM clumping factor to better than about 20%. We also discuss a “local clumping factor,” defined over a particular spatial scale, and quantify its scatter on a given scale and its variation as a function of scale.« less
Enterprise Cloud Architecture for Chinese Ministry of Railway
NASA Astrophysics Data System (ADS)
Shan, Xumei; Liu, Hefeng
Enterprise like PRC Ministry of Railways (MOR), is facing various challenges ranging from highly distributed computing environment and low legacy system utilization, Cloud Computing is increasingly regarded as one workable solution to address this. This article describes full scale cloud solution with Intel Tashi as virtual machine infrastructure layer, Hadoop HDFS as computing platform, and self developed SaaS interface, gluing virtual machine and HDFS with Xen hypervisor. As a result, on demand computing task application and deployment have been tackled per MOR real working scenarios at the end of article.
Dynamic VM Provisioning for TORQUE in a Cloud Environment
NASA Astrophysics Data System (ADS)
Zhang, S.; Boland, L.; Coddington, P.; Sevior, M.
2014-06-01
Cloud computing, also known as an Infrastructure-as-a-Service (IaaS), is attracting more interest from the commercial and educational sectors as a way to provide cost-effective computational infrastructure. It is an ideal platform for researchers who must share common resources but need to be able to scale up to massive computational requirements for specific periods of time. This paper presents the tools and techniques developed to allow the open source TORQUE distributed resource manager and Maui cluster scheduler to dynamically integrate OpenStack cloud resources into existing high throughput computing clusters.
AGIS: The ATLAS Grid Information System
NASA Astrophysics Data System (ADS)
Anisenkov, A.; Di Girolamo, A.; Klimentov, A.; Oleynik, D.; Petrosyan, A.; Atlas Collaboration
2014-06-01
ATLAS, a particle physics experiment at the Large Hadron Collider at CERN, produced petabytes of data annually through simulation production and tens of petabytes of data per year from the detector itself. The ATLAS computing model embraces the Grid paradigm and a high degree of decentralization and computing resources able to meet ATLAS requirements of petabytes scale data operations. In this paper we describe the ATLAS Grid Information System (AGIS), designed to integrate configuration and status information about resources, services and topology of the computing infrastructure used by the ATLAS Distributed Computing applications and services.
Prediction of resource volumes at untested locations using simple local prediction models
Attanasi, E.D.; Coburn, T.C.; Freeman, P.A.
2006-01-01
This paper shows how local spatial nonparametric prediction models can be applied to estimate volumes of recoverable gas resources at individual undrilled sites, at multiple sites on a regional scale, and to compute confidence bounds for regional volumes based on the distribution of those estimates. An approach that combines cross-validation, the jackknife, and bootstrap procedures is used to accomplish this task. Simulation experiments show that cross-validation can be applied beneficially to select an appropriate prediction model. The cross-validation procedure worked well for a wide range of different states of nature and levels of information. Jackknife procedures are used to compute individual prediction estimation errors at undrilled locations. The jackknife replicates also are used with a bootstrap resampling procedure to compute confidence bounds for the total volume. The method was applied to data (partitioned into a training set and target set) from the Devonian Antrim Shale continuous-type gas play in the Michigan Basin in Otsego County, Michigan. The analysis showed that the model estimate of total recoverable volumes at prediction sites is within 4 percent of the total observed volume. The model predictions also provide frequency distributions of the cell volumes at the production unit scale. Such distributions are the basis for subsequent economic analyses. ?? Springer Science+Business Media, LLC 2007.
Incorporating linguistic knowledge for learning distributed word representations.
Wang, Yan; Liu, Zhiyuan; Sun, Maosong
2015-01-01
Combined with neural language models, distributed word representations achieve significant advantages in computational linguistics and text mining. Most existing models estimate distributed word vectors from large-scale data in an unsupervised fashion, which, however, do not take rich linguistic knowledge into consideration. Linguistic knowledge can be represented as either link-based knowledge or preference-based knowledge, and we propose knowledge regularized word representation models (KRWR) to incorporate these prior knowledge for learning distributed word representations. Experiment results demonstrate that our estimated word representation achieves better performance in task of semantic relatedness ranking. This indicates that our methods can efficiently encode both prior knowledge from knowledge bases and statistical knowledge from large-scale text corpora into a unified word representation model, which will benefit many tasks in text mining.
Incorporating Linguistic Knowledge for Learning Distributed Word Representations
Wang, Yan; Liu, Zhiyuan; Sun, Maosong
2015-01-01
Combined with neural language models, distributed word representations achieve significant advantages in computational linguistics and text mining. Most existing models estimate distributed word vectors from large-scale data in an unsupervised fashion, which, however, do not take rich linguistic knowledge into consideration. Linguistic knowledge can be represented as either link-based knowledge or preference-based knowledge, and we propose knowledge regularized word representation models (KRWR) to incorporate these prior knowledge for learning distributed word representations. Experiment results demonstrate that our estimated word representation achieves better performance in task of semantic relatedness ranking. This indicates that our methods can efficiently encode both prior knowledge from knowledge bases and statistical knowledge from large-scale text corpora into a unified word representation model, which will benefit many tasks in text mining. PMID:25874581
NASA Astrophysics Data System (ADS)
Cetinbas, Firat C.; Ahluwalia, Rajesh K.; Kariuki, Nancy; De Andrade, Vincent; Fongalland, Dash; Smith, Linda; Sharman, Jonathan; Ferreira, Paulo; Rasouli, Somaye; Myers, Deborah J.
2017-03-01
The cost and performance of proton exchange membrane fuel cells strongly depend on the cathode electrode due to usage of expensive platinum (Pt) group metal catalyst and sluggish reaction kinetics. Development of low Pt content high performance cathodes requires comprehensive understanding of the electrode microstructure. In this study, a new approach is presented to characterize the detailed cathode electrode microstructure from nm to μm length scales by combining information from different experimental techniques. In this context, nano-scale X-ray computed tomography (nano-CT) is performed to extract the secondary pore space of the electrode. Transmission electron microscopy (TEM) is employed to determine primary C particle and Pt particle size distributions. X-ray scattering, with its ability to provide size distributions of orders of magnitude more particles than TEM, is used to confirm the TEM-determined size distributions. The number of primary pores that cannot be resolved by nano-CT is approximated using mercury intrusion porosimetry. An algorithm is developed to incorporate all these experimental data in one geometric representation. Upon validation of pore size distribution against gas adsorption and mercury intrusion porosimetry data, reconstructed ionomer size distribution is reported. In addition, transport related characteristics and effective properties are computed by performing simulations on the hybrid microstructure.
Fan-out Estimation in Spin-based Quantum Computer Scale-up.
Nguyen, Thien; Hill, Charles D; Hollenberg, Lloyd C L; James, Matthew R
2017-10-17
Solid-state spin-based qubits offer good prospects for scaling based on their long coherence times and nexus to large-scale electronic scale-up technologies. However, high-threshold quantum error correction requires a two-dimensional qubit array operating in parallel, posing significant challenges in fabrication and control. While architectures incorporating distributed quantum control meet this challenge head-on, most designs rely on individual control and readout of all qubits with high gate densities. We analysed the fan-out routing overhead of a dedicated control line architecture, basing the analysis on a generalised solid-state spin qubit platform parameterised to encompass Coulomb confined (e.g. donor based spin qubits) or electrostatically confined (e.g. quantum dot based spin qubits) implementations. The spatial scalability under this model is estimated using standard electronic routing methods and present-day fabrication constraints. Based on reasonable assumptions for qubit control and readout we estimate 10 2 -10 5 physical qubits, depending on the quantum interconnect implementation, can be integrated and fanned-out independently. Assuming relatively long control-free interconnects the scalability can be extended. Ultimately, the universal quantum computation may necessitate a much higher number of integrated qubits, indicating that higher dimensional electronics fabrication and/or multiplexed distributed control and readout schemes may be the preferredstrategy for large-scale implementation.
A scalable parallel black oil simulator on distributed memory parallel computers
NASA Astrophysics Data System (ADS)
Wang, Kun; Liu, Hui; Chen, Zhangxin
2015-11-01
This paper presents our work on developing a parallel black oil simulator for distributed memory computers based on our in-house parallel platform. The parallel simulator is designed to overcome the performance issues of common simulators that are implemented for personal computers and workstations. The finite difference method is applied to discretize the black oil model. In addition, some advanced techniques are employed to strengthen the robustness and parallel scalability of the simulator, including an inexact Newton method, matrix decoupling methods, and algebraic multigrid methods. A new multi-stage preconditioner is proposed to accelerate the solution of linear systems from the Newton methods. Numerical experiments show that our simulator is scalable and efficient, and is capable of simulating extremely large-scale black oil problems with tens of millions of grid blocks using thousands of MPI processes on parallel computers.
Hyperswitch communication network
NASA Technical Reports Server (NTRS)
Peterson, J.; Pniel, M.; Upchurch, E.
1991-01-01
The Hyperswitch Communication Network (HCN) is a large scale parallel computer prototype being developed at JPL. Commercial versions of the HCN computer are planned. The HCN computer being designed is a message passing multiple instruction multiple data (MIMD) computer, and offers many advantages in price-performance ratio, reliability and availability, and manufacturing over traditional uniprocessors and bus based multiprocessors. The design of the HCN operating system is a uniquely flexible environment that combines both parallel processing and distributed processing. This programming paradigm can achieve a balance among the following competing factors: performance in processing and communications, user friendliness, and fault tolerance. The prototype is being designed to accommodate a maximum of 64 state of the art microprocessors. The HCN is classified as a distributed supercomputer. The HCN system is described, and the performance/cost analysis and other competing factors within the system design are reviewed.
Extreme-Scale De Novo Genome Assembly
DOE Office of Scientific and Technical Information (OSTI.GOV)
Georganas, Evangelos; Hofmeyr, Steven; Egan, Rob
De novo whole genome assembly reconstructs genomic sequence from short, overlapping, and potentially erroneous DNA segments and is one of the most important computations in modern genomics. This work presents HipMER, a high-quality end-to-end de novo assembler designed for extreme scale analysis, via efficient parallelization of the Meraculous code. Genome assembly software has many components, each of which stresses different components of a computer system. This chapter explains the computational challenges involved in each step of the HipMer pipeline, the key distributed data structures, and communication costs in detail. We present performance results of assembling the human genome and themore » large hexaploid wheat genome on large supercomputers up to tens of thousands of cores.« less
Critical branching neural networks.
Kello, Christopher T
2013-01-01
It is now well-established that intrinsic variations in human neural and behavioral activity tend to exhibit scaling laws in their fluctuations and distributions. The meaning of these scaling laws is an ongoing matter of debate between isolable causes versus pervasive causes. A spiking neural network model is presented that self-tunes to critical branching and, in doing so, simulates observed scaling laws as pervasive to neural and behavioral activity. These scaling laws are related to neural and cognitive functions, in that critical branching is shown to yield spiking activity with maximal memory and encoding capacities when analyzed using reservoir computing techniques. The model is also shown to account for findings of pervasive 1/f scaling in speech and cued response behaviors that are difficult to explain by isolable causes. Issues and questions raised by the model and its results are discussed from the perspectives of physics, neuroscience, computer and information sciences, and psychological and cognitive sciences.
Spontaneous Movements of a Computer Mouse Reveal Egoism and In-group Favoritism
Maliszewski, Norbert; Wojciechowski, Łukasz; Suszek, Hubert
2017-01-01
The purpose of the project was to assess whether the first spontaneous movements of a computer mouse, when making an assessment on a scale presented on the screen, may express a respondent’s implicit attitudes. In Study 1, the altruistic behaviors of 66 students were assessed. The students were led to believe that the task they were performing was also being performed by another person and they were asked to distribute earnings between themselves and the partner. The participants performed the tasks under conditions with and without distractors. With the distractors, in the first few seconds spontaneous mouse movements on the scale expressed a selfish distribution of money, while later the movements gravitated toward more altruism. In Study 2, 77 Polish students evaluated a painting by a Polish/Jewish painter on a scale. They evaluated it under conditions of full or distracted cognitive abilities. Spontaneous movements of the mouse on the scale were analyzed. In addition, implicit attitudes toward both Poles and Jews were measured with the Implicit Association Test (IAT). A significant association between implicit attitudes (IAT) and spontaneous evaluation of images using a computer mouse was observed in the group with the distractor. The participants with strong implicit in-group favoritism of Poles revealed stronger preference for the Polish painter’s work in the first few seconds of mouse movement. Taken together, these results suggest that spontaneous mouse movements may reveal egoism (in-group favoritism), i.e., processes that were not observed in the participants’ final decisions (clicking on the scale). PMID:28163689
NASA Astrophysics Data System (ADS)
Machalek, P.; Kim, S. M.; Berry, R. D.; Liang, A.; Small, T.; Brevdo, E.; Kuznetsova, A.
2012-12-01
We describe how the Climate Corporation uses Python and Clojure, a language impleneted on top of Java, to generate climatological forecasts for precipitation based on the Advanced Hydrologic Prediction Service (AHPS) radar based daily precipitation measurements. A 2-year-long forecasts is generated on each of the ~650,000 CONUS land based 4-km AHPS grids by constructing 10,000 ensembles sampled from a 30-year reconstructed AHPS history for each grid. The spatial and temporal correlations between neighboring AHPS grids and the sampling of the analogues are handled by Python. The parallelization for all the 650,000 CONUS stations is further achieved by utilizing the MAP-REDUCE framework (http://code.google.com/edu/parallel/mapreduce-tutorial.html). Each full scale computational run requires hundreds of nodes with up to 8 processors each on the Amazon Elastic MapReduce (http://aws.amazon.com/elasticmapreduce/) distributed computing service resulting in 3 terabyte datasets. We further describe how we have productionalized a monthly run of the simulations process at full scale of the 4km AHPS grids and how the resultant terabyte sized datasets are handled.
Understanding I/O workload characteristics of a Peta-scale storage system
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kim, Youngjae; Gunasekaran, Raghul
2015-01-01
Understanding workload characteristics is critical for optimizing and improving the performance of current systems and software, and architecting new storage systems based on observed workload patterns. In this paper, we characterize the I/O workloads of scientific applications of one of the world s fastest high performance computing (HPC) storage cluster, Spider, at the Oak Ridge Leadership Computing Facility (OLCF). OLCF flagship petascale simulation platform, Titan, and other large HPC clusters, in total over 250 thousands compute cores, depend on Spider for their I/O needs. We characterize the system utilization, the demands of reads and writes, idle time, storage space utilization,more » and the distribution of read requests to write requests for the Peta-scale Storage Systems. From this study, we develop synthesized workloads, and we show that the read and write I/O bandwidth usage as well as the inter-arrival time of requests can be modeled as a Pareto distribution. We also study the I/O load imbalance problems using I/O performance data collected from the Spider storage system.« less
A distributed parallel storage architecture and its potential application within EOSDIS
NASA Technical Reports Server (NTRS)
Johnston, William E.; Tierney, Brian; Feuquay, Jay; Butzer, Tony
1994-01-01
We describe the architecture, implementation, use of a scalable, high performance, distributed-parallel data storage system developed in the ARPA funded MAGIC gigabit testbed. A collection of wide area distributed disk servers operate in parallel to provide logical block level access to large data sets. Operated primarily as a network-based cache, the architecture supports cooperation among independently owned resources to provide fast, large-scale, on-demand storage to support data handling, simulation, and computation.
Finite-Size Scaling for the Baxter-Wu Model Using Block Distribution Functions
NASA Astrophysics Data System (ADS)
Velonakis, Ioannis N.; Hadjiagapiou, Ioannis A.
2018-05-01
In the present work, we present an alternative way of applying the well-known finite-size scaling (FSS) theory in the case of a Baxter-Wu model using Binder-like blocks. Binder's ideas are extended to estimate phase transition points and the corresponding scaling exponents not only for magnetic but also for energy properties, saving computational time and effort. The vast majority of our conclusions can be easily generalized to other models.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Rajbhandari, Samyam; NIkam, Akshay; Lai, Pai-Wei
Tensor contractions represent the most compute-intensive core kernels in ab initio computational quantum chemistry and nuclear physics. Symmetries in these tensor contractions makes them difficult to load balance and scale to large distributed systems. In this paper, we develop an efficient and scalable algorithm to contract symmetric tensors. We introduce a novel approach that avoids data redistribution in contracting symmetric tensors while also avoiding redundant storage and maintaining load balance. We present experimental results on two parallel supercomputers for several symmetric contractions that appear in the CCSD quantum chemistry method. We also present a novel approach to tensor redistribution thatmore » can take advantage of parallel hyperplanes when the initial distribution has replicated dimensions, and use collective broadcast when the final distribution has replicated dimensions, making the algorithm very efficient.« less
High Performance Computing for Modeling Wind Farms and Their Impact
NASA Astrophysics Data System (ADS)
Mavriplis, D.; Naughton, J. W.; Stoellinger, M. K.
2016-12-01
As energy generated by wind penetrates further into our electrical system, modeling of power production, power distribution, and the economic impact of wind-generated electricity is growing in importance. The models used for this work can range in fidelity from simple codes that run on a single computer to those that require high performance computing capabilities. Over the past several years, high fidelity models have been developed and deployed on the NCAR-Wyoming Supercomputing Center's Yellowstone machine. One of the primary modeling efforts focuses on developing the capability to compute the behavior of a wind farm in complex terrain under realistic atmospheric conditions. Fully modeling this system requires the simulation of continental flows to modeling the flow over a wind turbine blade, including down to the blade boundary level, fully 10 orders of magnitude in scale. To accomplish this, the simulations are broken up by scale, with information from the larger scales being passed to the lower scale models. In the code being developed, four scale levels are included: the continental weather scale, the local atmospheric flow in complex terrain, the wind plant scale, and the turbine scale. The current state of the models in the latter three scales will be discussed. These simulations are based on a high-order accurate dynamic overset and adaptive mesh approach, which runs at large scale on the NWSC Yellowstone machine. A second effort on modeling the economic impact of new wind development as well as improvement in wind plant performance and enhancements to the transmission infrastructure will also be discussed.
Two-dimensional analysis of coupled heat and moisture transport in masonry structures
NASA Astrophysics Data System (ADS)
Krejčí, Tomáš
2016-06-01
Reconstruction and maintenance of historical buildings and bridges require good knowledge of temperature and moisture distribution. Sharp changes in the temperature and moisture can lead to damage. This paper describes analysis of coupled heat and moisture transfer in masonry based on two-level approach. Macro-scale level describes the whole structure while meso-scale level takes into account detailed composition of the masonry. The two-level approach is very computationally demanding and it was implemented in parallel. The two-level approach was used in analysis of temperature and moisture distribution in Charles bridge in Prague, Czech Republic.
Gallicchio, Emilio; Deng, Nanjie; He, Peng; Wickstrom, Lauren; Perryman, Alexander L.; Santiago, Daniel N.; Forli, Stefano; Olson, Arthur J.; Levy, Ronald M.
2014-01-01
As part of the SAMPL4 blind challenge, filtered AutoDock Vina ligand docking predictions and large scale binding energy distribution analysis method binding free energy calculations have been applied to the virtual screening of a focused library of candidate binders to the LEDGF site of the HIV integrase protein. The computational protocol leveraged docking and high level atomistic models to improve enrichment. The enrichment factor of our blind predictions ranked best among all of the computational submissions, and second best overall. This work represents to our knowledge the first example of the application of an all-atom physics-based binding free energy model to large scale virtual screening. A total of 285 parallel Hamiltonian replica exchange molecular dynamics absolute protein-ligand binding free energy simulations were conducted starting from docked poses. The setup of the simulations was fully automated, calculations were distributed on multiple computing resources and were completed in a 6-weeks period. The accuracy of the docked poses and the inclusion of intramolecular strain and entropic losses in the binding free energy estimates were the major factors behind the success of the method. Lack of sufficient time and computing resources to investigate additional protonation states of the ligands was a major cause of mispredictions. The experiment demonstrated the applicability of binding free energy modeling to improve hit rates in challenging virtual screening of focused ligand libraries during lead optimization. PMID:24504704
Halligan, Brian D.; Geiger, Joey F.; Vallejos, Andrew K.; Greene, Andrew S.; Twigger, Simon N.
2009-01-01
One of the major difficulties for many laboratories setting up proteomics programs has been obtaining and maintaining the computational infrastructure required for the analysis of the large flow of proteomics data. We describe a system that combines distributed cloud computing and open source software to allow laboratories to set up scalable virtual proteomics analysis clusters without the investment in computational hardware or software licensing fees. Additionally, the pricing structure of distributed computing providers, such as Amazon Web Services, allows laboratories or even individuals to have large-scale computational resources at their disposal at a very low cost per run. We provide detailed step by step instructions on how to implement the virtual proteomics analysis clusters as well as a list of current available preconfigured Amazon machine images containing the OMSSA and X!Tandem search algorithms and sequence databases on the Medical College of Wisconsin Proteomics Center website (http://proteomics.mcw.edu/vipdac). PMID:19358578
Halligan, Brian D; Geiger, Joey F; Vallejos, Andrew K; Greene, Andrew S; Twigger, Simon N
2009-06-01
One of the major difficulties for many laboratories setting up proteomics programs has been obtaining and maintaining the computational infrastructure required for the analysis of the large flow of proteomics data. We describe a system that combines distributed cloud computing and open source software to allow laboratories to set up scalable virtual proteomics analysis clusters without the investment in computational hardware or software licensing fees. Additionally, the pricing structure of distributed computing providers, such as Amazon Web Services, allows laboratories or even individuals to have large-scale computational resources at their disposal at a very low cost per run. We provide detailed step-by-step instructions on how to implement the virtual proteomics analysis clusters as well as a list of current available preconfigured Amazon machine images containing the OMSSA and X!Tandem search algorithms and sequence databases on the Medical College of Wisconsin Proteomics Center Web site ( http://proteomics.mcw.edu/vipdac ).
A Neural Information Field Approach to Computational Cognition
2016-11-18
We have extended our perceptual decision making model to account for the effects of context in this flexible DISTRIBUTION A. Approved for public...developed a new perceptual decision making model; demonstrated adaptive motor control in a large-scale cognitive simulation with spiking neurons (Spaun...TERMS EOARD, Computational Cognition, Mixed-initiative decision making 16. SECURITY CLASSIFICATION OF: 17. LIMITATION OF ABSTRACT SAR 18. NUMBER OF
Trace: a high-throughput tomographic reconstruction engine for large-scale datasets
Bicer, Tekin; Gursoy, Doga; Andrade, Vincent De; ...
2017-01-28
Here, synchrotron light source and detector technologies enable scientists to perform advanced experiments. These scientific instruments and experiments produce data at such scale and complexity that large-scale computation is required to unleash their full power. One of the widely used data acquisition technique at light sources is Computed Tomography, which can generate tens of GB/s depending on x-ray range. A large-scale tomographic dataset, such as mouse brain, may require hours of computation time with a medium size workstation. In this paper, we present Trace, a data-intensive computing middleware we developed for implementation and parallelization of iterative tomographic reconstruction algorithms. Tracemore » provides fine-grained reconstruction of tomography datasets using both (thread level) shared memory and (process level) distributed memory parallelization. Trace utilizes a special data structure called replicated reconstruction object to maximize application performance. We also present the optimizations we have done on the replicated reconstruction objects and evaluate them using a shale and a mouse brain sinogram. Our experimental evaluations show that the applied optimizations and parallelization techniques can provide 158x speedup (using 32 compute nodes) over single core configuration, which decreases the reconstruction time of a sinogram (with 4501 projections and 22400 detector resolution) from 12.5 hours to less than 5 minutes per iteration.« less
Trace: a high-throughput tomographic reconstruction engine for large-scale datasets
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bicer, Tekin; Gursoy, Doga; Andrade, Vincent De
Here, synchrotron light source and detector technologies enable scientists to perform advanced experiments. These scientific instruments and experiments produce data at such scale and complexity that large-scale computation is required to unleash their full power. One of the widely used data acquisition technique at light sources is Computed Tomography, which can generate tens of GB/s depending on x-ray range. A large-scale tomographic dataset, such as mouse brain, may require hours of computation time with a medium size workstation. In this paper, we present Trace, a data-intensive computing middleware we developed for implementation and parallelization of iterative tomographic reconstruction algorithms. Tracemore » provides fine-grained reconstruction of tomography datasets using both (thread level) shared memory and (process level) distributed memory parallelization. Trace utilizes a special data structure called replicated reconstruction object to maximize application performance. We also present the optimizations we have done on the replicated reconstruction objects and evaluate them using a shale and a mouse brain sinogram. Our experimental evaluations show that the applied optimizations and parallelization techniques can provide 158x speedup (using 32 compute nodes) over single core configuration, which decreases the reconstruction time of a sinogram (with 4501 projections and 22400 detector resolution) from 12.5 hours to less than 5 minutes per iteration.« less
Network placement optimization for large-scale distributed system
NASA Astrophysics Data System (ADS)
Ren, Yu; Liu, Fangfang; Fu, Yunxia; Zhou, Zheng
2018-01-01
The network geometry strongly influences the performance of the distributed system, i.e., the coverage capability, measurement accuracy and overall cost. Therefore the network placement optimization represents an urgent issue in the distributed measurement, even in large-scale metrology. This paper presents an effective computer-assisted network placement optimization procedure for the large-scale distributed system and illustrates it with the example of the multi-tracker system. To get an optimal placement, the coverage capability and the coordinate uncertainty of the network are quantified. Then a placement optimization objective function is developed in terms of coverage capabilities, measurement accuracy and overall cost. And a novel grid-based encoding approach for Genetic algorithm is proposed. So the network placement is optimized by a global rough search and a local detailed search. Its obvious advantage is that there is no need for a specific initial placement. At last, a specific application illustrates this placement optimization procedure can simulate the measurement results of a specific network and design the optimal placement efficiently.
NASA Astrophysics Data System (ADS)
Ajami, H.; Sharma, A.
2016-12-01
A computationally efficient, semi-distributed hydrologic modeling framework is developed to simulate water balance at a catchment scale. The Soil Moisture and Runoff simulation Toolkit (SMART) is based upon the delineation of contiguous and topologically connected Hydrologic Response Units (HRUs). In SMART, HRUs are delineated using thresholds obtained from topographic and geomorphic analysis of a catchment, and simulation elements are distributed cross sections or equivalent cross sections (ECS) delineated in first order sub-basins. ECSs are formulated by aggregating topographic and physiographic properties of the part or entire first order sub-basins to further reduce computational time in SMART. Previous investigations using SMART have shown that temporal dynamics of soil moisture are well captured at a HRU level using the ECS delineation approach. However, spatial variability of soil moisture within a given HRU is ignored. Here, we examined a number of disaggregation schemes for soil moisture distribution in each HRU. The disaggregation schemes are either based on topographic based indices or a covariance matrix obtained from distributed soil moisture simulations. To assess the performance of the disaggregation schemes, soil moisture simulations from an integrated land surface-groundwater model, ParFlow.CLM in Baldry sub-catchment, Australia are used. ParFlow is a variably saturated sub-surface flow model that is coupled to the Common Land Model (CLM). Our results illustrate that the statistical disaggregation scheme performs better than the methods based on topographic data in approximating soil moisture distribution at a 60m scale. Moreover, the statistical disaggregation scheme maintains temporal correlation of simulated daily soil moisture while preserves the mean sub-basin soil moisture. Future work is focused on assessing the performance of this scheme in catchments with various topographic and climate settings.
Distributed-Memory Fast Maximal Independent Set
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kanewala Appuhamilage, Thejaka Amila J.; Zalewski, Marcin J.; Lumsdaine, Andrew
The Maximal Independent Set (MIS) graph problem arises in many applications such as computer vision, information theory, molecular biology, and process scheduling. The growing scale of MIS problems suggests the use of distributed-memory hardware as a cost-effective approach to providing necessary compute and memory resources. Luby proposed four randomized algorithms to solve the MIS problem. All those algorithms are designed focusing on shared-memory machines and are analyzed using the PRAM model. These algorithms do not have direct efficient distributed-memory implementations. In this paper, we extend two of Luby’s seminal MIS algorithms, “Luby(A)” and “Luby(B),” to distributed-memory execution, and we evaluatemore » their performance. We compare our results with the “Filtered MIS” implementation in the Combinatorial BLAS library for two types of synthetic graph inputs.« less
Design and Development of a Run-Time Monitor for Multi-Core Architectures in Cloud Computing
Kang, Mikyung; Kang, Dong-In; Crago, Stephen P.; Park, Gyung-Leen; Lee, Junghoon
2011-01-01
Cloud computing is a new information technology trend that moves computing and data away from desktops and portable PCs into large data centers. The basic principle of cloud computing is to deliver applications as services over the Internet as well as infrastructure. A cloud is a type of parallel and distributed system consisting of a collection of inter-connected and virtualized computers that are dynamically provisioned and presented as one or more unified computing resources. The large-scale distributed applications on a cloud require adaptive service-based software, which has the capability of monitoring system status changes, analyzing the monitored information, and adapting its service configuration while considering tradeoffs among multiple QoS features simultaneously. In this paper, we design and develop a Run-Time Monitor (RTM) which is a system software to monitor the application behavior at run-time, analyze the collected information, and optimize cloud computing resources for multi-core architectures. RTM monitors application software through library instrumentation as well as underlying hardware through a performance counter optimizing its computing configuration based on the analyzed data. PMID:22163811
Design and development of a run-time monitor for multi-core architectures in cloud computing.
Kang, Mikyung; Kang, Dong-In; Crago, Stephen P; Park, Gyung-Leen; Lee, Junghoon
2011-01-01
Cloud computing is a new information technology trend that moves computing and data away from desktops and portable PCs into large data centers. The basic principle of cloud computing is to deliver applications as services over the Internet as well as infrastructure. A cloud is a type of parallel and distributed system consisting of a collection of inter-connected and virtualized computers that are dynamically provisioned and presented as one or more unified computing resources. The large-scale distributed applications on a cloud require adaptive service-based software, which has the capability of monitoring system status changes, analyzing the monitored information, and adapting its service configuration while considering tradeoffs among multiple QoS features simultaneously. In this paper, we design and develop a Run-Time Monitor (RTM) which is a system software to monitor the application behavior at run-time, analyze the collected information, and optimize cloud computing resources for multi-core architectures. RTM monitors application software through library instrumentation as well as underlying hardware through a performance counter optimizing its computing configuration based on the analyzed data.
Strong scaling and speedup to 16,384 processors in cardiac electro-mechanical simulations.
Reumann, Matthias; Fitch, Blake G; Rayshubskiy, Aleksandr; Keller, David U J; Seemann, Gunnar; Dossel, Olaf; Pitman, Michael C; Rice, John J
2009-01-01
High performance computing is required to make feasible simulations of whole organ models of the heart with biophysically detailed cellular models in a clinical setting. Increasing model detail by simulating electrophysiology and mechanical models increases computation demands. We present scaling results of an electro - mechanical cardiac model of two ventricles and compare them to our previously published results using an electrophysiological model only. The anatomical data-set was given by both ventricles of the Visible Female data-set in a 0.2 mm resolution. Fiber orientation was included. Data decomposition for the distribution onto the distributed memory system was carried out by orthogonal recursive bisection. Load weight ratios for non-tissue vs. tissue elements used in the data decomposition were 1:1, 1:2, 1:5, 1:10, 1:25, 1:38.85, 1:50 and 1:100. The ten Tusscher et al. (2004) electrophysiological cell model was used and the Rice et al. (1999) model for the computation of the calcium transient dependent force. Scaling results for 512, 1024, 2048, 4096, 8192 and 16,384 processors were obtained for 1 ms simulation time. The simulations were carried out on an IBM Blue Gene/L supercomputer. The results show linear scaling from 512 to 16,384 processors with speedup factors between 1.82 and 2.14 between partitions. The most optimal load ratio was 1:25 for on all partitions. However, a shift towards load ratios with higher weight for the tissue elements can be recognized as can be expected when adding computational complexity to the model while keeping the same communication setup. This work demonstrates that it is potentially possible to run simulations of 0.5 s using the presented electro-mechanical cardiac model within 1.5 hours.
NASA Astrophysics Data System (ADS)
Gorelick, Noel
2013-04-01
The Google Earth Engine platform is a system designed to enable petabyte-scale, scientific analysis and visualization of geospatial datasets. Earth Engine provides a consolidated environment including a massive data catalog co-located with thousands of computers for analysis. The user-friendly front-end provides a workbench environment to allow interactive data and algorithm development and exploration and provides a convenient mechanism for scientists to share data, visualizations and analytic algorithms via URLs. The Earth Engine data catalog contains a wide variety of popular, curated datasets, including the world's largest online collection of Landsat scenes (> 2.0M), numerous MODIS collections, and many vector-based data sets. The platform provides a uniform access mechanism to a variety of data types, independent of their bands, projection, bit-depth, resolution, etc..., facilitating easy multi-sensor analysis. Additionally, a user is able to add and curate their own data and collections. Using a just-in-time, distributed computation model, Earth Engine can rapidly process enormous quantities of geo-spatial data. All computation is performed lazily; nothing is computed until it's required either for output or as input to another step. This model allows real-time feedback and preview during algorithm development, supporting a rapid algorithm development, test, and improvement cycle that scales seamlessly to large-scale production data processing. Through integration with a variety of other services, Earth Engine is able to bring to bear considerable analytic and technical firepower in a transparent fashion, including: AI-based classification via integration with Google's machine learning infrastructure, publishing and distribution at Google scale through integration with the Google Maps API, Maps Engine and Google Earth, and support for in-the-field activities such as validation, ground-truthing, crowd-sourcing and citizen science though the Android Open Data Kit.
NASA Astrophysics Data System (ADS)
Gorelick, N.
2012-12-01
The Google Earth Engine platform is a system designed to enable petabyte-scale, scientific analysis and visualization of geospatial datasets. Earth Engine provides a consolidated environment including a massive data catalog co-located with thousands of computers for analysis. The user-friendly front-end provides a workbench environment to allow interactive data and algorithm development and exploration and provides a convenient mechanism for scientists to share data, visualizations and analytic algorithms via URLs. The Earth Engine data catalog contains a wide variety of popular, curated datasets, including the world's largest online collection of Landsat scenes (> 2.0M), numerous MODIS collections, and many vector-based data sets. The platform provides a uniform access mechanism to a variety of data types, independent of their bands, projection, bit-depth, resolution, etc..., facilitating easy multi-sensor analysis. Additionally, a user is able to add and curate their own data and collections. Using a just-in-time, distributed computation model, Earth Engine can rapidly process enormous quantities of geo-spatial data. All computation is performed lazily; nothing is computed until it's required either for output or as input to another step. This model allows real-time feedback and preview during algorithm development, supporting a rapid algorithm development, test, and improvement cycle that scales seamlessly to large-scale production data processing. Through integration with a variety of other services, Earth Engine is able to bring to bear considerable analytic and technical firepower in a transparent fashion, including: AI-based classification via integration with Google's machine learning infrastructure, publishing and distribution at Google scale through integration with the Google Maps API, Maps Engine and Google Earth, and support for in-the-field activities such as validation, ground-truthing, crowd-sourcing and citizen science though the Android Open Data Kit.
NASA Technical Reports Server (NTRS)
Johnston, William E.; Gannon, Dennis; Nitzberg, Bill
2000-01-01
We use the term "Grid" to refer to distributed, high performance computing and data handling infrastructure that incorporates geographically and organizationally dispersed, heterogeneous resources that are persistent and supported. This infrastructure includes: (1) Tools for constructing collaborative, application oriented Problem Solving Environments / Frameworks (the primary user interfaces for Grids); (2) Programming environments, tools, and services providing various approaches for building applications that use aggregated computing and storage resources, and federated data sources; (3) Comprehensive and consistent set of location independent tools and services for accessing and managing dynamic collections of widely distributed resources: heterogeneous computing systems, storage systems, real-time data sources and instruments, human collaborators, and communications systems; (4) Operational infrastructure including management tools for distributed systems and distributed resources, user services, accounting and auditing, strong and location independent user authentication and authorization, and overall system security services The vision for NASA's Information Power Grid - a computing and data Grid - is that it will provide significant new capabilities to scientists and engineers by facilitating routine construction of information based problem solving environments / frameworks. Such Grids will knit together widely distributed computing, data, instrument, and human resources into just-in-time systems that can address complex and large-scale computing and data analysis problems. Examples of these problems include: (1) Coupled, multidisciplinary simulations too large for single systems (e.g., multi-component NPSS turbomachine simulation); (2) Use of widely distributed, federated data archives (e.g., simultaneous access to metrological, topological, aircraft performance, and flight path scheduling databases supporting a National Air Space Simulation systems}; (3) Coupling large-scale computing and data systems to scientific and engineering instruments (e.g., realtime interaction with experiments through real-time data analysis and interpretation presented to the experimentalist in ways that allow direct interaction with the experiment (instead of just with instrument control); (5) Highly interactive, augmented reality and virtual reality remote collaborations (e.g., Ames / Boeing Remote Help Desk providing field maintenance use of coupled video and NDI to a remote, on-line airframe structures expert who uses this data to index into detailed design databases, and returns 3D internal aircraft geometry to the field); (5) Single computational problems too large for any single system (e.g. the rotocraft reference calculation). Grids also have the potential to provide pools of resources that could be called on in extraordinary / rapid response situations (such as disaster response) because they can provide common interfaces and access mechanisms, standardized management, and uniform user authentication and authorization, for large collections of distributed resources (whether or not they normally function in concert). IPG development and deployment is addressing requirements obtained by analyzing a number of different application areas, in particular from the NASA Aero-Space Technology Enterprise. This analysis has focussed primarily on two types of users: the scientist / design engineer whose primary interest is problem solving (e.g. determining wing aerodynamic characteristics in many different operating environments), and whose primary interface to IPG will be through various sorts of problem solving frameworks. The second type of user is the tool designer: the computational scientists who convert physics and mathematics into code that can simulate the physical world. These are the two primary users of IPG, and they have rather different requirements. The results of the analysis of the needs of these two types of users provides a broad set of requirements that gives rise to a general set of required capabilities. The IPG project is intended to address all of these requirements. In some cases the required computing technology exists, and in some cases it must be researched and developed. The project is using available technology to provide a prototype set of capabilities in a persistent distributed computing testbed. Beyond this, there are required capabilities that are not immediately available, and whose development spans the range from near-term engineering development (one to two years) to much longer term R&D (three to six years). Additional information is contained in the original.
Zipf's law from scale-free geometry.
Lin, Henry W; Loeb, Abraham
2016-03-01
The spatial distribution of people exhibits clustering across a wide range of scales, from household (∼10(-2) km) to continental (∼10(4) km) scales. Empirical data indicate simple power-law scalings for the size distribution of cities (known as Zipf's law) and the population density fluctuations as a function of scale. Using techniques from random field theory and statistical physics, we show that these power laws are fundamentally a consequence of the scale-free spatial clustering of human populations and the fact that humans inhabit a two-dimensional surface. In this sense, the symmetries of scale invariance in two spatial dimensions are intimately connected to urban sociology. We test our theory by empirically measuring the power spectrum of population density fluctuations and show that the logarithmic slope α=2.04 ± 0.09, in excellent agreement with our theoretical prediction α=2. The model enables the analytic computation of many new predictions by importing the mathematical formalism of random fields.
A Disciplined Architectural Approach to Scaling Data Analysis for Massive, Scientific Data
NASA Astrophysics Data System (ADS)
Crichton, D. J.; Braverman, A. J.; Cinquini, L.; Turmon, M.; Lee, H.; Law, E.
2014-12-01
Data collections across remote sensing and ground-based instruments in astronomy, Earth science, and planetary science are outpacing scientists' ability to analyze them. Furthermore, the distribution, structure, and heterogeneity of the measurements themselves pose challenges that limit the scalability of data analysis using traditional approaches. Methods for developing science data processing pipelines, distribution of scientific datasets, and performing analysis will require innovative approaches that integrate cyber-infrastructure, algorithms, and data into more systematic approaches that can more efficiently compute and reduce data, particularly distributed data. This requires the integration of computer science, machine learning, statistics and domain expertise to identify scalable architectures for data analysis. The size of data returned from Earth Science observing satellites and the magnitude of data from climate model output, is predicted to grow into the tens of petabytes challenging current data analysis paradigms. This same kind of growth is present in astronomy and planetary science data. One of the major challenges in data science and related disciplines defining new approaches to scaling systems and analysis in order to increase scientific productivity and yield. Specific needs include: 1) identification of optimized system architectures for analyzing massive, distributed data sets; 2) algorithms for systematic analysis of massive data sets in distributed environments; and 3) the development of software infrastructures that are capable of performing massive, distributed data analysis across a comprehensive data science framework. NASA/JPL has begun an initiative in data science to address these challenges. Our goal is to evaluate how scientific productivity can be improved through optimized architectural topologies that identify how to deploy and manage the access, distribution, computation, and reduction of massive, distributed data, while managing the uncertainties of scientific conclusions derived from such capabilities. This talk will provide an overview of JPL's efforts in developing a comprehensive architectural approach to data science.
Maximum entropy approach to H -theory: Statistical mechanics of hierarchical systems
NASA Astrophysics Data System (ADS)
Vasconcelos, Giovani L.; Salazar, Domingos S. P.; Macêdo, A. M. S.
2018-02-01
A formalism, called H-theory, is applied to the problem of statistical equilibrium of a hierarchical complex system with multiple time and length scales. In this approach, the system is formally treated as being composed of a small subsystem—representing the region where the measurements are made—in contact with a set of "nested heat reservoirs" corresponding to the hierarchical structure of the system, where the temperatures of the reservoirs are allowed to fluctuate owing to the complex interactions between degrees of freedom at different scales. The probability distribution function (pdf) of the temperature of the reservoir at a given scale, conditioned on the temperature of the reservoir at the next largest scale in the hierarchy, is determined from a maximum entropy principle subject to appropriate constraints that describe the thermal equilibrium properties of the system. The marginal temperature distribution of the innermost reservoir is obtained by integrating over the conditional distributions of all larger scales, and the resulting pdf is written in analytical form in terms of certain special transcendental functions, known as the Fox H functions. The distribution of states of the small subsystem is then computed by averaging the quasiequilibrium Boltzmann distribution over the temperature of the innermost reservoir. This distribution can also be written in terms of H functions. The general family of distributions reported here recovers, as particular cases, the stationary distributions recently obtained by Macêdo et al. [Phys. Rev. E 95, 032315 (2017), 10.1103/PhysRevE.95.032315] from a stochastic dynamical approach to the problem.
Maximum entropy approach to H-theory: Statistical mechanics of hierarchical systems.
Vasconcelos, Giovani L; Salazar, Domingos S P; Macêdo, A M S
2018-02-01
A formalism, called H-theory, is applied to the problem of statistical equilibrium of a hierarchical complex system with multiple time and length scales. In this approach, the system is formally treated as being composed of a small subsystem-representing the region where the measurements are made-in contact with a set of "nested heat reservoirs" corresponding to the hierarchical structure of the system, where the temperatures of the reservoirs are allowed to fluctuate owing to the complex interactions between degrees of freedom at different scales. The probability distribution function (pdf) of the temperature of the reservoir at a given scale, conditioned on the temperature of the reservoir at the next largest scale in the hierarchy, is determined from a maximum entropy principle subject to appropriate constraints that describe the thermal equilibrium properties of the system. The marginal temperature distribution of the innermost reservoir is obtained by integrating over the conditional distributions of all larger scales, and the resulting pdf is written in analytical form in terms of certain special transcendental functions, known as the Fox H functions. The distribution of states of the small subsystem is then computed by averaging the quasiequilibrium Boltzmann distribution over the temperature of the innermost reservoir. This distribution can also be written in terms of H functions. The general family of distributions reported here recovers, as particular cases, the stationary distributions recently obtained by Macêdo et al. [Phys. Rev. E 95, 032315 (2017)10.1103/PhysRevE.95.032315] from a stochastic dynamical approach to the problem.
NASA Astrophysics Data System (ADS)
Lin, Y.; O'Malley, D.; Vesselinov, V. V.
2015-12-01
Inverse modeling seeks model parameters given a set of observed state variables. However, for many practical problems due to the facts that the observed data sets are often large and model parameters are often numerous, conventional methods for solving the inverse modeling can be computationally expensive. We have developed a new, computationally-efficient Levenberg-Marquardt method for solving large-scale inverse modeling. Levenberg-Marquardt methods require the solution of a dense linear system of equations which can be prohibitively expensive to compute for large-scale inverse problems. Our novel method projects the original large-scale linear problem down to a Krylov subspace, such that the dimensionality of the measurements can be significantly reduced. Furthermore, instead of solving the linear system for every Levenberg-Marquardt damping parameter, we store the Krylov subspace computed when solving the first damping parameter and recycle it for all the following damping parameters. The efficiency of our new inverse modeling algorithm is significantly improved by using these computational techniques. We apply this new inverse modeling method to invert for a random transitivity field. Our algorithm is fast enough to solve for the distributed model parameters (transitivity) at each computational node in the model domain. The inversion is also aided by the use regularization techniques. The algorithm is coded in Julia and implemented in the MADS computational framework (http://mads.lanl.gov). Julia is an advanced high-level scientific programing language that allows for efficient memory management and utilization of high-performance computational resources. By comparing with a Levenberg-Marquardt method using standard linear inversion techniques, our Levenberg-Marquardt method yields speed-up ratio of 15 in a multi-core computational environment and a speed-up ratio of 45 in a single-core computational environment. Therefore, our new inverse modeling method is a powerful tool for large-scale applications.
Trace: a high-throughput tomographic reconstruction engine for large-scale datasets.
Bicer, Tekin; Gürsoy, Doğa; Andrade, Vincent De; Kettimuthu, Rajkumar; Scullin, William; Carlo, Francesco De; Foster, Ian T
2017-01-01
Modern synchrotron light sources and detectors produce data at such scale and complexity that large-scale computation is required to unleash their full power. One of the widely used imaging techniques that generates data at tens of gigabytes per second is computed tomography (CT). Although CT experiments result in rapid data generation, the analysis and reconstruction of the collected data may require hours or even days of computation time with a medium-sized workstation, which hinders the scientific progress that relies on the results of analysis. We present Trace, a data-intensive computing engine that we have developed to enable high-performance implementation of iterative tomographic reconstruction algorithms for parallel computers. Trace provides fine-grained reconstruction of tomography datasets using both (thread-level) shared memory and (process-level) distributed memory parallelization. Trace utilizes a special data structure called replicated reconstruction object to maximize application performance. We also present the optimizations that we apply to the replicated reconstruction objects and evaluate them using tomography datasets collected at the Advanced Photon Source. Our experimental evaluations show that our optimizations and parallelization techniques can provide 158× speedup using 32 compute nodes (384 cores) over a single-core configuration and decrease the end-to-end processing time of a large sinogram (with 4501 × 1 × 22,400 dimensions) from 12.5 h to <5 min per iteration. The proposed tomographic reconstruction engine can efficiently process large-scale tomographic data using many compute nodes and minimize reconstruction times.
A Distributed Representation of Remembered Time
2015-11-19
hippocampus , time, and memory across scales. Journal of Experimental Psychology: General., 142(4), 1211-30. doi: 10.1037/a0033621 Howard, M. W...The hippocampus , time, and memory across scales. Journal of Experimental Psychology: General., 142(4), 1211-30. doi: 10.1037/a0033621 Howard, M. W...accomplished this goal by developing a computational framework that describes a wide range of functional cellular correlates in the hippocampus and
Globus | Informatics Technology for Cancer Research (ITCR)
Globus software services provide secure cancer research data transfer, synchronization, and sharing in distributed environments at large scale. These services can be integrated into applications and research data gateways, leveraging Globus identity management, single sign-on, search, and authorization capabilities. Globus Genomics integrates Globus with the Galaxy genomics workflow engine and Amazon Web Services to enable cancer genomics analysis that can elastically scale compute resources with demand.
Temporal coding of reward-guided choice in the posterior parietal cortex
Hawellek, David J.; Wong, Yan T.; Pesaran, Bijan
2016-01-01
Making a decision involves computations across distributed cortical and subcortical networks. How such distributed processing is performed remains unclear. We test how the encoding of choice in a key decision-making node, the posterior parietal cortex (PPC), depends on the temporal structure of the surrounding population activity. We recorded spiking and local field potential (LFP) activity in the PPC while two rhesus macaques performed a decision-making task. We quantified the mutual information that neurons carried about an upcoming choice and its dependence on LFP activity. The spiking of PPC neurons was correlated with LFP phases at three distinct time scales in the theta, beta, and gamma frequency bands. Importantly, activity at these time scales encoded upcoming decisions differently. Choice information contained in neural firing varied with the phase of beta and gamma activity. For gamma activity, maximum choice information occurred at the same phase as the maximum spike count. However, for beta activity, choice information and spike count were greatest at different phases. In contrast, theta activity did not modulate the encoding properties of PPC units directly but was correlated with beta and gamma activity through cross-frequency coupling. We propose that the relative timing of local spiking and choice information reveals temporal reference frames for computations in either local or large-scale decision networks. Differences between the timing of task information and activity patterns may be a general signature of distributed processing across large-scale networks. PMID:27821752
Davis, Joe M
2011-10-28
General equations are derived for the distribution of minimum resolution between two chromatographic peaks, when peak heights in a multi-component chromatogram follow a continuous statistical distribution. The derivation draws on published theory by relating the area under the distribution of minimum resolution to the area under the distribution of the ratio of peak heights, which in turn is derived from the peak-height distribution. Two procedures are proposed for the equations' numerical solution. The procedures are applied to the log-normal distribution, which recently was reported to describe the distribution of component concentrations in three complex natural mixtures. For published statistical parameters of these mixtures, the distribution of minimum resolution is similar to that for the commonly assumed exponential distribution of peak heights used in statistical-overlap theory. However, these two distributions of minimum resolution can differ markedly, depending on the scale parameter of the log-normal distribution. Theory for the computation of the distribution of minimum resolution is extended to other cases of interest. With the log-normal distribution of peak heights as an example, the distribution of minimum resolution is computed when small peaks are lost due to noise or detection limits, and when the height of at least one peak is less than an upper limit. The distribution of minimum resolution shifts slightly to lower resolution values in the first case and to markedly larger resolution values in the second one. The theory and numerical procedure are confirmed by Monte Carlo simulation. Copyright © 2011 Elsevier B.V. All rights reserved.
Simulation framework for intelligent transportation systems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ewing, T.; Doss, E.; Hanebutte, U.
1996-10-01
A simulation framework has been developed for a large-scale, comprehensive, scaleable simulation of an Intelligent Transportation System (ITS). The simulator is designed for running on parallel computers and distributed (networked) computer systems, but can run on standalone workstations for smaller simulations. The simulator currently models instrumented smart vehicles with in-vehicle navigation units capable of optimal route planning and Traffic Management Centers (TMC). The TMC has probe vehicle tracking capabilities (display position and attributes of instrumented vehicles), and can provide two-way interaction with traffic to provide advisories and link times. Both the in-vehicle navigation module and the TMC feature detailed graphicalmore » user interfaces to support human-factors studies. Realistic modeling of variations of the posted driving speed are based on human factors studies that take into consideration weather, road conditions, driver personality and behavior, and vehicle type. The prototype has been developed on a distributed system of networked UNIX computers but is designed to run on parallel computers, such as ANL`s IBM SP-2, for large-scale problems. A novel feature of the approach is that vehicles are represented by autonomous computer processes which exchange messages with other processes. The vehicles have a behavior model which governs route selection and driving behavior, and can react to external traffic events much like real vehicles. With this approach, the simulation is scaleable to take advantage of emerging massively parallel processor (MPP) systems.« less
Extraction of drainage networks from large terrain datasets using high throughput computing
NASA Astrophysics Data System (ADS)
Gong, Jianya; Xie, Jibo
2009-02-01
Advanced digital photogrammetry and remote sensing technology produces large terrain datasets (LTD). How to process and use these LTD has become a big challenge for GIS users. Extracting drainage networks, which are basic for hydrological applications, from LTD is one of the typical applications of digital terrain analysis (DTA) in geographical information applications. Existing serial drainage algorithms cannot deal with large data volumes in a timely fashion, and few GIS platforms can process LTD beyond the GB size. High throughput computing (HTC), a distributed parallel computing mode, is proposed to improve the efficiency of drainage networks extraction from LTD. Drainage network extraction using HTC involves two key issues: (1) how to decompose the large DEM datasets into independent computing units and (2) how to merge the separate outputs into a final result. A new decomposition method is presented in which the large datasets are partitioned into independent computing units using natural watershed boundaries instead of using regular 1-dimensional (strip-wise) and 2-dimensional (block-wise) decomposition. Because the distribution of drainage networks is strongly related to watershed boundaries, the new decomposition method is more effective and natural. The method to extract natural watershed boundaries was improved by using multi-scale DEMs instead of single-scale DEMs. A HTC environment is employed to test the proposed methods with real datasets.
NASA Technical Reports Server (NTRS)
Avizienis, A.; Gunningberg, P.; Kelly, J. P. J.; Strigini, L.; Traverse, P. J.; Tso, K. S.; Voges, U.
1986-01-01
To establish a long-term research facility for experimental investigations of design diversity as a means of achieving fault-tolerant systems, a distributed testbed for multiple-version software was designed. It is part of a local network, which utilizes the Locus distributed operating system to operate a set of 20 VAX 11/750 computers. It is used in experiments to measure the efficacy of design diversity and to investigate reliability increases under large-scale, controlled experimental conditions.
Dependence of Snowmelt Simulations on Scaling of the Forcing Processes (Invited)
NASA Astrophysics Data System (ADS)
Winstral, A. H.; Marks, D. G.; Gurney, R. J.
2009-12-01
The spatial organization and scaling relationships of snow distribution in mountain environs is ultimately dependent on the controlling processes. These processes include interactions between weather, topography, vegetation, snow state, and seasonally-dependent radiation inputs. In large scale snow modeling it is vital to know these dependencies to obtain accurate predictions while reducing computational costs. This study examined the scaling characteristics of the forcing processes and the dependency of distributed snowmelt simulations to their scaling. A base model simulation characterized these processes with 10m resolution over a 14.0 km2 basin with an elevation range of 1474 - 2244 masl. Each of the major processes affecting snow accumulation and melt - precipitation, wind speed, solar radiation, thermal radiation, temperature, and vapor pressure - were independently degraded to 1 km resolution. Seasonal and event-specific results were analyzed. Results indicated that scale effects on melt vary by process and weather conditions. The dependence of melt simulations on the scaling of solar radiation fluxes also had a seasonal component. These process-based scaling characteristics should remain static through time as they are based on physical considerations. As such, these results not only provide guidance for current modeling efforts, but are also well suited to predicting how potential climate changes will affect the heterogeneity of mountain snow distributions.
Solar potential scaling and the urban road network topology
NASA Astrophysics Data System (ADS)
Najem, Sara
2017-01-01
We explore the scaling of cities' solar potentials with their number of buildings and reveal a latent dependence between the solar potential and the length of the corresponding city's road network. This scaling is shown to be valid at the grid and block levels and is attributed to a common street length distribution. Additionally, we compute the buildings' solar potential correlation function and length in order to determine the set of critical exponents typifying the urban solar potential universality class.
Job Scheduling in a Heterogeneous Grid Environment
NASA Technical Reports Server (NTRS)
Shan, Hong-Zhang; Smith, Warren; Oliker, Leonid; Biswas, Rupak
2004-01-01
Computational grids have the potential for solving large-scale scientific problems using heterogeneous and geographically distributed resources. However, a number of major technical hurdles must be overcome before this potential can be realized. One problem that is critical to effective utilization of computational grids is the efficient scheduling of jobs. This work addresses this problem by describing and evaluating a grid scheduling architecture and three job migration algorithms. The architecture is scalable and does not assume control of local site resources. The job migration policies use the availability and performance of computer systems, the network bandwidth available between systems, and the volume of input and output data associated with each job. An extensive performance comparison is presented using real workloads from leading computational centers. The results, based on several key metrics, demonstrate that the performance of our distributed migration algorithms is significantly greater than that of a local scheduling framework and comparable to a non-scalable global scheduling approach.
Belle II grid computing: An overview of the distributed data management system.
NASA Astrophysics Data System (ADS)
Bansal, Vikas; Schram, Malachi; Belle Collaboration, II
2017-01-01
The Belle II experiment at the SuperKEKB collider in Tsukuba, Japan, will start physics data taking in 2018 and will accumulate 50/ab of e +e- collision data, about 50 times larger than the data set of the Belle experiment. The computing requirements of Belle II are comparable to those of a Run I LHC experiment. Computing at this scale requires efficient use of the compute grids in North America, Asia and Europe and will take advantage of upgrades to the high-speed global network. We present the architecture of data flow and data handling as a part of the Belle II computing infrastructure.
The Quantitative Analysis of User Behavior Online - Data, Models and Algorithms
NASA Astrophysics Data System (ADS)
Raghavan, Prabhakar
By blending principles from mechanism design, algorithms, machine learning and massive distributed computing, the search industry has become good at optimizing monetization on sound scientific principles. This represents a successful and growing partnership between computer science and microeconomics. When it comes to understanding how online users respond to the content and experiences presented to them, we have more of a lacuna in the collaboration between computer science and certain social sciences. We will use a concrete technical example from image search results presentation, developing in the process some algorithmic and machine learning problems of interest in their own right. We then use this example to motivate the kinds of studies that need to grow between computer science and the social sciences; a critical element of this is the need to blend large-scale data analysis with smaller-scale eye-tracking and "individualized" lab studies.
Continuous-Variable Instantaneous Quantum Computing is Hard to Sample.
Douce, T; Markham, D; Kashefi, E; Diamanti, E; Coudreau, T; Milman, P; van Loock, P; Ferrini, G
2017-02-17
Instantaneous quantum computing is a subuniversal quantum complexity class, whose circuits have proven to be hard to simulate classically in the discrete-variable realm. We extend this proof to the continuous-variable (CV) domain by using squeezed states and homodyne detection, and by exploring the properties of postselected circuits. In order to treat postselection in CVs, we consider finitely resolved homodyne detectors, corresponding to a realistic scheme based on discrete probability distributions of the measurement outcomes. The unavoidable errors stemming from the use of finitely squeezed states are suppressed through a qubit-into-oscillator Gottesman-Kitaev-Preskill encoding of quantum information, which was previously shown to enable fault-tolerant CV quantum computation. Finally, we show that, in order to render postselected computational classes in CVs meaningful, a logarithmic scaling of the squeezing parameter with the circuit size is necessary, translating into a polynomial scaling of the input energy.
Robert E. Keane; Stacy A. Drury; Eva C. Karau; Paul F. Hessburg; Keith M. Reynolds
2010-01-01
This paper presents modeling methods for mapping fire hazard and fire risk using a research model called FIREHARM (FIRE Hazard and Risk Model) that computes common measures of fire behavior, fire danger, and fire effects to spatially portray fire hazard over space. FIREHARM can compute a measure of risk associated with the distribution of these measures over time using...
Scalable Automated Model Search
2014-05-20
ma- chines. Categories and Subject Descriptors Big Data [Distributed Computing]: Large scale optimization 1. INTRODUCTION Modern scientific and...from Continuum Analytics[1], and Apache Spark 0.8.1. Additionally, we made use of Hadoop 1.0.4 configured on local disks as our data store for the large...Borkar et al. Hyracks: A flexible and extensible foundation for data -intensive computing. In ICDE, 2011. [16] J. Canny and H. Zhao. Big data
Towards a model of pion generalized parton distributions from Dyson-Schwinger equations
DOE Office of Scientific and Technical Information (OSTI.GOV)
Moutarde, H.
2015-04-10
We compute the pion quark Generalized Parton Distribution H{sup q} and Double Distributions F{sup q} and G{sup q} in a coupled Bethe-Salpeter and Dyson-Schwinger approach. We use simple algebraic expressions inspired by the numerical resolution of Dyson-Schwinger and Bethe-Salpeter equations. We explicitly check the support and polynomiality properties, and the behavior under charge conjugation or time invariance of our model. We derive analytic expressions for the pion Double Distributions and Generalized Parton Distribution at vanishing pion momentum transfer at a low scale. Our model compares very well to experimental pion form factor or parton distribution function data.
Multiple Instance Fuzzy Inference
2015-12-02
very small probabilities. To compute Pr(t | Bi) for a given bag Bi, a conjunction measure of all its instances Bij , j = 1, . . . ,M is computed using...the noisy-or operator Pr(t | Bi) = 1− ∏ 1≤ j ≤M (1− Pr(Bij ∈ t)), (2.5) where Pr(Bij ∈ t) is computed from a Gaussian distribution centred at the concept...Xnk to target concept Ci, and its computed using Pr(Xnk ∈ Ci) = e−( ∑D j =1 sij(xnkj−cij)2) (2.9) In (4.5), sij is a scaling parameter that weights the
Design and Implement of Astronomical Cloud Computing Environment In China-VO
NASA Astrophysics Data System (ADS)
Li, Changhua; Cui, Chenzhou; Mi, Linying; He, Boliang; Fan, Dongwei; Li, Shanshan; Yang, Sisi; Xu, Yunfei; Han, Jun; Chen, Junyi; Zhang, Hailong; Yu, Ce; Xiao, Jian; Wang, Chuanjun; Cao, Zihuang; Fan, Yufeng; Liu, Liang; Chen, Xiao; Song, Wenming; Du, Kangyu
2017-06-01
Astronomy cloud computing environment is a cyber-Infrastructure for Astronomy Research initiated by Chinese Virtual Observatory (China-VO) under funding support from NDRC (National Development and Reform commission) and CAS (Chinese Academy of Sciences). Based on virtualization technology, astronomy cloud computing environment was designed and implemented by China-VO team. It consists of five distributed nodes across the mainland of China. Astronomer can get compuitng and storage resource in this cloud computing environment. Through this environments, astronomer can easily search and analyze astronomical data collected by different telescopes and data centers , and avoid the large scale dataset transportation.
Decentralized Optimal Dispatch of Photovoltaic Inverters in Residential Distribution Systems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dall'Anese, Emiliano; Dhople, Sairaj V.; Johnson, Brian B.
Summary form only given. Decentralized methods for computing optimal real and reactive power setpoints for residential photovoltaic (PV) inverters are developed in this paper. It is known that conventional PV inverter controllers, which are designed to extract maximum power at unity power factor, cannot address secondary performance objectives such as voltage regulation and network loss minimization. Optimal power flow techniques can be utilized to select which inverters will provide ancillary services, and to compute their optimal real and reactive power setpoints according to well-defined performance criteria and economic objectives. Leveraging advances in sparsity-promoting regularization techniques and semidefinite relaxation, this papermore » shows how such problems can be solved with reduced computational burden and optimality guarantees. To enable large-scale implementation, a novel algorithmic framework is introduced - based on the so-called alternating direction method of multipliers - by which optimal power flow-type problems in this setting can be systematically decomposed into sub-problems that can be solved in a decentralized fashion by the utility and customer-owned PV systems with limited exchanges of information. Since the computational burden is shared among multiple devices and the requirement of all-to-all communication can be circumvented, the proposed optimization approach scales favorably to large distribution networks.« less
Divide and Conquer (DC) BLAST: fast and easy BLAST execution within HPC environments
Yim, Won Cheol; Cushman, John C.
2017-07-22
Bioinformatics is currently faced with very large-scale data sets that lead to computational jobs, especially sequence similarity searches, that can take absurdly long times to run. For example, the National Center for Biotechnology Information (NCBI) Basic Local Alignment Search Tool (BLAST and BLAST+) suite, which is by far the most widely used tool for rapid similarity searching among nucleic acid or amino acid sequences, is highly central processing unit (CPU) intensive. While the BLAST suite of programs perform searches very rapidly, they have the potential to be accelerated. In recent years, distributed computing environments have become more widely accessible andmore » used due to the increasing availability of high-performance computing (HPC) systems. Therefore, simple solutions for data parallelization are needed to expedite BLAST and other sequence analysis tools. However, existing software for parallel sequence similarity searches often requires extensive computational experience and skill on the part of the user. In order to accelerate BLAST and other sequence analysis tools, Divide and Conquer BLAST (DCBLAST) was developed to perform NCBI BLAST searches within a cluster, grid, or HPC environment by using a query sequence distribution approach. Scaling from one (1) to 256 CPU cores resulted in significant improvements in processing speed. Thus, DCBLAST dramatically accelerates the execution of BLAST searches using a simple, accessible, robust, and parallel approach. DCBLAST works across multiple nodes automatically and it overcomes the speed limitation of single-node BLAST programs. DCBLAST can be used on any HPC system, can take advantage of hundreds of nodes, and has no output limitations. Thus, this freely available tool simplifies distributed computation pipelines to facilitate the rapid discovery of sequence similarities between very large data sets.« less
Divide and Conquer (DC) BLAST: fast and easy BLAST execution within HPC environments
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yim, Won Cheol; Cushman, John C.
Bioinformatics is currently faced with very large-scale data sets that lead to computational jobs, especially sequence similarity searches, that can take absurdly long times to run. For example, the National Center for Biotechnology Information (NCBI) Basic Local Alignment Search Tool (BLAST and BLAST+) suite, which is by far the most widely used tool for rapid similarity searching among nucleic acid or amino acid sequences, is highly central processing unit (CPU) intensive. While the BLAST suite of programs perform searches very rapidly, they have the potential to be accelerated. In recent years, distributed computing environments have become more widely accessible andmore » used due to the increasing availability of high-performance computing (HPC) systems. Therefore, simple solutions for data parallelization are needed to expedite BLAST and other sequence analysis tools. However, existing software for parallel sequence similarity searches often requires extensive computational experience and skill on the part of the user. In order to accelerate BLAST and other sequence analysis tools, Divide and Conquer BLAST (DCBLAST) was developed to perform NCBI BLAST searches within a cluster, grid, or HPC environment by using a query sequence distribution approach. Scaling from one (1) to 256 CPU cores resulted in significant improvements in processing speed. Thus, DCBLAST dramatically accelerates the execution of BLAST searches using a simple, accessible, robust, and parallel approach. DCBLAST works across multiple nodes automatically and it overcomes the speed limitation of single-node BLAST programs. DCBLAST can be used on any HPC system, can take advantage of hundreds of nodes, and has no output limitations. Thus, this freely available tool simplifies distributed computation pipelines to facilitate the rapid discovery of sequence similarities between very large data sets.« less
Applied Distributed Model Predictive Control for Energy Efficient Buildings and Ramp Metering
NASA Astrophysics Data System (ADS)
Koehler, Sarah Muraoka
Industrial large-scale control problems present an interesting algorithmic design challenge. A number of controllers must cooperate in real-time on a network of embedded hardware with limited computing power in order to maximize system efficiency while respecting constraints and despite communication delays. Model predictive control (MPC) can automatically synthesize a centralized controller which optimizes an objective function subject to a system model, constraints, and predictions of disturbance. Unfortunately, the computations required by model predictive controllers for large-scale systems often limit its industrial implementation only to medium-scale slow processes. Distributed model predictive control (DMPC) enters the picture as a way to decentralize a large-scale model predictive control problem. The main idea of DMPC is to split the computations required by the MPC problem amongst distributed processors that can compute in parallel and communicate iteratively to find a solution. Some popularly proposed solutions are distributed optimization algorithms such as dual decomposition and the alternating direction method of multipliers (ADMM). However, these algorithms ignore two practical challenges: substantial communication delays present in control systems and also problem non-convexity. This thesis presents two novel and practically effective DMPC algorithms. The first DMPC algorithm is based on a primal-dual active-set method which achieves fast convergence, making it suitable for large-scale control applications which have a large communication delay across its communication network. In particular, this algorithm is suited for MPC problems with a quadratic cost, linear dynamics, forecasted demand, and box constraints. We measure the performance of this algorithm and show that it significantly outperforms both dual decomposition and ADMM in the presence of communication delay. The second DMPC algorithm is based on an inexact interior point method which is suited for nonlinear optimization problems. The parallel computation of the algorithm exploits iterative linear algebra methods for the main linear algebra computations in the algorithm. We show that the splitting of the algorithm is flexible and can thus be applied to various distributed platform configurations. The two proposed algorithms are applied to two main energy and transportation control problems. The first application is energy efficient building control. Buildings represent 40% of energy consumption in the United States. Thus, it is significant to improve the energy efficiency of buildings. The goal is to minimize energy consumption subject to the physics of the building (e.g. heat transfer laws), the constraints of the actuators as well as the desired operating constraints (thermal comfort of the occupants), and heat load on the system. In this thesis, we describe the control systems of forced air building systems in practice. We discuss the "Trim and Respond" algorithm which is a distributed control algorithm that is used in practice, and show that it performs similarly to a one-step explicit DMPC algorithm. Then, we apply the novel distributed primal-dual active-set method and provide extensive numerical results for the building MPC problem. The second main application is the control of ramp metering signals to optimize traffic flow through a freeway system. This application is particularly important since urban congestion has more than doubled in the past few decades. The ramp metering problem is to maximize freeway throughput subject to freeway dynamics (derived from mass conservation), actuation constraints, freeway capacity constraints, and predicted traffic demand. In this thesis, we develop a hybrid model predictive controller for ramp metering that is guaranteed to be persistently feasible and stable. This contrasts to previous work on MPC for ramp metering where such guarantees are absent. We apply a smoothing method to the hybrid model predictive controller and apply the inexact interior point method to this nonlinear non-convex ramp metering problem.
Developing science gateways for drug discovery in a grid environment.
Pérez-Sánchez, Horacio; Rezaei, Vahid; Mezhuyev, Vitaliy; Man, Duhu; Peña-García, Jorge; den-Haan, Helena; Gesing, Sandra
2016-01-01
Methods for in silico screening of large databases of molecules increasingly complement and replace experimental techniques to discover novel compounds to combat diseases. As these techniques become more complex and computationally costly we are faced with an increasing problem to provide the research community of life sciences with a convenient tool for high-throughput virtual screening on distributed computing resources. To this end, we recently integrated the biophysics-based drug-screening program FlexScreen into a service, applicable for large-scale parallel screening and reusable in the context of scientific workflows. Our implementation is based on Pipeline Pilot and Simple Object Access Protocol and provides an easy-to-use graphical user interface to construct complex workflows, which can be executed on distributed computing resources, thus accelerating the throughput by several orders of magnitude.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mllett, Paul; McDeavitt, Sean; Deo, Chaitanya
This proposal will investigate the stability of bimodal pore size distributions in metallic uranium and uranium-zirconium alloys during sintering and re-sintering annealing treatments. The project will utilize both computational and experimental approaches. The computational approach includes both Molecular Dynamics simulations to determine the self-diffusion coefficients in pure U and U-Zr alloys in single crystals, grain boundaries, and free surfaces, as well as calculations of grain boundary and free surface interfacial energies. Phase-field simulations using MOOSE will be conducted to study pore and grain structure evolution in microstructures with bimodal pore size distributions. Experiments will also be performed to validate themore » simulations, and measure the time-dependent densification of bimodal porous compacts.« less
NASA Astrophysics Data System (ADS)
Komini Babu, Siddharth; Mohamed, Alexander I.; Whitacre, Jay F.; Litster, Shawn
2015-06-01
This paper presents the use of nanometer scale resolution X-ray computed tomography (nano-CT) in the three-dimensional (3D) imaging of a Li-ion battery cathode, including the separate volumes of active material, binder plus conductive additive, and pore. The different high and low atomic number (Z) materials are distinguished by sequentially imaging the lithium cobalt oxide electrode in absorption and then Zernike phase contrast modes. Morphological parameters of the active material and the additives are extracted from the 3D reconstructions, including the distribution of contact areas between the additives and the active material. This method could provide a better understanding of the electric current distribution and structural integrity of battery electrodes, as well as provide detailed geometries for computational models.
3D chemical imaging in the laboratory by hyperspectral X-ray computed tomography
Egan, C. K.; Jacques, S. D. M.; Wilson, M. D.; Veale, M. C.; Seller, P.; Beale, A. M.; Pattrick, R. A. D.; Withers, P. J.; Cernik, R. J.
2015-01-01
We report the development of laboratory based hyperspectral X-ray computed tomography which allows the internal elemental chemistry of an object to be reconstructed and visualised in three dimensions. The method employs a spectroscopic X-ray imaging detector with sufficient energy resolution to distinguish individual elemental absorption edges. Elemental distributions can then be made by K-edge subtraction, or alternatively by voxel-wise spectral fitting to give relative atomic concentrations. We demonstrate its application to two material systems: studying the distribution of catalyst material on porous substrates for industrial scale chemical processing; and mapping of minerals and inclusion phases inside a mineralised ore sample. The method makes use of a standard laboratory X-ray source with measurement times similar to that required for conventional computed tomography. PMID:26514938
Anisotropic Galaxy-Galaxy Lensing in the Illustris-1 Simulation
NASA Astrophysics Data System (ADS)
Brainerd, Tereasa G.
2017-06-01
In Cold Dark Matter universes, the dark matter halos of galaxies are expected to be triaxial, leading to a surface mass density that is not circularly symmetric. In principle, this "flattening" of the dark matter halos of galaxies should be observable as an anisotropy in the weak galaxy-galaxy lensing signal. The degree to which the weak lensing signal is observed to be anisotropic, however, will depend strongly on the degree to which mass (i.e., the dark matter) is aligned with light in the lensing galaxies. That is, the anisotropy will be maximized when the major axis of the projected mass distribution is well aligned with the projected light distribution of the lens galaxies. Observational studies of anisotropic galaxy-galaxy lensing have found an anisotropic weak lensing signal around massive, red galaxies. Detecting the signal around blue, disky galaxies has, however, been more elusive. A possible explanation for this is that mass and light are well aligned within red galaxies and poorly aligned within blue galaxies (an explanation that is supported by studies of the locations of satellites of large, relatively isolated galaxies). Here we compute the weak lensing signal of isolated central galaxies in the Illustris-1 simulation. We compute the anisotropy of the weak lensing signal using two definitions of the geometry: [1] the major axis of the projected dark matter mass distribution and [2] the major axis of the projected stellar mass. On projected scales less than 15% of the virial radius, an anisotropy of order 10% is found for both definitions of the geometry. On larger scales, the anisotropy computed relative to the major axis of the projected light distribution is less than the anisotropy computed relative to the major axis of the projected dark matter. On projected scales of order the virial radius, the anisotropy obtained when using the major axis of the light is an order of magnitude less than the anisotropy obtained when using the major axis of the dark matter. The suppression of the anisotropy when using the major axis of the light to define the geometry is indicative of a significant misalignment of mass and light in the Illustris-1 galaxies at large physical radii.
NWChem: A comprehensive and scalable open-source solution for large scale molecular simulations
NASA Astrophysics Data System (ADS)
Valiev, M.; Bylaska, E. J.; Govind, N.; Kowalski, K.; Straatsma, T. P.; Van Dam, H. J. J.; Wang, D.; Nieplocha, J.; Apra, E.; Windus, T. L.; de Jong, W. A.
2010-09-01
The latest release of NWChem delivers an open-source computational chemistry package with extensive capabilities for large scale simulations of chemical and biological systems. Utilizing a common computational framework, diverse theoretical descriptions can be used to provide the best solution for a given scientific problem. Scalable parallel implementations and modular software design enable efficient utilization of current computational architectures. This paper provides an overview of NWChem focusing primarily on the core theoretical modules provided by the code and their parallel performance. Program summaryProgram title: NWChem Catalogue identifier: AEGI_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEGI_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: Open Source Educational Community License No. of lines in distributed program, including test data, etc.: 11 709 543 No. of bytes in distributed program, including test data, etc.: 680 696 106 Distribution format: tar.gz Programming language: Fortran 77, C Computer: all Linux based workstations and parallel supercomputers, Windows and Apple machines Operating system: Linux, OS X, Windows Has the code been vectorised or parallelized?: Code is parallelized Classification: 2.1, 2.2, 3, 7.3, 7.7, 16.1, 16.2, 16.3, 16.10, 16.13 Nature of problem: Large-scale atomistic simulations of chemical and biological systems require efficient and reliable methods for ground and excited solutions of many-electron Hamiltonian, analysis of the potential energy surface, and dynamics. Solution method: Ground and excited solutions of many-electron Hamiltonian are obtained utilizing density-functional theory, many-body perturbation approach, and coupled cluster expansion. These solutions or a combination thereof with classical descriptions are then used to analyze potential energy surface and perform dynamical simulations. Additional comments: Full documentation is provided in the distribution file. This includes an INSTALL file giving details of how to build the package. A set of test runs is provided in the examples directory. The distribution file for this program is over 90 Mbytes and therefore is not delivered directly when download or Email is requested. Instead a html file giving details of how the program can be obtained is sent. Running time: Running time depends on the size of the chemical system, complexity of the method, number of cpu's and the computational task. It ranges from several seconds for serial DFT energy calculations on a few atoms to several hours for parallel coupled cluster energy calculations on tens of atoms or ab-initio molecular dynamics simulation on hundreds of atoms.
Topological analysis of the CfA redshift survey
NASA Technical Reports Server (NTRS)
Vogeley, Michael S.; Park, Changbom; Geller, Margaret J.; Huchra, John P.; Gott, J. Richard, III
1994-01-01
We study the topology of large-scale structure in the Center for Astrophysics Redshift Survey, which now includes approximately 12,000 galaxies with limiting magnitude m(sub B) is less than or equal to 15.5. The dense sampling and large volume of this survey allow us to compute the topology on smoothing scales from 6 to 20/h Mpc; we thus examine the topology of structure in both 'nonlinear' and 'linear' regimes. On smoothing scales less than or equal to 10/h Mpc this sample has 3 times the number of resolution elements of samples examined in previous studies. Isodensity surface of the smoothed galaxy density field demonstrate that coherent high-density structures and large voids dominate the galaxy distribution. We compute the genus-threshold density relation for isodensity surfaces of the CfA survey. To quantify phase correlation in these data, we compare the CfA genus with the genus of realizations of Gaussian random fields with the power spectrum measured for the CfA survey. On scales less than or equal to 10/h Mpc the observed genus amplitude is smaller than random phase (96% confidence level). This decrement reflects the degree of phase coherence in the observed galaxy distribution. In other words the genus amplitude on these scales is not good measure of the power spectrum slope. On scales greater than 10/h Mpc, where the galaxy distribution is rougly in the 'linear' regime, the genus ampitude is consistent with the random phase amplitude. The shape of the genus curve reflects the strong coherence in the observed structure; the observed genus curve appears broader than random phase (94% confidence level for smoothing scales less than or equal to 10/h Mpc) because the topolgoy is spongelike over a very large range of density threshold. This departre from random phase consistent with a distribution like a filamentary net of 'walls with holes.' On smoothing scales approaching approximately 20/h Mpc the shape of the CfA genus curve is consistent with random phase. There is very weak evidence for a shift of the genus toward a 'bubble-like' topology. To test cosmological models, we compute the genus for mock CfA surveys drawn from large (L greater than or approximately 400/h Mpc) N-body simulations of three variants of the cold dark matter (CDM) cosmogony. The genus amplitude of the 'standard' CDM model (omega h = 0.5, b = 1.5) differs from the observations (96% confidence level) on smoothing scales is less than or approximately 10/h Mpc. An open CDM model (omega h = 0.2) and a CDM model with nonzero cosmological constant (omega h = 0.24, lambda (sub 0) = 0.6) are consistent with the observed genus amplitude over the full range of smoothing scales. All of these models fail (97% confidence level) to match the broadness of the observed genus curve on smoothing scales is less than or equal to 10/h Mpc.
Efficient 3D inversions using the Richards equation
NASA Astrophysics Data System (ADS)
Cockett, Rowan; Heagy, Lindsey J.; Haber, Eldad
2018-07-01
Fluid flow in the vadose zone is governed by the Richards equation; it is parameterized by hydraulic conductivity, which is a nonlinear function of pressure head. Investigations in the vadose zone typically require characterizing distributed hydraulic properties. Water content or pressure head data may include direct measurements made from boreholes. Increasingly, proxy measurements from hydrogeophysics are being used to supply more spatially and temporally dense data sets. Inferring hydraulic parameters from such datasets requires the ability to efficiently solve and optimize the nonlinear time domain Richards equation. This is particularly important as the number of parameters to be estimated in a vadose zone inversion continues to grow. In this paper, we describe an efficient technique to invert for distributed hydraulic properties in 1D, 2D, and 3D. Our technique does not store the Jacobian matrix, but rather computes its product with a vector. Existing literature for the Richards equation inversion explicitly calculates the sensitivity matrix using finite difference or automatic differentiation, however, for large scale problems these methods are constrained by computation and/or memory. Using an implicit sensitivity algorithm enables large scale inversion problems for any distributed hydraulic parameters in the Richards equation to become tractable on modest computational resources. We provide an open source implementation of our technique based on the SimPEG framework, and show it in practice for a 3D inversion of saturated hydraulic conductivity using water content data through time.
DISCRN: A Distributed Storytelling Framework for Intelligence Analysis.
Shukla, Manu; Dos Santos, Raimundo; Chen, Feng; Lu, Chang-Tien
2017-09-01
Storytelling connects entities (people, organizations) using their observed relationships to establish meaningful storylines. This can be extended to spatiotemporal storytelling that incorporates locations, time, and graph computations to enhance coherence and meaning. But when performed sequentially these computations become a bottleneck because the massive number of entities make space and time complexity untenable. This article presents DISCRN, or distributed spatiotemporal ConceptSearch-based storytelling, a distributed framework for performing spatiotemporal storytelling. The framework extracts entities from microblogs and event data, and links these entities using a novel ConceptSearch to derive storylines in a distributed fashion utilizing key-value pair paradigm. Performing these operations at scale allows deeper and broader analysis of storylines. The novel parallelization techniques speed up the generation and filtering of storylines on massive datasets. Experiments with microblog posts such as Twitter data and Global Database of Events, Language, and Tone events show the efficiency of the techniques in DISCRN.
NASA Astrophysics Data System (ADS)
Garousi Nejad, I.; He, S.; Tang, Q.; Ogden, F. L.; Steinke, R. C.; Frazier, N.; Tarboton, D. G.; Ohara, N.; Lin, H.
2017-12-01
Spatial scale is one of the main considerations in hydrological modeling of snowmelt in mountainous areas. The size of model elements controls the degree to which variability can be explicitly represented versus what needs to be parameterized using effective properties such as averages or other subgrid variability parameterizations that may degrade the quality of model simulations. For snowmelt modeling terrain parameters such as slope, aspect, vegetation and elevation play an important role in the timing and quantity of snowmelt that serves as an input to hydrologic runoff generation processes. In general, higher resolution enhances the accuracy of the simulation since fine meshes represent and preserve the spatial variability of atmospheric and surface characteristics better than coarse resolution. However, this increases computational cost and there may be a scale beyond which the model response does not improve due to diminishing sensitivity to variability and irreducible uncertainty associated with the spatial interpolation of inputs. This paper examines the influence of spatial resolution on the snowmelt process using simulations of and data from the Animas River watershed, an alpine mountainous area in Colorado, USA, using an unstructured distributed physically based hydrological model developed for a parallel computing environment, ADHydro. Five spatial resolutions (30 m, 100 m, 250 m, 500 m, and 1 km) were used to investigate the variations in hydrologic response. This study demonstrated the importance of choosing the appropriate spatial scale in the implementation of ADHydro to obtain a balance between representing spatial variability and the computational cost. According to the results, variation in the input variables and parameters due to using different spatial resolution resulted in changes in the obtained hydrological variables, especially snowmelt, both at the basin-scale and distributed across the model mesh.
Evolution of the ATLAS distributed computing system during the LHC long shutdown
NASA Astrophysics Data System (ADS)
Campana, S.; Atlas Collaboration
2014-06-01
The ATLAS Distributed Computing project (ADC) was established in 2007 to develop and operate a framework, following the ATLAS computing model, to enable data storage, processing and bookkeeping on top of the Worldwide LHC Computing Grid (WLCG) distributed infrastructure. ADC development has always been driven by operations and this contributed to its success. The system has fulfilled the demanding requirements of ATLAS, daily consolidating worldwide up to 1 PB of data and running more than 1.5 million payloads distributed globally, supporting almost one thousand concurrent distributed analysis users. Comprehensive automation and monitoring minimized the operational manpower required. The flexibility of the system to adjust to operational needs has been important to the success of the ATLAS physics program. The LHC shutdown in 2013-2015 affords an opportunity to improve the system in light of operational experience and scale it to cope with the demanding requirements of 2015 and beyond, most notably a much higher trigger rate and event pileup. We will describe the evolution of the ADC software foreseen during this period. This includes consolidating the existing Production and Distributed Analysis framework (PanDA) and ATLAS Grid Information System (AGIS), together with the development and commissioning of next generation systems for distributed data management (DDM/Rucio) and production (Prodsys-2). We will explain how new technologies such as Cloud Computing and NoSQL databases, which ATLAS investigated as R&D projects in past years, will be integrated in production. Finally, we will describe more fundamental developments such as breaking job-to-data locality by exploiting storage federations and caches, and event level (rather than file or dataset level) workload engines.
Multi-scaling modelling in financial markets
NASA Astrophysics Data System (ADS)
Liu, Ruipeng; Aste, Tomaso; Di Matteo, T.
2007-12-01
In the recent years, a new wave of interest spurred the involvement of complexity in finance which might provide a guideline to understand the mechanism of financial markets, and researchers with different backgrounds have made increasing contributions introducing new techniques and methodologies. In this paper, Markov-switching multifractal models (MSM) are briefly reviewed and the multi-scaling properties of different financial data are analyzed by computing the scaling exponents by means of the generalized Hurst exponent H(q). In particular we have considered H(q) for price data, absolute returns and squared returns of different empirical financial time series. We have computed H(q) for the simulated data based on the MSM models with Binomial and Lognormal distributions of the volatility components. The results demonstrate the capacity of the multifractal (MF) models to capture the stylized facts in finance, and the ability of the generalized Hurst exponents approach to detect the scaling feature of financial time series.
NASA Astrophysics Data System (ADS)
Ichiba, Abdellah; Gires, Auguste; Tchiguirinskaia, Ioulia; Schertzer, Daniel; Bompard, Philippe; Ten Veldhuis, Marie-Claire
2017-04-01
Nowadays, there is a growing interest on small-scale rainfall information, provided by weather radars, to be used in urban water management and decision-making. Therefore, an increasing interest is in parallel devoted to the development of fully distributed and grid-based models following the increase of computation capabilities, the availability of high-resolution GIS information needed for such models implementation. However, the choice of an appropriate implementation scale to integrate the catchment heterogeneity and the whole measured rainfall variability provided by High-resolution radar technologies still issues. This work proposes a two steps investigation of scale effects in urban hydrology and its effects on modeling works. In the first step fractal tools are used to highlight the scale dependency observed within distributed data used to describe the catchment heterogeneity, both the structure of the sewer network and the distribution of impervious areas are analyzed. Then an intensive multi-scale modeling work is carried out to understand scaling effects on hydrological model performance. Investigations were conducted using a fully distributed and physically based model, Multi-Hydro, developed at Ecole des Ponts ParisTech. The model was implemented at 17 spatial resolutions ranging from 100 m to 5 m and modeling investigations were performed using both rain gauge rainfall information as well as high resolution X band radar data in order to assess the sensitivity of the model to small scale rainfall variability. Results coming out from this work demonstrate scale effect challenges in urban hydrology modeling. In fact, fractal concept highlights the scale dependency observed within distributed data used to implement hydrological models. Patterns of geophysical data change when we change the observation pixel size. The multi-scale modeling investigation performed with Multi-Hydro model at 17 spatial resolutions confirms scaling effect on hydrological model performance. Results were analyzed at three ranges of scales identified in the fractal analysis and confirmed in the modeling work. The sensitivity of the model to small-scale rainfall variability was discussed as well.
R&D100: Lightweight Distributed Metric Service
Gentile, Ann; Brandt, Jim; Tucker, Tom; Showerman, Mike
2018-06-12
On today's High Performance Computing platforms, the complexity of applications and configurations makes efficient use of resources difficult. The Lightweight Distributed Metric Service (LDMS) is monitoring software developed by Sandia National Laboratories to provide detailed metrics of system performance. LDMS provides collection, transport, and storage of data from extreme-scale systems at fidelities and timescales to provide understanding of application and system performance with no statistically significant impact on application performance.
R&D100: Lightweight Distributed Metric Service
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gentile, Ann; Brandt, Jim; Tucker, Tom
2015-11-19
On today's High Performance Computing platforms, the complexity of applications and configurations makes efficient use of resources difficult. The Lightweight Distributed Metric Service (LDMS) is monitoring software developed by Sandia National Laboratories to provide detailed metrics of system performance. LDMS provides collection, transport, and storage of data from extreme-scale systems at fidelities and timescales to provide understanding of application and system performance with no statistically significant impact on application performance.
NASA Astrophysics Data System (ADS)
Georgiadis, A.; Berg, S.; Makurat, A.; Maitland, G.; Ott, H.
2013-09-01
We investigated the cluster-size distribution of the residual nonwetting phase in a sintered glass-bead porous medium at two-phase flow conditions, by means of micro-computed-tomography (μCT) imaging with pore-scale resolution. Cluster-size distribution functions and cluster volumes were obtained by image analysis for a range of injected pore volumes under both imbibition and drainage conditions; the field of view was larger than the porosity-based representative elementary volume (REV). We did not attempt to make a definition for a two-phase REV but used the nonwetting-phase cluster-size distribution as an indicator. Most of the nonwetting-phase total volume was found to be contained in clusters that were one to two orders of magnitude larger than the porosity-based REV. The largest observed clusters in fact ranged in volume from 65% to 99% of the entire nonwetting phase in the field of view. As a consequence, the largest clusters observed were statistically not represented and were found to be smaller than the estimated maximum cluster length. The results indicate that the two-phase REV is larger than the field of view attainable by μCT scanning, at a resolution which allows for the accurate determination of cluster connectivity.
AUTOMATED GEOSPATIAL WATERSHED ASSESSMENT: A GIS-BASED HYDROLOGIC MODELING TOOL
Planning and assessment in land and water resource management are evolving toward complex, spatially explicit regional assessments. These problems have to be addressed with distributed models that can compute runoff and erosion at different spatial and temporal scales. The extens...
Gibbs sampling on large lattice with GMRF
NASA Astrophysics Data System (ADS)
Marcotte, Denis; Allard, Denis
2018-02-01
Gibbs sampling is routinely used to sample truncated Gaussian distributions. These distributions naturally occur when associating latent Gaussian fields to category fields obtained by discrete simulation methods like multipoint, sequential indicator simulation and object-based simulation. The latent Gaussians are often used in data assimilation and history matching algorithms. When the Gibbs sampling is applied on a large lattice, the computing cost can become prohibitive. The usual practice of using local neighborhoods is unsatisfying as it can diverge and it does not reproduce exactly the desired covariance. A better approach is to use Gaussian Markov Random Fields (GMRF) which enables to compute the conditional distributions at any point without having to compute and invert the full covariance matrix. As the GMRF is locally defined, it allows simultaneous updating of all points that do not share neighbors (coding sets). We propose a new simultaneous Gibbs updating strategy on coding sets that can be efficiently computed by convolution and applied with an acceptance/rejection method in the truncated case. We study empirically the speed of convergence, the effect of choice of boundary conditions, of the correlation range and of GMRF smoothness. We show that the convergence is slower in the Gaussian case on the torus than for the finite case studied in the literature. However, in the truncated Gaussian case, we show that short scale correlation is quickly restored and the conditioning categories at each lattice point imprint the long scale correlation. Hence our approach enables to realistically apply Gibbs sampling on large 2D or 3D lattice with the desired GMRF covariance.
AGIS: The ATLAS Grid Information System
NASA Astrophysics Data System (ADS)
Anisenkov, Alexey; Belov, Sergey; Di Girolamo, Alessandro; Gayazov, Stavro; Klimentov, Alexei; Oleynik, Danila; Senchenko, Alexander
2012-12-01
ATLAS is a particle physics experiment at the Large Hadron Collider at CERN. The experiment produces petabytes of data annually through simulation production and tens petabytes of data per year from the detector itself. The ATLAS Computing model embraces the Grid paradigm and a high degree of decentralization and computing resources able to meet ATLAS requirements of petabytes scale data operations. In this paper we present ATLAS Grid Information System (AGIS) designed to integrate configuration and status information about resources, services and topology of whole ATLAS Grid needed by ATLAS Distributed Computing applications and services.
NASA Technical Reports Server (NTRS)
Mckay, Charles W.; Feagin, Terry; Bishop, Peter C.; Hallum, Cecil R.; Freedman, Glenn B.
1987-01-01
The principle focus of one of the RICIS (Research Institute for Computing and Information Systems) components is computer systems and software engineering in-the-large of the lifecycle of large, complex, distributed systems which: (1) evolve incrementally over a long time; (2) contain non-stop components; and (3) must simultaneously satisfy a prioritized balance of mission and safety critical requirements at run time. This focus is extremely important because of the contribution of the scaling direction problem to the current software crisis. The Computer Systems and Software Engineering (CSSE) component addresses the lifestyle issues of three environments: host, integration, and target.
Giga-voxel computational morphogenesis for structural design
NASA Astrophysics Data System (ADS)
Aage, Niels; Andreassen, Erik; Lazarov, Boyan S.; Sigmund, Ole
2017-10-01
In the design of industrial products ranging from hearing aids to automobiles and aeroplanes, material is distributed so as to maximize the performance and minimize the cost. Historically, human intuition and insight have driven the evolution of mechanical design, recently assisted by computer-aided design approaches. The computer-aided approach known as topology optimization enables unrestricted design freedom and shows great promise with regard to weight savings, but its applicability has so far been limited to the design of single components or simple structures, owing to the resolution limits of current optimization methods. Here we report a computational morphogenesis tool, implemented on a supercomputer, that produces designs with giga-voxel resolution—more than two orders of magnitude higher than previously reported. Such resolution provides insights into the optimal distribution of material within a structure that were hitherto unachievable owing to the challenges of scaling up existing modelling and optimization frameworks. As an example, we apply the tool to the design of the internal structure of a full-scale aeroplane wing. The optimized full-wing design has unprecedented structural detail at length scales ranging from tens of metres to millimetres and, intriguingly, shows remarkable similarity to naturally occurring bone structures in, for example, bird beaks. We estimate that our optimized design corresponds to a reduction in mass of 2-5 per cent compared to currently used aeroplane wing designs, which translates into a reduction in fuel consumption of about 40-200 tonnes per year per aeroplane. Our morphogenesis process is generally applicable, not only to mechanical design, but also to flow systems, antennas, nano-optics and micro-systems.
NASA Astrophysics Data System (ADS)
Marsh, C.; Pomeroy, J. W.; Wheater, H. S.
2016-12-01
There is a need for hydrological land surface schemes that can link to atmospheric models, provide hydrological prediction at multiple scales and guide the development of multiple objective water predictive systems. Distributed raster-based models suffer from an overrepresentation of topography, leading to wasted computational effort that increases uncertainty due to greater numbers of parameters and initial conditions. The Canadian Hydrological Model (CHM) is a modular, multiphysics, spatially distributed modelling framework designed for representing hydrological processes, including those that operate in cold-regions. Unstructured meshes permit variable spatial resolution, allowing coarse resolutions at low spatial variability and fine resolutions as required. Model uncertainty is reduced by lessening the necessary computational elements relative to high-resolution rasters. CHM uses a novel multi-objective approach for unstructured triangular mesh generation that fulfills hydrologically important constraints (e.g., basin boundaries, water bodies, soil classification, land cover, elevation, and slope/aspect). This provides an efficient spatial representation of parameters and initial conditions, as well as well-formed and well-graded triangles that are suitable for numerical discretization. CHM uses high-quality open source libraries and high performance computing paradigms to provide a framework that allows for integrating current state-of-the-art process algorithms. The impact of changes to model structure, including individual algorithms, parameters, initial conditions, driving meteorology, and spatial/temporal discretization can be easily tested. Initial testing of CHM compared spatial scales and model complexity for a spring melt period at a sub-arctic mountain basin. The meshing algorithm reduced the total number of computational elements and preserved the spatial heterogeneity of predictions.
Giga-voxel computational morphogenesis for structural design.
Aage, Niels; Andreassen, Erik; Lazarov, Boyan S; Sigmund, Ole
2017-10-04
In the design of industrial products ranging from hearing aids to automobiles and aeroplanes, material is distributed so as to maximize the performance and minimize the cost. Historically, human intuition and insight have driven the evolution of mechanical design, recently assisted by computer-aided design approaches. The computer-aided approach known as topology optimization enables unrestricted design freedom and shows great promise with regard to weight savings, but its applicability has so far been limited to the design of single components or simple structures, owing to the resolution limits of current optimization methods. Here we report a computational morphogenesis tool, implemented on a supercomputer, that produces designs with giga-voxel resolution-more than two orders of magnitude higher than previously reported. Such resolution provides insights into the optimal distribution of material within a structure that were hitherto unachievable owing to the challenges of scaling up existing modelling and optimization frameworks. As an example, we apply the tool to the design of the internal structure of a full-scale aeroplane wing. The optimized full-wing design has unprecedented structural detail at length scales ranging from tens of metres to millimetres and, intriguingly, shows remarkable similarity to naturally occurring bone structures in, for example, bird beaks. We estimate that our optimized design corresponds to a reduction in mass of 2-5 per cent compared to currently used aeroplane wing designs, which translates into a reduction in fuel consumption of about 40-200 tonnes per year per aeroplane. Our morphogenesis process is generally applicable, not only to mechanical design, but also to flow systems, antennas, nano-optics and micro-systems.
PanDA for ATLAS distributed computing in the next decade
NASA Astrophysics Data System (ADS)
Barreiro Megino, F. H.; De, K.; Klimentov, A.; Maeno, T.; Nilsson, P.; Oleynik, D.; Padolski, S.; Panitkin, S.; Wenaus, T.; ATLAS Collaboration
2017-10-01
The Production and Distributed Analysis (PanDA) system has been developed to meet ATLAS production and analysis requirements for a data-driven workload management system capable of operating at the Large Hadron Collider (LHC) data processing scale. Heterogeneous resources used by the ATLAS experiment are distributed worldwide at hundreds of sites, thousands of physicists analyse the data remotely, the volume of processed data is beyond the exabyte scale, dozens of scientific applications are supported, while data processing requires more than a few billion hours of computing usage per year. PanDA performed very well over the last decade including the LHC Run 1 data taking period. However, it was decided to upgrade the whole system concurrently with the LHC’s first long shutdown in order to cope with rapidly changing computing infrastructure. After two years of reengineering efforts, PanDA has embedded capabilities for fully dynamic and flexible workload management. The static batch job paradigm was discarded in favor of a more automated and scalable model. Workloads are dynamically tailored for optimal usage of resources, with the brokerage taking network traffic and forecasts into account. Computing resources are partitioned based on dynamic knowledge of their status and characteristics. The pilot has been re-factored around a plugin structure for easier development and deployment. Bookkeeping is handled with both coarse and fine granularities for efficient utilization of pledged or opportunistic resources. An in-house security mechanism authenticates the pilot and data management services in off-grid environments such as volunteer computing and private local clusters. The PanDA monitor has been extensively optimized for performance and extended with analytics to provide aggregated summaries of the system as well as drill-down to operational details. There are as well many other challenges planned or recently implemented, and adoption by non-LHC experiments such as bioinformatics groups successfully running Paleomix (microbial genome and metagenomes) payload on supercomputers. In this paper we will focus on the new and planned features that are most important to the next decade of distributed computing workload management.
Belitz, Kenneth; Jurgens, Bryant C.; Landon, Matthew K.; Fram, Miranda S.; Johnson, Tyler D.
2010-01-01
The proportion of an aquifer with constituent concentrations above a specified threshold (high concentrations) is taken as a nondimensional measure of regional scale water quality. If computed on the basis of area, it can be referred to as the aquifer scale proportion. A spatially unbiased estimate of aquifer scale proportion and a confidence interval for that estimate are obtained through the use of equal area grids and the binomial distribution. Traditionally, the confidence interval for a binomial proportion is computed using either the standard interval or the exact interval. Research from the statistics literature has shown that the standard interval should not be used and that the exact interval is overly conservative. On the basis of coverage probability and interval width, the Jeffreys interval is preferred. If more than one sample per cell is available, cell declustering is used to estimate the aquifer scale proportion, and Kish's design effect may be useful for estimating an effective number of samples. The binomial distribution is also used to quantify the adequacy of a grid with a given number of cells for identifying a small target, defined as a constituent that is present at high concentrations in a small proportion of the aquifer. Case studies illustrate a consistency between approaches that use one well per grid cell and many wells per cell. The methods presented in this paper provide a quantitative basis for designing a sampling program and for utilizing existing data.
Comparison of commonly used orthopaedic outcome measures using palm-top computers and paper surveys.
Saleh, Khaled J; Radosevich, David M; Kassim, Rida A; Moussa, Mohamed; Dykes, Darrell; Bottolfson, Helena; Gioe, Terence J; Robinson, Harry
2002-11-01
Measuring patient-perceived outcomes following orthopaedic procedures have become an important component of clinical research and patient care. General and disease-specific outcomes measures have been developed and applied in orthopaedics to assess the patients' perceived health status. Unfortunately, paper-based, self-administered instruments remain inefficient for collecting data because of: (a) missing data (b) respondent error, and (c) the costs to administer and enter data. To study the comparability of palm-top computer devices and paper-pencil self-administered questionnaires in the collection of health-related quality of life (HRQL) information from patients. The comparability of administering HRQL questionnaires using palm-top computer and traditional paper-based forms was tested in a sample of 96 patients with complaints of hip and/or knee pain. Each patient completed mailed versions of the Medical Outcomes Study (MOS), 36-item Health Survey (SF-36), and Western Ontario and McMasters University Arthritis Index (WOMAC) three weeks prior to presenting to clinic. At the clinic they were asked to complete the same outcomes measures using the palm-top computer or a paper-and-pencil version. In the analysis, scale distributions, floor and ceiling effects, internal consistency and retest reliability of scales were compared across the two data collection methods. Because the baseline characteristics of the groups were not strictly comparable according to age, the data were analyzed for the entire sample and stratified according to age. Few statistically significant differences were found for the means, variances and intra-class correlation coefficients between the methods of administration. While the scale distribution between the two methods was comparable, the internal consistency of the scales was dissimilar. Administration of HRQL questionnaires using portable palm-top computer devices has the potential advantage of decreased cost and convenience. These data lend some support for the comparability of palm-top computers and paper surveys for outcomes measures widely used in the field of orthopaedic surgery. The present study identified the lack of reliability across modes of administration that requires further study in a randomized comparability trial. These mode effects are important for orthopaedic surgeons to appreciate before implementing innovative data-capture technologies in their practices.
NASA Astrophysics Data System (ADS)
Schruff, T.; Liang, R.; Rüde, U.; Schüttrumpf, H.; Frings, R. M.
2018-01-01
The knowledge of structural properties of granular materials such as porosity is highly important in many application-oriented and scientific fields. In this paper we present new results of computer-based packing simulations where we use the non-smooth granular dynamics (NSGD) method to simulate gravitational random dense packing of spherical particles with various particle size distributions and two types of depositional conditions. A bin packing scenario was used to compare simulation results to laboratory porosity measurements and to quantify the sensitivity of the NSGD regarding critical simulation parameters such as time step size. The results of the bin packing simulations agree well with laboratory measurements across all particle size distributions with all absolute errors below 1%. A large-scale packing scenario with periodic side walls was used to simulate the packing of up to 855,600 spherical particles with various particle size distributions (PSD). Simulation outcomes are used to quantify the effect of particle-domain-size ratio on the packing compaction. A simple correction model, based on the coordination number, is employed to compensate for this effect on the porosity and to determine the relationship between PSD and porosity. Promising accuracy and stability results paired with excellent computational performance recommend the application of NSGD for large-scale packing simulations, e.g. to further enhance the generation of representative granular deposits.
Hydra: a scalable proteomic search engine which utilizes the Hadoop distributed computing framework
2012-01-01
Background For shotgun mass spectrometry based proteomics the most computationally expensive step is in matching the spectra against an increasingly large database of sequences and their post-translational modifications with known masses. Each mass spectrometer can generate data at an astonishingly high rate, and the scope of what is searched for is continually increasing. Therefore solutions for improving our ability to perform these searches are needed. Results We present a sequence database search engine that is specifically designed to run efficiently on the Hadoop MapReduce distributed computing framework. The search engine implements the K-score algorithm, generating comparable output for the same input files as the original implementation. The scalability of the system is shown, and the architecture required for the development of such distributed processing is discussed. Conclusion The software is scalable in its ability to handle a large peptide database, numerous modifications and large numbers of spectra. Performance scales with the number of processors in the cluster, allowing throughput to expand with the available resources. PMID:23216909
Hydra: a scalable proteomic search engine which utilizes the Hadoop distributed computing framework.
Lewis, Steven; Csordas, Attila; Killcoyne, Sarah; Hermjakob, Henning; Hoopmann, Michael R; Moritz, Robert L; Deutsch, Eric W; Boyle, John
2012-12-05
For shotgun mass spectrometry based proteomics the most computationally expensive step is in matching the spectra against an increasingly large database of sequences and their post-translational modifications with known masses. Each mass spectrometer can generate data at an astonishingly high rate, and the scope of what is searched for is continually increasing. Therefore solutions for improving our ability to perform these searches are needed. We present a sequence database search engine that is specifically designed to run efficiently on the Hadoop MapReduce distributed computing framework. The search engine implements the K-score algorithm, generating comparable output for the same input files as the original implementation. The scalability of the system is shown, and the architecture required for the development of such distributed processing is discussed. The software is scalable in its ability to handle a large peptide database, numerous modifications and large numbers of spectra. Performance scales with the number of processors in the cluster, allowing throughput to expand with the available resources.
OpenCluster: A Flexible Distributed Computing Framework for Astronomical Data Processing
NASA Astrophysics Data System (ADS)
Wei, Shoulin; Wang, Feng; Deng, Hui; Liu, Cuiyin; Dai, Wei; Liang, Bo; Mei, Ying; Shi, Congming; Liu, Yingbo; Wu, Jingping
2017-02-01
The volume of data generated by modern astronomical telescopes is extremely large and rapidly growing. However, current high-performance data processing architectures/frameworks are not well suited for astronomers because of their limitations and programming difficulties. In this paper, we therefore present OpenCluster, an open-source distributed computing framework to support rapidly developing high-performance processing pipelines of astronomical big data. We first detail the OpenCluster design principles and implementations and present the APIs facilitated by the framework. We then demonstrate a case in which OpenCluster is used to resolve complex data processing problems for developing a pipeline for the Mingantu Ultrawide Spectral Radioheliograph. Finally, we present our OpenCluster performance evaluation. Overall, OpenCluster provides not only high fault tolerance and simple programming interfaces, but also a flexible means of scaling up the number of interacting entities. OpenCluster thereby provides an easily integrated distributed computing framework for quickly developing a high-performance data processing system of astronomical telescopes and for significantly reducing software development expenses.
NASA Astrophysics Data System (ADS)
Noh, S.; Tachikawa, Y.; Shiiba, M.; Kim, S.
2011-12-01
Applications of the sequential data assimilation methods have been increasing in hydrology to reduce uncertainty in the model prediction. In a distributed hydrologic model, there are many types of state variables and each variable interacts with each other based on different time scales. However, the framework to deal with the delayed response, which originates from different time scale of hydrologic processes, has not been thoroughly addressed in the hydrologic data assimilation. In this study, we propose the lagged filtering scheme to consider the lagged response of internal states in a distributed hydrologic model using two filtering schemes; particle filtering (PF) and ensemble Kalman filtering (EnKF). The EnKF is one of the widely used sub-optimal filters implementing an efficient computation with limited number of ensemble members, however, still based on Gaussian approximation. PF can be an alternative in which the propagation of all uncertainties is carried out by a suitable selection of randomly generated particles without any assumptions about the nature of the distributions involved. In case of PF, advanced particle regularization scheme is implemented together to preserve the diversity of the particle system. In case of EnKF, the ensemble square root filter (EnSRF) are implemented. Each filtering method is parallelized and implemented in the high performance computing system. A distributed hydrologic model, the water and energy transfer processes (WEP) model, is applied for the Katsura River catchment, Japan to demonstrate the applicability of proposed approaches. Forecasted results via PF and EnKF are compared and analyzed in terms of the prediction accuracy and the probabilistic adequacy. Discussions are focused on the prospects and limitations of each data assimilation method.
Compressed digital holography: from micro towards macro
NASA Astrophysics Data System (ADS)
Schretter, Colas; Bettens, Stijn; Blinder, David; Pesquet-Popescu, Béatrice; Cagnazzo, Marco; Dufaux, Frédéric; Schelkens, Peter
2016-09-01
signal processing methods from software-driven computer engineering and applied mathematics. The compressed sensing theory in particular established a practical framework for reconstructing the scene content using few linear combinations of complex measurements and a sparse prior for regularizing the solution. Compressed sensing found direct applications in digital holography for microscopy. Indeed, the wave propagation phenomenon in free space mixes in a natural way the spatial distribution of point sources from the 3-dimensional scene. As the 3-dimensional scene is mapped to a 2-dimensional hologram, the hologram samples form a compressed representation of the scene as well. This overview paper discusses contributions in the field of compressed digital holography at the micro scale. Then, an outreach on future extensions towards the real-size macro scale is discussed. Thanks to advances in sensor technologies, increasing computing power and the recent improvements in sparse digital signal processing, holographic modalities are on the verge of practical high-quality visualization at a macroscopic scale where much higher resolution holograms must be acquired and processed on the computer.
NASA Astrophysics Data System (ADS)
Jougnot, D.; Jimenez-Martinez, J.; Legendre, R.; Le Borgne, T.; Meheust, Y.; Linde, N.
2017-12-01
The use of time-lapse electrical resistivity tomography has been largely developed in environmental studies to remotely monitor water saturation and contaminant plumes migration. However, subsurface heterogeneities, and corresponding preferential transport paths, yield a potentially large anisotropy in the electrical properties of the subsurface. In order to study this effect, we have used a newly developed geoelectrical milli-fluidic experimental set-up with a flow cell that contains a 2D porous medium consisting of a single layer of cylindrical solid grains. We performed saline tracer tests under full and partial water saturations in that cell by jointly injecting air and aqueous solutions with different salinities. The flow cell is equipped with four electrodes to measure the bulk electrical resistivity at the cell's scale. The spatial distribution of the water/air phases and the saline solute concentration field in the water phase are captured simultaneously with a high-resolution camera by combining a fluorescent tracer with the saline solute. These data are used to compute the longitudinal and transverse effective electrical resistivity numerically from the measured spatial distributions of the fluid phases and the salinity field. This approach is validated as the computed longitudinal effective resistivities are in good agreement with the laboratory measurements. The anisotropy in electrical resistivity is then inferred from the computed longitudinal and transverse effective resistivities. We find that the spatial distribution of saline tracer, and potentially air phase, drive temporal changes in the effective resistivity through preferential paths or barriers for electrical current at the pore scale. The resulting heterogeneities in the solute concentrations lead to strong anisotropy of the effective bulk electrical resistivity, especially for partially saturated conditions. Therefore, considering the electrical resistivity as a tensor could improve our understanding of transport properties from field-scale time-lapse ERT.
Geospatial Data as a Service: Towards planetary scale real-time analytics
NASA Astrophysics Data System (ADS)
Evans, B. J. K.; Larraondo, P. R.; Antony, J.; Richards, C. J.
2017-12-01
The rapid growth of earth systems, environmental and geophysical datasets poses a challenge to both end-users and infrastructure providers. For infrastructure and data providers, tasks like managing, indexing and storing large collections of geospatial data needs to take into consideration the various use cases by which consumers will want to access and use the data. Considerable investment has been made by the Earth Science community to produce suitable real-time analytics platforms for geospatial data. There are currently different interfaces that have been defined to provide data services. Unfortunately, there is considerable difference on the standards, protocols or data models which have been designed to target specific communities or working groups. The Australian National University's National Computational Infrastructure (NCI) is used for a wide range of activities in the geospatial community. Earth observations, climate and weather forecasting are examples of these communities which generate large amounts of geospatial data. The NCI has been carrying out significant effort to develop a data and services model that enables the cross-disciplinary use of data. Recent developments in cloud and distributed computing provide a publicly accessible platform where new infrastructures can be built. One of the key components these technologies offer is the possibility of having "limitless" compute power next to where the data is stored. This model is rapidly transforming data delivery from centralised monolithic services towards ubiquitous distributed services that scale up and down adapting to fluctuations in the demand. NCI has developed GSKY, a scalable, distributed server which presents a new approach for geospatial data discovery and delivery based on OGC standards. We will present the architecture and motivating use-cases that drove GSKY's collaborative design, development and production deployment. We show our approach offers the community valuable exploratory analysis capabilities, for dealing with petabyte-scale geospatial data collections.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sewell, Christopher Meyer
This is a set of slides from a guest lecture for a class at the University of Texas, El Paso on visualization and data analysis for high-performance computing. The topics covered are the following: trends in high-performance computing; scientific visualization, such as OpenGL, ray tracing and volume rendering, VTK, and ParaView; data science at scale, such as in-situ visualization, image databases, distributed memory parallelism, shared memory parallelism, VTK-m, "big data", and then an analysis example.
Open source tools for large-scale neuroscience.
Freeman, Jeremy
2015-06-01
New technologies for monitoring and manipulating the nervous system promise exciting biology but pose challenges for analysis and computation. Solutions can be found in the form of modern approaches to distributed computing, machine learning, and interactive visualization. But embracing these new technologies will require a cultural shift: away from independent efforts and proprietary methods and toward an open source and collaborative neuroscience. Copyright © 2015 The Author. Published by Elsevier Ltd.. All rights reserved.
Calculation of absolute protein-ligand binding free energy using distributed replica sampling.
Rodinger, Tomas; Howell, P Lynne; Pomès, Régis
2008-10-21
Distributed replica sampling [T. Rodinger et al., J. Chem. Theory Comput. 2, 725 (2006)] is a simple and general scheme for Boltzmann sampling of conformational space by computer simulation in which multiple replicas of the system undergo a random walk in reaction coordinate or temperature space. Individual replicas are linked through a generalized Hamiltonian containing an extra potential energy term or bias which depends on the distribution of all replicas, thus enforcing the desired sampling distribution along the coordinate or parameter of interest regardless of free energy barriers. In contrast to replica exchange methods, efficient implementation of the algorithm does not require synchronicity of the individual simulations. The algorithm is inherently suited for large-scale simulations using shared or heterogeneous computing platforms such as a distributed network. In this work, we build on our original algorithm by introducing Boltzmann-weighted jumping, which allows moves of a larger magnitude and thus enhances sampling efficiency along the reaction coordinate. The approach is demonstrated using a realistic and biologically relevant application; we calculate the standard binding free energy of benzene to the L99A mutant of T4 lysozyme. Distributed replica sampling is used in conjunction with thermodynamic integration to compute the potential of mean force for extracting the ligand from protein and solvent along a nonphysical spatial coordinate. Dynamic treatment of the reaction coordinate leads to faster statistical convergence of the potential of mean force than a conventional static coordinate, which suffers from slow transitions on a rugged potential energy surface.
Calculation of absolute protein-ligand binding free energy using distributed replica sampling
NASA Astrophysics Data System (ADS)
Rodinger, Tomas; Howell, P. Lynne; Pomès, Régis
2008-10-01
Distributed replica sampling [T. Rodinger et al., J. Chem. Theory Comput. 2, 725 (2006)] is a simple and general scheme for Boltzmann sampling of conformational space by computer simulation in which multiple replicas of the system undergo a random walk in reaction coordinate or temperature space. Individual replicas are linked through a generalized Hamiltonian containing an extra potential energy term or bias which depends on the distribution of all replicas, thus enforcing the desired sampling distribution along the coordinate or parameter of interest regardless of free energy barriers. In contrast to replica exchange methods, efficient implementation of the algorithm does not require synchronicity of the individual simulations. The algorithm is inherently suited for large-scale simulations using shared or heterogeneous computing platforms such as a distributed network. In this work, we build on our original algorithm by introducing Boltzmann-weighted jumping, which allows moves of a larger magnitude and thus enhances sampling efficiency along the reaction coordinate. The approach is demonstrated using a realistic and biologically relevant application; we calculate the standard binding free energy of benzene to the L99A mutant of T4 lysozyme. Distributed replica sampling is used in conjunction with thermodynamic integration to compute the potential of mean force for extracting the ligand from protein and solvent along a nonphysical spatial coordinate. Dynamic treatment of the reaction coordinate leads to faster statistical convergence of the potential of mean force than a conventional static coordinate, which suffers from slow transitions on a rugged potential energy surface.
Distributed Optimization of Multi-Agent Systems: Framework, Local Optimizer, and Applications
NASA Astrophysics Data System (ADS)
Zu, Yue
Convex optimization problem can be solved in a centralized or distributed manner. Compared with centralized methods based on single-agent system, distributed algorithms rely on multi-agent systems with information exchanging among connected neighbors, which leads to great improvement on the system fault tolerance. Thus, a task within multi-agent system can be completed with presence of partial agent failures. By problem decomposition, a large-scale problem can be divided into a set of small-scale sub-problems that can be solved in sequence/parallel. Hence, the computational complexity is greatly reduced by distributed algorithm in multi-agent system. Moreover, distributed algorithm allows data collected and stored in a distributed fashion, which successfully overcomes the drawbacks of using multicast due to the bandwidth limitation. Distributed algorithm has been applied in solving a variety of real-world problems. Our research focuses on the framework and local optimizer design in practical engineering applications. In the first one, we propose a multi-sensor and multi-agent scheme for spatial motion estimation of a rigid body. Estimation performance is improved in terms of accuracy and convergence speed. Second, we develop a cyber-physical system and implement distributed computation devices to optimize the in-building evacuation path when hazard occurs. The proposed Bellman-Ford Dual-Subgradient path planning method relieves the congestion in corridor and the exit areas. At last, highway traffic flow is managed by adjusting speed limits to minimize the fuel consumption and travel time in the third project. Optimal control strategy is designed through both centralized and distributed algorithm based on convex problem formulation. Moreover, a hybrid control scheme is presented for highway network travel time minimization. Compared with no controlled case or conventional highway traffic control strategy, the proposed hybrid control strategy greatly reduces total travel time on test highway network.
Modeling and comparative study of fluid velocities in heterogeneous rocks
NASA Astrophysics Data System (ADS)
Hingerl, Ferdinand F.; Romanenko, Konstantin; Pini, Ronny; Balcom, Bruce; Benson, Sally
2013-04-01
Detailed knowledge of the distribution of effective porosity and fluid velocities in heterogeneous rock samples is crucial for understanding and predicting spatially resolved fluid residence times and kinetic reaction rates of fluid-rock interactions. The applicability of conventional MRI techniques to sedimentary rocks is limited by internal magnetic field gradients and short spin relaxation times. The approach developed at the UNB MRI Centre combines the 13-interval Alternating-Pulsed-Gradient Stimulated-Echo (APGSTE) scheme and three-dimensional Single Point Ramped Imaging with T1 Enhancement (SPRITE). These methods were designed to reduce the errors due to effects of background gradients and fast transverse relaxation. SPRITE is largely immune to time-evolution effects resulting from background gradients, paramagnetic impurities and chemical shift. Using these techniques quantitative 3D porosity maps as well as single-phase fluid velocity fields in sandstone core samples were measured. Using a new Magnetic Resonance Imaging technique developed at the MRI Centre at UNB, we created 3D maps of porosity distributions as well as single-phase fluid velocity distributions of sandstone rock samples. Then, we evaluated the applicability of the Kozeny-Carman relationship for modeling measured fluid velocity distributions in sandstones samples showing meso-scale heterogeneities using two different modeling approaches. The MRI maps were used as reference points for the modeling approaches. For the first modeling approach, we applied the Kozeny-Carman relationship to the porosity distributions and computed respective permeability maps, which in turn provided input for a CFD simulation - using the Stanford CFD code GPRS - to compute averaged velocity maps. The latter were then compared to the measured velocity maps. For the second approach, the measured velocity distributions were used as input for inversely computing permeabilities using the GPRS CFD code. The computed permeabilities were then correlated with the ones based on the porosity maps and the Kozeny-Carman relationship. The findings of the comparative modeling study are discussed and its potential impact on the modeling of fluid residence times and kinetic reaction rates of fluid-rock interactions in rocks containing meso-scale heterogeneities are reviewed.
NASA Technical Reports Server (NTRS)
Gott, J. Richard, III; Weinberg, David H.; Melott, Adrian L.
1987-01-01
A quantitative measure of the topology of large-scale structure: the genus of density contours in a smoothed density distribution, is described and applied. For random phase (Gaussian) density fields, the mean genus per unit volume exhibits a universal dependence on threshold density, with a normalizing factor that can be calculated from the power spectrum. If large-scale structure formed from the gravitational instability of small-amplitude density fluctuations, the topology observed today on suitable scales should follow the topology in the initial conditions. The technique is illustrated by applying it to simulations of galaxy clustering in a flat universe dominated by cold dark matter. The technique is also applied to a volume-limited sample of the CfA redshift survey and to a model in which galaxies reside on the surfaces of polyhedral 'bubbles'. The topology of the evolved mass distribution and 'biased' galaxy distribution in the cold dark matter models closely matches the topology of the density fluctuations in the initial conditions. The topology of the observational sample is consistent with the random phase, cold dark matter model.
DOE Office of Scientific and Technical Information (OSTI.GOV)
McCaskey, Alexander J.
There is a lack of state-of-the-art quantum computing simulation software that scales on heterogeneous systems like Titan. Tensor Network Quantum Virtual Machine (TNQVM) provides a quantum simulator that leverages a distributed network of GPUs to simulate quantum circuits in a manner that leverages recent results from tensor network theory.
GIS-BASED HYDROLOGIC MODELING: THE AUTOMATED GEOSPATIAL WATERSHED ASSESSMENT TOOL
Planning and assessment in land and water resource management are evolving from simple, local scale problems toward complex, spatially explicit regional ones. Such problems have to be
addressed with distributed models that can compute runoff and erosion at different spatial a...
Computer-Assisted Analysis of Near-Bottom Photos for Benthic Habitat Studies
2006-09-01
navigated survey platform greatly increases the efficiency of image analysis and provides new insight about the relationships between benthic organisms...increase in the efficiency of image analysis for benthic habitat studies, and provides the opportunity to assess small scale spatial distribution of
NASA Technical Reports Server (NTRS)
Garcia, Rolando R.; Stordal, Frode; Solomon, Susan; Kiehl, Jeffrey T.
1992-01-01
Attention is given to a new model of the middle atmosphere which includes, in addition to the equations governing the zonal mean state, a potential vorticity equation for a single planetary-scale Rossby wave, and an IR radiative transfer code for the stratosphere and lower mesosphere, which replaces the Newtonian cooling parameterization used previously. It is shown that explicit computation of the planetary-scale wave field yields a more realistic representation of the zonal mean dynamics and the distribution of trace chemical species. Wave breaking produces a well-mixed 'surf zone' equatorward of the polar night vortex and drives a meridional circulation with downwelling on the poleward side of the vortex. This combination of mixing and downwelling produces shallow meridional gradients of trace gases in the subtropics and middle latitudes, and very steep gradients at the edge of the polar vortex. Computed distributions of methane and nitrous oxide are shown to agree well with observations.
NASA Astrophysics Data System (ADS)
Septiani, Eka Lutfi; Widiyastuti, W.; Winardi, Sugeng; Machmudah, Siti; Nurtono, Tantular; Kusdianto
2016-02-01
Flame assisted spray dryer are widely uses for large-scale production of nanoparticles because of it ability. Numerical approach is needed to predict combustion and particles production in scale up and optimization process due to difficulty in experimental observation and relatively high cost. Computational Fluid Dynamics (CFD) can provide the momentum, energy and mass transfer, so that CFD more efficient than experiment due to time and cost. Here, two turbulence models, k-ɛ and Large Eddy Simulation were compared and applied in flame assisted spray dryer system. The energy sources for particle drying was obtained from combustion between LPG as fuel and air as oxidizer and carrier gas that modelled by non-premixed combustion in simulation. Silica particles was used to particle modelling from sol silica solution precursor. From the several comparison result, i.e. flame contour, temperature distribution and particle size distribution, Large Eddy Simulation turbulence model can provide the closest data to the experimental result.
Large Scale Analysis of Geospatial Data with Dask and XArray
NASA Astrophysics Data System (ADS)
Zender, C. S.; Hamman, J.; Abernathey, R.; Evans, K. J.; Rocklin, M.; Zender, C. S.; Rocklin, M.
2017-12-01
The analysis of geospatial data with high level languages has acceleratedinnovation and the impact of existing data resources. However, as datasetsgrow beyond single-machine memory, data structures within these high levellanguages can become a bottleneck. New libraries like Dask and XArray resolve some of these scalability issues,providing interactive workflows that are both familiar tohigh-level-language researchers while also scaling out to much largerdatasets. This broadens the access of researchers to larger datasets on highperformance computers and, through interactive development, reducestime-to-insight when compared to traditional parallel programming techniques(MPI). This talk describes Dask, a distributed dynamic task scheduler, Dask.array, amulti-dimensional array that copies the popular NumPy interface, and XArray,a library that wraps NumPy/Dask.array with labeled and indexes axes,implementing the CF conventions. We discuss both the basic design of theselibraries and how they change interactive analysis of geospatial data, and alsorecent benefits and challenges of distributed computing on clusters ofmachines.
Contention Modeling for Multithreaded Distributed Shared Memory Machines: The Cray XMT
DOE Office of Scientific and Technical Information (OSTI.GOV)
Secchi, Simone; Tumeo, Antonino; Villa, Oreste
Distributed Shared Memory (DSM) machines are a wide class of multi-processor computing systems where a large virtually-shared address space is mapped on a network of physically distributed memories. High memory latency and network contention are two of the main factors that limit performance scaling of such architectures. Modern high-performance computing DSM systems have evolved toward exploitation of massive hardware multi-threading and fine-grained memory hashing to tolerate irregular latencies, avoid network hot-spots and enable high scaling. In order to model the performance of such large-scale machines, parallel simulation has been proved to be a promising approach to achieve good accuracy inmore » reasonable times. One of the most critical factors in solving the simulation speed-accuracy trade-off is network modeling. The Cray XMT is a massively multi-threaded supercomputing architecture that belongs to the DSM class, since it implements a globally-shared address space abstraction on top of a physically distributed memory substrate. In this paper, we discuss the development of a contention-aware network model intended to be integrated in a full-system XMT simulator. We start by measuring the effects of network contention in a 128-processor XMT machine and then investigate the trade-off that exists between simulation accuracy and speed, by comparing three network models which operate at different levels of accuracy. The comparison and model validation is performed by executing a string-matching algorithm on the full-system simulator and on the XMT, using three datasets that generate noticeably different contention patterns.« less
NASA Astrophysics Data System (ADS)
Colaïtis, A.; Chapman, T.; Strozzi, D.; Divol, L.; Michel, P.
2018-03-01
A three-dimensional laser propagation model for computation of laser-plasma interactions is presented. It is focused on indirect drive geometries in inertial confinement fusion and formulated for use at large temporal and spatial scales. A modified tesselation-based estimator and a relaxation scheme are used to estimate the intensity distribution in plasma from geometrical optics rays. Comparisons with reference solutions show that this approach is well-suited to reproduce realistic 3D intensity field distributions of beams smoothed by phase plates. It is shown that the method requires a reduced number of rays compared to traditional rigid-scale intensity estimation. Using this field estimator, we have implemented laser refraction, inverse-bremsstrahlung absorption, and steady-state crossed-beam energy transfer with a linear kinetic model in the numerical code Vampire. Probe beam amplification and laser spot shapes are compared with experimental results and pf3d paraxial simulations. These results are promising for the efficient and accurate computation of laser intensity distributions in holhraums, which is of importance for determining the capsule implosion shape and risks of laser-plasma instabilities such as hot electron generation and backscatter in multi-beam configurations.
Remote maintenance monitoring system
NASA Technical Reports Server (NTRS)
Simpkins, Lorenz G. (Inventor); Owens, Richard C. (Inventor); Rochette, Donn A. (Inventor)
1992-01-01
A remote maintenance monitoring system retrofits to a given hardware device with a sensor implant which gathers and captures failure data from the hardware device, without interfering with its operation. Failure data is continuously obtained from predetermined critical points within the hardware device, and is analyzed with a diagnostic expert system, which isolates failure origin to a particular component within the hardware device. For example, monitoring of a computer-based device may include monitoring of parity error data therefrom, as well as monitoring power supply fluctuations therein, so that parity error and power supply anomaly data may be used to trace the failure origin to a particular plane or power supply within the computer-based device. A plurality of sensor implants may be rerofit to corresponding plural devices comprising a distributed large-scale system. Transparent interface of the sensors to the devices precludes operative interference with the distributed network. Retrofit capability of the sensors permits monitoring of even older devices having no built-in testing technology. Continuous real time monitoring of a distributed network of such devices, coupled with diagnostic expert system analysis thereof, permits capture and analysis of even intermittent failures, thereby facilitating maintenance of the monitored large-scale system.
Distributed multiple path routing in complex networks
NASA Astrophysics Data System (ADS)
Chen, Guang; Wang, San-Xiu; Wu, Ling-Wei; Mei, Pan; Yang, Xu-Hua; Wen, Guang-Hui
2016-12-01
Routing in complex transmission networks is an important problem that has garnered extensive research interest in the recent years. In this paper, we propose a novel routing strategy called the distributed multiple path (DMP) routing strategy. For each of the O-D node pairs in a given network, the DMP routing strategy computes and stores multiple short-length paths that overlap less with each other in advance. And during the transmission stage, it rapidly selects an actual routing path which provides low transmission cost from the pre-computed paths for each transmission task, according to the real-time network transmission status information. Computer simulation results obtained for the lattice, ER random, and scale-free networks indicate that the strategy can significantly improve the anti-congestion ability of transmission networks, as well as provide favorable routing robustness against partial network failures.
Butler, Samuel D; Nauyoks, Stephen E; Marciniak, Michael A
2015-06-01
Of the many classes of bidirectional reflectance distribution function (BRDF) models, two popular classes of models are the microfacet model and the linear systems diffraction model. The microfacet model has the benefit of speed and simplicity, as it uses geometric optics approximations, while linear systems theory uses a diffraction approach to compute the BRDF, at the expense of greater computational complexity. In this Letter, nongrazing BRDF measurements of rough and polished surface-reflecting materials at multiple incident angles are scaled by the microfacet cross section conversion term, but in the linear systems direction cosine space, resulting in great alignment of BRDF data at various incident angles in this space. This results in a predictive BRDF model for surface-reflecting materials at nongrazing angles, while avoiding some of the computational complexities in the linear systems diffraction model.
BlueSNP: R package for highly scalable genome-wide association studies using Hadoop clusters.
Huang, Hailiang; Tata, Sandeep; Prill, Robert J
2013-01-01
Computational workloads for genome-wide association studies (GWAS) are growing in scale and complexity outpacing the capabilities of single-threaded software designed for personal computers. The BlueSNP R package implements GWAS statistical tests in the R programming language and executes the calculations across computer clusters configured with Apache Hadoop, a de facto standard framework for distributed data processing using the MapReduce formalism. BlueSNP makes computationally intensive analyses, such as estimating empirical p-values via data permutation, and searching for expression quantitative trait loci over thousands of genes, feasible for large genotype-phenotype datasets. http://github.com/ibm-bioinformatics/bluesnp
A redshift survey of IRAS galaxies. V - The acceleration on the Local Group
NASA Technical Reports Server (NTRS)
Strauss, Michael A.; Yahil, Amos; Davis, Marc; Huchra, John P.; Fisher, Karl
1992-01-01
The acceleration on the Local Group is calculated based on a full-sky redshift survey of 5288 galaxies detected by IRAS. A formalism is developed to compute the distribution function of the IRAS acceleration for a given power spectrum of initial perturbations. The computed acceleration on the Local Group points 18-28 deg from the direction of the Local Group peculiar velocity vector. The data suggest that the CMB dipole is indeed due to the motion of the Local Group, that this motion is gravitationally induced, and that the distribution of IRAS galaxies on large scales is related to that of dark matter by a simple linear biasing model.
Large-scale expensive black-box function optimization
NASA Astrophysics Data System (ADS)
Rashid, Kashif; Bailey, William; Couët, Benoît
2012-09-01
This paper presents the application of an adaptive radial basis function method to a computationally expensive black-box reservoir simulation model of many variables. An iterative proxy-based scheme is used to tune the control variables, distributed for finer control over a varying number of intervals covering the total simulation period, to maximize asset NPV. The method shows that large-scale simulation-based function optimization of several hundred variables is practical and effective.
NASA Astrophysics Data System (ADS)
Wisniewski, Nicholas Andrew
This dissertation is divided into two parts. First we present an exact solution to a generalization of the Behrens-Fisher problem by embedding the problem in the Riemannian manifold of Normal distributions. From this we construct a geometric hypothesis testing scheme. Secondly we investigate the most commonly used geometric methods employed in tensor field interpolation for DT-MRI analysis and cardiac computer modeling. We computationally investigate a class of physiologically motivated orthogonal tensor invariants, both at the full tensor field scale and at the scale of a single interpolation by doing a decimation/interpolation experiment. We show that Riemannian-based methods give the best results in preserving desirable physiological features.
Shared versus distributed memory multiprocessors
NASA Technical Reports Server (NTRS)
Jordan, Harry F.
1991-01-01
The question of whether multiprocessors should have shared or distributed memory has attracted a great deal of attention. Some researchers argue strongly for building distributed memory machines, while others argue just as strongly for programming shared memory multiprocessors. A great deal of research is underway on both types of parallel systems. Special emphasis is placed on systems with a very large number of processors for computation intensive tasks and considers research and implementation trends. It appears that the two types of systems will likely converge to a common form for large scale multiprocessors.
Multivariate quadrature for representing cloud condensation nuclei activity of aerosol populations
Fierce, Laura; McGraw, Robert L.
2017-07-26
Here, sparse representations of atmospheric aerosols are needed for efficient regional- and global-scale chemical transport models. Here we introduce a new framework for representing aerosol distributions, based on the quadrature method of moments. Given a set of moment constraints, we show how linear programming, combined with an entropy-inspired cost function, can be used to construct optimized quadrature representations of aerosol distributions. The sparse representations derived from this approach accurately reproduce cloud condensation nuclei (CCN) activity for realistically complex distributions simulated by a particleresolved model. Additionally, the linear programming techniques described in this study can be used to bound key aerosolmore » properties, such as the number concentration of CCN. Unlike the commonly used sparse representations, such as modal and sectional schemes, the maximum-entropy approach described here is not constrained to pre-determined size bins or assumed distribution shapes. This study is a first step toward a particle-based aerosol scheme that will track multivariate aerosol distributions with sufficient computational efficiency for large-scale simulations.« less
Multivariate quadrature for representing cloud condensation nuclei activity of aerosol populations
DOE Office of Scientific and Technical Information (OSTI.GOV)
Fierce, Laura; McGraw, Robert L.
Here, sparse representations of atmospheric aerosols are needed for efficient regional- and global-scale chemical transport models. Here we introduce a new framework for representing aerosol distributions, based on the quadrature method of moments. Given a set of moment constraints, we show how linear programming, combined with an entropy-inspired cost function, can be used to construct optimized quadrature representations of aerosol distributions. The sparse representations derived from this approach accurately reproduce cloud condensation nuclei (CCN) activity for realistically complex distributions simulated by a particleresolved model. Additionally, the linear programming techniques described in this study can be used to bound key aerosolmore » properties, such as the number concentration of CCN. Unlike the commonly used sparse representations, such as modal and sectional schemes, the maximum-entropy approach described here is not constrained to pre-determined size bins or assumed distribution shapes. This study is a first step toward a particle-based aerosol scheme that will track multivariate aerosol distributions with sufficient computational efficiency for large-scale simulations.« less
Magnetic pattern at supergranulation scale: the void size distribution
NASA Astrophysics Data System (ADS)
Berrilli, F.; Scardigli, S.; Del Moro, D.
2014-08-01
The large-scale magnetic pattern observed in the photosphere of the quiet Sun is dominated by the magnetic network. This network, created by photospheric magnetic fields swept into convective downflows, delineates the boundaries of large-scale cells of overturning plasma and exhibits "voids" in magnetic organization. These voids include internetwork fields, which are mixed-polarity sparse magnetic fields that populate the inner part of network cells. To single out voids and to quantify their intrinsic pattern we applied a fast circle-packing-based algorithm to 511 SOHO/MDI high-resolution magnetograms acquired during the unusually long solar activity minimum between cycles 23 and 24. The computed void distribution function shows a quasi-exponential decay behavior in the range 10-60 Mm. The lack of distinct flow scales in this range corroborates the hypothesis of multi-scale motion flows at the solar surface. In addition to the quasi-exponential decay, we have found that the voids depart from a simple exponential decay at about 35 Mm.
Optimal Information Processing in Biochemical Networks
NASA Astrophysics Data System (ADS)
Wiggins, Chris
2012-02-01
A variety of experimental results over the past decades provide examples of near-optimal information processing in biological networks, including in biochemical and transcriptional regulatory networks. Computing information-theoretic quantities requires first choosing or computing the joint probability distribution describing multiple nodes in such a network --- for example, representing the probability distribution of finding an integer copy number of each of two interacting reactants or gene products while respecting the `intrinsic' small copy number noise constraining information transmission at the scale of the cell. I'll given an overview of some recent analytic and numerical work facilitating calculation of such joint distributions and the associated information, which in turn makes possible numerical optimization of information flow in models of noisy regulatory and biochemical networks. Illustrating cases include quantification of form-function relations, ideal design of regulatory cascades, and response to oscillatory driving.
A common stochastic accumulator with effector-dependent noise can explain eye-hand coordination
Gopal, Atul; Viswanathan, Pooja
2015-01-01
The computational architecture that enables the flexible coupling between otherwise independent eye and hand effector systems is not understood. By using a drift diffusion framework, in which variability of the reaction time (RT) distribution scales with mean RT, we tested the ability of a common stochastic accumulator to explain eye-hand coordination. Using a combination of behavior, computational modeling and electromyography, we show how a single stochastic accumulator to threshold, followed by noisy effector-dependent delays, explains eye-hand RT distributions and their correlation, while an alternate independent, interactive eye and hand accumulator model does not. Interestingly, the common accumulator model did not explain the RT distributions of the same subjects when they made eye and hand movements in isolation. Taken together, these data suggest that a dedicated circuit underlies coordinated eye-hand planning. PMID:25568161
Large-Scale Distributed Coalition Formation
2009-09-01
Ripeanu, Matei, Adriana Iamnitchi, and Ian Foster. “Mapping the Gnutella Network”. IEEE Internet Computing, 6(1):50–57, 2002. 78. Rowstron, Antony I...for Search. Working Papers 95-02-010, Santa Fe Institute, February 1995. 97. Xu, Yang, Paul Scerri, Bin Yu, Steven Okamoto, Michael Lewis, and Ka
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ibrahim, Khaled Z.; Epifanovsky, Evgeny; Williams, Samuel
Coupled-cluster methods provide highly accurate models of molecular structure through explicit numerical calculation of tensors representing the correlation between electrons. These calculations are dominated by a sequence of tensor contractions, motivating the development of numerical libraries for such operations. While based on matrix–matrix multiplication, these libraries are specialized to exploit symmetries in the molecular structure and in electronic interactions, and thus reduce the size of the tensor representation and the complexity of contractions. The resulting algorithms are irregular and their parallelization has been previously achieved via the use of dynamic scheduling or specialized data decompositions. We introduce our efforts tomore » extend the Libtensor framework to work in the distributed memory environment in a scalable and energy-efficient manner. We achieve up to 240× speedup compared with the optimized shared memory implementation of Libtensor. We attain scalability to hundreds of thousands of compute cores on three distributed-memory architectures (Cray XC30 and XC40, and IBM Blue Gene/Q), and on a heterogeneous GPU-CPU system (Cray XK7). As the bottlenecks shift from being compute-bound DGEMM's to communication-bound collectives as the size of the molecular system scales, we adopt two radically different parallelization approaches for handling load-imbalance, tasking and bulk synchronous models. Nevertheless, we preserve a unified interface to both programming models to maintain the productivity of computational quantum chemists.« less
Ibrahim, Khaled Z.; Epifanovsky, Evgeny; Williams, Samuel; ...
2017-03-08
Coupled-cluster methods provide highly accurate models of molecular structure through explicit numerical calculation of tensors representing the correlation between electrons. These calculations are dominated by a sequence of tensor contractions, motivating the development of numerical libraries for such operations. While based on matrix–matrix multiplication, these libraries are specialized to exploit symmetries in the molecular structure and in electronic interactions, and thus reduce the size of the tensor representation and the complexity of contractions. The resulting algorithms are irregular and their parallelization has been previously achieved via the use of dynamic scheduling or specialized data decompositions. We introduce our efforts tomore » extend the Libtensor framework to work in the distributed memory environment in a scalable and energy-efficient manner. We achieve up to 240× speedup compared with the optimized shared memory implementation of Libtensor. We attain scalability to hundreds of thousands of compute cores on three distributed-memory architectures (Cray XC30 and XC40, and IBM Blue Gene/Q), and on a heterogeneous GPU-CPU system (Cray XK7). As the bottlenecks shift from being compute-bound DGEMM's to communication-bound collectives as the size of the molecular system scales, we adopt two radically different parallelization approaches for handling load-imbalance, tasking and bulk synchronous models. Nevertheless, we preserve a unified interface to both programming models to maintain the productivity of computational quantum chemists.« less
Classical boson sampling algorithms with superior performance to near-term experiments
NASA Astrophysics Data System (ADS)
Neville, Alex; Sparrow, Chris; Clifford, Raphaël; Johnston, Eric; Birchall, Patrick M.; Montanaro, Ashley; Laing, Anthony
2017-12-01
It is predicted that quantum computers will dramatically outperform their conventional counterparts. However, large-scale universal quantum computers are yet to be built. Boson sampling is a rudimentary quantum algorithm tailored to the platform of linear optics, which has sparked interest as a rapid way to demonstrate such quantum supremacy. Photon statistics are governed by intractable matrix functions, which suggests that sampling from the distribution obtained by injecting photons into a linear optical network could be solved more quickly by a photonic experiment than by a classical computer. The apparently low resource requirements for large boson sampling experiments have raised expectations of a near-term demonstration of quantum supremacy by boson sampling. Here we present classical boson sampling algorithms and theoretical analyses of prospects for scaling boson sampling experiments, showing that near-term quantum supremacy via boson sampling is unlikely. Our classical algorithm, based on Metropolised independence sampling, allowed the boson sampling problem to be solved for 30 photons with standard computing hardware. Compared to current experiments, a demonstration of quantum supremacy over a successful implementation of these classical methods on a supercomputer would require the number of photons and experimental components to increase by orders of magnitude, while tackling exponentially scaling photon loss.
NASA Astrophysics Data System (ADS)
Strand, T. E.; Wang, H. F.
2003-12-01
Immiscible displacement protocols have long been used to infer the geometric properties of the void space in granular porous media. The three most commonly used experimental techniques are the measurement of soil-moisture retention curves and relative permeability-capillary pressure-saturation relations, as well as mercury intrusion porosimetry experiments. A coupled theoretical and computational investigation was performed that provides insight into the limitations associated with each technique and quantifies the relationship between experimental observations and the geometric properties of the void space. It is demonstrated that the inference of the pore space geometry from both mercury porosimetry experiments and measurements of capillary pressure curves is influenced by trapping/mobilization phenomena and subject to scaling behavior. In addition, both techniques also assume that the capillary pressure at a location on the meniscus can be approximated by a pressure difference across a region or sample. For example, when performing capillary pressure measurements, the capillary pressure, taken to be the difference between the injected fluid pressure at the inlet and the defending fluid pressure at the outlet, is increased in a series of small steps and the fluid saturation is measured each time the system reaches steady. Regions of defending fluid that become entrapped by the invading fluid can be subsequently mobilized at higher flow rates (capillary pressures), contributing to a scale-dependence of the capillary pressure-saturation curve that complicates the determination of the properties of the pore space. This scale-dependence is particularly problematic for measurements performed at the core scale. Mercury porosimetry experiments are subject to similar limitations. Trapped regions of defending fluid are also present during the measurement of soil-moisture retention curves, but the effects of scaling behavior on the evaluation of the pore space properties from the immiscible displacement structure are much simpler to account for due to the control of mobilization phenomena. Some mobilization may occur due to film flow, but this can be limited by keeping time scales relatively small or exploited at longer time scales in order to quantify the rate of film flow. Computer simulations of gradient-stabilized drainage and imbibition to the (respective) equilibrium positions were performed using a pore-scale modified invasion percolation (MIP) model in order to quantify the relationship between the saturation profile and the geometric properties of the void space. These simulations are similar to the experimental measurement of soil-moisture retention curves. Results show that the equilibrium height and the width of the equilibrium fringe depend on two length scale distributions, one controlling the imbibition equilibrium structure and the other controlling the drainage structure. The equilibrium height is related to the mean value of the appropriate distribution as described by Jurin's law, and the width of the equilibrium fringe scales as a function of a combined parameter, the Bond number, Bo, divided by the coefficient of variation (cov). Simulations also demonstrate that the apparent radius distribution obtained from saturation profiles using direct inversion by Jurin's law is a subset of the actual distribution in the porous medium. The relationship between the apparent and actual radius distributions is quantified in terms of the combined parameter, Bo/cov, and the mean coordination number of the porous medium.
Scaling theory for information networks.
Moses, Melanie E; Forrest, Stephanie; Davis, Alan L; Lodder, Mike A; Brown, James H
2008-12-06
Networks distribute energy, materials and information to the components of a variety of natural and human-engineered systems, including organisms, brains, the Internet and microprocessors. Distribution networks enable the integrated and coordinated functioning of these systems, and they also constrain their design. The similar hierarchical branching networks observed in organisms and microprocessors are striking, given that the structure of organisms has evolved via natural selection, while microprocessors are designed by engineers. Metabolic scaling theory (MST) shows that the rate at which networks deliver energy to an organism is proportional to its mass raised to the 3/4 power. We show that computational systems are also characterized by nonlinear network scaling and use MST principles to characterize how information networks scale, focusing on how MST predicts properties of clock distribution networks in microprocessors. The MST equations are modified to account for variation in the size and density of transistors and terminal wires in microprocessors. Based on the scaling of the clock distribution network, we predict a set of trade-offs and performance properties that scale with chip size and the number of transistors. However, there are systematic deviations between power requirements on microprocessors and predictions derived directly from MST. These deviations are addressed by augmenting the model to account for decentralized flow in some microprocessor networks (e.g. in logic networks). More generally, we hypothesize a set of constraints between the size, power and performance of networked information systems including transistors on chips, hosts on the Internet and neurons in the brain.
Distributed computing testbed for a remote experimental environment
DOE Office of Scientific and Technical Information (OSTI.GOV)
Butner, D.N.; Casper, T.A.; Howard, B.C.
1995-09-18
Collaboration is increasing as physics research becomes concentrated on a few large, expensive facilities, particularly in magnetic fusion energy research, with national and international participation. These facilities are designed for steady state operation and interactive, real-time experimentation. We are developing tools to provide for the establishment of geographically distant centers for interactive operations; such centers would allow scientists to participate in experiments from their home institutions. A testbed is being developed for a Remote Experimental Environment (REE), a ``Collaboratory.`` The testbed will be used to evaluate the ability of a remotely located group of scientists to conduct research on themore » DIII-D Tokamak at General Atomics. The REE will serve as a testing environment for advanced control and collaboration concepts applicable to future experiments. Process-to-process communications over high speed wide area networks provide real-time synchronization and exchange of data among multiple computer networks, while the ability to conduct research is enhanced by adding audio/video communication capabilities. The Open Software Foundation`s Distributed Computing Environment is being used to test concepts in distributed control, security, naming, remote procedure calls and distributed file access using the Distributed File Services. We are exploring the technology and sociology of remotely participating in the operation of a large scale experimental facility.« less
Sketching the pion's valence-quark generalised parton distribution
Mezrag, C.; Chang, L.; Moutarde, H.; ...
2015-02-01
In order to learn effectively from measurements of generalised parton distributions (GPDs), it is desirable to compute them using a framework that can potentially connect empirical information with basic features of the Standard Model. We sketch an approach to such computations, based upon a rainbow-ladder (RL) truncation of QCD’s Dyson–Schwinger equations and exemplified via the pion’s valence dressed-quark GPD, H v π(x, ξ, t). Our analysis focuses primarily on ξ=0, although we also capitalise on the symmetry-preserving nature of the RL truncation by connecting H v π(x, ξ=±1, t)with the pion’s valence-quark parton distribution amplitude. We explain that the impulse-approximationmore » used hitherto to define the pion’s valence dressed-quark GPD is generally invalid owing to omission of contributions from the gluons which bind dressed-quarks into the pion. A simple correction enables us to identify a practicable improvement to the approximation for H v π(x, 0, t), expressed as the Radon transform of a single amplitude. Therewith we obtain results for H v π(x, 0, t) and the associated impact-parameter dependent distribution, q v π(x, |b⊥|), which provide a qualitatively sound picture of the pion’s dressed-quark structure at a hadronic scale. We evolve the distributions to a scale ζ = 2 GeV, so as to facilitate comparisons in future with results from experiment or other nonperturbative methods.« less
Distributed Coordinated Control of Large-Scale Nonlinear Networks
Kundu, Soumya; Anghel, Marian
2015-11-08
We provide a distributed coordinated approach to the stability analysis and control design of largescale nonlinear dynamical systems by using a vector Lyapunov functions approach. In this formulation the large-scale system is decomposed into a network of interacting subsystems and the stability of the system is analyzed through a comparison system. However finding such comparison system is not trivial. In this work, we propose a sum-of-squares based completely decentralized approach for computing the comparison systems for networks of nonlinear systems. Moreover, based on the comparison systems, we introduce a distributed optimal control strategy in which the individual subsystems (agents) coordinatemore » with their immediate neighbors to design local control policies that can exponentially stabilize the full system under initial disturbances.We illustrate the control algorithm on a network of interacting Van der Pol systems.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Palmintier, Bryan; Hale, Elaine; Hodge, Bri-Mathias
2016-08-11
This paper discusses the development of, approaches for, experiences with, and some results from a large-scale, high-performance-computer-based (HPC-based) co-simulation of electric power transmission and distribution systems using the Integrated Grid Modeling System (IGMS). IGMS was developed at the National Renewable Energy Laboratory (NREL) as a novel Independent System Operator (ISO)-to-appliance scale electric power system modeling platform that combines off-the-shelf tools to simultaneously model 100s to 1000s of distribution systems in co-simulation with detailed ISO markets, transmission power flows, and AGC-level reserve deployment. Lessons learned from the co-simulation architecture development are shared, along with a case study that explores the reactivemore » power impacts of PV inverter voltage support on the bulk power system.« less
A Virtual Hosting Environment for Distributed Online Gaming
NASA Astrophysics Data System (ADS)
Brossard, David; Prieto Martinez, Juan Luis
With enterprise boundaries becoming fuzzier, it’s become clear that businesses need to share resources, expose services, and interact in many different ways. In order to achieve such a distribution in a dynamic, flexible, and secure way, we have designed and implemented a virtual hosting environment (VHE) which aims at integrating business services across enterprise boundaries and virtualising the ICT environment within which these services operate in order to exploit economies of scale for the businesses as well as achieve shorter concept-to-market time scales. To illustrate the relevance of the VHE, we have applied it to the online gaming world. Online gaming is an early adopter of distributed computing and more than 30% of gaming developer companies, being aware of the shift, are focusing on developing high performance platforms for the new online trend.
Parallel Computation of the Regional Ocean Modeling System (ROMS)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wang, P; Song, Y T; Chao, Y
2005-04-05
The Regional Ocean Modeling System (ROMS) is a regional ocean general circulation modeling system solving the free surface, hydrostatic, primitive equations over varying topography. It is free software distributed world-wide for studying both complex coastal ocean problems and the basin-to-global scale ocean circulation. The original ROMS code could only be run on shared-memory systems. With the increasing need to simulate larger model domains with finer resolutions and on a variety of computer platforms, there is a need in the ocean-modeling community to have a ROMS code that can be run on any parallel computer ranging from 10 to hundreds ofmore » processors. Recently, we have explored parallelization for ROMS using the MPI programming model. In this paper, an efficient parallelization strategy for such a large-scale scientific software package, based on an existing shared-memory computing model, is presented. In addition, scientific applications and data-performance issues on a couple of SGI systems, including Columbia, the world's third-fastest supercomputer, are discussed.« less
Extending the length and time scales of Gram-Schmidt Lyapunov vector computations
NASA Astrophysics Data System (ADS)
Costa, Anthony B.; Green, Jason R.
2013-08-01
Lyapunov vectors have found growing interest recently due to their ability to characterize systems out of thermodynamic equilibrium. The computation of orthogonal Gram-Schmidt vectors requires multiplication and QR decomposition of large matrices, which grow as N2 (with the particle count). This expense has limited such calculations to relatively small systems and short time scales. Here, we detail two implementations of an algorithm for computing Gram-Schmidt vectors. The first is a distributed-memory message-passing method using Scalapack. The second uses the newly-released MAGMA library for GPUs. We compare the performance of both codes for Lennard-Jones fluids from N=100 to 1300 between Intel Nahalem/Infiniband DDR and NVIDIA C2050 architectures. To our best knowledge, these are the largest systems for which the Gram-Schmidt Lyapunov vectors have been computed, and the first time their calculation has been GPU-accelerated. We conclude that Lyapunov vector calculations can be significantly extended in length and time by leveraging the power of GPU-accelerated linear algebra.
PARVMEC: An Efficient, Scalable Implementation of the Variational Moments Equilibrium Code
DOE Office of Scientific and Technical Information (OSTI.GOV)
Seal, Sudip K; Hirshman, Steven Paul; Wingen, Andreas
The ability to sustain magnetically confined plasma in a state of stable equilibrium is crucial for optimal and cost-effective operations of fusion devices like tokamaks and stellarators. The Variational Moments Equilibrium Code (VMEC) is the de-facto serial application used by fusion scientists to compute magnetohydrodynamics (MHD) equilibria and study the physics of three dimensional plasmas in confined configurations. Modern fusion energy experiments have larger system scales with more interactive experimental workflows, both demanding faster analysis turnaround times on computational workloads that are stressing the capabilities of sequential VMEC. In this paper, we present PARVMEC, an efficient, parallel version of itsmore » sequential counterpart, capable of scaling to thousands of processors on distributed memory machines. PARVMEC is a non-linear code, with multiple numerical physics modules, each with its own computational complexity. A detailed speedup analysis supported by scaling results on 1,024 cores of a Cray XC30 supercomputer is presented. Depending on the mode of PARVMEC execution, speedup improvements of one to two orders of magnitude are reported. PARVMEC equips fusion scientists for the first time with a state-of-theart capability for rapid, high fidelity analyses of magnetically confined plasmas at unprecedented scales.« less
Transport on percolation clusters with power-law distributed bond strengths.
Alava, Mikko; Moukarzel, Cristian F
2003-05-01
The simplest transport problem, namely finding the maximum flow of current, or maxflow, is investigated on critical percolation clusters in two and three dimensions, using a combination of extremal statistics arguments and exact numerical computations, for power-law distributed bond strengths of the type P(sigma) approximately sigma(-alpha). Assuming that only cutting bonds determine the flow, the maxflow critical exponent v is found to be v(alpha)=(d-1)nu+1/(1-alpha). This prediction is confirmed with excellent accuracy using large-scale numerical simulation in two and three dimensions. However, in the region of anomalous bond capacity distributions (0< or =alpha< or =1) we demonstrate that, due to cluster-structure fluctuations, it is not the cutting bonds but the blobs that set the transport properties of the backbone. This "blob dominance" avoids a crossover to a regime where structural details, the distribution of the number of red or cutting bonds, would set the scaling. The restored scaling exponents, however, still follow the simplistic red bond estimate. This is argued to be due to the existence of a hierarchy of so-called minimum cut configurations, for which cutting bonds form the lowest level, and whose transport properties scale all in the same way. We point out the relevance of our findings to other scalar transport problems (i.e., conductivity).
NASA Astrophysics Data System (ADS)
Liu, Q.; Chiu, L. S.; Hao, X.
2017-10-01
The abundance or lack of rainfall affects peoples' life and activities. As a major component of the global hydrological cycle (Chokngamwong & Chiu, 2007), accurate representations at various spatial and temporal scales are crucial for a lot of decision making processes. Climate models show a warmer and wetter climate due to increases of Greenhouse Gases (GHG). However, the models' resolutions are often too coarse to be directly applicable to local scales that are useful for mitigation purposes. Hence disaggregation (downscaling) procedures are needed to transfer the coarse scale products to higher spatial and temporal resolutions. The aim of this paper is to examine the changes in the statistical parameters of rainfall at various spatial and temporal resolutions. The TRMM Multi-satellite Precipitation Analysis (TMPA) at 0.25 degree, 3 hourly grid rainfall data for a summer is aggregated to 0.5,1.0, 2.0 and 2.5 degree and at 6, 12, 24 hourly, pentad (five days) and monthly resolutions. The probability distributions (PDF) and cumulative distribution functions(CDF) of rain amount at these resolutions are computed and modeled as a mixed distribution. Parameters of the PDFs are compared using the Kolmogrov-Smironov (KS) test, both for the mixed and the marginal distribution. These distributions are shown to be distinct. The marginal distributions are fitted with Lognormal and Gamma distributions and it is found that the Gamma distributions fit much better than the Lognormal.
Dynamics of motile phytoplankton in turbulence: Laboratory investigation of microscale patchiness
NASA Astrophysics Data System (ADS)
Crimaldi, J. P.; True, A.; Stocker, R.
2016-02-01
Phytoplankton represent the basis of oceanic life and play a critical role in biogeochemical cycles. While phytoplankton are traditionally studied in bulk, their collective impact stems from cell-level processes and interactions at the microscale. A fundamental element that determines these interactions is the small-scale spatial distribution of individual cells: this directly determines the local cell concentration and the probability that two cells contact or interact with each other. The traditional, bulk perspective on phytoplankton distributions is that turbulence acts to smear out patchiness and locally homogenizes the distributions. However, recent numerical simulations suggest that the action of turbulence on motile phytoplankton may be precisely the opposite: by biasing the swimming direction of cells through the action of viscous torques, turbulence is predicted to generate strong patchiness at small scales. Flow-mediated patch formation has been demonstrated experimentally in simple laminar flows, but has never been tested experimentally in turbulence. In this talk we report on preliminary laboratory experiments performed in a purpose-built flow facility that uses a pair of computer-controlled oscillating grids to generate approximately homogenous isotropic 3D turbulence. Turbulent flow characteristics and dissipation rates are first quantified using particle image velocimetry (PIV). Then, 2D distributions of the motile dinoflagellate Heterosigma akashiwo are imaged using planar laser-induced fluorescence (PLIF). Analysis of imaged phytoplankton distributions for patchiness is performed using a Voronoi tessellation approach. Results suggest that motile phytoplankton distributions differ from those of passive particles. Furthermore, computed values for the patch enhancement factor are shown to be roughly consistent with those of previous DNS predictions.
USDA-ARS?s Scientific Manuscript database
Transcription initiation, essential to gene expression regulation, involves recruitment of basal transcription factors to the core promoter elements (CPEs). The distribution of currently known CPEs across plant genomes is largely unknown. This is the first large scale genome-wide report on the compu...
2012-03-01
Lowe, David G. “Distinctive Image Features from Scale-Invariant Keypoints”. International Journal of Computer Vision, 2004. 13. Maybeck, Peter S...Fairfax Drive - 3rd Floor Arlington,VA 22203 Dr. Stefanie Tompkins ; (703)248–1540; Stefanie.Tompkins@darpa.mil DARPA Distribution A. Approved for Public
NASA Technical Reports Server (NTRS)
Klumpar, D. M. (Principal Investigator)
1981-01-01
Progress is reported in reading MAGSAT tapes in modeling procedure developed to compute the magnetic fields at satellite orbit due to current distributions in the ionosphere. The modeling technique utilizes a linear current element representation of the large-scale space-current system.
A Rich Metadata Filesystem for Scientific Data
ERIC Educational Resources Information Center
Bui, Hoang
2012-01-01
As scientific research becomes more data intensive, there is an increasing need for scalable, reliable, and high performance storage systems. Such data repositories must provide both data archival services and rich metadata, and cleanly integrate with large scale computing resources. ROARS is a hybrid approach to distributed storage that provides…
CASTAG - A Computer Assisted Interactive Naval Wargame.
1980-03-01
SEATAG, THE MANUAL GAME -------------------------- 12 A. HISTORY AND DEVELOPMENT OF SEATAG -------------12 B. DESCRIPTION OF THE PLAYING AREA, SCALE...ENVIRONMENT AND PLATFORM CHARACTERISTICS OF SEATAG ------------------------------------ 12 C. GAME FLOW, AIRCRAFT CARRIER AND SUBMARINE OPERATIONS, AND...DISTRIBUTION LIST---------------------------------- 157 7 LIST OF FIGURES 1. SEATAG Game Flow ---------------------------------- 15 2. Overall CASTAG Program
Disaggregated Effects of Device on Score Comparability
ERIC Educational Resources Information Center
Davis, Laurie; Morrison, Kristin; Kong, Xiaojing; McBride, Yuanyuan
2017-01-01
The use of tablets for large-scale testing programs has transitioned from concept to reality for many state testing programs. This study extended previous research on score comparability between tablets and computers with high school students to compare score distributions across devices for reading, math, and science and to evaluate device…
Extreme Scale Computing Studies
2010-12-01
PUBLICATION IN ACCORDANCE WITH ASSIGNED DISTRIBUTION STATEMENT. *//Signature// //Signature// KERRY HILL, Program Manager BRADLEY J ...Research Institute William Carlson Institute for Defense Analyses William Dally Stanford University Monty Denneau IBM T. J . Watson Research...for Defense Analyses William Dally, Stanford University Monty Denneau, IBM T. J . Watson Research Laboratories Paul Franzon, North Carolina State
Development of a Renormalization Group Approach to Multi-Scale Plasma Physics Computation
2012-03-28
with a collection of information if it does not display a currently valid OMB control number. PLEASE DO NOT RETURN YOUR FORM TO THE ABOVE ADDRESS. 1...NUMBER(S) 12. DISTRIBUTION/AVAILABILITY STATEMENT 13. SUPPLEMENTARY NOTES 14. ABSTRACT 15. SUBJECT TERMS 16. SECURITY CLASSIFICATION OF: a . REPORT...code) 29-12-2008 Final Technical Report From 29-12-2008 To 16-95-2011 (STTR PHASE II) DEVELOPMENT OF A RENORMALIZATION GROUP APPROACH TO MULTI-SCALE
Chen, Wen Hao; Yang, Sam Y. S.; Xiao, Ti Qiao; Mayo, Sherry C.; Wang, Yu Dan; Wang, Hai Peng
2014-01-01
Quantifying three-dimensional spatial distributions of pores and material compositions in samples is a key materials characterization challenge, particularly in samples where compositions are distributed across a range of length scales, and where such compositions have similar X-ray absorption properties, such as in coal. Consequently, obtaining detailed information within sub-regions of a multi-length-scale sample by conventional approaches may not provide the resolution and level of detail one might desire. Herein, an approach for quantitative high-definition determination of material compositions from X-ray local computed tomography combined with a data-constrained modelling method is proposed. The approach is capable of dramatically improving the spatial resolution and enabling finer details within a region of interest of a sample larger than the field of view to be revealed than by using conventional techniques. A coal sample containing distributions of porosity and several mineral compositions is employed to demonstrate the approach. The optimal experimental parameters are pre-analyzed. The quantitative results demonstrated that the approach can reveal significantly finer details of compositional distributions in the sample region of interest. The elevated spatial resolution is crucial for coal-bed methane reservoir evaluation and understanding the transformation of the minerals during coal processing. The method is generic and can be applied for three-dimensional compositional characterization of other materials. PMID:24763649
Modeling chloride transport using travel time distributions at Plynlimon, Wales
NASA Astrophysics Data System (ADS)
Benettin, Paolo; Kirchner, James W.; Rinaldo, Andrea; Botter, Gianluca
2015-05-01
Here we present a theoretical interpretation of high-frequency, high-quality tracer time series from the Hafren catchment at Plynlimon in mid-Wales. We make use of the formulation of transport by travel time distributions to model chloride transport originating from atmospheric deposition and compute catchment-scale travel time distributions. The relevance of the approach lies in the explanatory power of the chosen tools, particularly to highlight hydrologic processes otherwise clouded by the integrated nature of the measured outflux signal. The analysis reveals the key role of residual storages that are poorly visible in the hydrological response, but are shown to strongly affect water quality dynamics. A significant accuracy in reproducing data is shown by our calibrated model. A detailed representation of catchment-scale travel time distributions has been derived, including the time evolution of the overall dispersion processes (which can be expressed in terms of time-varying storage sampling functions). Mean computed travel times span a broad range of values (from 80 to 800 days) depending on the catchment state. Results also suggest that, in the average, discharge waters are younger than storage water. The model proves able to capture high-frequency fluctuations in the measured chloride concentrations, which are broadly explained by the sharp transition between groundwaters and faster flows originating from topsoil layers. This article was corrected on 22 JUN 2015. See the end of the full text for details.
Staghorn: An Automated Large-Scale Distributed System Analysis Platform
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gabert, Kasimir; Burns, Ian; Elliott, Steven
2016-09-01
Conducting experiments on large-scale distributed computing systems is becoming significantly easier with the assistance of emulation. Researchers can now create a model of a distributed computing environment and then generate a virtual, laboratory copy of the entire system composed of potentially thousands of virtual machines, switches, and software. The use of real software, running at clock rate in full virtual machines, allows experiments to produce meaningful results without necessitating a full understanding of all model components. However, the ability to inspect and modify elements within these models is bound by the limitation that such modifications must compete with the model,more » either running in or alongside it. This inhibits entire classes of analyses from being conducted upon these models. We developed a mechanism to snapshot an entire emulation-based model as it is running. This allows us to \\freeze time" and subsequently fork execution, replay execution, modify arbitrary parts of the model, or deeply explore the model. This snapshot includes capturing packets in transit and other input/output state along with the running virtual machines. We were able to build this system in Linux using Open vSwitch and Kernel Virtual Machines on top of Sandia's emulation platform Firewheel. This primitive opens the door to numerous subsequent analyses on models, including state space exploration, debugging distributed systems, performance optimizations, improved training environments, and improved experiment repeatability.« less
CD-ROM technology at the EROS data center
Madigan, Michael E.; Weinheimer, Mary C.
1993-01-01
The vast amount of digital spatial data often required by a single user has created a demand for media alternatives to 1/2" magnetic tape. One such medium that has been recently adopted at the U.S. Geological Survey's EROS Data Center is the compact disc (CD). CD's are a versatile, dynamic, and low-cost method for providing a variety of data on a single media device and are compatible with various computer platforms. CD drives are available for personal computers, UNIX workstations, and mainframe systems, either directly connected, or through a network. This medium furnishes a quick method of reproducing and distributing large amounts of data on a single CD. Several data sets are already available on CD's, including collections of historical Landsat multispectral scanner data and biweekly composites of Advanced Very High Resolution Radiometer data for the conterminous United States. The EROS Data Center intends to provide even more data sets on CD's. Plans include specific data sets on a customized disc to fulfill individual requests, and mass production of unique data sets for large-scale distribution. Requests for a single compact disc-read only memory (CD-ROM) containing a large volume of data either for archiving or for one-time distribution can be addressed with a CD-write once (CD-WO) unit. Mass production and large-scale distribution will require CD-ROM replication and mastering.
NASA Technical Reports Server (NTRS)
Halpern, D.; Zlotnicki, V.; Newman, J.; Brown, O.; Wentz, F.
1991-01-01
Monthly mean global distributions for 1988 are presented with a common color scale and geographical map. Distributions are included for sea surface height variation estimated from GEOSAT; surface wind speed estimated from the Special Sensor Microwave Imager on the Defense Meteorological Satellite Program spacecraft; sea surface temperature estimated from the Advanced Very High Resolution Radiometer on NOAA spacecrafts; and the Cartesian components of the 10m height wind vector computed by the European Center for Medium Range Weather Forecasting. Charts of monthly mean value, sampling distribution, and standard deviation value are displayed. Annual mean distributions are displayed.
NASA Technical Reports Server (NTRS)
Birman, Kenneth; Cooper, Robert; Marzullo, Keith
1990-01-01
The ISIS project has developed a new methodology, virtual synchony, for writing robust distributed software. High performance multicast, large scale applications, and wide area networks are the focus of interest. Several interesting applications that exploit the strengths of ISIS, including an NFS-compatible replicated file system, are being developed. The META project is distributed control in a soft real-time environment incorporating feedback. This domain encompasses examples as diverse as monitoring inventory and consumption on a factory floor, and performing load-balancing on a distributed computing system. One of the first uses of META is for distributed application management: the tasks of configuring a distributed program, dynamically adapting to failures, and monitoring its performance. Recent progress and current plans are reported.
NASA Astrophysics Data System (ADS)
Ahluwalia, Arti
2017-02-01
About two decades ago, West and coworkers established a model which predicts that metabolic rate follows a three quarter power relationship with the mass of an organism, based on the premise that tissues are supplied nutrients through a fractal distribution network. Quarter power scaling is widely considered a universal law of biology and it is generally accepted that were in-vitro cultures to obey allometric metabolic scaling, they would have more predictive potential and could, for instance, provide a viable substitute for animals in research. This paper outlines a theoretical and computational framework for establishing quarter power scaling in three-dimensional spherical constructs in-vitro, starting where fractal distribution ends. Allometric scaling in non-vascular spherical tissue constructs was assessed using models of Michaelis Menten oxygen consumption and diffusion. The models demonstrate that physiological scaling is maintained when about 5 to 60% of the construct is exposed to oxygen concentrations less than the Michaelis Menten constant, with a significant concentration gradient in the sphere. The results have important implications for the design of downscaled in-vitro systems with physiological relevance.
Harvey, Benjamin Simeon; Ji, Soo-Yeon
2017-01-01
As microarray data available to scientists continues to increase in size and complexity, it has become overwhelmingly important to find multiple ways to bring forth oncological inference to the bioinformatics community through the analysis of large-scale cancer genomic (LSCG) DNA and mRNA microarray data that is useful to scientists. Though there have been many attempts to elucidate the issue of bringing forth biological interpretation by means of wavelet preprocessing and classification, there has not been a research effort that focuses on a cloud-scale distributed parallel (CSDP) separable 1-D wavelet decomposition technique for denoising through differential expression thresholding and classification of LSCG microarray data. This research presents a novel methodology that utilizes a CSDP separable 1-D method for wavelet-based transformation in order to initialize a threshold which will retain significantly expressed genes through the denoising process for robust classification of cancer patients. Additionally, the overall study was implemented and encompassed within CSDP environment. The utilization of cloud computing and wavelet-based thresholding for denoising was used for the classification of samples within the Global Cancer Map, Cancer Cell Line Encyclopedia, and The Cancer Genome Atlas. The results proved that separable 1-D parallel distributed wavelet denoising in the cloud and differential expression thresholding increased the computational performance and enabled the generation of higher quality LSCG microarray datasets, which led to more accurate classification results.
An Adaptive Priority Tuning System for Optimized Local CPU Scheduling using BOINC Clients
NASA Astrophysics Data System (ADS)
Mnaouer, Adel B.; Ragoonath, Colin
2010-11-01
Volunteer Computing (VC) is a Distributed Computing model which utilizes idle CPU cycles from computing resources donated by volunteers who are connected through the Internet to form a very large-scale, loosely coupled High Performance Computing environment. Distributed Volunteer Computing environments such as the BOINC framework is concerned mainly with the efficient scheduling of the available resources to the applications which require them. The BOINC framework thus contains a number of scheduling policies/algorithms both on the server-side and on the client which work together to maximize the available resources and to provide a degree of QoS in an environment which is highly volatile. This paper focuses on the BOINC client and introduces an adaptive priority tuning client side middleware application which improves the execution times of Work Units (WUs) while maintaining an acceptable Maximum Response Time (MRT) for the end user. We have conducted extensive experimentation of the proposed system and the results show clear speedup of BOINC applications using our optimized middleware as opposed to running using the original BOINC client.
Tortuosity of lightning return stroke channels
NASA Technical Reports Server (NTRS)
Levine, D. M.; Gilson, B.
1984-01-01
Data obtained from photographs of lightning are presented on the tortuosity of return stroke channels. The data were obtained by making piecewise linear fits to the channels, and recording the cartesian coordinates of the ends of each linear segment. The mean change between ends of the segments was nearly zero in the horizontal direction and was about eight meters in the vertical direction. Histograms of these changes are presented. These data were used to create model lightning channels and to predict the electric fields radiated during return strokes. This was done using a computer generated random walk in which linear segments were placed end-to-end to form a piecewise linear representation of the channel. The computer selected random numbers for the ends of the segments assuming a normal distribution with the measured statistics. Once the channels were simulated, the electric fields radiated during a return stroke were predicted using a transmission line model on each segment. It was found that realistic channels are obtained with this procedure, but only if the model includes two scales of tortuosity: fine scale irregularities corresponding to the local channel tortuosity which are superimposed on large scale horizontal drifts. The two scales of tortuosity are also necessary to obtain agreement between the electric fields computed mathematically from the simulated channels and the electric fields radiated from real return strokes. Without large scale drifts, the computed electric fields do not have the undulations characteristics of the data.
Evidence of common and separate eye and hand accumulators underlying flexible eye-hand coordination
Jana, Sumitash; Gopal, Atul
2016-01-01
Eye and hand movements are initiated by anatomically separate regions in the brain, and yet these movements can be flexibly coupled and decoupled, depending on the need. The computational architecture that enables this flexible coupling of independent effectors is not understood. Here, we studied the computational architecture that enables flexible eye-hand coordination using a drift diffusion framework, which predicts that the variability of the reaction time (RT) distribution scales with its mean. We show that a common stochastic accumulator to threshold, followed by a noisy effector-dependent delay, explains eye-hand RT distributions and their correlation in a visual search task that required decision-making, while an interactive eye and hand accumulator model did not. In contrast, in an eye-hand dual task, an interactive model better predicted the observed correlations and RT distributions than a common accumulator model. Notably, these two models could only be distinguished on the basis of the variability and not the means of the predicted RT distributions. Additionally, signatures of separate initiation signals were also observed in a small fraction of trials in the visual search task, implying that these distinct computational architectures were not a manifestation of the task design per se. Taken together, our results suggest two unique computational architectures for eye-hand coordination, with task context biasing the brain toward instantiating one of the two architectures. NEW & NOTEWORTHY Previous studies on eye-hand coordination have considered mainly the means of eye and hand reaction time (RT) distributions. Here, we leverage the approximately linear relationship between the mean and standard deviation of RT distributions, as predicted by the drift-diffusion model, to propose the existence of two distinct computational architectures underlying coordinated eye-hand movements. These architectures, for the first time, provide a computational basis for the flexible coupling between eye and hand movements. PMID:27784809
A Decade-Long European-Scale Convection-Resolving Climate Simulation on GPUs
NASA Astrophysics Data System (ADS)
Leutwyler, D.; Fuhrer, O.; Ban, N.; Lapillonne, X.; Lüthi, D.; Schar, C.
2016-12-01
Convection-resolving models have proven to be very useful tools in numerical weather prediction and in climate research. However, due to their extremely demanding computational requirements, they have so far been limited to short simulations and/or small computational domains. Innovations in the supercomputing domain have led to new supercomputer designs that involve conventional multi-core CPUs and accelerators such as graphics processing units (GPUs). One of the first atmospheric models that has been fully ported to GPUs is the Consortium for Small-Scale Modeling weather and climate model COSMO. This new version allows us to expand the size of the simulation domain to areas spanning continents and the time period up to one decade. We present results from a decade-long, convection-resolving climate simulation over Europe using the GPU-enabled COSMO version on a computational domain with 1536x1536x60 gridpoints. The simulation is driven by the ERA-interim reanalysis. The results illustrate how the approach allows for the representation of interactions between synoptic-scale and meso-scale atmospheric circulations at scales ranging from 1000 to 10 km. We discuss some of the advantages and prospects from using GPUs, and focus on the performance of the convection-resolving modeling approach on the European scale. Specifically we investigate the organization of convective clouds and on validate hourly rainfall distributions with various high-resolution data sets.
A new parallel-vector finite element analysis software on distributed-memory computers
NASA Technical Reports Server (NTRS)
Qin, Jiangning; Nguyen, Duc T.
1993-01-01
A new parallel-vector finite element analysis software package MPFEA (Massively Parallel-vector Finite Element Analysis) is developed for large-scale structural analysis on massively parallel computers with distributed-memory. MPFEA is designed for parallel generation and assembly of the global finite element stiffness matrices as well as parallel solution of the simultaneous linear equations, since these are often the major time-consuming parts of a finite element analysis. Block-skyline storage scheme along with vector-unrolling techniques are used to enhance the vector performance. Communications among processors are carried out concurrently with arithmetic operations to reduce the total execution time. Numerical results on the Intel iPSC/860 computers (such as the Intel Gamma with 128 processors and the Intel Touchstone Delta with 512 processors) are presented, including an aircraft structure and some very large truss structures, to demonstrate the efficiency and accuracy of MPFEA.
High-Performance Computation of Distributed-Memory Parallel 3D Voronoi and Delaunay Tessellation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Peterka, Tom; Morozov, Dmitriy; Phillips, Carolyn
2014-11-14
Computing a Voronoi or Delaunay tessellation from a set of points is a core part of the analysis of many simulated and measured datasets: N-body simulations, molecular dynamics codes, and LIDAR point clouds are just a few examples. Such computational geometry methods are common in data analysis and visualization; but as the scale of simulations and observations surpasses billions of particles, the existing serial and shared-memory algorithms no longer suffice. A distributed-memory scalable parallel algorithm is the only feasible approach. The primary contribution of this paper is a new parallel Delaunay and Voronoi tessellation algorithm that automatically determines which neighbormore » points need to be exchanged among the subdomains of a spatial decomposition. Other contributions include periodic and wall boundary conditions, comparison of our method using two popular serial libraries, and application to numerous science datasets.« less
Dynamic Load Balancing for Adaptive Computations on Distributed-Memory Machines
NASA Technical Reports Server (NTRS)
1999-01-01
Dynamic load balancing is central to adaptive mesh-based computations on large-scale parallel computers. The principal investigator has investigated various issues on the dynamic load balancing problem under NASA JOVE and JAG rants. The major accomplishments of the project are two graph partitioning algorithms and a load balancing framework. The S-HARP dynamic graph partitioner is known to be the fastest among the known dynamic graph partitioners to date. It can partition a graph of over 100,000 vertices in 0.25 seconds on a 64- processor Cray T3E distributed-memory multiprocessor while maintaining the scalability of over 16-fold speedup. Other known and widely used dynamic graph partitioners take over a second or two while giving low scalability of a few fold speedup on 64 processors. These results have been published in journals and peer-reviewed flagship conferences.
Atmospheric solar heating rate in the water vapor bands
NASA Technical Reports Server (NTRS)
Chou, Ming-Dah
1986-01-01
The total absorption of solar radiation by water vapor in clear atmospheres is parameterized as a simple function of the scaled water vapor amount. For applications to cloudy and hazy atmospheres, the flux-weighted k-distribution functions are computed for individual absorption bands and for the total near-infrared region. The parameterization is based upon monochromatic calculations and follows essentially the scaling approximation of Chou and Arking, but the effect of temperature variation with height is taken into account in order to enhance the accuracy. Furthermore, the spectral range is extended to cover the two weak bands centered at 0.72 and 0.82 micron. Comparisons with monochromatic calculations show that the atmospheric heating rate and the surface radiation can be accurately computed from the parameterization. Comparisons are also made with other parameterizations. It is found that the absorption of solar radiation can be computed reasonably well using the Goody band model and the Curtis-Godson approximation.
Deterministically estimated fission source distributions for Monte Carlo k-eigenvalue problems
Biondo, Elliott D.; Davidson, Gregory G.; Pandya, Tara M.; ...
2018-04-30
The standard Monte Carlo (MC) k-eigenvalue algorithm involves iteratively converging the fission source distribution using a series of potentially time-consuming inactive cycles before quantities of interest can be tallied. One strategy for reducing the computational time requirements of these inactive cycles is the Sourcerer method, in which a deterministic eigenvalue calculation is performed to obtain an improved initial guess for the fission source distribution. This method has been implemented in the Exnihilo software suite within SCALE using the SPNSPN or SNSN solvers in Denovo and the Shift MC code. The efficacy of this method is assessed with different Denovo solutionmore » parameters for a series of typical k-eigenvalue problems including small criticality benchmarks, full-core reactors, and a fuel cask. Here it is found that, in most cases, when a large number of histories per cycle are required to obtain a detailed flux distribution, the Sourcerer method can be used to reduce the computational time requirements of the inactive cycles.« less
NASA Astrophysics Data System (ADS)
Prasad, Guru; Jayaram, Sanjay; Ward, Jami; Gupta, Pankaj
2004-08-01
In this paper, Aximetric proposes a decentralized Command and Control (C2) architecture for a distributed control of a cluster of on-board health monitoring and software enabled control systems called SimBOX that will use some of the real-time infrastructure (RTI) functionality from the current military real-time simulation architecture. The uniqueness of the approach is to provide a "plug and play environment" for various system components that run at various data rates (Hz) and the ability to replicate or transfer C2 operations to various subsystems in a scalable manner. This is possible by providing a communication bus called "Distributed Shared Data Bus" and a distributed computing environment used to scale the control needs by providing a self-contained computing, data logging and control function module that can be rapidly reconfigured to perform different functions. This kind of software-enabled control is very much needed to meet the needs of future aerospace command and control functions.
NASA Astrophysics Data System (ADS)
Prasad, Guru; Jayaram, Sanjay; Ward, Jami; Gupta, Pankaj
2004-09-01
In this paper, Aximetric proposes a decentralized Command and Control (C2) architecture for a distributed control of a cluster of on-board health monitoring and software enabled control systems called
Deterministically estimated fission source distributions for Monte Carlo k-eigenvalue problems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Biondo, Elliott D.; Davidson, Gregory G.; Pandya, Tara M.
The standard Monte Carlo (MC) k-eigenvalue algorithm involves iteratively converging the fission source distribution using a series of potentially time-consuming inactive cycles before quantities of interest can be tallied. One strategy for reducing the computational time requirements of these inactive cycles is the Sourcerer method, in which a deterministic eigenvalue calculation is performed to obtain an improved initial guess for the fission source distribution. This method has been implemented in the Exnihilo software suite within SCALE using the SPNSPN or SNSN solvers in Denovo and the Shift MC code. The efficacy of this method is assessed with different Denovo solutionmore » parameters for a series of typical k-eigenvalue problems including small criticality benchmarks, full-core reactors, and a fuel cask. Here it is found that, in most cases, when a large number of histories per cycle are required to obtain a detailed flux distribution, the Sourcerer method can be used to reduce the computational time requirements of the inactive cycles.« less
NASA Astrophysics Data System (ADS)
Chaney, N.; Wood, E. F.
2014-12-01
The increasing accessibility of high-resolution land data (< 100 m) and high performance computing allows improved parameterizations of subgrid hydrologic processes in macroscale land surface models. Continental scale fully distributed modeling at these spatial scales is possible; however, its practicality for operational use is still unknown due to uncertainties in input data, model parameters, and storage requirements. To address these concerns, we propose a modeling framework that provides the spatial detail of a fully distributed model yet maintains the benefits of a semi-distributed model. In this presentation we will introduce DTOPLATS-MP, a coupling between the NOAH-MP land surface model and the Dynamic TOPMODEL hydrologic model. This new model captures a catchment's spatial heterogeneity by clustering high-resolution land datasets (soil, topography, and land cover) into hundreds of hydrologic similar units (HSUs). A prior DEM analysis defines the connections between each HSU. At each time step, the 1D land surface model updates each HSU; the HSUs then interact laterally via the subsurface and surface. When compared to the fully distributed form of the model, this framework allows a significant decrease in computation and storage while providing most of the same information and enabling parameter transferability. As a proof of concept, we will show how this new modeling framework can be run over CONUS at a 30-meter spatial resolution. For each catchment in the WBD HUC-12 dataset, the model is run between 2002 and 2012 using available high-resolution continental scale land and meteorological datasets over CONUS (dSSURGO, NLCD, NED, and NCEP Stage IV). For each catchment, the model is run with 1000 model parameter sets obtained from a Latin hypercube sample. This exercise will illustrate the feasibility of running the model operationally at continental scales while accounting for model parameter uncertainty.
Estimating Skin Cancer Risk: Evaluating Mobile Computer-Adaptive Testing.
Djaja, Ngadiman; Janda, Monika; Olsen, Catherine M; Whiteman, David C; Chien, Tsair-Wei
2016-01-22
Response burden is a major detriment to questionnaire completion rates. Computer adaptive testing may offer advantages over non-adaptive testing, including reduction of numbers of items required for precise measurement. Our aim was to compare the efficiency of non-adaptive (NAT) and computer adaptive testing (CAT) facilitated by Partial Credit Model (PCM)-derived calibration to estimate skin cancer risk. We used a random sample from a population-based Australian cohort study of skin cancer risk (N=43,794). All 30 items of the skin cancer risk scale were calibrated with the Rasch PCM. A total of 1000 cases generated following a normal distribution (mean [SD] 0 [1]) were simulated using three Rasch models with three fixed-item (dichotomous, rating scale, and partial credit) scenarios, respectively. We calculated the comparative efficiency and precision of CAT and NAT (shortening of questionnaire length and the count difference number ratio less than 5% using independent t tests). We found that use of CAT led to smaller person standard error of the estimated measure than NAT, with substantially higher efficiency but no loss of precision, reducing response burden by 48%, 66%, and 66% for dichotomous, Rating Scale Model, and PCM models, respectively. CAT-based administrations of the skin cancer risk scale could substantially reduce participant burden without compromising measurement precision. A mobile computer adaptive test was developed to help people efficiently assess their skin cancer risk.
Computational study of single-expansion-ramp nozzles with external burning
NASA Astrophysics Data System (ADS)
Yungster, Shaye; Trefny, Charles J.
1992-04-01
A computational investigation of the effects of external burning on the performance of single expansion ramp nozzles (SERN) operating at transonic speeds is presented. The study focuses on the effects of external heat addition and introduces a simplified injection and mixing model based on a control volume analysis. This simplified model permits parametric and scaling studies that would have been impossible to conduct with a detailed CFD analysis. The CFD model is validated by comparing the computed pressure distribution and thrust forces, for several nozzle configurations, with experimental data. Specific impulse calculations are also presented which indicate that external burning performance can be superior to other methods of thrust augmentation at transonic speeds. The effects of injection fuel pressure and nozzle pressure ratio on the performance of SERN nozzles with external burning are described. The results show trends similar to those reported in the experimental study, and provide additional information that complements the experimental data, improving our understanding of external burning flowfields. A study of the effect of scale is also presented. The results indicate that combustion kinetics do not make the flowfield sensitive to scale.
Computational study of single-expansion-ramp nozzles with external burning
NASA Technical Reports Server (NTRS)
Yungster, Shaye; Trefny, Charles J.
1992-01-01
A computational investigation of the effects of external burning on the performance of single expansion ramp nozzles (SERN) operating at transonic speeds is presented. The study focuses on the effects of external heat addition and introduces a simplified injection and mixing model based on a control volume analysis. This simplified model permits parametric and scaling studies that would have been impossible to conduct with a detailed CFD analysis. The CFD model is validated by comparing the computed pressure distribution and thrust forces, for several nozzle configurations, with experimental data. Specific impulse calculations are also presented which indicate that external burning performance can be superior to other methods of thrust augmentation at transonic speeds. The effects of injection fuel pressure and nozzle pressure ratio on the performance of SERN nozzles with external burning are described. The results show trends similar to those reported in the experimental study, and provide additional information that complements the experimental data, improving our understanding of external burning flowfields. A study of the effect of scale is also presented. The results indicate that combustion kinetics do not make the flowfield sensitive to scale.
Material and Thickness Grading for Aeroelastic Tailoring of the Common Research Model Wing Box
NASA Technical Reports Server (NTRS)
Stanford, Bret K.; Jutte, Christine V.
2014-01-01
This work quantifies the potential aeroelastic benefits of tailoring a full-scale wing box structure using tailored thickness distributions, material distributions, or both simultaneously. These tailoring schemes are considered for the wing skins, the spars, and the ribs. Material grading utilizes a spatially-continuous blend of two metals: Al and Al+SiC. Thicknesses and material fraction variables are specified at the 4 corners of the wing box, and a bilinear interpolation is used to compute these parameters for the interior of the planform. Pareto fronts detailing the conflict between static aeroelastic stresses and dynamic flutter boundaries are computed with a genetic algorithm. In some cases, a true material grading is found to be superior to a single-material structure.
NASA Technical Reports Server (NTRS)
Wehrbein, W. M.; Leovy, C. B.
1981-01-01
A Curtis matrix is used to compute cooling by the 15 micron and 10 micron bands of carbon dioxide. Escape of radiation to space and exchange the lower boundary are used for the 9.6 micron band of ozone. Voigt line shape, vibrational relaxation, line overlap, and the temperature dependence of line strength distributions and transmission functions are incorporated into the Curtis matrices. The distributions of the atmospheric constituents included in the algorithm, and the method used to compute the Curtis matrices are discussed as well as cooling or heating by the 9.6 micron band of ozone. The FORTRAN programs and subroutines that were developed are described and listed.
Human dynamics scaling characteristics for aerial inbound logistics operation
NASA Astrophysics Data System (ADS)
Wang, Qing; Guo, Jin-Li
2010-05-01
In recent years, the study of power-law scaling characteristics of real-life networks has attracted much interest from scholars; it deviates from the Poisson process. In this paper, we take the whole process of aerial inbound operation in a logistics company as the empirical object. The main aim of this work is to study the statistical scaling characteristics of the task-restricted work patterns. We found that the statistical variables have the scaling characteristics of unimodal distribution with a power-law tail in five statistical distributions - that is to say, there obviously exists a peak in each distribution, the shape of the left part closes to a Poisson distribution, and the right part has a heavy-tailed scaling statistics. Furthermore, to our surprise, there is only one distribution where the right parts can be approximated by the power-law form with exponent α=1.50. Others are bigger than 1.50 (three of four are about 2.50, one of four is about 3.00). We then obtain two inferences based on these empirical results: first, the human behaviors probably both close to the Poisson statistics and power-law distributions on certain levels, and the human-computer interaction behaviors may be the most common in the logistics operational areas, even in the whole task-restricted work pattern areas. Second, the hypothesis in Vázquez et al. (2006) [A. Vázquez, J. G. Oliveira, Z. Dezsö, K.-I. Goh, I. Kondor, A.-L. Barabási. Modeling burst and heavy tails in human dynamics, Phys. Rev. E 73 (2006) 036127] is probably not sufficient; it claimed that human dynamics can be classified as two discrete university classes. There may be a new human dynamics mechanism that is different from the classical Barabási models.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Livny, Miron; Shank, James; Ernst, Michael
Under this SciDAC-2 grant the project’s goal w a s t o stimulate new discoveries by providing scientists with effective and dependable access to an unprecedented national distributed computational facility: the Open Science Grid (OSG). We proposed to achieve this through the work of the Open Science Grid Consortium: a unique hands-on multi-disciplinary collaboration of scientists, software developers and providers of computing resources. Together the stakeholders in this consortium sustain and use a shared distributed computing environment that transforms simulation and experimental science in the US. The OSG consortium is an open collaboration that actively engages new research communities. Wemore » operate an open facility that brings together a broad spectrum of compute, storage, and networking resources and interfaces to other cyberinfrastructures, including the US XSEDE (previously TeraGrid), the European Grids for ESciencE (EGEE), as well as campus and regional grids. We leverage middleware provided by computer science groups, facility IT support organizations, and computing programs of application communities for the benefit of consortium members and the US national CI.« less
A parallel implementation of an off-lattice individual-based model of multicellular populations
NASA Astrophysics Data System (ADS)
Harvey, Daniel G.; Fletcher, Alexander G.; Osborne, James M.; Pitt-Francis, Joe
2015-07-01
As computational models of multicellular populations include ever more detailed descriptions of biophysical and biochemical processes, the computational cost of simulating such models limits their ability to generate novel scientific hypotheses and testable predictions. While developments in microchip technology continue to increase the power of individual processors, parallel computing offers an immediate increase in available processing power. To make full use of parallel computing technology, it is necessary to develop specialised algorithms. To this end, we present a parallel algorithm for a class of off-lattice individual-based models of multicellular populations. The algorithm divides the spatial domain between computing processes and comprises communication routines that ensure the model is correctly simulated on multiple processors. The parallel algorithm is shown to accurately reproduce the results of a deterministic simulation performed using a pre-existing serial implementation. We test the scaling of computation time, memory use and load balancing as more processes are used to simulate a cell population of fixed size. We find approximate linear scaling of both speed-up and memory consumption on up to 32 processor cores. Dynamic load balancing is shown to provide speed-up for non-regular spatial distributions of cells in the case of a growing population.
Improving flow distribution in influent channels using computational fluid dynamics.
Park, No-Suk; Yoon, Sukmin; Jeong, Woochang; Lee, Seungjae
2016-10-01
Although the flow distribution in an influent channel where the inflow is split into each treatment process in a wastewater treatment plant greatly affects the efficiency of the process, and a weir is the typical structure for the flow distribution, to the authors' knowledge, there is a paucity of research on the flow distribution in an open channel with a weir. In this study, the influent channel of a real-scale wastewater treatment plant was used, installing a suppressed rectangular weir that has a horizontal crest to cross the full channel width. The flow distribution in the influent channel was analyzed using a validated computational fluid dynamics model to investigate (1) the comparison of single-phase and two-phase simulation, (2) the improved procedure of the prototype channel, and (3) the effect of the inflow rate on flow distribution. The results show that two-phase simulation is more reliable due to the description of the free-surface fluctuations. It should first be considered for improving flow distribution to prevent a short-circuit flow, and the difference in the kinetic energy with the inflow rate makes flow distribution trends different. The authors believe that this case study is helpful for improving flow distribution in an influent channel.
IP-Based Video Modem Extender Requirements
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pierson, L G; Boorman, T M; Howe, R E
2003-12-16
Visualization is one of the keys to understanding large complex data sets such as those generated by the large computing resources purchased and developed by the Advanced Simulation and Computing program (aka ASCI). In order to be convenient to researchers, visualization data must be distributed to offices and large complex visualization theaters. Currently, local distribution of the visual data is accomplished by distance limited modems and RGB switches that simply do not scale to hundreds of users across the local, metropolitan, and WAN distances without incurring large costs in fiber plant installation and maintenance. Wide Area application over the DOEmore » Complex is infeasible using these limited distance RGB extenders. On the other hand, Internet Protocols (IP) over Ethernet is a scalable well-proven technology that can distribute large volumes of data over these distances. Visual data has been distributed at lower resolutions over IP in industrial applications. This document describes requirements of the ASCI program in visual signal distribution for the purpose of identifying industrial partners willing to develop products to meet ASCI's needs.« less
Evaluating the performance of distributed approaches for modal identification
NASA Astrophysics Data System (ADS)
Krishnan, Sriram S.; Sun, Zhuoxiong; Irfanoglu, Ayhan; Dyke, Shirley J.; Yan, Guirong
2011-04-01
In this paper two modal identification approaches appropriate for use in a distributed computing environment are applied to a full-scale, complex structure. The natural excitation technique (NExT) is used in conjunction with a condensed eigensystem realization algorithm (ERA), and the frequency domain decomposition with peak-picking (FDD-PP) are both applied to sensor data acquired from a 57.5-ft, 10 bay highway sign truss structure. Monte-Carlo simulations are performed on a numerical example to investigate the statistical properties and sensitivity to noise of the two distributed algorithms. Experimental results are provided and discussed.
A scaling procedure for the response of an isolated system with high modal overlap factor
NASA Astrophysics Data System (ADS)
De Rosa, S.; Franco, F.
2008-10-01
The paper deals with a numerical approach that reduces some physical sizes of the solution domain to compute the dynamic response of an isolated system: it has been named Asymptotical Scaled Modal Analysis (ASMA). The proposed numerical procedure alters the input data needed to obtain the classic modal responses to increase the frequency band of validity of the discrete or continuous coordinates model through the definition of a proper scaling coefficient. It is demonstrated that the computational cost remains acceptable while the frequency range of analysis increases. Moreover, with reference to the flexural vibrations of a rectangular plate, the paper discusses the ASMA vs. the statistical energy analysis and the energy distribution approach. Some insights are also given about the limits of the scaling coefficient. Finally it is shown that the linear dynamic response, predicted with the scaling procedure, has the same quality and characteristics of the statistical energy analysis, but it can be useful when the system cannot be solved appropriately by the standard Statistical Energy Analysis (SEA).
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gong, Jian; Stewart, Mark L.; Zelenyuk, Alla
The state-of-the-art multiscale modeling of GPFs including channel scale, wall scale, and pore scale is described. The microstructures of two GPFs were experimentally characterized. The pore size distributions of the GPFs were determined by mercury porosimetry. The porosity was measured by X-ray computed tomography (CT) and found to be inhomogeneous across the substrate wall. The significance of pore size distribution with respect to filtration performance was analyzed. The predictions of filtration efficiency were improved by including the pore size distribution in the filtration model. A dynamic heterogeneous multiscale filtration (HMF) model was utilized to simulate particulate filtration on a singlemore » channel particulate filter with realistic particulate emissions from a spark-ignition direct-injection (SIDI) gasoline engine. The dynamic evolution of filter’s microstructure and macroscopic filtration characteristics including mass- and number-based filtration efficiencies and pressure drop were predicted and discussed. The microstructure of the GPF substrate including inhomogeneous porosity and pore size distribution is found to significantly influence local particulate deposition inside the substrate and macroscopic filtration performance and is recommended to be resolved in the filtration model to simulate and evaluate the filtration performance of GPFs.« less
Gong, Jian; Stewart, Mark L.; Zelenyuk, Alla; ...
2018-01-03
The state-of-the-art multiscale modeling of gasoline particulate filter (GPF) including channel scale, wall scale, and pore scale is described. The microstructures of two GPFs were experimentally characterized. The pore size distributions of the GPFs were determined by mercury porosimetry. The porosity was measured by X-ray computed tomography (CT) and found to be inhomogeneous across the substrate wall. The significance of pore size distribution with respect to filtration performance was analyzed. The predictions of filtration efficiency were improved by including the pore size distribution in the filtration model. A dynamic heterogeneous multiscale filtration (HMF) model was utilized to simulate particulate filtrationmore » on a single channel particulate filter with realistic particulate emissions from a spark-ignition direct-injection (SIDI) gasoline engine. The dynamic evolution of filter’s microstructure and macroscopic filtration characteristics including mass- and number-based filtration efficiencies and pressure drop were predicted and discussed. In conclusion, the microstructure of the GPF substrate including inhomogeneous porosity and pore size distribution is found to significantly influence local particulate deposition inside the substrate and macroscopic filtration performance and is recommended to be resolved in the filtration model to simulate and evaluate the filtration performance of GPFs.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gong, Jian; Stewart, Mark L.; Zelenyuk, Alla
The state-of-the-art multiscale modeling of gasoline particulate filter (GPF) including channel scale, wall scale, and pore scale is described. The microstructures of two GPFs were experimentally characterized. The pore size distributions of the GPFs were determined by mercury porosimetry. The porosity was measured by X-ray computed tomography (CT) and found to be inhomogeneous across the substrate wall. The significance of pore size distribution with respect to filtration performance was analyzed. The predictions of filtration efficiency were improved by including the pore size distribution in the filtration model. A dynamic heterogeneous multiscale filtration (HMF) model was utilized to simulate particulate filtrationmore » on a single channel particulate filter with realistic particulate emissions from a spark-ignition direct-injection (SIDI) gasoline engine. The dynamic evolution of filter’s microstructure and macroscopic filtration characteristics including mass- and number-based filtration efficiencies and pressure drop were predicted and discussed. In conclusion, the microstructure of the GPF substrate including inhomogeneous porosity and pore size distribution is found to significantly influence local particulate deposition inside the substrate and macroscopic filtration performance and is recommended to be resolved in the filtration model to simulate and evaluate the filtration performance of GPFs.« less
Spectral fingerprints of large-scale neuronal interactions.
Siegel, Markus; Donner, Tobias H; Engel, Andreas K
2012-01-11
Cognition results from interactions among functionally specialized but widely distributed brain regions; however, neuroscience has so far largely focused on characterizing the function of individual brain regions and neurons therein. Here we discuss recent studies that have instead investigated the interactions between brain regions during cognitive processes by assessing correlations between neuronal oscillations in different regions of the primate cerebral cortex. These studies have opened a new window onto the large-scale circuit mechanisms underlying sensorimotor decision-making and top-down attention. We propose that frequency-specific neuronal correlations in large-scale cortical networks may be 'fingerprints' of canonical neuronal computations underlying cognitive processes.
Dynamical scales for multi-TeV top-pair production at the LHC
NASA Astrophysics Data System (ADS)
Czakon, Michał; Heymes, David; Mitov, Alexander
2017-04-01
We calculate all major differential distributions with stable top-quarks at the LHC. The calculation covers the multi-TeV range that will be explored during LHC Run II and beyond. Our results are in the form of high-quality binned distributions. We offer predictions based on three different parton distribution function (pdf) sets. In the near future we will make our results available also in the more flexible fastNLO format that allows fast re-computation with any other pdf set. In order to be able to extend our calculation into the multi-TeV range we have had to derive a set of dynamic scales. Such scales are selected based on the principle of fastest perturbative convergence applied to the differential and inclusive cross-section. Many observations from our study are likely to be applicable and useful to other precision processes at the LHC. With scale uncertainty now under good control, pdfs arise as the leading source of uncertainty for TeV top production. Based on our findings, true precision in the boosted regime will likely only be possible after new and improved pdf sets appear. We expect that LHC top-quark data will play an important role in this process.
The International Symposium on Grids and Clouds
NASA Astrophysics Data System (ADS)
The International Symposium on Grids and Clouds (ISGC) 2012 will be held at Academia Sinica in Taipei from 26 February to 2 March 2012, with co-located events and workshops. The conference is hosted by the Academia Sinica Grid Computing Centre (ASGC). 2012 is the decennium anniversary of the ISGC which over the last decade has tracked the convergence, collaboration and innovation of individual researchers across the Asia Pacific region to a coherent community. With the continuous support and dedication from the delegates, ISGC has provided the primary international distributed computing platform where distinguished researchers and collaboration partners from around the world share their knowledge and experiences. The last decade has seen the wide-scale emergence of e-Infrastructure as a critical asset for the modern e-Scientist. The emergence of large-scale research infrastructures and instruments that has produced a torrent of electronic data is forcing a generational change in the scientific process and the mechanisms used to analyse the resulting data deluge. No longer can the processing of these vast amounts of data and production of relevant scientific results be undertaken by a single scientist. Virtual Research Communities that span organisations around the world, through an integrated digital infrastructure that connects the trust and administrative domains of multiple resource providers, have become critical in supporting these analyses. Topics covered in ISGC 2012 include: High Energy Physics, Biomedicine & Life Sciences, Earth Science, Environmental Changes and Natural Disaster Mitigation, Humanities & Social Sciences, Operations & Management, Middleware & Interoperability, Security and Networking, Infrastructure Clouds & Virtualisation, Business Models & Sustainability, Data Management, Distributed Volunteer & Desktop Grid Computing, High Throughput Computing, and High Performance, Manycore & GPU Computing.
NASA Astrophysics Data System (ADS)
Singh, Surya P. N.; Thayer, Scott M.
2002-02-01
This paper presents a novel algorithmic architecture for the coordination and control of large scale distributed robot teams derived from the constructs found within the human immune system. Using this as a guide, the Immunology-derived Distributed Autonomous Robotics Architecture (IDARA) distributes tasks so that broad, all-purpose actions are refined and followed by specific and mediated responses based on each unit's utility and capability to timely address the system's perceived need(s). This method improves on initial developments in this area by including often overlooked interactions of the innate immune system resulting in a stronger first-order, general response mechanism. This allows for rapid reactions in dynamic environments, especially those lacking significant a priori information. As characterized via computer simulation of a of a self-healing mobile minefield having up to 7,500 mines and 2,750 robots, IDARA provides an efficient, communications light, and scalable architecture that yields significant operation and performance improvements for large-scale multi-robot coordination and control.
Spatio-temporal assessment of food safety risks in Canadian food distribution systems using GIS.
Hashemi Beni, Leila; Villeneuve, Sébastien; LeBlanc, Denyse I; Côté, Kevin; Fazil, Aamir; Otten, Ainsley; McKellar, Robin; Delaquis, Pascal
2012-09-01
While the value of geographic information systems (GIS) is widely applied in public health there have been comparatively few examples of applications that extend to the assessment of risks in food distribution systems. GIS can provide decision makers with strong computing platforms for spatial data management, integration, analysis, querying and visualization. The present report addresses some spatio-analyses in a complex food distribution system and defines influence areas as travel time zones generated through road network analysis on a national scale rather than on a community scale. In addition, a dynamic risk index is defined to translate a contamination event into a public health risk as time progresses. More specifically, in this research, GIS is used to map the Canadian produce distribution system, analyze accessibility to contaminated product by consumers, and estimate the level of risk associated with a contamination event over time, as illustrated in a scenario. Crown Copyright © 2012. Published by Elsevier Ltd. All rights reserved.
NASA Technical Reports Server (NTRS)
Mcclelland, J.; Silk, J.
1979-01-01
The evolution of the two-point correlation function for the large-scale distribution of galaxies in an expanding universe is studied on the assumption that the perturbation densities lie in a Gaussian distribution centered on any given mass scale. The perturbations are evolved according to the Friedmann equation, and the correlation function for the resulting distribution of perturbations at the present epoch is calculated. It is found that: (1) the computed correlation function gives a satisfactory fit to the observed function in cosmological models with a density parameter (Omega) of approximately unity, provided that a certain free parameter is suitably adjusted; (2) the power-law slope in the nonlinear regime reflects the initial fluctuation spectrum, provided that the density profile of individual perturbations declines more rapidly than the -2.4 power of distance; and (3) both positive and negative contributions to the correlation function are predicted for cosmological models with Omega less than unity.
Self-organization of the magnetization in ferromagnetic nanowires
NASA Astrophysics Data System (ADS)
Ivanov, A. A.; Orlov, V. A.
2017-10-01
In this work we demonstrate the occurrence of the characteristic spatial scale in the distribution of magnetization unrelated to the domain wall or crystallite size with using computer simulation of magnetization in a polycrystalline ferromagnetic nanowire. This is the stochastic domain size. We show that this length is included in the spectral density of the pinning force of domain wall on inhomogeneities of the crystallographic anisotropy. The constant and distribution of easy axes directions of the effective anisotropy of stochastic domain, are analytically calculated.
Transverse momentum dependent parton distributions at small- x
Xiao, Bo-Wen; Yuan, Feng; Zhou, Jian
2017-05-23
We study the transverse momentum dependent (TMD) parton distributions at small-x in a consistent framework that takes into account the TMD evolution and small-x evolution simultaneously. The small-x evolution effects are included by computing the TMDs at appropriate scales in terms of the dipole scattering amplitudes, which obey the relevant Balitsky–Kovchegov equation. Meanwhile, the TMD evolution is obtained by resumming the Collins–Soper type large logarithms emerged from the calculations in small-x formalism into Sudakov factors.
Transverse momentum dependent parton distributions at small-x
NASA Astrophysics Data System (ADS)
Xiao, Bo-Wen; Yuan, Feng; Zhou, Jian
2017-08-01
We study the transverse momentum dependent (TMD) parton distributions at small-x in a consistent framework that takes into account the TMD evolution and small-x evolution simultaneously. The small-x evolution effects are included by computing the TMDs at appropriate scales in terms of the dipole scattering amplitudes, which obey the relevant Balitsky-Kovchegov equation. Meanwhile, the TMD evolution is obtained by resumming the Collins-Soper type large logarithms emerged from the calculations in small-x formalism into Sudakov factors.
Transverse momentum dependent parton distributions at small- x
DOE Office of Scientific and Technical Information (OSTI.GOV)
Xiao, Bo-Wen; Yuan, Feng; Zhou, Jian
We study the transverse momentum dependent (TMD) parton distributions at small-x in a consistent framework that takes into account the TMD evolution and small-x evolution simultaneously. The small-x evolution effects are included by computing the TMDs at appropriate scales in terms of the dipole scattering amplitudes, which obey the relevant Balitsky–Kovchegov equation. Meanwhile, the TMD evolution is obtained by resumming the Collins–Soper type large logarithms emerged from the calculations in small-x formalism into Sudakov factors.
On the Computation of Sound by Large-Eddy Simulations
NASA Technical Reports Server (NTRS)
Piomelli, Ugo; Streett, Craig L.; Sarkar, Sutanu
1997-01-01
The effect of the small scales on the source term in Lighthill's acoustic analogy is investigated, with the objective of determining the accuracy of large-eddy simulations when applied to studies of flow-generated sound. The distribution of the turbulent quadrupole is predicted accurately, if models that take into account the trace of the SGS stresses are used. Its spatial distribution is also correct, indicating that the low-wave-number (or frequency) part of the sound spectrum can be predicted well by LES. Filtering, however, removes the small-scale fluctuations that contribute significantly to the higher derivatives in space and time of Lighthill's stress tensor T(sub ij). The rms fluctuations of the filtered derivatives are substantially lower than those of the unfiltered quantities. The small scales, however, are not strongly correlated, and are not expected to contribute significantly to the far-field sound; separate modeling of the subgrid-scale density fluctuations might, however, be required in some configurations.
Azad, Ariful; Ouzounis, Christos A; Kyrpides, Nikos C; Buluç, Aydin
2018-01-01
Abstract Biological networks capture structural or functional properties of relevant entities such as molecules, proteins or genes. Characteristic examples are gene expression networks or protein–protein interaction networks, which hold information about functional affinities or structural similarities. Such networks have been expanding in size due to increasing scale and abundance of biological data. While various clustering algorithms have been proposed to find highly connected regions, Markov Clustering (MCL) has been one of the most successful approaches to cluster sequence similarity or expression networks. Despite its popularity, MCL’s scalability to cluster large datasets still remains a bottleneck due to high running times and memory demands. Here, we present High-performance MCL (HipMCL), a parallel implementation of the original MCL algorithm that can run on distributed-memory computers. We show that HipMCL can efficiently utilize 2000 compute nodes and cluster a network of ∼70 million nodes with ∼68 billion edges in ∼2.4 h. By exploiting distributed-memory environments, HipMCL clusters large-scale networks several orders of magnitude faster than MCL and enables clustering of even bigger networks. HipMCL is based on MPI and OpenMP and is freely available under a modified BSD license. PMID:29315405
Azad, Ariful; Pavlopoulos, Georgios A.; Ouzounis, Christos A.; ...
2018-01-05
Biological networks capture structural or functional properties of relevant entities such as molecules, proteins or genes. Characteristic examples are gene expression networks or protein–protein interaction networks, which hold information about functional affinities or structural similarities. Such networks have been expanding in size due to increasing scale and abundance of biological data. While various clustering algorithms have been proposed to find highly connected regions, Markov Clustering (MCL) has been one of the most successful approaches to cluster sequence similarity or expression networks. Despite its popularity, MCL’s scalability to cluster large datasets still remains a bottleneck due to high running times andmore » memory demands. In this paper, we present High-performance MCL (HipMCL), a parallel implementation of the original MCL algorithm that can run on distributed-memory computers. We show that HipMCL can efficiently utilize 2000 compute nodes and cluster a network of ~70 million nodes with ~68 billion edges in ~2.4 h. By exploiting distributed-memory environments, HipMCL clusters large-scale networks several orders of magnitude faster than MCL and enables clustering of even bigger networks. Finally, HipMCL is based on MPI and OpenMP and is freely available under a modified BSD license.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Azad, Ariful; Pavlopoulos, Georgios A.; Ouzounis, Christos A.
Biological networks capture structural or functional properties of relevant entities such as molecules, proteins or genes. Characteristic examples are gene expression networks or protein–protein interaction networks, which hold information about functional affinities or structural similarities. Such networks have been expanding in size due to increasing scale and abundance of biological data. While various clustering algorithms have been proposed to find highly connected regions, Markov Clustering (MCL) has been one of the most successful approaches to cluster sequence similarity or expression networks. Despite its popularity, MCL’s scalability to cluster large datasets still remains a bottleneck due to high running times andmore » memory demands. In this paper, we present High-performance MCL (HipMCL), a parallel implementation of the original MCL algorithm that can run on distributed-memory computers. We show that HipMCL can efficiently utilize 2000 compute nodes and cluster a network of ~70 million nodes with ~68 billion edges in ~2.4 h. By exploiting distributed-memory environments, HipMCL clusters large-scale networks several orders of magnitude faster than MCL and enables clustering of even bigger networks. Finally, HipMCL is based on MPI and OpenMP and is freely available under a modified BSD license.« less
The emerging role of cloud computing in molecular modelling.
Ebejer, Jean-Paul; Fulle, Simone; Morris, Garrett M; Finn, Paul W
2013-07-01
There is a growing recognition of the importance of cloud computing for large-scale and data-intensive applications. The distinguishing features of cloud computing and their relationship to other distributed computing paradigms are described, as are the strengths and weaknesses of the approach. We review the use made to date of cloud computing for molecular modelling projects and the availability of front ends for molecular modelling applications. Although the use of cloud computing technologies for molecular modelling is still in its infancy, we demonstrate its potential by presenting several case studies. Rapid growth can be expected as more applications become available and costs continue to fall; cloud computing can make a major contribution not just in terms of the availability of on-demand computing power, but could also spur innovation in the development of novel approaches that utilize that capacity in more effective ways. Copyright © 2013 Elsevier Inc. All rights reserved.
Integrating Data Distribution and Data Assimilation Between the OOI CI and the NOAA DIF
NASA Astrophysics Data System (ADS)
Meisinger, M.; Arrott, M.; Clemesha, A.; Farcas, C.; Farcas, E.; Im, T.; Schofield, O.; Krueger, I.; Klacansky, I.; Orcutt, J.; Peach, C.; Chave, A.; Raymer, D.; Vernon, F.
2008-12-01
The Ocean Observatories Initiative (OOI) is an NSF funded program to establish the ocean observing infrastructure of the 21st century benefiting research and education. It is currently approaching final design and promises to deliver cyber and physical observatory infrastructure components as well as substantial core instrumentation to study environmental processes of the ocean at various scales, from coastal shelf-slope exchange processes to the deep ocean. The OOI's data distribution network lies at the heart of its cyber- infrastructure, which enables a multitude of science and education applications, ranging from data analysis, to processing, visualization and ontology supported query and mediation. In addition, it fundamentally supports a class of applications exploiting the knowledge gained from analyzing observational data for objective-driven ocean observing applications, such as automatically triggered response to episodic environmental events and interactive instrument tasking and control. The U.S. Department of Commerce through NOAA operates the Integrated Ocean Observing System (IOOS) providing continuous data in various formats, rates and scales on open oceans and coastal waters to scientists, managers, businesses, governments, and the public to support research and inform decision-making. The NOAA IOOS program initiated development of the Data Integration Framework (DIF) to improve management and delivery of an initial subset of ocean observations with the expectation of achieving improvements in a select set of NOAA's decision-support tools. Both OOI and NOAA through DIF collaborate on an effort to integrate the data distribution, access and analysis needs of both programs. We present details and early findings from this collaboration; one part of it is the development of a demonstrator combining web-based user access to oceanographic data through ERDDAP, efficient science data distribution, and scalable, self-healing deployment in a cloud computing environment. ERDDAP is a web-based front-end application integrating oceanographic data sources of various formats, for instance CDF data files as aggregated through NcML or presented using a THREDDS server. The OOI-designed data distribution network provides global traffic management and computational load balancing for observatory resources; it makes use of the OpenDAP Data Access Protocol (DAP) for efficient canonical science data distribution over the network. A cloud computing strategy is the basis for scalable, self-healing organization of an observatory's computing and storage resources, independent of the physical location and technical implementation of these resources.
Vibrational analysis and quantum chemical calculations of 2,2‧-bipyridine Zinc(II) halide complexes
NASA Astrophysics Data System (ADS)
Ozel, Aysen E.; Kecel, Serda; Akyuz, Sevim
2007-05-01
In this study the molecular structure and vibrational spectra of Zn(2,2'-bipyridine)X 2 (X = Cl and Br) complexes were studied in their ground states by computational vibrational study and scaled quantum mechanical (SQM) analysis. The geometry optimization, vibrational wavenumber and intensity calculations of free and coordinated 2,2'-bipyridine were carried out with the Gaussian03 program package by using Hartree-Fock (HF) and Density Functional Theory (DFT) with B3LYP functional and 6-31G (d,p) basis set. The total energy distributions (TED) of the vibrational modes were calculated by using Scaled Quantum Mechanical (SQM) analysis. Fundamentals were characterised by their total energy distributions. Coordination sensitive modes of 2,2'-bipyridine were determined.
NASA Technical Reports Server (NTRS)
Klumpar, D. M. (Principal Investigator)
1982-01-01
The status of the initial testing of the modeling procedure developed to compute the magnetic fields at satellite orbit due to current distributions in the ionosphere and magnetosphere is reported. The modeling technique utilizes a linear current element representation of the large scale space-current system.
Planning and assessment in land and water resource management are evolving from simple, local-scale problems toward complex, spatially explicit regional ones. Such problems have to be addressed with distributed models that can compute runoff and erosion at different spatial and t...
NASA Astrophysics Data System (ADS)
Lucas, Charles E.; Walters, Eric A.; Jatskevich, Juri; Wasynczuk, Oleg; Lamm, Peter T.
2003-09-01
In this paper, a new technique useful for the numerical simulation of large-scale systems is presented. This approach enables the overall system simulation to be formed by the dynamic interconnection of the various interdependent simulations, each representing a specific component or subsystem such as control, electrical, mechanical, hydraulic, or thermal. Each simulation may be developed separately using possibly different commercial-off-the-shelf simulation programs thereby allowing the most suitable language or tool to be used based on the design/analysis needs. These subsystems communicate the required interface variables at specific time intervals. A discussion concerning the selection of appropriate communication intervals is presented herein. For the purpose of demonstration, this technique is applied to a detailed simulation of a representative aircraft power system, such as that found on the Joint Strike Fighter (JSF). This system is comprised of ten component models each developed using MATLAB/Simulink, EASY5, or ACSL. When the ten component simulations were distributed across just four personal computers (PCs), a greater than 15-fold improvement in simulation speed (compared to the single-computer implementation) was achieved.
Anishaparvin, A; Chhanwal, N; Indrani, D; Raghavarao, K S M S; Anandharamakrishnan, C
2010-01-01
A computational fluid dynamics (CFD) model was developed for bread-baking process in a pilot-scale baking oven to find out the effect of hot air distribution and placement of bread on temperature and starch gelatinization index of bread. In this study, product (bread) simulation was carried out with different placements of bread. Simulation results were validated with experimental measurements of bread temperature. This study showed that nonuniform air flow pattern inside the oven cavity leads to uneven temperature distribution. The study with respect to placement of bread showed that baking of bread in upper trays required shorter baking time and gelatinization index compared to those in the bottom tray. The upper tray bread center reached 100 °C at 1200 s, whereas starch gelatinization completed within 900 s, which was the minimum baking index. Moreover, the heat penetration and starch gelatinization were higher along the sides of the bread as compared to the top and bottom portions of the bread. © 2010 Institute of Food Technologists®
Grating-based X-ray Dark-field Computed Tomography of Living Mice.
Velroyen, A; Yaroshenko, A; Hahn, D; Fehringer, A; Tapfer, A; Müller, M; Noël, P B; Pauwels, B; Sasov, A; Yildirim, A Ö; Eickelberg, O; Hellbach, K; Auweter, S D; Meinel, F G; Reiser, M F; Bech, M; Pfeiffer, F
2015-10-01
Changes in x-ray attenuating tissue caused by lung disorders like emphysema or fibrosis are subtle and thus only resolved by high-resolution computed tomography (CT). The structural reorganization, however, is of strong influence for lung function. Dark-field CT (DFCT), based on small-angle scattering of x-rays, reveals such structural changes even at resolutions coarser than the pulmonary network and thus provides access to their anatomical distribution. In this proof-of-concept study we present x-ray in vivo DFCTs of lungs of a healthy, an emphysematous and a fibrotic mouse. The tomographies show excellent depiction of the distribution of structural - and thus indirectly functional - changes in lung parenchyma, on single-modality slices in dark field as well as on multimodal fusion images. Therefore, we anticipate numerous applications of DFCT in diagnostic lung imaging. We introduce a scatter-based Hounsfield Unit (sHU) scale to facilitate comparability of scans. In this newly defined sHU scale, the pathophysiological changes by emphysema and fibrosis cause a shift towards lower numbers, compared to healthy lung tissue.
Cognitive biases, linguistic universals, and constraint-based grammar learning.
Culbertson, Jennifer; Smolensky, Paul; Wilson, Colin
2013-07-01
According to classical arguments, language learning is both facilitated and constrained by cognitive biases. These biases are reflected in linguistic typology-the distribution of linguistic patterns across the world's languages-and can be probed with artificial grammar experiments on child and adult learners. Beginning with a widely successful approach to typology (Optimality Theory), and adapting techniques from computational approaches to statistical learning, we develop a Bayesian model of cognitive biases and show that it accounts for the detailed pattern of results of artificial grammar experiments on noun-phrase word order (Culbertson, Smolensky, & Legendre, 2012). Our proposal has several novel properties that distinguish it from prior work in the domains of linguistic theory, computational cognitive science, and machine learning. This study illustrates how ideas from these domains can be synthesized into a model of language learning in which biases range in strength from hard (absolute) to soft (statistical), and in which language-specific and domain-general biases combine to account for data from the macro-level scale of typological distribution to the micro-level scale of learning by individuals. Copyright © 2013 Cognitive Science Society, Inc.
Grating-based X-ray Dark-field Computed Tomography of Living Mice
Velroyen, A.; Yaroshenko, A.; Hahn, D.; Fehringer, A.; Tapfer, A.; Müller, M.; Noël, P.B.; Pauwels, B.; Sasov, A.; Yildirim, A.Ö.; Eickelberg, O.; Hellbach, K.; Auweter, S.D.; Meinel, F.G.; Reiser, M.F.; Bech, M.; Pfeiffer, F.
2015-01-01
Changes in x-ray attenuating tissue caused by lung disorders like emphysema or fibrosis are subtle and thus only resolved by high-resolution computed tomography (CT). The structural reorganization, however, is of strong influence for lung function. Dark-field CT (DFCT), based on small-angle scattering of x-rays, reveals such structural changes even at resolutions coarser than the pulmonary network and thus provides access to their anatomical distribution. In this proof-of-concept study we present x-ray in vivo DFCTs of lungs of a healthy, an emphysematous and a fibrotic mouse. The tomographies show excellent depiction of the distribution of structural – and thus indirectly functional – changes in lung parenchyma, on single-modality slices in dark field as well as on multimodal fusion images. Therefore, we anticipate numerous applications of DFCT in diagnostic lung imaging. We introduce a scatter-based Hounsfield Unit (sHU) scale to facilitate comparability of scans. In this newly defined sHU scale, the pathophysiological changes by emphysema and fibrosis cause a shift towards lower numbers, compared to healthy lung tissue. PMID:26629545
Developing eThread pipeline using SAGA-pilot abstraction for large-scale structural bioinformatics.
Ragothaman, Anjani; Boddu, Sairam Chowdary; Kim, Nayong; Feinstein, Wei; Brylinski, Michal; Jha, Shantenu; Kim, Joohyun
2014-01-01
While most of computational annotation approaches are sequence-based, threading methods are becoming increasingly attractive because of predicted structural information that could uncover the underlying function. However, threading tools are generally compute-intensive and the number of protein sequences from even small genomes such as prokaryotes is large typically containing many thousands, prohibiting their application as a genome-wide structural systems biology tool. To leverage its utility, we have developed a pipeline for eThread--a meta-threading protein structure modeling tool, that can use computational resources efficiently and effectively. We employ a pilot-based approach that supports seamless data and task-level parallelism and manages large variation in workload and computational requirements. Our scalable pipeline is deployed on Amazon EC2 and can efficiently select resources based upon task requirements. We present runtime analysis to characterize computational complexity of eThread and EC2 infrastructure. Based on results, we suggest a pathway to an optimized solution with respect to metrics such as time-to-solution or cost-to-solution. Our eThread pipeline can scale to support a large number of sequences and is expected to be a viable solution for genome-scale structural bioinformatics and structure-based annotation, particularly, amenable for small genomes such as prokaryotes. The developed pipeline is easily extensible to other types of distributed cyberinfrastructure.
Developing eThread Pipeline Using SAGA-Pilot Abstraction for Large-Scale Structural Bioinformatics
Ragothaman, Anjani; Feinstein, Wei; Jha, Shantenu; Kim, Joohyun
2014-01-01
While most of computational annotation approaches are sequence-based, threading methods are becoming increasingly attractive because of predicted structural information that could uncover the underlying function. However, threading tools are generally compute-intensive and the number of protein sequences from even small genomes such as prokaryotes is large typically containing many thousands, prohibiting their application as a genome-wide structural systems biology tool. To leverage its utility, we have developed a pipeline for eThread—a meta-threading protein structure modeling tool, that can use computational resources efficiently and effectively. We employ a pilot-based approach that supports seamless data and task-level parallelism and manages large variation in workload and computational requirements. Our scalable pipeline is deployed on Amazon EC2 and can efficiently select resources based upon task requirements. We present runtime analysis to characterize computational complexity of eThread and EC2 infrastructure. Based on results, we suggest a pathway to an optimized solution with respect to metrics such as time-to-solution or cost-to-solution. Our eThread pipeline can scale to support a large number of sequences and is expected to be a viable solution for genome-scale structural bioinformatics and structure-based annotation, particularly, amenable for small genomes such as prokaryotes. The developed pipeline is easily extensible to other types of distributed cyberinfrastructure. PMID:24995285
Prospects of Detecting HI using Redshifted 21-cm Radiation at z˜3
NASA Astrophysics Data System (ADS)
Gehlot, Bharat Kumar; Bagla, J. S.
2017-03-01
Distribution of cold gas in the post-reionization era provides an important link between distribution of galaxies and the process of star formation. Redshifted 21-cm radiation from the hyperfine transition of neutral hydrogen allows us to probe the neutral component of cold gas, most of which is to be found in the interstellar medium of galaxies. Existing and upcoming radio telescopes can probe the large scale distribution of neutral hydrogen via HI intensity mapping. In this paper, we use an estimate of the HI power spectrum derived using an ansatz to compute the expected signal from the large scale HI distribution at z˜3. We find that the scale dependence of bias at small scales makes a significant difference to the expected signal even at large angular scales. We compare the predicted signal strength with the sensitivity of radio telescopes that can observe such radiation and calculate the observation time required for detecting neutral hydrogen at these redshifts. We find that OWFA (Ooty Wide Field Array) offers the best possibility to detect neutral hydrogen at z˜3 before the SKA (Square Kilometer Array) becomes operational. We find that the OWFA should be able to make a 3 σ or a more significant detection in 2000 hours of observations at several angular scales. Calculations done using the Fisher matrix approach indicate that a 5 σ detection of the binned HI power spectrum via measurement of the amplitude of the HI power spectrum is possible in 1000 h (Sarkar et al. 2017).
Siragusa, Enrico; Haiminen, Niina; Utro, Filippo; Parida, Laxmi
2017-10-09
Computer simulations can be used to study population genetic methods, models and parameters, as well as to predict potential outcomes. For example, in plant populations, predicting the outcome of breeding operations can be studied using simulations. In-silico construction of populations with pre-specified characteristics is an important task in breeding optimization and other population genetic studies. We present two linear time Simulation using Best-fit Algorithms (SimBA) for two classes of problems where each co-fits two distributions: SimBA-LD fits linkage disequilibrium and minimum allele frequency distributions, while SimBA-hap fits founder-haplotype and polyploid allele dosage distributions. An incremental gap-filling version of previously introduced SimBA-LD is here demonstrated to accurately fit the target distributions, allowing efficient large scale simulations. SimBA-hap accuracy and efficiency is demonstrated by simulating tetraploid populations with varying numbers of founder haplotypes, we evaluate both a linear time greedy algoritm and an optimal solution based on mixed-integer programming. SimBA is available on http://researcher.watson.ibm.com/project/5669.
NASA Technical Reports Server (NTRS)
Kramer, Williams T. C.; Simon, Horst D.
1994-01-01
This tutorial proposes to be a practical guide for the uninitiated to the main topics and themes of high-performance computing (HPC), with particular emphasis to distributed computing. The intent is first to provide some guidance and directions in the rapidly increasing field of scientific computing using both massively parallel and traditional supercomputers. Because of their considerable potential computational power, loosely or tightly coupled clusters of workstations are increasingly considered as a third alternative to both the more conventional supercomputers based on a small number of powerful vector processors, as well as high massively parallel processors. Even though many research issues concerning the effective use of workstation clusters and their integration into a large scale production facility are still unresolved, such clusters are already used for production computing. In this tutorial we will utilize the unique experience made at the NAS facility at NASA Ames Research Center. Over the last five years at NAS massively parallel supercomputers such as the Connection Machines CM-2 and CM-5 from Thinking Machines Corporation and the iPSC/860 (Touchstone Gamma Machine) and Paragon Machines from Intel were used in a production supercomputer center alongside with traditional vector supercomputers such as the Cray Y-MP and C90.
NASA Astrophysics Data System (ADS)
Ketcham, Richard A.
2017-04-01
Anisotropy in three-dimensional quantities such as geometric shape and orientation is commonly quantified using principal components analysis, in which a second order tensor determines the orientations of orthogonal components and their relative magnitudes. This approach has many advantages, such as simplicity and ability to accommodate many forms of data, and resilience to data sparsity. However, when data are sufficiently plentiful and precise, they sometimes show that aspects of the principal components approach are oversimplifications that may affect how the data are interpreted or extrapolated for mathematical or physical modeling. High-resolution X-ray computed tomography (CT) can effectively extract thousands of measurements from a single sample, providing a data density sufficient to examine the ways in which anisotropy on the hand-sample scale and smaller can be quantified, and the extent to which the ways the data are simplified are faithful to the underlying distributions. Features within CT data can be considered as discrete objects or continuum fabrics; the latter can be characterized using a variety of metrics, such as the most commonly used mean intercept length, and also the more specialized star length and star volume distributions. Each method posits a different scaling among components that affects the measured degree of anisotropy. The star volume distribution is the most sensitive to anisotropy, and commonly distinguishes strong fabric components that are not orthogonal. Although these data are well-presented using a stereoplot, 3D rose diagrams are another visualization option that can often help identify these components. This talk presents examples from a number of cases, starting with trabecular bone and extending to geological features such as fractures and brittle and ductile fabrics, in which non-orthogonal principal components identified using CT provide some insight into the origin of the underlying structures, and how they should be interpreted and potentially up-scaled.
Estimating Bias Error Distributions
NASA Technical Reports Server (NTRS)
Liu, Tian-Shu; Finley, Tom D.
2001-01-01
This paper formulates the general methodology for estimating the bias error distribution of a device in a measuring domain from less accurate measurements when a minimal number of standard values (typically two values) are available. A new perspective is that the bias error distribution can be found as a solution of an intrinsic functional equation in a domain. Based on this theory, the scaling- and translation-based methods for determining the bias error distribution arc developed. These methods are virtually applicable to any device as long as the bias error distribution of the device can be sufficiently described by a power series (a polynomial) or a Fourier series in a domain. These methods have been validated through computational simulations and laboratory calibration experiments for a number of different devices.
Accounting for small scale heterogeneity in ecohydrologic watershed models
NASA Astrophysics Data System (ADS)
Bhaskar, A.; Fleming, B.; Hogan, D. M.
2016-12-01
Spatially distributed ecohydrologic models are inherently constrained by the spatial resolution of their smallest units, below which land and processes are assumed to be homogenous. At coarse scales, heterogeneity is often accounted for by computing store and fluxes of interest over a distribution of land cover types (or other sources of heterogeneity) within spatially explicit modeling units. However this approach ignores spatial organization and the lateral transfer of water and materials downslope. The challenge is to account both for the role of flow network topology and fine-scale heterogeneity. We present a new approach that defines two levels of spatial aggregation and that integrates spatially explicit network approach with a flexible representation of finer-scale aspatial heterogeneity. Critically, this solution does not simply increase the resolution of the smallest spatial unit, and so by comparison, results in improved computational efficiency. The approach is demonstrated by adapting Regional Hydro-Ecologic Simulation System (RHESSys), an ecohydrologic model widely used to simulate climate, land use, and land management impacts. We illustrate the utility of our approach by showing how the model can be used to better characterize forest thinning impacts on ecohydrology. Forest thinning is typically done at the scale of individual trees, and yet management responses of interest include impacts on watershed scale hydrology and on downslope riparian vegetation. Our approach allow us to characterize the variability in tree size/carbon reduction and water transfers between neighboring trees while still capturing hillslope to watershed scale effects, Our illustrative example demonstrates that accounting for these fine scale effects can substantially alter model estimates, in some cases shifting the impacts of thinning on downslope water availability from increases to decreases. We conclude by describing other use cases that may benefit from this approach including characterizing urban vegetation and storm water management features and their impact on watershed scale hydrology and biogeochemical cycling.
Accounting for small scale heterogeneity in ecohydrologic watershed models
NASA Astrophysics Data System (ADS)
Burke, W.; Tague, C.
2017-12-01
Spatially distributed ecohydrologic models are inherently constrained by the spatial resolution of their smallest units, below which land and processes are assumed to be homogenous. At coarse scales, heterogeneity is often accounted for by computing store and fluxes of interest over a distribution of land cover types (or other sources of heterogeneity) within spatially explicit modeling units. However this approach ignores spatial organization and the lateral transfer of water and materials downslope. The challenge is to account both for the role of flow network topology and fine-scale heterogeneity. We present a new approach that defines two levels of spatial aggregation and that integrates spatially explicit network approach with a flexible representation of finer-scale aspatial heterogeneity. Critically, this solution does not simply increase the resolution of the smallest spatial unit, and so by comparison, results in improved computational efficiency. The approach is demonstrated by adapting Regional Hydro-Ecologic Simulation System (RHESSys), an ecohydrologic model widely used to simulate climate, land use, and land management impacts. We illustrate the utility of our approach by showing how the model can be used to better characterize forest thinning impacts on ecohydrology. Forest thinning is typically done at the scale of individual trees, and yet management responses of interest include impacts on watershed scale hydrology and on downslope riparian vegetation. Our approach allow us to characterize the variability in tree size/carbon reduction and water transfers between neighboring trees while still capturing hillslope to watershed scale effects, Our illustrative example demonstrates that accounting for these fine scale effects can substantially alter model estimates, in some cases shifting the impacts of thinning on downslope water availability from increases to decreases. We conclude by describing other use cases that may benefit from this approach including characterizing urban vegetation and storm water management features and their impact on watershed scale hydrology and biogeochemical cycling.
Baity-Jesi, Marco; Calore, Enrico; Cruz, Andres; Fernandez, Luis Antonio; Gil-Narvión, José Miguel; Gordillo-Guerrero, Antonio; Iñiguez, David; Maiorano, Andrea; Marinari, Enzo; Martin-Mayor, Victor; Monforte-Garcia, Jorge; Muñoz Sudupe, Antonio; Navarro, Denis; Parisi, Giorgio; Perez-Gaviro, Sergio; Ricci-Tersenghi, Federico; Ruiz-Lorenzo, Juan Jesus; Schifano, Sebastiano Fabio; Tarancón, Alfonso; Tripiccione, Raffaele; Yllanes, David
2017-01-01
We have performed a very accurate computation of the nonequilibrium fluctuation–dissipation ratio for the 3D Edwards–Anderson Ising spin glass, by means of large-scale simulations on the special-purpose computers Janus and Janus II. This ratio (computed for finite times on very large, effectively infinite, systems) is compared with the equilibrium probability distribution of the spin overlap for finite sizes. Our main result is a quantitative statics-dynamics dictionary, which could allow the experimental exploration of important features of the spin-glass phase without requiring uncontrollable extrapolations to infinite times or system sizes. PMID:28174274
Evolution of the ATLAS PanDA workload management system for exascale computational science
NASA Astrophysics Data System (ADS)
Maeno, T.; De, K.; Klimentov, A.; Nilsson, P.; Oleynik, D.; Panitkin, S.; Petrosyan, A.; Schovancova, J.; Vaniachine, A.; Wenaus, T.; Yu, D.; Atlas Collaboration
2014-06-01
An important foundation underlying the impressive success of data processing and analysis in the ATLAS experiment [1] at the LHC [2] is the Production and Distributed Analysis (PanDA) workload management system [3]. PanDA was designed specifically for ATLAS and proved to be highly successful in meeting all the distributed computing needs of the experiment. However, the core design of PanDA is not experiment specific. The PanDA workload management system is capable of meeting the needs of other data intensive scientific applications. Alpha-Magnetic Spectrometer [4], an astro-particle experiment on the International Space Station, and the Compact Muon Solenoid [5], an LHC experiment, have successfully evaluated PanDA and are pursuing its adoption. In this paper, a description of the new program of work to develop a generic version of PanDA will be given, as well as the progress in extending PanDA's capabilities to support supercomputers and clouds and to leverage intelligent networking. PanDA has demonstrated at a very large scale the value of automated dynamic brokering of diverse workloads across distributed computing resources. The next generation of PanDA will allow other data-intensive sciences and a wider exascale community employing a variety of computing platforms to benefit from ATLAS' experience and proven tools.
NASA Astrophysics Data System (ADS)
Murga, Alicia; Sano, Yusuke; Kawamoto, Yoichi; Ito, Kazuhide
2017-10-01
Mechanical and passive ventilation strategies directly impact indoor air quality. Passive ventilation has recently become widespread owing to its ability to reduce energy demand in buildings, such as the case of natural or cross ventilation. To understand the effect of natural ventilation on indoor environmental quality, outdoor-indoor flow paths need to be analyzed as functions of urban atmospheric conditions, topology of the built environment, and indoor conditions. Wind-driven natural ventilation (e.g., cross ventilation) can be calculated through the wind pressure coefficient distributions of outdoor wall surfaces and openings of a building, allowing the study of indoor air parameters and airborne contaminant concentrations. Variations in outside parameters will directly impact indoor air quality and residents' health. Numerical modeling can contribute to comprehend these various parameters because it allows full control of boundary conditions and sampling points. In this study, numerical weather prediction modeling was used to calculate wind profiles/distributions at the atmospheric scale, and computational fluid dynamics was used to model detailed urban and indoor flows, which were then integrated into a dynamic downscaling analysis to predict specific urban wind parameters from the atmospheric to built-environment scale. Wind velocity and contaminant concentration distributions inside a factory building were analyzed to assess the quality of the human working environment by using a computer simulated person. The impact of cross ventilation flows and its variations on local average contaminant concentration around a factory worker, and inhaled contaminant dose, were then discussed.
NASA Astrophysics Data System (ADS)
Plewa, Tomasz; Handy, Timothy; Odrzywolek, Andrzej
2014-09-01
We compute and discuss the process of nucleosynthesis in a series of core-collapse explosion models of a 15 solar mass, blue supergiant progenitor. We obtain nucleosynthetic yields and study the evolution of the chemical element distribution from the moment of core bounce until young supernova remnant phase. Our models show how the process of energy deposition due to radioactive decay modifies the dynamics and the core ejecta structure on small and intermediate scales. The results are compared against observations of young supernova remnants including Cas A and the recent data obtained for SN 1987A. We compute and discuss the process of nucleosynthesis in a series of core-collapse explosion models of a 15 solar mass, blue supergiant progenitor. We obtain nucleosynthetic yields and study the evolution of the chemical element distribution from the moment of core bounce until young supernova remnant phase. Our models show how the process of energy deposition due to radioactive decay modifies the dynamics and the core ejecta structure on small and intermediate scales. The results are compared against observations of young supernova remnants including Cas A and the recent data obtained for SN 1987A. The work has been supported by the NSF grant AST-1109113 and DOE grant DE-FG52-09NA29548. This research used resources of the National Energy Research Scientific Computing Center, which is supported by the U.S. DoE under Contract No. DE-AC02-05CH11231.
NASA Astrophysics Data System (ADS)
Negrut, Dan; Lamb, David; Gorsich, David
2011-06-01
This paper describes a software infrastructure made up of tools and libraries designed to assist developers in implementing computational dynamics applications running on heterogeneous and distributed computing environments. Together, these tools and libraries compose a so called Heterogeneous Computing Template (HCT). The heterogeneous and distributed computing hardware infrastructure is assumed herein to be made up of a combination of CPUs and Graphics Processing Units (GPUs). The computational dynamics applications targeted to execute on such a hardware topology include many-body dynamics, smoothed-particle hydrodynamics (SPH) fluid simulation, and fluid-solid interaction analysis. The underlying theme of the solution approach embraced by HCT is that of partitioning the domain of interest into a number of subdomains that are each managed by a separate core/accelerator (CPU/GPU) pair. Five components at the core of HCT enable the envisioned distributed computing approach to large-scale dynamical system simulation: (a) the ability to partition the problem according to the one-to-one mapping; i.e., spatial subdivision, discussed above (pre-processing); (b) a protocol for passing data between any two co-processors; (c) algorithms for element proximity computation; and (d) the ability to carry out post-processing in a distributed fashion. In this contribution the components (a) and (b) of the HCT are demonstrated via the example of the Discrete Element Method (DEM) for rigid body dynamics with friction and contact. The collision detection task required in frictional-contact dynamics (task (c) above), is shown to benefit on the GPU of a two order of magnitude gain in efficiency when compared to traditional sequential implementations. Note: Reference herein to any specific commercial products, process, or service by trade name, trademark, manufacturer, or otherwise, does not imply its endorsement, recommendation, or favoring by the United States Army. The views and opinions of authors expressed herein do not necessarily state or reflect those of the United States Army, and shall not be used for advertising or product endorsement purposes.
Bayesian approach for three-dimensional aquifer characterization at the Hanford 300 Area
DOE Office of Scientific and Technical Information (OSTI.GOV)
Murakami, Haruko; Chen, X.; Hahn, Melanie S.
2010-10-21
This study presents a stochastic, three-dimensional characterization of a heterogeneous hydraulic conductivity field within DOE's Hanford 300 Area site, Washington, by assimilating large-scale, constant-rate injection test data with small-scale, three-dimensional electromagnetic borehole flowmeter (EBF) measurement data. We first inverted the injection test data to estimate the transmissivity field, using zeroth-order temporal moments of pressure buildup curves. We applied a newly developed Bayesian geostatistical inversion framework, the method of anchored distributions (MAD), to obtain a joint posterior distribution of geostatistical parameters and local log-transmissivities at multiple locations. The unique aspects of MAD that make it suitable for this purpose are itsmore » ability to integrate multi-scale, multi-type data within a Bayesian framework and to compute a nonparametric posterior distribution. After we combined the distribution of transmissivities with depth-discrete relative-conductivity profile from EBF data, we inferred the three-dimensional geostatistical parameters of the log-conductivity field, using the Bayesian model-based geostatistics. Such consistent use of the Bayesian approach throughout the procedure enabled us to systematically incorporate data uncertainty into the final posterior distribution. The method was tested in a synthetic study and validated using the actual data that was not part of the estimation. Results showed broader and skewed posterior distributions of geostatistical parameters except for the mean, which suggests the importance of inferring the entire distribution to quantify the parameter uncertainty.« less
A numerical differentiation library exploiting parallel architectures
NASA Astrophysics Data System (ADS)
Voglis, C.; Hadjidoukas, P. E.; Lagaris, I. E.; Papageorgiou, D. G.
2009-08-01
We present a software library for numerically estimating first and second order partial derivatives of a function by finite differencing. Various truncation schemes are offered resulting in corresponding formulas that are accurate to order O(h), O(h), and O(h), h being the differencing step. The derivatives are calculated via forward, backward and central differences. Care has been taken that only feasible points are used in the case where bound constraints are imposed on the variables. The Hessian may be approximated either from function or from gradient values. There are three versions of the software: a sequential version, an OpenMP version for shared memory architectures and an MPI version for distributed systems (clusters). The parallel versions exploit the multiprocessing capability offered by computer clusters, as well as modern multi-core systems and due to the independent character of the derivative computation, the speedup scales almost linearly with the number of available processors/cores. Program summaryProgram title: NDL (Numerical Differentiation Library) Catalogue identifier: AEDG_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEDG_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: Standard CPC licence, http://cpc.cs.qub.ac.uk/licence/licence.html No. of lines in distributed program, including test data, etc.: 73 030 No. of bytes in distributed program, including test data, etc.: 630 876 Distribution format: tar.gz Programming language: ANSI FORTRAN-77, ANSI C, MPI, OPENMP Computer: Distributed systems (clusters), shared memory systems Operating system: Linux, Solaris Has the code been vectorised or parallelized?: Yes RAM: The library uses O(N) internal storage, N being the dimension of the problem Classification: 4.9, 4.14, 6.5 Nature of problem: The numerical estimation of derivatives at several accuracy levels is a common requirement in many computational tasks, such as optimization, solution of nonlinear systems, etc. The parallel implementation that exploits systems with multiple CPUs is very important for large scale and computationally expensive problems. Solution method: Finite differencing is used with carefully chosen step that minimizes the sum of the truncation and round-off errors. The parallel versions employ both OpenMP and MPI libraries. Restrictions: The library uses only double precision arithmetic. Unusual features: The software takes into account bound constraints, in the sense that only feasible points are used to evaluate the derivatives, and given the level of the desired accuracy, the proper formula is automatically employed. Running time: Running time depends on the function's complexity. The test run took 15 ms for the serial distribution, 0.6 s for the OpenMP and 4.2 s for the MPI parallel distribution on 2 processors.
NASA Astrophysics Data System (ADS)
Guenther, A. B.; Duhl, T.
2011-12-01
Increasing computational resources have enabled a steady improvement in the spatial resolution used for earth system models. Land surface models and landcover distributions have kept ahead by providing higher spatial resolution than typically used in these models. Satellite observations have played a major role in providing high resolution landcover distributions over large regions or the entire earth surface but ground observations are needed to calibrate these data and provide accurate inputs for models. As our ability to resolve individual landscape components improves, it is important to consider what scale is sufficient for providing inputs to earth system models. The required spatial scale is dependent on the processes being represented and the scientific questions being addressed. This presentation will describe the development a contiguous U.S. landcover database using high resolution imagery (1 to 1000 meters) and surface observations of species composition and other landcover characteristics. The database includes plant functional types and species composition and is suitable for driving land surface models (CLM and MEGAN) that predict land surface exchange of carbon, water, energy and biogenic reactive gases (e.g., isoprene, sesquiterpenes, and NO). We investigate the sensitivity of model results to landcover distributions with spatial scales ranging over six orders of magnitude (1 meter to 1000000 meters). The implications for predictions of regional climate and air quality will be discussed along with recommendations for regional and global earth system modeling.
NASA Astrophysics Data System (ADS)
Ying, Shen; Li, Lin; Gao, Yurong
2009-10-01
Spatial visibility analysis is the important direction of pedestrian behaviors because our visual conception in space is the straight method to get environment information and navigate your actions. Based on the agent modeling and up-tobottom method, the paper develop the framework about the analysis of the pedestrian flow depended on visibility. We use viewshed in visibility analysis and impose the parameters on agent simulation to direct their motion in urban space. We analyze the pedestrian behaviors in micro-scale and macro-scale of urban open space. The individual agent use visual affordance to determine his direction of motion in micro-scale urban street on district. And we compare the distribution of pedestrian flow with configuration in macro-scale urban environment, and mine the relationship between the pedestrian flow and distribution of urban facilities and urban function. The paper first computes the visibility situations at the vantage point in urban open space, such as street network, quantify the visibility parameters. The multiple agents use visibility parameters to decide their direction of motion, and finally pedestrian flow reach to a stable state in urban environment through the simulation of multiple agent system. The paper compare the morphology of visibility parameters and pedestrian distribution with urban function and facilities layout to confirm the consistence between them, which can be used to make decision support in urban design.
Future in biomolecular computation
NASA Astrophysics Data System (ADS)
Wimmer, E.
1988-01-01
Large-scale computations for biomolecules are dominated by three levels of theory: rigorous quantum mechanical calculations for molecules with up to about 30 atoms, semi-empirical quantum mechanical calculations for systems with up to several hundred atoms, and force-field molecular dynamics studies of biomacromolecules with 10,000 atoms and more including surrounding solvent molecules. It can be anticipated that increased computational power will allow the treatment of larger systems of ever growing complexity. Due to the scaling of the computational requirements with increasing number of atoms, the force-field approaches will benefit the most from increased computational power. On the other hand, progress in methodologies such as density functional theory will enable us to treat larger systems on a fully quantum mechanical level and a combination of molecular dynamics and quantum mechanics can be envisioned. One of the greatest challenges in biomolecular computation is the protein folding problem. It is unclear at this point, if an approach with current methodologies will lead to a satisfactory answer or if unconventional, new approaches will be necessary. In any event, due to the complexity of biomolecular systems, a hierarchy of approaches will have to be established and used in order to capture the wide ranges of length-scales and time-scales involved in biological processes. In terms of hardware development, speed and power of computers will increase while the price/performance ratio will become more and more favorable. Parallelism can be anticipated to become an integral architectural feature in a range of computers. It is unclear at this point, how fast massively parallel systems will become easy enough to use so that new methodological developments can be pursued on such computers. Current trends show that distributed processing such as the combination of convenient graphics workstations and powerful general-purpose supercomputers will lead to a new style of computing in which the calculations are monitored and manipulated as they proceed. The combination of a numeric approach with artificial-intelligence approaches can be expected to open up entirely new possibilities. Ultimately, the most exciding aspect of the future in biomolecular computing will be the unexpected discoveries.
Scale-dependent coupling of hysteretic capillary pressure, trapping, and fluid mobilities
NASA Astrophysics Data System (ADS)
Doster, F.; Celia, M. A.; Nordbotten, J. M.
2012-12-01
Many applications of multiphase flow in porous media, including CO2-storage and enhanced oil recovery, require mathematical models that span a large range of length scales. In the context of numerical simulations, practical grid sizes are often on the order of tens of meters, thereby de facto defining a coarse model scale. Under particular conditions, it is possible to approximate the sub-grid-scale distribution of the fluid saturation within a grid cell; that reconstructed saturation can then be used to compute effective properties at the coarse scale. If both the density difference between the fluids and the vertical extend of the grid cell are large, and buoyant segregation within the cell on a sufficiently shorte time scale, then the phase pressure distributions are essentially hydrostatic and the saturation profile can be reconstructed from the inferred capillary pressures. However, the saturation reconstruction may not be unique because the parameters and parameter functions of classical formulations of two-phase flow in porous media - the relative permeability functions, the capillary pressure -saturation relationship, and the residual saturations - show path dependence, i.e. their values depend not only on the state variables but also on their drainage and imbibition histories. In this study we focus on capillary pressure hysteresis and trapping and show that the contribution of hysteresis to effective quantities is dependent on the vertical length scale. By studying the transition from the two extreme cases - the homogeneous saturation distribution for small vertical extents and the completely segregated distribution for large extents - we identify how hysteretic capillary pressure at the local scale induces hysteresis in all coarse-scale quantities for medium vertical extents and finally vanishes for large vertical extents. Our results allow for more accurate vertically integrated modeling while improving our understanding of the coupling of capillary pressure and relative permeabilities over larger length scales.
Coupling DAEM and CFD for simulating biomass fast pyrolysis in fluidized beds
Xiong, Qingang; Zhang, Jingchao; Wiggins, Gavin; ...
2015-12-03
We report results from computational simulations of an experimental, lab-scale bubbling bed biomass pyrolysis reactor that include a distributed activation energy model (DAEM) for the kinetics. In this study, we utilized multiphase computational fluid dynamics (CFD) to account for the turbulent hydrodynamics, and this was combined with the DAEM kinetics in a multi-component, multi-step reaction network. Our results indicate that it is possible to numerically integrate the coupled CFD–DAEM system without significantly increasing computational overhead. It is also clear, however, that reactor operating conditions, reaction kinetics, and multiphase flow dynamics all have major impacts on the pyrolysis products exiting themore » reactor. We find that, with the same pre-exponential factors and mean activation energies, inclusion of distributed activation energies in the kinetics can shift the predicted average value of the exit vapor-phase tar flux and its statistical distribution, compared to single-valued activation-energy kinetics. Perhaps the most interesting observed trend is that increasing the diversity of the DAEM activation energies appears to increase the mean tar yield, all else being equal. As a result, these findings imply that accurate resolution of the reaction activation energy distributions will be important for optimizing biomass pyrolysis processes.« less
NASA Astrophysics Data System (ADS)
Zhang, Ning; Du, Yunsong; Miao, Shiguang; Fang, Xiaoyi
2016-08-01
The simulation performance over complex building clusters of a wind simulation model (Wind Information Field Fast Analysis model, WIFFA) in a micro-scale air pollutant dispersion model system (Urban Microscale Air Pollution dispersion Simulation model, UMAPS) is evaluated using various wind tunnel experimental data including the CEDVAL (Compilation of Experimental Data for Validation of Micro-Scale Dispersion Models) wind tunnel experiment data and the NJU-FZ experiment data (Nanjing University-Fang Zhuang neighborhood wind tunnel experiment data). The results show that the wind model can reproduce the vortexes triggered by urban buildings well, and the flow patterns in urban street canyons and building clusters can also be represented. Due to the complex shapes of buildings and their distributions, the simulation deviations/discrepancies from the measurements are usually caused by the simplification of the building shapes and the determination of the key zone sizes. The computational efficiencies of different cases are also discussed in this paper. The model has a high computational efficiency compared to traditional numerical models that solve the Navier-Stokes equations, and can produce very high-resolution (1-5 m) wind fields of a complex neighborhood scale urban building canopy (~ 1 km ×1 km) in less than 3 min when run on a personal computer.
HammerCloud: A Stress Testing System for Distributed Analysis
NASA Astrophysics Data System (ADS)
van der Ster, Daniel C.; Elmsheuser, Johannes; Úbeda García, Mario; Paladin, Massimo
2011-12-01
Distributed analysis of LHC data is an I/O-intensive activity which places large demands on the internal network, storage, and local disks at remote computing facilities. Commissioning and maintaining a site to provide an efficient distributed analysis service is therefore a challenge which can be aided by tools to help evaluate a variety of infrastructure designs and configurations. HammerCloud is one such tool; it is a stress testing service which is used by central operations teams, regional coordinators, and local site admins to (a) submit arbitrary number of analysis jobs to a number of sites, (b) maintain at a steady-state a predefined number of jobs running at the sites under test, (c) produce web-based reports summarizing the efficiency and performance of the sites under test, and (d) present a web-interface for historical test results to both evaluate progress and compare sites. HammerCloud was built around the distributed analysis framework Ganga, exploiting its API for grid job management. HammerCloud has been employed by the ATLAS experiment for continuous testing of many sites worldwide, and also during large scale computing challenges such as STEP'09 and UAT'09, where the scale of the tests exceeded 10,000 concurrently running and 1,000,000 total jobs over multi-day periods. In addition, HammerCloud is being adopted by the CMS experiment; the plugin structure of HammerCloud allows the execution of CMS jobs using their official tool (CRAB).
Distribution of shortest cycle lengths in random networks
NASA Astrophysics Data System (ADS)
Bonneau, Haggai; Hassid, Aviv; Biham, Ofer; Kühn, Reimer; Katzav, Eytan
2017-12-01
We present analytical results for the distribution of shortest cycle lengths (DSCL) in random networks. The approach is based on the relation between the DSCL and the distribution of shortest path lengths (DSPL). We apply this approach to configuration model networks, for which analytical results for the DSPL were obtained before. We first calculate the fraction of nodes in the network which reside on at least one cycle. Conditioning on being on a cycle, we provide the DSCL over ensembles of configuration model networks with degree distributions which follow a Poisson distribution (Erdős-Rényi network), degenerate distribution (random regular graph), and a power-law distribution (scale-free network). The mean and variance of the DSCL are calculated. The analytical results are found to be in very good agreement with the results of computer simulations.
Proceedings of the 14th International Conference on the Numerical Simulation of Plasmas
NASA Astrophysics Data System (ADS)
Partial Contents are as follows: Numerical Simulations of the Vlasov-Maxwell Equations by Coupled Particle-Finite Element Methods on Unstructured Meshes; Electromagnetic PIC Simulations Using Finite Elements on Unstructured Grids; Modelling Travelling Wave Output Structures with the Particle-in-Cell Code CONDOR; SST--A Single-Slice Particle Simulation Code; Graphical Display and Animation of Data Produced by Electromagnetic, Particle-in-Cell Codes; A Post-Processor for the PEST Code; Gray Scale Rendering of Beam Profile Data; A 2D Electromagnetic PIC Code for Distributed Memory Parallel Computers; 3-D Electromagnetic PIC Simulation on the NRL Connection Machine; Plasma PIC Simulations on MIMD Computers; Vlasov-Maxwell Algorithm for Electromagnetic Plasma Simulation on Distributed Architectures; MHD Boundary Layer Calculation Using the Vortex Method; and Eulerian Codes for Plasma Simulations.
NASA Astrophysics Data System (ADS)
Shi, X.
2015-12-01
As NSF indicated - "Theory and experimentation have for centuries been regarded as two fundamental pillars of science. It is now widely recognized that computational and data-enabled science forms a critical third pillar." Geocomputation is the third pillar of GIScience and geosciences. With the exponential growth of geodata, the challenge of scalable and high performance computing for big data analytics become urgent because many research activities are constrained by the inability of software or tool that even could not complete the computation process. Heterogeneous geodata integration and analytics obviously magnify the complexity and operational time frame. Many large-scale geospatial problems may be not processable at all if the computer system does not have sufficient memory or computational power. Emerging computer architectures, such as Intel's Many Integrated Core (MIC) Architecture and Graphics Processing Unit (GPU), and advanced computing technologies provide promising solutions to employ massive parallelism and hardware resources to achieve scalability and high performance for data intensive computing over large spatiotemporal and social media data. Exploring novel algorithms and deploying the solutions in massively parallel computing environment to achieve the capability for scalable data processing and analytics over large-scale, complex, and heterogeneous geodata with consistent quality and high-performance has been the central theme of our research team in the Department of Geosciences at the University of Arkansas (UARK). New multi-core architectures combined with application accelerators hold the promise to achieve scalability and high performance by exploiting task and data levels of parallelism that are not supported by the conventional computing systems. Such a parallel or distributed computing environment is particularly suitable for large-scale geocomputation over big data as proved by our prior works, while the potential of such advanced infrastructure remains unexplored in this domain. Within this presentation, our prior and on-going initiatives will be summarized to exemplify how we exploit multicore CPUs, GPUs, and MICs, and clusters of CPUs, GPUs and MICs, to accelerate geocomputation in different applications.
Extending the length and time scales of Gram–Schmidt Lyapunov vector computations
DOE Office of Scientific and Technical Information (OSTI.GOV)
Costa, Anthony B., E-mail: acosta@northwestern.edu; Green, Jason R., E-mail: jason.green@umb.edu; Department of Chemistry, University of Massachusetts Boston, Boston, MA 02125
Lyapunov vectors have found growing interest recently due to their ability to characterize systems out of thermodynamic equilibrium. The computation of orthogonal Gram–Schmidt vectors requires multiplication and QR decomposition of large matrices, which grow as N{sup 2} (with the particle count). This expense has limited such calculations to relatively small systems and short time scales. Here, we detail two implementations of an algorithm for computing Gram–Schmidt vectors. The first is a distributed-memory message-passing method using Scalapack. The second uses the newly-released MAGMA library for GPUs. We compare the performance of both codes for Lennard–Jones fluids from N=100 to 1300 betweenmore » Intel Nahalem/Infiniband DDR and NVIDIA C2050 architectures. To our best knowledge, these are the largest systems for which the Gram–Schmidt Lyapunov vectors have been computed, and the first time their calculation has been GPU-accelerated. We conclude that Lyapunov vector calculations can be significantly extended in length and time by leveraging the power of GPU-accelerated linear algebra.« less
Dynamic Load Balancing for Grid Partitioning on a SP-2 Multiprocessor: A Framework
NASA Technical Reports Server (NTRS)
Sohn, Andrew; Simon, Horst; Lasinski, T. A. (Technical Monitor)
1994-01-01
Computational requirements of full scale computational fluid dynamics change as computation progresses on a parallel machine. The change in computational intensity causes workload imbalance of processors, which in turn requires a large amount of data movement at runtime. If parallel CFD is to be successful on a parallel or massively parallel machine, balancing of the runtime load is indispensable. Here a framework is presented for dynamic load balancing for CFD applications, called Jove. One processor is designated as a decision maker Jove while others are assigned to computational fluid dynamics. Processors running CFD send flags to Jove in a predetermined number of iterations to initiate load balancing. Jove starts working on load balancing while other processors continue working with the current data and load distribution. Jove goes through several steps to decide if the new data should be taken, including preliminary evaluate, partition, processor reassignment, cost evaluation, and decision. Jove running on a single EBM SP2 node has been completely implemented. Preliminary experimental results show that the Jove approach to dynamic load balancing can be effective for full scale grid partitioning on the target machine IBM SP2.
Dynamic Load Balancing For Grid Partitioning on a SP-2 Multiprocessor: A Framework
NASA Technical Reports Server (NTRS)
Sohn, Andrew; Simon, Horst; Lasinski, T. A. (Technical Monitor)
1994-01-01
Computational requirements of full scale computational fluid dynamics change as computation progresses on a parallel machine. The change in computational intensity causes workload imbalance of processors, which in turn requires a large amount of data movement at runtime. If parallel CFD is to be successful on a parallel or massively parallel machine, balancing of the runtime load is indispensable. Here a framework is presented for dynamic load balancing for CFD applications, called Jove. One processor is designated as a decision maker Jove while others are assigned to computational fluid dynamics. Processors running CFD send flags to Jove in a predetermined number of iterations to initiate load balancing. Jove starts working on load balancing while other processors continue working with the current data and load distribution. Jove goes through several steps to decide if the new data should be taken, including preliminary evaluate, partition, processor reassignment, cost evaluation, and decision. Jove running on a single IBM SP2 node has been completely implemented. Preliminary experimental results show that the Jove approach to dynamic load balancing can be effective for full scale grid partitioning on the target machine IBM SP2.
The European computer driving licence and the use of computers by dental students.
Antonarakis, G S
2009-02-01
The use of computers within the dental curriculum for students is vital for many aspects of their studies. The aim of this study was to assess how dental students who had obtained the European computer driving licence (ECDL) qualification (an internationally-recognised standard of competence) through taught courses, felt about the qualification, and how it changed their habits vis-à-vis computers, and information and communication technology. This study was carried out as a descriptive, one-off, cross-sectional survey. A questionnaire was distributed to 100 students who had successfully completed the course, with questions pertaining to the use of email, word processing and Internet for course-works, Medline for research, computer based learning, online lecture notes, and online communication with members of staff, both before and after ECDL qualification. Scaled responses were given. The attitudes of students towards the course were also assessed. The frequencies and percentage distributions of the responses to each question were analysed. It was found that dental students who follow ECDL teaching and successfully complete its requirements, seem to increase the frequency with which they use email, word processing and Internet for course works, Medline for research purposes, computer based learning, online lecture notes, and online communication with staff. Opinions about the ECDL course varied, many dental students finding the course easy, enjoying it only a little, but admitting that it improved their computer skills.
Parallel Computing for Probabilistic Response Analysis of High Temperature Composites
NASA Technical Reports Server (NTRS)
Sues, R. H.; Lua, Y. J.; Smith, M. D.
1994-01-01
The objective of this Phase I research was to establish the required software and hardware strategies to achieve large scale parallelism in solving PCM problems. To meet this objective, several investigations were conducted. First, we identified the multiple levels of parallelism in PCM and the computational strategies to exploit these parallelisms. Next, several software and hardware efficiency investigations were conducted. These involved the use of three different parallel programming paradigms and solution of two example problems on both a shared-memory multiprocessor and a distributed-memory network of workstations.
Exploring Contextual Models in Chemical Patent Search
NASA Astrophysics Data System (ADS)
Urbain, Jay; Frieder, Ophir
We explore the development of probabilistic retrieval models for integrating term statistics with entity search using multiple levels of document context to improve the performance of chemical patent search. A distributed indexing model was developed to enable efficient named entity search and aggregation of term statistics at multiple levels of patent structure including individual words, sentences, claims, descriptions, abstracts, and titles. The system can be scaled to an arbitrary number of compute instances in a cloud computing environment to support concurrent indexing and query processing operations on large patent collections.
DOE Office of Scientific and Technical Information (OSTI.GOV)
De Supinski, B.; Caliga, D.
2017-09-28
The primary objective of this project was to develop memory optimization technology to efficiently deliver data to, and distribute data within, the SRC-6's Field Programmable Gate Array- ("FPGA") based Multi-Adaptive Processors (MAPs). The hardware/software approach was to explore efficient MAP configurations and generate the compiler technology to exploit those configurations. This memory accessing technology represents an important step towards making reconfigurable symmetric multi-processor (SMP) architectures that will be a costeffective solution for large-scale scientific computing.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wu, Chase Qishi; Zhu, Michelle Mengxia
The advent of large-scale collaborative scientific applications has demonstrated the potential for broad scientific communities to pool globally distributed resources to produce unprecedented data acquisition, movement, and analysis. System resources including supercomputers, data repositories, computing facilities, network infrastructures, storage systems, and display devices have been increasingly deployed at national laboratories and academic institutes. These resources are typically shared by large communities of users over Internet or dedicated networks and hence exhibit an inherent dynamic nature in their availability, accessibility, capacity, and stability. Scientific applications using either experimental facilities or computation-based simulations with various physical, chemical, climatic, and biological models featuremore » diverse scientific workflows as simple as linear pipelines or as complex as a directed acyclic graphs, which must be executed and supported over wide-area networks with massively distributed resources. Application users oftentimes need to manually configure their computing tasks over networks in an ad hoc manner, hence significantly limiting the productivity of scientists and constraining the utilization of resources. The success of these large-scale distributed applications requires a highly adaptive and massively scalable workflow platform that provides automated and optimized computing and networking services. This project is to design and develop a generic Scientific Workflow Automation and Management Platform (SWAMP), which contains a web-based user interface specially tailored for a target application, a set of user libraries, and several easy-to-use computing and networking toolkits for application scientists to conveniently assemble, execute, monitor, and control complex computing workflows in heterogeneous high-performance network environments. SWAMP will enable the automation and management of the entire process of scientific workflows with the convenience of a few mouse clicks while hiding the implementation and technical details from end users. Particularly, we will consider two types of applications with distinct performance requirements: data-centric and service-centric applications. For data-centric applications, the main workflow task involves large-volume data generation, catalog, storage, and movement typically from supercomputers or experimental facilities to a team of geographically distributed users; while for service-centric applications, the main focus of workflow is on data archiving, preprocessing, filtering, synthesis, visualization, and other application-specific analysis. We will conduct a comprehensive comparison of existing workflow systems and choose the best suited one with open-source code, a flexible system structure, and a large user base as the starting point for our development. Based on the chosen system, we will develop and integrate new components including a black box design of computing modules, performance monitoring and prediction, and workflow optimization and reconfiguration, which are missing from existing workflow systems. A modular design for separating specification, execution, and monitoring aspects will be adopted to establish a common generic infrastructure suited for a wide spectrum of science applications. We will further design and develop efficient workflow mapping and scheduling algorithms to optimize the workflow performance in terms of minimum end-to-end delay, maximum frame rate, and highest reliability. We will develop and demonstrate the SWAMP system in a local environment, the grid network, and the 100Gpbs Advanced Network Initiative (ANI) testbed. The demonstration will target scientific applications in climate modeling and high energy physics and the functions to be demonstrated include workflow deployment, execution, steering, and reconfiguration. Throughout the project period, we will work closely with the science communities in the fields of climate modeling and high energy physics including Spallation Neutron Source (SNS) and Large Hadron Collider (LHC) projects to mature the system for production use.« less
NASA Technical Reports Server (NTRS)
2001-01-01
This document presents the full-scale analyses of the CFD RSRM. The RSRM model was developed with a 20 second burn time. The following are presented as part of the full-scale analyses: (1) RSRM embedded inclusion analysis; (2) RSRM igniter nozzle design analysis; (3) Nozzle Joint 4 erosion anomaly; (4) RSRM full motor port slag accumulation analysis; (5) RSRM motor analysis of two-phase flow in the aft segment/submerged nozzle region; (6) Completion of 3-D Analysis of the hot air nozzle manifold; (7) Bates Motor distributed combustion test case; and (8) Three Dimensional Polysulfide Bump Analysis.
Investigations of grain size dependent sediment transport phenomena on multiple scales
NASA Astrophysics Data System (ADS)
Thaxton, Christopher S.
Sediment transport processes in coastal and fluvial environments resulting from disturbances such as urbanization, mining, agriculture, military operations, and climatic change have significant impact on local, regional, and global environments. Primarily, these impacts include the erosion and deposition of sediment, channel network modification, reduction in downstream water quality, and the delivery of chemical contaminants. The scale and spatial distribution of these effects are largely attributable to the size distribution of the sediment grains that become eligible for transport. An improved understanding of advective and diffusive grain-size dependent sediment transport phenomena will lead to the development of more accurate predictive models and more effective control measures. To this end, three studies were performed that investigated grain-size dependent sediment transport on three different scales. Discrete particle computer simulations of sheet flow bedload transport on the scale of 0.1--100 millimeters were performed on a heterogeneous population of grains of various grain sizes. The relative transport rates and diffusivities of grains under both oscillatory and uniform, steady flow conditions were quantified. These findings suggest that boundary layer formalisms should describe surface roughness through a representative grain size that is functionally dependent on the applied flow parameters. On the scale of 1--10m, experiments were performed to quantify the hydrodynamics and sediment capture efficiency of various baffles installed in a sediment retention pond, a commonly used sedimentation control measure in watershed applications. Analysis indicates that an optimum sediment capture effectiveness may be achieved based on baffle permeability, pond geometry and flow rate. Finally, on the scale of 10--1,000m, a distributed, bivariate watershed terain evolution module was developed within GRASS GIS. Simulation results for variable grain sizes and for distributed rainfall infiltration and land cover matched observations. Although a unique set of governing equations applies to each scale, an improved physics-based understanding of small and medium scale behavior may yield more accurate parameterization of key variables used in large scale predictive models.
Argonne simulation framework for intelligent transportation systems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ewing, T.; Doss, E.; Hanebutte, U.
1996-04-01
A simulation framework has been developed which defines a high-level architecture for a large-scale, comprehensive, scalable simulation of an Intelligent Transportation System (ITS). The simulator is designed to run on parallel computers and distributed (networked) computer systems; however, a version for a stand alone workstation is also available. The ITS simulator includes an Expert Driver Model (EDM) of instrumented ``smart`` vehicles with in-vehicle navigation units. The EDM is capable of performing optimal route planning and communicating with Traffic Management Centers (TMC). A dynamic road map data base is sued for optimum route planning, where the data is updated periodically tomore » reflect any changes in road or weather conditions. The TMC has probe vehicle tracking capabilities (display position and attributes of instrumented vehicles), and can provide 2-way interaction with traffic to provide advisories and link times. Both the in-vehicle navigation module and the TMC feature detailed graphical user interfaces that includes human-factors studies to support safety and operational research. Realistic modeling of variations of the posted driving speed are based on human factor studies that take into consideration weather, road conditions, driver`s personality and behavior and vehicle type. The simulator has been developed on a distributed system of networked UNIX computers, but is designed to run on ANL`s IBM SP-X parallel computer system for large scale problems. A novel feature of the developed simulator is that vehicles will be represented by autonomous computer processes, each with a behavior model which performs independent route selection and reacts to external traffic events much like real vehicles. Vehicle processes interact with each other and with ITS components by exchanging messages. With this approach, one will be able to take advantage of emerging massively parallel processor (MPP) systems.« less
Comparison of sampling techniques for Bayesian parameter estimation
NASA Astrophysics Data System (ADS)
Allison, Rupert; Dunkley, Joanna
2014-02-01
The posterior probability distribution for a set of model parameters encodes all that the data have to tell us in the context of a given model; it is the fundamental quantity for Bayesian parameter estimation. In order to infer the posterior probability distribution we have to decide how to explore parameter space. Here we compare three prescriptions for how parameter space is navigated, discussing their relative merits. We consider Metropolis-Hasting sampling, nested sampling and affine-invariant ensemble Markov chain Monte Carlo (MCMC) sampling. We focus on their performance on toy-model Gaussian likelihoods and on a real-world cosmological data set. We outline the sampling algorithms themselves and elaborate on performance diagnostics such as convergence time, scope for parallelization, dimensional scaling, requisite tunings and suitability for non-Gaussian distributions. We find that nested sampling delivers high-fidelity estimates for posterior statistics at low computational cost, and should be adopted in favour of Metropolis-Hastings in many cases. Affine-invariant MCMC is competitive when computing clusters can be utilized for massive parallelization. Affine-invariant MCMC and existing extensions to nested sampling naturally probe multimodal and curving distributions.
Study of Solid State Drives performance in PROOF distributed analysis system
NASA Astrophysics Data System (ADS)
Panitkin, S. Y.; Ernst, M.; Petkus, R.; Rind, O.; Wenaus, T.
2010-04-01
Solid State Drives (SSD) is a promising storage technology for High Energy Physics parallel analysis farms. Its combination of low random access time and relatively high read speed is very well suited for situations where multiple jobs concurrently access data located on the same drive. It also has lower energy consumption and higher vibration tolerance than Hard Disk Drive (HDD) which makes it an attractive choice in many applications raging from personal laptops to large analysis farms. The Parallel ROOT Facility - PROOF is a distributed analysis system which allows to exploit inherent event level parallelism of high energy physics data. PROOF is especially efficient together with distributed local storage systems like Xrootd, when data are distributed over computing nodes. In such an architecture the local disk subsystem I/O performance becomes a critical factor, especially when computing nodes use multi-core CPUs. We will discuss our experience with SSDs in PROOF environment. We will compare performance of HDD with SSD in I/O intensive analysis scenarios. In particular we will discuss PROOF system performance scaling with a number of simultaneously running analysis jobs.
Pion and kaon valence-quark parton quasidistributions
NASA Astrophysics Data System (ADS)
Xu, Shu-Sheng; Chang, Lei; Roberts, Craig D.; Zong, Hong-Shi
2018-05-01
Algebraic Ansätze for the Poincaré-covariant Bethe-Salpeter wave functions of the pion and kaon are used to calculate their light-front wave functions, parton distribution amplitudes, parton quasidistribution amplitudes, valence parton distribution functions, and parton quasidistribution functions (PqDFs). The light-front wave functions are broad, concave functions, and the scale of flavor-symmetry violation in the kaon is roughly 15%, being set by the ratio of emergent masses in the s - and u -quark sectors. Parton quasidistribution amplitudes computed with longitudinal momentum Pz=1.75 GeV provide a semiquantitatively accurate representation of the objective parton distribution amplitude, but even with Pz=3 GeV , they cannot provide information about this amplitude's end point behavior. On the valence-quark domain, similar outcomes characterize PqDFs. In this connection, however, the ratio of kaon-to-pion u -quark PqDFs is found to provide a good approximation to the true parton distribution function ratio on 0.4 ≲x ≲0.8 , suggesting that with existing resources computations of ratios of parton quasidistributions can yield results that support empirical comparison.
NASA Astrophysics Data System (ADS)
Loring, B.; Karimabadi, H.; Rortershteyn, V.
2015-10-01
The surface line integral convolution(LIC) visualization technique produces dense visualization of vector fields on arbitrary surfaces. We present a screen space surface LIC algorithm for use in distributed memory data parallel sort last rendering infrastructures. The motivations for our work are to support analysis of datasets that are too large to fit in the main memory of a single computer and compatibility with prevalent parallel scientific visualization tools such as ParaView and VisIt. By working in screen space using OpenGL we can leverage the computational power of GPUs when they are available and run without them when they are not. We address efficiency and performance issues that arise from the transformation of data from physical to screen space by selecting an alternate screen space domain decomposition. We analyze the algorithm's scaling behavior with and without GPUs on two high performance computing systems using data from turbulent plasma simulations.
Design and Verification of Remote Sensing Image Data Center Storage Architecture Based on Hadoop
NASA Astrophysics Data System (ADS)
Tang, D.; Zhou, X.; Jing, Y.; Cong, W.; Li, C.
2018-04-01
The data center is a new concept of data processing and application proposed in recent years. It is a new method of processing technologies based on data, parallel computing, and compatibility with different hardware clusters. While optimizing the data storage management structure, it fully utilizes cluster resource computing nodes and improves the efficiency of data parallel application. This paper used mature Hadoop technology to build a large-scale distributed image management architecture for remote sensing imagery. Using MapReduce parallel processing technology, it called many computing nodes to process image storage blocks and pyramids in the background to improve the efficiency of image reading and application and sovled the need for concurrent multi-user high-speed access to remotely sensed data. It verified the rationality, reliability and superiority of the system design by testing the storage efficiency of different image data and multi-users and analyzing the distributed storage architecture to improve the application efficiency of remote sensing images through building an actual Hadoop service system.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Loring, Burlen; Karimabadi, Homa; Rortershteyn, Vadim
2014-07-01
The surface line integral convolution(LIC) visualization technique produces dense visualization of vector fields on arbitrary surfaces. We present a screen space surface LIC algorithm for use in distributed memory data parallel sort last rendering infrastructures. The motivations for our work are to support analysis of datasets that are too large to fit in the main memory of a single computer and compatibility with prevalent parallel scientific visualization tools such as ParaView and VisIt. By working in screen space using OpenGL we can leverage the computational power of GPUs when they are available and run without them when they are not.more » We address efficiency and performance issues that arise from the transformation of data from physical to screen space by selecting an alternate screen space domain decomposition. We analyze the algorithm's scaling behavior with and without GPUs on two high performance computing systems using data from turbulent plasma simulations.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Guerrier, C.; Holcman, D., E-mail: david.holcman@ens.fr; Mathematical Institute, Oxford OX2 6GG, Newton Institute
The main difficulty in simulating diffusion processes at a molecular level in cell microdomains is due to the multiple scales involving nano- to micrometers. Few to many particles have to be simulated and simultaneously tracked while there are exploring a large portion of the space for binding small targets, such as buffers or active sites. Bridging the small and large spatial scales is achieved by rare events representing Brownian particles finding small targets and characterized by long-time distribution. These rare events are the bottleneck of numerical simulations. A naive stochastic simulation requires running many Brownian particles together, which is computationallymore » greedy and inefficient. Solving the associated partial differential equations is also difficult due to the time dependent boundary conditions, narrow passages and mixed boundary conditions at small windows. We present here two reduced modeling approaches for a fast computation of diffusing fluxes in microdomains. The first approach is based on a Markov mass-action law equations coupled to a Markov chain. The second is a Gillespie's method based on the narrow escape theory for coarse-graining the geometry of the domain into Poissonian rates. The main application concerns diffusion in cellular biology, where we compute as an example the distribution of arrival times of calcium ions to small hidden targets to trigger vesicular release.« less
The Numerical Propulsion System Simulation: An Overview
NASA Technical Reports Server (NTRS)
Lytle, John K.
2000-01-01
Advances in computational technology and in physics-based modeling are making large-scale, detailed simulations of complex systems possible within the design environment. For example, the integration of computing, communications, and aerodynamics has reduced the time required to analyze major propulsion system components from days and weeks to minutes and hours. This breakthrough has enabled the detailed simulation of major propulsion system components to become a routine part of designing systems, providing the designer with critical information about the components early in the design process. This paper describes the development of the numerical propulsion system simulation (NPSS), a modular and extensible framework for the integration of multicomponent and multidisciplinary analysis tools using geographically distributed resources such as computing platforms, data bases, and people. The analysis is currently focused on large-scale modeling of complete aircraft engines. This will provide the product developer with a "virtual wind tunnel" that will reduce the number of hardware builds and tests required during the development of advanced aerospace propulsion systems.
NASA Technical Reports Server (NTRS)
Blair, Michael F.; Anderson, Olof L.
1989-01-01
A combined experimental and computational program was conducted to examine the heat transfer distribution in a turbine rotor passage geometrically similiar to the Space Shuttle Main Engine (SSME) High Pressure Fuel Turbopump (HPFTP). Heat transfer was measured and computed for both the full-span suction and pressure surfaces of the rotor airfoil as well as for the hub endwall surface. The primary objective of the program was to provide a benchmark-quality data base for the assessment of rotor passage heat transfer computational procedures. The experimental portion of the study was conducted in a large-scale, ambient temperature, rotating turbine model. Heat transfer data were obtained using thermocouple and liquid-crystal techniques to measure temperature distributions on the thin, electrically-heated skin of the rotor passage model. Test data were obtained for various combinations of Reynolds number, rotor incidence angle and model surface roughness. The data are reported in the form of contour maps of Stanton number. These heat distribution maps revealed numerous local effects produced by the three-dimensional flows within the rotor passage. Of particular importance were regions of local enhancement produced on the airfoil suction surface by the main-passage and tip-leakage vortices and on the hub endwall by the leading-edge horseshoe vortex system. The computational portion consisted of the application of a well-posed parabolized Navier-Stokes analysis to the calculation of the three-dimensional viscous flow through ducts simulating the a gas turbine passage. These cases include a 90 deg turning duct, a gas turbine cascade simulating a stator passage, and a gas turbine rotor passage including Coriolis forces. The calculated results were evaluated using experimental data of the three-dimensional velocity fields, wall static pressures, and wall heat transfer on the suction surface of the turbine airfoil and on the end wall. Particular attention was paid to an accurate modeling of the passage vortex and to the development of the wall boundary layers including crossflow.
Learning from Massive Distributed Data Sets (Invited)
NASA Astrophysics Data System (ADS)
Kang, E. L.; Braverman, A. J.
2013-12-01
Technologies for remote sensing and ever-expanding computer experiments in climate science are generating massive data sets. Meanwhile, it has been common in all areas of large-scale science to have these 'big data' distributed over multiple different physical locations, and moving large amounts of data can be impractical. In this talk, we will discuss efficient ways for us to summarize and learn from distributed data. We formulate a graphical model to mimic the main characteristics of a distributed-data network, including the size of the data sets and speed of moving data. With this nominal model, we investigate the trade off between prediction accurate and cost of data movement, theoretically and through simulation experiments. We will also discuss new implementations of spatial and spatio-temporal statistical methods optimized for distributed data.
Precise QCD Predictions for the Production of a Z Boson in Association with a Hadronic Jet.
Gehrmann-De Ridder, A; Gehrmann, T; Glover, E W N; Huss, A; Morgan, T A
2016-07-08
We compute the cross section and differential distributions for the production of a Z boson in association with a hadronic jet to next-to-next-to-leading order (NNLO) in perturbative QCD, including the leptonic decay of the Z boson. We present numerical results for the transverse momentum and rapidity distributions of both the Z boson and the associated jet at the LHC. We find that the NNLO corrections increase the NLO predictions by approximately 1% and significantly reduce the scale variation uncertainty.
Center for Center for Technology for Advanced Scientific Component Software (TASCS)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kostadin, Damevski
A resounding success of the Scientific Discovery through Advanced Computing (SciDAC) program is that high-performance computational science is now universally recognized as a critical aspect of scientific discovery [71], complementing both theoretical and experimental research. As scientific communities prepare to exploit unprecedented computing capabilities of emerging leadership-class machines for multi-model simulations at the extreme scale [72], it is more important than ever to address the technical and social challenges of geographically distributed teams that combine expertise in domain science, applied mathematics, and computer science to build robust and flexible codes that can incorporate changes over time. The Center for Technologymore » for Advanced Scientific Component Software (TASCS)1 tackles these these issues by exploiting component-based software development to facilitate collaborative high-performance scientific computing.« less
NASA Astrophysics Data System (ADS)
Georgiev, K.; Zlatev, Z.
2010-11-01
The Danish Eulerian Model (DEM) is an Eulerian model for studying the transport of air pollutants on large scale. Originally, the model was developed at the National Environmental Research Institute of Denmark. The model computational domain covers Europe and some neighbour parts belong to the Atlantic Ocean, Asia and Africa. If DEM model is to be applied by using fine grids, then its discretization leads to a huge computational problem. This implies that such a model as DEM must be run only on high-performance computer architectures. The implementation and tuning of such a complex large-scale model on each different computer is a non-trivial task. Here, some comparison results of running of this model on different kind of vector (CRAY C92A, Fujitsu, etc.), parallel computers with distributed memory (IBM SP, CRAY T3E, Beowulf clusters, Macintosh G4 clusters, etc.), parallel computers with shared memory (SGI Origin, SUN, etc.) and parallel computers with two levels of parallelism (IBM SMP, IBM BlueGene/P, clusters of multiprocessor nodes, etc.) will be presented. The main idea in the parallel version of DEM is domain partitioning approach. Discussions according to the effective use of the cache and hierarchical memories of the modern computers as well as the performance, speed-ups and efficiency achieved will be done. The parallel code of DEM, created by using MPI standard library, appears to be highly portable and shows good efficiency and scalability on different kind of vector and parallel computers. Some important applications of the computer model output are presented in short.
Organization of the secure distributed computing based on multi-agent system
NASA Astrophysics Data System (ADS)
Khovanskov, Sergey; Rumyantsev, Konstantin; Khovanskova, Vera
2018-04-01
Nowadays developing methods for distributed computing is received much attention. One of the methods of distributed computing is using of multi-agent systems. The organization of distributed computing based on the conventional network computers can experience security threats performed by computational processes. Authors have developed the unified agent algorithm of control system of computing network nodes operation. Network PCs is used as computing nodes. The proposed multi-agent control system for the implementation of distributed computing allows in a short time to organize using of the processing power of computers any existing network to solve large-task by creating a distributed computing. Agents based on a computer network can: configure a distributed computing system; to distribute the computational load among computers operated agents; perform optimization distributed computing system according to the computing power of computers on the network. The number of computers connected to the network can be increased by connecting computers to the new computer system, which leads to an increase in overall processing power. Adding multi-agent system in the central agent increases the security of distributed computing. This organization of the distributed computing system reduces the problem solving time and increase fault tolerance (vitality) of computing processes in a changing computing environment (dynamic change of the number of computers on the network). Developed a multi-agent system detects cases of falsification of the results of a distributed system, which may lead to wrong decisions. In addition, the system checks and corrects wrong results.
Overview of ATLAS PanDA Workload Management
NASA Astrophysics Data System (ADS)
Maeno, T.; De, K.; Wenaus, T.; Nilsson, P.; Stewart, G. A.; Walker, R.; Stradling, A.; Caballero, J.; Potekhin, M.; Smith, D.; ATLAS Collaboration
2011-12-01
The Production and Distributed Analysis System (PanDA) plays a key role in the ATLAS distributed computing infrastructure. All ATLAS Monte-Carlo simulation and data reprocessing jobs pass through the PanDA system. We will describe how PanDA manages job execution on the grid using dynamic resource estimation and data replication together with intelligent brokerage in order to meet the scaling and automation requirements of ATLAS distributed computing. PanDA is also the primary ATLAS system for processing user and group analysis jobs, bringing further requirements for quick, flexible adaptation to the rapidly evolving analysis use cases of the early datataking phase, in addition to the high reliability, robustness and usability needed to provide efficient and transparent utilization of the grid for analysis users. We will describe how PanDA meets ATLAS requirements, the evolution of the system in light of operational experience, how the system has performed during the first LHC data-taking phase and plans for the future.
Overview of ATLAS PanDA Workload Management
DOE Office of Scientific and Technical Information (OSTI.GOV)
Maeno T.; De K.; Wenaus T.
2011-01-01
The Production and Distributed Analysis System (PanDA) plays a key role in the ATLAS distributed computing infrastructure. All ATLAS Monte-Carlo simulation and data reprocessing jobs pass through the PanDA system. We will describe how PanDA manages job execution on the grid using dynamic resource estimation and data replication together with intelligent brokerage in order to meet the scaling and automation requirements of ATLAS distributed computing. PanDA is also the primary ATLAS system for processing user and group analysis jobs, bringing further requirements for quick, flexible adaptation to the rapidly evolving analysis use cases of the early datataking phase, in additionmore » to the high reliability, robustness and usability needed to provide efficient and transparent utilization of the grid for analysis users. We will describe how PanDA meets ATLAS requirements, the evolution of the system in light of operational experience, how the system has performed during the first LHC data-taking phase and plans for the future.« less
Estimation Accuracy on Execution Time of Run-Time Tasks in a Heterogeneous Distributed Environment.
Liu, Qi; Cai, Weidong; Jin, Dandan; Shen, Jian; Fu, Zhangjie; Liu, Xiaodong; Linge, Nigel
2016-08-30
Distributed Computing has achieved tremendous development since cloud computing was proposed in 2006, and played a vital role promoting rapid growth of data collecting and analysis models, e.g., Internet of things, Cyber-Physical Systems, Big Data Analytics, etc. Hadoop has become a data convergence platform for sensor networks. As one of the core components, MapReduce facilitates allocating, processing and mining of collected large-scale data, where speculative execution strategies help solve straggler problems. However, there is still no efficient solution for accurate estimation on execution time of run-time tasks, which can affect task allocation and distribution in MapReduce. In this paper, task execution data have been collected and employed for the estimation. A two-phase regression (TPR) method is proposed to predict the finishing time of each task accurately. Detailed data of each task have drawn interests with detailed analysis report being made. According to the results, the prediction accuracy of concurrent tasks' execution time can be improved, in particular for some regular jobs.
Approaches for scalable modeling and emulation of cyber systems : LDRD final report.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mayo, Jackson R.; Minnich, Ronald G.; Armstrong, Robert C.
2009-09-01
The goal of this research was to combine theoretical and computational approaches to better understand the potential emergent behaviors of large-scale cyber systems, such as networks of {approx} 10{sup 6} computers. The scale and sophistication of modern computer software, hardware, and deployed networked systems have significantly exceeded the computational research community's ability to understand, model, and predict current and future behaviors. This predictive understanding, however, is critical to the development of new approaches for proactively designing new systems or enhancing existing systems with robustness to current and future cyber threats, including distributed malware such as botnets. We have developed preliminarymore » theoretical and modeling capabilities that can ultimately answer questions such as: How would we reboot the Internet if it were taken down? Can we change network protocols to make them more secure without disrupting existing Internet connectivity and traffic flow? We have begun to address these issues by developing new capabilities for understanding and modeling Internet systems at scale. Specifically, we have addressed the need for scalable network simulation by carrying out emulations of a network with {approx} 10{sup 6} virtualized operating system instances on a high-performance computing cluster - a 'virtual Internet'. We have also explored mappings between previously studied emergent behaviors of complex systems and their potential cyber counterparts. Our results provide foundational capabilities for further research toward understanding the effects of complexity in cyber systems, to allow anticipating and thwarting hackers.« less
High-precision QCD at hadron colliders:electroweak gauge boson rapidity distributions at NNLO
DOE Office of Scientific and Technical Information (OSTI.GOV)
Anastasiou, C.
2004-01-05
We compute the rapidity distributions of W and Z bosons produced at the Tevatron and the LHC through next-to-next-to leading order in QCD. Our results demonstrate remarkable stability with respect to variations of the factorization and renormalization scales for all values of rapidity accessible in current and future experiments. These processes are therefore ''gold-plated'': current theoretical knowledge yields QCD predictions accurate to better than one percent. These results strengthen the proposal to use $W$ and $Z$ production to determine parton-parton luminosities and constrain parton distribution functions at the LHC. For example, LHC data should easily be able to distinguish themore » central parton distribution fit obtained by MRST from that obtained by Alekhin.« less
Halimi, Abdelghafour; Batatia, Hadj; Le Digabel, Jimmy; Josse, Gwendal; Tourneret, Jean Yves
2017-01-01
Detecting skin lentigo in reflectance confocal microscopy images is an important and challenging problem. This imaging modality has not yet been widely investigated for this problem and there are a few automatic processing techniques. They are mostly based on machine learning approaches and rely on numerous classical image features that lead to high computational costs given the very large resolution of these images. This paper presents a detection method with very low computational complexity that is able to identify the skin depth at which the lentigo can be detected. The proposed method performs multiresolution decomposition of the image obtained at each skin depth. The distribution of image pixels at a given depth can be approximated accurately by a generalized Gaussian distribution whose parameters depend on the decomposition scale, resulting in a very-low-dimension parameter space. SVM classifiers are then investigated to classify the scale parameter of this distribution allowing real-time detection of lentigo. The method is applied to 45 healthy and lentigo patients from a clinical study, where sensitivity of 81.4% and specificity of 83.3% are achieved. Our results show that lentigo is identifiable at depths between 50μm and 60μm, corresponding to the average location of the the dermoepidermal junction. This result is in agreement with the clinical practices that characterize the lentigo by assessing the disorganization of the dermoepidermal junction. PMID:29296480
GoFFish: A Sub-Graph Centric Framework for Large-Scale Graph Analytics1
DOE Office of Scientific and Technical Information (OSTI.GOV)
Simmhan, Yogesh; Kumbhare, Alok; Wickramaarachchi, Charith
2014-08-25
Large scale graph processing is a major research area for Big Data exploration. Vertex centric programming models like Pregel are gaining traction due to their simple abstraction that allows for scalable execution on distributed systems naturally. However, there are limitations to this approach which cause vertex centric algorithms to under-perform due to poor compute to communication overhead ratio and slow convergence of iterative superstep. In this paper we introduce GoFFish a scalable sub-graph centric framework co-designed with a distributed persistent graph storage for large scale graph analytics on commodity clusters. We introduce a sub-graph centric programming abstraction that combines themore » scalability of a vertex centric approach with the flexibility of shared memory sub-graph computation. We map Connected Components, SSSP and PageRank algorithms to this model to illustrate its flexibility. Further, we empirically analyze GoFFish using several real world graphs and demonstrate its significant performance improvement, orders of magnitude in some cases, compared to Apache Giraph, the leading open source vertex centric implementation. We map Connected Components, SSSP and PageRank algorithms to this model to illustrate its flexibility. Further, we empirically analyze GoFFish using several real world graphs and demonstrate its significant performance improvement, orders of magnitude in some cases, compared to Apache Giraph, the leading open source vertex centric implementation.« less
Large Scale EOF Analysis of Climate Data
NASA Astrophysics Data System (ADS)
Prabhat, M.; Gittens, A.; Kashinath, K.; Cavanaugh, N. R.; Mahoney, M.
2016-12-01
We present a distributed approach towards extracting EOFs from 3D climate data. We implement the method in Apache Spark, and process multi-TB sized datasets on O(1000-10,000) cores. We apply this method to latitude-weighted ocean temperature data from CSFR, a 2.2 terabyte-sized data set comprising ocean and subsurface reanalysis measurements collected at 41 levels in the ocean, at 6 hour intervals over 31 years. We extract the first 100 EOFs of this full data set and compare to the EOFs computed simply on the surface temperature field. Our analyses provide evidence of Kelvin and Rossy waves and components of large-scale modes of oscillation including the ENSO and PDO that are not visible in the usual SST EOFs. Further, they provide information on the the most influential parts of the ocean, such as the thermocline, that exist below the surface. Work is ongoing to understand the factors determining the depth-varying spatial patterns observed in the EOFs. We will experiment with weighting schemes to appropriately account for the differing depths of the observations. We also plan to apply the same distributed approach to analysis of analysis of 3D atmospheric climatic data sets, including multiple variables. Because the atmosphere changes on a quicker time-scale than the ocean, we expect that the results will demonstrate an even greater advantage to computing 3D EOFs in lieu of 2D EOFs.
Advanced information processing system
NASA Technical Reports Server (NTRS)
Lala, J. H.
1984-01-01
Design and performance details of the advanced information processing system (AIPS) for fault and damage tolerant data processing on aircraft and spacecraft are presented. AIPS comprises several computers distributed throughout the vehicle and linked by a damage tolerant data bus. Most I/O functions are available to all the computers, which run in a TDMA mode. Each computer performs separate specific tasks in normal operation and assumes other tasks in degraded modes. Redundant software assures that all fault monitoring, logging and reporting are automated, together with control functions. Redundant duplex links and damage-spread limitation provide the fault tolerance. Details of an advanced design of a laboratory-scale proof-of-concept system are described, including functional operations.
NASA Technical Reports Server (NTRS)
Leonard, A.
1980-01-01
Three recent simulations of tubulent shear flow bounded by a wall using the Illiac computer are reported. These are: (1) vibrating-ribbon experiments; (2) study of the evolution of a spot-like disturbance in a laminar boundary layer; and (3) investigation of turbulent channel flow. A number of persistent flow structures were observed, including streamwise and vertical vorticity distributions near the wall, low-speed and high-speed streaks, and local regions of intense vertical velocity. The role of these structures in, for example, the growth or maintenance of turbulence is discussed. The problem of representing the large range of turbulent scales in a computer simulation is also discussed.
Diffraction scattering computed tomography: a window into the structures of complex nanomaterials
Birkbak, M. E.; Leemreize, H.; Frølich, S.; Stock, S. R.
2015-01-01
Modern functional nanomaterials and devices are increasingly composed of multiple phases arranged in three dimensions over several length scales. Therefore there is a pressing demand for improved methods for structural characterization of such complex materials. An excellent emerging technique that addresses this problem is diffraction/scattering computed tomography (DSCT). DSCT combines the merits of diffraction and/or small angle scattering with computed tomography to allow imaging the interior of materials based on the diffraction or small angle scattering signals. This allows, e.g., one to distinguish the distributions of polymorphs in complex mixtures. Here we review this technique and give examples of how it can shed light on modern nanoscale materials. PMID:26505175
NASA Astrophysics Data System (ADS)
Huang, Dong; Liu, Yangang
2014-12-01
Subgrid-scale variability is one of the main reasons why parameterizations are needed in large-scale models. Although some parameterizations started to address the issue of subgrid variability by introducing a subgrid probability distribution function for relevant quantities, the spatial structure has been typically ignored and thus the subgrid-scale interactions cannot be accounted for physically. Here we present a new statistical-physics-like approach whereby the spatial autocorrelation function can be used to physically capture the net effects of subgrid cloud interaction with radiation. The new approach is able to faithfully reproduce the Monte Carlo 3D simulation results with several orders less computational cost, allowing for more realistic representation of cloud radiation interactions in large-scale models.
A secure distributed logistic regression protocol for the detection of rare adverse drug events
El Emam, Khaled; Samet, Saeed; Arbuckle, Luk; Tamblyn, Robyn; Earle, Craig; Kantarcioglu, Murat
2013-01-01
Background There is limited capacity to assess the comparative risks of medications after they enter the market. For rare adverse events, the pooling of data from multiple sources is necessary to have the power and sufficient population heterogeneity to detect differences in safety and effectiveness in genetic, ethnic and clinically defined subpopulations. However, combining datasets from different data custodians or jurisdictions to perform an analysis on the pooled data creates significant privacy concerns that would need to be addressed. Existing protocols for addressing these concerns can result in reduced analysis accuracy and can allow sensitive information to leak. Objective To develop a secure distributed multi-party computation protocol for logistic regression that provides strong privacy guarantees. Methods We developed a secure distributed logistic regression protocol using a single analysis center with multiple sites providing data. A theoretical security analysis demonstrates that the protocol is robust to plausible collusion attacks and does not allow the parties to gain new information from the data that are exchanged among them. The computational performance and accuracy of the protocol were evaluated on simulated datasets. Results The computational performance scales linearly as the dataset sizes increase. The addition of sites results in an exponential growth in computation time. However, for up to five sites, the time is still short and would not affect practical applications. The model parameters are the same as the results on pooled raw data analyzed in SAS, demonstrating high model accuracy. Conclusion The proposed protocol and prototype system would allow the development of logistic regression models in a secure manner without requiring the sharing of personal health information. This can alleviate one of the key barriers to the establishment of large-scale post-marketing surveillance programs. We extended the secure protocol to account for correlations among patients within sites through generalized estimating equations, and to accommodate other link functions by extending it to generalized linear models. PMID:22871397
Architectural Strategies for Enabling Data-Driven Science at Scale
NASA Astrophysics Data System (ADS)
Crichton, D. J.; Law, E. S.; Doyle, R. J.; Little, M. M.
2017-12-01
The analysis of large data collections from NASA or other agencies is often executed through traditional computational and data analysis approaches, which require users to bring data to their desktops and perform local data analysis. Alternatively, data are hauled to large computational environments that provide centralized data analysis via traditional High Performance Computing (HPC). Scientific data archives, however, are not only growing massive, but are also becoming highly distributed. Neither traditional approach provides a good solution for optimizing analysis into the future. Assumptions across the NASA mission and science data lifecycle, which historically assume that all data can be collected, transmitted, processed, and archived, will not scale as more capable instruments stress legacy-based systems. New paradigms are needed to increase the productivity and effectiveness of scientific data analysis. This paradigm must recognize that architectural and analytical choices are interrelated, and must be carefully coordinated in any system that aims to allow efficient, interactive scientific exploration and discovery to exploit massive data collections, from point of collection (e.g., onboard) to analysis and decision support. The most effective approach to analyzing a distributed set of massive data may involve some exploration and iteration, putting a premium on the flexibility afforded by the architectural framework. The framework should enable scientist users to assemble workflows efficiently, manage the uncertainties related to data analysis and inference, and optimize deep-dive analytics to enhance scalability. In many cases, this "data ecosystem" needs to be able to integrate multiple observing assets, ground environments, archives, and analytics, evolving from stewardship of measurements of data to using computational methodologies to better derive insight from the data that may be fused with other sets of data. This presentation will discuss architectural strategies, including a 2015-2016 NASA AIST Study on Big Data, for evolving scientific research towards massively distributed data-driven discovery. It will include example use cases across earth science, planetary science, and other disciplines.
A secure distributed logistic regression protocol for the detection of rare adverse drug events.
El Emam, Khaled; Samet, Saeed; Arbuckle, Luk; Tamblyn, Robyn; Earle, Craig; Kantarcioglu, Murat
2013-05-01
There is limited capacity to assess the comparative risks of medications after they enter the market. For rare adverse events, the pooling of data from multiple sources is necessary to have the power and sufficient population heterogeneity to detect differences in safety and effectiveness in genetic, ethnic and clinically defined subpopulations. However, combining datasets from different data custodians or jurisdictions to perform an analysis on the pooled data creates significant privacy concerns that would need to be addressed. Existing protocols for addressing these concerns can result in reduced analysis accuracy and can allow sensitive information to leak. To develop a secure distributed multi-party computation protocol for logistic regression that provides strong privacy guarantees. We developed a secure distributed logistic regression protocol using a single analysis center with multiple sites providing data. A theoretical security analysis demonstrates that the protocol is robust to plausible collusion attacks and does not allow the parties to gain new information from the data that are exchanged among them. The computational performance and accuracy of the protocol were evaluated on simulated datasets. The computational performance scales linearly as the dataset sizes increase. The addition of sites results in an exponential growth in computation time. However, for up to five sites, the time is still short and would not affect practical applications. The model parameters are the same as the results on pooled raw data analyzed in SAS, demonstrating high model accuracy. The proposed protocol and prototype system would allow the development of logistic regression models in a secure manner without requiring the sharing of personal health information. This can alleviate one of the key barriers to the establishment of large-scale post-marketing surveillance programs. We extended the secure protocol to account for correlations among patients within sites through generalized estimating equations, and to accommodate other link functions by extending it to generalized linear models.
A hydrological emulator for global applications - HE v1.0.0
NASA Astrophysics Data System (ADS)
Liu, Yaling; Hejazi, Mohamad; Li, Hongyi; Zhang, Xuesong; Leng, Guoyong
2018-03-01
While global hydrological models (GHMs) are very useful in exploring water resources and interactions between the Earth and human systems, their use often requires numerous model inputs, complex model calibration, and high computation costs. To overcome these challenges, we construct an efficient open-source and ready-to-use hydrological emulator (HE) that can mimic complex GHMs at a range of spatial scales (e.g., basin, region, globe). More specifically, we construct both a lumped and a distributed scheme of the HE based on the monthly abcd model to explore the tradeoff between computational cost and model fidelity. Model predictability and computational efficiency are evaluated in simulating global runoff from 1971 to 2010 with both the lumped and distributed schemes. The results are compared against the runoff product from the widely used Variable Infiltration Capacity (VIC) model. Our evaluation indicates that the lumped and distributed schemes present comparable results regarding annual total quantity, spatial pattern, and temporal variation of the major water fluxes (e.g., total runoff, evapotranspiration) across the global 235 basins (e.g., correlation coefficient r between the annual total runoff from either of these two schemes and the VIC is > 0.96), except for several cold (e.g., Arctic, interior Tibet), dry (e.g., North Africa) and mountainous (e.g., Argentina) regions. Compared against the monthly total runoff product from the VIC (aggregated from daily runoff), the global mean Kling-Gupta efficiencies are 0.75 and 0.79 for the lumped and distributed schemes, respectively, with the distributed scheme better capturing spatial heterogeneity. Notably, the computation efficiency of the lumped scheme is 2 orders of magnitude higher than the distributed one and 7 orders more efficient than the VIC model. A case study of uncertainty analysis for the world's 16 basins with top annual streamflow is conducted using 100 000 model simulations, and it demonstrates the lumped scheme's extraordinary advantage in computational efficiency. Our results suggest that the revised lumped abcd model can serve as an efficient and reasonable HE for complex GHMs and is suitable for broad practical use, and the distributed scheme is also an efficient alternative if spatial heterogeneity is of more interest.
Assembling Large, Multi-Sensor Climate Datasets Using the SciFlo Grid Workflow System
NASA Astrophysics Data System (ADS)
Wilson, B.; Manipon, G.; Xing, Z.; Fetzer, E.
2009-04-01
NASA's Earth Observing System (EOS) is an ambitious facility for studying global climate change. The mandate now is to combine measurements from the instruments on the "A-Train" platforms (AIRS, AMSR-E, MODIS, MISR, MLS, and CloudSat) and other Earth probes to enable large-scale studies of climate change over periods of years to decades. However, moving from predominantly single-instrument studies to a multi-sensor, measurement-based model for long-duration analysis of important climate variables presents serious challenges for large-scale data mining and data fusion. For example, one might want to compare temperature and water vapor retrievals from one instrument (AIRS) to another instrument (MODIS), and to a model (ECMWF), stratify the comparisons using a classification of the "cloud scenes" from CloudSat, and repeat the entire analysis over years of AIRS data. To perform such an analysis, one must discover & access multiple datasets from remote sites, find the space/time "matchups" between instruments swaths and model grids, understand the quality flags and uncertainties for retrieved physical variables, assemble merged datasets, and compute fused products for further scientific and statistical analysis. To meet these large-scale challenges, we are utilizing a Grid computing and dataflow framework, named SciFlo, in which we are deploying a set of versatile and reusable operators for data query, access, subsetting, co-registration, mining, fusion, and advanced statistical analysis. SciFlo is a semantically-enabled ("smart") Grid Workflow system that ties together a peer-to-peer network of computers into an efficient engine for distributed computation. The SciFlo workflow engine enables scientists to do multi-instrument Earth Science by assembling remotely-invokable Web Services (SOAP or http GET URLs), native executables, command-line scripts, and Python codes into a distributed computing flow. A scientist visually authors the graph of operation in the VizFlow GUI, or uses a text editor to modify the simple XML workflow documents. The SciFlo client & server engines optimize the execution of such distributed workflows and allow the user to transparently find and use datasets and operators without worrying about the actual location of the Grid resources. The engine transparently moves data to the operators, and moves operators to the data (on the dozen trusted SciFlo nodes). SciFlo also deploys a variety of Data Grid services to: query datasets in space and time, locate & retrieve on-line data granules, provide on-the-fly variable and spatial subsetting, perform pairwise instrument matchups for A-Train datasets, and compute fused products. These services are combined into efficient workflows to assemble the desired large-scale, merged climate datasets. SciFlo is currently being applied in several large climate studies: comparisons of aerosol optical depth between MODIS, MISR, AERONET ground network, and U. Michigan's IMPACT aerosol transport model; characterization of long-term biases in microwave and infrared instruments (AIRS, MLS) by comparisons to GPS temperature retrievals accurate to 0.1 degrees Kelvin; and construction of a decade-long, multi-sensor water vapor climatology stratified by classified cloud scene by bringing together datasets from AIRS/AMSU, AMSR-E, MLS, MODIS, and CloudSat (NASA MEASUREs grant, Fetzer PI). The presentation will discuss the SciFlo technologies, their application in these distributed workflows, and the many challenges encountered in assembling and analyzing these massive datasets.
Context-aware distributed cloud computing using CloudScheduler
NASA Astrophysics Data System (ADS)
Seuster, R.; Leavett-Brown, CR; Casteels, K.; Driemel, C.; Paterson, M.; Ring, D.; Sobie, RJ; Taylor, RP; Weldon, J.
2017-10-01
The distributed cloud using the CloudScheduler VM provisioning service is one of the longest running systems for HEP workloads. It has run millions of jobs for ATLAS and Belle II over the past few years using private and commercial clouds around the world. Our goal is to scale the distributed cloud to the 10,000-core level, with the ability to run any type of application (low I/O, high I/O and high memory) on any cloud. To achieve this goal, we have been implementing changes that utilize context-aware computing designs that are currently employed in the mobile communication industry. Context-awareness makes use of real-time and archived data to respond to user or system requirements. In our distributed cloud, we have many opportunistic clouds with no local HEP services, software or storage repositories. A context-aware design significantly improves the reliability and performance of our system by locating the nearest location of the required services. We describe how we are collecting and managing contextual information from our workload management systems, the clouds, the virtual machines and our services. This information is used not only to monitor the system but also to carry out automated corrective actions. We are incrementally adding new alerting and response services to our distributed cloud. This will enable us to scale the number of clouds and virtual machines. Further, a context-aware design will enable us to run analysis or high I/O application on opportunistic clouds. We envisage an open-source HTTP data federation (for example, the DynaFed system at CERN) as a service that would provide us access to existing storage elements used by the HEP experiments.
Sketching the pion's valence-quark generalised parton distribution
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mezrag, C.; Chang, L.; Moutarde, H.
2015-02-01
In order to learn effectively from measurements of generalised parton distributions (GPDs), it is desirable to compute them using a framework that can potentially connect empirical information with basic features of the Standard Model. We sketch an approach to such computations, based upon a rainbow-ladder (RL) truncation of QCD's Dyson-Schwinger equations and exemplified via the pion's valence dressed-quark GPD, H-pi(V)(chi, xi, t). Our analysis focuses primarily on xi = 0, although we also capitalise on the symmetry-preserving nature of the RL truncation by connecting H-pi(V)(chi, xi = +/- 1, t) with the pion's valence-quark parton distribution amplitude. We explain thatmore » the impulse-approximation used hitherto to define the pion's valence dressed-quark GPD is generally invalid owing to omission of contributions from the gluons which bind dressed-quarks into the pion. A simple correction enables us to identify a practicable improvement to the approximation for H(pi)(V)p(chi, 0, t), expressed as the Radon transform of a single amplitude. Therewith we obtain results for H pi V(chi, 0, t) and the associated impact-parameter dependent distribution, q(pi)(V)(chi, vertical bar(b) over right arrow (perpendicular to)vertical bar), which provide a qualitatively sound picture of the pion's dressed-quark structure at a hadronic scale. We evolve the distributions to a scale zeta = 2 GeV, so as to facilitate comparisons in future with results from experiment or other nonperturbative methods. (C) 2014 Published by Elsevier B. V. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/3.0/).« less
NASA Technical Reports Server (NTRS)
Ameri, Ali A.; Rigby, David L.; Steinthorsson, Erlendur; Heidmann, James D.; Fabian, John C.
2008-01-01
The effect of the upstream wake on the blade heat transfer has been numerically examined. The geometry and the flow conditions of the first stage turbine blade of GE s E3 engine with a tip clearance equal to 2 percent of the span was utilized. Based on numerical calculations of the vane, a set of wake boundary conditions were approximated, which were subsequently imposed upon the downstream blade. This set consisted of the momentum and thermal wakes as well as the variation in modeled turbulence quantities of turbulence intensity and the length scale. Using a one-blade periodic domain, the distributions of unsteady heat transfer rate on the turbine blade and its tip, as affected by the wake, were determined. Such heat transfer coefficient distribution was computed using the wall heat flux and the adiabatic wall temperature to desensitize the heat transfer coefficient to the wall temperature. For the determination of the wall heat flux and the adiabatic wall temperatures, two sets of computations were required. The results were used in a phase-locked manner to compute the unsteady or steady heat transfer coefficients. It has been found that the unsteady wake has some effect on the distribution of the time averaged heat transfer coefficient on the blade and that this distribution is different from the distribution that is obtainable from a steady computation. This difference was found to be as large as 20 percent of the average heat transfer on the blade surface. On the tip surface, this difference is comparatively smaller and can be as large as four percent of the average.
Large-Scale Optimization for Bayesian Inference in Complex Systems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Willcox, Karen; Marzouk, Youssef
2013-11-12
The SAGUARO (Scalable Algorithms for Groundwater Uncertainty Analysis and Robust Optimization) Project focused on the development of scalable numerical algorithms for large-scale Bayesian inversion in complex systems that capitalize on advances in large-scale simulation-based optimization and inversion methods. The project was a collaborative effort among MIT, the University of Texas at Austin, Georgia Institute of Technology, and Sandia National Laboratories. The research was directed in three complementary areas: efficient approximations of the Hessian operator, reductions in complexity of forward simulations via stochastic spectral approximations and model reduction, and employing large-scale optimization concepts to accelerate sampling. The MIT--Sandia component of themore » SAGUARO Project addressed the intractability of conventional sampling methods for large-scale statistical inverse problems by devising reduced-order models that are faithful to the full-order model over a wide range of parameter values; sampling then employs the reduced model rather than the full model, resulting in very large computational savings. Results indicate little effect on the computed posterior distribution. On the other hand, in the Texas--Georgia Tech component of the project, we retain the full-order model, but exploit inverse problem structure (adjoint-based gradients and partial Hessian information of the parameter-to-observation map) to implicitly extract lower dimensional information on the posterior distribution; this greatly speeds up sampling methods, so that fewer sampling points are needed. We can think of these two approaches as ``reduce then sample'' and ``sample then reduce.'' In fact, these two approaches are complementary, and can be used in conjunction with each other. Moreover, they both exploit deterministic inverse problem structure, in the form of adjoint-based gradient and Hessian information of the underlying parameter-to-observation map, to achieve their speedups.« less
Final Report: Large-Scale Optimization for Bayesian Inference in Complex Systems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ghattas, Omar
2013-10-15
The SAGUARO (Scalable Algorithms for Groundwater Uncertainty Analysis and Robust Optimiza- tion) Project focuses on the development of scalable numerical algorithms for large-scale Bayesian inversion in complex systems that capitalize on advances in large-scale simulation-based optimiza- tion and inversion methods. Our research is directed in three complementary areas: efficient approximations of the Hessian operator, reductions in complexity of forward simulations via stochastic spectral approximations and model reduction, and employing large-scale optimization concepts to accelerate sampling. Our efforts are integrated in the context of a challenging testbed problem that considers subsurface reacting flow and transport. The MIT component of the SAGUAROmore » Project addresses the intractability of conventional sampling methods for large-scale statistical inverse problems by devising reduced-order models that are faithful to the full-order model over a wide range of parameter values; sampling then employs the reduced model rather than the full model, resulting in very large computational savings. Results indicate little effect on the computed posterior distribution. On the other hand, in the Texas-Georgia Tech component of the project, we retain the full-order model, but exploit inverse problem structure (adjoint-based gradients and partial Hessian information of the parameter-to- observation map) to implicitly extract lower dimensional information on the posterior distribution; this greatly speeds up sampling methods, so that fewer sampling points are needed. We can think of these two approaches as "reduce then sample" and "sample then reduce." In fact, these two approaches are complementary, and can be used in conjunction with each other. Moreover, they both exploit deterministic inverse problem structure, in the form of adjoint-based gradient and Hessian information of the underlying parameter-to-observation map, to achieve their speedups.« less
Power spectrum, correlation function, and tests for luminosity bias in the CfA redshift survey
NASA Astrophysics Data System (ADS)
Park, Changbom; Vogeley, Michael S.; Geller, Margaret J.; Huchra, John P.
1994-08-01
We describe and apply a method for directly computing the power spectrum for the galaxy distribution in the extension of the Center for Astrophysics Redshift Survey. Tests show that our technique accurately reproduces the true power spectrum for k greater than 0.03 h Mpc-1. The dense sampling and large spatial coverage of this survey allow accurate measurement of the redshift-space power spectrum on scales from 5 to approximately 200 h-1 Mpc. The power spectrum has slope n approximately equal -2.1 on small scales (lambda less than or equal 25 h-1 Mpc) and n approximately -1.1 on scales 30 less than lambda less than 120 h-1 Mpc. On larger scales the power spectrum flattens somewhat, but we do not detect a turnover. Comparison with N-body simulations of cosmological models shows that an unbiased, open universe CDM model (OMEGA h = 0.2) and a nonzero cosmological constant (CDM) model (OMEGA h = 0.24, lambdazero = 0.6, b = 1.3) match the CfA power spectrum over the wavelength range we explore. The standard biased CDM model (OMEGA h = 0.5, b = 1.5) fails (99% significance level) because it has insufficient power on scales lambda greater than 30 h-1 Mpc. Biased CDM with a normalization that matches the Cosmic Microwave Background (CMB) anisotropy (OMEGA h = 0.5, b = 1.4, sigma8 (mass) = 1) has too much power on small scales to match the observed galaxy power spectrum. This model with b = 1 matches both Cosmic Background Explorer Satellite (COBE) and the small-scale power spect rum but has insufficient power on scales lambda approximately 100 h-1 Mpc. We derive a formula for the effect of small-scale peculiar velocities on the power spectrum and combine this formula with the linear-regime amplification described by Kaiser to compute an estimate of the real-space power spectrum. Two tests reveal luminosity bias in the galaxy distribution: First, the amplitude of the power spectrum is approximately 40% larger for the brightest 50% of galaxies in volume-limited samples that have Mlim greater than M*. This bias in the power spectrum is independent of scale, consistent with the peaks-bias paradigm for galaxy formation. Second, the distribution of local density around galaxies shows that regions of moderate and high density contain both very bright (M less than M* = -19.2 + 5 log h) and fainter galaxies, but that voids preferentially harbor fainter galaxies (approximately 2 sigma significance level).
RATIO_TOOL - SOFTWARE FOR COMPUTING IMAGE RATIOS
NASA Technical Reports Server (NTRS)
Yates, G. L.
1994-01-01
Geological studies analyze spectral data in order to gain information on surface materials. RATIO_TOOL is an interactive program for viewing and analyzing large multispectral image data sets that have been created by an imaging spectrometer. While the standard approach to classification of multispectral data is to match the spectrum for each input pixel against a library of known mineral spectra, RATIO_TOOL uses ratios of spectral bands in order to spot significant areas of interest within a multispectral image. Each image band can be viewed iteratively, or a selected image band of the data set can be requested and displayed. When the image ratios are computed, the result is displayed as a gray scale image. At this point a histogram option helps in viewing the distribution of values. A thresholding option can then be used to segment the ratio image result into two to four classes. The segmented image is then color coded to indicate threshold classes and displayed alongside the gray scale image. RATIO_TOOL is written in C language for Sun series computers running SunOS 4.0 and later. It requires the XView toolkit and the OpenWindows window manager (version 2.0 or 3.0). The XView toolkit is distributed with Open Windows. A color monitor is also required. The standard distribution medium for RATIO_TOOL is a .25 inch streaming magnetic tape cartridge in UNIX tar format. An electronic copy of the documentation is included on the program media. RATIO_TOOL was developed in 1992 and is a copyrighted work with all copyright vested in NASA. Sun, SunOS, and OpenWindows are trademarks of Sun Microsystems, Inc. UNIX is a registered trademark of AT&T Bell Laboratories.
Image-Based Macro-Micro Finite Element Models of a Canine Femur with Implant Design Implications
NASA Astrophysics Data System (ADS)
Ghosh, Somnath; Krishnan, Ganapathi; Dyce, Jonathan
2006-06-01
In this paper, a comprehensive model of a bone-cement-implant assembly is developed for a canine cemented femoral prosthesis system. Various steps in this development entail profiling the canine femur contours by computed tomography (CT) scanning, computer aided design (CAD) reconstruction of the canine femur from CT images, CAD modeling of the implant from implant blue prints and CAD modeling of the interface cement. Finite element analysis of the macroscopic assembly is conducted for stress analysis in individual components of the system, accounting for variation in density and material properties in the porous bone material. A sensitivity analysis is conducted with the macroscopic model to investigate the effect of implant design variables on the stress distribution in the assembly. Subsequently, rigorous microstructural analysis of the bone incorporating the morphological intricacies is conducted. Various steps in this development include acquisition of the bone microstructural data from histological serial sectioning, stacking of sections to obtain 3D renderings of void distributions, microstructural characterization and determination of properties and, finally, microstructural stress analysis using a 3D Voronoi cell finite element method. Generation of the simulated microstructure and analysis by the 3D Voronoi cell finite element model provides a new way of modeling complex microstructures and correlating to morphological characteristics. An inverse calculation of the material parameters of bone by combining macroscopic experiments with microstructural characterization and analysis provides a new approach to evaluating properties without having to do experiments at this scale. Finally, the microstructural stresses in the femur are computed using the 3D VCFEM to study the stress distribution at the scale of the bone porosity. Significant difference is observed between the macroscopic stresses and the peak microscopic stresses at different locations.
Coupled basin-scale water resource models for arid and semiarid regions
NASA Astrophysics Data System (ADS)
Winter, C.; Springer, E.; Costigan, K.; Fasel, P.; Mniewski, S.; Zyvoloski, G.
2003-04-01
Managers of semi-arid and arid water resources must allocate increasingly variable surface sources and limited groundwater resources to growing demands. This challenge is leading to a new generation of detailed computational models that link multiple interacting sources and demands. We will discuss a new computational model of arid region hydrology that we are parameterizing for the upper Rio Grande Basin of the United States. The model consists of linked components for the atmosphere (the Regional Atmospheric Modeling System, RAMS), surface hydrology (the Los Alamos Distributed Hydrologic System, LADHS), and groundwater (the Finite Element Heat and Mass code, FEHM), and the couplings between them. The model runs under the Parallel Application WorkSpace software developed at Los Alamos for applications running on large distributed memory computers. RAMS simulates regional meteorology coupled to global climate data on the one hand and land surface hydrology on the other. LADHS generates runoff by infiltration or saturation excess mechanisms, as well as interception, evapotranspiration, and snow accumulation and melt. FEHM simulates variably saturated flow and heat transport in three dimensions. A key issue is to increase the components’ spatial and temporal resolution to account for changes in topography and other rapidly changing variables that affect results such as soil moisture distribution or groundwater recharge. Thus, RAMS’ smallest grid is 5 km on a side, LADHS uses 100 m spacing, while FEHM concentrates processing on key volumes by means of an unstructured grid. Couplings within our model are based on new scaling methods that link groundwater-groundwater systems and streams to aquifers and we are developing evapotranspiration methods based on detailed calculations of latent heat and vegetative cover. Simulations of precipitation and soil moisture for the 1992-93 El Nino year will be used to demonstrate the approach and suggest further needs.
NGScloud: RNA-seq analysis of non-model species using cloud computing.
Mora-Márquez, Fernando; Vázquez-Poletti, José Luis; López de Heredia, Unai
2018-05-03
RNA-seq analysis usually requires large computing infrastructures. NGScloud is a bioinformatic system developed to analyze RNA-seq data using the cloud computing services of Amazon that permit the access to ad hoc computing infrastructure scaled according to the complexity of the experiment, so its costs and times can be optimized. The application provides a user-friendly front-end to operate Amazon's hardware resources, and to control a workflow of RNA-seq analysis oriented to non-model species, incorporating the cluster concept, which allows parallel runs of common RNA-seq analysis programs in several virtual machines for faster analysis. NGScloud is freely available at https://github.com/GGFHF/NGScloud/. A manual detailing installation and how-to-use instructions is available with the distribution. unai.lopezdeheredia@upm.es.
NASA Astrophysics Data System (ADS)
Acedo, L.; Villanueva-Oller, J.; Moraño, J. A.; Villanueva, R.-J.
2013-01-01
The Berkeley Open Infrastructure for Network Computing (BOINC) has become the standard open source solution for grid computing in the Internet. Volunteers use their computers to complete an small part of the task assigned by a dedicated server. We have developed a BOINC project called Neurona@Home whose objective is to simulate a cellular automata random network with, at least, one million neurons. We consider a cellular automata version of the integrate-and-fire model in which excitatory and inhibitory nodes can activate or deactivate neighbor nodes according to a set of probabilistic rules. Our aim is to determine the phase diagram of the model and its behaviour and to compare it with the electroencephalographic signals measured in real brains.
Mathias, Susan D; Gao, Sue K; Rutstein, Mark; Snyder, Claire F; Wu, Albert W; Cella, David
2009-02-01
Interpretation of data from health-related quality of life (HRQoL) questionnaires can be enhanced with the availability of minimally important difference (MID) estimates. This information will aid clinicians in interpreting HRQoL differences within patients over time and between treatment groups. The Immune Thrombocytopenic Purpura (ITP)-Patient Assessment Questionnaire (PAQ) is the only comprehensive HRQoL questionnaire available for adults with ITP. Forty centers from within the US and Europe enrolled ITP patients into one of two multicenter, randomized, placebo-controlled, double-blind, 6-month, phase III clinical trials of romiplostim. Patients enrolled in these studies self-administered the ITP-PAQ and two items assessing global change (anchors) at baseline and weeks 4, 12, and 24. Using data from the ITP-PAQ and these two anchors, an anchor-based estimate was computed and combined with the standard error of measurement and standard deviation to compute a distribution-based estimate in order to provide an MID range for each of the 11 scales of the ITP-PAQ. A total of 125 patients participated in these clinical trials and provided data for use in these analyses. Combining results from anchor- and distribution-based approaches, MID values were computed for 9 of the 11 scales. MIDs ranged from 8 to 12 points for Symptoms, Bother, Psychological, Overall QOL, Social Activity, Menstrual Symptoms, and Fertility, while the range was 10 to 15 points for the Fatigue and Activity scales of the ITP-PAQ. These estimates, while slightly higher than other published MID estimates, were consistent with moderate effect sizes. These MID estimates will serve as a useful tool to researchers and clinicians using the ITP-PAQ, providing guidance for interpretation of baseline scores as well as changes in ITP-PAQ scores over time. Additional work should be done to finalize these initial estimates using more appropriate anchors that correlate more highly with the ITP-PAQ scales.
Latency Hiding in Dynamic Partitioning and Load Balancing of Grid Computing Applications
NASA Technical Reports Server (NTRS)
Das, Sajal K.; Harvey, Daniel J.; Biswas, Rupak
2001-01-01
The Information Power Grid (IPG) concept developed by NASA is aimed to provide a metacomputing platform for large-scale distributed computations, by hiding the intricacies of highly heterogeneous environment and yet maintaining adequate security. In this paper, we propose a latency-tolerant partitioning scheme that dynamically balances processor workloads on the.IPG, and minimizes data movement and runtime communication. By simulating an unsteady adaptive mesh application on a wide area network, we study the performance of our load balancer under the Globus environment. The number of IPG nodes, the number of processors per node, and the interconnected speeds are parameterized to derive conditions under which the IPG would be suitable for parallel distributed processing of such applications. Experimental results demonstrate that effective solution are achieved when the IPG nodes are connected by a high-speed asynchronous interconnection network.
A Hybrid Soft-computing Method for Image Analysis of Digital Plantar Scanners.
Razjouyan, Javad; Khayat, Omid; Siahi, Mehdi; Mansouri, Ali Alizadeh
2013-01-01
Digital foot scanners have been developed in recent years to yield anthropometrists digital image of insole with pressure distribution and anthropometric information. In this paper, a hybrid algorithm containing gray level spatial correlation (GLSC) histogram and Shanbag entropy is presented for analysis of scanned foot images. An evolutionary algorithm is also employed to find the optimum parameters of GLSC and transform function of the membership values. Resulting binary images as the thresholded images are undergone anthropometric measurements taking in to account the scale factor of pixel size to metric scale. The proposed method is finally applied to plantar images obtained through scanning feet of randomly selected subjects by a foot scanner system as our experimental setup described in the paper. Running computation time and the effects of GLSC parameters are investigated in the simulation results.
High performance computation of radiative transfer equation using the finite element method
NASA Astrophysics Data System (ADS)
Badri, M. A.; Jolivet, P.; Rousseau, B.; Favennec, Y.
2018-05-01
This article deals with an efficient strategy for numerically simulating radiative transfer phenomena using distributed computing. The finite element method alongside the discrete ordinate method is used for spatio-angular discretization of the monochromatic steady-state radiative transfer equation in an anisotropically scattering media. Two very different methods of parallelization, angular and spatial decomposition methods, are presented. To do so, the finite element method is used in a vectorial way. A detailed comparison of scalability, performance, and efficiency on thousands of processors is established for two- and three-dimensional heterogeneous test cases. Timings show that both algorithms scale well when using proper preconditioners. It is also observed that our angular decomposition scheme outperforms our domain decomposition method. Overall, we perform numerical simulations at scales that were previously unattainable by standard radiative transfer equation solvers.
Pronk, Sander; Pouya, Iman; Lundborg, Magnus; Rotskoff, Grant; Wesén, Björn; Kasson, Peter M; Lindahl, Erik
2015-06-09
Computational chemistry and other simulation fields are critically dependent on computing resources, but few problems scale efficiently to the hundreds of thousands of processors available in current supercomputers-particularly for molecular dynamics. This has turned into a bottleneck as new hardware generations primarily provide more processing units rather than making individual units much faster, which simulation applications are addressing by increasingly focusing on sampling with algorithms such as free-energy perturbation, Markov state modeling, metadynamics, or milestoning. All these rely on combining results from multiple simulations into a single observation. They are potentially powerful approaches that aim to predict experimental observables directly, but this comes at the expense of added complexity in selecting sampling strategies and keeping track of dozens to thousands of simulations and their dependencies. Here, we describe how the distributed execution framework Copernicus allows the expression of such algorithms in generic workflows: dataflow programs. Because dataflow algorithms explicitly state dependencies of each constituent part, algorithms only need to be described on conceptual level, after which the execution is maximally parallel. The fully automated execution facilitates the optimization of these algorithms with adaptive sampling, where undersampled regions are automatically detected and targeted without user intervention. We show how several such algorithms can be formulated for computational chemistry problems, and how they are executed efficiently with many loosely coupled simulations using either distributed or parallel resources with Copernicus.
Jung, Yousung; Shao, Yihan; Head-Gordon, Martin
2007-09-01
The scaled opposite spin Møller-Plesset method (SOS-MP2) is an economical way of obtaining correlation energies that are computationally cheaper, and yet, in a statistical sense, of higher quality than standard MP2 theory, by introducing one empirical parameter. But SOS-MP2 still has a fourth-order scaling step that makes the method inapplicable to very large molecular systems. We reduce the scaling of SOS-MP2 by exploiting the sparsity of expansion coefficients and local integral matrices, by performing local auxiliary basis expansions for the occupied-virtual product distributions. To exploit sparsity of 3-index local quantities, we use a blocking scheme in which entire zero-rows and columns, for a given third global index, are deleted by comparison against a numerical threshold. This approach minimizes sparse matrix book-keeping overhead, and also provides sufficiently large submatrices after blocking, to allow efficient matrix-matrix multiplies. The resulting algorithm is formally cubic scaling, and requires only moderate computational resources (quadratic memory and disk space) and, in favorable cases, is shown to yield effective quadratic scaling behavior in the size regime we can apply it to. Errors associated with local fitting using the attenuated Coulomb metric and numerical thresholds in the blocking procedure are found to be insignificant in terms of the predicted relative energies. A diverse set of test calculations shows that the size of system where significant computational savings can be achieved depends strongly on the dimensionality of the system, and the extent of localizability of the molecular orbitals. Copyright 2007 Wiley Periodicals, Inc.
NASA Astrophysics Data System (ADS)
Harfst, S.; Portegies Zwart, S.; McMillan, S.
2008-12-01
We present MUSE, a software framework for combining existing computational tools from different astrophysical domains into a single multi-physics, multi-scale application. MUSE facilitates the coupling of existing codes written in different languages by providing inter-language tools and by specifying an interface between each module and the framework that represents a balance between generality and computational efficiency. This approach allows scientists to use combinations of codes to solve highly-coupled problems without the need to write new codes for other domains or significantly alter their existing codes. MUSE currently incorporates the domains of stellar dynamics, stellar evolution and stellar hydrodynamics for studying generalized stellar systems. We have now reached a ``Noah's Ark'' milestone, with (at least) two available numerical solvers for each domain. MUSE can treat multi-scale and multi-physics systems in which the time- and size-scales are well separated, like simulating the evolution of planetary systems, small stellar associations, dense stellar clusters, galaxies and galactic nuclei. In this paper we describe two examples calculated using MUSE: the merger of two galaxies and an N-body simulation with live stellar evolution. In addition, we demonstrate an implementation of MUSE on a distributed computer which may also include special-purpose hardware, such as GRAPEs or GPUs, to accelerate computations. The current MUSE code base is publicly available as open source at http://muse.li.
ERIC Educational Resources Information Center
Dowell, Nia M. M.; Graesser, Arthur\tC.; Cai, Zhiqiang
2016-01-01
The goal of this article is to preserve and distribute the information presented at the LASI (2014) workshop on Coh-Metrix, a theoretically grounded, computational linguistics facility that analyzes texts on multiple levels of language and discourse. The workshop focused on the utility of Coh-Metrix in discourse theory and educational practice. We…
Bibliography--Unclassified Technical Reports, Special Reports, and Technical Notes: FY 1982.
1982-11-01
in each category are listed in chronological order under seven areas: manpower management, personnel administration , organization management, education...7633). Technical reports listed that have unlimited distribution can also be obtained from the National Technical Information Service , 5285 Port Royal...simulations of manpower systems. This research exploits the technology of computer-managed large-scale data bases. PERSONNEL ADMINISTRATION The personnel
NASA Astrophysics Data System (ADS)
Harrison, T. W.; Polagye, B. L.
2016-02-01
Coastal ecosystems are characterized by spatially and temporally varying hydrodynamics. In marine renewable energy applications, these variations strongly influence project economics and in oceanographic studies, they impact accuracy of biological transport and pollutant dispersion models. While stationary point or profile measurements are relatively straight forward, spatial representativeness of point measurements can be poor due to strong gradients. Moving platforms, such as AUVs or surface vessels, offer better coverage, but suffer from energetic constraints (AUVs) and resolvable scales (vessels). A system of sub-surface, drifting sensor packages is being developed to provide spatially distributed, synoptic data sets of coastal hydrodynamics with meter-scale resolution over a regional extent of a kilometer. Computational investigation has informed system parameters such as drifter size and shape, necessary position accuracy, number of drifters, and deployment methods. A hydrodynamic domain with complex flow features was created using a computational fluid dynamics code. A simple model of drifter dynamics propagate the drifters through the domain in post-processing. System parameters are evaluated relative to their ability to accurately recreate domain hydrodynamics. Implications of these results for an inexpensive, depth-controlled Lagrangian drifter system is presented.
Siretskiy, Alexey; Sundqvist, Tore; Voznesenskiy, Mikhail; Spjuth, Ola
2015-01-01
New high-throughput technologies, such as massively parallel sequencing, have transformed the life sciences into a data-intensive field. The most common e-infrastructure for analyzing this data consists of batch systems that are based on high-performance computing resources; however, the bioinformatics software that is built on this platform does not scale well in the general case. Recently, the Hadoop platform has emerged as an interesting option to address the challenges of increasingly large datasets with distributed storage, distributed processing, built-in data locality, fault tolerance, and an appealing programming methodology. In this work we introduce metrics and report on a quantitative comparison between Hadoop and a single node of conventional high-performance computing resources for the tasks of short read mapping and variant calling. We calculate efficiency as a function of data size and observe that the Hadoop platform is more efficient for biologically relevant data sizes in terms of computing hours for both split and un-split data files. We also quantify the advantages of the data locality provided by Hadoop for NGS problems, and show that a classical architecture with network-attached storage will not scale when computing resources increase in numbers. Measurements were performed using ten datasets of different sizes, up to 100 gigabases, using the pipeline implemented in Crossbow. To make a fair comparison, we implemented an improved preprocessor for Hadoop with better performance for splittable data files. For improved usability, we implemented a graphical user interface for Crossbow in a private cloud environment using the CloudGene platform. All of the code and data in this study are freely available as open source in public repositories. From our experiments we can conclude that the improved Hadoop pipeline scales better than the same pipeline on high-performance computing resources, we also conclude that Hadoop is an economically viable option for the common data sizes that are currently used in massively parallel sequencing. Given that datasets are expected to increase over time, Hadoop is a framework that we envision will have an increasingly important role in future biological data analysis.
Komini Babu, S.; Chung, H. T.; Zelenay, P.; ...
2015-09-14
This manuscript presents micro-scale experimental diagnostics and nano-scale resolution X-ray imaging applied to the study of proton conduction in non-precious metal catalyst (NPMC) fuel cell cathodes. NPMC’s have the potential to reduce the cost of the fuel cell for multiple applications. But, NPMC electrodes are inherently thick compared to the convention Pt/C electrode due to the lower volumetric activity. Thus, the electric potential drop through the Nafion across the electrode thickness can yield significant performance loss. Ionomer distributions in the NPMC electrodes with different ionomer loading are extracted from morphological data using nanoscale X-ray computed tomography (nano-XCT) imaging of themore » cathode. Microstructured electrode scaffold (MES) diagnostics are used to measure the electrolyte potential at discrete points across the thickness of the catalyst layer. When using that apparatus, the electrolyte potential drop, the through-thickness reaction distribution, and the proton conductivity are measured and correlated with the corresponding Nafion morphology and cell performance.« less
Structural, vibrational spectroscopic and quantum chemical studies on indole-3-carboxaldehyde
NASA Astrophysics Data System (ADS)
Premkumar, R.; Asath, R. Mohamed; Mathavan, T.; Benial, A. Milton Franklin
2017-05-01
The potential energy surface (PES) scan was performed for indole-3-carboxaldehyde (ICA) and the most stable optimized conformer was predicted using DFT/B3LYP method with 6-31G basis set. The vibrational frequencies of ICA were theoretically calculated by the DFT/B3LYP method with cc-pVTZ basis set using Gaussian 09 program. The vibrational spectra were experimentally recorded by Fourier transform-infrared (FT-IR) and Fourier transform-Raman spectrometer (FT-Raman). The computed vibrational frequencies were scaled by scaling factors to yield a good agreement with observed vibrational frequencies. The theoretically calculated and experimentally observed vibrational frequencies were assigned on the basis of potential energy distribution (PED) calculation using VEDA 4.0 program. The molecular interaction, stability and intramolecular charge transfer of ICA were studied using frontier molecular orbitals (FMOs) analysis and Mulliken atomic charge distribution shows the distribution of the atomic charges. The presence of intramolecular charge transfer was studied using natural bond orbital (NBO) analysis.
NASA Astrophysics Data System (ADS)
Vucinic, Dean; Deen, Danny; Oanta, Emil; Batarilo, Zvonimir; Lacor, Chris
This paper focuses on visualization and manipulation of graphical content in distributed network environments. The developed graphical middleware and 3D desktop prototypes were specialized for situational awareness. This research was done in the LArge Scale COllaborative decision support Technology (LASCOT) project, which explored and combined software technologies to support human-centred decision support system for crisis management (earthquake, tsunami, flooding, airplane or oil-tanker incidents, chemical, radio-active or other pollutants spreading, etc.). The performed state-of-the-art review did not identify any publicly available large scale distributed application of this kind. Existing proprietary solutions rely on the conventional technologies and 2D representations. Our challenge was to apply the "latest" available technologies, such Java3D, X3D and SOAP, compatible with average computer graphics hardware. The selected technologies are integrated and we demonstrate: the flow of data, which originates from heterogeneous data sources; interoperability across different operating systems and 3D visual representations to enhance the end-users interactions.
Radiation breakage of DNA: a model based on random-walk chromatin structure
NASA Technical Reports Server (NTRS)
Ponomarev, A. L.; Sachs, R. K.
2001-01-01
Monte Carlo computer software, called DNAbreak, has recently been developed to analyze observed non-random clustering of DNA double strand breaks in chromatin after exposure to densely ionizing radiation. The software models coarse-grained configurations of chromatin and radiation tracks, small-scale details being suppressed in order to obtain statistical results for larger scales, up to the size of a whole chromosome. We here give an analytic counterpart of the numerical model, useful for benchmarks, for elucidating the numerical results, for analyzing the assumptions of a more general but less mechanistic "randomly-located-clusters" formalism, and, potentially, for speeding up the calculations. The equations characterize multi-track DNA fragment-size distributions in terms of one-track action; an important step in extrapolating high-dose laboratory results to the much lower doses of main interest in environmental or occupational risk estimation. The approach can utilize the experimental information on DNA fragment-size distributions to draw inferences about large-scale chromatin geometry during cell-cycle interphase.
NASA Technical Reports Server (NTRS)
Siemers, P. M., III; Henry, M. W.
1986-01-01
Pressure distribution test data obtained on a 0.10-scale model of the forward fuselage of the Space Shuttle Orbiter are presented without analysis. The tests were completed in the AEDC 16T Propulsion Wind Tunnel. The 0.10-scale model was tested at angles of attack from -2 deg to 18 deg and angles of side slip from -6 to 6 deg at Mach numbers from 0.25 to 1/5 deg. The tests were conducted in support of the development of the Shuttle Entry Air Data System (SEADS). In addition to modeling the 20 SEADS orifices, the wind-tunnel model was also instrumented with orifices to match Development Flight Instrumentation (DFI) port locations that existed on the Space Shuttle Orbiter Columbia (OV-102) during the Orbiter Flight Test program. This DFI simulation has provided a means of comparisons between reentry flight pressure data and wind-tunnel and computational data.
Fluctuations of the gluon distribution from the small- x effective action
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dumitru, Adrian; Skokov, Vladimir
The computation of observables in high-energy QCD involves an average over stochastic semiclassical small-x gluon fields. The weight of various configurations is determined by the effective action. We introduce a method to study fluctuations of observables, functionals of the small-x fields, which does not explicitly involve dipoles. We integrate out those fluctuations of the semiclassical gluon field under which a given observable is invariant. Thereby we obtain the effective potential for that observable describing its fluctuations about the average. Here, we determine explicitly the effective potential for the covariant gauge gluon distribution both for the McLerran-Venugopalan (MV) model and formore » a (nonlocal) Gaussian approximation for the small-x effective action. This provides insight into the correlation of fluctuations of the number of hard gluons versus their typical transverse momentum. We find that the spectral shape of the fluctuations of the gluon distribution is fundamentally different in the MV model, where there is a pileup of gluons near the saturation scale, versus the solution of the small-x JIMWLK renormalization group, which generates essentially scale-invariant fluctuations above the absorptive boundary set by the saturation scale.« less
Fluctuations of the gluon distribution from the small- x effective action
Dumitru, Adrian; Skokov, Vladimir
2017-09-29
The computation of observables in high-energy QCD involves an average over stochastic semiclassical small-x gluon fields. The weight of various configurations is determined by the effective action. We introduce a method to study fluctuations of observables, functionals of the small-x fields, which does not explicitly involve dipoles. We integrate out those fluctuations of the semiclassical gluon field under which a given observable is invariant. Thereby we obtain the effective potential for that observable describing its fluctuations about the average. Here, we determine explicitly the effective potential for the covariant gauge gluon distribution both for the McLerran-Venugopalan (MV) model and formore » a (nonlocal) Gaussian approximation for the small-x effective action. This provides insight into the correlation of fluctuations of the number of hard gluons versus their typical transverse momentum. We find that the spectral shape of the fluctuations of the gluon distribution is fundamentally different in the MV model, where there is a pileup of gluons near the saturation scale, versus the solution of the small-x JIMWLK renormalization group, which generates essentially scale-invariant fluctuations above the absorptive boundary set by the saturation scale.« less
Probabilistic Simulation of Multi-Scale Composite Behavior
NASA Technical Reports Server (NTRS)
Chamis, Christos C.
2012-01-01
A methodology is developed to computationally assess the non-deterministic composite response at all composite scales (from micro to structural) due to the uncertainties in the constituent (fiber and matrix) properties, in the fabrication process and in structural variables (primitive variables). The methodology is computationally efficient for simulating the probability distributions of composite behavior, such as material properties, laminate and structural responses. Bi-products of the methodology are probabilistic sensitivities of the composite primitive variables. The methodology has been implemented into the computer codes PICAN (Probabilistic Integrated Composite ANalyzer) and IPACS (Integrated Probabilistic Assessment of Composite Structures). The accuracy and efficiency of this methodology are demonstrated by simulating the uncertainties in composite typical laminates and comparing the results with the Monte Carlo simulation method. Available experimental data of composite laminate behavior at all scales fall within the scatters predicted by PICAN. Multi-scaling is extended to simulate probabilistic thermo-mechanical fatigue and to simulate the probabilistic design of a composite redome in order to illustrate its versatility. Results show that probabilistic fatigue can be simulated for different temperature amplitudes and for different cyclic stress magnitudes. Results also show that laminate configurations can be selected to increase the redome reliability by several orders of magnitude without increasing the laminate thickness--a unique feature of structural composites. The old reference denotes that nothing fundamental has been done since that time.
Implementing fluid dynamics obtained from GeoPET in reactive transport models
NASA Astrophysics Data System (ADS)
Lippmann-Pipke, Johanna; Eichelbaum, Sebastian; Kulenkampff, Johannes
2016-04-01
Flow and transport simulations in geomaterials are commonly conducted on high-resolution tomograms (μCT) of the pore structure or stochastic models that are calibrated with measured integral quantities, like break through curves (BTC). Yet, there existed virtually no method for experimental verification of the simulated velocity distribution results. Positron emission tomography (PET) has unrivaled sensitivity and robustness for non-destructive, quantitative, spatio-temporal measurement of tracer concentrations in body tissue. In the past decade, we empowered PET for its applicability in opaque/geological media - GeoPET (Kulenkampff et al.; Kulenkampff et al., 2008; Zakhnini et al., 2013) and have developed detailed correction schemes to bring the images into sharp focus. Thereby it is the appropriate method for experimental verification and calibration of computer simulations of pore-scale transport by means of the observed propagation of a tracer pulse, c_PET(x,y,z,t). In parallel, we aimed at deriving velocity and porosity distributions directly from our concentration time series of fluid flow processes in geomaterials. This would allow us to directly benefit from lab scale observations and to parameterize respective numerical transport models. For this we have developed a robust spatiotemporal (3D+t) parameter extraction algorithm. Here, we will present its functionality, and demonstrate the use of obtained velocity distributions in finite element simulations of reactive transport processes on drill core scale. Kulenkampff, J., Gruendig, M., Zakhnini, A., Gerasch, R., and Lippmann-Pipke, J.: Process tomography of diffusion with PET for evaluating anisotropy and heterogeneity, Clay Minerals, in press. Kulenkampff, J., Gründig, M., Richter, M., and Enzmann, F.: Evaluation of positron emission tomography for visualisation of migration processes in geomaterials, Physics and Chemistry of the Earth, 33, 937-942, 2008. Zakhnini, A., Kulenkampff, J., Sauerzapf, S., Pietrzyk, U., and Lippmann-Pipke, J.: Monte Carlo simulations of GeoPET experiments: 3D images of tracer distributions (18-F, 124-I and 58-Co) in Opalinus Clay, anhydrite and quartz, Computers and Geosciences, 57 183-196, 2013.
Numerical Simulation of Dispersion from Urban Greenhouse Gas Sources
NASA Astrophysics Data System (ADS)
Nottrott, Anders; Tan, Sze; He, Yonggang; Winkler, Renato
2017-04-01
Cities are characterized by complex topography, inhomogeneous turbulence, and variable pollutant source distributions. These features create a scale separation between local sources and urban scale emissions estimates known as the Grey-Zone. Modern computational fluid dynamics (CFD) techniques provide a quasi-deterministic, physically based toolset to bridge the scale separation gap between source level dynamics, local measurements, and urban scale emissions inventories. CFD has the capability to represent complex building topography and capture detailed 3D turbulence fields in the urban boundary layer. This presentation discusses the application of OpenFOAM to urban CFD simulations of natural gas leaks in cities. OpenFOAM is an open source software for advanced numerical simulation of engineering and environmental fluid flows. When combined with free or low cost computer aided drawing and GIS, OpenFOAM generates a detailed, 3D representation of urban wind fields. OpenFOAM was applied to model scalar emissions from various components of the natural gas distribution system, to study the impact of urban meteorology on mobile greenhouse gas measurements. The numerical experiments demonstrate that CH4 concentration profiles are highly sensitive to the relative location of emission sources and buildings. Sources separated by distances of 5-10 meters showed significant differences in vertical dispersion of plumes, due to building wake effects. The OpenFOAM flow fields were combined with an inverse, stochastic dispersion model to quantify and visualize the sensitivity of point sensors to upwind sources in various built environments. The Boussinesq approximation was applied to investigate the effects of canopy layer temperature gradients and convection on sensor footprints.
NASA Technical Reports Server (NTRS)
Cen, Renyue
1994-01-01
The mass and velocity distributions in the outskirts (0.5-3.0/h Mpc) of simulated clusters of galaxies are examined for a suite of cosmogonic models (two Omega(sub 0) = 1 and two Omega(sub 0) = 0.2 models) utilizing large-scale particle-mesh (PM) simulations. Through a series of model computations, designed to isolate the different effects, we find that both Omega(sub 0) and P(sub k) (lambda less than or = 16/h Mpc) are important to the mass distributions in clusters of galaxies. There is a correlation between power, P(sub k), and density profiles of massive clusters; more power tends to point to the direction of a stronger correlation between alpha and M(r less than 1.5/h Mpc); i.e., massive clusters being relatively extended and small mass clusters being relatively concentrated. A lower Omega(sub 0) universe tends to produce relatively concentrated massive clusters and relatively extended small mass clusters compared to their counterparts in a higher Omega(sub 0) model with the same power. Models with little (initial) small-scale power, such as the hot dark matter (HDM) model, produce more extended mass distributions than the isothermal distribution for most of the mass clusters. But the cold dark matter (CDM) models show mass distributions of most of the clusters more concentrated than the isothermal distribution. X-ray and gravitational lensing observations are beginning providing useful information on the mass distribution in and around clusters; some interesting constraints on Omega(sub 0) and/or the (initial) power of the density fluctuations on scales lambda less than or = 16/h Mpc (where linear extrapolation is invalid) can be obtained when larger observational data sets, such as the Sloan Digital Sky Survey, become available.
CBRAIN: a web-based, distributed computing platform for collaborative neuroimaging research
Sherif, Tarek; Rioux, Pierre; Rousseau, Marc-Etienne; Kassis, Nicolas; Beck, Natacha; Adalat, Reza; Das, Samir; Glatard, Tristan; Evans, Alan C.
2014-01-01
The Canadian Brain Imaging Research Platform (CBRAIN) is a web-based collaborative research platform developed in response to the challenges raised by data-heavy, compute-intensive neuroimaging research. CBRAIN offers transparent access to remote data sources, distributed computing sites, and an array of processing and visualization tools within a controlled, secure environment. Its web interface is accessible through any modern browser and uses graphical interface idioms to reduce the technical expertise required to perform large-scale computational analyses. CBRAIN's flexible meta-scheduling has allowed the incorporation of a wide range of heterogeneous computing sites, currently including nine national research High Performance Computing (HPC) centers in Canada, one in Korea, one in Germany, and several local research servers. CBRAIN leverages remote computing cycles and facilitates resource-interoperability in a transparent manner for the end-user. Compared with typical grid solutions available, our architecture was designed to be easily extendable and deployed on existing remote computing sites with no tool modification, administrative intervention, or special software/hardware configuration. As October 2013, CBRAIN serves over 200 users spread across 53 cities in 17 countries. The platform is built as a generic framework that can accept data and analysis tools from any discipline. However, its current focus is primarily on neuroimaging research and studies of neurological diseases such as Autism, Parkinson's and Alzheimer's diseases, Multiple Sclerosis as well as on normal brain structure and development. This technical report presents the CBRAIN Platform, its current deployment and usage and future direction. PMID:24904400
CBRAIN: a web-based, distributed computing platform for collaborative neuroimaging research.
Sherif, Tarek; Rioux, Pierre; Rousseau, Marc-Etienne; Kassis, Nicolas; Beck, Natacha; Adalat, Reza; Das, Samir; Glatard, Tristan; Evans, Alan C
2014-01-01
The Canadian Brain Imaging Research Platform (CBRAIN) is a web-based collaborative research platform developed in response to the challenges raised by data-heavy, compute-intensive neuroimaging research. CBRAIN offers transparent access to remote data sources, distributed computing sites, and an array of processing and visualization tools within a controlled, secure environment. Its web interface is accessible through any modern browser and uses graphical interface idioms to reduce the technical expertise required to perform large-scale computational analyses. CBRAIN's flexible meta-scheduling has allowed the incorporation of a wide range of heterogeneous computing sites, currently including nine national research High Performance Computing (HPC) centers in Canada, one in Korea, one in Germany, and several local research servers. CBRAIN leverages remote computing cycles and facilitates resource-interoperability in a transparent manner for the end-user. Compared with typical grid solutions available, our architecture was designed to be easily extendable and deployed on existing remote computing sites with no tool modification, administrative intervention, or special software/hardware configuration. As October 2013, CBRAIN serves over 200 users spread across 53 cities in 17 countries. The platform is built as a generic framework that can accept data and analysis tools from any discipline. However, its current focus is primarily on neuroimaging research and studies of neurological diseases such as Autism, Parkinson's and Alzheimer's diseases, Multiple Sclerosis as well as on normal brain structure and development. This technical report presents the CBRAIN Platform, its current deployment and usage and future direction.
NASA Astrophysics Data System (ADS)
Konishi, Yoshihiro; Tanaka, Fumihiko; Uchino, Toshitaka; Hamanaka, Daisuke
During transport using refrigerated trucks, the maintaining of the recommended conditions throughout a cargo is required to preserve the quality of fresh fruit and vegetables. Temperature distribution within a refrigerated container is governed by airflow pattern with thermal transport. In this study, Computational Fluid Dynamics(CFD) predictions were used to investigate the temperature distribution within a typical refrigerated truck filled with cardboard packed eggplants. Numerical modeling of heat and mass transfer was performed using the CFX code. In order to verify the developed CFD model full-scale measurement was carried out within a load of eggplants during transport. CFD predictions show reasonable agreement with actual data.
NASA Astrophysics Data System (ADS)
Pressel, K. G.; Collins, W.; Desai, A. R.
2011-12-01
Deficiencies in the parameterization of boundary layer clouds in global climate models (GCMs) remains one of the greatest sources of uncertainty in climate change predictions. Many GCM cloud parameterizations, which seek to include some representation of subgrid-scale cloud variability, do so by making assumptions regarding the subgrid-scale spatial probability density function (PDF) of total water content. Properly specifying the form and parameters of the total water PDF is an essential step in the formulation of PDF based cloud parameterizations. In the cloud free boundary layer, the PDF of total water mixing ratio is equivalent to the PDF of water vapor mixing ratio. Understanding the PDF of water vapor mixing ratio in the cloud free atmosphere is a necessary step towards understanding the PDF of water vapor in the cloudy atmosphere. A primary challenge in empirically constraining the PDF of water vapor mixing ratio is a distinct lack of a spatially distributed observational dataset at or near cloud scale. However, at meso-beta (20-50km) and larger scales, there is a wealth of information on the spatial distribution of water vapor contained in the physically retrieved water vapor profiles from the Atmospheric Infrared Sounder onboard NASA`s Aqua satellite. The scaling (scale-invariance) of the observed water vapor field has been suggested as means of using observations at satellite observed (meso-beta) scales to derive information about cloud scale PDFs. However, doing so requires the derivation of a robust climatology of water vapor scaling from in-situ observations across the meso- gamma (2-20km) and meso-beta scales. In this work, we present the results of the scaling of high frequency (10Hz) time series of water vapor mixing ratio as observed from the 447m WLEF tower located near Park Falls, Wisconsin. Observations from a tall tower offer an ideal set of observations with which to investigate scaling at meso-gamma and meso-beta scales requiring only the assumption of Taylor`s Hypothesis to convert observed time scales to spatial scales. Furthermore, the WLEF tower holds an instrument suite offering a diverse set of variables at the 396m, 122m, and 30m levels with which to characterize the state of the boundary layer. Three methods are used to compute scaling exponents for the observed time series; poor man`s variance spectra, first order structure functions, and detrended fluctuation analysis. In each case scaling exponents are computed by linear regression. The results for each method are compared and used to build a climatology of scaling exponents. In particular, the results for June 2007 are presented, and it is shown that the scaling of water vapor time series at the 396m level is characterized by two regimes that are determined by the state of the boundary layer. Finally, the results are compared to, and shown to be roughly consistent with, scaling exponents computed from AIRS observations.
Spectral method for a kinetic swarming model
Gamba, Irene M.; Haack, Jeffrey R.; Motsch, Sebastien
2015-04-28
Here we present the first numerical method for a kinetic description of the Vicsek swarming model. The kinetic model poses a unique challenge, as there is a distribution dependent collision invariant to satisfy when computing the interaction term. We use a spectral representation linked with a discrete constrained optimization to compute these interactions. To test the numerical scheme we investigate the kinetic model at different scales and compare the solution with the microscopic and macroscopic descriptions of the Vicsek model. Lastly, we observe that the kinetic model captures key features such as vortex formation and traveling waves.
Computer Anxiety: How to Measure It?
ERIC Educational Resources Information Center
McPherson, Bill
1997-01-01
Provides an overview of five scales that are used to measure computer anxiety: Computer Anxiety Index, Computer Anxiety Scale, Computer Attitude Scale, Attitudes toward Computers, and Blombert-Erickson-Lowrey Computer Attitude Task. Includes background information and scale specifics. (JOW)
xPerm: fast index canonicalization for tensor computer algebra
NASA Astrophysics Data System (ADS)
Martín-García, José M.
2008-10-01
We present a very fast implementation of the Butler-Portugal algorithm for index canonicalization with respect to permutation symmetries. It is called xPerm, and has been written as a combination of a Mathematica package and a C subroutine. The latter performs the most demanding parts of the computations and can be linked from any other program or computer algebra system. We demonstrate with tests and timings the effectively polynomial performance of the Butler-Portugal algorithm with respect to the number of indices, though we also show a case in which it is exponential. Our implementation handles generic tensorial expressions with several dozen indices in hundredths of a second, or one hundred indices in a few seconds, clearly outperforming all other current canonicalizers. The code has been already under intensive testing for several years and has been essential in recent investigations in large-scale tensor computer algebra. Program summaryProgram title: xPerm Catalogue identifier: AEBH_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEBH_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: Standard CPC licence, http://cpc.cs.qub.ac.uk/licence/licence.html No. of lines in distributed program, including test data, etc.: 93 582 No. of bytes in distributed program, including test data, etc.: 1 537 832 Distribution format: tar.gz Programming language: C and Mathematica (version 5.0 or higher) Computer: Any computer running C and Mathematica (version 5.0 or higher) Operating system: Linux, Unix, Windows XP, MacOS RAM:: 20 Mbyte Word size: 64 or 32 bits Classification: 1.5, 5 Nature of problem: Canonicalization of indexed expressions with respect to permutation symmetries. Solution method: The Butler-Portugal algorithm. Restrictions: Multiterm symmetries are not considered. Running time: A few seconds with generic expressions of up to 100 indices. The xPermDoc.nb notebook supplied with the distribution takes approximately one and a half hours to execute in full.
Grammatical Analysis as a Distributed Neurobiological Function
Bozic, Mirjana; Fonteneau, Elisabeth; Su, Li; Marslen-Wilson, William D
2015-01-01
Language processing engages large-scale functional networks in both hemispheres. Although it is widely accepted that left perisylvian regions have a key role in supporting complex grammatical computations, patient data suggest that some aspects of grammatical processing could be supported bilaterally. We investigated the distribution and the nature of grammatical computations across language processing networks by comparing two types of combinatorial grammatical sequences—inflectionally complex words and minimal phrases—and contrasting them with grammatically simple words. Novel multivariate analyses revealed that they engage a coalition of separable subsystems: inflected forms triggered left-lateralized activation, dissociable into dorsal processes supporting morphophonological parsing and ventral, lexically driven morphosyntactic processes. In contrast, simple phrases activated a consistently bilateral pattern of temporal regions, overlapping with inflectional activations in L middle temporal gyrus. These data confirm the role of the left-lateralized frontotemporal network in supporting complex grammatical computations. Critically, they also point to the capacity of bilateral temporal regions to support simple, linear grammatical computations. This is consistent with a dual neurobiological framework where phylogenetically older bihemispheric systems form part of the network that supports language function in the modern human, and where significant capacities for language comprehension remain intact even following severe left hemisphere damage. PMID:25421880
NASA Astrophysics Data System (ADS)
DeBeer, Chris M.; Pomeroy, John W.
2017-10-01
The spatial heterogeneity of mountain snow cover and ablation is important in controlling patterns of snow cover depletion (SCD), meltwater production, and runoff, yet is not well-represented in most large-scale hydrological models and land surface schemes. Analyses were conducted in this study to examine the influence of various representations of snow cover and melt energy heterogeneity on both simulated SCD and stream discharge from a small alpine basin in the Canadian Rocky Mountains. Simulations were performed using the Cold Regions Hydrological Model (CRHM), where point-scale snowmelt computations were made using a snowpack energy balance formulation and applied to spatial frequency distributions of snow water equivalent (SWE) on individual slope-, aspect-, and landcover-based hydrological response units (HRUs) in the basin. Hydrological routines were added to represent the vertical and lateral transfers of water through the basin and channel system. From previous studies it is understood that the heterogeneity of late winter SWE is a primary control on patterns of SCD. The analyses here showed that spatial variation in applied melt energy, mainly due to differences in net radiation, has an important influence on SCD at multiple scales and basin discharge, and cannot be neglected without serious error in the prediction of these variables. A single basin SWE distribution using the basin-wide mean SWE (SWE ‾) and coefficient of variation (CV; standard deviation/mean) was found to represent the fine-scale spatial heterogeneity of SWE sufficiently well. Simulations that accounted for differences in (SWE ‾) among HRUs but neglected the sub-HRU heterogeneity of SWE were found to yield similar discharge results as simulations that included this heterogeneity, while SCD was poorly represented, even at the basin level. Finally, applying point-scale snowmelt computations based on a single SWE depth for each HRU (thereby neglecting spatial differences in internal snowpack energetics over the distributions) was found to yield similar SCD and discharge results as simulations that resolved internal energy differences. Spatial/internal snowpack melt energy effects are more pronounced at times earlier in spring before the main period of snowmelt and SCD, as shown in previously published work. The paper discusses the importance of these findings as they apply to the warranted complexity of snowmelt process simulation in cold mountain environments, and shows how the end-of-winter SWE distribution represents an effective means of resolving snow cover heterogeneity at multiple scales for modelling, even in steep and complex terrain.