Computational Approaches to Simulation and Optimization of Global Aircraft Trajectories
NASA Technical Reports Server (NTRS)
Ng, Hok Kwan; Sridhar, Banavar
2016-01-01
This study examines three possible approaches to improving the speed in generating wind-optimal routes for air traffic at the national or global level. They are: (a) using the resources of a supercomputer, (b) running the computations on multiple commercially available computers and (c) implementing those same algorithms into NASAs Future ATM Concepts Evaluation Tool (FACET) and compares those to a standard implementation run on a single CPU. Wind-optimal aircraft trajectories are computed using global air traffic schedules. The run time and wait time on the supercomputer for trajectory optimization using various numbers of CPUs ranging from 80 to 10,240 units are compared with the total computational time for running the same computation on a single desktop computer and on multiple commercially available computers for potential computational enhancement through parallel processing on the computer clusters. This study also re-implements the trajectory optimization algorithm for further reduction of computational time through algorithm modifications and integrates that with FACET to facilitate the use of the new features which calculate time-optimal routes between worldwide airport pairs in a wind field for use with existing FACET applications. The implementations of trajectory optimization algorithms use MATLAB, Python, and Java programming languages. The performance evaluations are done by comparing their computational efficiencies and based on the potential application of optimized trajectories. The paper shows that in the absence of special privileges on a supercomputer, a cluster of commercially available computers provides a feasible approach for national and global air traffic system studies.
Nonlinear Analysis of a Bolted Marine Riser Connector Using NASTRAN Substructuring
NASA Technical Reports Server (NTRS)
Fox, G. L.
1984-01-01
Results of an investigation of the behavior of a bolted, flange type marine riser connector is reported. The method used to account for the nonlinear effect of connector separation due to bolt preload and axial tension load is described. The automated multilevel substructing capability of COSMIC/NASTRAN was employed at considerable savings in computer run time. Simplified formulas for computer resources, i.e., computer run times for modules SDCOMP, FBS, and MPYAD, as well as disk storage space, are presented. Actual run time data on a VAX-11/780 is compared with the formulas presented.
Simulating three dimensional wave run-up over breakwaters covered by antifer units
NASA Astrophysics Data System (ADS)
Najafi-Jilani, A.; Niri, M. Zakiri; Naderi, Nader
2014-06-01
The paper presents the numerical analysis of wave run-up over rubble-mound breakwaters covered by antifer units using a technique integrating Computer-Aided Design (CAD) and Computational Fluid Dynamics (CFD) software. Direct application of Navier-Stokes equations within armour blocks, is used to provide a more reliable approach to simulate wave run-up over breakwaters. A well-tested Reynolds-averaged Navier-Stokes (RANS) Volume of Fluid (VOF) code (Flow-3D) was adopted for CFD computations. The computed results were compared with experimental data to check the validity of the model. Numerical results showed that the direct three dimensional (3D) simulation method can deliver accurate results for wave run-up over rubble mound breakwaters. The results showed that the placement pattern of antifer units had a great impact on values of wave run-up so that by changing the placement pattern from regular to double pyramid can reduce the wave run-up by approximately 30%. Analysis was done to investigate the influences of surface roughness, energy dissipation in the pores of the armour layer and reduced wave run-up due to inflow into the armour and stone layer.
A Quantum Computing Approach to Model Checking for Advanced Manufacturing Problems
2014-07-01
amount of time. In summary, the tool we developed succeeded in allowing us to produce good solutions for optimization problems that did not fit ...We compared the value of the objective obtained in each run with the known optimal value, and used this information to compute the probability of ...success for each given instance. Then we used this information to compute the expected number of repetitions (or runs) needed to obtain the optimal
Multidisciplinary Simulation Acceleration using Multiple Shared-Memory Graphical Processing Units
NASA Astrophysics Data System (ADS)
Kemal, Jonathan Yashar
For purposes of optimizing and analyzing turbomachinery and other designs, the unsteady Favre-averaged flow-field differential equations for an ideal compressible gas can be solved in conjunction with the heat conduction equation. We solve all equations using the finite-volume multiple-grid numerical technique, with the dual time-step scheme used for unsteady simulations. Our numerical solver code targets CUDA-capable Graphical Processing Units (GPUs) produced by NVIDIA. Making use of MPI, our solver can run across networked compute notes, where each MPI process can use either a GPU or a Central Processing Unit (CPU) core for primary solver calculations. We use NVIDIA Tesla C2050/C2070 GPUs based on the Fermi architecture, and compare our resulting performance against Intel Zeon X5690 CPUs. Solver routines converted to CUDA typically run about 10 times faster on a GPU for sufficiently dense computational grids. We used a conjugate cylinder computational grid and ran a turbulent steady flow simulation using 4 increasingly dense computational grids. Our densest computational grid is divided into 13 blocks each containing 1033x1033 grid points, for a total of 13.87 million grid points or 1.07 million grid points per domain block. To obtain overall speedups, we compare the execution time of the solver's iteration loop, including all resource intensive GPU-related memory copies. Comparing the performance of 8 GPUs to that of 8 CPUs, we obtain an overall speedup of about 6.0 when using our densest computational grid. This amounts to an 8-GPU simulation running about 39.5 times faster than running than a single-CPU simulation.
Simulation of LHC events on a millions threads
NASA Astrophysics Data System (ADS)
Childers, J. T.; Uram, T. D.; LeCompte, T. J.; Papka, M. E.; Benjamin, D. P.
2015-12-01
Demand for Grid resources is expected to double during LHC Run II as compared to Run I; the capacity of the Grid, however, will not double. The HEP community must consider how to bridge this computing gap by targeting larger compute resources and using the available compute resources as efficiently as possible. Argonne's Mira, the fifth fastest supercomputer in the world, can run roughly five times the number of parallel processes that the ATLAS experiment typically uses on the Grid. We ported Alpgen, a serial x86 code, to run as a parallel application under MPI on the Blue Gene/Q architecture. By analysis of the Alpgen code, we reduced the memory footprint to allow running 64 threads per node, utilizing the four hardware threads available per core on the PowerPC A2 processor. Event generation and unweighting, typically run as independent serial phases, are coupled together in a single job in this scenario, reducing intermediate writes to the filesystem. By these optimizations, we have successfully run LHC proton-proton physics event generation at the scale of a million threads, filling two-thirds of Mira.
Colt: an experiment in wormhole run-time reconfiguration
NASA Astrophysics Data System (ADS)
Bittner, Ray; Athanas, Peter M.; Musgrove, Mark
1996-10-01
Wormhole run-time reconfiguration (RTR) is an attempt to create a refined computing paradigm for high performance computational tasks. By combining concepts from field programmable gate array (FPGA) technologies with data flow computing, the Colt/Stallion architecture achieves high utilization of hardware resources, and facilitates rapid run-time reconfiguration. Targeted mainly at DSP-type operations, the Colt integrated circuit -- a prototype wormhole RTR device -- compares favorably to contemporary DSP alternatives in terms of silicon area consumed per unit computation and in computing performance. Although emphasis has been placed on signal processing applications, general purpose computation has not been overlooked. Colt is a prototype that defines an architecture not only at the chip level but also in terms of an overall system design. As this system is realized, the concept of wormhole RTR will be applied to numerical computation and DSP applications including those common to image processing, communications systems, digital filters, acoustic processing, real-time control systems and simulation acceleration.
CloVR-Comparative: automated, cloud-enabled comparative microbial genome sequence analysis pipeline.
Agrawal, Sonia; Arze, Cesar; Adkins, Ricky S; Crabtree, Jonathan; Riley, David; Vangala, Mahesh; Galens, Kevin; Fraser, Claire M; Tettelin, Hervé; White, Owen; Angiuoli, Samuel V; Mahurkar, Anup; Fricke, W Florian
2017-04-27
The benefit of increasing genomic sequence data to the scientific community depends on easy-to-use, scalable bioinformatics support. CloVR-Comparative combines commonly used bioinformatics tools into an intuitive, automated, and cloud-enabled analysis pipeline for comparative microbial genomics. CloVR-Comparative runs on annotated complete or draft genome sequences that are uploaded by the user or selected via a taxonomic tree-based user interface and downloaded from NCBI. CloVR-Comparative runs reference-free multiple whole-genome alignments to determine unique, shared and core coding sequences (CDSs) and single nucleotide polymorphisms (SNPs). Output includes short summary reports and detailed text-based results files, graphical visualizations (phylogenetic trees, circular figures), and a database file linked to the Sybil comparative genome browser. Data up- and download, pipeline configuration and monitoring, and access to Sybil are managed through CloVR-Comparative web interface. CloVR-Comparative and Sybil are distributed as part of the CloVR virtual appliance, which runs on local computers or the Amazon EC2 cloud. Representative datasets (e.g. 40 draft and complete Escherichia coli genomes) are processed in <36 h on a local desktop or at a cost of <$20 on EC2. CloVR-Comparative allows anybody with Internet access to run comparative genomics projects, while eliminating the need for on-site computational resources and expertise.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hong, Tianzhen; Buhl, Fred; Haves, Philip
2008-09-20
EnergyPlus is a new generation building performance simulation program offering many new modeling capabilities and more accurate performance calculations integrating building components in sub-hourly time steps. However, EnergyPlus runs much slower than the current generation simulation programs. This has become a major barrier to its widespread adoption by the industry. This paper analyzed EnergyPlus run time from comprehensive perspectives to identify key issues and challenges of speeding up EnergyPlus: studying the historical trends of EnergyPlus run time based on the advancement of computers and code improvements to EnergyPlus, comparing EnergyPlus with DOE-2 to understand and quantify the run time differences,more » identifying key simulation settings and model features that have significant impacts on run time, and performing code profiling to identify which EnergyPlus subroutines consume the most amount of run time. This paper provides recommendations to improve EnergyPlus run time from the modeler?s perspective and adequate computing platforms. Suggestions of software code and architecture changes to improve EnergyPlus run time based on the code profiling results are also discussed.« less
Building Computer-Based Experiments in Psychology without Programming Skills.
Ruisoto, Pablo; Bellido, Alberto; Ruiz, Javier; Juanes, Juan A
2016-06-01
Research in Psychology usually requires to build and run experiments. However, although this task has required scripting, recent computer tools based on graphical interfaces offer new opportunities in this field for researchers with non-programming skills. The purpose of this study is to illustrate and provide a comparative overview of two of the main free open source "point and click" software packages for building and running experiments in Psychology: PsychoPy and OpenSesame. Recommendations for their potential use are further discussed.
Vectorization of transport and diffusion computations on the CDC Cyber 205
DOE Office of Scientific and Technical Information (OSTI.GOV)
Abu-Shumays, I.K.
1986-01-01
The development and testing of alternative numerical methods and computational algorithms specifically designed for the vectorization of transport and diffusion computations on a Control Data Corporation (CDC) Cyber 205 vector computer are described. Two solution methods for the discrete ordinates approximation to the transport equation are summarized and compared. Factors of 4 to 7 reduction in run times for certain large transport problems were achieved on a Cyber 205 as compared with run times on a CDC-7600. The solution of tridiagonal systems of linear equations, central to several efficient numerical methods for multidimensional diffusion computations and essential for fluid flowmore » and other physics and engineering problems, is also dealt with. Among the methods tested, a combined odd-even cyclic reduction and modified Cholesky factorization algorithm for solving linear symmetric positive definite tridiagonal systems is found to be the most effective for these systems on a Cyber 205. For large tridiagonal systems, computation with this algorithm is an order of magnitude faster on a Cyber 205 than computation with the best algorithm for tridiagonal systems on a CDC-7600.« less
Long-Term Marathon Running Is Associated with Low Coronary Plaque Formation in Women.
Roberts, William O; Schwartz, Robert S; Kraus, Stacia Merkel; Schwartz, Jonathan G; Peichel, Gretchen; Garberich, Ross F; Lesser, John R; Oesterle, Stephen N; Wickstrom, Kelly K; Knickelbine, Thomas; Harris, Kevin M
2017-04-01
Marathon running is presumed to improve cardiovascular risk, but health benefits of high volume running are unknown. High-resolution coronary computed tomography angiography and cardiac risk factor assessment were completed in women with long-term marathon running histories to compare to sedentary women with similar risk factors. Women who had run at least one marathon per year for 10-25 yr underwent coronary computed tomography angiography, 12-lead ECG, blood pressure and heart rate measurement, lipid panel, and a demographic/health risk factor survey. Sedentary matched controls were derived from a contemporaneous clinical study database. CT scans were analyzed for calcified and noncalcified plaque prevalence, volume, stenosis severity, and calcium score. Women marathon runners (n = 26), age 42-82 yr, with combined 1217 marathons (average 47) exhibited significantly lower coronary plaque prevalence and less calcific plaque volume. The marathon runners also had less risk factors (smoking, hypertension, and hyperlipidemia); significantly lower resting heart rate, body weight, body mass index, and triglyceride levels; and higher high-density lipoprotein cholesterol levels compared with controls (n = 28). The five women runners with coronary plaque had run marathons for more years and were on average 12 yr older (65 vs 53) than the runners without plaque. Women marathon runners had minimal coronary artery calcium counts, lower coronary artery plaque prevalence, and less calcified plaque volume compared with sedentary women. Developing coronary artery plaque in long-term women marathon runners appears related to older age and more cardiac risk factors, although the runners with coronary artery plaque had accumulated significantly more years running marathons.
NASA Astrophysics Data System (ADS)
Hill, M. C.; Jakeman, J.; Razavi, S.; Tolson, B.
2015-12-01
For many environmental systems model runtimes have remained very long as more capable computers have been used to add more processes and more time and space discretization. Scientists have also added more parameters and kinds of observations, and many model runs are needed to explore the models. Computational demand equals run time multiplied by number of model runs divided by parallelization opportunities. Model exploration is conducted using sensitivity analysis, optimization, and uncertainty quantification. Sensitivity analysis is used to reveal consequences of what may be very complex simulated relations, optimization is used to identify parameter values that fit the data best, or at least better, and uncertainty quantification is used to evaluate the precision of simulated results. The long execution times make such analyses a challenge. Methods for addressing this challenges include computationally frugal analysis of the demanding original model and a number of ingenious surrogate modeling methods. Both commonly use about 50-100 runs of the demanding original model. In this talk we consider the tradeoffs between (1) original model development decisions, (2) computationally frugal analysis of the original model, and (3) using many model runs of the fast surrogate model. Some questions of interest are as follows. If the added processes and discretization invested in (1) are compared with the restrictions and approximations in model analysis produced by long model execution times, is there a net benefit related of the goals of the model? Are there changes to the numerical methods that could reduce the computational demands while giving up less fidelity than is compromised by using computationally frugal methods or surrogate models for model analysis? Both the computationally frugal methods and surrogate models require that the solution of interest be a smooth function of the parameters or interest. How does the information obtained from the local methods typical of (2) and the global averaged methods typical of (3) compare for typical systems? The discussion will use examples of response of the Greenland glacier to global warming and surface and groundwater modeling.
PC vs. Mac--Which Way Should You Go?
ERIC Educational Resources Information Center
Wodarz, Nan
1997-01-01
Outlines the factors in hardware, software, and administration to consider in developing specifications for choosing a computer operating system. Compares Microsoft Windows 95/NT that runs on PC/Intel-based systems and System 7.5 that runs on the Apple-based systems. Lists reasons why the Microsoft platform clearly stands above the Apple platform.…
Cane Toad or Computer Mouse? Real and Computer-Simulated Laboratory Exercises in Physiology Classes
ERIC Educational Resources Information Center
West, Jan; Veenstra, Anneke
2012-01-01
Traditional practical classes in many countries are being rationalised to reduce costs. The challenge for university educators is to provide students with the opportunity to reinforce theoretical concepts by running something other than a traditional practical program. One alternative is to replace wet labs with comparable computer simulations.…
NASA Astrophysics Data System (ADS)
Chen, Xiuhong; Huang, Xianglei; Jiao, Chaoyi; Flanner, Mark G.; Raeker, Todd; Palen, Brock
2017-01-01
The suites of numerical models used for simulating climate of our planet are usually run on dedicated high-performance computing (HPC) resources. This study investigates an alternative to the usual approach, i.e. carrying out climate model simulations on commercially available cloud computing environment. We test the performance and reliability of running the CESM (Community Earth System Model), a flagship climate model in the United States developed by the National Center for Atmospheric Research (NCAR), on Amazon Web Service (AWS) EC2, the cloud computing environment by Amazon.com, Inc. StarCluster is used to create virtual computing cluster on the AWS EC2 for the CESM simulations. The wall-clock time for one year of CESM simulation on the AWS EC2 virtual cluster is comparable to the time spent for the same simulation on a local dedicated high-performance computing cluster with InfiniBand connections. The CESM simulation can be efficiently scaled with the number of CPU cores on the AWS EC2 virtual cluster environment up to 64 cores. For the standard configuration of the CESM at a spatial resolution of 1.9° latitude by 2.5° longitude, increasing the number of cores from 16 to 64 reduces the wall-clock running time by more than 50% and the scaling is nearly linear. Beyond 64 cores, the communication latency starts to outweigh the benefit of distributed computing and the parallel speedup becomes nearly unchanged.
DualSPHysics: A numerical tool to simulate real breakwaters
NASA Astrophysics Data System (ADS)
Zhang, Feng; Crespo, Alejandro; Altomare, Corrado; Domínguez, José; Marzeddu, Andrea; Shang, Shao-ping; Gómez-Gesteira, Moncho
2018-02-01
The open-source code DualSPHysics is used in this work to compute the wave run-up in an existing dike in the Chinese coast using realistic dimensions, bathymetry and wave conditions. The GPU computing power of the DualSPHysics allows simulating real-engineering problems that involve complex geometries with a high resolution in a reasonable computational time. The code is first validated by comparing the numerical free-surface elevation, the wave orbital velocities and the time series of the run-up with physical data in a wave flume. Those experiments include a smooth dike and an armored dike with two layers of cubic blocks. After validation, the code is applied to a real case to obtain the wave run-up under different incident wave conditions. In order to simulate the real open sea, the spurious reflections from the wavemaker are removed by using an active wave absorption technique.
NASA Astrophysics Data System (ADS)
Řidký, V.; Šidlof, P.; Vlček, V.
2013-04-01
The work is devoted to comparing measured data with the results of numerical simulations. As mathematical model was used mathematical model whitout turbulence for incompressible flow In the experiment was observed the behavior of designed NACA0015 airfoil in airflow. For the numerical solution was used OpenFOAM computational package, this is open-source software based on finite volume method. In the numerical solution is prescribed displacement of the airfoil, which corresponds to the experiment. The velocity at a point close to the airfoil surface is compared with the experimental data obtained from interferographic measurements of the velocity field. Numerical solution is computed on a 3D mesh composed of about 1 million ortogonal hexahedron elements. The time step is limited by the Courant number. Parallel computations are run on supercomputers of the CIV at Technical University in Prague (HAL and FOX) and on a computer cluster of the Faculty of Mechatronics of Liberec (HYDRA). Run time is fixed at five periods, the results from the fifth periods and average value for all periods are then be compared with experiment.
Validation of CBCT for the computation of textural biomarkers
NASA Astrophysics Data System (ADS)
Paniagua, Beatriz; Ruellas, Antonio C.; Benavides, Erika; Marron, Steve; Wolford, Larry; Cevidanes, Lucia
2015-03-01
Osteoarthritis (OA) is associated with significant pain and 42.6% of patients with TMJ disorders present with evidence of TMJ OA. However, OA diagnosis and treatment remain controversial, since there are no clear symptoms of the disease. The subchondral bone in the TMJ is believed to play a major role in the progression of OA. We hypothesize that the textural imaging biomarkers computed in high resolution Conebeam CT (hr- CBCT) and μCT scans are comparable. The purpose of this study is to test the feasibility of computing textural imaging biomarkers in-vivo using hr-CBCT, compared to those computed in μCT scans as our Gold Standard. Specimens of condylar bones obtained from condylectomies were scanned using μCT and hr- CBCT. Nine different textural imaging biomarkers (four co-occurrence features and five run-length features) from each pair of μCT and hr-CBCT were computed and compared. Pearson correlation coefficients were computed to compare textural biomarkers values of μCT and hr-CBCT. Four of the nine computed textural biomarkers showed a strong positive correlation between biomarkers computed in μCT and hr-CBCT. Higher correlations in Energy and Contrast, and in GLN (grey-level non-uniformity) and RLN (run length non-uniformity) indicate quantitative texture features can be computed reliably in hr-CBCT, when compared with μCT. The textural imaging biomarkers computed in-vivo hr-CBCT have captured the structure, patterns, contrast between neighboring regions and uniformity of healthy and/or pathologic subchondral bone. The ability to quantify bone texture non-invasively now makes it possible to evaluate the progression of subchondral bone alterations, in TMJ OA.
Validation of CBCT for the computation of textural biomarkers
Paniagua, Beatriz; Ruellas, Antonio Carlos; Benavides, Erika; Marron, Steve; Woldford, Larry; Cevidanes, Lucia
2015-01-01
Osteoarthritis (OA) is associated with significant pain and 42.6% of patients with TMJ disorders present with evidence of TMJ OA. However, OA diagnosis and treatment remain controversial, since there are no clear symptoms of the disease. The subchondral bone in the TMJ is believed to play a major role in the progression of OA. We hypothesize that the textural imaging biomarkers computed in high resolution Conebeam CT (hr-CBCT) and μCT scans are comparable. The purpose of this study is to test the feasibility of computing textural imaging biomarkers in-vivo using hr-CBCT, compared to those computed in μCT scans as our Gold Standard. Specimens of condylar bones obtained from condylectomies were scanned using μCT and hr-CBCT. Nine different textural imaging biomarkers (four co-occurrence features and five run-length features) from each pair of μCT and hr-CBCT were computed and compared. Pearson correlation coefficients were computed to compare textural biomarkers values of μCT and hr-CBCT. Four of the nine computed textural biomarkers showed a strong positive correlation between biomarkers computed in μCT and hr-CBCT. Higher correlations in Energy and Contrast, and in GLN (grey-level non-uniformity) and RLN (run length non-uniformity) indicate quantitative texture features can be computed reliably in hr-CBCT, when compared with μCT. The textural imaging biomarkers computed in-vivo hr-CBCT have captured the structure, patterns, contrast between neighboring regions and uniformity of healthy and/or pathologic subchondral bone. The ability to quantify bone texture non-invasively now makes it possible to evaluate the progression of subchondral bone alterations, in TMJ OA. PMID:26085710
Validation of CBCT for the computation of textural biomarkers.
Paniagua, Beatriz; Ruellas, Antonio Carlos; Benavides, Erika; Marron, Steve; Woldford, Larry; Cevidanes, Lucia
2015-03-17
Osteoarthritis (OA) is associated with significant pain and 42.6% of patients with TMJ disorders present with evidence of TMJ OA. However, OA diagnosis and treatment remain controversial, since there are no clear symptoms of the disease. The subchondral bone in the TMJ is believed to play a major role in the progression of OA. We hypothesize that the textural imaging biomarkers computed in high resolution Conebeam CT (hr-CBCT) and μCT scans are comparable. The purpose of this study is to test the feasibility of computing textural imaging biomarkers in-vivo using hr-CBCT, compared to those computed in μCT scans as our Gold Standard. Specimens of condylar bones obtained from condylectomies were scanned using μCT and hr-CBCT. Nine different textural imaging biomarkers (four co-occurrence features and five run-length features) from each pair of μCT and hr-CBCT were computed and compared. Pearson correlation coefficients were computed to compare textural biomarkers values of μCT and hr-CBCT. Four of the nine computed textural biomarkers showed a strong positive correlation between biomarkers computed in μCT and hr-CBCT. Higher correlations in Energy and Contrast, and in GLN (grey-level non-uniformity) and RLN (run length non-uniformity) indicate quantitative texture features can be computed reliably in hr-CBCT, when compared with μCT. The textural imaging biomarkers computed in-vivo hr-CBCT have captured the structure, patterns, contrast between neighboring regions and uniformity of healthy and/or pathologic subchondral bone. The ability to quantify bone texture non-invasively now makes it possible to evaluate the progression of subchondral bone alterations, in TMJ OA.
Fast methods to numerically integrate the Reynolds equation for gas fluid films
NASA Technical Reports Server (NTRS)
Dimofte, Florin
1992-01-01
The alternating direction implicit (ADI) method is adopted, modified, and applied to the Reynolds equation for thin, gas fluid films. An efficient code is developed to predict both the steady-state and dynamic performance of an aerodynamic journal bearing. An alternative approach is shown for hybrid journal gas bearings by using Liebmann's iterative solution (LIS) for elliptic partial differential equations. The results are compared with known design criteria from experimental data. The developed methods show good accuracy and very short computer running time in comparison with methods based on an inverting of a matrix. The computer codes need a small amount of memory and can be run on either personal computers or on mainframe systems.
New insights into faster computation of uncertainties
NASA Astrophysics Data System (ADS)
Bhattacharya, Atreyee
2012-11-01
Heavy computation power, lengthy simulations, and an exhaustive number of model runs—often these seem like the only statistical tools that scientists have at their disposal when computing uncertainties associated with predictions, particularly in cases of environmental processes such as groundwater movement. However, calculation of uncertainties need not be as lengthy, a new study shows. Comparing two approaches—the classical Bayesian “credible interval” and a less commonly used regression-based “confidence interval” method—Lu et al. show that for many practical purposes both methods provide similar estimates of uncertainties. The advantage of the regression method is that it demands 10-1000 model runs, whereas the classical Bayesian approach requires 10,000 to millions of model runs.
First International Diagnosis Competition - DXC'09
NASA Technical Reports Server (NTRS)
Kurtoglu, tolga; Narasimhan, Sriram; Poll, Scott; Garcia, David; Kuhn, Lukas; deKleer, Johan; vanGemund, Arjan; Feldman, Alexander
2009-01-01
A framework to compare and evaluate diagnosis algorithms (DAs) has been created jointly by NASA Ames Research Center and PARC. In this paper, we present the first concrete implementation of this framework as a competition called DXC 09. The goal of this competition was to evaluate and compare DAs in a common platform and to determine a winner based on diagnosis results. 12 DAs (model-based and otherwise) competed in this first year of the competition in 3 tracks that included industrial and synthetic systems. Specifically, the participants provided algorithms that communicated with the run-time architecture to receive scenario data and return diagnostic results. These algorithms were run on extended scenario data sets (different from sample set) to compute a set of pre-defined metrics. A ranking scheme based on weighted metrics was used to declare winners. This paper presents the systems used in DXC 09, description of faults and data sets, a listing of participating DAs, the metrics and results computed from running the DAs, and a superficial analysis of the results.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Greenough, Jeffrey A.; de Supinski, Bronis R.; Yates, Robert K.
2005-04-25
We describe the performance of the block-structured Adaptive Mesh Refinement (AMR) code Raptor on the 32k node IBM BlueGene/L computer. This machine represents a significant step forward towards petascale computing. As such, it presents Raptor with many challenges for utilizing the hardware efficiently. In terms of performance, Raptor shows excellent weak and strong scaling when running in single level mode (no adaptivity). Hardware performance monitors show Raptor achieves an aggregate performance of 3:0 Tflops in the main integration kernel on the 32k system. Results from preliminary AMR runs on a prototype astrophysical problem demonstrate the efficiency of the current softwaremore » when running at large scale. The BG/L system is enabling a physics problem to be considered that represents a factor of 64 increase in overall size compared to the largest ones of this type computed to date. Finally, we provide a description of the development work currently underway to address our inefficiencies.« less
Bonacci, Jason; Saunders, Philo U; Hicks, Amy; Rantalainen, Timo; Vicenzino, Bill Guglielmo T; Spratford, Wayne
2013-04-01
The purpose of this study was to determine the changes in running mechanics that occur when highly trained runners run barefoot and in a minimalist shoe, and specifically if running in a minimalist shoe replicates barefoot running. Ground reaction force data and kinematics were collected from 22 highly trained runners during overground running while barefoot and in three shod conditions (minimalist shoe, racing flat and the athlete's regular shoe). Three-dimensional net joint moments and subsequent net powers and work were computed using Newton-Euler inverse dynamics. Joint kinematic and kinetic variables were statistically compared between barefoot and shod conditions using a multivariate analysis of variance for repeated measures and standardised mean differences calculated. There were significant differences between barefoot and shod conditions for kinematic and kinetic variables at the knee and ankle, with no differences between shod conditions. Barefoot running demonstrated less knee flexion during midstance, an 11% decrease in the peak internal knee extension and abduction moments and a 24% decrease in negative work done at the knee compared with shod conditions. The ankle demonstrated less dorsiflexion at initial contact, a 14% increase in peak power generation and a 19% increase in the positive work done during barefoot running compared with shod conditions. Barefoot running was different to all shod conditions. Barefoot running changes the amount of work done at the knee and ankle joints and this may have therapeutic and performance implications for runners.
McEwan, Phil; Bergenheim, Klas; Yuan, Yong; Tetlow, Anthony P; Gordon, Jason P
2010-01-01
Simulation techniques are well suited to modelling diseases yet can be computationally intensive. This study explores the relationship between modelled effect size, statistical precision, and efficiency gains achieved using variance reduction and an executable programming language. A published simulation model designed to model a population with type 2 diabetes mellitus based on the UKPDS 68 outcomes equations was coded in both Visual Basic for Applications (VBA) and C++. Efficiency gains due to the programming language were evaluated, as was the impact of antithetic variates to reduce variance, using predicted QALYs over a 40-year time horizon. The use of C++ provided a 75- and 90-fold reduction in simulation run time when using mean and sampled input values, respectively. For a series of 50 one-way sensitivity analyses, this would yield a total run time of 2 minutes when using C++, compared with 155 minutes for VBA when using mean input values. The use of antithetic variates typically resulted in a 53% reduction in the number of simulation replications and run time required. When drawing all input values to the model from distributions, the use of C++ and variance reduction resulted in a 246-fold improvement in computation time compared with VBA - for which the evaluation of 50 scenarios would correspondingly require 3.8 hours (C++) and approximately 14.5 days (VBA). The choice of programming language used in an economic model, as well as the methods for improving precision of model output can have profound effects on computation time. When constructing complex models, more computationally efficient approaches such as C++ and variance reduction should be considered; concerns regarding model transparency using compiled languages are best addressed via thorough documentation and model validation.
Another Program For Generating Interactive Graphics
NASA Technical Reports Server (NTRS)
Costenbader, Jay; Moleski, Walt; Szczur, Martha; Howell, David; Engelberg, Norm; Li, Tin P.; Misra, Dharitri; Miller, Philip; Neve, Leif; Wolf, Karl;
1991-01-01
VAX/Ultrix version of Transportable Applications Environment Plus (TAE+) computer program provides integrated, portable software environment for developing and running interactive window, text, and graphical-object-based application software systems. Enables programmer or nonprogrammer to construct easily custom software interface between user and application program and to move resulting interface program and its application program to different computers. When used throughout company for wide range of applications, makes both application program and computer seem transparent, with noticeable improvements in learning curve. Available in form suitable for following six different groups of computers: DEC VAX station and other VMS VAX computers, Macintosh II computers running AUX, Apollo Domain Series 3000, DEC VAX and reduced-instruction-set-computer workstations running Ultrix, Sun 3- and 4-series workstations running Sun OS and IBM RT/PC's and PS/2 computers running AIX, and HP 9000 S
Grace: A cross-platform micromagnetic simulator on graphics processing units
NASA Astrophysics Data System (ADS)
Zhu, Ru
2015-12-01
A micromagnetic simulator running on graphics processing units (GPUs) is presented. Different from GPU implementations of other research groups which are predominantly running on NVidia's CUDA platform, this simulator is developed with C++ Accelerated Massive Parallelism (C++ AMP) and is hardware platform independent. It runs on GPUs from venders including NVidia, AMD and Intel, and achieves significant performance boost as compared to previous central processing unit (CPU) simulators, up to two orders of magnitude. The simulator paved the way for running large size micromagnetic simulations on both high-end workstations with dedicated graphics cards and low-end personal computers with integrated graphics cards, and is freely available to download.
Visualization of Octree Adaptive Mesh Refinement (AMR) in Astrophysical Simulations
NASA Astrophysics Data System (ADS)
Labadens, M.; Chapon, D.; Pomaréde, D.; Teyssier, R.
2012-09-01
Computer simulations are important in current cosmological research. Those simulations run in parallel on thousands of processors, and produce huge amount of data. Adaptive mesh refinement is used to reduce the computing cost while keeping good numerical accuracy in regions of interest. RAMSES is a cosmological code developed by the Commissariat à l'énergie atomique et aux énergies alternatives (English: Atomic Energy and Alternative Energies Commission) which uses Octree adaptive mesh refinement. Compared to grid based AMR, the Octree AMR has the advantage to fit very precisely the adaptive resolution of the grid to the local problem complexity. However, this specific octree data type need some specific software to be visualized, as generic visualization tools works on Cartesian grid data type. This is why the PYMSES software has been also developed by our team. It relies on the python scripting language to ensure a modular and easy access to explore those specific data. In order to take advantage of the High Performance Computer which runs the RAMSES simulation, it also uses MPI and multiprocessing to run some parallel code. We would like to present with more details our PYMSES software with some performance benchmarks. PYMSES has currently two visualization techniques which work directly on the AMR. The first one is a splatting technique, and the second one is a custom ray tracing technique. Both have their own advantages and drawbacks. We have also compared two parallel programming techniques with the python multiprocessing library versus the use of MPI run. The load balancing strategy has to be smartly defined in order to achieve a good speed up in our computation. Results obtained with this software are illustrated in the context of a massive, 9000-processor parallel simulation of a Milky Way-like galaxy.
Williams, Paul T
2012-01-01
Current physical activity recommendations assume that different activities can be exchanged to produce the same weight-control benefits so long as total energy expended remains the same (exchangeability premise). To this end, they recommend calculating energy expenditure as the product of the time spent performing each activity and the activity's metabolic equivalents (MET), which may be summed to achieve target levels. The validity of the exchangeability premise was assessed using data from the National Runners' Health Study. Physical activity dose was compared to body mass index (BMI) and body circumferences in 33,374 runners who reported usual distance run and pace, and usual times spent running and other exercises per week. MET hours per day (METhr/d) from running was computed from: a) time and intensity, and b) reported distance run (1.02 MET • hours per km). When computed from time and intensity, the declines (slope±SE) per METhr/d were significantly greater (P<10(-15)) for running than non-running exercise for BMI (slopes±SE, male: -0.12 ± 0.00 vs. 0.00±0.00; female: -0.12 ± 0.00 vs. -0.01 ± 0.01 kg/m(2) per METhr/d) and waist circumference (male: -0.28 ± 0.01 vs. -0.07±0.01; female: -0. 31±0.01 vs. -0.05 ± 0.01 cm per METhr/d). Reported METhr/d of running was 38% to 43% greater when calculated from time and intensity than distance. Moreover, the declines per METhr/d run were significantly greater when estimated from reported distance for BMI (males: -0.29 ± 0.01; females: -0.27 ± 0.01 kg/m(2) per METhr/d) and waist circumference (males: -0.67 ± 0.02; females: -0.69 ± 0.02 cm per METhr/d) than when computed from time and intensity (cited above). The exchangeability premise was not supported for running vs. non-running exercise. Moreover, distance-based running prescriptions may provide better weight control than time-based prescriptions for running or other activities. Additional longitudinal studies and randomized clinical trials are required to verify these results prospectively.
RSTensorFlow: GPU Enabled TensorFlow for Deep Learning on Commodity Android Devices
Alzantot, Moustafa; Wang, Yingnan; Ren, Zhengshuang; Srivastava, Mani B.
2018-01-01
Mobile devices have become an essential part of our daily lives. By virtue of both their increasing computing power and the recent progress made in AI, mobile devices evolved to act as intelligent assistants in many tasks rather than a mere way of making phone calls. However, popular and commonly used tools and frameworks for machine intelligence are still lacking the ability to make proper use of the available heterogeneous computing resources on mobile devices. In this paper, we study the benefits of utilizing the heterogeneous (CPU and GPU) computing resources available on commodity android devices while running deep learning models. We leveraged the heterogeneous computing framework RenderScript to accelerate the execution of deep learning models on commodity Android devices. Our system is implemented as an extension to the popular open-source framework TensorFlow. By integrating our acceleration framework tightly into TensorFlow, machine learning engineers can now easily make benefit of the heterogeneous computing resources on mobile devices without the need of any extra tools. We evaluate our system on different android phones models to study the trade-offs of running different neural network operations on the GPU. We also compare the performance of running different models architectures such as convolutional and recurrent neural networks on CPU only vs using heterogeneous computing resources. Our result shows that although GPUs on the phones are capable of offering substantial performance gain in matrix multiplication on mobile devices. Therefore, models that involve multiplication of large matrices can run much faster (approx. 3 times faster in our experiments) due to GPU support. PMID:29629431
RSTensorFlow: GPU Enabled TensorFlow for Deep Learning on Commodity Android Devices.
Alzantot, Moustafa; Wang, Yingnan; Ren, Zhengshuang; Srivastava, Mani B
2017-06-01
Mobile devices have become an essential part of our daily lives. By virtue of both their increasing computing power and the recent progress made in AI, mobile devices evolved to act as intelligent assistants in many tasks rather than a mere way of making phone calls. However, popular and commonly used tools and frameworks for machine intelligence are still lacking the ability to make proper use of the available heterogeneous computing resources on mobile devices. In this paper, we study the benefits of utilizing the heterogeneous (CPU and GPU) computing resources available on commodity android devices while running deep learning models. We leveraged the heterogeneous computing framework RenderScript to accelerate the execution of deep learning models on commodity Android devices. Our system is implemented as an extension to the popular open-source framework TensorFlow. By integrating our acceleration framework tightly into TensorFlow, machine learning engineers can now easily make benefit of the heterogeneous computing resources on mobile devices without the need of any extra tools. We evaluate our system on different android phones models to study the trade-offs of running different neural network operations on the GPU. We also compare the performance of running different models architectures such as convolutional and recurrent neural networks on CPU only vs using heterogeneous computing resources. Our result shows that although GPUs on the phones are capable of offering substantial performance gain in matrix multiplication on mobile devices. Therefore, models that involve multiplication of large matrices can run much faster (approx. 3 times faster in our experiments) due to GPU support.
Comparison of scientific computing platforms for MCNP4A Monte Carlo calculations
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hendricks, J.S.; Brockhoff, R.C.
1994-04-01
The performance of seven computer platforms is evaluated with the widely used and internationally available MCNP4A Monte Carlo radiation transport code. All results are reproducible and are presented in such a way as to enable comparison with computer platforms not in the study. The authors observed that the HP/9000-735 workstation runs MCNP 50% faster than the Cray YMP 8/64. Compared with the Cray YMP 8/64, the IBM RS/6000-560 is 68% as fast, the Sun Sparc10 is 66% as fast, the Silicon Graphics ONYX is 90% as fast, the Gateway 2000 model 4DX2-66V personal computer is 27% as fast, and themore » Sun Sparc2 is 24% as fast. In addition to comparing the timing performance of the seven platforms, the authors observe that changes in compilers and software over the past 2 yr have resulted in only modest performance improvements, hardware improvements have enhanced performance by less than a factor of [approximately]3, timing studies are very problem dependent, MCNP4Q runs about as fast as MCNP4.« less
Computer-based testing of the modified essay question: the Singapore experience.
Lim, Erle Chuen-Hian; Seet, Raymond Chee-Seong; Oh, Vernon M S; Chia, Boon-Lock; Aw, Marion; Quak, Seng-Hock; Ong, Benjamin K C
2007-11-01
The modified essay question (MEQ), featuring an evolving case scenario, tests a candidate's problem-solving and reasoning ability, rather than mere factual recall. Although it is traditionally conducted as a pen-and-paper examination, our university has run the MEQ using computer-based testing (CBT) since 2003. We describe our experience with running the MEQ examination using the IVLE, or integrated virtual learning environment (https://ivle.nus.edu.sg), provide a blueprint for universities intending to conduct computer-based testing of the MEQ, and detail how our MEQ examination has evolved since its inception. An MEQ committee, comprising specialists in key disciplines from the departments of Medicine and Paediatrics, was formed. We utilized the IVLE, developed for our university in 1998, as the online platform on which we ran the MEQ. We calculated the number of man-hours (academic and support staff) required to run the MEQ examination, using either a computer-based or pen-and-paper format. With the support of our university's information technology (IT) specialists, we have successfully run the MEQ examination online, twice a year, since 2003. Initially, we conducted the examination with short-answer questions only, but have since expanded the MEQ examination to include multiple-choice and extended matching questions. A total of 1268 man-hours was spent in preparing for, and running, the MEQ examination using CBT, compared to 236.5 man-hours to run it using a pen-and-paper format. Despite being more labour-intensive, our students and staff prefer CBT to the pen-and-paper format. The MEQ can be conducted using a computer-based testing scenario, which offers several advantages over a pen-and-paper format. We hope to increase the number of questions and incorporate audio and video files, featuring clinical vignettes, to the MEQ examination in the near future.
Distributed run of a one-dimensional model in a regional application using SOAP-based web services
NASA Astrophysics Data System (ADS)
Smiatek, Gerhard
This article describes the setup of a distributed computing system in Perl. It facilitates the parallel run of a one-dimensional environmental model on a number of simple network PC hosts. The system uses Simple Object Access Protocol (SOAP) driven web services offering the model run on remote hosts and a multi-thread environment distributing the work and accessing the web services. Its application is demonstrated in a regional run of a process-oriented biogenic emission model for the area of Germany. Within a network consisting of up to seven web services implemented on Linux and MS-Windows hosts, a performance increase of approximately 400% has been reached compared to a model run on the fastest single host.
Boundary-layer transition and global skin friction measurement with an oil-fringe imaging technique
NASA Technical Reports Server (NTRS)
Monson, Daryl J.; Mateer, George G.; Menter, Florian R.
1993-01-01
A new oil-fringe imaging system skin friction (FISF) technique to measure skin friction on wind tunnel models is presented. In the method used to demonstrate the technique, lines of oil are applied on surfaces that connect the intended sets of measurement points, and then a wind tunnel is run so that the oil thins and forms interference fringes that are spaced in proportion to local skin friction. After a run the fringe spacings are imaged with a CCD-array digital camera and measured on a computer. Skin friction and transition measurements on a two-dimensional wing are presented and compared with computational predictions.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hasenkamp, Daren; Sim, Alexander; Wehner, Michael
Extensive computing power has been used to tackle issues such as climate changes, fusion energy, and other pressing scientific challenges. These computations produce a tremendous amount of data; however, many of the data analysis programs currently only run a single processor. In this work, we explore the possibility of using the emerging cloud computing platform to parallelize such sequential data analysis tasks. As a proof of concept, we wrap a program for analyzing trends of tropical cyclones in a set of virtual machines (VMs). This approach allows the user to keep their familiar data analysis environment in the VMs, whilemore » we provide the coordination and data transfer services to ensure the necessary input and output are directed to the desired locations. This work extensively exercises the networking capability of the cloud computing systems and has revealed a number of weaknesses in the current cloud system software. In our tests, we are able to scale the parallel data analysis job to a modest number of VMs and achieve a speedup that is comparable to running the same analysis task using MPI. However, compared to MPI based parallelization, the cloud-based approach has a number of advantages. The cloud-based approach is more flexible because the VMs can capture arbitrary software dependencies without requiring the user to rewrite their programs. The cloud-based approach is also more resilient to failure; as long as a single VM is running, it can make progress while as soon as one MPI node fails the whole analysis job fails. In short, this initial work demonstrates that a cloud computing system is a viable platform for distributed scientific data analyses traditionally conducted on dedicated supercomputing systems.« less
NASA Technical Reports Server (NTRS)
Chawner, David M.; Gomez, Ray J.
2010-01-01
In the Applied Aerosciences and CFD branch at Johnson Space Center, computational simulations are run that face many challenges. Two of which are the ability to customize software for specialized needs and the need to run simulations as fast as possible. There are many different tools that are used for running these simulations and each one has its own pros and cons. Once these simulations are run, there needs to be software capable of visualizing the results in an appealing manner. Some of this software is called open source, meaning that anyone can edit the source code to make modifications and distribute it to all other users in a future release. This is very useful, especially in this branch where many different tools are being used. File readers can be written to load any file format into a program, to ease the bridging from one tool to another. Programming such a reader requires knowledge of the file format that is being read as well as the equations necessary to obtain the derived values after loading. When running these CFD simulations, extremely large files are being loaded and having values being calculated. These simulations usually take a few hours to complete, even on the fastest machines. Graphics processing units (GPUs) are usually used to load the graphics for computers; however, in recent years, GPUs are being used for more generic applications because of the speed of these processors. Applications run on GPUs have been known to run up to forty times faster than they would on normal central processing units (CPUs). If these CFD programs are extended to run on GPUs, the amount of time they would require to complete would be much less. This would allow more simulations to be run in the same amount of time and possibly perform more complex computations.
The Weekly Fab Five: Things You Should Do Every Week To Keep Your Computer Running in Tip-Top Shape.
ERIC Educational Resources Information Center
Crispen, Patrick
2001-01-01
Describes five steps that school librarians should follow every week to keep their computers running at top efficiency. Explains how to update virus definitions; run Windows update; run ScanDisk to repair errors on the hard drive; run a disk defragmenter; and backup all data. (LRW)
Besnier, Francois; Glover, Kevin A.
2013-01-01
This software package provides an R-based framework to make use of multi-core computers when running analyses in the population genetics program STRUCTURE. It is especially addressed to those users of STRUCTURE dealing with numerous and repeated data analyses, and who could take advantage of an efficient script to automatically distribute STRUCTURE jobs among multiple processors. It also consists of additional functions to divide analyses among combinations of populations within a single data set without the need to manually produce multiple projects, as it is currently the case in STRUCTURE. The package consists of two main functions: MPI_structure() and parallel_structure() as well as an example data file. We compared the performance in computing time for this example data on two computer architectures and showed that the use of the present functions can result in several-fold improvements in terms of computation time. ParallelStructure is freely available at https://r-forge.r-project.org/projects/parallstructure/. PMID:23923012
Investigation of Storage Options for Scientific Computing on Grid and Cloud Facilities
NASA Astrophysics Data System (ADS)
Garzoglio, Gabriele
2012-12-01
In recent years, several new storage technologies, such as Lustre, Hadoop, OrangeFS, and BlueArc, have emerged. While several groups have run benchmarks to characterize them under a variety of configurations, more work is needed to evaluate these technologies for the use cases of scientific computing on Grid clusters and Cloud facilities. This paper discusses our evaluation of the technologies as deployed on a test bed at FermiCloud, one of the Fermilab infrastructure-as-a-service Cloud facilities. The test bed consists of 4 server-class nodes with 40 TB of disk space and up to 50 virtual machine clients, some running on the storage server nodes themselves. With this configuration, the evaluation compares the performance of some of these technologies when deployed on virtual machines and on “bare metal” nodes. In addition to running standard benchmarks such as IOZone to check the sanity of our installation, we have run I/O intensive tests using physics-analysis applications. This paper presents how the storage solutions perform in a variety of realistic use cases of scientific computing. One interesting difference among the storage systems tested is found in a decrease in total read throughput with increasing number of client processes, which occurs in some implementations but not others.
Investigation of storage options for scientific computing on Grid and Cloud facilities
DOE Office of Scientific and Technical Information (OSTI.GOV)
Garzoglio, Gabriele
In recent years, several new storage technologies, such as Lustre, Hadoop, OrangeFS, and BlueArc, have emerged. While several groups have run benchmarks to characterize them under a variety of configurations, more work is needed to evaluate these technologies for the use cases of scientific computing on Grid clusters and Cloud facilities. This paper discusses our evaluation of the technologies as deployed on a test bed at FermiCloud, one of the Fermilab infrastructure-as-a-service Cloud facilities. The test bed consists of 4 server-class nodes with 40 TB of disk space and up to 50 virtual machine clients, some running on the storagemore » server nodes themselves. With this configuration, the evaluation compares the performance of some of these technologies when deployed on virtual machines and on bare metal nodes. In addition to running standard benchmarks such as IOZone to check the sanity of our installation, we have run I/O intensive tests using physics-analysis applications. This paper presents how the storage solutions perform in a variety of realistic use cases of scientific computing. One interesting difference among the storage systems tested is found in a decrease in total read throughput with increasing number of client processes, which occurs in some implementations but not others.« less
Program For Generating Interactive Displays
NASA Technical Reports Server (NTRS)
Costenbader, Jay; Moleski, Walt; Szczur, Martha; Howell, David; Engelberg, Norm; Li, Tin P.; Misra, Dharitri; Miller, Philip; Neve, Leif; Wolf, Karl;
1991-01-01
Sun/Unix version of Transportable Applications Environment Plus (TAE+) computer program provides integrated, portable software environment for developing and running interactive window, text, and graphical-object-based application software systems. Enables programmer or nonprogrammer to construct easily custom software interface between user and application program and to move resulting interface program and its application program to different computers. Plus viewed as productivity tool for application developers and application end users, who benefit from resultant consistent and well-designed user interface sheltering them from intricacies of computer. Available in form suitable for following six different groups of computers: DEC VAX station and other VMS VAX computers, Macintosh II computers running AUX, Apollo Domain Series 3000, DEC VAX and reduced-instruction-set-computer workstations running Ultrix, Sun 3- and 4-series workstations running Sun OS and IBM RT/PC and PS/2 compute
A Web-based Distributed Voluntary Computing Platform for Large Scale Hydrological Computations
NASA Astrophysics Data System (ADS)
Demir, I.; Agliamzanov, R.
2014-12-01
Distributed volunteer computing can enable researchers and scientist to form large parallel computing environments to utilize the computing power of the millions of computers on the Internet, and use them towards running large scale environmental simulations and models to serve the common good of local communities and the world. Recent developments in web technologies and standards allow client-side scripting languages to run at speeds close to native application, and utilize the power of Graphics Processing Units (GPU). Using a client-side scripting language like JavaScript, we have developed an open distributed computing framework that makes it easy for researchers to write their own hydrologic models, and run them on volunteer computers. Users will easily enable their websites for visitors to volunteer sharing their computer resources to contribute running advanced hydrological models and simulations. Using a web-based system allows users to start volunteering their computational resources within seconds without installing any software. The framework distributes the model simulation to thousands of nodes in small spatial and computational sizes. A relational database system is utilized for managing data connections and queue management for the distributed computing nodes. In this paper, we present a web-based distributed volunteer computing platform to enable large scale hydrological simulations and model runs in an open and integrated environment.
Is midsole thickness a key parameter for the running pattern?
Chambon, Nicolas; Delattre, Nicolas; Guéguen, Nils; Berton, Eric; Rao, Guillaume
2014-01-01
Many studies have highlighted differences in foot strike pattern comparing habitually shod runners who ran barefoot and with running shoes. Barefoot running results in a flatter foot landing and in a decreased vertical ground reaction force compared to shod running. The aim of this study was to investigate one possible parameter influencing running pattern: the midsole thickness. Fifteen participants ran overground at 3.3 ms(-1) barefoot and with five shoes of different midsole thickness (0 mm, 2 mm, 4 mm, 8 mm, 16 mm) with no difference of height between rearfoot and forefoot. Impact magnitude was evaluated using transient peak of vertical ground reaction force, loading rate, tibial acceleration peak and rate. Hip, knee and ankle flexion angles were computed at touch-down and during stance phase (range of motion and maximum values). External net joint moments and stiffness for hip, knee and ankle joints were also observed as well as global leg stiffness. No significant effect of midsole thickness was observed on ground reaction force and tibial acceleration. However, the contact time increased with midsole thickness. Barefoot running compared to shod running induced ankle in plantar flexion at touch-down, higher ankle dorsiflexion and lower knee flexion during stance phase. These adjustments are suspected to explain the absence of difference on ground reaction force and tibial acceleration. This study showed that the presence of very thin footwear upper and sole was sufficient to significantly influence the running pattern. Copyright © 2014 Elsevier B.V. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Liu, Richen; Guo, Hanqi; Yuan, Xiaoru
Most of the existing approaches to visualize vector field ensembles are to reveal the uncertainty of individual variables, for example, statistics, variability, etc. However, a user-defined derived feature like vortex or air mass is also quite significant, since they make more sense to domain scientists. In this paper, we present a new framework to extract user-defined derived features from different simulation runs. Specially, we use a detail-to-overview searching scheme to help extract vortex with a user-defined shape. We further compute the geometry information including the size, the geo-spatial location of the extracted vortexes. We also design some linked views tomore » compare them between different runs. At last, the temporal information such as the occurrence time of the feature is further estimated and compared. Results show that our method is capable of extracting the features across different runs and comparing them spatially and temporally.« less
CUDA Fortran acceleration for the finite-difference time-domain method
NASA Astrophysics Data System (ADS)
Hadi, Mohammed F.; Esmaeili, Seyed A.
2013-05-01
A detailed description of programming the three-dimensional finite-difference time-domain (FDTD) method to run on graphical processing units (GPUs) using CUDA Fortran is presented. Two FDTD-to-CUDA thread-block mapping designs are investigated and their performances compared. Comparative assessment of trade-offs between GPU's shared memory and L1 cache is also discussed. This presentation is for the benefit of FDTD programmers who work exclusively with Fortran and are reluctant to port their codes to C in order to utilize GPU computing. The derived CUDA Fortran code is compared with an optimized CPU version that runs on a workstation-class CPU to present a realistic GPU to CPU run time comparison and thus help in making better informed investment decisions on FDTD code redesigns and equipment upgrades. All analyses are mirrored with CUDA C simulations to put in perspective the present state of CUDA Fortran development.
Bartneck, Christoph; Duenser, Andreas; Moltchanova, Elena; Zawieska, Karolina
2015-01-01
Computer and internet based questionnaires have become a standard tool in Human-Computer Interaction research and other related fields, such as psychology and sociology. Amazon’s Mechanical Turk (AMT) service is a new method of recruiting participants and conducting certain types of experiments. This study compares whether participants recruited through AMT give different responses than participants recruited through an online forum or recruited directly on a university campus. Moreover, we compare whether a study conducted within AMT results in different responses compared to a study for which participants are recruited through AMT but which is conducted using an external online questionnaire service. The results of this study show that there is a statistical difference between results obtained from participants recruited through AMT compared to the results from the participant recruited on campus or through online forums. We do, however, argue that this difference is so small that it has no practical consequence. There was no significant difference between running the study within AMT compared to running it with an online questionnaire service. There was no significant difference between results obtained directly from within AMT compared to results obtained in the campus and online forum condition. This may suggest that AMT is a viable and economical option for recruiting participants and for conducting studies as setting up and running a study with AMT generally requires less effort and time compared to other frequently used methods. We discuss our findings as well as limitations of using AMT for empirical studies. PMID:25876027
Bartneck, Christoph; Duenser, Andreas; Moltchanova, Elena; Zawieska, Karolina
2015-01-01
Computer and internet based questionnaires have become a standard tool in Human-Computer Interaction research and other related fields, such as psychology and sociology. Amazon's Mechanical Turk (AMT) service is a new method of recruiting participants and conducting certain types of experiments. This study compares whether participants recruited through AMT give different responses than participants recruited through an online forum or recruited directly on a university campus. Moreover, we compare whether a study conducted within AMT results in different responses compared to a study for which participants are recruited through AMT but which is conducted using an external online questionnaire service. The results of this study show that there is a statistical difference between results obtained from participants recruited through AMT compared to the results from the participant recruited on campus or through online forums. We do, however, argue that this difference is so small that it has no practical consequence. There was no significant difference between running the study within AMT compared to running it with an online questionnaire service. There was no significant difference between results obtained directly from within AMT compared to results obtained in the campus and online forum condition. This may suggest that AMT is a viable and economical option for recruiting participants and for conducting studies as setting up and running a study with AMT generally requires less effort and time compared to other frequently used methods. We discuss our findings as well as limitations of using AMT for empirical studies.
Irausquin, Roelof A.; Scavelli, Thomas D.; Corti, Lisa; Stefanacci, Joseph D.; DeMarco, Joann; Flood, Shannon; Rohrbach, Barton W.
2008-01-01
Evaluation of dogs with splenic masses to better educate owners as to the extent of the disease is a goal of many research studies. We compared the use of ultrasonography (US) and contrast-enhanced computed tomography (CT) to evaluate the accuracy of detecting hepatic neoplasia in dogs with splenic masses, independently, in series, or in parallel. No significant difference was found between US and CT. If the presence or absence of ascites, as detected with US, was used as a pretest probability of disease in our population, the positive predictive value increased to 94% if the tests were run in series, and the negative predictive value increased to 95% if the tests were run in parallel. The study showed that CT combined with US could be a valuable tool in evaluation of dogs with splenic masses. PMID:18320977
Experiences using OpenMP based on Computer Directed Software DSM on a PC Cluster
NASA Technical Reports Server (NTRS)
Hess, Matthias; Jost, Gabriele; Mueller, Matthias; Ruehle, Roland
2003-01-01
In this work we report on our experiences running OpenMP programs on a commodity cluster of PCs running a software distributed shared memory (DSM) system. We describe our test environment and report on the performance of a subset of the NAS Parallel Benchmarks that have been automaticaly parallelized for OpenMP. We compare the performance of the OpenMP implementations with that of their message passing counterparts and discuss performance differences.
Plancton: an opportunistic distributed computing project based on Docker containers
NASA Astrophysics Data System (ADS)
Concas, Matteo; Berzano, Dario; Bagnasco, Stefano; Lusso, Stefano; Masera, Massimo; Puccio, Maximiliano; Vallero, Sara
2017-10-01
The computing power of most modern commodity computers is far from being fully exploited by standard usage patterns. In this work we describe the development and setup of a virtual computing cluster based on Docker containers used as worker nodes. The facility is based on Plancton: a lightweight fire-and-forget background service. Plancton spawns and controls a local pool of Docker containers on a host with free resources, by constantly monitoring its CPU utilisation. It is designed to release the resources allocated opportunistically, whenever another demanding task is run by the host user, according to configurable policies. This is attained by killing a number of running containers. One of the advantages of a thin virtualization layer such as Linux containers is that they can be started almost instantly upon request. We will show how fast the start-up and disposal of containers eventually enables us to implement an opportunistic cluster based on Plancton daemons without a central control node, where the spawned Docker containers behave as job pilots. Finally, we will show how Plancton was configured to run up to 10 000 concurrent opportunistic jobs on the ALICE High-Level Trigger facility, by giving a considerable advantage in terms of management compared to virtual machines.
Comparative Implementation of High Performance Computing for Power System Dynamic Simulations
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jin, Shuangshuang; Huang, Zhenyu; Diao, Ruisheng
Dynamic simulation for transient stability assessment is one of the most important, but intensive, computations for power system planning and operation. Present commercial software is mainly designed for sequential computation to run a single simulation, which is very time consuming with a single processer. The application of High Performance Computing (HPC) to dynamic simulations is very promising in accelerating the computing process by parallelizing its kernel algorithms while maintaining the same level of computation accuracy. This paper describes the comparative implementation of four parallel dynamic simulation schemes in two state-of-the-art HPC environments: Message Passing Interface (MPI) and Open Multi-Processing (OpenMP).more » These implementations serve to match the application with dedicated multi-processor computing hardware and maximize the utilization and benefits of HPC during the development process.« less
BrightStat.com: free statistics online.
Stricker, Daniel
2008-10-01
Powerful software for statistical analysis is expensive. Here I present BrightStat, a statistical software running on the Internet which is free of charge. BrightStat's goals, its main capabilities and functionalities are outlined. Three different sample runs, a Friedman test, a chi-square test, and a step-wise multiple regression are presented. The results obtained by BrightStat are compared with results computed by SPSS, one of the global leader in providing statistical software, and VassarStats, a collection of scripts for data analysis running on the Internet. Elementary statistics is an inherent part of academic education and BrightStat is an alternative to commercial products.
Simulation procedure for modeling transient water table and artesian stress and response
Reed, J.E.; Bedinger, M.S.; Terry, J.E.
1976-01-01
The series of computer programs described in this report were designed specifically to model the ground-water regime in sufficient detail to determine the effects of the imposition of various types of stress upon the system, and to display the results in a convenient manner during calibration and when presenting projected data. SUPERMOCK simulates the ground-water system and DATE and HYDROG aid in the display of computed data. During calibration, DATE is especially useful because it has the optional feature of comparing computed data with observed data. Although the programs can be run independently, experience dictates that for best results the three should be run as steps in the same job. English units of inches, feet, and days are used in each of the programs. The units for any parameters not given in the text are clearly specified in the instructions for input to the individual programs. (Woodard-USGS)
Queueing Network Models for Parallel Processing of Task Systems: an Operational Approach
NASA Technical Reports Server (NTRS)
Mak, Victor W. K.
1986-01-01
Computer performance modeling of possibly complex computations running on highly concurrent systems is considered. Earlier works in this area either dealt with a very simple program structure or resulted in methods with exponential complexity. An efficient procedure is developed to compute the performance measures for series-parallel-reducible task systems using queueing network models. The procedure is based on the concept of hierarchical decomposition and a new operational approach. Numerical results for three test cases are presented and compared to those of simulations.
Implementation of the force decomposition machine for molecular dynamics simulations.
Borštnik, Urban; Miller, Benjamin T; Brooks, Bernard R; Janežič, Dušanka
2012-09-01
We present the design and implementation of the force decomposition machine (FDM), a cluster of personal computers (PCs) that is tailored to running molecular dynamics (MD) simulations using the distributed diagonal force decomposition (DDFD) parallelization method. The cluster interconnect architecture is optimized for the communication pattern of the DDFD method. Our implementation of the FDM relies on standard commodity components even for networking. Although the cluster is meant for DDFD MD simulations, it remains general enough for other parallel computations. An analysis of several MD simulation runs on both the FDM and a standard PC cluster demonstrates that the FDM's interconnect architecture provides a greater performance compared to a more general cluster interconnect. Copyright © 2012 Elsevier Inc. All rights reserved.
Experiences Using OpenMP Based on Compiler Directed Software DSM on a PC Cluster
NASA Technical Reports Server (NTRS)
Hess, Matthias; Jost, Gabriele; Mueller, Matthias; Ruehle, Roland; Biegel, Bryan (Technical Monitor)
2002-01-01
In this work we report on our experiences running OpenMP (message passing) programs on a commodity cluster of PCs (personal computers) running a software distributed shared memory (DSM) system. We describe our test environment and report on the performance of a subset of the NAS (NASA Advanced Supercomputing) Parallel Benchmarks that have been automatically parallelized for OpenMP. We compare the performance of the OpenMP implementations with that of their message passing counterparts and discuss performance differences.
System and method for controlling power consumption in a computer system based on user satisfaction
Yang, Lei; Dick, Robert P; Chen, Xi; Memik, Gokhan; Dinda, Peter A; Shy, Alex; Ozisikyilmaz, Berkin; Mallik, Arindam; Choudhary, Alok
2014-04-22
Systems and methods for controlling power consumption in a computer system. For each of a plurality of interactive applications, the method changes a frequency at which a processor of the computer system runs, receives an indication of user satisfaction, determines a relationship between the changed frequency and the user satisfaction of the interactive application, and stores the determined relationship information. The determined relationship can distinguish between different users and different interactive applications. A frequency may be selected from the discrete frequencies at which the processor of the computer system runs based on the determined relationship information for a particular user and a particular interactive application running on the processor of the computer system. The processor may be adapted to run at the selected frequency.
Simulation of ozone production in a complex circulation region using nested grids
NASA Astrophysics Data System (ADS)
Taghavi, M.; Cautenet, S.; Foret, G.
2003-07-01
During ESCOMPTE precampaign (15 June to 10 July 2000), three days of intensive pollution (IOP0) have been observed and simulated. The comprehensive RAMS model, version 4.3, coupled online with a chemical module including 29 species, has been used to follow the chemistry of the zone polluted over southern France. This online method can be used because the code is paralleled and the SGI 3800 computer is very powerful. Two runs have been performed: run1 with one grid and run2 with two nested grids. The redistribution of simulated chemical species (ozone, carbon monoxide, sulphur dioxide and nitrogen oxides) was compared to aircraft measurements and surface stations. The 2-grid run has given substantially better results than the one-grid run only because the former takes the outer pollutants into account. This online method helps to explain dynamics and to retrieve the chemical species redistribution with a good agreement.
Simulation of ozone production in a complex circulation region using nested grids
NASA Astrophysics Data System (ADS)
Taghavi, M.; Cautenet, S.; Foret, G.
2004-06-01
During the ESCOMPTE precampaign (summer 2000, over Southern France), a 3-day period of intensive observation (IOP0), associated with ozone peaks, has been simulated. The comprehensive RAMS model, version 4.3, coupled on-line with a chemical module including 29 species, is used to follow the chemistry of the polluted zone. This efficient but time consuming method can be used because the code is installed on a parallel computer, the SGI 3800. Two runs are performed: run 1 with a single grid and run 2 with two nested grids. The simulated fields of ozone, carbon monoxide, nitrogen oxides and sulfur dioxide are compared with aircraft and surface station measurements. The 2-grid run looks substantially better than the run with one grid because the former takes the outer pollutants into account. This on-line method helps to satisfactorily retrieve the chemical species redistribution and to explain the impact of dynamics on this redistribution.
The diverse use of clouds by CMS
Andronis, Anastasios; Bauer, Daniela; Chaze, Olivier; ...
2015-12-23
The resources CMS is using are increasingly being offered as clouds. In Run 2 of the LHC the majority of CMS CERN resources, both in Meyrin and at the Wigner Computing Centre, will be presented as cloud resources on which CMS will have to build its own infrastructure. This infrastructure will need to run all of the CMS workflows including: Tier 0, production and user analysis. In addition, the CMS High Level Trigger will provide a compute resource comparable in scale to the total offered by the CMS Tier 1 sites, when it is not running as part of themore » trigger system. During these periods a cloud infrastructure will be overlaid on this resource, making it accessible for general CMS use. Finally, CMS is starting to utilise cloud resources being offered by individual institutes and is gaining experience to facilitate the use of opportunistically available cloud resources. Lastly, we present a snap shot of this infrastructure and its operation at the time of the CHEP2015 conference.« less
Melcher, Daniel A; Paquette, Max R; Schilling, Brian K; Bloomer, Richard J
2017-12-01
Research has focused on the effects of acute strike pattern modifications on lower extremity joint stiffness and running economy (RE). Strike pattern modifications on running biomechanics have mostly been studied while runners complete short running bouts. This study examined the effects of an imposed forefoot strike (FFS) on RE and ankle and knee joint stiffness before and after a long run in habitual rearfoot strike (RFS) runners. Joint kinetics and RE were collected before and after a long run. Sagittal joint kinetics were computed from kinematic and ground reaction force data that were collected during over-ground running trials in 13 male runners. RE was measured during treadmill running. Knee flexion range of motion, knee extensor moment and ankle joint stiffness were lower while plantarflexor moment and knee joint stiffness were greater during imposed FFS compared with RFS. The long run did not influence the difference in ankle and knee joint stiffness between strike patterns. Runners were more economical during RFS than imposed FFS and RE was not influenced by the long run. These findings suggest that using a FFS pattern towards the end of a long run may not be mechanically or metabolically beneficial for well-trained male RFS runners.
BelleII@home: Integrate volunteer computing resources into DIRAC in a secure way
NASA Astrophysics Data System (ADS)
Wu, Wenjing; Hara, Takanori; Miyake, Hideki; Ueda, Ikuo; Kan, Wenxiao; Urquijo, Phillip
2017-10-01
The exploitation of volunteer computing resources has become a popular practice in the HEP computing community as the huge amount of potential computing power it provides. In the recent HEP experiments, the grid middleware has been used to organize the services and the resources, however it relies heavily on the X.509 authentication, which is contradictory to the untrusted feature of volunteer computing resources, therefore one big challenge to utilize the volunteer computing resources is how to integrate them into the grid middleware in a secure way. The DIRAC interware which is commonly used as the major component of the grid computing infrastructure for several HEP experiments proposes an even bigger challenge to this paradox as its pilot is more closely coupled with operations requiring the X.509 authentication compared to the implementations of pilot in its peer grid interware. The Belle II experiment is a B-factory experiment at KEK, and it uses DIRAC for its distributed computing. In the project of BelleII@home, in order to integrate the volunteer computing resources into the Belle II distributed computing platform in a secure way, we adopted a new approach which detaches the payload running from the Belle II DIRAC pilot which is a customized pilot pulling and processing jobs from the Belle II distributed computing platform, so that the payload can run on volunteer computers without requiring any X.509 authentication. In this approach we developed a gateway service running on a trusted server which handles all the operations requiring the X.509 authentication. So far, we have developed and deployed the prototype of BelleII@home, and tested its full workflow which proves the feasibility of this approach. This approach can also be applied on HPC systems whose work nodes do not have outbound connectivity to interact with the DIRAC system in general.
Nadkarni, P M; Miller, P L
1991-01-01
A parallel program for inter-database sequence comparison was developed on the Intel Hypercube using two models of parallel programming. One version was built using machine-specific Hypercube parallel programming commands. The other version was built using Linda, a machine-independent parallel programming language. The two versions of the program provide a case study comparing these two approaches to parallelization in an important biological application area. Benchmark tests with both programs gave comparable results with a small number of processors. As the number of processors was increased, the Linda version was somewhat less efficient. The Linda version was also run without change on Network Linda, a virtual parallel machine running on a network of desktop workstations.
Simulating Microbial Community Patterning Using Biocellion
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kang, Seung-Hwa; Kahan, Simon H.; Momeni, Babak
2014-04-17
Mathematical modeling and computer simulation are important tools for understanding complex interactions between cells and their biotic and abiotic environment: similarities and differences between modeled and observed behavior provide the basis for hypothesis forma- tion. Momeni et al. [5] investigated pattern formation in communities of yeast strains engaging in different types of ecological interactions, comparing the predictions of mathematical modeling and simulation to actual patterns observed in wet-lab experiments. However, simu- lations of millions of cells in a three-dimensional community are ex- tremely time-consuming. One simulation run in MATLAB may take a week or longer, inhibiting exploration of the vastmore » space of parameter combinations and assumptions. Improving the speed, scale, and accu- racy of such simulations facilitates hypothesis formation and expedites discovery. Biocellion is a high performance software framework for ac- celerating discrete agent-based simulation of biological systems with millions to trillions of cells. Simulations of comparable scale and accu- racy to those taking a week of computer time using MATLAB require just hours using Biocellion on a multicore workstation. Biocellion fur- ther accelerates large scale, high resolution simulations using cluster computers by partitioning the work to run on multiple compute nodes. Biocellion targets computational biologists who have mathematical modeling backgrounds and basic C++ programming skills. This chap- ter describes the necessary steps to adapt the original Momeni et al.'s model to the Biocellion framework as a case study.« less
Far-field radiation patterns of aperture antennas by the Winograd Fourier transform algorithm
NASA Technical Reports Server (NTRS)
Heisler, R.
1978-01-01
A more time-efficient algorithm for computing the discrete Fourier transform, the Winograd Fourier transform (WFT), is described. The WFT algorithm is compared with other transform algorithms. Results indicate that the WFT algorithm in antenna analysis appears to be a very successful application. Significant savings in cpu time will improve the computer turn around time and circumvent the need to resort to weekend runs.
ERIC Educational Resources Information Center
Reggini, Horacio C.
The first article, "LOGO and von Neumann Ideas," deals with the creation of new procedures based on procedures defined and stored in memory as LOGO lists of lists. This representation, which enables LOGO procedures to construct, modify, and run other LOGO procedures, is compared with basic computer concepts first formulated by John von…
"Grinding" cavities in polyurethane foam
NASA Technical Reports Server (NTRS)
Brower, J. R.; Davey, R. E.; Dixon, W. F.; Robb, P. H.; Zebus, P. P.
1980-01-01
Grinding tool installed on conventional milling machine cuts precise cavities in foam blocks. Method is well suited for prototype or midsize production runs and can be adapted to computer control for mass production. Method saves time and materials compared to bonding or hot wire techniques.
Schache, Anthony G.; Brown, Nicholas A. T.; Pandy, Marcus G.
2016-01-01
Tendon elastic strain energy is the dominant contributor to muscle–tendon work during steady-state running. Does this behaviour also occur for sprint accelerations? We used experimental data and computational modelling to quantify muscle fascicle work and tendon elastic strain energy for the human ankle plantar flexors (specifically soleus and medial gastrocnemius) for multiple foot contacts of a maximal sprint as well as for running at a steady-state speed. Positive work done by the soleus and medial gastrocnemius muscle fascicles decreased incrementally throughout the maximal sprint and both muscles performed more work for the first foot contact of the maximal sprint (FC1) compared with steady-state running at 5 m s−1 (SS5). However, the differences in tendon strain energy for both muscles were negligible throughout the maximal sprint and when comparing FC1 to SS5. Consequently, the contribution of muscle fascicle work to stored tendon elastic strain energy was greater for FC1 compared with subsequent foot contacts of the maximal sprint and compared with SS5. We conclude that tendon elastic strain energy in the ankle plantar flexors is just as vital at the start of a maximal sprint as it is at the end, and as it is for running at a constant speed. PMID:27581481
AMITIS: A 3D GPU-Based Hybrid-PIC Model for Space and Plasma Physics
NASA Astrophysics Data System (ADS)
Fatemi, Shahab; Poppe, Andrew R.; Delory, Gregory T.; Farrell, William M.
2017-05-01
We have developed, for the first time, an advanced modeling infrastructure in space simulations (AMITIS) with an embedded three-dimensional self-consistent grid-based hybrid model of plasma (kinetic ions and fluid electrons) that runs entirely on graphics processing units (GPUs). The model uses NVIDIA GPUs and their associated parallel computing platform, CUDA, developed for general purpose processing on GPUs. The model uses a single CPU-GPU pair, where the CPU transfers data between the system and GPU memory, executes CUDA kernels, and writes simulation outputs on the disk. All computations, including moving particles, calculating macroscopic properties of particles on a grid, and solving hybrid model equations are processed on a single GPU. We explain various computing kernels within AMITIS and compare their performance with an already existing well-tested hybrid model of plasma that runs in parallel using multi-CPU platforms. We show that AMITIS runs ∼10 times faster than the parallel CPU-based hybrid model. We also introduce an implicit solver for computation of Faraday’s Equation, resulting in an explicit-implicit scheme for the hybrid model equation. We show that the proposed scheme is stable and accurate. We examine the AMITIS energy conservation and show that the energy is conserved with an error < 0.2% after 500,000 timesteps, even when a very low number of particles per cell is used.
Belle II grid computing: An overview of the distributed data management system.
NASA Astrophysics Data System (ADS)
Bansal, Vikas; Schram, Malachi; Belle Collaboration, II
2017-01-01
The Belle II experiment at the SuperKEKB collider in Tsukuba, Japan, will start physics data taking in 2018 and will accumulate 50/ab of e +e- collision data, about 50 times larger than the data set of the Belle experiment. The computing requirements of Belle II are comparable to those of a Run I LHC experiment. Computing at this scale requires efficient use of the compute grids in North America, Asia and Europe and will take advantage of upgrades to the high-speed global network. We present the architecture of data flow and data handling as a part of the Belle II computing infrastructure.
A robust algorithm for automated target recognition using precomputed radar cross sections
NASA Astrophysics Data System (ADS)
Ehrman, Lisa M.; Lanterman, Aaron D.
2004-09-01
Passive radar is an emerging technology that offers a number of unique benefits, including covert operation. Many such systems are already capable of detecting and tracking aircraft. The goal of this work is to develop a robust algorithm for adding automated target recognition (ATR) capabilities to existing passive radar systems. In previous papers, we proposed conducting ATR by comparing the precomputed RCS of known targets to that of detected targets. To make the precomputed RCS as accurate as possible, a coordinated flight model is used to estimate aircraft orientation. Once the aircraft's position and orientation are known, it is possible to determine the incident and observed angles on the aircraft, relative to the transmitter and receiver. This makes it possible to extract the appropriate radar cross section (RCS) from our simulated database. This RCS is then scaled to account for propagation losses and the receiver's antenna gain. A Rician likelihood model compares these expected signals from different targets to the received target profile. We have previously employed Monte Carlo runs to gauge the probability of error in the ATR algorithm; however, generation of a statistically significant set of Monte Carlo runs is computationally intensive. As an alternative to Monte Carlo runs, we derive the relative entropy (also known as Kullback-Liebler distance) between two Rician distributions. Since the probability of Type II error in our hypothesis testing problem can be expressed as a function of the relative entropy via Stein's Lemma, this provides us with a computationally efficient method for determining an upper bound on our algorithm's performance. It also provides great insight into the types of classification errors we can expect from our algorithm. This paper compares the numerically approximated probability of Type II error with the results obtained from a set of Monte Carlo runs.
WinSCP for Windows File Transfers | High-Performance Computing | NREL
WinSCP for Windows File Transfers WinSCP for Windows File Transfers WinSCP for can used to securely transfer files between your local computer running Microsoft Windows and a remote computer running Linux
Cloud computing for comparative genomics
2010-01-01
Background Large comparative genomics studies and tools are becoming increasingly more compute-expensive as the number of available genome sequences continues to rise. The capacity and cost of local computing infrastructures are likely to become prohibitive with the increase, especially as the breadth of questions continues to rise. Alternative computing architectures, in particular cloud computing environments, may help alleviate this increasing pressure and enable fast, large-scale, and cost-effective comparative genomics strategies going forward. To test this, we redesigned a typical comparative genomics algorithm, the reciprocal smallest distance algorithm (RSD), to run within Amazon's Elastic Computing Cloud (EC2). We then employed the RSD-cloud for ortholog calculations across a wide selection of fully sequenced genomes. Results We ran more than 300,000 RSD-cloud processes within the EC2. These jobs were farmed simultaneously to 100 high capacity compute nodes using the Amazon Web Service Elastic Map Reduce and included a wide mix of large and small genomes. The total computation time took just under 70 hours and cost a total of $6,302 USD. Conclusions The effort to transform existing comparative genomics algorithms from local compute infrastructures is not trivial. However, the speed and flexibility of cloud computing environments provides a substantial boost with manageable cost. The procedure designed to transform the RSD algorithm into a cloud-ready application is readily adaptable to similar comparative genomics problems. PMID:20482786
Cloud computing for comparative genomics.
Wall, Dennis P; Kudtarkar, Parul; Fusaro, Vincent A; Pivovarov, Rimma; Patil, Prasad; Tonellato, Peter J
2010-05-18
Large comparative genomics studies and tools are becoming increasingly more compute-expensive as the number of available genome sequences continues to rise. The capacity and cost of local computing infrastructures are likely to become prohibitive with the increase, especially as the breadth of questions continues to rise. Alternative computing architectures, in particular cloud computing environments, may help alleviate this increasing pressure and enable fast, large-scale, and cost-effective comparative genomics strategies going forward. To test this, we redesigned a typical comparative genomics algorithm, the reciprocal smallest distance algorithm (RSD), to run within Amazon's Elastic Computing Cloud (EC2). We then employed the RSD-cloud for ortholog calculations across a wide selection of fully sequenced genomes. We ran more than 300,000 RSD-cloud processes within the EC2. These jobs were farmed simultaneously to 100 high capacity compute nodes using the Amazon Web Service Elastic Map Reduce and included a wide mix of large and small genomes. The total computation time took just under 70 hours and cost a total of $6,302 USD. The effort to transform existing comparative genomics algorithms from local compute infrastructures is not trivial. However, the speed and flexibility of cloud computing environments provides a substantial boost with manageable cost. The procedure designed to transform the RSD algorithm into a cloud-ready application is readily adaptable to similar comparative genomics problems.
NASA Astrophysics Data System (ADS)
Arendt, V.; Shalchi, A.
2018-06-01
We explore numerically the transport of energetic particles in a turbulent magnetic field configuration. A test-particle code is employed to compute running diffusion coefficients as well as particle distribution functions in the different directions of space. Our numerical findings are compared with models commonly used in diffusion theory such as Gaussian distribution functions and solutions of the cosmic ray Fokker-Planck equation. Furthermore, we compare the running diffusion coefficients across the mean magnetic field with solutions obtained from the time-dependent version of the unified non-linear transport theory. In most cases we find that particle distribution functions are indeed of Gaussian form as long as a two-component turbulence model is employed. For turbulence setups with reduced dimensionality, however, the Gaussian distribution can no longer be obtained. It is also shown that the unified non-linear transport theory agrees with simulated perpendicular diffusion coefficients as long as the pure two-dimensional model is excluded.
RAPPORT: running scientific high-performance computing applications on the cloud.
Cohen, Jeremy; Filippis, Ioannis; Woodbridge, Mark; Bauer, Daniela; Hong, Neil Chue; Jackson, Mike; Butcher, Sarah; Colling, David; Darlington, John; Fuchs, Brian; Harvey, Matt
2013-01-28
Cloud computing infrastructure is now widely used in many domains, but one area where there has been more limited adoption is research computing, in particular for running scientific high-performance computing (HPC) software. The Robust Application Porting for HPC in the Cloud (RAPPORT) project took advantage of existing links between computing researchers and application scientists in the fields of bioinformatics, high-energy physics (HEP) and digital humanities, to investigate running a set of scientific HPC applications from these domains on cloud infrastructure. In this paper, we focus on the bioinformatics and HEP domains, describing the applications and target cloud platforms. We conclude that, while there are many factors that need consideration, there is no fundamental impediment to the use of cloud infrastructure for running many types of HPC applications and, in some cases, there is potential for researchers to benefit significantly from the flexibility offered by cloud platforms.
GEANT4 distributed computing for compact clusters
NASA Astrophysics Data System (ADS)
Harrawood, Brian P.; Agasthya, Greeshma A.; Lakshmanan, Manu N.; Raterman, Gretchen; Kapadia, Anuj J.
2014-11-01
A new technique for distribution of GEANT4 processes is introduced to simplify running a simulation in a parallel environment such as a tightly coupled computer cluster. Using a new C++ class derived from the GEANT4 toolkit, multiple runs forming a single simulation are managed across a local network of computers with a simple inter-node communication protocol. The class is integrated with the GEANT4 toolkit and is designed to scale from a single symmetric multiprocessing (SMP) machine to compact clusters ranging in size from tens to thousands of nodes. User designed 'work tickets' are distributed to clients using a client-server work flow model to specify the parameters for each individual run of the simulation. The new g4DistributedRunManager class was developed and well tested in the course of our Neutron Stimulated Emission Computed Tomography (NSECT) experiments. It will be useful for anyone running GEANT4 for large discrete data sets such as covering a range of angles in computed tomography, calculating dose delivery with multiple fractions or simply speeding the through-put of a single model.
Computational Methods for Feedback Controllers for Aerodynamics Flow Applications
2007-08-15
Iteration #, and y-translation by: »> Fy=[unf(:,8);runA(:,8);runB(:,8);runC(:,8);runD(:,S); runE (:,8)]; >> Oy-[unf(:,23) ;runA(:,23) ;runB(:,23) ;runC(:,23...runD(:,23) ; runE (:,23)]; >> Iter-[unf(:,1);runA(U ,l);runB(:,l);runC(:,l) ;runD(:,l); runE (:,l)]; >> plot(Fy) Cobalt version 4.0 €blso!,,tic,,. ř-21
Proposal for grid computing for nuclear applications
DOE Office of Scientific and Technical Information (OSTI.GOV)
Idris, Faridah Mohamad; Ismail, Saaidi; Haris, Mohd Fauzi B.
2014-02-12
The use of computer clusters for computational sciences including computational physics is vital as it provides computing power to crunch big numbers at a faster rate. In compute intensive applications that requires high resolution such as Monte Carlo simulation, the use of computer clusters in a grid form that supplies computational power to any nodes within the grid that needs computing power, has now become a necessity. In this paper, we described how the clusters running on a specific application could use resources within the grid, to run the applications to speed up the computing process.
DeepX: Deep Learning Accelerator for Restricted Boltzmann Machine Artificial Neural Networks.
Kim, Lok-Won
2018-05-01
Although there have been many decades of research and commercial presence on high performance general purpose processors, there are still many applications that require fully customized hardware architectures for further computational acceleration. Recently, deep learning has been successfully used to learn in a wide variety of applications, but their heavy computation demand has considerably limited their practical applications. This paper proposes a fully pipelined acceleration architecture to alleviate high computational demand of an artificial neural network (ANN) which is restricted Boltzmann machine (RBM) ANNs. The implemented RBM ANN accelerator (integrating network size, using 128 input cases per batch, and running at a 303-MHz clock frequency) integrated in a state-of-the art field-programmable gate array (FPGA) (Xilinx Virtex 7 XC7V-2000T) provides a computational performance of 301-billion connection-updates-per-second and about 193 times higher performance than a software solution running on general purpose processors. Most importantly, the architecture enables over 4 times (12 times in batch learning) higher performance compared with a previous work when both are implemented in an FPGA device (XC2VP70).
Nadkarni, P. M.; Miller, P. L.
1991-01-01
A parallel program for inter-database sequence comparison was developed on the Intel Hypercube using two models of parallel programming. One version was built using machine-specific Hypercube parallel programming commands. The other version was built using Linda, a machine-independent parallel programming language. The two versions of the program provide a case study comparing these two approaches to parallelization in an important biological application area. Benchmark tests with both programs gave comparable results with a small number of processors. As the number of processors was increased, the Linda version was somewhat less efficient. The Linda version was also run without change on Network Linda, a virtual parallel machine running on a network of desktop workstations. PMID:1807632
NASA Astrophysics Data System (ADS)
Gupta, V.; Gupta, N.; Gupta, S.; Field, E.; Maechling, P.
2003-12-01
Modern laptop computers, and personal computers, can provide capabilities that are, in many ways, comparable to workstations or departmental servers. However, this doesn't mean we should run all computations on our local computers. We have identified several situations in which it preferable to implement our seismological application programs in a distributed, server-based, computing model. In this model, application programs on the user's laptop, or local computer, invoke programs that run on an organizational server, and the results are returned to the invoking system. Situations in which a server-based architecture may be preferred include: (a) a program is written in a language, or written for an operating environment, that is unsupported on the local computer, (b) software libraries or utilities required to execute a program are not available on the users computer, (c) a computational program is physically too large, or computationally too expensive, to run on a users computer, (d) a user community wants to enforce a consistent method of performing a computation by standardizing on a single implementation of a program, and (e) the computational program may require current information, that is not available to all client computers. Until recently, distributed, server-based, computational capabilities were implemented using client/server architectures. In these architectures, client programs were often written in the same language, and they executed in the same computing environment, as the servers. Recently, a new distributed computational model, called Web Services, has been developed. Web Services are based on Internet standards such as XML, SOAP, WDSL, and UDDI. Web Services offer the promise of platform, and language, independent distributed computing. To investigate this new computational model, and to provide useful services to the SCEC Community, we have implemented several computational and utility programs using a Web Service architecture. We have hosted these Web Services as a part of the SCEC Community Modeling Environment (SCEC/CME) ITR Project (http://www.scec.org/cme). We have implemented Web Services for several of the reasons sited previously. For example, we implemented a FORTRAN-based Earthquake Rupture Forecast (ERF) as a Web Service for use by client computers that don't support a FORTRAN runtime environment. We implemented a Generic Mapping Tool (GMT) Web Service for use by systems that don't have local access to GMT. We implemented a Hazard Map Calculator Web Service to execute Hazard calculations that are too computationally intensive to run on a local system. We implemented a Coordinate Conversion Web Service to enforce a standard and consistent method for converting between UTM and Lat/Lon. Our experience developing these services indicates both strengths and weakness in current Web Service technology. Client programs that utilize Web Services typically need network access, a significant disadvantage at times. Programs with simple input and output parameters were the easiest to implement as Web Services, while programs with complex parameter-types required a significant amount of additional development. We also noted that Web services are very data-oriented, and adapting object-oriented software into the Web Service model proved problematic. Also, the Web Service approach of converting data types into XML format for network transmission has significant inefficiencies for some data sets.
SSL - THE SIMPLE SOCKETS LIBRARY
NASA Technical Reports Server (NTRS)
Campbell, C. E.
1994-01-01
The Simple Sockets Library (SSL) allows C programmers to develop systems of cooperating programs using Berkeley streaming Sockets running under the TCP/IP protocol over Ethernet. The SSL provides a simple way to move information between programs running on the same or different machines and does so with little overhead. The SSL can create three types of Sockets: namely a server, a client, and an accept Socket. The SSL's Sockets are designed to be used in a fashion reminiscent of the use of FILE pointers so that a C programmer who is familiar with reading and writing files will immediately feel comfortable with reading and writing with Sockets. The SSL consists of three parts: the library, PortMaster, and utilities. The user of the SSL accesses it by linking programs to the SSL library. The PortMaster initializes connections between clients and servers. The PortMaster also supports a "firewall" facility to keep out socket requests from unapproved machines. The "firewall" is a file which contains Internet addresses for all approved machines. There are three utilities provided with the SSL. SKTDBG can be used to debug programs that make use of the SSL. SPMTABLE lists the servers and port numbers on requested machine(s). SRMSRVR tells the PortMaster to forcibly remove a server name from its list. The package also includes two example programs: multiskt.c, which makes multiple accepts on one server, and sktpoll.c, which repeatedly attempts to connect a client to some server at one second intervals. SSL is a machine independent library written in the C-language for computers connected via Ethernet using the TCP/IP protocol. It has been successfully compiled and implemented on a variety of platforms, including Sun series computers running SunOS, DEC VAX series computers running VMS, SGI computers running IRIX, DECstations running ULTRIX, DEC alpha AXPs running OSF/1, IBM RS/6000 computers running AIX, IBM PC and compatibles running BSD/386 UNIX and HP Apollo 3000/4000/9000/400T computers running HP-UX. SSL requires 45K of RAM to run under SunOS and 80K of RAM to run under VMS. For use on IBM PC series computers and compatibles running DOS, SSL requires Microsoft C 6.0 and the Wollongong TCP/IP package. Source code for sample programs and debugging tools are provided. The documentation is available on the distribution medium in TeX and PostScript formats. The standard distribution medium for SSL is a .25 inch streaming magnetic tape cartridge (QIC-24) in UNIX tar format. It is also available on a 3.5 inch diskette in UNIX tar format and a 5.25 inch 360K MS-DOS format diskette. The SSL was developed in 1992 and was updated in 1993.
Additional extensions to the NASCAP computer code, volume 3
NASA Technical Reports Server (NTRS)
Mandell, M. J.; Cooke, D. L.
1981-01-01
The ION computer code is designed to calculate charge exchange ion densities, electric potentials, plasma temperatures, and current densities external to a neutralized ion engine in R-Z geometry. The present version assumes the beam ion current and density to be known and specified, and the neutralizing electrons to originate from a hot-wire ring surrounding the beam orifice. The plasma is treated as being resistive, with an electron relaxation time comparable to the plasma frequency. Together with the thermal and electrical boundary conditions described below and other straightforward engine parameters, these assumptions suffice to determine the required quantities. The ION code, written in ASCII FORTRAN for UNIVAC 1100 series computers, is designed to be run interactively, although it can also be run in batch mode. The input is free-format, and the output is mainly graphical, using the machine-independent graphics developed for the NASCAP code. The executive routine calls the code's major subroutines in user-specified order, and the code allows great latitude for restart and parameter change.
Pc as Physics Computer for Lhc ?
NASA Astrophysics Data System (ADS)
Jarp, Sverre; Simmins, Antony; Tang, Hong; Yaari, R.
In the last five years, we have seen RISC workstations take over the computing scene that was once controlled by mainframes and supercomputers. In this paper we will argue that the same phenomenon might happen again. A project, active since March this year in the Physics Data Processing group, of CERN's CN division is described where ordinary desktop PCs running Windows (NT and 3.11) have been used for creating an environment for running large LHC batch jobs (initially the DICE simulation job of Atlas). The problems encountered in porting both the CERN library and the specific Atlas codes are described together with some encouraging benchmark results when comparing to existing RISC workstations in use by the Atlas collaboration. The issues of establishing the batch environment (Batch monitor, staging software, etc.) are also covered. Finally a quick extrapolation of commodity computing power available in the future is touched upon to indicate what kind of cost envelope could be sufficient for the simulation farms required by the LHC experiments.
Adaptive Grid Refinement for Atmospheric Boundary Layer Simulations
NASA Astrophysics Data System (ADS)
van Hooft, Antoon; van Heerwaarden, Chiel; Popinet, Stephane; van der linden, Steven; de Roode, Stephan; van de Wiel, Bas
2017-04-01
We validate and benchmark an adaptive mesh refinement (AMR) algorithm for numerical simulations of the atmospheric boundary layer (ABL). The AMR technique aims to distribute the computational resources efficiently over a domain by refining and coarsening the numerical grid locally and in time. This can be beneficial for studying cases in which length scales vary significantly in time and space. We present the results for a case describing the growth and decay of a convective boundary layer. The AMR results are benchmarked against two runs using a fixed, fine meshed grid. First, with the same numerical formulation as the AMR-code and second, with a code dedicated to ABL studies. Compared to the fixed and isotropic grid runs, the AMR algorithm can coarsen and refine the grid such that accurate results are obtained whilst using only a fraction of the grid cells. Performance wise, the AMR run was cheaper than the fixed and isotropic grid run with similar numerical formulations. However, for this specific case, the dedicated code outperformed both aforementioned runs.
Fixed-interval matching-to-sample: intermatching time and intermatching error runs1
Nelson, Thomas D.
1978-01-01
Four pigeons were trained on a matching-to-sample task in which reinforcers followed either the first matching response (fixed interval) or the fifth matching response (tandem fixed-interval fixed-ratio) that occurred 80 seconds or longer after the last reinforcement. Relative frequency distributions of the matching-to-sample responses that concluded intermatching times and runs of mismatches (intermatching error runs) were computed for the final matching responses directly followed by grain access and also for the three matching responses immediately preceding the final match. Comparison of these two distributions showed that the fixed-interval schedule arranged for the preferential reinforcement of matches concluding relatively extended intermatching times and runs of mismatches. Differences in matching accuracy and rate during the fixed interval, compared to the tandem fixed-interval fixed-ratio, suggested that reinforcers following matches concluding various intermatching times and runs of mismatches influenced the rate and accuracy of the last few matches before grain access, but did not control rate and accuracy throughout the entire fixed-interval period. PMID:16812032
VERIFICATION OF THE HYDROLOGIC EVALUATION OF LANDFILL PERFORMANCE (HELP) MODEL USING FIELD DATA
The report describes a study conducted to verify the Hydrologic Evaluation of Landfill Performance (HELP) computer model using existing field data from a total of 20 landfill cells at 7 sites in the United States. Simulations using the HELP model were run to compare the predicted...
Fast-response free-running dc-to-dc converter employing a state-trajectory control law
NASA Technical Reports Server (NTRS)
Huffman, S. D.; Burns, W. W., III; Wilson, T. G.; Owen, H. A., Jr.
1977-01-01
A recently proposed state-trajectory control law for a family of energy-storage dc-to-dc converters has been implemented for the voltage step-up configuration. Two methods of realization are discussed; one employs a digital processor and the other uses analog computational circuits. Performance characteristics of experimental voltage step-up converters operating under the control of each of these implementations are reported and compared to theoretical predictions and computer simulations.
Accelerating 3D Hall MHD Magnetosphere Simulations with Graphics Processing Units
NASA Astrophysics Data System (ADS)
Bard, C.; Dorelli, J.
2017-12-01
The resolution required to simulate planetary magnetospheres with Hall magnetohydrodynamics result in program sizes approaching several hundred million grid cells. These would take years to run on a single computational core and require hundreds or thousands of computational cores to complete in a reasonable time. However, this requires access to the largest supercomputers. Graphics processing units (GPUs) provide a viable alternative: one GPU can do the work of roughly 100 cores, bringing Hall MHD simulations of Ganymede within reach of modest GPU clusters ( 8 GPUs). We report our progress in developing a GPU-accelerated, three-dimensional Hall magnetohydrodynamic code and present Hall MHD simulation results for both Ganymede (run on 8 GPUs) and Mercury (56 GPUs). We benchmark our Ganymede simulation with previous results for the Galileo G8 flyby, namely that adding the Hall term to ideal MHD simulations changes the global convection pattern within the magnetosphere. Additionally, we present new results for the G1 flyby as well as initial results from Hall MHD simulations of Mercury and compare them with the corresponding ideal MHD runs.
Energy consumption program: A computer model simulating energy loads in buildings
NASA Technical Reports Server (NTRS)
Stoller, F. W.; Lansing, F. L.; Chai, V. W.; Higgins, S.
1978-01-01
The JPL energy consumption computer program developed as a useful tool in the on-going building modification studies in the DSN energy conservation project is described. The program simulates building heating and cooling loads and computes thermal and electric energy consumption and cost. The accuracy of computations are not sacrificed, however, since the results lie within + or - 10 percent margin compared to those read from energy meters. The program is carefully structured to reduce both user's time and running cost by asking minimum information from the user and reducing many internal time-consuming computational loops. Many unique features were added to handle two-level electronics control rooms not found in any other program.
Long-distance running, bone density, and osteoarthritis
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lane, N.E.; Bloch, D.A.; Jones, H.H.
Forty-one long-distance runners aged 50 to 72 years were compared with 41 matched community controls to examine associations of repetitive, long-term physical impact (running) with osteoarthritis and osteoporosis. Roentgenograms of hands, lateral lumbar spine, and knees were assessed without knowledge of running status. A computed tomographic scan of the first lumbar vertebra was performed to quantitate bone mineral content. Runners, both male and female, have approximately 40% more bone mineral than matched controls. Female runners, but not male runners, appear to have somewhat more sclerosis and spur formation in spine and weight-bearing knee x-ray films, but not in hand x-raymore » films. There were no differences between groups in joint space narrowing, crepitation, joint stability, or symptomatic osteoarthritis. Running is associated with increased bone mineral but not, in this cross-sectional study, with clinical osteoarthritis.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Look, Nicole; Arellano, Christopher J.; Grabowski, Alena M.
2013-12-15
In this paper, we study dynamic stability during running, focusing on the effects of speed, and the use of a leg prosthesis. We compute and compare the maximal Lyapunov exponents of kinematic time-series data from subjects with and without unilateral transtibial amputations running at a wide range of speeds. We find that the dynamics of the affected leg with the running-specific prosthesis are less stable than the dynamics of the unaffected leg and also less stable than the biological legs of the non-amputee runners. Surprisingly, we find that the center-of-mass dynamics of runners with two intact biological legs are slightlymore » less stable than those of runners with amputations. Our results suggest that while leg asymmetries may be associated with instability, runners may compensate for this effect by increased control of their center-of-mass dynamics.« less
Running Jobs on the Peregrine System | High-Performance Computing | NREL
on the Peregrine high-performance computing (HPC) system. Running Different Types of Jobs Batch jobs scheduling policies - queue names, limits, etc. Requesting different node types Sample batch scripts
ATLAS Distributed Computing Experience and Performance During the LHC Run-2
NASA Astrophysics Data System (ADS)
Filipčič, A.;
2017-10-01
ATLAS Distributed Computing during LHC Run-1 was challenged by steadily increasing computing, storage and network requirements. In addition, the complexity of processing task workflows and their associated data management requirements led to a new paradigm in the ATLAS computing model for Run-2, accompanied by extensive evolution and redesign of the workflow and data management systems. The new systems were put into production at the end of 2014, and gained robustness and maturity during 2015 data taking. ProdSys2, the new request and task interface; JEDI, the dynamic job execution engine developed as an extension to PanDA; and Rucio, the new data management system, form the core of Run-2 ATLAS distributed computing engine. One of the big changes for Run-2 was the adoption of the Derivation Framework, which moves the chaotic CPU and data intensive part of the user analysis into the centrally organized train production, delivering derived AOD datasets to user groups for final analysis. The effectiveness of the new model was demonstrated through the delivery of analysis datasets to users just one week after data taking, by completing the calibration loop, Tier-0 processing and train production steps promptly. The great flexibility of the new system also makes it possible to execute part of the Tier-0 processing on the grid when Tier-0 resources experience a backlog during high data-taking periods. The introduction of the data lifetime model, where each dataset is assigned a finite lifetime (with extensions possible for frequently accessed data), was made possible by Rucio. Thanks to this the storage crises experienced in Run-1 have not reappeared during Run-2. In addition, the distinction between Tier-1 and Tier-2 disk storage, now largely artificial given the quality of Tier-2 resources and their networking, has been removed through the introduction of dynamic ATLAS clouds that group the storage endpoint nucleus and its close-by execution satellite sites. All stable ATLAS sites are now able to store unique or primary copies of the datasets. ATLAS Distributed Computing is further evolving to speed up request processing by introducing network awareness, using machine learning and optimisation of the latencies during the execution of the full chain of tasks. The Event Service, a new workflow and job execution engine, is designed around check-pointing at the level of event processing to use opportunistic resources more efficiently. ATLAS has been extensively exploring possibilities of using computing resources extending beyond conventional grid sites in the WLCG fabric to deliver as many computing cycles as possible and thereby enhance the significance of the Monte-Carlo samples to deliver better physics results. The exploitation of opportunistic resources was at an early stage throughout 2015, at the level of 10% of the total ATLAS computing power, but in the next few years it is expected to deliver much more. In addition, demonstrating the ability to use an opportunistic resource can lead to securing ATLAS allocations on the facility, hence the importance of this work goes beyond merely the initial CPU cycles gained. In this paper, we give an overview and compare the performance, development effort, flexibility and robustness of the various approaches.
WinHPC System | High-Performance Computing | NREL
System WinHPC System NREL's WinHPC system is a computing cluster running the Microsoft Windows operating system. It allows users to run jobs requiring a Windows environment such as ANSYS and MATLAB
Performance of the Widely-Used CFD Code OVERFLOW on the Pleides Supercomputer
NASA Technical Reports Server (NTRS)
Guruswamy, Guru P.
2017-01-01
Computational performance studies were made for NASA's widely used Computational Fluid Dynamics code OVERFLOW on the Pleiades Supercomputer. Two test cases were considered: a full launch vehicle with a grid of 286 million points and a full rotorcraft model with a grid of 614 million points. Computations using up to 8000 cores were run on Sandy Bridge and Ivy Bridge nodes. Performance was monitored using times reported in the day files from the Portable Batch System utility. Results for two grid topologies are presented and compared in detail. Observations and suggestions for future work are made.
Processing Shotgun Proteomics Data on the Amazon Cloud with the Trans-Proteomic Pipeline*
Slagel, Joseph; Mendoza, Luis; Shteynberg, David; Deutsch, Eric W.; Moritz, Robert L.
2015-01-01
Cloud computing, where scalable, on-demand compute cycles and storage are available as a service, has the potential to accelerate mass spectrometry-based proteomics research by providing simple, expandable, and affordable large-scale computing to all laboratories regardless of location or information technology expertise. We present new cloud computing functionality for the Trans-Proteomic Pipeline, a free and open-source suite of tools for the processing and analysis of tandem mass spectrometry datasets. Enabled with Amazon Web Services cloud computing, the Trans-Proteomic Pipeline now accesses large scale computing resources, limited only by the available Amazon Web Services infrastructure, for all users. The Trans-Proteomic Pipeline runs in an environment fully hosted on Amazon Web Services, where all software and data reside on cloud resources to tackle large search studies. In addition, it can also be run on a local computer with computationally intensive tasks launched onto the Amazon Elastic Compute Cloud service to greatly decrease analysis times. We describe the new Trans-Proteomic Pipeline cloud service components, compare the relative performance and costs of various Elastic Compute Cloud service instance types, and present on-line tutorials that enable users to learn how to deploy cloud computing technology rapidly with the Trans-Proteomic Pipeline. We provide tools for estimating the necessary computing resources and costs given the scale of a job and demonstrate the use of cloud enabled Trans-Proteomic Pipeline by performing over 1100 tandem mass spectrometry files through four proteomic search engines in 9 h and at a very low cost. PMID:25418363
Processing shotgun proteomics data on the Amazon cloud with the trans-proteomic pipeline.
Slagel, Joseph; Mendoza, Luis; Shteynberg, David; Deutsch, Eric W; Moritz, Robert L
2015-02-01
Cloud computing, where scalable, on-demand compute cycles and storage are available as a service, has the potential to accelerate mass spectrometry-based proteomics research by providing simple, expandable, and affordable large-scale computing to all laboratories regardless of location or information technology expertise. We present new cloud computing functionality for the Trans-Proteomic Pipeline, a free and open-source suite of tools for the processing and analysis of tandem mass spectrometry datasets. Enabled with Amazon Web Services cloud computing, the Trans-Proteomic Pipeline now accesses large scale computing resources, limited only by the available Amazon Web Services infrastructure, for all users. The Trans-Proteomic Pipeline runs in an environment fully hosted on Amazon Web Services, where all software and data reside on cloud resources to tackle large search studies. In addition, it can also be run on a local computer with computationally intensive tasks launched onto the Amazon Elastic Compute Cloud service to greatly decrease analysis times. We describe the new Trans-Proteomic Pipeline cloud service components, compare the relative performance and costs of various Elastic Compute Cloud service instance types, and present on-line tutorials that enable users to learn how to deploy cloud computing technology rapidly with the Trans-Proteomic Pipeline. We provide tools for estimating the necessary computing resources and costs given the scale of a job and demonstrate the use of cloud enabled Trans-Proteomic Pipeline by performing over 1100 tandem mass spectrometry files through four proteomic search engines in 9 h and at a very low cost. © 2015 by The American Society for Biochemistry and Molecular Biology, Inc.
Analyzing Spacecraft Telecommunication Systems
NASA Technical Reports Server (NTRS)
Kordon, Mark; Hanks, David; Gladden, Roy; Wood, Eric
2004-01-01
Multi-Mission Telecom Analysis Tool (MMTAT) is a C-language computer program for analyzing proposed spacecraft telecommunication systems. MMTAT utilizes parameterized input and computational models that can be run on standard desktop computers to perform fast and accurate analyses of telecommunication links. MMTAT is easy to use and can easily be integrated with other software applications and run as part of almost any computational simulation. It is distributed as either a stand-alone application program with a graphical user interface or a linkable library with a well-defined set of application programming interface (API) calls. As a stand-alone program, MMTAT provides both textual and graphical output. The graphs make it possible to understand, quickly and easily, how telecommunication performance varies with variations in input parameters. A delimited text file that can be read by any spreadsheet program is generated at the end of each run. The API in the linkable-library form of MMTAT enables the user to control simulation software and to change parameters during a simulation run. Results can be retrieved either at the end of a run or by use of a function call at any time step.
Head direction cells in the postsubiculum do not show replay of prior waking sequences during sleep
Brandon, Mark P.; Bogaard, Andrew; Andrews, Chris M.; Hasselmo, Michael E.
2011-01-01
During slow-wave sleep and REM sleep, hippocampal place cells in the rat show replay of sequences previously observed during waking. We tested the hypothesis from computational modelling that the temporal structure of REM sleep replay could arise from an interplay of place cells with head direction cells in the postsubiculum. Physiological single-unit recording was performed simultaneously from five or more head direction or place by head direction cells in the postsubiculum during running on a circular track allowing sampling of a full range of head directions, and during sleep periods before and after running on the circular track. Data analysis compared the spiking activity during individual REM periods with waking as in previous analysis procedures for REM sleep. We also used a new procedure comparing groups of similar runs during waking with REM sleep periods. There was no consistent evidence for a statistically significant correlation of the temporal structure of spiking during REM sleep with spiking during waking running periods. Thus, the spiking activity of head direction cells during REM sleep does not show replay of head direction cell activity occurring during a previous waking period of running on the task. In addition, we compared the spiking of postsubiculum neurons during hippocampal sharp wave ripple events. We show that head direction cells are not activated during sharp wave ripples, while neurons responsive to place in the postsubiculum show reliable spiking at ripple events. PMID:21509854
Cable Connected Spinning Spacecraft, 1. the Canonical Equations, 2. Urban Mass Transportation, 3
NASA Technical Reports Server (NTRS)
Sitchin, A.
1972-01-01
Work on the dynamics of cable-connected spinning spacecraft was completed by formulating the equations of motion by both the canonical equations and Lagrange's equations and programming them for numerical solution on a digital computer. These energy-based formulations will permit future addition of the effect of cable mass. Comparative runs indicate that the canonical formulation requires less computer time. Available literature on urban mass transportation was surveyed. Areas of the private rapid transit concept of urban transportation are also studied.
Organization and use of a Software/Hardware Avionics Research Program (SHARP)
NASA Technical Reports Server (NTRS)
Karmarkar, J. S.; Kareemi, M. N.
1975-01-01
The organization and use is described of the software/hardware avionics research program (SHARP) developed to duplicate the automatic portion of the STOLAND simulator system, on a general-purpose computer system (i.e., IBM 360). The program's uses are: (1) to conduct comparative evaluation studies of current and proposed airborne and ground system concepts via single run or Monte Carlo simulation techniques, and (2) to provide a software tool for efficient algorithm evaluation and development for the STOLAND avionics computer.
2011-08-01
5 Figure 4 Architetural diagram of running Blender on Amazon EC2 through Nimbis...classification of streaming data. Example input images (top left). All digit prototypes (cluster centers) found, with size proportional to frequency (top...Figure 4 Architetural diagram of running Blender on Amazon EC2 through Nimbis 1 http
Performance Analysis of and Tool Support for Transactional Memory on BG/Q
DOE Office of Scientific and Technical Information (OSTI.GOV)
Schindewolf, M
2011-12-08
Martin Schindewolf worked during his internship at the Lawrence Livermore National Laboratory (LLNL) under the guidance of Martin Schulz at the Computer Science Group of the Center for Applied Scientific Computing. We studied the performance of the TM subsystem of BG/Q as well as researched the possibilities for tool support for TM. To study the performance, we run CLOMP-TM. CLOMP-TM is a benchmark designed for the purpose to quantify the overhead of OpenMP and compare different synchronization primitives. To advance CLOMP-TM, we added Message Passing Interface (MPI) routines for a hybrid parallelization. This enables to run multiple MPI tasks, eachmore » running OpenMP, on one node. With these enhancements, a beneficial MPI task to OpenMP thread ratio is determined. Further, the synchronization primitives are ranked as a function of the application characteristics. To demonstrate the usefulness of these results, we investigate a real Monte Carlo simulation called Monte Carlo Benchmark (MCB). Applying the lessons learned yields the best task to thread ratio. Further, we were able to tune the synchronization by transactifying the MCB. Further, we develop tools that capture the performance of the TM run time system and present it to the application's developer. The performance of the TM run time system relies on the built-in statistics. These tools use the Blue Gene Performance Monitoring (BGPM) interface to correlate the statistics from the TM run time system with performance counter values. This combination provides detailed insights in the run time behavior of the application and enables to track down the cause of degraded performance. Further, one tool has been implemented that separates the performance counters in three categories: Successful Speculation, Unsuccessful Speculation and No Speculation. All of the tools are crafted around IBM's xlc compiler for C and C++ and have been run and tested on a Q32 early access system.« less
Overcoming Microsoft Excel's Weaknesses for Crop Model Building and Simulations
ERIC Educational Resources Information Center
Sung, Christopher Teh Boon
2011-01-01
Using spreadsheets such as Microsoft Excel for building crop models and running simulations can be beneficial. Excel is easy to use, powerful, and versatile, and it requires the least proficiency in computer programming compared to other programming platforms. Excel, however, has several weaknesses: it does not directly support loops for iterative…
Design for Run-Time Monitor on Cloud Computing
NASA Astrophysics Data System (ADS)
Kang, Mikyung; Kang, Dong-In; Yun, Mira; Park, Gyung-Leen; Lee, Junghoon
Cloud computing is a new information technology trend that moves computing and data away from desktops and portable PCs into large data centers. The basic principle of cloud computing is to deliver applications as services over the Internet as well as infrastructure. A cloud is the type of a parallel and distributed system consisting of a collection of inter-connected and virtualized computers that are dynamically provisioned and presented as one or more unified computing resources. The large-scale distributed applications on a cloud require adaptive service-based software, which has the capability of monitoring the system status change, analyzing the monitored information, and adapting its service configuration while considering tradeoffs among multiple QoS features simultaneously. In this paper, we design Run-Time Monitor (RTM) which is a system software to monitor the application behavior at run-time, analyze the collected information, and optimize resources on cloud computing. RTM monitors application software through library instrumentation as well as underlying hardware through performance counter optimizing its computing configuration based on the analyzed data.
Cloud Computing for Complex Performance Codes.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Appel, Gordon John; Hadgu, Teklu; Klein, Brandon Thorin
This report describes the use of cloud computing services for running complex public domain performance assessment problems. The work consisted of two phases: Phase 1 was to demonstrate complex codes, on several differently configured servers, could run and compute trivial small scale problems in a commercial cloud infrastructure. Phase 2 focused on proving non-trivial large scale problems could be computed in the commercial cloud environment. The cloud computing effort was successfully applied using codes of interest to the geohydrology and nuclear waste disposal modeling community.
NASA Astrophysics Data System (ADS)
Myre, Joseph M.
Heterogeneous computing systems have recently come to the forefront of the High-Performance Computing (HPC) community's interest. HPC computer systems that incorporate special purpose accelerators, such as Graphics Processing Units (GPUs), are said to be heterogeneous. Large scale heterogeneous computing systems have consistently ranked highly on the Top500 list since the beginning of the heterogeneous computing trend. By using heterogeneous computing systems that consist of both general purpose processors and special- purpose accelerators, the speed and problem size of many simulations could be dramatically increased. Ultimately this results in enhanced simulation capabilities that allows, in some cases for the first time, the execution of parameter space and uncertainty analyses, model optimizations, and other inverse modeling techniques that are critical for scientific discovery and engineering analysis. However, simplifying the usage and optimization of codes for heterogeneous computing systems remains a challenge. This is particularly true for scientists and engineers for whom understanding HPC architectures and undertaking performance analysis may not be primary research objectives. To enable scientists and engineers to remain focused on their primary research objectives, a modular environment for geophysical inversion and run-time autotuning on heterogeneous computing systems is presented. This environment is composed of three major components: 1) CUSH---a framework for reducing the complexity of programming heterogeneous computer systems, 2) geophysical inversion routines which can be used to characterize physical systems, and 3) run-time autotuning routines designed to determine configurations of heterogeneous computing systems in an attempt to maximize the performance of scientific and engineering codes. Using three case studies, a lattice-Boltzmann method, a non-negative least squares inversion, and a finite-difference fluid flow method, it is shown that this environment provides scientists and engineers with means to reduce the programmatic complexity of their applications, to perform geophysical inversions for characterizing physical systems, and to determine high-performing run-time configurations of heterogeneous computing systems using a run-time autotuner.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Xie, Yu; Sengupta, Manajit
Solar radiation can be computed using radiative transfer models, such as the Rapid Radiation Transfer Model (RRTM) and its general circulation model applications, and used for various energy applications. Due to the complexity of computing radiation fields in aerosol and cloudy atmospheres, simulating solar radiation can be extremely time-consuming, but many approximations--e.g., the two-stream approach and the delta-M truncation scheme--can be utilized. To provide a new fast option for computing solar radiation, we developed the Fast All-sky Radiation Model for Solar applications (FARMS) by parameterizing the simulated diffuse horizontal irradiance and direct normal irradiance for cloudy conditions from the RRTMmore » runs using a 16-stream discrete ordinates radiative transfer method. The solar irradiance at the surface was simulated by combining the cloud irradiance parameterizations with a fast clear-sky model, REST2. To understand the accuracy and efficiency of the newly developed fast model, we analyzed FARMS runs using cloud optical and microphysical properties retrieved using GOES data from 2009-2012. The global horizontal irradiance for cloudy conditions was simulated using FARMS and RRTM for global circulation modeling with a two-stream approximation and compared to measurements taken from the U.S. Department of Energy's Atmospheric Radiation Measurement Climate Research Facility Southern Great Plains site. Our results indicate that the accuracy of FARMS is comparable to or better than the two-stream approach; however, FARMS is approximately 400 times more efficient because it does not explicitly solve the radiative transfer equation for each individual cloud condition. Radiative transfer model runs are computationally expensive, but this model is promising for broad applications in solar resource assessment and forecasting. It is currently being used in the National Solar Radiation Database, which is publicly available from the National Renewable Energy Laboratory at http://nsrdb.nrel.gov.« less
Scalable computing for evolutionary genomics.
Prins, Pjotr; Belhachemi, Dominique; Möller, Steffen; Smant, Geert
2012-01-01
Genomic data analysis in evolutionary biology is becoming so computationally intensive that analysis of multiple hypotheses and scenarios takes too long on a single desktop computer. In this chapter, we discuss techniques for scaling computations through parallelization of calculations, after giving a quick overview of advanced programming techniques. Unfortunately, parallel programming is difficult and requires special software design. The alternative, especially attractive for legacy software, is to introduce poor man's parallelization by running whole programs in parallel as separate processes, using job schedulers. Such pipelines are often deployed on bioinformatics computer clusters. Recent advances in PC virtualization have made it possible to run a full computer operating system, with all of its installed software, on top of another operating system, inside a "box," or virtual machine (VM). Such a VM can flexibly be deployed on multiple computers, in a local network, e.g., on existing desktop PCs, and even in the Cloud, to create a "virtual" computer cluster. Many bioinformatics applications in evolutionary biology can be run in parallel, running processes in one or more VMs. Here, we show how a ready-made bioinformatics VM image, named BioNode, effectively creates a computing cluster, and pipeline, in a few steps. This allows researchers to scale-up computations from their desktop, using available hardware, anytime it is required. BioNode is based on Debian Linux and can run on networked PCs and in the Cloud. Over 200 bioinformatics and statistical software packages, of interest to evolutionary biology, are included, such as PAML, Muscle, MAFFT, MrBayes, and BLAST. Most of these software packages are maintained through the Debian Med project. In addition, BioNode contains convenient configuration scripts for parallelizing bioinformatics software. Where Debian Med encourages packaging free and open source bioinformatics software through one central project, BioNode encourages creating free and open source VM images, for multiple targets, through one central project. BioNode can be deployed on Windows, OSX, Linux, and in the Cloud. Next to the downloadable BioNode images, we provide tutorials online, which empower bioinformaticians to install and run BioNode in different environments, as well as information for future initiatives, on creating and building such images.
Heterogeneous computing architecture for fast detection of SNP-SNP interactions.
Sluga, Davor; Curk, Tomaz; Zupan, Blaz; Lotric, Uros
2014-06-25
The extent of data in a typical genome-wide association study (GWAS) poses considerable computational challenges to software tools for gene-gene interaction discovery. Exhaustive evaluation of all interactions among hundreds of thousands to millions of single nucleotide polymorphisms (SNPs) may require weeks or even months of computation. Massively parallel hardware within a modern Graphic Processing Unit (GPU) and Many Integrated Core (MIC) coprocessors can shorten the run time considerably. While the utility of GPU-based implementations in bioinformatics has been well studied, MIC architecture has been introduced only recently and may provide a number of comparative advantages that have yet to be explored and tested. We have developed a heterogeneous, GPU and Intel MIC-accelerated software module for SNP-SNP interaction discovery to replace the previously single-threaded computational core in the interactive web-based data exploration program SNPsyn. We report on differences between these two modern massively parallel architectures and their software environments. Their utility resulted in an order of magnitude shorter execution times when compared to the single-threaded CPU implementation. GPU implementation on a single Nvidia Tesla K20 runs twice as fast as that for the MIC architecture-based Xeon Phi P5110 coprocessor, but also requires considerably more programming effort. General purpose GPUs are a mature platform with large amounts of computing power capable of tackling inherently parallel problems, but can prove demanding for the programmer. On the other hand the new MIC architecture, albeit lacking in performance reduces the programming effort and makes it up with a more general architecture suitable for a wider range of problems.
Heterogeneous computing architecture for fast detection of SNP-SNP interactions
2014-01-01
Background The extent of data in a typical genome-wide association study (GWAS) poses considerable computational challenges to software tools for gene-gene interaction discovery. Exhaustive evaluation of all interactions among hundreds of thousands to millions of single nucleotide polymorphisms (SNPs) may require weeks or even months of computation. Massively parallel hardware within a modern Graphic Processing Unit (GPU) and Many Integrated Core (MIC) coprocessors can shorten the run time considerably. While the utility of GPU-based implementations in bioinformatics has been well studied, MIC architecture has been introduced only recently and may provide a number of comparative advantages that have yet to be explored and tested. Results We have developed a heterogeneous, GPU and Intel MIC-accelerated software module for SNP-SNP interaction discovery to replace the previously single-threaded computational core in the interactive web-based data exploration program SNPsyn. We report on differences between these two modern massively parallel architectures and their software environments. Their utility resulted in an order of magnitude shorter execution times when compared to the single-threaded CPU implementation. GPU implementation on a single Nvidia Tesla K20 runs twice as fast as that for the MIC architecture-based Xeon Phi P5110 coprocessor, but also requires considerably more programming effort. Conclusions General purpose GPUs are a mature platform with large amounts of computing power capable of tackling inherently parallel problems, but can prove demanding for the programmer. On the other hand the new MIC architecture, albeit lacking in performance reduces the programming effort and makes it up with a more general architecture suitable for a wider range of problems. PMID:24964802
Computer-based, Jeopardy™-like game in general chemistry for engineering majors
NASA Astrophysics Data System (ADS)
Ling, S. S.; Saffre, F.; Kadadha, M.; Gater, D. L.; Isakovic, A. F.
2013-03-01
We report on the design of Jeopardy™-like computer game for enhancement of learning of general chemistry for engineering majors. While we examine several parameters of student achievement and attitude, our primary concern is addressing the motivation of students, which tends to be low in a traditionally run chemistry lectures. The effect of the game-playing is tested by comparing paper-based game quiz, which constitutes a control group, and computer-based game quiz, constituting a treatment group. Computer-based game quizzes are Java™-based applications that students run once a week in the second part of the last lecture of the week. Overall effectiveness of the semester-long program is measured through pretest-postest conceptual testing of general chemistry. The objective of this research is to determine to what extent this ``gamification'' of the course delivery and course evaluation processes may be beneficial to the undergraduates' learning of science in general, and chemistry in particular. We present data addressing gender-specific difference in performance, as well as background (pre-college) level of general science and chemistry preparation. We outline the plan how to extend such approach to general physics courses and to modern science driven electives, and we offer live, in-lectures examples of our computer gaming experience. We acknowledge support from Khalifa University, Abu Dhabi
NASA Astrophysics Data System (ADS)
Fasel, Markus
2016-10-01
High-Performance Computing Systems are powerful tools tailored to support large- scale applications that rely on low-latency inter-process communications to run efficiently. By design, these systems often impose constraints on application workflows, such as limited external network connectivity and whole node scheduling, that make more general-purpose computing tasks, such as those commonly found in high-energy nuclear physics applications, more difficult to carry out. In this work, we present a tool designed to simplify access to such complicated environments by handling the common tasks of job submission, software management, and local data management, in a framework that is easily adaptable to the specific requirements of various computing systems. The tool, initially constructed to process stand-alone ALICE simulations for detector and software development, was successfully deployed on the NERSC computing systems, Carver, Hopper and Edison, and is being configured to provide access to the next generation NERSC system, Cori. In this report, we describe the tool and discuss our experience running ALICE applications on NERSC HPC systems. The discussion will include our initial benchmarks of Cori compared to other systems and our attempts to leverage the new capabilities offered with Cori to support data-intensive applications, with a future goal of full integration of such systems into ALICE grid operations.
Aerodynamic optimization of supersonic compressor cascade using differential evolution on GPU
NASA Astrophysics Data System (ADS)
Aissa, Mohamed Hasanine; Verstraete, Tom; Vuik, Cornelis
2016-06-01
Differential Evolution (DE) is a powerful stochastic optimization method. Compared to gradient-based algorithms, DE is able to avoid local minima but requires at the same time more function evaluations. In turbomachinery applications, function evaluations are performed with time-consuming CFD simulation, which results in a long, non affordable, design cycle. Modern High Performance Computing systems, especially Graphic Processing Units (GPUs), are able to alleviate this inconvenience by accelerating the design evaluation itself. In this work we present a validated CFD Solver running on GPUs, able to accelerate the design evaluation and thus the entire design process. An achieved speedup of 20x to 30x enabled the DE algorithm to run on a high-end computer instead of a costly large cluster. The GPU-enhanced DE was used to optimize the aerodynamics of a supersonic compressor cascade, achieving an aerodynamic loss minimization of 20%.
Increasing thermal efficiency of solar flat plate collectors
NASA Astrophysics Data System (ADS)
Pona, J.
A study of methods to increase the efficiency of heat transfer in flat plate solar collectors is presented. In order to increase the heat transfer from the absorber plate to the working fluid inside the tubes, turbulent flow was induced by installing baffles within the tubes. The installation of the baffles resulted in a 7 to 12% increase in collector efficiency. Experiments were run on both 1 sq ft and 2 sq ft collectors each fitted with either slotted baffles or tubular baffles. A computer program was run comparing the baffled collector to the standard collector. The results obtained from the computer show that the baffled collectors have a 2.7% increase in life cycle cost (LCC) savings and a 3.6% increase in net cash flow for use in domestic hot water systems, and even greater increases when used in solar heating systems.
Aerodynamic optimization of supersonic compressor cascade using differential evolution on GPU
DOE Office of Scientific and Technical Information (OSTI.GOV)
Aissa, Mohamed Hasanine; Verstraete, Tom; Vuik, Cornelis
Differential Evolution (DE) is a powerful stochastic optimization method. Compared to gradient-based algorithms, DE is able to avoid local minima but requires at the same time more function evaluations. In turbomachinery applications, function evaluations are performed with time-consuming CFD simulation, which results in a long, non affordable, design cycle. Modern High Performance Computing systems, especially Graphic Processing Units (GPUs), are able to alleviate this inconvenience by accelerating the design evaluation itself. In this work we present a validated CFD Solver running on GPUs, able to accelerate the design evaluation and thus the entire design process. An achieved speedup of 20xmore » to 30x enabled the DE algorithm to run on a high-end computer instead of a costly large cluster. The GPU-enhanced DE was used to optimize the aerodynamics of a supersonic compressor cascade, achieving an aerodynamic loss minimization of 20%.« less
Fingerprinting Communication and Computation on HPC Machines
DOE Office of Scientific and Technical Information (OSTI.GOV)
Peisert, Sean
2010-06-02
How do we identify what is actually running on high-performance computing systems? Names of binaries, dynamic libraries loaded, or other elements in a submission to a batch queue can give clues, but binary names can be changed, and libraries provide limited insight and resolution on the code being run. In this paper, we present a method for"fingerprinting" code running on HPC machines using elements of communication and computation. We then discuss how that fingerprint can be used to determine if the code is consistent with certain other types of codes, what a user usually runs, or what the user requestedmore » an allocation to do. In some cases, our techniques enable us to fingerprint HPC codes using runtime MPI data with a high degree of accuracy.« less
Eigensolver for a Sparse, Large Hermitian Matrix
NASA Technical Reports Server (NTRS)
Tisdale, E. Robert; Oyafuso, Fabiano; Klimeck, Gerhard; Brown, R. Chris
2003-01-01
A parallel-processing computer program finds a few eigenvalues in a sparse Hermitian matrix that contains as many as 100 million diagonal elements. This program finds the eigenvalues faster, using less memory, than do other, comparable eigensolver programs. This program implements a Lanczos algorithm in the American National Standards Institute/ International Organization for Standardization (ANSI/ISO) C computing language, using the Message Passing Interface (MPI) standard to complement an eigensolver in PARPACK. [PARPACK (Parallel Arnoldi Package) is an extension, to parallel-processing computer architectures, of ARPACK (Arnoldi Package), which is a collection of Fortran 77 subroutines that solve large-scale eigenvalue problems.] The eigensolver runs on Beowulf clusters of computers at the Jet Propulsion Laboratory (JPL).
New reflective symmetry design capability in the JPL-IDEAS Structure Optimization Program
NASA Technical Reports Server (NTRS)
Strain, D.; Levy, R.
1986-01-01
The JPL-IDEAS antenna structure analysis and design optimization computer program was modified to process half structure models of symmetric structures subjected to arbitrary external static loads, synthesize the performance, and optimize the design of the full structure. Significant savings in computation time and cost (more than 50%) were achieved compared to the cost of full model computer runs. The addition of the new reflective symmetry analysis design capabilities to the IDEAS program allows processing of structure models whose size would otherwise prevent automated design optimization. The new program produced synthesized full model iterative design results identical to those of actual full model program executions at substantially reduced cost, time, and computer storage.
Robot computer problem solving system
NASA Technical Reports Server (NTRS)
Becker, J. D.; Merriam, E. W.
1974-01-01
The conceptual, experimental, and practical phases of developing a robot computer problem solving system are outlined. Robot intelligence, conversion of the programming language SAIL to run under the THNEX monitor, and the use of the network to run several cooperating jobs at different sites are discussed.
Tibiofemoral Contact Forces in the Anterior Cruciate Ligament-Reconstructed Knee.
Saxby, David John; Bryant, Adam L; Modenese, Luca; Gerus, Pauline; Killen, Bryce A; Konrath, Jason; Fortin, Karine; Wrigley, Tim V; Bennell, Kim L; Cicuttini, Flavia M; Vertullo, Christopher; Feller, Julian A; Whitehead, Tim; Gallie, Price; Lloyd, David G
2016-11-01
To investigate differences in anterior cruciate ligament-reconstructed (ACLR) and healthy individuals in terms of the magnitude of the tibiofemoral contact forces, as well as the relative muscle and external load contributions to those contact forces, during walking, running, and sidestepping gait tasks. A computational EMG-driven neuromusculoskeletal model was used to estimate the muscle and tibiofemoral contact forces in those with single-bundle combined semitendinosus and gracilis tendon autograft ACLR (n = 104, 29.7 ± 6.5 yr, 78.1 ± 14.4 kg) and healthy controls (n = 60, 27.5 ± 5.4 yr, 67.8 ± 14.0 kg) during walking (1.4 ± 0.2 m·s), running (4.5 ± 0.5 m·s) and sidestepping (3.7 ± 0.6 m·s). Within the computational model, the semitendinosus of ACLR participants was adjusted to account for literature reported strength deficits and morphological changes subsequent to autograft harvesting. ACLR had smaller maximum total and medial tibiofemoral contact forces (~80% of control values, scaled to bodyweight) during the different gait tasks. Compared with controls, ACLR were found to have a smaller maximum knee flexion moment, which explained the smaller tibiofemoral contact forces. Similarly, compared with controls, ACLR had both a smaller maximum knee flexion angle and knee flexion excursion during running and sidestepping, which may have concentrated the articular contact forces to smaller areas within the tibiofemoral joint. Mean relative muscle and external load contributions to the tibiofemoral contact forces were not significantly different between ACLR and controls. ACLR had lower bodyweight-scaled tibiofemoral contact forces during walking, running, and sidestepping, likely due to lower knee flexion moments and straighter knee during the different gait tasks. The relative contributions of muscles and external loads to the contact forces were equivalent between groups.
Active Nodal Task Seeking for High-Performance, Ultra-Dependable Computing
1994-07-01
implementation. Figure 1 shows a hardware organization of ANTS: stand-alone computing nodes inter - connected by buses. 2.1 Run Time Partitioning The...nodes in 14 respond to changing loads [27] or system reconfiguration [26]. Existing techniques are all source-initiated or server-initiated [27]. 5.1...short-running task segments. The task segments must be short-running in order that processors will become avalable often enough to satisfy changing
Parallel computing for automated model calibration
DOE Office of Scientific and Technical Information (OSTI.GOV)
Burke, John S.; Danielson, Gary R.; Schulz, Douglas A.
2002-07-29
Natural resources model calibration is a significant burden on computing and staff resources in modeling efforts. Most assessments must consider multiple calibration objectives (for example magnitude and timing of stream flow peak). An automated calibration process that allows real time updating of data/models, allowing scientists to focus effort on improving models is needed. We are in the process of building a fully featured multi objective calibration tool capable of processing multiple models cheaply and efficiently using null cycle computing. Our parallel processing and calibration software routines have been generically, but our focus has been on natural resources model calibration. Somore » far, the natural resources models have been friendly to parallel calibration efforts in that they require no inter-process communication, only need a small amount of input data and only output a small amount of statistical information for each calibration run. A typical auto calibration run might involve running a model 10,000 times with a variety of input parameters and summary statistical output. In the past model calibration has been done against individual models for each data set. The individual model runs are relatively fast, ranging from seconds to minutes. The process was run on a single computer using a simple iterative process. We have completed two Auto Calibration prototypes and are currently designing a more feature rich tool. Our prototypes have focused on running the calibration in a distributed computing cross platform environment. They allow incorporation of?smart? calibration parameter generation (using artificial intelligence processing techniques). Null cycle computing similar to SETI@Home has also been a focus of our efforts. This paper details the design of the latest prototype and discusses our plans for the next revision of the software.« less
The Impact and Promise of Open-Source Computational Material for Physics Teaching
NASA Astrophysics Data System (ADS)
Christian, Wolfgang
2017-01-01
A computer-based modeling approach to teaching must be flexible because students and teachers have different skills and varying levels of preparation. Learning how to run the ``software du jour'' is not the objective for integrating computational physics material into the curriculum. Learning computational thinking, how to use computation and computer-based visualization to communicate ideas, how to design and build models, and how to use ready-to-run models to foster critical thinking is the objective. Our computational modeling approach to teaching is a research-proven pedagogy that predates computers. It attempts to enhance student achievement through the Modeling Cycle. This approach was pioneered by Robert Karplus and the SCIS Project in the 1960s and 70s and later extended by the Modeling Instruction Program led by Jane Jackson and David Hestenes at Arizona State University. This talk describes a no-cost open-source computational approach aligned with a Modeling Cycle pedagogy. Our tools, curricular material, and ready-to-run examples are freely available from the Open Source Physics Collection hosted on the AAPT-ComPADRE digital library. Examples will be presented.
NASA Technical Reports Server (NTRS)
Parker, R. J.; Zaretsky, E. V.
1974-01-01
The five-ball fatigue tester was used to evaluate silicon nitride as a rolling-element bearing material. Results indicate that hot-pressed silicon nitride running against steel may be expected to yield fatigue lives comparable to or greater than those of bearing quality steel running against steel at stress levels typical rolling-element bearing application. The fatigue life of hot-pressed silicon nitride is considerably greater than that of any ceramic or cermet tested. Computer analysis indicates that there is no improvement in the lives of 120-mm-bore angular--contact ball bearings of the same geometry operating at DN values from 2 to 4 million where hot-pressed silicon nitride balls are used in place of steel balls.
The effect of foot strike pattern on achilles tendon load during running.
Almonroeder, Thomas; Willson, John D; Kernozek, Thomas W
2013-08-01
In this study we compared Achilles tendon loading parameters during barefoot running among females with different foot strike patterns using open-source computer muscle modeling software to provide dynamic simulations of running. Muscle forces of the gastrocnemius and soleus were estimated from experimental data collected in a motion capture laboratory during barefoot running for 11 runners utilizing a rearfoot strike (RFS) and 8 runners utilizing a non-RFS (NRFS) pattern. Our results show that peak Achilles tendon force occurred earlier in stance phase (p = 0.007), which contributed to a 15% increase in average Achilles tendon loading rate among participants adopting a NRFS pattern (p = 0.06). Stance time, step length, and the estimated number of steps per mile were similar between groups. However, runners with a NRFS pattern experienced 11% greater Achilles tendon impulse each step (p = 0.05) and nearly significantly greater Achilles tendon impulse per mile run (p = 0.06). This difference equates to an additional 47.7 body weights for each mile run with a NRFS pattern. Runners considering a NRFS pattern may want to account for these novel stressors and adapt training programs accordingly.
NASA Astrophysics Data System (ADS)
Tan, K. L.; Chong, Z. L.; Khoo, M. B. C.; Teoh, W. L.; Teh, S. Y.
2017-09-01
Quality control is crucial in a wide variety of fields, as it can help to satisfy customers’ needs and requirements by enhancing and improving the products and services to a superior quality level. The EWMA median chart was proposed as a useful alternative to the EWMA \\bar{X} chart because the median-type chart is robust against contamination, outliers or small deviation from the normality assumption compared to the traditional \\bar{X}-type chart. To provide a complete understanding of the run-length distribution, the percentiles of the run-length distribution should be investigated rather than depending solely on the average run length (ARL) performance measure. This is because interpretation depending on the ARL alone can be misleading, as the process mean shifts change according to the skewness and shape of the run-length distribution, varying from almost symmetric when the magnitude of the mean shift is large, to highly right-skewed when the process is in-control (IC) or slightly out-of-control (OOC). Before computing the percentiles of the run-length distribution, optimal parameters of the EWMA median chart will be obtained by minimizing the OOC ARL, while retaining the IC ARL at a desired value.
Virtualization and cloud computing in dentistry.
Chow, Frank; Muftu, Ali; Shorter, Richard
2014-01-01
The use of virtualization and cloud computing has changed the way we use computers. Virtualization is a method of placing software called a hypervisor on the hardware of a computer or a host operating system. It allows a guest operating system to run on top of the physical computer with a virtual machine (i.e., virtual computer). Virtualization allows multiple virtual computers to run on top of one physical computer and to share its hardware resources, such as printers, scanners, and modems. This increases the efficient use of the computer by decreasing costs (e.g., hardware, electricity administration, and management) since only one physical computer is needed and running. This virtualization platform is the basis for cloud computing. It has expanded into areas of server and storage virtualization. One of the commonly used dental storage systems is cloud storage. Patient information is encrypted as required by the Health Insurance Portability and Accountability Act (HIPAA) and stored on off-site private cloud services for a monthly service fee. As computer costs continue to increase, so too will the need for more storage and processing power. Virtual and cloud computing will be a method for dentists to minimize costs and maximize computer efficiency in the near future. This article will provide some useful information on current uses of cloud computing.
Kim, Seung-Cheol; Kim, Eun-Soo
2009-02-20
In this paper we propose a new approach for fast generation of computer-generated holograms (CGHs) of a 3D object by using the run-length encoding (RLE) and the novel look-up table (N-LUT) methods. With the RLE method, spatially redundant data of a 3D object are extracted and regrouped into the N-point redundancy map according to the number of the adjacent object points having the same 3D value. Based on this redundancy map, N-point principle fringe patterns (PFPs) are newly calculated by using the 1-point PFP of the N-LUT, and the CGH pattern for the 3D object is generated with these N-point PFPs. In this approach, object points to be involved in calculation of the CGH pattern can be dramatically reduced and, as a result, an increase of computational speed can be obtained. Some experiments with a test 3D object are carried out and the results are compared to those of the conventional methods.
Cosmic Reionization On Computers: Numerical and Physical Convergence
Gnedin, Nickolay Y.
2016-04-01
In this paper I show that simulations of reionization performed under the Cosmic Reionization On Computers (CROC) project do converge in space and mass, albeit rather slowly. A fully converged solution (for a given star formation and feedback model) can be determined at a level of precision of about 20%, but such a solution is useless in practice, since achieving it in production-grade simulations would require a large set of runs at various mass and spatial resolutions, and computational resources for such an undertaking are not yet readily available. In order to make progress in the interim, I introduce amore » weak convergence correction factor in the star formation recipe, which allows one to approximate the fully converged solution with finite resolution simulations. The accuracy of weakly converged simulations approaches a comparable, ~20% level of precision for star formation histories of individual galactic halos and other galactic properties that are directly related to star formation rates, like stellar masses and metallicities. Yet other properties of model galaxies, for example, their HI masses, are recovered in the weakly converged runs only within a factor of two.« less
Cosmic Reionization On Computers: Numerical and Physical Convergence
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gnedin, Nickolay Y.
In this paper I show that simulations of reionization performed under the Cosmic Reionization On Computers (CROC) project do converge in space and mass, albeit rather slowly. A fully converged solution (for a given star formation and feedback model) can be determined at a level of precision of about 20%, but such a solution is useless in practice, since achieving it in production-grade simulations would require a large set of runs at various mass and spatial resolutions, and computational resources for such an undertaking are not yet readily available. In order to make progress in the interim, I introduce amore » weak convergence correction factor in the star formation recipe, which allows one to approximate the fully converged solution with finite resolution simulations. The accuracy of weakly converged simulations approaches a comparable, ~20% level of precision for star formation histories of individual galactic halos and other galactic properties that are directly related to star formation rates, like stellar masses and metallicities. Yet other properties of model galaxies, for example, their HI masses, are recovered in the weakly converged runs only within a factor of two.« less
Framework for architecture-independent run-time reconfigurable applications
NASA Astrophysics Data System (ADS)
Lehn, David I.; Hudson, Rhett D.; Athanas, Peter M.
2000-10-01
Configurable Computing Machines (CCMs) have emerged as a technology with the computational benefits of custom ASICs as well as the flexibility and reconfigurability of general-purpose microprocessors. Significant effort from the research community has focused on techniques to move this reconfigurability from a rapid application development tool to a run-time tool. This requires the ability to change the hardware design while the application is executing and is known as Run-Time Reconfiguration (RTR). Widespread acceptance of run-time reconfigurable custom computing depends upon the existence of high-level automated design tools. Such tools must reduce the designers effort to port applications between different platforms as the architecture, hardware, and software evolves. A Java implementation of a high-level application framework, called Janus, is presented here. In this environment, developers create Java classes that describe the structural behavior of an application. The framework allows hardware and software modules to be freely mixed and interchanged. A compilation phase of the development process analyzes the structure of the application and adapts it to the target platform. Janus is capable of structuring the run-time behavior of an application to take advantage of the memory and computational resources available.
Progress in Machine Learning Studies for the CMS Computing Infrastructure
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bonacorsi, Daniele; Kuznetsov, Valentin; Magini, Nicolo
Here, computing systems for LHC experiments developed together with Grids worldwide. While a complete description of the original Grid-based infrastructure and services for LHC experiments and its recent evolutions can be found elsewhere, it is worth to mention here the scale of the computing resources needed to fulfill the needs of LHC experiments in Run-1 and Run-2 so far.
Progress in Machine Learning Studies for the CMS Computing Infrastructure
Bonacorsi, Daniele; Kuznetsov, Valentin; Magini, Nicolo; ...
2017-12-06
Here, computing systems for LHC experiments developed together with Grids worldwide. While a complete description of the original Grid-based infrastructure and services for LHC experiments and its recent evolutions can be found elsewhere, it is worth to mention here the scale of the computing resources needed to fulfill the needs of LHC experiments in Run-1 and Run-2 so far.
Accelerating epistasis analysis in human genetics with consumer graphics hardware.
Sinnott-Armstrong, Nicholas A; Greene, Casey S; Cancare, Fabio; Moore, Jason H
2009-07-24
Human geneticists are now capable of measuring more than one million DNA sequence variations from across the human genome. The new challenge is to develop computationally feasible methods capable of analyzing these data for associations with common human disease, particularly in the context of epistasis. Epistasis describes the situation where multiple genes interact in a complex non-linear manner to determine an individual's disease risk and is thought to be ubiquitous for common diseases. Multifactor Dimensionality Reduction (MDR) is an algorithm capable of detecting epistasis. An exhaustive analysis with MDR is often computationally expensive, particularly for high order interactions. This challenge has previously been met with parallel computation and expensive hardware. The option we examine here exploits commodity hardware designed for computer graphics. In modern computers Graphics Processing Units (GPUs) have more memory bandwidth and computational capability than Central Processing Units (CPUs) and are well suited to this problem. Advances in the video game industry have led to an economy of scale creating a situation where these powerful components are readily available at very low cost. Here we implement and evaluate the performance of the MDR algorithm on GPUs. Of primary interest are the time required for an epistasis analysis and the price to performance ratio of available solutions. We found that using MDR on GPUs consistently increased performance per machine over both a feature rich Java software package and a C++ cluster implementation. The performance of a GPU workstation running a GPU implementation reduces computation time by a factor of 160 compared to an 8-core workstation running the Java implementation on CPUs. This GPU workstation performs similarly to 150 cores running an optimized C++ implementation on a Beowulf cluster. Furthermore this GPU system provides extremely cost effective performance while leaving the CPU available for other tasks. The GPU workstation containing three GPUs costs $2000 while obtaining similar performance on a Beowulf cluster requires 150 CPU cores which, including the added infrastructure and support cost of the cluster system, cost approximately $82,500. Graphics hardware based computing provides a cost effective means to perform genetic analysis of epistasis using MDR on large datasets without the infrastructure of a computing cluster.
Multitasking the code ARC3D. [for computational fluid dynamics
NASA Technical Reports Server (NTRS)
Barton, John T.; Hsiung, Christopher C.
1986-01-01
The CRAY multitasking system was developed in order to utilize all four processors and sharply reduce the wall clock run time. This paper describes the techniques used to modify the computational fluid dynamics code ARC3D for this run and analyzes the achieved speedup. The ARC3D code solves either the Euler or thin-layer N-S equations using an implicit approximate factorization scheme. Results indicate that multitask processing can be used to achieve wall clock speedup factors of over three times, depending on the nature of the program code being used. Multitasking appears to be particularly advantageous for large-memory problems running on multiple CPU computers.
NASA Technical Reports Server (NTRS)
Eberhardt, D. S.; Baganoff, D.; Stevens, K.
1984-01-01
Implicit approximate-factored algorithms have certain properties that are suitable for parallel processing. A particular computational fluid dynamics (CFD) code, using this algorithm, is mapped onto a multiple-instruction/multiple-data-stream (MIMD) computer architecture. An explanation of this mapping procedure is presented, as well as some of the difficulties encountered when trying to run the code concurrently. Timing results are given for runs on the Ames Research Center's MIMD test facility which consists of two VAX 11/780's with a common MA780 multi-ported memory. Speedups exceeding 1.9 for characteristic CFD runs were indicated by the timing results.
Determinant Computation on the GPU using the Condensation Method
NASA Astrophysics Data System (ADS)
Anisul Haque, Sardar; Moreno Maza, Marc
2012-02-01
We report on a GPU implementation of the condensation method designed by Abdelmalek Salem and Kouachi Said for computing the determinant of a matrix. We consider two types of coefficients: modular integers and floating point numbers. We evaluate the performance of our code by measuring its effective bandwidth and argue that it is numerical stable in the floating point number case. In addition, we compare our code with serial implementation of determinant computation from well-known mathematical packages. Our results suggest that a GPU implementation of the condensation method has a large potential for improving those packages in terms of running time and numerical stability.
Design and Development of a Run-Time Monitor for Multi-Core Architectures in Cloud Computing
Kang, Mikyung; Kang, Dong-In; Crago, Stephen P.; Park, Gyung-Leen; Lee, Junghoon
2011-01-01
Cloud computing is a new information technology trend that moves computing and data away from desktops and portable PCs into large data centers. The basic principle of cloud computing is to deliver applications as services over the Internet as well as infrastructure. A cloud is a type of parallel and distributed system consisting of a collection of inter-connected and virtualized computers that are dynamically provisioned and presented as one or more unified computing resources. The large-scale distributed applications on a cloud require adaptive service-based software, which has the capability of monitoring system status changes, analyzing the monitored information, and adapting its service configuration while considering tradeoffs among multiple QoS features simultaneously. In this paper, we design and develop a Run-Time Monitor (RTM) which is a system software to monitor the application behavior at run-time, analyze the collected information, and optimize cloud computing resources for multi-core architectures. RTM monitors application software through library instrumentation as well as underlying hardware through a performance counter optimizing its computing configuration based on the analyzed data. PMID:22163811
Design and development of a run-time monitor for multi-core architectures in cloud computing.
Kang, Mikyung; Kang, Dong-In; Crago, Stephen P; Park, Gyung-Leen; Lee, Junghoon
2011-01-01
Cloud computing is a new information technology trend that moves computing and data away from desktops and portable PCs into large data centers. The basic principle of cloud computing is to deliver applications as services over the Internet as well as infrastructure. A cloud is a type of parallel and distributed system consisting of a collection of inter-connected and virtualized computers that are dynamically provisioned and presented as one or more unified computing resources. The large-scale distributed applications on a cloud require adaptive service-based software, which has the capability of monitoring system status changes, analyzing the monitored information, and adapting its service configuration while considering tradeoffs among multiple QoS features simultaneously. In this paper, we design and develop a Run-Time Monitor (RTM) which is a system software to monitor the application behavior at run-time, analyze the collected information, and optimize cloud computing resources for multi-core architectures. RTM monitors application software through library instrumentation as well as underlying hardware through a performance counter optimizing its computing configuration based on the analyzed data.
Compressed quantum computation using a remote five-qubit quantum computer
NASA Astrophysics Data System (ADS)
Hebenstreit, M.; Alsina, D.; Latorre, J. I.; Kraus, B.
2017-05-01
The notion of compressed quantum computation is employed to simulate the Ising interaction of a one-dimensional chain consisting of n qubits using the universal IBM cloud quantum computer running on log2(n ) qubits. The external field parameter that controls the quantum phase transition of this model translates into particular settings of the quantum gates that generate the circuit. We measure the magnetization, which displays the quantum phase transition, on a two-qubit system, which simulates a four-qubit Ising chain, and show its agreement with the theoretical prediction within a certain error. We also discuss the relevant point of how to assess errors when using a cloud quantum computer with a limited amount of runs. As a solution, we propose to use validating circuits, that is, to run independent controlled quantum circuits of similar complexity to the circuit of interest.
The viability of ADVANTG deterministic method for synthetic radiography generation
NASA Astrophysics Data System (ADS)
Bingham, Andrew; Lee, Hyoung K.
2018-07-01
Fast simulation techniques to generate synthetic radiographic images of high resolution are helpful when new radiation imaging systems are designed. However, the standard stochastic approach requires lengthy run time with poorer statistics at higher resolution. The investigation of the viability of a deterministic approach to synthetic radiography image generation was explored. The aim was to analyze a computational time decrease over the stochastic method. ADVANTG was compared to MCNP in multiple scenarios including a small radiography system prototype, to simulate high resolution radiography images. By using ADVANTG deterministic code to simulate radiography images the computational time was found to decrease 10 to 13 times compared to the MCNP stochastic approach while retaining image quality.
Improved Gaussian Beam-Scattering Algorithm
NASA Technical Reports Server (NTRS)
Lock, James A.
1995-01-01
The localized model of the beam-shape coefficients for Gaussian beam-scattering theory by a spherical particle provides a great simplification in the numerical implementation of the theory. We derive an alternative form for the localized coefficients that is more convenient for computer computations and that provides physical insight into the details of the scattering process. We construct a FORTRAN program for Gaussian beam scattering with the localized model and compare its computer run time on a personal computer with that of a traditional Mie scattering program and with three other published methods for computing Gaussian beam scattering. We show that the analytical form of the beam-shape coefficients makes evident the fact that the excitation rate of morphology-dependent resonances is greatly enhanced for far off-axis incidence of the Gaussian beam.
Self-Scheduling Parallel Methods for Multiple Serial Codes with Application to WOPWOP
NASA Technical Reports Server (NTRS)
Long, Lyle N.; Brentner, Kenneth S.
2000-01-01
This paper presents a scheme for efficiently running a large number of serial jobs on parallel computers. Two examples are given of computer programs that run relatively quickly, but often they must be run numerous times to obtain all the results needed. It is very common in science and engineering to have codes that are not massive computing challenges in themselves, but due to the number of instances that must be run, they do become large-scale computing problems. The two examples given here represent common problems in aerospace engineering: aerodynamic panel methods and aeroacoustic integral methods. The first example simply solves many systems of linear equations. This is representative of an aerodynamic panel code where someone would like to solve for numerous angles of attack. The complete code for this first example is included in the appendix so that it can be readily used by others as a template. The second example is an aeroacoustics code (WOPWOP) that solves the Ffowcs Williams Hawkings equation to predict the far-field sound due to rotating blades. In this example, one quite often needs to compute the sound at numerous observer locations, hence parallelization is utilized to automate the noise computation for a large number of observers.
Automated selection of brain regions for real-time fMRI brain-computer interfaces
NASA Astrophysics Data System (ADS)
Lührs, Michael; Sorger, Bettina; Goebel, Rainer; Esposito, Fabrizio
2017-02-01
Objective. Brain-computer interfaces (BCIs) implemented with real-time functional magnetic resonance imaging (rt-fMRI) use fMRI time-courses from predefined regions of interest (ROIs). To reach best performances, localizer experiments and on-site expert supervision are required for ROI definition. To automate this step, we developed two unsupervised computational techniques based on the general linear model (GLM) and independent component analysis (ICA) of rt-fMRI data, and compared their performances on a communication BCI. Approach. 3 T fMRI data of six volunteers were re-analyzed in simulated real-time. During a localizer run, participants performed three mental tasks following visual cues. During two communication runs, a letter-spelling display guided the subjects to freely encode letters by performing one of the mental tasks with a specific timing. GLM- and ICA-based procedures were used to decode each letter, respectively using compact ROIs and whole-brain distributed spatio-temporal patterns of fMRI activity, automatically defined from subject-specific or group-level maps. Main results. Letter-decoding performances were comparable to supervised methods. In combination with a similarity-based criterion, GLM- and ICA-based approaches successfully decoded more than 80% (average) of the letters. Subject-specific maps yielded optimal performances. Significance. Automated solutions for ROI selection may help accelerating the translation of rt-fMRI BCIs from research to clinical applications.
Study of the TRAC Airfoil Table Computational System
NASA Technical Reports Server (NTRS)
Hu, Hong
1999-01-01
The report documents the study of the application of the TRAC airfoil table computational package (TRACFOIL) to the prediction of 2D airfoil force and moment data over a wide range of angle of attack and Mach number. The TRACFOIL generates the standard C-81 airfoil table for input into rotorcraft comprehensive codes such as CAM- RAD. The existing TRACFOIL computer package is successfully modified to run on Digital alpha workstations and on Cray-C90 supercomputers. A step-by-step instruction for using the package on both computer platforms is provided. Application of the newer version of TRACFOIL is made for two airfoil sections. The C-81 data obtained using the TRACFOIL method are compared with those of wind-tunnel data and results are presented.
Counterfactual quantum computation through quantum interrogation
NASA Astrophysics Data System (ADS)
Hosten, Onur; Rakher, Matthew T.; Barreiro, Julio T.; Peters, Nicholas A.; Kwiat, Paul G.
2006-02-01
The logic underlying the coherent nature of quantum information processing often deviates from intuitive reasoning, leading to surprising effects. Counterfactual computation constitutes a striking example: the potential outcome of a quantum computation can be inferred, even if the computer is not run. Relying on similar arguments to interaction-free measurements (or quantum interrogation), counterfactual computation is accomplished by putting the computer in a superposition of `running' and `not running' states, and then interfering the two histories. Conditional on the as-yet-unknown outcome of the computation, it is sometimes possible to counterfactually infer information about the solution. Here we demonstrate counterfactual computation, implementing Grover's search algorithm with an all-optical approach. It was believed that the overall probability of such counterfactual inference is intrinsically limited, so that it could not perform better on average than random guesses. However, using a novel `chained' version of the quantum Zeno effect, we show how to boost the counterfactual inference probability to unity, thereby beating the random guessing limit. Our methods are general and apply to any physical system, as illustrated by a discussion of trapped-ion systems. Finally, we briefly show that, in certain circumstances, counterfactual computation can eliminate errors induced by decoherence.
4273π: Bioinformatics education on low cost ARM hardware
2013-01-01
Background Teaching bioinformatics at universities is complicated by typical computer classroom settings. As well as running software locally and online, students should gain experience of systems administration. For a future career in biology or bioinformatics, the installation of software is a useful skill. We propose that this may be taught by running the course on GNU/Linux running on inexpensive Raspberry Pi computer hardware, for which students may be granted full administrator access. Results We release 4273π, an operating system image for Raspberry Pi based on Raspbian Linux. This includes minor customisations for classroom use and includes our Open Access bioinformatics course, 4273π Bioinformatics for Biologists. This is based on the final-year undergraduate module BL4273, run on Raspberry Pi computers at the University of St Andrews, Semester 1, academic year 2012–2013. Conclusions 4273π is a means to teach bioinformatics, including systems administration tasks, to undergraduates at low cost. PMID:23937194
4273π: bioinformatics education on low cost ARM hardware.
Barker, Daniel; Ferrier, David Ek; Holland, Peter Wh; Mitchell, John Bo; Plaisier, Heleen; Ritchie, Michael G; Smart, Steven D
2013-08-12
Teaching bioinformatics at universities is complicated by typical computer classroom settings. As well as running software locally and online, students should gain experience of systems administration. For a future career in biology or bioinformatics, the installation of software is a useful skill. We propose that this may be taught by running the course on GNU/Linux running on inexpensive Raspberry Pi computer hardware, for which students may be granted full administrator access. We release 4273π, an operating system image for Raspberry Pi based on Raspbian Linux. This includes minor customisations for classroom use and includes our Open Access bioinformatics course, 4273π Bioinformatics for Biologists. This is based on the final-year undergraduate module BL4273, run on Raspberry Pi computers at the University of St Andrews, Semester 1, academic year 2012-2013. 4273π is a means to teach bioinformatics, including systems administration tasks, to undergraduates at low cost.
NASA Astrophysics Data System (ADS)
Wahl, N.; Hennig, P.; Wieser, H. P.; Bangert, M.
2017-07-01
The sensitivity of intensity-modulated proton therapy (IMPT) treatment plans to uncertainties can be quantified and mitigated with robust/min-max and stochastic/probabilistic treatment analysis and optimization techniques. Those methods usually rely on sparse random, importance, or worst-case sampling. Inevitably, this imposes a trade-off between computational speed and accuracy of the uncertainty propagation. Here, we investigate analytical probabilistic modeling (APM) as an alternative for uncertainty propagation and minimization in IMPT that does not rely on scenario sampling. APM propagates probability distributions over range and setup uncertainties via a Gaussian pencil-beam approximation into moments of the probability distributions over the resulting dose in closed form. It supports arbitrary correlation models and allows for efficient incorporation of fractionation effects regarding random and systematic errors. We evaluate the trade-off between run-time and accuracy of APM uncertainty computations on three patient datasets. Results are compared against reference computations facilitating importance and random sampling. Two approximation techniques to accelerate uncertainty propagation and minimization based on probabilistic treatment plan optimization are presented. Runtimes are measured on CPU and GPU platforms, dosimetric accuracy is quantified in comparison to a sampling-based benchmark (5000 random samples). APM accurately propagates range and setup uncertainties into dose uncertainties at competitive run-times (GPU ≤slant {5} min). The resulting standard deviation (expectation value) of dose show average global γ{3% / {3}~mm} pass rates between 94.2% and 99.9% (98.4% and 100.0%). All investigated importance sampling strategies provided less accuracy at higher run-times considering only a single fraction. Considering fractionation, APM uncertainty propagation and treatment plan optimization was proven to be possible at constant time complexity, while run-times of sampling-based computations are linear in the number of fractions. Using sum sampling within APM, uncertainty propagation can only be accelerated at the cost of reduced accuracy in variance calculations. For probabilistic plan optimization, we were able to approximate the necessary pre-computations within seconds, yielding treatment plans of similar quality as gained from exact uncertainty propagation. APM is suited to enhance the trade-off between speed and accuracy in uncertainty propagation and probabilistic treatment plan optimization, especially in the context of fractionation. This brings fully-fledged APM computations within reach of clinical application.
Wahl, N; Hennig, P; Wieser, H P; Bangert, M
2017-06-26
The sensitivity of intensity-modulated proton therapy (IMPT) treatment plans to uncertainties can be quantified and mitigated with robust/min-max and stochastic/probabilistic treatment analysis and optimization techniques. Those methods usually rely on sparse random, importance, or worst-case sampling. Inevitably, this imposes a trade-off between computational speed and accuracy of the uncertainty propagation. Here, we investigate analytical probabilistic modeling (APM) as an alternative for uncertainty propagation and minimization in IMPT that does not rely on scenario sampling. APM propagates probability distributions over range and setup uncertainties via a Gaussian pencil-beam approximation into moments of the probability distributions over the resulting dose in closed form. It supports arbitrary correlation models and allows for efficient incorporation of fractionation effects regarding random and systematic errors. We evaluate the trade-off between run-time and accuracy of APM uncertainty computations on three patient datasets. Results are compared against reference computations facilitating importance and random sampling. Two approximation techniques to accelerate uncertainty propagation and minimization based on probabilistic treatment plan optimization are presented. Runtimes are measured on CPU and GPU platforms, dosimetric accuracy is quantified in comparison to a sampling-based benchmark (5000 random samples). APM accurately propagates range and setup uncertainties into dose uncertainties at competitive run-times (GPU [Formula: see text] min). The resulting standard deviation (expectation value) of dose show average global [Formula: see text] pass rates between 94.2% and 99.9% (98.4% and 100.0%). All investigated importance sampling strategies provided less accuracy at higher run-times considering only a single fraction. Considering fractionation, APM uncertainty propagation and treatment plan optimization was proven to be possible at constant time complexity, while run-times of sampling-based computations are linear in the number of fractions. Using sum sampling within APM, uncertainty propagation can only be accelerated at the cost of reduced accuracy in variance calculations. For probabilistic plan optimization, we were able to approximate the necessary pre-computations within seconds, yielding treatment plans of similar quality as gained from exact uncertainty propagation. APM is suited to enhance the trade-off between speed and accuracy in uncertainty propagation and probabilistic treatment plan optimization, especially in the context of fractionation. This brings fully-fledged APM computations within reach of clinical application.
Computational Investigation of Fluidic Counterflow Thrust Vectoring
NASA Technical Reports Server (NTRS)
Hunter, Craig A.; Deere, Karen A.
1999-01-01
A computational study of fluidic counterflow thrust vectoring has been conducted. Two-dimensional numerical simulations were run using the computational fluid dynamics code PAB3D with two-equation turbulence closure and linear Reynolds stress modeling. For validation, computational results were compared to experimental data obtained at the NASA Langley Jet Exit Test Facility. In general, computational results were in good agreement with experimental performance data, indicating that efficient thrust vectoring can be obtained with low secondary flow requirements (less than 1% of the primary flow). An examination of the computational flowfield has revealed new details about the generation of a countercurrent shear layer, its relation to secondary suction, and its role in thrust vectoring. In addition to providing new information about the physics of counterflow thrust vectoring, this work appears to be the first documented attempt to simulate the counterflow thrust vectoring problem using computational fluid dynamics.
Ó Conchúir, Shane; Barlow, Kyle A; Pache, Roland A; Ollikainen, Noah; Kundert, Kale; O'Meara, Matthew J; Smith, Colin A; Kortemme, Tanja
2015-01-01
The development and validation of computational macromolecular modeling and design methods depend on suitable benchmark datasets and informative metrics for comparing protocols. In addition, if a method is intended to be adopted broadly in diverse biological applications, there needs to be information on appropriate parameters for each protocol, as well as metrics describing the expected accuracy compared to experimental data. In certain disciplines, there exist established benchmarks and public resources where experts in a particular methodology are encouraged to supply their most efficient implementation of each particular benchmark. We aim to provide such a resource for protocols in macromolecular modeling and design. We present a freely accessible web resource (https://kortemmelab.ucsf.edu/benchmarks) to guide the development of protocols for protein modeling and design. The site provides benchmark datasets and metrics to compare the performance of a variety of modeling protocols using different computational sampling methods and energy functions, providing a "best practice" set of parameters for each method. Each benchmark has an associated downloadable benchmark capture archive containing the input files, analysis scripts, and tutorials for running the benchmark. The captures may be run with any suitable modeling method; we supply command lines for running the benchmarks using the Rosetta software suite. We have compiled initial benchmarks for the resource spanning three key areas: prediction of energetic effects of mutations, protein design, and protein structure prediction, each with associated state-of-the-art modeling protocols. With the help of the wider macromolecular modeling community, we hope to expand the variety of benchmarks included on the website and continue to evaluate new iterations of current methods as they become available.
Statistical fingerprinting for malware detection and classification
Prowell, Stacy J.; Rathgeb, Christopher T.
2015-09-15
A system detects malware in a computing architecture with an unknown pedigree. The system includes a first computing device having a known pedigree and operating free of malware. The first computing device executes a series of instrumented functions that, when executed, provide a statistical baseline that is representative of the time it takes the software application to run on a computing device having a known pedigree. A second computing device executes a second series of instrumented functions that, when executed, provides an actual time that is representative of the time the known software application runs on the second computing device. The system detects malware when there is a difference in execution times between the first and the second computing devices.
Kopelman, Naama M; Mayzel, Jonathan; Jakobsson, Mattias; Rosenberg, Noah A; Mayrose, Itay
2015-09-01
The identification of the genetic structure of populations from multilocus genotype data has become a central component of modern population-genetic data analysis. Application of model-based clustering programs often entails a number of steps, in which the user considers different modelling assumptions, compares results across different predetermined values of the number of assumed clusters (a parameter typically denoted K), examines multiple independent runs for each fixed value of K, and distinguishes among runs belonging to substantially distinct clustering solutions. Here, we present Clumpak (Cluster Markov Packager Across K), a method that automates the postprocessing of results of model-based population structure analyses. For analysing multiple independent runs at a single K value, Clumpak identifies sets of highly similar runs, separating distinct groups of runs that represent distinct modes in the space of possible solutions. This procedure, which generates a consensus solution for each distinct mode, is performed by the use of a Markov clustering algorithm that relies on a similarity matrix between replicate runs, as computed by the software Clumpp. Next, Clumpak identifies an optimal alignment of inferred clusters across different values of K, extending a similar approach implemented for a fixed K in Clumpp and simplifying the comparison of clustering results across different K values. Clumpak incorporates additional features, such as implementations of methods for choosing K and comparing solutions obtained by different programs, models, or data subsets. Clumpak, available at http://clumpak.tau.ac.il, simplifies the use of model-based analyses of population structure in population genetics and molecular ecology. © 2015 John Wiley & Sons Ltd.
HEP Computing Tools, Grid and Supercomputers for Genome Sequencing Studies
NASA Astrophysics Data System (ADS)
De, K.; Klimentov, A.; Maeno, T.; Mashinistov, R.; Novikov, A.; Poyda, A.; Tertychnyy, I.; Wenaus, T.
2017-10-01
PanDA - Production and Distributed Analysis Workload Management System has been developed to address ATLAS experiment at LHC data processing and analysis challenges. Recently PanDA has been extended to run HEP scientific applications on Leadership Class Facilities and supercomputers. The success of the projects to use PanDA beyond HEP and Grid has drawn attention from other compute intensive sciences such as bioinformatics. Recent advances of Next Generation Genome Sequencing (NGS) technology led to increasing streams of sequencing data that need to be processed, analysed and made available for bioinformaticians worldwide. Analysis of genomes sequencing data using popular software pipeline PALEOMIX can take a month even running it on the powerful computer resource. In this paper we will describe the adaptation the PALEOMIX pipeline to run it on a distributed computing environment powered by PanDA. To run pipeline we split input files into chunks which are run separately on different nodes as separate inputs for PALEOMIX and finally merge output file, it is very similar to what it done by ATLAS to process and to simulate data. We dramatically decreased the total walltime because of jobs (re)submission automation and brokering within PanDA. Using software tools developed initially for HEP and Grid can reduce payload execution time for Mammoths DNA samples from weeks to days.
Fuller, Joel T; Buckley, Jonathan D; Tsiros, Margarita D; Brown, Nicholas A T; Thewlis, Dominic
2016-10-01
Minimalist shoes have been suggested as a way to alter running biomechanics to improve running performance and reduce injuries. However, to date, researchers have only considered the effect of minimalist shoes at slow running speeds. To determine if runners change foot-strike pattern and alter the distribution of mechanical work at the knee and ankle joints when running at a fast speed in minimalist shoes compared with conventional running shoes. Crossover study. Research laboratory. Twenty-six trained runners (age = 30.0 ± 7.9 years [age range, 18-40 years], height = 1.79 ± 0.06 m, mass = 75.3 ± 8.2 kg, weekly training distance = 27 ± 15 km) who ran with a habitual rearfoot foot-strike pattern and had no experience running in minimalist shoes. Participants completed overground running trials at 18 km/h in minimalist and conventional shoes. Sagittal-plane kinematics and joint work at the knee and ankle joints were computed using 3-dimensional kinematic and ground reaction force data. Foot-strike pattern was classified as rearfoot, midfoot, or forefoot strike based on strike index and ankle angle at initial contact. We observed no difference in foot-strike classification between shoes (χ 2 1 = 2.29, P = .13). Ankle angle at initial contact was less (2.46° versus 7.43°; t 25 = 3.34, P = .003) and strike index was greater (35.97% versus 29.04%; t 25 = 2.38, P = .03) when running in minimalist shoes compared with conventional shoes. We observed greater negative (52.87 J versus 42.46 J; t 24 = 2.29, P = .03) and positive work (68.91 J versus 59.08 J; t 24 = 2.65, P = .01) at the ankle but less negative (59.01 J versus 67.02 J; t 24 = 2.25, P = .03) and positive work (40.37 J versus 47.09 J; t 24 = 2.11, P = .046) at the knee with minimalist shoes compared with conventional shoes. Running in minimalist shoes at a fast speed caused a redistribution of work from the knee to the ankle joint. This finding suggests that runners changing from conventional to minimalist shoes for short-distance races could be at an increased risk of ankle and calf injuries but a reduced risk of knee injuries.
NASA Technical Reports Server (NTRS)
White, P. R.; Little, R. R.
1985-01-01
A research effort was undertaken to develop personal computer based software for vibrational analysis. The software was developed to analytically determine the natural frequencies and mode shapes for the uncoupled lateral vibrations of the blade and counterweight assemblies used in a single bladed wind turbine. The uncoupled vibration analysis was performed in both the flapwise and chordwise directions for static rotor conditions. The effects of rotation on the uncoupled flapwise vibration of the blade and counterweight assemblies were evaluated for various rotor speeds up to 90 rpm. The theory, used in the vibration analysis codes, is based on a lumped mass formulation for the blade and counterweight assemblies. The codes are general so that other designs can be readily analyzed. The input for the codes is generally interactive to facilitate usage. The output of the codes is both tabular and graphical. Listings of the codes are provided. Predicted natural frequencies of the first several modes show reasonable agreement with experimental results. The analysis codes were originally developed on a DEC PDP 11/34 minicomputer and then downloaded and modified to run on an ITT XTRA personal computer. Studies conducted to evaluate the efficiency of running the programs on a personal computer as compared with the minicomputer indicated that, with the proper combination of hardware and software options, the efficiency of using a personal computer exceeds that of a minicomputer.
Running R Statistical Computing Environment Software on the Peregrine
for the development of new statistical methodologies and enjoys a large user base. Please consult the distribution details. Natural language support but running in an English locale R is a collaborative project programming paradigms to better leverage modern HPC systems. The CRAN task view for High Performance Computing
High Resolution Nature Runs and the Big Data Challenge
NASA Technical Reports Server (NTRS)
Webster, W. Phillip; Duffy, Daniel Q.
2015-01-01
NASA's Global Modeling and Assimilation Office at Goddard Space Flight Center is undertaking a series of very computationally intensive Nature Runs and a downscaled reanalysis. The nature runs use the GEOS-5 as an Atmospheric General Circulation Model (AGCM) while the reanalysis uses the GEOS-5 in Data Assimilation mode. This paper will present computational challenges from three runs, two of which are AGCM and one is downscaled reanalysis using the full DAS. The nature runs will be completed at two surface grid resolutions, 7 and 3 kilometers and 72 vertical levels. The 7 km run spanned 2 years (2005-2006) and produced 4 PB of data while the 3 km run will span one year and generate 4 BP of data. The downscaled reanalysis (MERRA-II Modern-Era Reanalysis for Research and Applications) will cover 15 years and generate 1 PB of data. Our efforts to address the big data challenges of climate science, we are moving toward a notion of Climate Analytics-as-a-Service (CAaaS), a specialization of the concept of business process-as-a-service that is an evolving extension of IaaS, PaaS, and SaaS enabled by cloud computing. In this presentation, we will describe two projects that demonstrate this shift. MERRA Analytic Services (MERRA/AS) is an example of cloud-enabled CAaaS. MERRA/AS enables MapReduce analytics over MERRA reanalysis data collection by bringing together the high-performance computing, scalable data management, and a domain-specific climate data services API. NASA's High-Performance Science Cloud (HPSC) is an example of the type of compute-storage fabric required to support CAaaS. The HPSC comprises a high speed Infinib and network, high performance file systems and object storage, and a virtual system environments specific for data intensive, science applications. These technologies are providing a new tier in the data and analytic services stack that helps connect earthbound, enterprise-level data and computational resources to new customers and new mobility-driven applications and modes of work. In our experience, CAaaS lowers the barriers and risk to organizational change, fosters innovation and experimentation, and provides the agility required to meet our customers' increasing and changing needs
Mount, D W; Conrad, B
1986-01-01
We have previously described programs for a variety of types of sequence analysis (1-4). These programs have now been integrated into a single package. They are written in the standard C programming language and run on virtually any computer system with a C compiler, such as the IBM/PC and other computers running under the MS/DOS and UNIX operating systems. The programs are widely distributed and may be obtained from the authors as described below. PMID:3753780
NASA Astrophysics Data System (ADS)
Decyk, Viktor K.; Dauger, Dean E.
We have constructed a parallel cluster consisting of Apple Macintosh G4 computers running both Classic Mac OS as well as the Unix-based Mac OS X, and have achieved very good performance on numerically intensive, parallel plasma particle-in-cell simulations. Unlike other Unix-based clusters, no special expertise in operating systems is required to build and run the cluster. This enables us to move parallel computing from the realm of experts to the mainstream of computing.
Proteinortho: detection of (co-)orthologs in large-scale analysis.
Lechner, Marcus; Findeiss, Sven; Steiner, Lydia; Marz, Manja; Stadler, Peter F; Prohaska, Sonja J
2011-04-28
Orthology analysis is an important part of data analysis in many areas of bioinformatics such as comparative genomics and molecular phylogenetics. The ever-increasing flood of sequence data, and hence the rapidly increasing number of genomes that can be compared simultaneously, calls for efficient software tools as brute-force approaches with quadratic memory requirements become infeasible in practise. The rapid pace at which new data become available, furthermore, makes it desirable to compute genome-wide orthology relations for a given dataset rather than relying on relations listed in databases. The program Proteinortho described here is a stand-alone tool that is geared towards large datasets and makes use of distributed computing techniques when run on multi-core hardware. It implements an extended version of the reciprocal best alignment heuristic. We apply Proteinortho to compute orthologous proteins in the complete set of all 717 eubacterial genomes available at NCBI at the beginning of 2009. We identified thirty proteins present in 99% of all bacterial proteomes. Proteinortho significantly reduces the required amount of memory for orthology analysis compared to existing tools, allowing such computations to be performed on off-the-shelf hardware.
Analytical investigation of critical phenomena in MHD power generators
DOE Office of Scientific and Technical Information (OSTI.GOV)
Not Available
1980-07-31
Critical phenomena in the Arnold Engineering Development Center (AEDC) High Performance Demonstration Experiment (HPDE) and the US U-25 Experiment, are analyzed. Also analyzed are the performance of a NASA-specified 500 MW(th) flow train and computations concerning critica issues for the scale-up of MHD Generators. The HPDE is characterized by computational simulations of both the nominal conditions and the conditions during the experimental runs. The steady-state performance is discussed along with the Hall voltage overshoots during the start-up and shutdown transients. The results of simulations of the HPDE runs with codes from the Q3D and TRANSIENT code families are compared tomore » the experimental results. The results of the simulations are in good agreement with the experimental data. Additional critica phenomena analyzed in the AEDC/HPDE are the optimal load schedules, parametric variations, the parametric dependence of the electrode voltage drops, the boundary layer behavior, near electrode phenomena with finite electrode segmentation, and current distribution in the end regions. The US U-25 experiment is characterized by computational simulations of the nominal operating conditions. The steady-state performance for the nominal design of the US U-25 experiment is analyzed, as is the dependence of performance on the mass flow rate. A NASA-specified 500 MW(th) MHD flow train is characterized for computer simulation and the electrical, transport, and thermodynamic properties at the inlet plane are analyzed. Issues for the scale-up of MHD power trains are discussed. The AEDC/HPDE performance is analyzed to compare these experimental results to scale-up rules.« less
PanDA: Exascale Federation of Resources for the ATLAS Experiment at the LHC
NASA Astrophysics Data System (ADS)
Barreiro Megino, Fernando; Caballero Bejar, Jose; De, Kaushik; Hover, John; Klimentov, Alexei; Maeno, Tadashi; Nilsson, Paul; Oleynik, Danila; Padolski, Siarhei; Panitkin, Sergey; Petrosyan, Artem; Wenaus, Torre
2016-02-01
After a scheduled maintenance and upgrade period, the world's largest and most powerful machine - the Large Hadron Collider(LHC) - is about to enter its second run at unprecedented energies. In order to exploit the scientific potential of the machine, the experiments at the LHC face computational challenges with enormous data volumes that need to be analysed by thousand of physics users and compared to simulated data. Given diverse funding constraints, the computational resources for the LHC have been deployed in a worldwide mesh of data centres, connected to each other through Grid technologies. The PanDA (Production and Distributed Analysis) system was developed in 2005 for the ATLAS experiment on top of this heterogeneous infrastructure to seamlessly integrate the computational resources and give the users the feeling of a unique system. Since its origins, PanDA has evolved together with upcoming computing paradigms in and outside HEP, such as changes in the networking model, Cloud Computing and HPC. It is currently running steadily up to 200 thousand simultaneous cores (limited by the available resources for ATLAS), up to two million aggregated jobs per day and processes over an exabyte of data per year. The success of PanDA in ATLAS is triggering the widespread adoption and testing by other experiments. In this contribution we will give an overview of the PanDA components and focus on the new features and upcoming challenges that are relevant to the next decade of distributed computing workload management using PanDA.
NASA Astrophysics Data System (ADS)
Moon, Hongsik
What is the impact of multicore and associated advanced technologies on computational software for science? Most researchers and students have multicore laptops or desktops for their research and they need computing power to run computational software packages. Computing power was initially derived from Central Processing Unit (CPU) clock speed. That changed when increases in clock speed became constrained by power requirements. Chip manufacturers turned to multicore CPU architectures and associated technological advancements to create the CPUs for the future. Most software applications benefited by the increased computing power the same way that increases in clock speed helped applications run faster. However, for Computational ElectroMagnetics (CEM) software developers, this change was not an obvious benefit - it appeared to be a detriment. Developers were challenged to find a way to correctly utilize the advancements in hardware so that their codes could benefit. The solution was parallelization and this dissertation details the investigation to address these challenges. Prior to multicore CPUs, advanced computer technologies were compared with the performance using benchmark software and the metric was FLoting-point Operations Per Seconds (FLOPS) which indicates system performance for scientific applications that make heavy use of floating-point calculations. Is FLOPS an effective metric for parallelized CEM simulation tools on new multicore system? Parallel CEM software needs to be benchmarked not only by FLOPS but also by the performance of other parameters related to type and utilization of the hardware, such as CPU, Random Access Memory (RAM), hard disk, network, etc. The codes need to be optimized for more than just FLOPs and new parameters must be included in benchmarking. In this dissertation, the parallel CEM software named High Order Basis Based Integral Equation Solver (HOBBIES) is introduced. This code was developed to address the needs of the changing computer hardware platforms in order to provide fast, accurate and efficient solutions to large, complex electromagnetic problems. The research in this dissertation proves that the performance of parallel code is intimately related to the configuration of the computer hardware and can be maximized for different hardware platforms. To benchmark and optimize the performance of parallel CEM software, a variety of large, complex projects are created and executed on a variety of computer platforms. The computer platforms used in this research are detailed in this dissertation. The projects run as benchmarks are also described in detail and results are presented. The parameters that affect parallel CEM software on High Performance Computing Clusters (HPCC) are investigated. This research demonstrates methods to maximize the performance of parallel CEM software code.
Coalescent: an open-science framework for importance sampling in coalescent theory.
Tewari, Susanta; Spouge, John L
2015-01-01
Background. In coalescent theory, computer programs often use importance sampling to calculate likelihoods and other statistical quantities. An importance sampling scheme can exploit human intuition to improve statistical efficiency of computations, but unfortunately, in the absence of general computer frameworks on importance sampling, researchers often struggle to translate new sampling schemes computationally or benchmark against different schemes, in a manner that is reliable and maintainable. Moreover, most studies use computer programs lacking a convenient user interface or the flexibility to meet the current demands of open science. In particular, current computer frameworks can only evaluate the efficiency of a single importance sampling scheme or compare the efficiencies of different schemes in an ad hoc manner. Results. We have designed a general framework (http://coalescent.sourceforge.net; language: Java; License: GPLv3) for importance sampling that computes likelihoods under the standard neutral coalescent model of a single, well-mixed population of constant size over time following infinite sites model of mutation. The framework models the necessary core concepts, comes integrated with several data sets of varying size, implements the standard competing proposals, and integrates tightly with our previous framework for calculating exact probabilities. For a given dataset, it computes the likelihood and provides the maximum likelihood estimate of the mutation parameter. Well-known benchmarks in the coalescent literature validate the accuracy of the framework. The framework provides an intuitive user interface with minimal clutter. For performance, the framework switches automatically to modern multicore hardware, if available. It runs on three major platforms (Windows, Mac and Linux). Extensive tests and coverage make the framework reliable and maintainable. Conclusions. In coalescent theory, many studies of computational efficiency consider only effective sample size. Here, we evaluate proposals in the coalescent literature, to discover that the order of efficiency among the three importance sampling schemes changes when one considers running time as well as effective sample size. We also describe a computational technique called "just-in-time delegation" available to improve the trade-off between running time and precision by constructing improved importance sampling schemes from existing ones. Thus, our systems approach is a potential solution to the "2(8) programs problem" highlighted by Felsenstein, because it provides the flexibility to include or exclude various features of similar coalescent models or importance sampling schemes.
Graph run-length matrices for histopathological image segmentation.
Tosun, Akif Burak; Gunduz-Demir, Cigdem
2011-03-01
The histopathological examination of tissue specimens is essential for cancer diagnosis and grading. However, this examination is subject to a considerable amount of observer variability as it mainly relies on visual interpretation of pathologists. To alleviate this problem, it is very important to develop computational quantitative tools, for which image segmentation constitutes the core step. In this paper, we introduce an effective and robust algorithm for the segmentation of histopathological tissue images. This algorithm incorporates the background knowledge of the tissue organization into segmentation. For this purpose, it quantifies spatial relations of cytological tissue components by constructing a graph and uses this graph to define new texture features for image segmentation. This new texture definition makes use of the idea of gray-level run-length matrices. However, it considers the runs of cytological components on a graph to form a matrix, instead of considering the runs of pixel intensities. Working with colon tissue images, our experiments demonstrate that the texture features extracted from "graph run-length matrices" lead to high segmentation accuracies, also providing a reasonable number of segmented regions. Compared with four other segmentation algorithms, the results show that the proposed algorithm is more effective in histopathological image segmentation.
Parallel Computational Protein Design.
Zhou, Yichao; Donald, Bruce R; Zeng, Jianyang
2017-01-01
Computational structure-based protein design (CSPD) is an important problem in computational biology, which aims to design or improve a prescribed protein function based on a protein structure template. It provides a practical tool for real-world protein engineering applications. A popular CSPD method that guarantees to find the global minimum energy solution (GMEC) is to combine both dead-end elimination (DEE) and A* tree search algorithms. However, in this framework, the A* search algorithm can run in exponential time in the worst case, which may become the computation bottleneck of large-scale computational protein design process. To address this issue, we extend and add a new module to the OSPREY program that was previously developed in the Donald lab (Gainza et al., Methods Enzymol 523:87, 2013) to implement a GPU-based massively parallel A* algorithm for improving protein design pipeline. By exploiting the modern GPU computational framework and optimizing the computation of the heuristic function for A* search, our new program, called gOSPREY, can provide up to four orders of magnitude speedups in large protein design cases with a small memory overhead comparing to the traditional A* search algorithm implementation, while still guaranteeing the optimality. In addition, gOSPREY can be configured to run in a bounded-memory mode to tackle the problems in which the conformation space is too large and the global optimal solution cannot be computed previously. Furthermore, the GPU-based A* algorithm implemented in the gOSPREY program can be combined with the state-of-the-art rotamer pruning algorithms such as iMinDEE (Gainza et al., PLoS Comput Biol 8:e1002335, 2012) and DEEPer (Hallen et al., Proteins 81:18-39, 2013) to also consider continuous backbone and side-chain flexibility.
Metrics for comparing dynamic earthquake rupture simulations
Barall, Michael; Harris, Ruth A.
2014-01-01
Earthquakes are complex events that involve a myriad of interactions among multiple geologic features and processes. One of the tools that is available to assist with their study is computer simulation, particularly dynamic rupture simulation. A dynamic rupture simulation is a numerical model of the physical processes that occur during an earthquake. Starting with the fault geometry, friction constitutive law, initial stress conditions, and assumptions about the condition and response of the near‐fault rocks, a dynamic earthquake rupture simulation calculates the evolution of fault slip and stress over time as part of the elastodynamic numerical solution (Ⓔ see the simulation description in the electronic supplement to this article). The complexity of the computations in a dynamic rupture simulation make it challenging to verify that the computer code is operating as intended, because there are no exact analytic solutions against which these codes’ results can be directly compared. One approach for checking if dynamic rupture computer codes are working satisfactorily is to compare each code’s results with the results of other dynamic rupture codes running the same earthquake simulation benchmark. To perform such a comparison consistently, it is necessary to have quantitative metrics. In this paper, we present a new method for quantitatively comparing the results of dynamic earthquake rupture computer simulation codes.
User's instructions for the cardiovascular Walters model
NASA Technical Reports Server (NTRS)
Croston, R. C.
1973-01-01
The model is a combined, steady-state cardiovascular and thermal model. It was originally developed for interactive use, but was converted to batch mode simulation for the Sigma 3 computer. The model has the purpose to compute steady-state circulatory and thermal variables in response to exercise work loads and environmental factors. During a computer simulation run, several selected variables are printed at each time step. End conditions are also printed at the completion of the run.
Nosik, Melissa R; Williams, W Larry; Garrido, Natalia; Lee, Sarah
2013-01-01
In the current study, behavior skills training (BST) is compared to a computer based training package for teaching discrete trial instruction to staff, teaching an adult with autism. The computer based training package consisted of instructions, video modeling and feedback. BST consisted of instructions, modeling, rehearsal and feedback. Following training, participants were evaluated in terms of their accuracy on completing critical skills for running a discrete trial program. Six participants completed training; three received behavior skills training and three received the computer based training. Participants in the BST group performed better overall after training and during six week probes than those in the computer based training group. There were differences across both groups between research assistant and natural environment competency levels. Copyright © 2012 Elsevier Ltd. All rights reserved.
Multitasking for flows about multiple body configurations using the chimera grid scheme
NASA Technical Reports Server (NTRS)
Dougherty, F. C.; Morgan, R. L.
1987-01-01
The multitasking of a finite-difference scheme using multiple overset meshes is described. In this chimera, or multiple overset mesh approach, a multiple body configuration is mapped using a major grid about the main component of the configuration, with minor overset meshes used to map each additional component. This type of code is well suited to multitasking. Both steady and unsteady two dimensional computations are run on parallel processors on a CRAY-X/MP 48, usually with one mesh per processor. Flow field results are compared with single processor results to demonstrate the feasibility of running multiple mesh codes on parallel processors and to show the increase in efficiency.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Almasi, Gheorghe; Blumrich, Matthias Augustin; Chen, Dong
Methods and apparatus perform fault isolation in multiple node computing systems using commutative error detection values for--example, checksums--to identify and to isolate faulty nodes. When information associated with a reproducible portion of a computer program is injected into a network by a node, a commutative error detection value is calculated. At intervals, node fault detection apparatus associated with the multiple node computer system retrieve commutative error detection values associated with the node and stores them in memory. When the computer program is executed again by the multiple node computer system, new commutative error detection values are created and stored inmore » memory. The node fault detection apparatus identifies faulty nodes by comparing commutative error detection values associated with reproducible portions of the application program generated by a particular node from different runs of the application program. Differences in values indicate a possible faulty node.« less
CBESW: sequence alignment on the Playstation 3.
Wirawan, Adrianto; Kwoh, Chee Keong; Hieu, Nim Tri; Schmidt, Bertil
2008-09-17
The exponential growth of available biological data has caused bioinformatics to be rapidly moving towards a data-intensive, computational science. As a result, the computational power needed by bioinformatics applications is growing exponentially as well. The recent emergence of accelerator technologies has made it possible to achieve an excellent improvement in execution time for many bioinformatics applications, compared to current general-purpose platforms. In this paper, we demonstrate how the PlayStation 3, powered by the Cell Broadband Engine, can be used as a computational platform to accelerate the Smith-Waterman algorithm. For large datasets, our implementation on the PlayStation 3 provides a significant improvement in running time compared to other implementations such as SSEARCH, Striped Smith-Waterman and CUDA. Our implementation achieves a peak performance of up to 3,646 MCUPS. The results from our experiments demonstrate that the PlayStation 3 console can be used as an efficient low cost computational platform for high performance sequence alignment applications.
CBESW: Sequence Alignment on the Playstation 3
Wirawan, Adrianto; Kwoh, Chee Keong; Hieu, Nim Tri; Schmidt, Bertil
2008-01-01
Background The exponential growth of available biological data has caused bioinformatics to be rapidly moving towards a data-intensive, computational science. As a result, the computational power needed by bioinformatics applications is growing exponentially as well. The recent emergence of accelerator technologies has made it possible to achieve an excellent improvement in execution time for many bioinformatics applications, compared to current general-purpose platforms. In this paper, we demonstrate how the PlayStation® 3, powered by the Cell Broadband Engine, can be used as a computational platform to accelerate the Smith-Waterman algorithm. Results For large datasets, our implementation on the PlayStation® 3 provides a significant improvement in running time compared to other implementations such as SSEARCH, Striped Smith-Waterman and CUDA. Our implementation achieves a peak performance of up to 3,646 MCUPS. Conclusion The results from our experiments demonstrate that the PlayStation® 3 console can be used as an efficient low cost computational platform for high performance sequence alignment applications. PMID:18798993
Face Recognition in Humans and Machines
NASA Astrophysics Data System (ADS)
O'Toole, Alice; Tistarelli, Massimo
The study of human face recognition by psychologists and neuroscientists has run parallel to the development of automatic face recognition technologies by computer scientists and engineers. In both cases, there are analogous steps of data acquisition, image processing, and the formation of representations that can support the complex and diverse tasks we accomplish with faces. These processes can be understood and compared in the context of their neural and computational implementations. In this chapter, we present the essential elements of face recognition by humans and machines, taking a perspective that spans psychological, neural, and computational approaches. From the human side, we overview the methods and techniques used in the neurobiology of face recognition, the underlying neural architecture of the system, the role of visual attention, and the nature of the representations that emerges. From the computational side, we discuss face recognition technologies and the strategies they use to overcome challenges to robust operation over viewing parameters. Finally, we conclude the chapter with a look at some recent studies that compare human and machine performances at face recognition.
Running Batch Jobs on Peregrine | High-Performance Computing | NREL
Using Resource Feature to Request Different Node Types Peregrine has several types of compute nodes incompatibility and get the job running. More information about requesting different node types in Peregrine is available. Queues In order to meet the needs of different types of jobs, nodes on Peregrine are available
Host-Nation Operations: Soldier Training on Governance (HOST-G) Training Support Package
2011-07-01
restricted this webpage from running scripts or ActiveX controls that could access your computer. Click here for options…” • If this occurs, select that...scripts and ActiveX controls can be useful, but active content might also harm your computer. Are you sure you want to let this file run active
24 CFR 15.110 - What fees will HUD charge?
Code of Federal Regulations, 2013 CFR
2013-04-01
... duplicating machinery. The computer run time includes the cost of operating a central processing unit for that... Applies. (6) Computer run time (includes only mainframe search time not printing) The direct cost of... estimated fee is more than $250.00 or you have a history of failing to pay FOIA fees to HUD in a timely...
NASA Technical Reports Server (NTRS)
Roberts, Floyd E., III
1994-01-01
Software provides for control and acquisition of data from optical pyrometer. There are six individual programs in PYROLASER package. Provides quick and easy way to set up, control, and program standard Pyrolaser. Temperature and emisivity measurements either collected as if Pyrolaser in manual operating mode or displayed on real-time strip charts and stored in standard spreadsheet format for posttest analysis. Shell supplied to allow macros, which are test-specific, added to system easily. Written using Labview software for use on Macintosh-series computers running System 6.0.3 or later, Sun Sparc-series computers running Open-Windows 3.0 or MIT's X Window System (X11R4 or X11R5), and IBM PC or compatible computers running Microsoft Windows 3.1 or later.
NASA Technical Reports Server (NTRS)
1972-01-01
The IDAPS (Image Data Processing System) is a user-oriented, computer-based, language and control system, which provides a framework or standard for implementing image data processing applications, simplifies set-up of image processing runs so that the system may be used without a working knowledge of computer programming or operation, streamlines operation of the image processing facility, and allows multiple applications to be run in sequence without operator interaction. The control system loads the operators, interprets the input, constructs the necessary parameters for each application, and cells the application. The overlay feature of the IBSYS loader (IBLDR) provides the means of running multiple operators which would otherwise overflow core storage.
Identification of Program Signatures from Cloud Computing System Telemetry Data
DOE Office of Scientific and Technical Information (OSTI.GOV)
Nichols, Nicole M.; Greaves, Mark T.; Smith, William P.
Malicious cloud computing activity can take many forms, including running unauthorized programs in a virtual environment. Detection of these malicious activities while preserving the privacy of the user is an important research challenge. Prior work has shown the potential viability of using cloud service billing metrics as a mechanism for proxy identification of malicious programs. Previously this novel detection method has been evaluated in a synthetic and isolated computational environment. In this paper we demonstrate the ability of billing metrics to identify programs, in an active cloud computing environment, including multiple virtual machines running on the same hypervisor. The openmore » source cloud computing platform OpenStack, is used for private cloud management at Pacific Northwest National Laboratory. OpenStack provides a billing tool (Ceilometer) to collect system telemetry measurements. We identify four different programs running on four virtual machines under the same cloud user account. Programs were identified with up to 95% accuracy. This accuracy is dependent on the distinctiveness of telemetry measurements for the specific programs we tested. Future work will examine the scalability of this approach for a larger selection of programs to better understand the uniqueness needed to identify a program. Additionally, future work should address the separation of signatures when multiple programs are running on the same virtual machine.« less
Habegger, Lukas; Balasubramanian, Suganthi; Chen, David Z; Khurana, Ekta; Sboner, Andrea; Harmanci, Arif; Rozowsky, Joel; Clarke, Declan; Snyder, Michael; Gerstein, Mark
2012-09-01
The functional annotation of variants obtained through sequencing projects is generally assumed to be a simple intersection of genomic coordinates with genomic features. However, complexities arise for several reasons, including the differential effects of a variant on alternatively spliced transcripts, as well as the difficulty in assessing the impact of small insertions/deletions and large structural variants. Taking these factors into consideration, we developed the Variant Annotation Tool (VAT) to functionally annotate variants from multiple personal genomes at the transcript level as well as obtain summary statistics across genes and individuals. VAT also allows visualization of the effects of different variants, integrates allele frequencies and genotype data from the underlying individuals and facilitates comparative analysis between different groups of individuals. VAT can either be run through a command-line interface or as a web application. Finally, in order to enable on-demand access and to minimize unnecessary transfers of large data files, VAT can be run as a virtual machine in a cloud-computing environment. VAT is implemented in C and PHP. The VAT web service, Amazon Machine Image, source code and detailed documentation are available at vat.gersteinlab.org.
COSMIC REIONIZATION ON COMPUTERS: NUMERICAL AND PHYSICAL CONVERGENCE
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gnedin, Nickolay Y., E-mail: gnedin@fnal.gov; Kavli Institute for Cosmological Physics, University of Chicago, Chicago, IL 60637; Department of Astronomy and Astrophysics, University of Chicago, Chicago, IL 60637
In this paper I show that simulations of reionization performed under the Cosmic Reionization On Computers project do converge in space and mass, albeit rather slowly. A fully converged solution (for a given star formation and feedback model) can be determined at a level of precision of about 20%, but such a solution is useless in practice, since achieving it in production-grade simulations would require a large set of runs at various mass and spatial resolutions, and computational resources for such an undertaking are not yet readily available. In order to make progress in the interim, I introduce a weakmore » convergence correction factor in the star formation recipe, which allows one to approximate the fully converged solution with finite-resolution simulations. The accuracy of weakly converged simulations approaches a comparable, ∼20% level of precision for star formation histories of individual galactic halos and other galactic properties that are directly related to star formation rates, such as stellar masses and metallicities. Yet other properties of model galaxies, for example, their H i masses, are recovered in the weakly converged runs only within a factor of 2.« less
Providing Assistive Technology Applications as a Service Through Cloud Computing.
Mulfari, Davide; Celesti, Antonio; Villari, Massimo; Puliafito, Antonio
2015-01-01
Users with disabilities interact with Personal Computers (PCs) using Assistive Technology (AT) software solutions. Such applications run on a PC that a person with a disability commonly uses. However the configuration of AT applications is not trivial at all, especially whenever the user needs to work on a PC that does not allow him/her to rely on his / her AT tools (e.g., at work, at university, in an Internet point). In this paper, we discuss how cloud computing provides a valid technological solution to enhance such a scenario.With the emergence of cloud computing, many applications are executed on top of virtual machines (VMs). Virtualization allows us to achieve a software implementation of a real computer able to execute a standard operating system and any kind of application. In this paper we propose to build personalized VMs running AT programs and settings. By using the remote desktop technology, our solution enables users to control their customized virtual desktop environment by means of an HTML5-based web interface running on any computer equipped with a browser, whenever they are.
MATH77 - A LIBRARY OF MATHEMATICAL SUBPROGRAMS FOR FORTRAN 77, RELEASE 4.0
NASA Technical Reports Server (NTRS)
Lawson, C. L.
1994-01-01
MATH77 is a high quality library of ANSI FORTRAN 77 subprograms implementing contemporary algorithms for the basic computational processes of science and engineering. The portability of MATH77 meets the needs of present-day scientists and engineers who typically use a variety of computing environments. Release 4.0 of MATH77 contains 454 user-callable and 136 lower-level subprograms. Usage of the user-callable subprograms is described in 69 sections of the 416 page users' manual. The topics covered by MATH77 are indicated by the following list of chapter titles in the users' manual: Mathematical Functions, Pseudo-random Number Generation, Linear Systems of Equations and Linear Least Squares, Matrix Eigenvalues and Eigenvectors, Matrix Vector Utilities, Nonlinear Equation Solving, Curve Fitting, Table Look-Up and Interpolation, Definite Integrals (Quadrature), Ordinary Differential Equations, Minimization, Polynomial Rootfinding, Finite Fourier Transforms, Special Arithmetic , Sorting, Library Utilities, Character-based Graphics, and Statistics. Besides subprograms that are adaptations of public domain software, MATH77 contains a number of unique packages developed by the authors of MATH77. Instances of the latter type include (1) adaptive quadrature, allowing for exceptional generality in multidimensional cases, (2) the ordinary differential equations solver used in spacecraft trajectory computation for JPL missions, (3) univariate and multivariate table look-up and interpolation, allowing for "ragged" tables, and providing error estimates, and (4) univariate and multivariate derivative-propagation arithmetic. MATH77 release 4.0 is a subroutine library which has been carefully designed to be usable on any computer system that supports the full ANSI standard FORTRAN 77 language. It has been successfully implemented on a CRAY Y/MP computer running UNICOS, a UNISYS 1100 computer running EXEC 8, a DEC VAX series computer running VMS, a Sun4 series computer running SunOS, a Hewlett-Packard 720 computer running HP-UX, a Macintosh computer running MacOS, and an IBM PC compatible computer running MS-DOS. Accompanying the library is a set of 196 "demo" drivers that exercise all of the user-callable subprograms. The FORTRAN source code for MATH77 comprises 109K lines of code in 375 files with a total size of 4.5Mb. The demo drivers comprise 11K lines of code and 418K. Forty-four percent of the lines of the library code and 29% of those in the demo code are comment lines. The standard distribution medium for MATH77 is a .25 inch streaming magnetic tape cartridge in UNIX tar format. It is also available on a 9track 1600 BPI magnetic tape in VAX BACKUP format and a TK50 tape cartridge in VAX BACKUP format. An electronic copy of the documentation is included on the distribution media. Previous releases of MATH77 have been used over a number of years in a variety of JPL applications. MATH77 Release 4.0 was completed in 1992. MATH77 is a copyrighted work with all copyright vested in NASA.
Computer Simulation of Great Lakes-St. Lawrence Seaway Icebreaker Requirements.
1980-01-01
of Run No. 1 for Taconite Task Command ... ....... 6-41 6.22d Results of Run No. I for Oil Can Task Command ........ ... 6-42 6.22e Results of Run No...Port and Period for Run No. 2 ... .. ... ... 6-47 6.23c Results of Run No. 2 for Taconite Task Command ... ....... 6-48 6.23d Results of Run No. 2 for...6-53 6.24b Predicted Icebreaker Fleet by Home Port and Period for Run No. 3 6-54 6.24c Results of Run No. 3 for Taconite Task Command. ....... 6
Computer modeling of heat pipe performance
NASA Technical Reports Server (NTRS)
Peterson, G. P.
1983-01-01
A parametric study of the defining equations which govern the steady state operational characteristics of the Grumman monogroove dual passage heat pipe is presented. These defining equations are combined to develop a mathematical model which describes and predicts the operational and performance capabilities of a specific heat pipe given the necessary physical characteristics and working fluid. Included is a brief review of the current literature, a discussion of the governing equations, and a description of both the mathematical and computer model. Final results of preliminary test runs of the model are presented and compared with experimental tests on actual prototypes.
Study to design and develop remote manipulator system
NASA Technical Reports Server (NTRS)
Hill, J. W.; Sword, A. J.
1973-01-01
Human performance measurement techniques for remote manipulation tasks and remote sensing techniques for manipulators are described for common manipulation tasks, performance is monitored by means of an on-line computer capable of measuring the joint angles of both master and slave arms as a function of time. The computer programs allow measurements of the operator's strategy and physical quantities such as task time and power consumed. The results are printed out after a test run to compare different experimental conditions. For tracking tasks, we describe a method of displaying errors in three dimensions and measuring the end-effector position in three dimensions.
Wind-US Unstructured Flow Solutions for a Transonic Diffuser
NASA Technical Reports Server (NTRS)
Mohler, Stanley R., Jr.
2005-01-01
The Wind-US Computational Fluid Dynamics flow solver computed flow solutions for a transonic diffusing duct. The calculations used an unstructured (hexahedral) grid. The Spalart-Allmaras turbulence model was used. Static pressures along the upper and lower wall agreed well with experiment, as did velocity profiles. The effect of the smoothing input parameters on convergence and solution accuracy was investigated. The meaning and proper use of these parameters are discussed for the benefit of Wind-US users. Finally, the unstructured solver is compared to the structured solver in terms of run times and solution accuracy.
The rid-redundant procedure in C-Prolog
NASA Technical Reports Server (NTRS)
Chen, Huo-Yan; Wah, Benjamin W.
1987-01-01
C-Prolog can conveniently be used for logical inferences on knowledge bases. However, as similar to many search methods using backward chaining, a large number of redundant computation may be produced in recursive calls. To overcome this problem, the 'rid-redundant' procedure was designed to rid all redundant computations in running multi-recursive procedures. Experimental results obtained for C-Prolog on the Vax 11/780 computer show that there is an order of magnitude improvement in the running time and solvable problem size.
An Upgrade of the Aeroheating Software ''MINIVER''
NASA Technical Reports Server (NTRS)
Louderback, Pierce
2013-01-01
Detailed computational modeling: CFO often used to create and execute computational domains. Increasing complexity when moving from 20 to 30 geometries. Computational time increased as finer grids are used (accuracy). Strong tool, but takes time to set up and run. MINIVER: Uses theoretical and empirical correlations. Orders of magnitude faster to set up and run. Not as accurate as CFO, but gives reasonable estimations. MINIVER's Drawbacks: Rigid command-line interface. Lackluster, unorganized documentation. No central control; multiple versions exist and have diverged.
A Functional Description of the Geophysical Data Acquisition System
1990-08-10
less than 50 SPS nor greater than 250 SPS 3.0 SENSORS/TRANSDUCERS 3.1 CHAPTER OVERVIEW Most of the research supported by GDAS has primarily involved two...signal for the computer. The SRUN signal from the computer is fed to a retriggerable oneshot multivibrator on the board. SRUN consists of a pulse train...that is present when the computer is running. The oneshot output drives the RUN lamp on the front panel. Finally, one pin on the board edge connector is
Network support for system initiated checkpoints
Chen, Dong; Heidelberger, Philip
2013-01-29
A system, method and computer program product for supporting system initiated checkpoints in parallel computing systems. The system and method generates selective control signals to perform checkpointing of system related data in presence of messaging activity associated with a user application running at the node. The checkpointing is initiated by the system such that checkpoint data of a plurality of network nodes may be obtained even in the presence of user applications running on highly parallel computers that include ongoing user messaging activity.
Parallel Computations in Insect and Mammalian Visual Motion Processing
Clark, Damon A.; Demb, Jonathan B.
2016-01-01
Sensory systems use receptors to extract information from the environment and neural circuits to perform subsequent computations. These computations may be described as algorithms composed of sequential mathematical operations. Comparing these operations across taxa reveals how different neural circuits have evolved to solve the same problem, even when using different mechanisms to implement the underlying math. In this review, we compare how insect and mammalian neural circuits have solved the problem of motion estimation, focusing on the fruit fly Drosophila and the mouse retina. Although the two systems implement computations with grossly different anatomy and molecular mechanisms, the underlying circuits transform light into motion signals with strikingly similar processing steps. These similarities run from photoreceptor gain control and spatiotemporal tuning to ON and OFF pathway structures, motion detection, and computed motion signals. The parallels between the two systems suggest that a limited set of algorithms for estimating motion satisfies both the needs of sighted creatures and the constraints imposed on them by metabolism, anatomy, and the structure and regularities of the visual world. PMID:27780048
Parallel Computations in Insect and Mammalian Visual Motion Processing.
Clark, Damon A; Demb, Jonathan B
2016-10-24
Sensory systems use receptors to extract information from the environment and neural circuits to perform subsequent computations. These computations may be described as algorithms composed of sequential mathematical operations. Comparing these operations across taxa reveals how different neural circuits have evolved to solve the same problem, even when using different mechanisms to implement the underlying math. In this review, we compare how insect and mammalian neural circuits have solved the problem of motion estimation, focusing on the fruit fly Drosophila and the mouse retina. Although the two systems implement computations with grossly different anatomy and molecular mechanisms, the underlying circuits transform light into motion signals with strikingly similar processing steps. These similarities run from photoreceptor gain control and spatiotemporal tuning to ON and OFF pathway structures, motion detection, and computed motion signals. The parallels between the two systems suggest that a limited set of algorithms for estimating motion satisfies both the needs of sighted creatures and the constraints imposed on them by metabolism, anatomy, and the structure and regularities of the visual world. Copyright © 2016 Elsevier Ltd. All rights reserved.
NASA Technical Reports Server (NTRS)
Hamilton, George S.; Williams, Jermaine C.
1998-01-01
This paper describes the methods, rationale, and comparative results of the conversion of an intravehicular (IVA) 3D human computer model (HCM) to extravehicular (EVA) use and compares the converted model to an existing model on another computer platform. The task of accurately modeling a spacesuited human figure in software is daunting: the suit restricts the human's joint range of motion (ROM) and does not have joints collocated with human joints. The modeling of the variety of materials needed to construct a space suit (e. g. metal bearings, rigid fiberglass torso, flexible cloth limbs and rubber coated gloves) attached to a human figure is currently out of reach of desktop computer hardware and software. Therefore a simplified approach was taken. The HCM's body parts were enlarged and the joint ROM was restricted to match the existing spacesuit model. This basic approach could be used to model other restrictive environments in industry such as chemical or fire protective clothing. In summary, the approach provides a moderate fidelity, usable tool which will run on current notebook computers.
Convergence properties of simple genetic algorithms
NASA Technical Reports Server (NTRS)
Bethke, A. D.; Zeigler, B. P.; Strauss, D. M.
1974-01-01
The essential parameters determining the behaviour of genetic algorithms were investigated. Computer runs were made while systematically varying the parameter values. Results based on the progress curves obtained from these runs are presented along with results based on the variability of the population as the run progresses.
Modeling Subsurface Reactive Flows Using Leadership-Class Computing
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mills, Richard T; Hammond, Glenn; Lichtner, Peter
2009-01-01
We describe our experiences running PFLOTRAN - a code for simulation of coupled hydro-thermal-chemical processes in variably saturated, non-isothermal, porous media - on leadership-class supercomputers, including initial experiences running on the petaflop incarnation of Jaguar, the Cray XT5 at the National Center for Computational Sciences at Oak Ridge National Laboratory. PFLOTRAN utilizes fully implicit time-stepping and is built on top of the Portable, Extensible Toolkit for Scientific Computation (PETSc). We discuss some of the hurdles to 'at scale' performance with PFLOTRAN and the progress we have made in overcoming them on leadership-class computer architectures.
GEO2D - Two-Dimensional Computer Model of a Ground Source Heat Pump System
James Menart
2013-06-07
This file contains a zipped file that contains many files required to run GEO2D. GEO2D is a computer code for simulating ground source heat pump (GSHP) systems in two-dimensions. GEO2D performs a detailed finite difference simulation of the heat transfer occurring within the working fluid, the tube wall, the grout, and the ground. Both horizontal and vertical wells can be simulated with this program, but it should be noted that the vertical wall is modeled as a single tube. This program also models the heat pump in conjunction with the heat transfer occurring. GEO2D simulates the heat pump and ground loop as a system. Many results are produced by GEO2D as a function of time and position, such as heat transfer rates, temperatures and heat pump performance. On top of this information from an economic comparison between the geothermal system simulated and a comparable air heat pump systems or a comparable gas, oil or propane heating systems with a vapor compression air conditioner. The version of GEO2D in the attached file has been coupled to the DOE heating and cooling load software called ENERGYPLUS. This is a great convenience for the user because heating and cooling loads are an input to GEO2D. GEO2D is a user friendly program that uses a graphical user interface for inputs and outputs. These make entering data simple and they produce many plotted results that are easy to understand. In order to run GEO2D access to MATLAB is required. If this program is not available on your computer you can download the program MCRInstaller.exe, the 64 bit version, from the MATLAB website or from this geothermal depository. This is a free download which will enable you to run GEO2D..
Chang, Yu-Kai; Tsai, Jack Han-Chao; Wang, Chun-Chih; Chang, Erik Chihhung
2015-07-01
The aim of this study was to use diffusion tensor imaging (DTI) to characterize and compare microscopic differences in white matter integrity in the basal ganglia between elite professional athletes specializing in running and martial arts. Thirty-three young adults with sport-related skills as elite professional runners (n = 11) or elite professional martial artists (n = 11) were recruited and compared with non-athletic and healthy controls (n = 11). All participants underwent health- and skill-related physical fitness assessments. Fractional anisotropy (FA) and mean diffusivity (MD), the primary indices derived from DTI, were computed for five regions of interest in the bilateral basal ganglia, including the caudate nucleus, putamen, globus pallidus internal segment (GPi), globus pallidus external segment (GPe), and subthalamic nucleus. Results revealed that both athletic groups demonstrated better physical fitness indices compared with their control counterparts, with the running group exhibiting the highest cardiovascular fitness and the martial arts group exhibiting the highest muscular endurance and flexibility. With respect to the basal ganglia, both athletic groups showed significantly lower FA and marginally higher MD values in the GPi compared with the healthy control group. These findings suggest that professional sport or motor skill training is associated with changes in white matter integrity in specific regions of the basal ganglia, although these positive changes did not appear to depend on the type of sport-related motor skill being practiced.
Microcomputer pollution model for civilian airports and Air Force bases. Model description
DOE Office of Scientific and Technical Information (OSTI.GOV)
Segal, H.M.; Hamilton, P.L.
1988-08-01
This is one of three reports describing the Emissions and Dispersion Modeling System (EDMS). EDMS is a complex source emissions/dispersion model for use at civilian airports and Air Force bases. It operates in both a refined and a screening mode and is programmed for an IBM-XT (or compatible) computer. This report--MODEL DESCRIPTION--provides the technical description of the model. It first identifies the key design features of both the emissions (EMISSMOD) and dispersion (GIMM) portions of EDMS. It then describes the type of meteorological information the dispersion model can accept and identifies the manner in which it preprocesses National Climatic Centermore » (NCC) data prior to a refined-model run. The report presents the results of running EDMS on a number of different microcomputers and compares EDMS results with those of comparable models. The appendices elaborate on the information noted above and list the source code.« less
Consequence modeling using the fire dynamics simulator.
Ryder, Noah L; Sutula, Jason A; Schemel, Christopher F; Hamer, Andrew J; Van Brunt, Vincent
2004-11-11
The use of Computational Fluid Dynamics (CFD) and in particular Large Eddy Simulation (LES) codes to model fires provides an efficient tool for the prediction of large-scale effects that include plume characteristics, combustion product dispersion, and heat effects to adjacent objects. This paper illustrates the strengths of the Fire Dynamics Simulator (FDS), an LES code developed by the National Institute of Standards and Technology (NIST), through several small and large-scale validation runs and process safety applications. The paper presents two fire experiments--a small room fire and a large (15 m diameter) pool fire. The model results are compared to experimental data and demonstrate good agreement between the models and data. The validation work is then extended to demonstrate applicability to process safety concerns by detailing a model of a tank farm fire and a model of the ignition of a gaseous fuel in a confined space. In this simulation, a room was filled with propane, given time to disperse, and was then ignited. The model yields accurate results of the dispersion of the gas throughout the space. This information can be used to determine flammability and explosive limits in a space and can be used in subsequent models to determine the pressure and temperature waves that would result from an explosion. The model dispersion results were compared to an experiment performed by Factory Mutual. Using the above examples, this paper will demonstrate that FDS is ideally suited to build realistic models of process geometries in which large scale explosion and fire failure risks can be evaluated with several distinct advantages over more traditional CFD codes. Namely transient solutions to fire and explosion growth can be produced with less sophisticated hardware (lower cost) than needed for traditional CFD codes (PC type computer verses UNIX workstation) and can be solved for longer time histories (on the order of hundreds of seconds of computed time) with minimal computer resources and length of model run. Additionally results that are produced can be analyzed, viewed, and tabulated during and following a model run within a PC environment. There are some tradeoffs, however, as rapid computations in PC's may require a sacrifice in the grid resolution or in the sub-grid modeling, depending on the size of the geometry modeled.
A PICKSC Science Gateway for enabling the common plasma physicist to run kinetic software
NASA Astrophysics Data System (ADS)
Hu, Q.; Winjum, B. J.; Zonca, A.; Youn, C.; Tsung, F. S.; Mori, W. B.
2017-10-01
Computer simulations offer tremendous opportunities for studying plasmas, ranging from simulations for students that illuminate fundamental educational concepts to research-level simulations that advance scientific knowledge. Nevertheless, there is a significant hurdle to using simulation tools. Users must navigate codes and software libraries, determine how to wrangle output into meaningful plots, and oftentimes confront a significant cyberinfrastructure with powerful computational resources. Science gateways offer a Web-based environment to run simulations without needing to learn or manage the underlying software and computing cyberinfrastructure. We discuss our progress on creating a Science Gateway for the Particle-in-Cell and Kinetic Simulation Software Center that enables users to easily run and analyze kinetic simulations with our software. We envision that this technology could benefit a wide range of plasma physicists, both in the use of our simulation tools as well as in its adaptation for running other plasma simulation software. Supported by NSF under Grant ACI-1339893 and by the UCLA Institute for Digital Research and Education.
Creating a Parallel Version of VisIt for Microsoft Windows
DOE Office of Scientific and Technical Information (OSTI.GOV)
Whitlock, B J; Biagas, K S; Rawson, P L
2011-12-07
VisIt is a popular, free interactive parallel visualization and analysis tool for scientific data. Users can quickly generate visualizations from their data, animate them through time, manipulate them, and save the resulting images or movies for presentations. VisIt was designed from the ground up to work on many scales of computers from modest desktops up to massively parallel clusters. VisIt is comprised of a set of cooperating programs. All programs can be run locally or in client/server mode in which some run locally and some run remotely on compute clusters. The VisIt program most able to harness today's computing powermore » is the VisIt compute engine. The compute engine is responsible for reading simulation data from disk, processing it, and sending results or images back to the VisIt viewer program. In a parallel environment, the compute engine runs several processes, coordinating using the Message Passing Interface (MPI) library. Each MPI process reads some subset of the scientific data and filters the data in various ways to create useful visualizations. By using MPI, VisIt has been able to scale well into the thousands of processors on large computers such as dawn and graph at LLNL. The advent of multicore CPU's has made parallelism the 'new' way to achieve increasing performance. With today's computers having at least 2 cores and in many cases up to 8 and beyond, it is more important than ever to deploy parallel software that can use that computing power not only on clusters but also on the desktop. We have created a parallel version of VisIt for Windows that uses Microsoft's MPI implementation (MSMPI) to process data in parallel on the Windows desktop as well as on a Windows HPC cluster running Microsoft Windows Server 2008. Initial desktop parallel support for Windows was deployed in VisIt 2.4.0. Windows HPC cluster support has been completed and will appear in the VisIt 2.5.0 release. We plan to continue supporting parallel VisIt on Windows so our users will be able to take full advantage of their multicore resources.« less
NASA Astrophysics Data System (ADS)
Varela Rodriguez, F.
2011-12-01
The control system of each of the four major Experiments at the CERN Large Hadron Collider (LHC) is distributed over up to 160 computers running either Linux or Microsoft Windows. A quick response to abnormal situations of the computer infrastructure is crucial to maximize the physics usage. For this reason, a tool was developed to supervise, identify errors and troubleshoot such a large system. Although the monitoring of the performance of the Linux computers and their processes was available since the first versions of the tool, it is only recently that the software package has been extended to provide similar functionality for the nodes running Microsoft Windows as this platform is the most commonly used in the LHC detector control systems. In this paper, the architecture and the functionality of the Windows Management Instrumentation (WMI) client developed to provide centralized monitoring of the nodes running different flavour of the Microsoft platform, as well as the interface to the SCADA software of the control systems are presented. The tool is currently being commissioned by the Experiments and it has already proven to be very efficient optimize the running systems and to detect misbehaving processes or nodes.
Computational steering of GEM based detector simulations
NASA Astrophysics Data System (ADS)
Sheharyar, Ali; Bouhali, Othmane
2017-10-01
Gas based detector R&D relies heavily on full simulation of detectors and their optimization before final prototypes can be built and tested. These simulations in particular those with complex scenarios such as those involving high detector voltages or gas with larger gains are computationally intensive may take several days or weeks to complete. These long-running simulations usually run on the high-performance computers in batch mode. If the results lead to unexpected behavior, then the simulation might be rerun with different parameters. However, the simulations (or jobs) may have to wait in a queue until they get a chance to run again because the supercomputer is a shared resource that maintains a queue of other user programs as well and executes them as time and priorities permit. It may result in inefficient resource utilization and increase in the turnaround time for the scientific experiment. To overcome this issue, the monitoring of the behavior of a simulation, while it is running (or live), is essential. In this work, we employ the computational steering technique by coupling the detector simulations with a visualization package named VisIt to enable the exploration of the live data as it is produced by the simulation.
CERN openlab: Engaging industry for innovation in the LHC Run 3-4 R&D programme
NASA Astrophysics Data System (ADS)
Girone, M.; Purcell, A.; Di Meglio, A.; Rademakers, F.; Gunne, K.; Pachou, M.; Pavlou, S.
2017-10-01
LHC Run3 and Run4 represent an unprecedented challenge for HEP computing in terms of both data volume and complexity. New approaches are needed for how data is collected and filtered, processed, moved, stored and analysed if these challenges are to be met with a realistic budget. To develop innovative techniques we are fostering relationships with industry leaders. CERN openlab is a unique resource for public-private partnership between CERN and leading Information Communication and Technology (ICT) companies. Its mission is to accelerate the development of cutting-edge solutions to be used by the worldwide HEP community. In 2015, CERN openlab started its phase V with a strong focus on tackling the upcoming LHC challenges. Several R&D programs are ongoing in the areas of data acquisition, networks and connectivity, data storage architectures, computing provisioning, computing platforms and code optimisation and data analytics. This paper gives an overview of the various innovative technologies that are currently being explored by CERN openlab V and discusses the long-term strategies that are pursued by the LHC communities with the help of industry in closing the technological gap in processing and storage needs expected in Run3 and Run4.
NASA Technical Reports Server (NTRS)
Yang, Guowei; Pasareanu, Corina S.; Khurshid, Sarfraz
2012-01-01
This paper introduces memoized symbolic execution (Memoise), a novel approach for more efficient application of forward symbolic execution, which is a well-studied technique for systematic exploration of program behaviors based on bounded execution paths. Our key insight is that application of symbolic execution often requires several successive runs of the technique on largely similar underlying problems, e.g., running it once to check a program to find a bug, fixing the bug, and running it again to check the modified program. Memoise introduces a trie-based data structure that stores the key elements of a run of symbolic execution. Maintenance of the trie during successive runs allows re-use of previously computed results of symbolic execution without the need for re-computing them as is traditionally done. Experiments using our prototype embodiment of Memoise show the benefits it holds in various standard scenarios of using symbolic execution, e.g., with iterative deepening of exploration depth, to perform regression analysis, or to enhance coverage.
Simple, efficient allocation of modelling runs on heterogeneous clusters with MPI
Donato, David I.
2017-01-01
In scientific modelling and computation, the choice of an appropriate method for allocating tasks for parallel processing depends on the computational setting and on the nature of the computation. The allocation of independent but similar computational tasks, such as modelling runs or Monte Carlo trials, among the nodes of a heterogeneous computational cluster is a special case that has not been specifically evaluated previously. A simulation study shows that a method of on-demand (that is, worker-initiated) pulling from a bag of tasks in this case leads to reliably short makespans for computational jobs despite heterogeneity both within and between cluster nodes. A simple reference implementation in the C programming language with the Message Passing Interface (MPI) is provided.
Evaluating the Efficacy of the Cloud for Cluster Computation
NASA Technical Reports Server (NTRS)
Knight, David; Shams, Khawaja; Chang, George; Soderstrom, Tom
2012-01-01
Computing requirements vary by industry, and it follows that NASA and other research organizations have computing demands that fall outside the mainstream. While cloud computing made rapid inroads for tasks such as powering web applications, performance issues on highly distributed tasks hindered early adoption for scientific computation. One venture to address this problem is Nebula, NASA's homegrown cloud project tasked with delivering science-quality cloud computing resources. However, another industry development is Amazon's high-performance computing (HPC) instances on Elastic Cloud Compute (EC2) that promises improved performance for cluster computation. This paper presents results from a series of benchmarks run on Amazon EC2 and discusses the efficacy of current commercial cloud technology for running scientific applications across a cluster. In particular, a 240-core cluster of cloud instances achieved 2 TFLOPS on High-Performance Linpack (HPL) at 70% of theoretical computational performance. The cluster's local network also demonstrated sub-100 ?s inter-process latency with sustained inter-node throughput in excess of 8 Gbps. Beyond HPL, a real-world Hadoop image processing task from NASA's Lunar Mapping and Modeling Project (LMMP) was run on a 29 instance cluster to process lunar and Martian surface images with sizes on the order of tens of gigapixels. These results demonstrate that while not a rival of dedicated supercomputing clusters, commercial cloud technology is now a feasible option for moderately demanding scientific workloads.
Controlling Laboratory Processes From A Personal Computer
NASA Technical Reports Server (NTRS)
Will, H.; Mackin, M. A.
1991-01-01
Computer program provides natural-language process control from IBM PC or compatible computer. Sets up process-control system that either runs without operator or run by workers who have limited programming skills. Includes three smaller programs. Two of them, written in FORTRAN 77, record data and control research processes. Third program, written in Pascal, generates FORTRAN subroutines used by other two programs to identify user commands with device-driving routines written by user. Also includes set of input data allowing user to define user commands to be executed by computer. Requires personal computer operating under MS-DOS with suitable hardware interfaces to all controlled devices. Also requires FORTRAN 77 compiler and device drivers written by user.
Shi, L; van Meijgaard, J; Simon, P
2012-08-01
Physical inactivity like recreational computer use is a likely factor in the rising obesity prevalence among Latino adolescents. Using the data from California Health Interview Survey, we test the hypothesis whether acculturation is associated with recreational computer use among Latino adolescents. We run linear regressions of the weekly time spent on recreational computer use among Latino adolescents, stratified first by gender and then by age group (12-14 and 15-17 years). Years living in the United States and language at home are used as key variables for acculturation. For all four sub-populations, living in the United States for less than 5 years is significantly associated with fewer hours on recreational computer use, compared with those US-born. Among female adolescents, those who lived in the United States for 10 years or more spent fewer hours on recreational computer use than those US-born. Among adolescents under 15, speaking English only and speaking English plus another language are both significantly associated with more hours on recreational computer use, compared with those who speak a non-English language at home. Educators and health professionals should heed the Latino adolescents' possible increase in recreational computer use. © 2012 The Authors. Pediatric Obesity © 2012 International Association for the Study of Obesity.
160-fold acceleration of the Smith-Waterman algorithm using a field programmable gate array (FPGA)
Li, Isaac TS; Shum, Warren; Truong, Kevin
2007-01-01
Background To infer homology and subsequently gene function, the Smith-Waterman (SW) algorithm is used to find the optimal local alignment between two sequences. When searching sequence databases that may contain hundreds of millions of sequences, this algorithm becomes computationally expensive. Results In this paper, we focused on accelerating the Smith-Waterman algorithm by using FPGA-based hardware that implemented a module for computing the score of a single cell of the SW matrix. Then using a grid of this module, the entire SW matrix was computed at the speed of field propagation through the FPGA circuit. These modifications dramatically accelerated the algorithm's computation time by up to 160 folds compared to a pure software implementation running on the same FPGA with an Altera Nios II softprocessor. Conclusion This design of FPGA accelerated hardware offers a new promising direction to seeking computation improvement of genomic database searching. PMID:17555593
160-fold acceleration of the Smith-Waterman algorithm using a field programmable gate array (FPGA).
Li, Isaac T S; Shum, Warren; Truong, Kevin
2007-06-07
To infer homology and subsequently gene function, the Smith-Waterman (SW) algorithm is used to find the optimal local alignment between two sequences. When searching sequence databases that may contain hundreds of millions of sequences, this algorithm becomes computationally expensive. In this paper, we focused on accelerating the Smith-Waterman algorithm by using FPGA-based hardware that implemented a module for computing the score of a single cell of the SW matrix. Then using a grid of this module, the entire SW matrix was computed at the speed of field propagation through the FPGA circuit. These modifications dramatically accelerated the algorithm's computation time by up to 160 folds compared to a pure software implementation running on the same FPGA with an Altera Nios II softprocessor. This design of FPGA accelerated hardware offers a new promising direction to seeking computation improvement of genomic database searching.
Implementation of Parallel Dynamic Simulation on Shared-Memory vs. Distributed-Memory Environments
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jin, Shuangshuang; Chen, Yousu; Wu, Di
2015-12-09
Power system dynamic simulation computes the system response to a sequence of large disturbance, such as sudden changes in generation or load, or a network short circuit followed by protective branch switching operation. It consists of a large set of differential and algebraic equations, which is computational intensive and challenging to solve using single-processor based dynamic simulation solution. High-performance computing (HPC) based parallel computing is a very promising technology to speed up the computation and facilitate the simulation process. This paper presents two different parallel implementations of power grid dynamic simulation using Open Multi-processing (OpenMP) on shared-memory platform, and Messagemore » Passing Interface (MPI) on distributed-memory clusters, respectively. The difference of the parallel simulation algorithms and architectures of the two HPC technologies are illustrated, and their performances for running parallel dynamic simulation are compared and demonstrated.« less
WinHPC System Programming | High-Performance Computing | NREL
Programming WinHPC System Programming Learn how to build and run an MPI (message passing interface (mpi.h) and library (msmpi.lib) are. To build from the command line, run... Start > Intel Software Development Tools > Intel C++ Compiler Professional... > C++ Build Environment for applications running
On Computing Breakpoint Distances for Genomes with Duplicate Genes.
Shao, Mingfu; Moret, Bernard M E
2017-06-01
A fundamental problem in comparative genomics is to compute the distance between two genomes in terms of its higher level organization (given by genes or syntenic blocks). For two genomes without duplicate genes, we can easily define (and almost always efficiently compute) a variety of distance measures, but the problem is NP-hard under most models when genomes contain duplicate genes. To tackle duplicate genes, three formulations (exemplar, maximum matching, and any matching) have been proposed, all of which aim to build a matching between homologous genes so as to minimize some distance measure. Of the many distance measures, the breakpoint distance (the number of nonconserved adjacencies) was the first one to be studied and remains of significant interest because of its simplicity and model-free property. The three breakpoint distance problems corresponding to the three formulations have been widely studied. Although we provided last year a solution for the exemplar problem that runs very fast on full genomes, computing optimal solutions for the other two problems has remained challenging. In this article, we describe very fast, exact algorithms for these two problems. Our algorithms rely on a compact integer-linear program that we further simplify by developing an algorithm to remove variables, based on new results on the structure of adjacencies and matchings. Through extensive experiments using both simulations and biological data sets, we show that our algorithms run very fast (in seconds) on mammalian genomes and scale well beyond. We also apply these algorithms (as well as the classic orthology tool MSOAR) to create orthology assignment, then compare their quality in terms of both accuracy and coverage. We find that our algorithm for the "any matching" formulation significantly outperforms other methods in terms of accuracy while achieving nearly maximum coverage.
Computational simulation and aerodynamic sensitivity analysis of film-cooled turbines
NASA Astrophysics Data System (ADS)
Massa, Luca
A computational tool is developed for the time accurate sensitivity analysis of the stage performance of hot gas, unsteady turbine components. An existing turbomachinery internal flow solver is adapted to the high temperature environment typical of the hot section of jet engines. A real gas model and film cooling capabilities are successfully incorporated in the software. The modifications to the existing algorithm are described; both the theoretical model and the numerical implementation are validated. The accuracy of the code in evaluating turbine stage performance is tested using a turbine geometry typical of the last stage of aeronautical jet engines. The results of the performance analysis show that the predictions differ from the experimental data by less than 3%. A reliable grid generator, applicable to the domain discretization of the internal flow field of axial flow turbine is developed. A sensitivity analysis capability is added to the flow solver, by rendering it able to accurately evaluate the derivatives of the time varying output functions. The complex Taylor's series expansion (CTSE) technique is reviewed. Two of them are used to demonstrate the accuracy and time dependency of the differentiation process. The results are compared with finite differences (FD) approximations. The CTSE is more accurate than the FD, but less efficient. A "black box" differentiation of the source code, resulting from the automated application of the CTSE, generates high fidelity sensitivity algorithms, but with low computational efficiency and high memory requirements. New formulations of the CTSE are proposed and applied. Selective differentiation of the method for solving the non-linear implicit residual equation leads to sensitivity algorithms with the same accuracy but improved run time. The time dependent sensitivity derivatives are computed in run times comparable to the ones required by the FD approach.
Fuller, Joel T.; Buckley, Jonathan D.; Tsiros, Margarita D.; Brown, Nicholas A. T.; Thewlis, Dominic
2016-01-01
Context: Minimalist shoes have been suggested as a way to alter running biomechanics to improve running performance and reduce injuries. However, to date, researchers have only considered the effect of minimalist shoes at slow running speeds. Objective: To determine if runners change foot-strike pattern and alter the distribution of mechanical work at the knee and ankle joints when running at a fast speed in minimalist shoes compared with conventional running shoes. Design: Crossover study. Setting: Research laboratory. Patients or Other Participants: Twenty-six trained runners (age = 30.0 ± 7.9 years [age range, 18−40 years], height = 1.79 ± 0.06 m, mass = 75.3 ± 8.2 kg, weekly training distance = 27 ± 15 km) who ran with a habitual rearfoot foot-strike pattern and had no experience running in minimalist shoes. Intervention(s): Participants completed overground running trials at 18 km/h in minimalist and conventional shoes. Main Outcome Measure(s): Sagittal-plane kinematics and joint work at the knee and ankle joints were computed using 3-dimensional kinematic and ground reaction force data. Foot-strike pattern was classified as rearfoot, midfoot, or forefoot strike based on strike index and ankle angle at initial contact. Results: We observed no difference in foot-strike classification between shoes (χ21 = 2.29, P = .13). Ankle angle at initial contact was less (2.46° versus 7.43°; t25 = 3.34, P = .003) and strike index was greater (35.97% versus 29.04%; t25 = 2.38, P = .03) when running in minimalist shoes compared with conventional shoes. We observed greater negative (52.87 J versus 42.46 J; t24 = 2.29, P = .03) and positive work (68.91 J versus 59.08 J; t24 = 2.65, P = .01) at the ankle but less negative (59.01 J versus 67.02 J; t24 = 2.25, P = .03) and positive work (40.37 J versus 47.09 J; t24 = 2.11, P = .046) at the knee with minimalist shoes compared with conventional shoes. Conclusions: Running in minimalist shoes at a fast speed caused a redistribution of work from the knee to the ankle joint. This finding suggests that runners changing from conventional to minimalist shoes for short-distance races could be at an increased risk of ankle and calf injuries but a reduced risk of knee injuries. PMID:27834504
Accelerating calculations of RNA secondary structure partition functions using GPUs
2013-01-01
Background RNA performs many diverse functions in the cell in addition to its role as a messenger of genetic information. These functions depend on its ability to fold to a unique three-dimensional structure determined by the sequence. The conformation of RNA is in part determined by its secondary structure, or the particular set of contacts between pairs of complementary bases. Prediction of the secondary structure of RNA from its sequence is therefore of great interest, but can be computationally expensive. In this work we accelerate computations of base-pair probababilities using parallel graphics processing units (GPUs). Results Calculation of the probabilities of base pairs in RNA secondary structures using nearest-neighbor standard free energy change parameters has been implemented using CUDA to run on hardware with multiprocessor GPUs. A modified set of recursions was introduced, which reduces memory usage by about 25%. GPUs are fastest in single precision, and for some hardware, restricted to single precision. This may introduce significant roundoff error. However, deviations in base-pair probabilities calculated using single precision were found to be negligible compared to those resulting from shifting the nearest-neighbor parameters by a random amount of magnitude similar to their experimental uncertainties. For large sequences running on our particular hardware, the GPU implementation reduces execution time by a factor of close to 60 compared with an optimized serial implementation, and by a factor of 116 compared with the original code. Conclusions Using GPUs can greatly accelerate computation of RNA secondary structure partition functions, allowing calculation of base-pair probabilities for large sequences in a reasonable amount of time, with a negligible compromise in accuracy due to working in single precision. The source code is integrated into the RNAstructure software package and available for download at http://rna.urmc.rochester.edu. PMID:24180434
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kopp, H.J.; Mortensen, G.A.
1978-04-01
Approximately 60% of the full CDC 6600/7600 Datatran 2.0 capability was made operational on IBM 360/370 equipment. Sufficient capability was made operational to demonstrate adequate performance for modular program linking applications. Also demonstrated were the basic capabilities and performance required to support moderate-sized data base applications and moderately active scratch input/output applications. Approximately one to two calendar years are required to develop DATATRAN 2.0 capabilities fully for the entire spectrum of applications proposed. Included in the next stage of conversion should be syntax checking and syntax conversion features that would foster greater FORTRAN compatibility between IBM and CDC developed modules.more » The batch portion of the JOSHUA Modular System, which was developed by Savannah River Laboratory to run on an IBM computer, was examined for the feasibility of conversion to run on a Control Data Corporation (CDC) computer. Portions of the JOSHUA Precompiler were changed so as to be operable on the CDC computer. The Data Manager and Batch Monitor were also examined for conversion feasibility, but no changes were made in them. It appears to be feasible to convert the batch portion of the JOSHUA Modular System to run on a CDC computer with an estimated additional two to three man-years of effort. 9 tables.« less
Control mechanism of double-rotator-structure ternary optical computer
NASA Astrophysics Data System (ADS)
Kai, SONG; Liping, YAN
2017-03-01
Double-rotator-structure ternary optical processor (DRSTOP) has two characteristics, namely, giant data-bits parallel computing and reconfigurable processor, which can handle thousands of data bits in parallel, and can run much faster than computers and other optical computer systems so far. In order to put DRSTOP into practical application, this paper established a series of methods, namely, task classification method, data-bits allocation method, control information generation method, control information formatting and sending method, and decoded results obtaining method and so on. These methods form the control mechanism of DRSTOP. This control mechanism makes DRSTOP become an automated computing platform. Compared with the traditional calculation tools, DRSTOP computing platform can ease the contradiction between high energy consumption and big data computing due to greatly reducing the cost of communications and I/O. Finally, the paper designed a set of experiments for DRSTOP control mechanism to verify its feasibility and correctness. Experimental results showed that the control mechanism is correct, feasible and efficient.
Identifying the impact of G-quadruplexes on Affymetrix 3' arrays using cloud computing.
Memon, Farhat N; Owen, Anne M; Sanchez-Graillet, Olivia; Upton, Graham J G; Harrison, Andrew P
2010-01-15
A tetramer quadruplex structure is formed by four parallel strands of DNA/ RNA containing runs of guanine. These quadruplexes are able to form because guanine can Hoogsteen hydrogen bond to other guanines, and a tetrad of guanines can form a stable arrangement. Recently we have discovered that probes on Affymetrix GeneChips that contain runs of guanine do not measure gene expression reliably. We associate this finding with the likelihood that quadruplexes are forming on the surface of GeneChips. In order to cope with the rapidly expanding size of GeneChip array datasets in the public domain, we are exploring the use of cloud computing to replicate our experiments on 3' arrays to look at the effect of the location of G-spots (runs of guanines). Cloud computing is a recently introduced high-performance solution that takes advantage of the computational infrastructure of large organisations such as Amazon and Google. We expect that cloud computing will become widely adopted because it enables bioinformaticians to avoid capital expenditure on expensive computing resources and to only pay a cloud computing provider for what is used. Moreover, as well as financial efficiency, cloud computing is an ecologically-friendly technology, it enables efficient data-sharing and we expect it to be faster for development purposes. Here we propose the advantageous use of cloud computing to perform a large data-mining analysis of public domain 3' arrays.
Katz, Jonathan E
2017-01-01
Laboratories tend to be amenable environments for long-term reliable operation of scientific measurement equipment. Indeed, it is not uncommon to find equipment 5, 10, or even 20+ years old still being routinely used in labs. Unfortunately, the Achilles heel for many of these devices is the control/data acquisition computer. Often these computers run older operating systems (e.g., Windows XP) and, while they might only use standard network, USB or serial ports, they require proprietary software to be installed. Even if the original installation disks can be found, it is a burdensome process to reinstall and is fraught with "gotchas" that can derail the process-lost license keys, incompatible hardware, forgotten configuration settings, etc. If you have running legacy instrumentation, the computer is the ticking time bomb waiting to put a halt to your operation.In this chapter, I describe how to virtualize your currently running control computer. This virtualized computer "image" is easy to maintain, easy to back up and easy to redeploy. I have used this multiple times in my own lab to greatly improve the robustness of my legacy devices.After completing the steps in this chapter, you will have your original control computer as well as a virtual instance of that computer with all the software installed ready to control your hardware should your original computer ever be decommissioned.
Automatic mathematical modeling for real time simulation system
NASA Technical Reports Server (NTRS)
Wang, Caroline; Purinton, Steve
1988-01-01
A methodology for automatic mathematical modeling and generating simulation models is described. The models will be verified by running in a test environment using standard profiles with the results compared against known results. The major objective is to create a user friendly environment for engineers to design, maintain, and verify their model and also automatically convert the mathematical model into conventional code for conventional computation. A demonstration program was designed for modeling the Space Shuttle Main Engine Simulation. It is written in LISP and MACSYMA and runs on a Symbolic 3670 Lisp Machine. The program provides a very friendly and well organized environment for engineers to build a knowledge base for base equations and general information. It contains an initial set of component process elements for the Space Shuttle Main Engine Simulation and a questionnaire that allows the engineer to answer a set of questions to specify a particular model. The system is then able to automatically generate the model and FORTRAN code. The future goal which is under construction is to download the FORTRAN code to VAX/VMS system for conventional computation. The SSME mathematical model will be verified in a test environment and the solution compared with the real data profile. The use of artificial intelligence techniques has shown that the process of the simulation modeling can be simplified.
Resource Efficient Hardware Architecture for Fast Computation of Running Max/Min Filters
Torres-Huitzil, Cesar
2013-01-01
Running max/min filters on rectangular kernels are widely used in many digital signal and image processing applications. Filtering with a k × k kernel requires of k 2 − 1 comparisons per sample for a direct implementation; thus, performance scales expensively with the kernel size k. Faster computations can be achieved by kernel decomposition and using constant time one-dimensional algorithms on custom hardware. This paper presents a hardware architecture for real-time computation of running max/min filters based on the van Herk/Gil-Werman (HGW) algorithm. The proposed architecture design uses less computation and memory resources than previously reported architectures when targeted to Field Programmable Gate Array (FPGA) devices. Implementation results show that the architecture is able to compute max/min filters, on 1024 × 1024 images with up to 255 × 255 kernels, in around 8.4 milliseconds, 120 frames per second, at a clock frequency of 250 MHz. The implementation is highly scalable for the kernel size with good performance/area tradeoff suitable for embedded applications. The applicability of the architecture is shown for local adaptive image thresholding. PMID:24288456
Energy Frontier Research With ATLAS: Final Report
DOE Office of Scientific and Technical Information (OSTI.GOV)
Butler, John; Black, Kevin; Ahlen, Steve
2016-06-14
The Boston University (BU) group is playing key roles across the ATLAS experiment: in detector operations, the online trigger, the upgrade, computing, and physics analysis. Our team has been critical to the maintenance and operations of the muon system since its installation. During Run 1 we led the muon trigger group and that responsibility continues into Run 2. BU maintains and operates the ATLAS Northeast Tier 2 computing center. We are actively engaged in the analysis of ATLAS data from Run 1 and Run 2. Physics analyses we have contributed to include Standard Model measurements (W and Z cross sections,more » t\\bar{t} differential cross sections, WWW^* production), evidence for the Higgs decaying to \\tau^+\\tau^-, and searches for new phenomena (technicolor, Z' and W', vector-like quarks, dark matter).« less
Contributions to Leg Stiffness in High- Compared with Low-Arched Athletes.
Powell, Douglas W; Paquette, Max R; Williams, D S Blaise
2017-08-01
High-arched (HA) athletes exhibit greater lower extremity stiffness during functional tasks than low-arched (LA) athletes. The contributions of skeletal and muscular structures to stiffness may underlie the distinct injury patterns observed in these athletes. The purpose of this study was to compare skeletal and muscular contributions to leg stiffness in HA and LA athletes during running and landing tasks. Ten HA and 10 LA female athletes performed five overground running trials at a self-selected pace and five step off bilateral landing trials from a height of 30 cm. Three-dimensional kinematics and kinetics were collected using a motion capture system and a force platform. Leg stiffness and its skeletal and muscular contributions were calculated. Independent t-tests were used to compare variable means between arch type groups and Cohen's d were computed to assess effect sizes of mean differences. In running, HA athletes had greater leg stiffness (P = 0.010, d = 1.03) and skeletal stiffness (P = 0.016, d = 0.81), although there are no differences in muscular stiffness (P = 0.134). During landing, HA had greater leg stiffness (P = 0.015, d = 1.06) and skeletal stiffness (P < 0.001, d = 1.84), whereas LA athletes had greater muscular stiffness (P = 0.025, d = 0.96). These findings demonstrate that HA athletes place a greater reliance on skeletal structures for load attenuation during running and landing, whereas LA athletes rely more greatly on muscle contributions during landing only. These findings may provide insight into the distinct injury patterns observed in HA and LA athletes.
Automatic Data Filter Customization Using a Genetic Algorithm
NASA Technical Reports Server (NTRS)
Mandrake, Lukas
2013-01-01
This work predicts whether a retrieval algorithm will usefully determine CO2 concentration from an input spectrum of GOSAT (Greenhouse Gases Observing Satellite). This was done to eliminate needless runtime on atmospheric soundings that would never yield useful results. A space of 50 dimensions was examined for predictive power on the final CO2 results. Retrieval algorithms are frequently expensive to run, and wasted effort defeats requirements and expends needless resources. This algorithm could be used to help predict and filter unneeded runs in any computationally expensive regime. Traditional methods such as the Fischer discriminant analysis and decision trees can attempt to predict whether a sounding will be properly processed. However, this work sought to detect a subsection of the dimensional space that can be simply filtered out to eliminate unwanted runs. LDAs (linear discriminant analyses) and other systems examine the entire data and judge a "best fit," giving equal weight to complex and problematic regions as well as simple, clear-cut regions. In this implementation, a genetic space of "left" and "right" thresholds outside of which all data are rejected was defined. These left/right pairs are created for each of the 50 input dimensions. A genetic algorithm then runs through countless potential filter settings using a JPL computer cluster, optimizing the tossed-out data s yield (proper vs. improper run removal) and number of points tossed. This solution is robust to an arbitrary decision boundary within the data and avoids the global optimization problem of whole-dataset fitting using LDA or decision trees. It filters out runs that would not have produced useful CO2 values to save needless computation. This would be an algorithmic preprocessing improvement to any computationally expensive system.
Open-source meteor detection software for low-cost single-board computers
NASA Astrophysics Data System (ADS)
Vida, D.; Zubović, D.; Šegon, D.; Gural, P.; Cupec, R.
2016-01-01
This work aims to overcome the current price threshold of meteor stations which can sometimes deter meteor enthusiasts from owning one. In recent years small card-sized computers became widely available and are used for numerous applications. To utilize such computers for meteor work, software which can run on them is needed. In this paper we present a detailed description of newly-developed open-source software for fireball and meteor detection optimized for running on low-cost single board computers. Furthermore, an update on the development of automated open-source software which will handle video capture, fireball and meteor detection, astrometry and photometry is given.
How to Build an AppleSeed: A Parallel Macintosh Cluster for Numerically Intensive Computing
NASA Astrophysics Data System (ADS)
Decyk, V. K.; Dauger, D. E.
We have constructed a parallel cluster consisting of a mixture of Apple Macintosh G3 and G4 computers running the Mac OS, and have achieved very good performance on numerically intensive, parallel plasma particle-incell simulations. A subset of the MPI message-passing library was implemented in Fortran77 and C. This library enabled us to port code, without modification, from other parallel processors to the Macintosh cluster. Unlike Unix-based clusters, no special expertise in operating systems is required to build and run the cluster. This enables us to move parallel computing from the realm of experts to the main stream of computing.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Venkata, Manjunath Gorentla; Aderholdt, William F
The pre-exascale systems are expected to have a significant amount of hierarchical and heterogeneous on-node memory, and this trend of system architecture in extreme-scale systems is expected to continue into the exascale era. along with hierarchical-heterogeneous memory, the system typically has a high-performing network ad a compute accelerator. This system architecture is not only effective for running traditional High Performance Computing (HPC) applications (Big-Compute), but also for running data-intensive HPC applications and Big-Data applications. As a consequence, there is a growing desire to have a single system serve the needs of both Big-Compute and Big-Data applications. Though the system architecturemore » supports the convergence of the Big-Compute and Big-Data, the programming models and software layer have yet to evolve to support either hierarchical-heterogeneous memory systems or the convergence. A programming abstraction to address this problem. The programming abstraction is implemented as a software library and runs on pre-exascale and exascale systems supporting current and emerging system architecture. Using distributed data-structures as a central concept, it provides (1) a simple, usable, and portable abstraction for hierarchical-heterogeneous memory and (2) a unified programming abstraction for Big-Compute and Big-Data applications.« less
PconsD: ultra rapid, accurate model quality assessment for protein structure prediction.
Skwark, Marcin J; Elofsson, Arne
2013-07-15
Clustering methods are often needed for accurately assessing the quality of modeled protein structures. Recent blind evaluation of quality assessment methods in CASP10 showed that there is little difference between many different methods as far as ranking models and selecting best model are concerned. When comparing many models, the computational cost of the model comparison can become significant. Here, we present PconsD, a fast, stream-computing method for distance-driven model quality assessment that runs on consumer hardware. PconsD is at least one order of magnitude faster than other methods of comparable accuracy. The source code for PconsD is freely available at http://d.pcons.net/. Supplementary benchmarking data are also available there. arne@bioinfo.se Supplementary data are available at Bioinformatics online.
User's guide to the Octopus computer network (the SHOC manual)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Schneider, C.; Thompson, D.; Whitten, G.
1977-07-18
This guide explains how to enter, run, and debug programs on the Octopus network. It briefly describes the network's operation, and directs the reader to other documents for further information. It stresses those service programs that will be most useful in the long run; ''quick'' methods that have little flexibility are not discussed. The Octopus timesharing network gives the user access to four CDC 7600 computers, two CDC STAR computers, and a broad array of peripheral equipment, from any of 800 or so remote terminals. 16 figures, 7 tables.
User's guide to the Octopus computer network (the SHOC manual)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Schneider, C.; Thompson, D.; Whitten, G.
1976-10-07
This guide explains how to enter, run, and debug programs on the Octopus network. It briefly describes the network's operation, and directs the reader to other documents for further information. It stresses those service programs that will be most useful in the long run; ''quick'' methods that have little flexibility are not discussed. The Octopus timesharing network gives the user access to four CDC 7600 computers, two CDC STAR computers, and a broad array of peripheral equipment, from any of 800 or so remote terminals. 8 figures, 4 tables.
User's guide to the Octopus computer network (the SHOC manual)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Schneider, C.; Thompson, D.; Whitten, G.
1975-06-02
This guide explains how to enter, run, and debug programs on the Octopus network. It briefly describes the network's operation, and directs the reader to other documents for further information. It stresses those service programs that will be most useful in the long run; ''quick'' methods that have little flexibility are not discussed. The Octopus timesharing network gives the user access to four CDC 7600 computers and a broad array of peripheral equipment, from any of 800 remote terminals. Octopus will soon include the Laboratory's STAR-100 computers. 9 figures, 5 tables. (auth)
Proteinortho: Detection of (Co-)orthologs in large-scale analysis
2011-01-01
Background Orthology analysis is an important part of data analysis in many areas of bioinformatics such as comparative genomics and molecular phylogenetics. The ever-increasing flood of sequence data, and hence the rapidly increasing number of genomes that can be compared simultaneously, calls for efficient software tools as brute-force approaches with quadratic memory requirements become infeasible in practise. The rapid pace at which new data become available, furthermore, makes it desirable to compute genome-wide orthology relations for a given dataset rather than relying on relations listed in databases. Results The program Proteinortho described here is a stand-alone tool that is geared towards large datasets and makes use of distributed computing techniques when run on multi-core hardware. It implements an extended version of the reciprocal best alignment heuristic. We apply Proteinortho to compute orthologous proteins in the complete set of all 717 eubacterial genomes available at NCBI at the beginning of 2009. We identified thirty proteins present in 99% of all bacterial proteomes. Conclusions Proteinortho significantly reduces the required amount of memory for orthology analysis compared to existing tools, allowing such computations to be performed on off-the-shelf hardware. PMID:21526987
Massively parallel quantum computer simulator
NASA Astrophysics Data System (ADS)
De Raedt, K.; Michielsen, K.; De Raedt, H.; Trieu, B.; Arnold, G.; Richter, M.; Lippert, Th.; Watanabe, H.; Ito, N.
2007-01-01
We describe portable software to simulate universal quantum computers on massive parallel computers. We illustrate the use of the simulation software by running various quantum algorithms on different computer architectures, such as a IBM BlueGene/L, a IBM Regatta p690+, a Hitachi SR11000/J1, a Cray X1E, a SGI Altix 3700 and clusters of PCs running Windows XP. We study the performance of the software by simulating quantum computers containing up to 36 qubits, using up to 4096 processors and up to 1 TB of memory. Our results demonstrate that the simulator exhibits nearly ideal scaling as a function of the number of processors and suggest that the simulation software described in this paper may also serve as benchmark for testing high-end parallel computers.
JAX Colony Management System (JCMS): an extensible colony and phenotype data management system.
Donnelly, Chuck J; McFarland, Mike; Ames, Abigail; Sundberg, Beth; Springer, Dave; Blauth, Peter; Bult, Carol J
2010-04-01
The Jackson Laboratory Colony Management System (JCMS) is a software application for managing data and information related to research mouse colonies, associated biospecimens, and experimental protocols. JCMS runs directly on computers that run one of the PC Windows operating systems, but can be accessed via web browser interfaces from any computer running a Windows, Macintosh, or Linux operating system. JCMS can be configured for a single user or multiple users in small- to medium-size work groups. The target audience for JCMS includes laboratory technicians, animal colony managers, and principal investigators. The application provides operational support for colony management and experimental workflows, sample and data tracking through transaction-based data entry forms, and date-driven work reports. Flexible query forms allow researchers to retrieve database records based on user-defined criteria. Recent advances in handheld computers with integrated barcode readers, middleware technologies, web browsers, and wireless networks add to the utility of JCMS by allowing real-time access to the database from any networked computer.
The NEST Dry-Run Mode: Efficient Dynamic Analysis of Neuronal Network Simulation Code.
Kunkel, Susanne; Schenck, Wolfram
2017-01-01
NEST is a simulator for spiking neuronal networks that commits to a general purpose approach: It allows for high flexibility in the design of network models, and its applications range from small-scale simulations on laptops to brain-scale simulations on supercomputers. Hence, developers need to test their code for various use cases and ensure that changes to code do not impair scalability. However, running a full set of benchmarks on a supercomputer takes up precious compute-time resources and can entail long queuing times. Here, we present the NEST dry-run mode, which enables comprehensive dynamic code analysis without requiring access to high-performance computing facilities. A dry-run simulation is carried out by a single process, which performs all simulation steps except communication as if it was part of a parallel environment with many processes. We show that measurements of memory usage and runtime of neuronal network simulations closely match the corresponding dry-run data. Furthermore, we demonstrate the successful application of the dry-run mode in the areas of profiling and performance modeling.
The NEST Dry-Run Mode: Efficient Dynamic Analysis of Neuronal Network Simulation Code
Kunkel, Susanne; Schenck, Wolfram
2017-01-01
NEST is a simulator for spiking neuronal networks that commits to a general purpose approach: It allows for high flexibility in the design of network models, and its applications range from small-scale simulations on laptops to brain-scale simulations on supercomputers. Hence, developers need to test their code for various use cases and ensure that changes to code do not impair scalability. However, running a full set of benchmarks on a supercomputer takes up precious compute-time resources and can entail long queuing times. Here, we present the NEST dry-run mode, which enables comprehensive dynamic code analysis without requiring access to high-performance computing facilities. A dry-run simulation is carried out by a single process, which performs all simulation steps except communication as if it was part of a parallel environment with many processes. We show that measurements of memory usage and runtime of neuronal network simulations closely match the corresponding dry-run data. Furthermore, we demonstrate the successful application of the dry-run mode in the areas of profiling and performance modeling. PMID:28701946
ATLAS@Home: Harnessing Volunteer Computing for HEP
NASA Astrophysics Data System (ADS)
Adam-Bourdarios, C.; Cameron, D.; Filipčič, A.; Lancon, E.; Wu, W.; ATLAS Collaboration
2015-12-01
A recent common theme among HEP computing is exploitation of opportunistic resources in order to provide the maximum statistics possible for Monte Carlo simulation. Volunteer computing has been used over the last few years in many other scientific fields and by CERN itself to run simulations of the LHC beams. The ATLAS@Home project was started to allow volunteers to run simulations of collisions in the ATLAS detector. So far many thousands of members of the public have signed up to contribute their spare CPU cycles for ATLAS, and there is potential for volunteer computing to provide a significant fraction of ATLAS computing resources. Here we describe the design of the project, the lessons learned so far and the future plans.
Understanding the Performance and Potential of Cloud Computing for Scientific Applications
Sadooghi, Iman; Martin, Jesus Hernandez; Li, Tonglin; ...
2015-02-19
In this paper, commercial clouds bring a great opportunity to the scientific computing area. Scientific applications usually require significant resources, however not all scientists have access to sufficient high-end computing systems, may of which can be found in the Top500 list. Cloud Computing has gained the attention of scientists as a competitive resource to run HPC applications at a potentially lower cost. But as a different infrastructure, it is unclear whether clouds are capable of running scientific applications with a reasonable performance per money spent. This work studies the performance of public clouds and places this performance in context tomore » price. We evaluate the raw performance of different services of AWS cloud in terms of the basic resources, such as compute, memory, network and I/O. We also evaluate the performance of the scientific applications running in the cloud. This paper aims to assess the ability of the cloud to perform well, as well as to evaluate the cost of the cloud running scientific applications. We developed a full set of metrics and conducted a comprehensive performance evlauation over the Amazon cloud. We evaluated EC2, S3, EBS and DynamoDB among the many Amazon AWS services. We evaluated the memory sub-system performance with CacheBench, the network performance with iperf, processor and network performance with the HPL benchmark application, and shared storage with NFS and PVFS in addition to S3. We also evaluated a real scientific computing application through the Swift parallel scripting system at scale. Armed with both detailed benchmarks to gauge expected performance and a detailed monetary cost analysis, we expect this paper will be a recipe cookbook for scientists to help them decide where to deploy and run their scientific applications between public clouds, private clouds, or hybrid clouds.« less
Understanding the Performance and Potential of Cloud Computing for Scientific Applications
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sadooghi, Iman; Martin, Jesus Hernandez; Li, Tonglin
In this paper, commercial clouds bring a great opportunity to the scientific computing area. Scientific applications usually require significant resources, however not all scientists have access to sufficient high-end computing systems, may of which can be found in the Top500 list. Cloud Computing has gained the attention of scientists as a competitive resource to run HPC applications at a potentially lower cost. But as a different infrastructure, it is unclear whether clouds are capable of running scientific applications with a reasonable performance per money spent. This work studies the performance of public clouds and places this performance in context tomore » price. We evaluate the raw performance of different services of AWS cloud in terms of the basic resources, such as compute, memory, network and I/O. We also evaluate the performance of the scientific applications running in the cloud. This paper aims to assess the ability of the cloud to perform well, as well as to evaluate the cost of the cloud running scientific applications. We developed a full set of metrics and conducted a comprehensive performance evlauation over the Amazon cloud. We evaluated EC2, S3, EBS and DynamoDB among the many Amazon AWS services. We evaluated the memory sub-system performance with CacheBench, the network performance with iperf, processor and network performance with the HPL benchmark application, and shared storage with NFS and PVFS in addition to S3. We also evaluated a real scientific computing application through the Swift parallel scripting system at scale. Armed with both detailed benchmarks to gauge expected performance and a detailed monetary cost analysis, we expect this paper will be a recipe cookbook for scientists to help them decide where to deploy and run their scientific applications between public clouds, private clouds, or hybrid clouds.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Faillace, E.R.; Cheng, J.J.; Yu, C.
A series of benchmarking runs were conducted so that results obtained with the RESRAD code could be compared against those obtained with six pathway analysis models used to determine the radiation dose to an individual living on a radiologically contaminated site. The RESRAD computer code was benchmarked against five other computer codes - GENII-S, GENII, DECOM, PRESTO-EPA-CPG, and PATHRAE-EPA - and the uncodified methodology presented in the NUREG/CR-5512 report. Estimated doses for the external gamma pathway; the dust inhalation pathway; and the soil, food, and water ingestion pathways were calculated for each methodology by matching, to the extent possible, inputmore » parameters such as occupancy, shielding, and consumption factors.« less
Pocket computers: a new aid to nutritional support.
Colley, C M; Fleck, A; Howard, J P
1985-01-01
A program has been written to run on a pocket computer (Sharp PC-1500) that can be used at the bedside to predict the nutritional requirements of patients with a wide range of clinical conditions. The predictions of the program showed good correlation with measured values for energy and nitrogen requirements. The program was used, with good results, in the management of over 100 patients needing nutritional support. The calculation of nutritional requirements for each patient individually facilitates more appropriate treatment and may also produce financial savings when compared with administration of a standard feeding regimen to all patients. Images FIG 1 PMID:3922512
Computer simulation of multigrid body dynamics and control
NASA Technical Reports Server (NTRS)
Swaminadham, M.; Moon, Young I.; Venkayya, V. B.
1990-01-01
The objective is to set up and analyze benchmark problems on multibody dynamics and to verify the predictions of two multibody computer simulation codes. TREETOPS and DISCOS have been used to run three example problems - one degree-of-freedom spring mass dashpot system, an inverted pendulum system, and a triple pendulum. To study the dynamics and control interaction, an inverted planar pendulum with an external body force and a torsional control spring was modeled as a hinge connected two-rigid body system. TREETOPS and DISCOS affected the time history simulation of this problem. System state space variables and their time derivatives from two simulation codes were compared.
Computing shifts to monitor ATLAS distributed computing infrastructure and operations
NASA Astrophysics Data System (ADS)
Adam, C.; Barberis, D.; Crépé-Renaudin, S.; De, K.; Fassi, F.; Stradling, A.; Svatos, M.; Vartapetian, A.; Wolters, H.
2017-10-01
The ATLAS Distributed Computing (ADC) group established a new Computing Run Coordinator (CRC) shift at the start of LHC Run 2 in 2015. The main goal was to rely on a person with a good overview of the ADC activities to ease the ADC experts’ workload. The CRC shifter keeps track of ADC tasks related to their fields of expertise and responsibility. At the same time, the shifter maintains a global view of the day-to-day operations of the ADC system. During Run 1, this task was accomplished by a person of the expert team called the ADC Manager on Duty (AMOD), a position that was removed during the shutdown period due to the reduced number and availability of ADC experts foreseen for Run 2. The CRC position was proposed to cover some of the AMODs former functions, while allowing more people involved in computing to participate. In this way, CRC shifters help with the training of future ADC experts. The CRC shifters coordinate daily ADC shift operations, including tracking open issues, reporting, and representing ADC in relevant meetings. The CRC also facilitates communication between the ADC experts team and the other ADC shifters. These include the Distributed Analysis Support Team (DAST), which is the first point of contact for addressing all distributed analysis questions, and the ATLAS Distributed Computing Shifters (ADCoS), which check and report problems in central services, sites, Tier-0 export, data transfers and production tasks. Finally, the CRC looks at the level of ADC activities on a weekly or monthly timescale to ensure that ADC resources are used efficiently.
Universe creation on a computer
NASA Astrophysics Data System (ADS)
McCabe, Gordon
The purpose of this paper is to provide an account of the epistemology and metaphysics of universe creation on a computer. The paper begins with F.J. Tipler's argument that our experience is indistinguishable from the experience of someone embedded in a perfect computer simulation of our own universe, hence we cannot know whether or not we are part of such a computer program ourselves. Tipler's argument is treated as a special case of epistemological scepticism, in a similar vein to 'brain-in-a-vat' arguments. It is argued that Tipler's hypothesis that our universe is a program running on a digital computer in another universe, generates empirical predictions, and is therefore a falsifiable hypothesis. The computer program hypothesis is also treated as a hypothesis about what exists beyond the physical world, and is compared with Kant's metaphysics of noumena. It is argued that if our universe is a program running on a digital computer, then our universe must have compact spatial topology, and the possibilities of observationally testing this prediction are considered. The possibility of testing the computer program hypothesis with the value of the density parameter Ω0 is also analysed. The informational requirements for a computer to represent a universe exactly and completely are considered. Consequent doubt is thrown upon Tipler's claim that if a hierarchy of computer universes exists, we would not be able to know which 'level of implementation' our universe exists at. It is then argued that a digital computer simulation of a universe, or any other physical system, does not provide a realisation of that universe or system. It is argued that a digital computer simulation of a physical system is not objectively related to that physical system, and therefore cannot exist as anything else other than a physical process occurring upon the components of the computer. It is concluded that Tipler's sceptical hypothesis, and a related hypothesis from Bostrom, cannot be true: it is impossible that our own experience is indistinguishable from the experience of somebody embedded in a digital computer simulation because it is impossible for anybody to be embedded in a digital computer simulation.
Parallel approach to identifying the well-test interpretation model using a neurocomputer
NASA Astrophysics Data System (ADS)
May, Edward A., Jr.; Dagli, Cihan H.
1996-03-01
The well test is one of the primary diagnostic and predictive tools used in the analysis of oil and gas wells. In these tests, a pressure recording device is placed in the well and the pressure response is recorded over time under controlled flow conditions. The interpreted results are indicators of the well's ability to flow and the damage done to the formation surrounding the wellbore during drilling and completion. The results are used for many purposes, including reservoir modeling (simulation) and economic forecasting. The first step in the analysis is the identification of the Well-Test Interpretation (WTI) model, which determines the appropriate solution method. Mis-identification of the WTI model occurs due to noise and non-ideal reservoir conditions. Previous studies have shown that a feed-forward neural network using the backpropagation algorithm can be used to identify the WTI model. One of the drawbacks to this approach is, however, training time, which can run into days of CPU time on personal computers. In this paper a similar neural network is applied using both a personal computer and a neurocomputer. Input data processing, network design, and performance are discussed and compared. The results show that the neurocomputer greatly eases the burden of training and allows the network to outperform a similar network running on a personal computer.
Habegger, Lukas; Balasubramanian, Suganthi; Chen, David Z.; Khurana, Ekta; Sboner, Andrea; Harmanci, Arif; Rozowsky, Joel; Clarke, Declan; Snyder, Michael; Gerstein, Mark
2012-01-01
Summary: The functional annotation of variants obtained through sequencing projects is generally assumed to be a simple intersection of genomic coordinates with genomic features. However, complexities arise for several reasons, including the differential effects of a variant on alternatively spliced transcripts, as well as the difficulty in assessing the impact of small insertions/deletions and large structural variants. Taking these factors into consideration, we developed the Variant Annotation Tool (VAT) to functionally annotate variants from multiple personal genomes at the transcript level as well as obtain summary statistics across genes and individuals. VAT also allows visualization of the effects of different variants, integrates allele frequencies and genotype data from the underlying individuals and facilitates comparative analysis between different groups of individuals. VAT can either be run through a command-line interface or as a web application. Finally, in order to enable on-demand access and to minimize unnecessary transfers of large data files, VAT can be run as a virtual machine in a cloud-computing environment. Availability and Implementation: VAT is implemented in C and PHP. The VAT web service, Amazon Machine Image, source code and detailed documentation are available at vat.gersteinlab.org. Contact: lukas.habegger@yale.edu or mark.gerstein@yale.edu Supplementary Information: Supplementary data are available at Bioinformatics online. PMID:22743228
Real-time machine vision system using FPGA and soft-core processor
NASA Astrophysics Data System (ADS)
Malik, Abdul Waheed; Thörnberg, Benny; Meng, Xiaozhou; Imran, Muhammad
2012-06-01
This paper presents a machine vision system for real-time computation of distance and angle of a camera from reference points in the environment. Image pre-processing, component labeling and feature extraction modules were modeled at Register Transfer (RT) level and synthesized for implementation on field programmable gate arrays (FPGA). The extracted image component features were sent from the hardware modules to a soft-core processor, MicroBlaze, for computation of distance and angle. A CMOS imaging sensor operating at a clock frequency of 27MHz was used in our experiments to produce a video stream at the rate of 75 frames per second. Image component labeling and feature extraction modules were running in parallel having a total latency of 13ms. The MicroBlaze was interfaced with the component labeling and feature extraction modules through Fast Simplex Link (FSL). The latency for computing distance and angle of camera from the reference points was measured to be 2ms on the MicroBlaze, running at 100 MHz clock frequency. In this paper, we present the performance analysis, device utilization and power consumption for the designed system. The FPGA based machine vision system that we propose has high frame speed, low latency and a power consumption that is much lower compared to commercially available smart camera solutions.
High-Lift Optimization Design Using Neural Networks on a Multi-Element Airfoil
NASA Technical Reports Server (NTRS)
Greenman, Roxana M.; Roth, Karlin R.; Smith, Charles A. (Technical Monitor)
1998-01-01
The high-lift performance of a multi-element airfoil was optimized by using neural-net predictions that were trained using a computational data set. The numerical data was generated using a two-dimensional, incompressible, Navier-Stokes algorithm with the Spalart-Allmaras turbulence model. Because it is difficult to predict maximum lift for high-lift systems, an empirically-based maximum lift criteria was used in this study to determine both the maximum lift and the angle at which it occurs. Multiple input, single output networks were trained using the NASA Ames variation of the Levenberg-Marquardt algorithm for each of the aerodynamic coefficients (lift, drag, and moment). The artificial neural networks were integrated with a gradient-based optimizer. Using independent numerical simulations and experimental data for this high-lift configuration, it was shown that this design process successfully optimized flap deflection, gap, overlap, and angle of attack to maximize lift. Once the neural networks were trained and integrated with the optimizer, minimal additional computer resources were required to perform optimization runs with different initial conditions and parameters. Applying the neural networks within the high-lift rigging optimization process reduced the amount of computational time and resources by 83% compared with traditional gradient-based optimization procedures for multiple optimization runs.
NASA Astrophysics Data System (ADS)
Dorband, J. E.; Tilak, N.; Radov, A.
2016-12-01
In this paper, a classical computer implementation of RBM is compared to a quantum annealing based RBM running on a D-Wave 2X (an adiabatic quantum computer). The codes for both are essentially identical. Only a flag is set to change the activation function from a classically computed logistic function to the D-Wave. To obtain greater understanding of the behavior of the D-Wave, a study of the stochastic properties of a virtual qubit (a 12 qubit chain) and a cell of qubits (an 8 qubit cell) was performed. We will present the results of comparing the D-Wave implementation with a theoretically errorless adiabatic quantum computer. The main purpose of this study is to develop a generic RBM regression tool in order to infer CO2 fluxes from the NASA satellite OCO-2 observed CO2 concentrations and predicted atmospheric states using regression models. The carbon fluxes will then be assimilated into a land surface model to predict the Net Ecosystem Exchange at globally distributed regional sites.
Soule, Pat LeRoy
1978-01-01
Water-surface profiles of the 25-, 50-, and 100-year recurrence interval discharges have been computed for all streams and reaches of channels in Fairfax County, Virginia, having a drainage area greater than 1 square mile except for Dogue Creek, Little Hunting Creek, and that portion of Cameron Run above Lake Barcroft. Maps having a 2-foot contour interval and a horizontal scale of 1 inch equals 100 feet were used for base on which flood boundaries were delineated for 25-, 50-, and 100-year floods to be expected in each basin under ultimate development conditions. This report is one of a series and presents a discussion of techniques employed in computing discharges and profiles as well as the flood profiles and maps on which flood boundaries have been delineated for the Occoquan River and its tributaries within Fairfax County and those streams on Mason Neck within Fairfax County tributary to the Potomac River. (Woodard-USGS)
ACON: a multipurpose production controller for plasma physics codes
DOE Office of Scientific and Technical Information (OSTI.GOV)
Snell, C.
1983-01-01
ACON is a BCON controller designed to run large production codes on the CTSS Cray-1 or the LTSS 7600 computers. ACON can also be operated interactively, with input from the user's terminal. The controller can run one code or a sequence of up to ten codes during the same job. Options are available to get and save Mass storage files, to perform Historian file updating operations, to compile and load source files, and to send out print and film files. Special features include ability to retry after Mass failures, backup options for saving files, startup messages for the various codes,more » and ability to reserve specified amounts of computer time after successive code runs. ACON's flexibility and power make it useful for running a number of different production codes.« less
Application of Intel Many Integrated Core (MIC) accelerators to the Pleim-Xiu land surface scheme
NASA Astrophysics Data System (ADS)
Huang, Melin; Huang, Bormin; Huang, Allen H.
2015-10-01
The land-surface model (LSM) is one physics process in the weather research and forecast (WRF) model. The LSM includes atmospheric information from the surface layer scheme, radiative forcing from the radiation scheme, and precipitation forcing from the microphysics and convective schemes, together with internal information on the land's state variables and land-surface properties. The LSM is to provide heat and moisture fluxes over land points and sea-ice points. The Pleim-Xiu (PX) scheme is one LSM. The PX LSM features three pathways for moisture fluxes: evapotranspiration, soil evaporation, and evaporation from wet canopies. To accelerate the computation process of this scheme, we employ Intel Xeon Phi Many Integrated Core (MIC) Architecture as it is a multiprocessor computer structure with merits of efficient parallelization and vectorization essentials. Our results show that the MIC-based optimization of this scheme running on Xeon Phi coprocessor 7120P improves the performance by 2.3x and 11.7x as compared to the original code respectively running on one CPU socket (eight cores) and on one CPU core with Intel Xeon E5-2670.
Enhanced conformational sampling of carbohydrates by Hamiltonian replica-exchange simulation.
Mishra, Sushil Kumar; Kara, Mahmut; Zacharias, Martin; Koca, Jaroslav
2014-01-01
Knowledge of the structure and conformational flexibility of carbohydrates in an aqueous solvent is important to improving our understanding of how carbohydrates function in biological systems. In this study, we extend a variant of the Hamiltonian replica-exchange molecular dynamics (MD) simulation to improve the conformational sampling of saccharides in an explicit solvent. During the simulations, a biasing potential along the glycosidic-dihedral linkage between the saccharide monomer units in an oligomer is applied at various levels along the replica runs to enable effective transitions between various conformations. One reference replica runs under the control of the original force field. The method was tested on disaccharide structures and further validated on biologically relevant blood group B, Lewis X and Lewis A trisaccharides. The biasing potential-based replica-exchange molecular dynamics (BP-REMD) method provided a significantly improved sampling of relevant conformational states compared with standard continuous MD simulations, with modest computational costs. Thus, the proposed BP-REMD approach adds a new dimension to existing carbohydrate conformational sampling approaches by enhancing conformational sampling in the presence of solvent molecules explicitly at relatively low computational cost.
Automatic Fitting of Spiking Neuron Models to Electrophysiological Recordings
Rossant, Cyrille; Goodman, Dan F. M.; Platkiewicz, Jonathan; Brette, Romain
2010-01-01
Spiking models can accurately predict the spike trains produced by cortical neurons in response to somatically injected currents. Since the specific characteristics of the model depend on the neuron, a computational method is required to fit models to electrophysiological recordings. The fitting procedure can be very time consuming both in terms of computer simulations and in terms of code writing. We present algorithms to fit spiking models to electrophysiological data (time-varying input and spike trains) that can run in parallel on graphics processing units (GPUs). The model fitting library is interfaced with Brian, a neural network simulator in Python. If a GPU is present it uses just-in-time compilation to translate model equations into optimized code. Arbitrary models can then be defined at script level and run on the graphics card. This tool can be used to obtain empirically validated spiking models of neurons in various systems. We demonstrate its use on public data from the INCF Quantitative Single-Neuron Modeling 2009 competition by comparing the performance of a number of neuron spiking models. PMID:20224819
Experiences running NASTRAN on the Microvax 2 computer
NASA Technical Reports Server (NTRS)
Butler, Thomas G.; Mitchell, Reginald S.
1987-01-01
The MicroVAX operates NASTRAN so well that the only detectable difference in its operation compared to an 11/780 VAX is in the execution time. On the modest installation described here, the engineer has all of the tools he needs to do an excellent job of analysis. System configuration decisions, system sizing, preparation of the system disk, definition of user quotas, installation, monitoring of system errors, and operation policies are discussed.
A handheld computer as part of a portable in vivo knee joint load monitoring system
Szivek, JA; Nandakumar, VS; Geffre, CP; Townsend, CP
2009-01-01
In vivo measurement of loads and pressures acting on articular cartilage in the knee joint during various activities and rehabilitative therapies following focal defect repair will provide a means of designing activities that encourage faster and more complete healing of focal defects. It was the goal of this study to develop a totally portable monitoring system that could be used during various activities and allow continuous monitoring of forces acting on the knee. In order to make the monitoring system portable, a handheld computer with custom software, a USB powered miniature wireless receiver and a battery-powered coil were developed to replace a currently used computer, AC powered bench top receiver and power supply. A Dell handheld running Windows Mobile operating system(OS) programmed using Labview was used to collect strain measurements. Measurements collected by the handheld based system connected to the miniature wireless receiver were compared with the measurements collected by a hardwired system and a computer based system during bench top testing and in vivo testing. The newly developed handheld based system had a maximum accuracy of 99% when compared to the computer based system. PMID:19789715
Experimental Realization of High-Efficiency Counterfactual Computation.
Kong, Fei; Ju, Chenyong; Huang, Pu; Wang, Pengfei; Kong, Xi; Shi, Fazhan; Jiang, Liang; Du, Jiangfeng
2015-08-21
Counterfactual computation (CFC) exemplifies the fascinating quantum process by which the result of a computation may be learned without actually running the computer. In previous experimental studies, the counterfactual efficiency is limited to below 50%. Here we report an experimental realization of the generalized CFC protocol, in which the counterfactual efficiency can break the 50% limit and even approach unity in principle. The experiment is performed with the spins of a negatively charged nitrogen-vacancy color center in diamond. Taking advantage of the quantum Zeno effect, the computer can remain in the not-running subspace due to the frequent projection by the environment, while the computation result can be revealed by final detection. The counterfactual efficiency up to 85% has been demonstrated in our experiment, which opens the possibility of many exciting applications of CFC, such as high-efficiency quantum integration and imaging.
Experimental Realization of High-Efficiency Counterfactual Computation
NASA Astrophysics Data System (ADS)
Kong, Fei; Ju, Chenyong; Huang, Pu; Wang, Pengfei; Kong, Xi; Shi, Fazhan; Jiang, Liang; Du, Jiangfeng
2015-08-01
Counterfactual computation (CFC) exemplifies the fascinating quantum process by which the result of a computation may be learned without actually running the computer. In previous experimental studies, the counterfactual efficiency is limited to below 50%. Here we report an experimental realization of the generalized CFC protocol, in which the counterfactual efficiency can break the 50% limit and even approach unity in principle. The experiment is performed with the spins of a negatively charged nitrogen-vacancy color center in diamond. Taking advantage of the quantum Zeno effect, the computer can remain in the not-running subspace due to the frequent projection by the environment, while the computation result can be revealed by final detection. The counterfactual efficiency up to 85% has been demonstrated in our experiment, which opens the possibility of many exciting applications of CFC, such as high-efficiency quantum integration and imaging.
Application configuration selection for energy-efficient execution on multicore systems
Wang, Shinan; Luo, Bing; Shi, Weisong; ...
2015-09-21
Balanced performance and energy consumption are incorporated in the design of modern computer systems. Several runtime factors, such as concurrency levels, thread mapping strategies, and dynamic voltage and frequency scaling (DVFS) should be considered in order to achieve optimal energy efficiency fora workload. Selecting appropriate run-time factors, however, is one of the most challenging tasks because the run-time factors are architecture-specific and workload-specific. And while most existing works concentrate on either static analysis of the workload or run-time prediction results, we present a hybrid two-step method that utilizes concurrency levels and DVFS settings to achieve the energy efficiency configuration formore » a worldoad. The experimental results based on a Xeon E5620 server with NPB and PARSEC benchmark suites show that the model is able to predict the energy efficient configuration accurately. On average, an additional 10% EDP (Energy Delay Product) saving is obtained by using run-time DVFS for the entire system. An off-line optimal solution is used to compare with the proposed scheme. Finally, the experimental results show that the average extra EDP saved by the optimal solution is within 5% on selective parallel benchmarks.« less
Running of scalar spectral index in multi-field inflation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gong, Jinn-Ouk, E-mail: jinn-ouk.gong@apctp.org
We compute the running of the scalar spectral index in general multi-field slow-roll inflation. By incorporating explicit momentum dependence at the moment of horizon crossing, we can find the running straightforwardly. At the same time, we can distinguish the contributions from the quasi de Sitter background and the super-horizon evolution of the field fluctuations.
Gong, Chunye; Bao, Weimin; Tang, Guojian; Jiang, Yuewen; Liu, Jie
2014-01-01
It is very time consuming to solve fractional differential equations. The computational complexity of two-dimensional fractional differential equation (2D-TFDE) with iterative implicit finite difference method is O(M(x)M(y)N(2)). In this paper, we present a parallel algorithm for 2D-TFDE and give an in-depth discussion about this algorithm. A task distribution model and data layout with virtual boundary are designed for this parallel algorithm. The experimental results show that the parallel algorithm compares well with the exact solution. The parallel algorithm on single Intel Xeon X5540 CPU runs 3.16-4.17 times faster than the serial algorithm on single CPU core. The parallel efficiency of 81 processes is up to 88.24% compared with 9 processes on a distributed memory cluster system. We do think that the parallel computing technology will become a very basic method for the computational intensive fractional applications in the near future.
Vectorized schemes for conical potential flow using the artificial density method
NASA Technical Reports Server (NTRS)
Bradley, P. F.; Dwoyer, D. L.; South, J. C., Jr.; Keen, J. M.
1984-01-01
A method is developed to determine solutions to the full-potential equation for steady supersonic conical flow using the artificial density method. Various update schemes used generally for transonic potential solutions are investigated. The schemes are compared for speed and robustness. All versions of the computer code have been vectorized and are currently running on the CYBER-203 computer. The update schemes are vectorized, where possible, either fully (explicit schemes) or partially (implicit schemes). Since each version of the code differs only by the update scheme and elements other than the update scheme are completely vectorizable, comparisons of computational effort and convergence rate among schemes are a measure of the specific scheme's performance. Results are presented for circular and elliptical cones at angle of attack for subcritical and supercritical crossflows.
Computation of transonic flow about helicopter rotor blades
NASA Technical Reports Server (NTRS)
Arieli, R.; Tauber, M. E.; Saunders, D. A.; Caughey, D. A.
1986-01-01
An inviscid, nonconservative, three-dimensional full-potential flow code, ROT22, has been developed for computing the quasi-steady flow about a lifting rotor blade. The code is valid throughout the subsonic and transonic regime. Calculations from the code are compared with detailed laser velocimeter measurements made in the tip region of a nonlifting rotor at a tip Mach number of 0.95 and zero advance ratio. In addition, comparisons are made with chordwise surface pressure measurements obtained in a wind tunnel for a nonlifting rotor blade at transonic tip speeds at advance ratios from 0.40 to 0.50. The overall agreement between theoretical calculations and experiment is very good. A typical run on a CRAY X-MP computer requires about 30 CPU seconds for one rotor position at transonic tip speed.
Numerical solutions of 3-dimensional Navier-Stokes equations for closed bluff-bodies
NASA Technical Reports Server (NTRS)
Abolhassani, J. S.; Tiwari, S. N.
1985-01-01
The Navier-Stokes equations are solved numerically. These equations are unsteady, compressible, viscous, and three-dimensional without neglecting any terms. The time dependency of the governing equations allows the solution to progress naturally for an arbitrary initial guess to an asymptotic steady state, if one exists. The equations are transformed from physical coordinates to the computational coordinates, allowing the solution of the governing equations in a rectangular parallelepiped domain. The equations are solved by the MacCormack time-split technique which is vectorized and programmed to run on the CDc VPS 32 computer. The codes are written in 32-bit (half word) FORTRAN, which provides an approximate factor of two decreasing in computational time and doubles the memory size compared to the 54-bit word size.
Discrete Fourier Transform Analysis in a Complex Vector Space
NASA Technical Reports Server (NTRS)
Dean, Bruce H.
2009-01-01
Alternative computational strategies for the Discrete Fourier Transform (DFT) have been developed using analysis of geometric manifolds. This approach provides a general framework for performing DFT calculations, and suggests a more efficient implementation of the DFT for applications using iterative transform methods, particularly phase retrieval. The DFT can thus be implemented using fewer operations when compared to the usual DFT counterpart. The software decreases the run time of the DFT in certain applications such as phase retrieval that iteratively call the DFT function. The algorithm exploits a special computational approach based on analysis of the DFT as a transformation in a complex vector space. As such, this approach has the potential to realize a DFT computation that approaches N operations versus Nlog(N) operations for the equivalent Fast Fourier Transform (FFT) calculation.
Mendel-GPU: haplotyping and genotype imputation on graphics processing units
Chen, Gary K.; Wang, Kai; Stram, Alex H.; Sobel, Eric M.; Lange, Kenneth
2012-01-01
Motivation: In modern sequencing studies, one can improve the confidence of genotype calls by phasing haplotypes using information from an external reference panel of fully typed unrelated individuals. However, the computational demands are so high that they prohibit researchers with limited computational resources from haplotyping large-scale sequence data. Results: Our graphics processing unit based software delivers haplotyping and imputation accuracies comparable to competing programs at a fraction of the computational cost and peak memory demand. Availability: Mendel-GPU, our OpenCL software, runs on Linux platforms and is portable across AMD and nVidia GPUs. Users can download both code and documentation at http://code.google.com/p/mendel-gpu/. Contact: gary.k.chen@usc.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:22954633
Program Processes Thermocouple Readings
NASA Technical Reports Server (NTRS)
Quave, Christine A.; Nail, William, III
1995-01-01
Digital Signal Processor for Thermocouples (DART) computer program implements precise and fast method of converting voltage to temperature for large-temperature-range thermocouple applications. Written using LabVIEW software. DART available only as object code for use on Macintosh II FX or higher-series computers running System 7.0 or later and IBM PC-series and compatible computers running Microsoft Windows 3.1. Macintosh version of DART (SSC-00032) requires LabVIEW 2.2.1 or 3.0 for execution. IBM PC version (SSC-00031) requires LabVIEW 3.0 for Windows 3.1. LabVIEW software product of National Instruments and not included with program.
NASA Technical Reports Server (NTRS)
Mcenulty, R. E.
1977-01-01
The G189A simulation of the Shuttle Orbiter ECLSS was upgraded. All simulation library versions and simulation models were converted from the EXEC2 to the EXEC8 computer system and a new program, G189PL, was added to the combination master program library. The program permits the post-plotting of up to 100 frames of plot data over any time interval of a G189 simulation run. The overlay structure of the G189A simulations were restructured for the purpose of conserving computer core requirements and minimizing run time requirements.
INHYD: Computer code for intraply hybrid composite design. A users manual
NASA Technical Reports Server (NTRS)
Chamis, C. C.; Sinclair, J. H.
1983-01-01
A computer program (INHYD) was developed for intraply hybrid composite design. A users manual for INHYD is presented. In INHYD embodies several composite micromechanics theories, intraply hybrid composite theories, and an integrated hygrothermomechanical theory. The INHYD can be run in both interactive and batch modes. It has considerable flexibility and capability, which the user can exercise through several options. These options are demonstrated through appropriate INHYD runs in the manual.
Topology Optimization for Reducing Additive Manufacturing Processing Distortions
2017-12-01
features that curl or warp under thermal load and are subsequently struck by the recoater blade /roller. Support structures act to wick heat away and...was run for 150 iterations. The material properties for all examples were Young’s modulus E = 1 GPa, Poisson’s ratio ν = 0.25, and thermal expansion...the element-birth model is significantly more computationally expensive for a full op- timization run . Consider, the computational complexity of a
MindModeling@Home . . . and Anywhere Else You Have Idle Processors
2009-12-01
was SETI @Home. It was established in 1999 for the purpose of demonstrating the utility of “distributed grid computing” by providing a mechanism for...the public imagination, and SETI @Home remains the longest running and one of the most popular volunteer computing projects in the world. This...pursuits. Most of them, including SETI @Home, run on a software architecture called the Berkeley Open Infrastructure for Network Computing (BOINC). Some of
NASA Astrophysics Data System (ADS)
Zhiying, Chen; Ping, Zhou
2017-11-01
Considering the robust optimization computational precision and efficiency for complex mechanical assembly relationship like turbine blade-tip radial running clearance, a hierarchically response surface robust optimization algorithm is proposed. The distribute collaborative response surface method is used to generate assembly system level approximation model of overall parameters and blade-tip clearance, and then a set samples of design parameters and objective response mean and/or standard deviation is generated by using system approximation model and design of experiment method. Finally, a new response surface approximation model is constructed by using those samples, and this approximation model is used for robust optimization process. The analyses results demonstrate the proposed method can dramatic reduce the computational cost and ensure the computational precision. The presented research offers an effective way for the robust optimization design of turbine blade-tip radial running clearance.
Implementing Parquet equations using HPX
NASA Astrophysics Data System (ADS)
Kellar, Samuel; Wagle, Bibek; Yang, Shuxiang; Tam, Ka-Ming; Kaiser, Hartmut; Moreno, Juana; Jarrell, Mark
A new C++ runtime system (HPX) enables simulations of complex systems to run more efficiently on parallel and heterogeneous systems. This increased efficiency allows for solutions to larger simulations of the parquet approximation for a system with impurities. The relevancy of the parquet equations depends upon the ability to solve systems which require long runs and large amounts of memory. These limitations, in addition to numerical complications arising from stability of the solutions, necessitate running on large distributed systems. As the computational resources trend towards the exascale and the limitations arising from computational resources vanish efficiency of large scale simulations becomes a focus. HPX facilitates efficient simulations through intelligent overlapping of computation and communication. Simulations such as the parquet equations which require the transfer of large amounts of data should benefit from HPX implementations. Supported by the the NSF EPSCoR Cooperative Agreement No. EPS-1003897 with additional support from the Louisiana Board of Regents.
Prediction of sound radiated from different practical jet engine inlets
NASA Technical Reports Server (NTRS)
Zinn, B. T.; Meyer, W. L.
1980-01-01
Existing computer codes for calculating the far field radiation patterns surrounding various practical jet engine inlet configurations under different excitation conditions were upgraded. The computer codes were refined and expanded so that they are now more efficient computationally by a factor of about three and they are now capable of producing accurate results up to nondimensional wave numbers of twenty. Computer programs were also developed to help generate accurate geometrical representations of the inlets to be investigated. This data is required as input for the computer programs which calculate the sound fields. This new geometry generating computer program considerably reduces the time required to generate the input data which was one of the most time consuming steps in the process. The results of sample runs using the NASA-Lewis QCSEE inlet are presented and comparison of run times and accuracy are made between the old and upgraded computer codes. The overall accuracy of the computations is determined by comparison of the results of the computations with simple source solutions.
NASA Astrophysics Data System (ADS)
Gerjuoy, Edward
2005-06-01
The security of messages encoded via the widely used RSA public key encryption system rests on the enormous computational effort required to find the prime factors of a large number N using classical (conventional) computers. In 1994 Peter Shor showed that for sufficiently large N, a quantum computer could perform the factoring with much less computational effort. This paper endeavors to explain, in a fashion comprehensible to the nonexpert, the RSA encryption protocol; the various quantum computer manipulations constituting the Shor algorithm; how the Shor algorithm performs the factoring; and the precise sense in which a quantum computer employing Shor's algorithm can be said to accomplish the factoring of very large numbers with less computational effort than a classical computer. It is made apparent that factoring N generally requires many successive runs of the algorithm. Our analysis reveals that the probability of achieving a successful factorization on a single run is about twice as large as commonly quoted in the literature.
Programming the social computer.
Robertson, David; Giunchiglia, Fausto
2013-03-28
The aim of 'programming the global computer' was identified by Milner and others as one of the grand challenges of computing research. At the time this phrase was coined, it was natural to assume that this objective might be achieved primarily through extending programming and specification languages. The Internet, however, has brought with it a different style of computation that (although harnessing variants of traditional programming languages) operates in a style different to those with which we are familiar. The 'computer' on which we are running these computations is a social computer in the sense that many of the elementary functions of the computations it runs are performed by humans, and successful execution of a program often depends on properties of the human society over which the program operates. These sorts of programs are not programmed in a traditional way and may have to be understood in a way that is different from the traditional view of programming. This shift in perspective raises new challenges for the science of the Web and for computing in general.
Acceleration of discrete stochastic biochemical simulation using GPGPU.
Sumiyoshi, Kei; Hirata, Kazuki; Hiroi, Noriko; Funahashi, Akira
2015-01-01
For systems made up of a small number of molecules, such as a biochemical network in a single cell, a simulation requires a stochastic approach, instead of a deterministic approach. The stochastic simulation algorithm (SSA) simulates the stochastic behavior of a spatially homogeneous system. Since stochastic approaches produce different results each time they are used, multiple runs are required in order to obtain statistical results; this results in a large computational cost. We have implemented a parallel method for using SSA to simulate a stochastic model; the method uses a graphics processing unit (GPU), which enables multiple realizations at the same time, and thus reduces the computational time and cost. During the simulation, for the purpose of analysis, each time course is recorded at each time step. A straightforward implementation of this method on a GPU is about 16 times faster than a sequential simulation on a CPU with hybrid parallelization; each of the multiple simulations is run simultaneously, and the computational tasks within each simulation are parallelized. We also implemented an improvement to the memory access and reduced the memory footprint, in order to optimize the computations on the GPU. We also implemented an asynchronous data transfer scheme to accelerate the time course recording function. To analyze the acceleration of our implementation on various sizes of model, we performed SSA simulations on different model sizes and compared these computation times to those for sequential simulations with a CPU. When used with the improved time course recording function, our method was shown to accelerate the SSA simulation by a factor of up to 130.
Acceleration of discrete stochastic biochemical simulation using GPGPU
Sumiyoshi, Kei; Hirata, Kazuki; Hiroi, Noriko; Funahashi, Akira
2015-01-01
For systems made up of a small number of molecules, such as a biochemical network in a single cell, a simulation requires a stochastic approach, instead of a deterministic approach. The stochastic simulation algorithm (SSA) simulates the stochastic behavior of a spatially homogeneous system. Since stochastic approaches produce different results each time they are used, multiple runs are required in order to obtain statistical results; this results in a large computational cost. We have implemented a parallel method for using SSA to simulate a stochastic model; the method uses a graphics processing unit (GPU), which enables multiple realizations at the same time, and thus reduces the computational time and cost. During the simulation, for the purpose of analysis, each time course is recorded at each time step. A straightforward implementation of this method on a GPU is about 16 times faster than a sequential simulation on a CPU with hybrid parallelization; each of the multiple simulations is run simultaneously, and the computational tasks within each simulation are parallelized. We also implemented an improvement to the memory access and reduced the memory footprint, in order to optimize the computations on the GPU. We also implemented an asynchronous data transfer scheme to accelerate the time course recording function. To analyze the acceleration of our implementation on various sizes of model, we performed SSA simulations on different model sizes and compared these computation times to those for sequential simulations with a CPU. When used with the improved time course recording function, our method was shown to accelerate the SSA simulation by a factor of up to 130. PMID:25762936
Myers, E W; Mount, D W
1986-01-01
We describe a program which may be used to find approximate matches to a short predefined DNA sequence in a larger target DNA sequence. The program predicts the usefulness of specific DNA probes and sequencing primers and finds nearly identical sequences that might represent the same regulatory signal. The program is written in the C programming language and will run on virtually any computer system with a C compiler, such as the IBM/PC and other computers running under the MS/DOS and UNIX operating systems. The program has been integrated into an existing software package for the IBM personal computer (see article by Mount and Conrad, this volume). Some examples of its use are given. PMID:3753785
AGIS: Evolution of Distributed Computing information system for ATLAS
NASA Astrophysics Data System (ADS)
Anisenkov, A.; Di Girolamo, A.; Alandes, M.; Karavakis, E.
2015-12-01
ATLAS, a particle physics experiment at the Large Hadron Collider at CERN, produces petabytes of data annually through simulation production and tens of petabytes of data per year from the detector itself. The ATLAS computing model embraces the Grid paradigm and a high degree of decentralization of computing resources in order to meet the ATLAS requirements of petabytes scale data operations. It has been evolved after the first period of LHC data taking (Run-1) in order to cope with new challenges of the upcoming Run- 2. In this paper we describe the evolution and recent developments of the ATLAS Grid Information System (AGIS), developed in order to integrate configuration and status information about resources, services and topology of the computing infrastructure used by the ATLAS Distributed Computing applications and services.
Jaschob, Daniel; Riffle, Michael
2012-07-30
Laboratories engaged in computational biology or bioinformatics frequently need to run lengthy, multistep, and user-driven computational jobs. Each job can tie up a computer for a few minutes to several days, and many laboratories lack the expertise or resources to build and maintain a dedicated computer cluster. JobCenter is a client-server application and framework for job management and distributed job execution. The client and server components are both written in Java and are cross-platform and relatively easy to install. All communication with the server is client-driven, which allows worker nodes to run anywhere (even behind external firewalls or "in the cloud") and provides inherent load balancing. Adding a worker node to the worker pool is as simple as dropping the JobCenter client files onto any computer and performing basic configuration, which provides tremendous ease-of-use, flexibility, and limitless horizontal scalability. Each worker installation may be independently configured, including the types of jobs it is able to run. Executed jobs may be written in any language and may include multistep workflows. JobCenter is a versatile and scalable distributed job management system that allows laboratories to very efficiently distribute all computational work among available resources. JobCenter is freely available at http://code.google.com/p/jobcenter/.
CPU SIM: A Computer Simulator for Use in an Introductory Computer Organization-Architecture Class.
ERIC Educational Resources Information Center
Skrein, Dale
1994-01-01
CPU SIM, an interactive low-level computer simulation package that runs on the Macintosh computer, is described. The program is designed for instructional use in the first or second year of undergraduate computer science, to teach various features of typical computer organization through hands-on exercises. (MSE)
Flame-Vortex Studies to Quantify Markstein Numbers Needed to Model Flame Extinction Limits
NASA Technical Reports Server (NTRS)
Driscoll, James F.; Feikema, Douglas A.
2003-01-01
This has quantified a database of Markstein numbers for unsteady flames; future work will quantify a database of flame extinction limits for unsteady conditions. Unsteady extinction limits have not been documented previously; both a stretch rate and a residence time must be measured, since extinction requires that the stretch rate be sufficiently large for a sufficiently long residence time. Ma was measured for an inwardly-propagating flame (IPF) that is negatively-stretched under microgravity conditions. Computations also were performed using RUN-1DL to explain the measurements. The Markstein number of an inwardly-propagating flame, for both the microgravity experiment and the computations, is significantly larger than that of an outwardy-propagating flame. The computed profiles of the various species within the flame suggest reasons. Computed hydrogen concentrations build up ahead of the IPF but not the OPF. Understanding was gained by running the computations for both simplified and full-chemistry conditions. Numerical Simulations. To explain the experimental findings, numerical simulations of both inwardly and outwardly propagating spherical flames (with complex chemistry) were generated using the RUN-1DL code, which includes 16 species and 46 reactions.
Design and performance of the virtualization platform for offline computing on the ATLAS TDAQ Farm
NASA Astrophysics Data System (ADS)
Ballestrero, S.; Batraneanu, S. M.; Brasolin, F.; Contescu, C.; Di Girolamo, A.; Lee, C. J.; Pozo Astigarraga, M. E.; Scannicchio, D. A.; Twomey, M. S.; Zaytsev, A.
2014-06-01
With the LHC collider at CERN currently going through the period of Long Shutdown 1 there is an opportunity to use the computing resources of the experiments' large trigger farms for other data processing activities. In the case of the ATLAS experiment, the TDAQ farm, consisting of more than 1500 compute nodes, is suitable for running Monte Carlo (MC) production jobs that are mostly CPU and not I/O bound. This contribution gives a thorough review of the design and deployment of a virtualized platform running on this computing resource and of its use to run large groups of CernVM based virtual machines operating as a single CERN-P1 WLCG site. This platform has been designed to guarantee the security and the usability of the ATLAS private network, and to minimize interference with TDAQ's usage of the farm. Openstack has been chosen to provide a cloud management layer. The experience gained in the last 3.5 months shows that the use of the TDAQ farm for the MC simulation contributes to the ATLAS data processing at the level of a large Tier-1 WLCG site, despite the opportunistic nature of the underlying computing resources being used.
Wager, Justin C; Challis, John H
2016-03-21
During locomotion, the lower limb tendons undergo stretch and recoil, functioning like springs that recycle energy with each step. Cadaveric testing has demonstrated that the arch of the foot operates in this capacity during simple loading, yet it remains unclear whether this function exists during locomotion. In this study, one of the arch׳s passive elastic tissues (the plantar aponeurosis; PA) was investigated to glean insights about it and the entire arch of the foot during running. Subject specific computer models of the foot were driven using the kinematics of eight subjects running at 3.1m/s using two initial contact patterns (rearfoot and non-rearfoot). These models were used to estimate PA strain, force, and elastic energy storage during the stance phase. To examine the release of stored energy, the foot joint moments, powers, and work created by the PA were computed. Mean elastic energy stored in the PA was 3.1±1.6J, which was comparable to in situ testing values. Changes to the initial contact pattern did not change elastic energy storage or late stance PA function, but did alter PA pre-tensioning and function during early stance. In both initial contact patterns conditions, the PA power was positive during late stance, which reveals that the release of the stored elastic energy assists with shortening of the arch during push-off. As the PA is just one of the arch׳s passive elastic tissues, the entire arch may store additional energy and impact the metabolic cost of running. Copyright © 2016 Elsevier Ltd. All rights reserved.
Efficient multitasking of Choleski matrix factorization on CRAY supercomputers
NASA Technical Reports Server (NTRS)
Overman, Andrea L.; Poole, Eugene L.
1991-01-01
A Choleski method is described and used to solve linear systems of equations that arise in large scale structural analysis. The method uses a novel variable-band storage scheme and is structured to exploit fast local memory caches while minimizing data access delays between main memory and vector registers. Several parallel implementations of this method are described for the CRAY-2 and CRAY Y-MP computers demonstrating the use of microtasking and autotasking directives. A portable parallel language, FORCE, is used for comparison with the microtasked and autotasked implementations. Results are presented comparing the matrix factorization times for three representative structural analysis problems from runs made in both dedicated and multi-user modes on both computers. CPU and wall clock timings are given for the parallel implementations and are compared to single processor timings of the same algorithm.
Bike and run pacing on downhill segments predict Ironman triathlon relative success.
Johnson, Evan C; Pryor, J Luke; Casa, Douglas J; Belval, Luke N; Vance, James S; DeMartini, Julie K; Maresh, Carl M; Armstrong, Lawrence E
2015-01-01
Determine if performance and physiological based pacing characteristics over the varied terrain of a triathlon predicted relative bike, run, and/or overall success. Poor self-regulation of intensity during long distance (Full Iron) triathlon can manifest in adverse discontinuities in performance. Observational study of a random sample of Ironman World Championship athletes. High performing and low performing groups were established upon race completion. Participants wore global positioning system and heart rate enabled watches during the race. Percentage difference from pre-race disclosed goal pace (%off) and mean HR were calculated for nine segments of the bike and 11 segments of the run. Normalized graded running pace (accounting for changes in elevation) was computed via analysis software. Step-wise regression analyses identified segments predictive of relative success and HP and LP were compared at these segments to confirm importance. %Off of goal velocity during two downhill segments of the bike (HP: -6.8±3.2%, -14.2±2.6% versus LP: -1.2±4.2%, -5.1±11.5%; p<0.020) and %off from NGP during one downhill segment of the run (HP: 4.8±5.2% versus LP: 33.3±38.7%; p=0.033) significantly predicted relative performance. Also, HP displayed more consistency in mean HR (141±12 to 138±11 bpm) compared to LP (139±17 to 131±16 bpm; p=0.019) over the climb and descent from the turn-around point during the bike component. Athletes who maintained faster relative speeds on downhill segments, and who had smaller changes in HR between consecutive up and downhill segments were more successful relative to their goal times. Copyright © 2013 Sports Medicine Australia. Published by Elsevier Ltd. All rights reserved.
Scalable load balancing for massively parallel distributed Monte Carlo particle transport
DOE Office of Scientific and Technical Information (OSTI.GOV)
O'Brien, M. J.; Brantley, P. S.; Joy, K. I.
2013-07-01
In order to run computer simulations efficiently on massively parallel computers with hundreds of thousands or millions of processors, care must be taken that the calculation is load balanced across the processors. Examining the workload of every processor leads to an unscalable algorithm, with run time at least as large as O(N), where N is the number of processors. We present a scalable load balancing algorithm, with run time 0(log(N)), that involves iterated processor-pair-wise balancing steps, ultimately leading to a globally balanced workload. We demonstrate scalability of the algorithm up to 2 million processors on the Sequoia supercomputer at Lawrencemore » Livermore National Laboratory. (authors)« less
Performance of a supercharged direct-injection stratified-charge rotary combustion engine
NASA Technical Reports Server (NTRS)
Bartrand, Timothy A.; Willis, Edward A.
1990-01-01
A zero-dimensional thermodynamic performance computer model for direct-injection stratified-charge rotary combustion engines was modified and run for a single rotor supercharged engine. Operating conditions for the computer runs were a single boost pressure and a matrix of speeds, loads and engine materials. A representative engine map is presented showing the predicted range of efficient operation. After discussion of the engine map, a number of engine features are analyzed individually. These features are: heat transfer and the influence insulating materials have on engine performance and exhaust energy; intake manifold pressure oscillations and interactions with the combustion chamber; and performance losses and seal friction. Finally, code running times and convergence data are presented.
Decrease in Ground-Run Distance of Small Airplanes by Applying Electrically-Driven Wheels
NASA Astrophysics Data System (ADS)
Kobayashi, Hiroshi; Nishizawa, Akira
A new takeoff method for small airplanes was proposed. Ground-roll performance of an airplane driven by electrically-powered wheels was experimentally and computationally studied. The experiments verified that the ground-run distance was decreased by half with a combination of the powered driven wheels and propeller without increase of energy consumption during the ground-roll. The computational analysis showed the ground-run distance of the wheel-driven aircraft was independent of the motor power when the motor capability exceeded the friction between tires and ground. Furthermore, the distance was minimized when the angle of attack was set to the value so that the wing generated negative lift.
NASA Technical Reports Server (NTRS)
Elliott, R. D.; Werner, N. M.; Baker, W. M.
1975-01-01
The Aerodynamic Data Analysis and Integration System (ADAIS), developed as a highly interactive computer graphics program capable of manipulating large quantities of data such that addressable elements of a data base can be called up for graphic display, compared, curve fit, stored, retrieved, differenced, etc., was described. The general nature of the system is evidenced by the fact that limited usage has already occurred with data bases consisting of thermodynamic, basic loads, and flight dynamics data. Productivity using ADAIS of five times that for conventional manual methods of wind tunnel data analysis is routinely achieved. In wind tunnel data analysis, data from one or more runs of a particular test may be called up and displayed along with data from one or more runs of a different test. Curves may be faired through the data points by any of four methods, including cubic spline and least squares polynomial fit up to seventh order.
The study on the control strategy of micro grid considering the economy of energy storage operation
NASA Astrophysics Data System (ADS)
Ma, Zhiwei; Liu, Yiqun; Wang, Xin; Li, Bei; Zeng, Ming
2017-08-01
To optimize the running of micro grid to guarantee the supply and demand balance of electricity, and to promote the utilization of renewable energy. The control strategy of micro grid energy storage system is studied. Firstly, the mixed integer linear programming model is established based on the receding horizon control. Secondly, the modified cuckoo search algorithm is proposed to calculate the model. Finally, a case study is carried out to study the signal characteristic of micro grid and batteries under the optimal control strategy, and the convergence of the modified cuckoo search algorithm is compared with others to verify the validity of the proposed model and method. The results show that, different micro grid running targets can affect the control strategy of energy storage system, which further affect the signal characteristics of the micro grid. Meanwhile, the convergent speed, computing time and the economy of the modified cuckoo search algorithm are improved compared with the traditional cuckoo search algorithm and differential evolution algorithm.
NASADIG - NASA DEVICE INDEPENDENT GRAPHICS LIBRARY (AMDAHL VERSION)
NASA Technical Reports Server (NTRS)
Rogers, J. E.
1994-01-01
The NASA Device Independent Graphics Library, NASADIG, can be used with many computer-based engineering and management applications. The library gives the user the opportunity to translate data into effective graphic displays for presentation. The software offers many features which allow the user flexibility in creating graphics. These include two-dimensional plots, subplot projections in 3D-space, surface contour line plots, and surface contour color-shaded plots. Routines for three-dimensional plotting, wireframe surface plots, surface plots with hidden line removal, and surface contour line plots are provided. Other features include polar and spherical coordinate plotting, world map plotting utilizing either cylindrical equidistant or Lambert equal area projection, plot translation, plot rotation, plot blowup, splines and polynomial interpolation, area blanking control, multiple log/linear axes, legends and text control, curve thickness control, and multiple text fonts (18 regular, 4 bold). NASADIG contains several groups of subroutines. Included are subroutines for plot area and axis definition; text set-up and display; area blanking; line style set-up, interpolation, and plotting; color shading and pattern control; legend, text block, and character control; device initialization; mixed alphabets setting; and other useful functions. The usefulness of many routines is dependent on the prior definition of basic parameters. The program's control structure uses a serial-level construct with each routine restricted for activation at some prescribed level(s) of problem definition. NASADIG provides the following output device drivers: Selanar 100XL, VECTOR Move/Draw ASCII and PostScript files, Tektronix 40xx, 41xx, and 4510 Rasterizer, DEC VT-240 (4014 mode), IBM AT/PC compatible with SmartTerm 240 emulator, HP Lasergrafix Film Recorder, QMS 800/1200, DEC LN03+ Laserprinters, and HP LaserJet (Series III). NASADIG is written in FORTRAN and is available for several platforms. NASADIG 5.7 is available for DEC VAX series computers running VMS 5.0 or later (MSC-21801), Cray X-MP and Y-MP series computers running UNICOS (COS-10049), and Amdahl 5990 mainframe computers running UTS (COS-10050). NASADIG 5.1 is available for UNIX-based operating systems (MSC-22001). The UNIX version has been successfully implemented on Sun4 series computers running SunOS, SGI IRIS computers running IRIX, Hewlett Packard 9000 computers running HP-UX, and Convex computers running Convex OS (MSC-22001). The standard distribution medium for MSC-21801 is a set of two 6250 BPI 9-track magnetic tapes in DEC VAX BACKUP format. It is also available on a set of two TK50 tape cartridges in DEC VAX BACKUP format. The standard distribution medium for COS-10049 and COS-10050 is a 6250 BPI 9-track magnetic tape in UNIX tar format. Other distribution media and formats may be available upon request. The standard distribution medium for MSC-22001 is a .25 inch streaming magnetic tape cartridge (Sun QIC-24) in UNIX tar format. Alternate distribution media and formats are available upon request. With minor modification, the UNIX source code can be ported to other platforms including IBM PC/AT series computers and compatibles. NASADIG is also available bundled with TRASYS, the Thermal Radiation Analysis System (COS-10026, DEC VAX version; COS-10040, CRAY version).
NASADIG - NASA DEVICE INDEPENDENT GRAPHICS LIBRARY (UNIX VERSION)
NASA Technical Reports Server (NTRS)
Rogers, J. E.
1994-01-01
The NASA Device Independent Graphics Library, NASADIG, can be used with many computer-based engineering and management applications. The library gives the user the opportunity to translate data into effective graphic displays for presentation. The software offers many features which allow the user flexibility in creating graphics. These include two-dimensional plots, subplot projections in 3D-space, surface contour line plots, and surface contour color-shaded plots. Routines for three-dimensional plotting, wireframe surface plots, surface plots with hidden line removal, and surface contour line plots are provided. Other features include polar and spherical coordinate plotting, world map plotting utilizing either cylindrical equidistant or Lambert equal area projection, plot translation, plot rotation, plot blowup, splines and polynomial interpolation, area blanking control, multiple log/linear axes, legends and text control, curve thickness control, and multiple text fonts (18 regular, 4 bold). NASADIG contains several groups of subroutines. Included are subroutines for plot area and axis definition; text set-up and display; area blanking; line style set-up, interpolation, and plotting; color shading and pattern control; legend, text block, and character control; device initialization; mixed alphabets setting; and other useful functions. The usefulness of many routines is dependent on the prior definition of basic parameters. The program's control structure uses a serial-level construct with each routine restricted for activation at some prescribed level(s) of problem definition. NASADIG provides the following output device drivers: Selanar 100XL, VECTOR Move/Draw ASCII and PostScript files, Tektronix 40xx, 41xx, and 4510 Rasterizer, DEC VT-240 (4014 mode), IBM AT/PC compatible with SmartTerm 240 emulator, HP Lasergrafix Film Recorder, QMS 800/1200, DEC LN03+ Laserprinters, and HP LaserJet (Series III). NASADIG is written in FORTRAN and is available for several platforms. NASADIG 5.7 is available for DEC VAX series computers running VMS 5.0 or later (MSC-21801), Cray X-MP and Y-MP series computers running UNICOS (COS-10049), and Amdahl 5990 mainframe computers running UTS (COS-10050). NASADIG 5.1 is available for UNIX-based operating systems (MSC-22001). The UNIX version has been successfully implemented on Sun4 series computers running SunOS, SGI IRIS computers running IRIX, Hewlett Packard 9000 computers running HP-UX, and Convex computers running Convex OS (MSC-22001). The standard distribution medium for MSC-21801 is a set of two 6250 BPI 9-track magnetic tapes in DEC VAX BACKUP format. It is also available on a set of two TK50 tape cartridges in DEC VAX BACKUP format. The standard distribution medium for COS-10049 and COS-10050 is a 6250 BPI 9-track magnetic tape in UNIX tar format. Other distribution media and formats may be available upon request. The standard distribution medium for MSC-22001 is a .25 inch streaming magnetic tape cartridge (Sun QIC-24) in UNIX tar format. Alternate distribution media and formats are available upon request. With minor modification, the UNIX source code can be ported to other platforms including IBM PC/AT series computers and compatibles. NASADIG is also available bundled with TRASYS, the Thermal Radiation Analysis System (COS-10026, DEC VAX version; COS-10040, CRAY version).
Multiple running speed signals in medial entorhinal cortex
Hinman, James R.; Brandon, Mark P.; Climer, Jason R.; Chapman, G. William; Hasselmo, Michael E.
2016-01-01
Grid cells in medial entorhinal cortex (MEC) can be modeled using oscillatory interference or attractor dynamic mechanisms that perform path integration, a computation requiring information about running direction and speed. The two classes of computational models often use either an oscillatory frequency or a firing rate that increases as a function of running speed. Yet it is currently not known whether these are two manifestations of the same speed signal or dissociable signals with potentially different anatomical substrates. We examined coding of running speed in MEC and identified these two speed signals to be independent of each other within individual neurons. The medial septum (MS) is strongly linked to locomotor behavior and removal of MS input resulted in strengthening of the firing rate speed signal, while decreasing the strength of the oscillatory speed signal. Thus two speed signals are present in MEC that are differentially affected by disrupted MS input. PMID:27427460
Running SINDA '85/FLUINT interactive on the VAX
NASA Technical Reports Server (NTRS)
Simmonds, Boris
1992-01-01
Computer software as engineering tools are typically run in three modes: Batch, Demand, and Interactive. The first two are the most popular in the SINDA world. The third one is not so popular, due probably to the users inaccessibility to the command procedure files for running SINDA '85, or lack of familiarity with the SINDA '85 execution processes (pre-processor, processor, compilation, linking, execution and all of the file assignment, creation, deletions and de-assignments). Interactive is the mode that makes thermal analysis with SINDA '85 a real-time design tool. This paper explains a command procedure sufficient (the minimum modifications required in an existing demand command procedure) to run SINDA '85 on the VAX in an interactive mode. To exercise the procedure a sample problem is presented exemplifying the mode, plus additional programming capabilities available in SINDA '85. Following the same guidelines the process can be extended to other SINDA '85 residence computer platforms.
Multi-GPGPU Tsunami simulation at Toyama-bay
NASA Astrophysics Data System (ADS)
Furuyama, Shoichi; Ueda, Yuki
2017-07-01
Accelerated multi General Purpose Graphics Processing Unit (GPGPU) calculation for Tsunami run-up simulation was achieved at the wide area (whole Toyama-bay in Japan) by faster computation technique. Toyama-bay has active-faults at the sea-bed. It has a high possibility to occur earthquakes and Tsunami waves in the case of the huge earthquake, that's why to predict the area of Tsunami run-up is important for decreasing damages to residents by the disaster. However it is very hard task to achieve the simulation by the computer resources problem. A several meter's order of the high resolution calculation is required for the running-up Tsunami simulation because artificial structures on the ground such as roads, buildings, and houses are very small. On the other hand the huge area simulation is also required. In the Toyama-bay case the area is 42 [km] × 15 [km]. When 5 [m] × 5 [m] size computational cells are used for the simulation, over 26,000,000 computational cells are generated. To calculate the simulation, a normal CPU desktop computer took about 10 hours for the calculation. An improvement of calculation time is important problem for the immediate prediction system of Tsunami running-up, as a result it will contribute to protect a lot of residents around the coastal region. The study tried to decrease this calculation time by using multi GPGPU system which is equipped with six NVIDIA TESLA K20xs, InfiniBand network connection between computer nodes by MVAPICH library. As a result 5.16 times faster calculation was achieved on six GPUs than one GPU case and it was 86% parallel efficiency to the linear speed up.
An analysis of running skyline load path.
Ward W. Carson; Charles N. Mann
1971-01-01
This paper is intended for those who wish to prepare an algorithm to determine the load path of a running skyline. The mathematics of a simplified approach to this running skyline design problem are presented. The approach employs assumptions which reduce the complexity of the problem to the point where it can be solved on desk-top computers of limited capacities. The...
Job Priorities on Peregrine | High-Performance Computing | NREL
allocation when run with qos=high. Requesting a Node Reservation If you are doing work that requires real scheduler more efficiently plan resources for larger jobs. When projects reach their allocation limit, jobs associated with those projects will run at very low priority, which will ensure that these jobs run only when
Running High-Throughput Jobs on Peregrine | High-Performance Computing |
unique name (using "name=") and usse the task name to create a unique output file name. For runs on and how many tasks to give to each worker at a time using the NITRO_COORD_OPTIONS environment . Finally, you start Nitro by executing launch_nitro.sh. Sample Nitro job script To run a job using the
AlgoRun: a Docker-based packaging system for platform-agnostic implemented algorithms.
Hosny, Abdelrahman; Vera-Licona, Paola; Laubenbacher, Reinhard; Favre, Thibauld
2016-08-01
There is a growing need in bioinformatics for easy-to-use software implementations of algorithms that are usable across platforms. At the same time, reproducibility of computational results is critical and often a challenge due to source code changes over time and dependencies. The approach introduced in this paper addresses both of these needs with AlgoRun, a dedicated packaging system for implemented algorithms, using Docker technology. Implemented algorithms, packaged with AlgoRun, can be executed through a user-friendly interface directly from a web browser or via a standardized RESTful web API to allow easy integration into more complex workflows. The packaged algorithm includes the entire software execution environment, thereby eliminating the common problem of software dependencies and the irreproducibility of computations over time. AlgoRun-packaged algorithms can be published on http://algorun.org, a centralized searchable directory to find existing AlgoRun-packaged algorithms. AlgoRun is available at http://algorun.org and the source code under GPL license is available at https://github.com/algorun laubenbacher@uchc.edu Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Neubauer, Jakob; Benndorf, Matthias; Reidelbach, Carolin; Krauß, Tobias; Lampert, Florian; Zajonc, Horst; Kotter, Elmar; Langer, Mathias; Fiebich, Martin; Goerke, Sebastian M.
2016-01-01
Purpose To compare the diagnostic accuracy of radiography, to radiography equivalent dose multidetector computed tomography (RED-MDCT) and to radiography equivalent dose cone beam computed tomography (RED-CBCT) for wrist fractures. Methods As study subjects we obtained 10 cadaveric human hands from body donors. Distal radius, distal ulna and carpal bones (n = 100) were artificially fractured in random order in a controlled experimental setting. We performed radiation dose equivalent radiography (settings as in standard clinical care), RED-MDCT in a 320 row MDCT with single shot mode and RED-CBCT in a device dedicated to musculoskeletal imaging. Three raters independently evaluated the resulting images for fractures and the level of confidence for each finding. Gold standard was evaluated by consensus reading of a high-dose MDCT. Results Pooled sensitivity was higher in RED-MDCT with 0.89 and RED-MDCT with 0.81 compared to radiography with 0.54 (P = < .004). No significant differences were detected concerning the modalities’ specificities (with values between P = .98). Raters' confidence was higher in RED-MDCT and RED-CBCT compared to radiography (P < .001). Conclusion The diagnostic accuracy of RED-MDCT and RED-CBCT for wrist fractures proved to be similar and in some parts even higher compared to radiography. Readers are more confident in their reporting with the cross sectional modalities. Dose equivalent cross sectional computed tomography of the wrist could replace plain radiography for fracture diagnosis in the long run. PMID:27788215
Running head: What color is it
NASA Technical Reports Server (NTRS)
Young, Andrew T.
1988-01-01
Color vision provides low-resolution spectrophotometric information about candidate materials for planetary surfaces that is comparable in precision to wideband photoelectric photometry, and considerably superior to Voyager TV data. Briefly explained are the basic concepts, teminology, and notation of color science. Also shown is how to convert a reflectance spectrum into a color specification. An Appendix lists a simple computer subroutine to convert spectral reflectance into CIE coordinates, and the text explains how to convert these to a surface color in a standard color atlas. Target and printed Solar System colors are compared to show how accurate are the printed colors.
Xu, Y; Hou, Q; Wang, C; Simpson, T; Bennett, B; Russell, S
2017-01-01
We aim to test how well modern nonhabitual barefoot people can adapt to barefoot and Minimalist Bare Foot Technology (MBFT) shoes, in regard to gait symmetry. 28 healthy university students (22 females/6 males) were recruited to walk on a 10-meter walkway randomly on barefoot, in MBFT shoes, and in neutral running shoes at their comfortable walking speed. Kinetic and kinematic data were collected using an 8-camera motion capture system. Data of joint angles, joint forces, and joint moments were extracted to compute a consecutive symmetry index. Compared to walking in neutral running shoes, walking barefoot led to worse symmetry of the following: ankle joint force in sagittal plane, knee joint moment in transverse plane, and ankle joint moment in frontal plane, while improving the symmetry of joint angle in sagittal plane at ankle joints and global (hip-knee-ankle) level. Walking in MBFT shoes had intermediate gait symmetry performance as compared to walking barefoot/walking in neutral running shoes. We conclude that modern nonhabitual barefoot adults will lose some gait symmetry in joint force/moment if they switch to barefoot walking without fitting in; MBFT shoe might be an ideal compromise for healthy youth as regards gait symmetry in walking.
ASDIR-II. Volume I. User Manual
1975-12-01
normally the most significant part of the overall aircraft IR signature. The 4 radiance is directly dependent upon the geometric view factors , a set...tactors as punched card output in. a view factor computer run. For the view factor computer run IB49 through 53 and all IDS input A, from IDS-2 to IDS-6...may be excluded from the input string if the * program execution is requested to stop after punching the viewv factors . Inputs required for punching
Feasibility of Virtual Machine and Cloud Computing Technologies for High Performance Computing
2014-05-01
Hat Enterprise Linux SaaS software as a service VM virtual machine vNUMA virtual non-uniform memory access WRF weather research and forecasting...previously mentioned in Chapter I Section B1 of this paper, which is used to run the weather research and forecasting ( WRF ) model in their experiments...against a VMware virtualization solution of WRF . The experiment consisted of running WRF in a standard configuration between the D-VTM and VMware while
The Air Force Geophysics Laboratory Standalone Data Acquisition System: A Functional Description.
1980-10-09
the board are a buffer for the RUN/HALT front panel switch and a retriggerable oneshot multivibrator. This latter circuit senses the SRUN pulse train...recording on the data tapes, and providing the master timing source for data acquisition. An Electronic Research Company (ERC) model 2446 digital...the computer is fed to a retriggerable oneshot multivibrator on the board. (SRUN consists of a pulse train that is present when the computer is running
Improved Algorithms Speed It Up for Codes
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hazi, A
2005-09-20
Huge computers, huge codes, complex problems to solve. The longer it takes to run a code, the more it costs. One way to speed things up and save time and money is through hardware improvements--faster processors, different system designs, bigger computers. But another side of supercomputing can reap savings in time and speed: software improvements to make codes--particularly the mathematical algorithms that form them--run faster and more efficiently. Speed up math? Is that really possible? According to Livermore physicist Eugene Brooks, the answer is a resounding yes. ''Sure, you get great speed-ups by improving hardware,'' says Brooks, the deputy leadermore » for Computational Physics in N Division, which is part of Livermore's Physics and Advanced Technologies (PAT) Directorate. ''But the real bonus comes on the software side, where improvements in software can lead to orders of magnitude improvement in run times.'' Brooks knows whereof he speaks. Working with Laboratory physicist Abraham Szoeke and others, he has been instrumental in devising ways to shrink the running time of what has, historically, been a tough computational nut to crack: radiation transport codes based on the statistical or Monte Carlo method of calculation. And Brooks is not the only one. Others around the Laboratory, including physicists Andrew Williamson, Randolph Hood, and Jeff Grossman, have come up with innovative ways to speed up Monte Carlo calculations using pure mathematics.« less
Hari, Pradip; Ko, Kevin; Koukoumidis, Emmanouil; Kremer, Ulrich; Martonosi, Margaret; Ottoni, Desiree; Peh, Li-Shiuan; Zhang, Pei
2008-10-28
Increasingly, spatial awareness plays a central role in many distributed and mobile computing applications. Spatially aware applications rely on information about the geographical position of compute devices and their supported services in order to support novel functionality. While many spatial application drivers already exist in mobile and distributed computing, very little systems research has explored how best to program these applications, to express their spatial and temporal constraints, and to allow efficient implementations on highly dynamic real-world platforms. This paper proposes the SARANA system architecture, which includes language and run-time system support for spatially aware and resource-aware applications. SARANA allows users to express spatial regions of interest, as well as trade-offs between quality of result (QoR), latency and cost. The goal is to produce applications that use resources efficiently and that can be run on diverse resource-constrained platforms ranging from laptops to personal digital assistants and to smart phones. SARANA's run-time system manages QoR and cost trade-offs dynamically by tracking resource availability and locations, brokering usage/pricing agreements and migrating programs to nodes accordingly. A resource cost model permeates the SARANA system layers, permitting users to express their resource needs and QoR expectations in units that make sense to them. Although we are still early in the system development, initial versions have been demonstrated on a nine-node system prototype.
Lee, Jae H.; Yao, Yushu; Shrestha, Uttam; Gullberg, Grant T.; Seo, Youngho
2014-01-01
The primary goal of this project is to implement the iterative statistical image reconstruction algorithm, in this case maximum likelihood expectation maximum (MLEM) used for dynamic cardiac single photon emission computed tomography, on Spark/GraphX. This involves porting the algorithm to run on large-scale parallel computing systems. Spark is an easy-to- program software platform that can handle large amounts of data in parallel. GraphX is a graph analytic system running on top of Spark to handle graph and sparse linear algebra operations in parallel. The main advantage of implementing MLEM algorithm in Spark/GraphX is that it allows users to parallelize such computation without any expertise in parallel computing or prior knowledge in computer science. In this paper we demonstrate a successful implementation of MLEM in Spark/GraphX and present the performance gains with the goal to eventually make it useable in clinical setting. PMID:27081299
Lee, Jae H; Yao, Yushu; Shrestha, Uttam; Gullberg, Grant T; Seo, Youngho
2014-11-01
The primary goal of this project is to implement the iterative statistical image reconstruction algorithm, in this case maximum likelihood expectation maximum (MLEM) used for dynamic cardiac single photon emission computed tomography, on Spark/GraphX. This involves porting the algorithm to run on large-scale parallel computing systems. Spark is an easy-to- program software platform that can handle large amounts of data in parallel. GraphX is a graph analytic system running on top of Spark to handle graph and sparse linear algebra operations in parallel. The main advantage of implementing MLEM algorithm in Spark/GraphX is that it allows users to parallelize such computation without any expertise in parallel computing or prior knowledge in computer science. In this paper we demonstrate a successful implementation of MLEM in Spark/GraphX and present the performance gains with the goal to eventually make it useable in clinical setting.
Effect of Minimalist Footwear on Running Efficiency: A Randomized Crossover Trial.
Gillinov, Stephen M; Laux, Sara; Kuivila, Thomas; Hass, Daniel; Joy, Susan M
2015-05-01
Although minimalist footwear is increasingly popular among runners, claims that minimalist footwear enhances running biomechanics and efficiency are controversial. Minimalist and barefoot conditions improve running efficiency when compared with traditional running shoes. Randomized crossover trial. Level 3. Fifteen experienced runners each completed three 90-second running trials on a treadmill, each trial performed in a different type of footwear: traditional running shoes with a heavily cushioned heel, minimalist running shoes with minimal heel cushioning, and barefoot (socked). High-speed photography was used to determine foot strike, ground contact time, knee angle, and stride cadence with each footwear type. Runners had more rearfoot strikes in traditional shoes (87%) compared with minimalist shoes (67%) and socked (40%) (P = 0.03). Ground contact time was longest in traditional shoes (265.9 ± 10.9 ms) when compared with minimalist shoes (253.4 ± 11.2 ms) and socked (250.6 ± 16.2 ms) (P = 0.005). There was no difference between groups with respect to knee angle (P = 0.37) or stride cadence (P = 0.20). When comparing running socked to running with minimalist running shoes, there were no differences in measures of running efficiency. When compared with running in traditional, cushioned shoes, both barefoot (socked) running and minimalist running shoes produce greater running efficiency in some experienced runners, with a greater tendency toward a midfoot or forefoot strike and a shorter ground contact time. Minimalist shoes closely approximate socked running in the 4 measurements performed. With regard to running efficiency and biomechanics, in some runners, barefoot (socked) and minimalist footwear are preferable to traditional running shoes.
A Monte-Carlo maplet for the study of the optical properties of biological tissues
NASA Astrophysics Data System (ADS)
Yip, Man Ho; Carvalho, M. J.
2007-12-01
Monte-Carlo simulations are commonly used to study complex physical processes in various fields of physics. In this paper we present a Maple program intended for Monte-Carlo simulations of photon transport in biological tissues. The program has been designed so that the input data and output display can be handled by a maplet (an easy and user-friendly graphical interface), named the MonteCarloMaplet. A thorough explanation of the programming steps and how to use the maplet is given. Results obtained with the Maple program are compared with corresponding results available in the literature. Program summaryProgram title:MonteCarloMaplet Catalogue identifier:ADZU_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/ADZU_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: Standard CPC licence, http://cpc.cs.qub.ac.uk/licence/licence.html No. of lines in distributed program, including test data, etc.:3251 No. of bytes in distributed program, including test data, etc.:296 465 Distribution format: tar.gz Programming language:Maple 10 Computer: Acer Aspire 5610 (any running Maple 10) Operating system: Windows XP professional (any running Maple 10) Classification: 3.1, 5 Nature of problem: Simulate the transport of radiation in biological tissues. Solution method: The Maple program follows the steps of the C program of L. Wang et al. [L. Wang, S.L. Jacques, L. Zheng, Computer Methods and Programs in Biomedicine 47 (1995) 131-146]; The Maple library routine for random number generation is used [Maple 10 User Manual c Maplesoft, a division of Waterloo Maple Inc., 2005]. Restrictions: Running time increases rapidly with the number of photons used in the simulation. Unusual features: A maplet (graphical user interface) has been programmed for data input and output. Note that the Monte-Carlo simulation was programmed with Maple 10. If attempting to run the simulation with an earlier version of Maple, appropriate modifications (regarding typesetting fonts) are required and once effected the worksheet runs without problem. However some of the windows of the maplet may still appear distorted. Running time: Depends essentially on the number of photons used in the simulation. Elapsed times for particular runs are reported in the main text.
Scaling predictive modeling in drug development with cloud computing.
Moghadam, Behrooz Torabi; Alvarsson, Jonathan; Holm, Marcus; Eklund, Martin; Carlsson, Lars; Spjuth, Ola
2015-01-26
Growing data sets with increased time for analysis is hampering predictive modeling in drug discovery. Model building can be carried out on high-performance computer clusters, but these can be expensive to purchase and maintain. We have evaluated ligand-based modeling on cloud computing resources where computations are parallelized and run on the Amazon Elastic Cloud. We trained models on open data sets of varying sizes for the end points logP and Ames mutagenicity and compare with model building parallelized on a traditional high-performance computing cluster. We show that while high-performance computing results in faster model building, the use of cloud computing resources is feasible for large data sets and scales well within cloud instances. An additional advantage of cloud computing is that the costs of predictive models can be easily quantified, and a choice can be made between speed and economy. The easy access to computational resources with no up-front investments makes cloud computing an attractive alternative for scientists, especially for those without access to a supercomputer, and our study shows that it enables cost-efficient modeling of large data sets on demand within reasonable time.
Research on elastic resource management for multi-queue under cloud computing environment
NASA Astrophysics Data System (ADS)
CHENG, Zhenjing; LI, Haibo; HUANG, Qiulan; Cheng, Yaodong; CHEN, Gang
2017-10-01
As a new approach to manage computing resource, virtualization technology is more and more widely applied in the high-energy physics field. A virtual computing cluster based on Openstack was built at IHEP, using HTCondor as the job queue management system. In a traditional static cluster, a fixed number of virtual machines are pre-allocated to the job queue of different experiments. However this method cannot be well adapted to the volatility of computing resource requirements. To solve this problem, an elastic computing resource management system under cloud computing environment has been designed. This system performs unified management of virtual computing nodes on the basis of job queue in HTCondor based on dual resource thresholds as well as the quota service. A two-stage pool is designed to improve the efficiency of resource pool expansion. This paper will present several use cases of the elastic resource management system in IHEPCloud. The practical run shows virtual computing resource dynamically expanded or shrunk while computing requirements change. Additionally, the CPU utilization ratio of computing resource was significantly increased when compared with traditional resource management. The system also has good performance when there are multiple condor schedulers and multiple job queues.
The engineering design integration (EDIN) system. [digital computer program complex
NASA Technical Reports Server (NTRS)
Glatt, C. R.; Hirsch, G. N.; Alford, G. E.; Colquitt, W. N.; Reiners, S. J.
1974-01-01
A digital computer program complex for the evaluation of aerospace vehicle preliminary designs is described. The system consists of a Univac 1100 series computer and peripherals using the Exec 8 operating system, a set of demand access terminals of the alphanumeric and graphics types, and a library of independent computer programs. Modification of the partial run streams, data base maintenance and construction, and control of program sequencing are provided by a data manipulation program called the DLG processor. The executive control of library program execution is performed by the Univac Exec 8 operating system through a user established run stream. A combination of demand and batch operations is employed in the evaluation of preliminary designs. Applications accomplished with the EDIN system are described.
Specvis: Free and open-source software for visual field examination.
Dzwiniel, Piotr; Gola, Mateusz; Wójcik-Gryciuk, Anna; Waleszczyk, Wioletta J
2017-01-01
Visual field impairment affects more than 100 million people globally. However, due to the lack of the access to appropriate ophthalmic healthcare in undeveloped regions as a result of associated costs and expertise this number may be an underestimate. Improved access to affordable diagnostic software designed for visual field examination could slow the progression of diseases, such as glaucoma, allowing for early diagnosis and intervention. We have developed Specvis, a free and open-source application written in Java programming language that can run on any personal computer to meet this requirement (http://www.specvis.pl/). Specvis was tested on glaucomatous, retinitis pigmentosa and stroke patients and the results were compared to results using the Medmont M700 Automated Static Perimeter. The application was also tested for inter-test intrapersonal variability. The results from both validation studies indicated low inter-test intrapersonal variability, and suitable reliability for a fast and simple assessment of visual field impairment. Specvis easily identifies visual field areas of zero sensitivity and allows for evaluation of its levels throughout the visual field. Thus, Specvis is a new, reliable application that can be successfully used for visual field examination and can fill the gap between confrontation and perimetry tests. The main advantages of Specvis over existing methods are its availability (free), affordability (runs on any personal computer), and reliability (comparable to high-cost solutions).
Specvis: Free and open-source software for visual field examination
Dzwiniel, Piotr; Gola, Mateusz; Wójcik-Gryciuk, Anna
2017-01-01
Visual field impairment affects more than 100 million people globally. However, due to the lack of the access to appropriate ophthalmic healthcare in undeveloped regions as a result of associated costs and expertise this number may be an underestimate. Improved access to affordable diagnostic software designed for visual field examination could slow the progression of diseases, such as glaucoma, allowing for early diagnosis and intervention. We have developed Specvis, a free and open-source application written in Java programming language that can run on any personal computer to meet this requirement (http://www.specvis.pl/). Specvis was tested on glaucomatous, retinitis pigmentosa and stroke patients and the results were compared to results using the Medmont M700 Automated Static Perimeter. The application was also tested for inter-test intrapersonal variability. The results from both validation studies indicated low inter-test intrapersonal variability, and suitable reliability for a fast and simple assessment of visual field impairment. Specvis easily identifies visual field areas of zero sensitivity and allows for evaluation of its levels throughout the visual field. Thus, Specvis is a new, reliable application that can be successfully used for visual field examination and can fill the gap between confrontation and perimetry tests. The main advantages of Specvis over existing methods are its availability (free), affordability (runs on any personal computer), and reliability (comparable to high-cost solutions). PMID:29028825
An evaluation of four single element airfoil analytic methods
NASA Technical Reports Server (NTRS)
Freuler, R. J.; Gregorek, G. M.
1979-01-01
A comparison of four computer codes for the analysis of two-dimensional single element airfoil sections is presented for three classes of section geometries. Two of the computer codes utilize vortex singularities methods to obtain the potential flow solution. The other two codes solve the full inviscid potential flow equation using finite differencing techniques, allowing results to be obtained for transonic flow about an airfoil including weak shocks. Each program incorporates boundary layer routines for computing the boundary layer displacement thickness and boundary layer effects on aerodynamic coefficients. Computational results are given for a symmetrical section represented by an NACA 0012 profile, a conventional section illustrated by an NACA 65A413 profile, and a supercritical type section for general aviation applications typified by a NASA LS(1)-0413 section. The four codes are compared and contrasted in the areas of method of approach, range of applicability, agreement among each other and with experiment, individual advantages and disadvantages, computer run times and memory requirements, and operational idiosyncrasies.
Using Avizo Software on the Peregrine System | High-Performance Computing |
be run remotely from the Peregrine visualization node. First, launch a TurboVNC remote desktop. Then from a terminal in that remote desktop: % module load avizo % vglrun avizo Running Locally Avizo can
Applying Parallel Adaptive Methods with GeoFEST/PYRAMID to Simulate Earth Surface Crustal Dynamics
NASA Technical Reports Server (NTRS)
Norton, Charles D.; Lyzenga, Greg; Parker, Jay; Glasscoe, Margaret; Donnellan, Andrea; Li, Peggy
2006-01-01
This viewgraph presentation reviews the use Adaptive Mesh Refinement (AMR) in simulating the Crustal Dynamics of Earth's Surface. AMR simultaneously improves solution quality, time to solution, and computer memory requirements when compared to generating/running on a globally fine mesh. The use of AMR in simulating the dynamics of the Earth's Surface is spurred by future proposed NASA missions, such as InSAR for Earth surface deformation and other measurements. These missions will require support for large-scale adaptive numerical methods using AMR to model observations. AMR was chosen because it has been successful in computation fluid dynamics for predictive simulation of complex flows around complex structures.
A Computing Infrastructure for Supporting Climate Studies
NASA Astrophysics Data System (ADS)
Yang, C.; Bambacus, M.; Freeman, S. M.; Huang, Q.; Li, J.; Sun, M.; Xu, C.; Wojcik, G. S.; Cahalan, R. F.; NASA Climate @ Home Project Team
2011-12-01
Climate change is one of the major challenges facing us on the Earth planet in the 21st century. Scientists build many models to simulate the past and predict the climate change for the next decades or century. Most of the models are at a low resolution with some targeting high resolution in linkage to practical climate change preparedness. To calibrate and validate the models, millions of model runs are needed to find the best simulation and configuration. This paper introduces the NASA effort on Climate@Home project to build a supercomputer based-on advanced computing technologies, such as cloud computing, grid computing, and others. Climate@Home computing infrastructure includes several aspects: 1) a cloud computing platform is utilized to manage the potential spike access to the centralized components, such as grid computing server for dispatching and collecting models runs results; 2) a grid computing engine is developed based on MapReduce to dispatch models, model configuration, and collect simulation results and contributing statistics; 3) a portal serves as the entry point for the project to provide the management, sharing, and data exploration for end users; 4) scientists can access customized tools to configure model runs and visualize model results; 5) the public can access twitter and facebook to get the latest about the project. This paper will introduce the latest progress of the project and demonstrate the operational system during the AGU fall meeting. It will also discuss how this technology can become a trailblazer for other climate studies and relevant sciences. It will share how the challenges in computation and software integration were solved.
Zhang, Yong; Otani, Akihito; Maginn, Edward J
2015-08-11
Equilibrium molecular dynamics is often used in conjunction with a Green-Kubo integral of the pressure tensor autocorrelation function to compute the shear viscosity of fluids. This approach is computationally expensive and is subject to a large amount of variability because the plateau region of the Green-Kubo integral is difficult to identify unambiguously. Here, we propose a time decomposition approach for computing the shear viscosity using the Green-Kubo formalism. Instead of one long trajectory, multiple independent trajectories are run and the Green-Kubo relation is applied to each trajectory. The averaged running integral as a function of time is fit to a double-exponential function with a weighting function derived from the standard deviation of the running integrals. Such a weighting function minimizes the uncertainty of the estimated shear viscosity and provides an objective means of estimating the viscosity. While the formal Green-Kubo integral requires an integration to infinite time, we suggest an integration cutoff time tcut, which can be determined by the relative values of the running integral and the corresponding standard deviation. This approach for computing the shear viscosity can be easily automated and used in computational screening studies where human judgment and intervention in the data analysis are impractical. The method has been applied to the calculation of the shear viscosity of a relatively low-viscosity liquid, ethanol, and relatively high-viscosity ionic liquid, 1-n-butyl-3-methylimidazolium bis(trifluoromethane-sulfonyl)imide ([BMIM][Tf2N]), over a range of temperatures. These test cases show that the method is robust and yields reproducible and reliable shear viscosity values.
SCEC Earthquake System Science Using High Performance Computing
NASA Astrophysics Data System (ADS)
Maechling, P. J.; Jordan, T. H.; Archuleta, R.; Beroza, G.; Bielak, J.; Chen, P.; Cui, Y.; Day, S.; Deelman, E.; Graves, R. W.; Minster, J. B.; Olsen, K. B.
2008-12-01
The SCEC Community Modeling Environment (SCEC/CME) collaboration performs basic scientific research using high performance computing with the goal of developing a predictive understanding of earthquake processes and seismic hazards in California. SCEC/CME research areas including dynamic rupture modeling, wave propagation modeling, probabilistic seismic hazard analysis (PSHA), and full 3D tomography. SCEC/CME computational capabilities are organized around the development and application of robust, re- usable, well-validated simulation systems we call computational platforms. The SCEC earthquake system science research program includes a wide range of numerical modeling efforts and we continue to extend our numerical modeling codes to include more realistic physics and to run at higher and higher resolution. During this year, the SCEC/USGS OpenSHA PSHA computational platform was used to calculate PSHA hazard curves and hazard maps using the new UCERF2.0 ERF and new 2008 attenuation relationships. Three SCEC/CME modeling groups ran 1Hz ShakeOut simulations using different codes and computer systems and carefully compared the results. The DynaShake Platform was used to calculate several dynamic rupture-based source descriptions equivalent in magnitude and final surface slip to the ShakeOut 1.2 kinematic source description. A SCEC/CME modeler produced 10Hz synthetic seismograms for the ShakeOut 1.2 scenario rupture by combining 1Hz deterministic simulation results with 10Hz stochastic seismograms. SCEC/CME modelers ran an ensemble of seven ShakeOut-D simulations to investigate the variability of ground motions produced by dynamic rupture-based source descriptions. The CyberShake Platform was used to calculate more than 15 new probabilistic seismic hazard analysis (PSHA) hazard curves using full 3D waveform modeling and the new UCERF2.0 ERF. The SCEC/CME group has also produced significant computer science results this year. Large-scale SCEC/CME high performance codes were run on NSF TeraGrid sites including simulations that use the full PSC Big Ben supercomputer (4096 cores) and simulations that ran on more than 10K cores at TACC Ranger. The SCEC/CME group used scientific workflow tools and grid-computing to run more than 1.5 million jobs at NCSA for the CyberShake project. Visualizations produced by a SCEC/CME researcher of the 10Hz ShakeOut 1.2 scenario simulation data were used by USGS in ShakeOut publications and public outreach efforts. OpenSHA was ported onto an NSF supercomputer and was used to produce very high resolution hazard PSHA maps that contained more than 1.6 million hazard curves.
Shared address collectives using counter mechanisms
DOE Office of Scientific and Technical Information (OSTI.GOV)
Blocksome, Michael; Dozsa, Gabor; Gooding, Thomas M
A shared address space on a compute node stores data received from a network and data to transmit to the network. The shared address space includes an application buffer that can be directly operated upon by a plurality of processes, for instance, running on different cores on the compute node. A shared counter is used for one or more of signaling arrival of the data across the plurality of processes running on the compute node, signaling completion of an operation performed by one or more of the plurality of processes, obtaining reservation slots by one or more of the pluralitymore » of processes, or combinations thereof.« less
Crew appliance computer program manual, volume 1
NASA Technical Reports Server (NTRS)
Russell, D. J.
1975-01-01
Trade studies of numerous appliance concepts for advanced spacecraft galley, personal hygiene, housekeeping, and other areas were made to determine which best satisfy the space shuttle orbiter and modular space station mission requirements. Analytical models of selected appliance concepts not currently included in the G-189A Generalized Environmental/Thermal Control and Life Support Systems (ETCLSS) Computer Program subroutine library were developed. The new appliance subroutines are given along with complete analytical model descriptions, solution methods, user's input instructions, and validation run results. The appliance components modeled were integrated with G-189A ETCLSS models for shuttle orbiter and modular space station, and results from computer runs of these systems are presented.
NASA Astrophysics Data System (ADS)
Steiger, Damian S.; Haener, Thomas; Troyer, Matthias
Quantum computers promise to transform our notions of computation by offering a completely new paradigm. A high level quantum programming language and optimizing compilers are essential components to achieve scalable quantum computation. In order to address this, we introduce the ProjectQ software framework - an open source effort to support both theorists and experimentalists by providing intuitive tools to implement and run quantum algorithms. Here, we present our ProjectQ quantum compiler, which compiles a quantum algorithm from our high-level Python-embedded language down to low-level quantum gates available on the target system. We demonstrate how this compiler can be used to control actual hardware and to run high-performance simulations.
PHREEQCI; a graphical user interface for the geochemical computer program PHREEQC
Charlton, Scott R.; Macklin, Clifford L.; Parkhurst, David L.
1997-01-01
PhreeqcI is a Windows-based graphical user interface for the geochemical computer program PHREEQC. PhreeqcI provides the capability to generate and edit input data files, run simulations, and view text files containing simulation results, all within the framework of a single interface. PHREEQC is a multipurpose geochemical program that can perform speciation, inverse, reaction-path, and 1D advective reaction-transport modeling. Interactive access to all of the capabilities of PHREEQC is available with PhreeqcI. The interface is written in Visual Basic and will run on personal computers under the Windows(3.1), Windows95, and WindowsNT operating systems.
Local rollback for fault-tolerance in parallel computing systems
Blumrich, Matthias A [Yorktown Heights, NY; Chen, Dong [Yorktown Heights, NY; Gara, Alan [Yorktown Heights, NY; Giampapa, Mark E [Yorktown Heights, NY; Heidelberger, Philip [Yorktown Heights, NY; Ohmacht, Martin [Yorktown Heights, NY; Steinmacher-Burow, Burkhard [Boeblingen, DE; Sugavanam, Krishnan [Yorktown Heights, NY
2012-01-24
A control logic device performs a local rollback in a parallel super computing system. The super computing system includes at least one cache memory device. The control logic device determines a local rollback interval. The control logic device runs at least one instruction in the local rollback interval. The control logic device evaluates whether an unrecoverable condition occurs while running the at least one instruction during the local rollback interval. The control logic device checks whether an error occurs during the local rollback. The control logic device restarts the local rollback interval if the error occurs and the unrecoverable condition does not occur during the local rollback interval.
[Groupamatic 360 C1 and automated blood donor processing in a transfusion center].
Guimbretiere, J; Toscer, M; Harousseau, H
1978-03-01
Automation of donor management flow path is controlled by: --a 3 slip "port a punch" card, --the groupamatic unit with a result sorted out on punch paper tape, --the management computer off line connected to groupamatic. Data tracking at blood collection time is made by punching a card with the donor card used as a master card. Groupamatic performs: --a standard blood grouping with one run for registered donors and two runs for new donors, --a phenotyping with two runs, --a screening of irregular antibodies. Themanagement computer checks the correlation between the data of the two runs or the data of a single run and that of previous file. It updates the data resident in the central file and prints out: --the controls of the different blood group for the red cell panel, --The listing of error messages, --The listing of emergency call up, --The listing of collected blood units when arrived at the blood center, with quantitative and qualitative information such as: number of blood, units collected, donor addresses, etc., --Statistics, --Donor cards, --Diplomas.
NASA Astrophysics Data System (ADS)
Boudreaux, R. D.; Metzger, C. E.; Macias, B. R.; Shirazi-Fard, Y.; Hogan, H. A.; Bloomfield, S. A.
2014-06-01
Astronauts on long duration missions continue to experience bone loss, as much as 1-2% each month, for up to 4.5 years after a mission. Mechanical loading of bone with exercise has been shown to increase bone formation, mass, and geometry. The aim of this study was to compare the efficacy of two exercise protocols during a period of reduced gravitational loading (1/6th body weight) in mice. Since muscle contractions via resistance exercise impart the largest physiological loads on the skeleton, we hypothesized that resistance training (via vertical tower climbing) would better protect against the deleterious musculoskeletal effects of reduced gravitational weight bearing when compared to endurance exercise (treadmill running). Young adult female BALB/cBYJ mice were randomly assigned to three groups: 1/6 g (G/6; n=6), 1/6 g with treadmill running (G/6+RUN; n=8), or 1/6 g with vertical tower climbing (G/6+CLB; n=9). Exercise was performed five times per week. Reduced weight bearing for 21 days was achieved through a novel harness suspension system. Treadmill velocity (12-20 m/min) and daily run time duration (32-51 min) increased incrementally throughout the study. Bone geometry and volumetric bone mineral density (vBMD) at proximal metaphysis and mid-diaphysis tibia were assessed by in vivo peripheral quantitative computed tomography (pQCT) on days 0 and 21 and standard dynamic histomorphometry was performed on undemineralized sections of the mid-diaphysis after tissue harvest. G/6 caused a significant decrease (P<0.001) in proximal tibia metaphysis total vBMD (-9.6%). These reductions of tibia metaphyseal vBMD in G/6 mice were mitigated in both G/6+RUN and G/6+CLB groups (P<0.05). After 21 days of G/6, we saw an absolute increase in tibia mid-diaphysis vBMD and in distal metaphysis femur vBMD in both G/6+RUN and G/6+CLB mice (P<0.05). Substantial increases in endocortical and periosteal mineralizing surface (MS/BS) at mid-diaphysis tibia in G/6+CLB demonstrate that bone formation can be increased even in the presence of reduced weight bearing. These data suggest that moderately vigorous endurance exercise and resistance training, through treadmill running or climb training mitigates decrements in vBMD during 21 days of reduced weight bearing. Consistent with our hypothesis, tower climb training, most pronounced in the tibia mid-diaphysis, provides a more potent osteogenic response compared to treadmill running.
Abusive User Policy | High-Performance Computing | NREL
below. First Incident The user's ability to run new jobs or store new data will be suspended temporarily acknowledged and participated in a remedy, ability to run new jobs or store new data will be restored. Second Incident Suspend running new jobs or storing new data. Terminate jobs if necessary. The system and
PNNL streamlines energy-guzzling computers
DOE Office of Scientific and Technical Information (OSTI.GOV)
Beckman, Mary T.; Marquez, Andres
In a room the size of a garage, two rows of six-foot-tall racks holding supercomputer hard drives sit back-to-back. Thin tubes and wires snake off the hard drives, slithering into the corners. Stepping between the rows, a rush of heat whips around you -- the air from fans blowing off processing heat. But walk farther in, between the next racks of hard drives, and the temperature drops noticeably. These drives are being cooled by a non-conducting liquid that runs right over the hardworking processors. The liquid carries the heat away in tubes, saving the air a few degrees. This ismore » the Energy Smart Data Center at Pacific Northwest National Laboratory. The bigger, faster, and meatier supercomputers get, the more energy they consume. PNNL's Andres Marquez has developed this test bed to learn how to train the behemoths in energy efficiency. The work will help supercomputers perform better as well. Processors have to keep cool or suffer from "thermal throttling," says Marquez. "That's the performance threshold where the computer is too hot to run well. That threshold is an industry secret." The center at EMSL, DOE's national scientific user facility at PNNL, harbors several ways of experimenting with energy usage. For example, the room's air conditioning is isolated from the rest of EMSL -- pipes running beneath the floor carry temperature-controlled water through heat exchangers to cooling towers outside. "We can test whether it's more energy efficient to cool directly on the processing chips or out in the water tower," says Marquez. The hard drives feed energy and temperature data to a network server running specially designed software that controls and monitors the data center. To test the center’s limits, the team runs the processors flat out – not only on carefully controlled test programs in the Energy Smart computers, but also on real world software from other EMSL research, such as regional weather forecasting models. Marquez's group is also developing "power aware computing", where the computer programs themselves perform calculations more energy efficiently. Maybe once computers get smart about energy, they'll have tips for their users.« less
Effect of Minimalist Footwear on Running Efficiency
Gillinov, Stephen M.; Laux, Sara; Kuivila, Thomas; Hass, Daniel; Joy, Susan M.
2015-01-01
Background: Although minimalist footwear is increasingly popular among runners, claims that minimalist footwear enhances running biomechanics and efficiency are controversial. Hypothesis: Minimalist and barefoot conditions improve running efficiency when compared with traditional running shoes. Study Design: Randomized crossover trial. Level of Evidence: Level 3. Methods: Fifteen experienced runners each completed three 90-second running trials on a treadmill, each trial performed in a different type of footwear: traditional running shoes with a heavily cushioned heel, minimalist running shoes with minimal heel cushioning, and barefoot (socked). High-speed photography was used to determine foot strike, ground contact time, knee angle, and stride cadence with each footwear type. Results: Runners had more rearfoot strikes in traditional shoes (87%) compared with minimalist shoes (67%) and socked (40%) (P = 0.03). Ground contact time was longest in traditional shoes (265.9 ± 10.9 ms) when compared with minimalist shoes (253.4 ± 11.2 ms) and socked (250.6 ± 16.2 ms) (P = 0.005). There was no difference between groups with respect to knee angle (P = 0.37) or stride cadence (P = 0.20). When comparing running socked to running with minimalist running shoes, there were no differences in measures of running efficiency. Conclusion: When compared with running in traditional, cushioned shoes, both barefoot (socked) running and minimalist running shoes produce greater running efficiency in some experienced runners, with a greater tendency toward a midfoot or forefoot strike and a shorter ground contact time. Minimalist shoes closely approximate socked running in the 4 measurements performed. Clinical Relevance: With regard to running efficiency and biomechanics, in some runners, barefoot (socked) and minimalist footwear are preferable to traditional running shoes. PMID:26131304
A Menu-Driven Interface to Unix-Based Resources
Evans, Elizabeth A.
1989-01-01
Unix has often been overlooked in the past as a viable operating system for anyone other than computer scientists. Its terseness, non-mnemonic nature of the commands, and the lack of user-friendly software to run under it are but a few of the user-related reasons which have been cited. It is, nevertheless, the operating system of choice in many cases. This paper describes a menu-driven interface to Unix which provides user-friendlier access to the software resources available on the computers running under Unix.
The Impact of Typhoons on the Ocean in the Pacific (ITOP) Field and Data Management Support
2011-12-16
in October o f 2009 to develop effective sampling strategies for 20 I 0. EOL /Computing Data and Software Facil ity (CDS) supported the !TO P Dry Run...measurement strategies necessitated a dry run experiment in October of 2009 to develop effective sampling strategies for 2010. EOL /Computing Data and...contains products from 21 September through 32 October 2009. The catalog remains accessible at EOL at the above mentioned uri. The products listed by
Interoperability...NMCI and Beyond
2001-05-31
wireless. “On The Road” – Pagers – Cell phones – Palm-size PDAs – Two way pagers – Hand-held computing device – Laptop computer – Two-way radios – A...combat capability”… $0 $5 $10 $15 $20 $25 Electric Power NMCI Seat First Run Movie Cell Phone Fed. Civilian Salary 23.80 11.00 4.00 1.380.20 F/A-18...Flying Hour: 1,134.00 Fed. Civilian Salary (mean): 23.80 Cell Phone Air Time: 11.00 First Run Movie: 4.00 DSN
NASA Astrophysics Data System (ADS)
Tong, Qiujie; Wang, Qianqian; Li, Xiaoyang; Shan, Bin; Cui, Xuntai; Li, Chenyu; Peng, Zhong
2016-11-01
In order to satisfy the requirements of the real-time and generality, a laser target simulator in semi-physical simulation system based on RTX+LabWindows/CVI platform is proposed in this paper. Compared with the upper-lower computers simulation platform architecture used in the most of the real-time system now, this system has better maintainability and portability. This system runs on the Windows platform, using Windows RTX real-time extension subsystem to ensure the real-time performance of the system combining with the reflective memory network to complete some real-time tasks such as calculating the simulation model, transmitting the simulation data, and keeping real-time communication. The real-time tasks of simulation system run under the RTSS process. At the same time, we use the LabWindows/CVI to compile a graphical interface, and complete some non-real-time tasks in the process of simulation such as man-machine interaction, display and storage of the simulation data, which run under the Win32 process. Through the design of RTX shared memory and task scheduling algorithm, the data interaction between the real-time tasks process of RTSS and non-real-time tasks process of Win32 is completed. The experimental results show that this system has the strongly real-time performance, highly stability, and highly simulation accuracy. At the same time, it also has the good performance of human-computer interaction.
NASA Astrophysics Data System (ADS)
Silva, F.; Maechling, P. J.; Goulet, C. A.; Somerville, P.; Jordan, T. H.
2014-12-01
The Southern California Earthquake Center (SCEC) Broadband Platform is a collaborative software development project involving geoscientists, earthquake engineers, graduate students, and the SCEC Community Modeling Environment. The SCEC Broadband Platform (BBP) is open-source scientific software that can generate broadband (0-100Hz) ground motions for earthquakes, integrating complex scientific modules that implement rupture generation, low and high-frequency seismogram synthesis, non-linear site effects calculation, and visualization into a software system that supports easy on-demand computation of seismograms. The Broadband Platform operates in two primary modes: validation simulations and scenario simulations. In validation mode, the Platform runs earthquake rupture and wave propagation modeling software to calculate seismograms for a well-observed historical earthquake. Then, the BBP calculates a number of goodness of fit measurements that quantify how well the model-based broadband seismograms match the observed seismograms for a certain event. Based on these results, the Platform can be used to tune and validate different numerical modeling techniques. In scenario mode, the Broadband Platform can run simulations for hypothetical (scenario) earthquakes. In this mode, users input an earthquake description, a list of station names and locations, and a 1D velocity model for their region of interest, and the Broadband Platform software then calculates ground motions for the specified stations. Working in close collaboration with scientists and research engineers, the SCEC software development group continues to add new capabilities to the Broadband Platform and to release new versions as open-source scientific software distributions that can be compiled and run on many Linux computer systems. Our latest release includes 5 simulation methods, 7 simulation regions covering California, Japan, and Eastern North America, the ability to compare simulation results against GMPEs, and several new data products, such as map and distance-based goodness of fit plots. As the number and complexity of scenarios simulated using the Broadband Platform increases, we have added batching utilities to substantially improve support for running large-scale simulations on computing clusters.
Barlough, J E; Jacobson, R H; Downing, D R; Lynch, T J; Scott, F W
1987-01-01
The computer-assisted, kinetics-based enzyme-linked immunosorbent assay for coronavirus antibodies in cats was calibrated to the conventional indirect immunofluorescence assay by linear regression analysis and computerized interpolation (generation of "immunofluorescence assay-equivalent" titers). Procedures were developed for normalization and standardization of kinetics-based enzyme-linked immunosorbent assay results through incorporation of five different control sera of predetermined ("expected") titer in daily runs. When used with such sera and with computer assistance, the kinetics-based enzyme-linked immunosorbent assay minimized both within-run and between-run variability while allowing also for efficient data reduction and statistical analysis and reporting of results. PMID:3032390
Barlough, J E; Jacobson, R H; Downing, D R; Lynch, T J; Scott, F W
1987-01-01
The computer-assisted, kinetics-based enzyme-linked immunosorbent assay for coronavirus antibodies in cats was calibrated to the conventional indirect immunofluorescence assay by linear regression analysis and computerized interpolation (generation of "immunofluorescence assay-equivalent" titers). Procedures were developed for normalization and standardization of kinetics-based enzyme-linked immunosorbent assay results through incorporation of five different control sera of predetermined ("expected") titer in daily runs. When used with such sera and with computer assistance, the kinetics-based enzyme-linked immunosorbent assay minimized both within-run and between-run variability while allowing also for efficient data reduction and statistical analysis and reporting of results.
COSP - A computer model of cyclic oxidation
NASA Technical Reports Server (NTRS)
Lowell, Carl E.; Barrett, Charles A.; Palmer, Raymond W.; Auping, Judith V.; Probst, Hubert B.
1991-01-01
A computer model useful in predicting the cyclic oxidation behavior of alloys is presented. The model considers the oxygen uptake due to scale formation during the heating cycle and the loss of oxide due to spalling during the cooling cycle. The balance between scale formation and scale loss is modeled and used to predict weight change and metal loss kinetics. A simple uniform spalling model is compared to a more complex random spall site model. In nearly all cases, the simpler uniform spall model gave predictions as accurate as the more complex model. The model has been applied to several nickel-base alloys which, depending upon composition, form Al2O3 or Cr2O3 during oxidation. The model has been validated by several experimental approaches. Versions of the model that run on a personal computer are available.
Response surface method in geotechnical/structural analysis, phase 1
NASA Astrophysics Data System (ADS)
Wong, F. S.
1981-02-01
In the response surface approach, an approximating function is fit to a long running computer code based on a limited number of code calculations. The approximating function, called the response surface, is then used to replace the code in subsequent repetitive computations required in a statistical analysis. The procedure of the response surface development and feasibility of the method are shown using a sample problem in slop stability which is based on data from centrifuge experiments of model soil slopes and involves five random soil parameters. It is shown that a response surface can be constructed based on as few as four code calculations and that the response surface is computationally extremely efficient compared to the code calculation. Potential applications of this research include probabilistic analysis of dynamic, complex, nonlinear soil/structure systems such as slope stability, liquefaction, and nuclear reactor safety.
2012-01-01
Background Laboratories engaged in computational biology or bioinformatics frequently need to run lengthy, multistep, and user-driven computational jobs. Each job can tie up a computer for a few minutes to several days, and many laboratories lack the expertise or resources to build and maintain a dedicated computer cluster. Results JobCenter is a client–server application and framework for job management and distributed job execution. The client and server components are both written in Java and are cross-platform and relatively easy to install. All communication with the server is client-driven, which allows worker nodes to run anywhere (even behind external firewalls or “in the cloud”) and provides inherent load balancing. Adding a worker node to the worker pool is as simple as dropping the JobCenter client files onto any computer and performing basic configuration, which provides tremendous ease-of-use, flexibility, and limitless horizontal scalability. Each worker installation may be independently configured, including the types of jobs it is able to run. Executed jobs may be written in any language and may include multistep workflows. Conclusions JobCenter is a versatile and scalable distributed job management system that allows laboratories to very efficiently distribute all computational work among available resources. JobCenter is freely available at http://code.google.com/p/jobcenter/. PMID:22846423
CMG-biotools, a free workbench for basic comparative microbial genomics.
Vesth, Tammi; Lagesen, Karin; Acar, Öncel; Ussery, David
2013-01-01
Today, there are more than a hundred times as many sequenced prokaryotic genomes than were present in the year 2000. The economical sequencing of genomic DNA has facilitated a whole new approach to microbial genomics. The real power of genomics is manifested through comparative genomics that can reveal strain specific characteristics, diversity within species and many other aspects. However, comparative genomics is a field not easily entered into by scientists with few computational skills. The CMG-biotools package is designed for microbiologists with limited knowledge of computational analysis and can be used to perform a number of analyses and comparisons of genomic data. The CMG-biotools system presents a stand-alone interface for comparative microbial genomics. The package is a customized operating system, based on Xubuntu 10.10, available through the open source Ubuntu project. The system can be installed on a virtual computer, allowing the user to run the system alongside any other operating system. Source codes for all programs are provided under GNU license, which makes it possible to transfer the programs to other systems if so desired. We here demonstrate the package by comparing and analyzing the diversity within the class Negativicutes, represented by 31 genomes including 10 genera. The analyses include 16S rRNA phylogeny, basic DNA and codon statistics, proteome comparisons using BLAST and graphical analyses of DNA structures. This paper shows the strength and diverse use of the CMG-biotools system. The system can be installed on a vide range of host operating systems and utilizes as much of the host computer as desired. It allows the user to compare multiple genomes, from various sources using standardized data formats and intuitive visualizations of results. The examples presented here clearly shows that users with limited computational experience can perform complicated analysis without much training.
Changes in running kinematics, kinetics, and spring-mass behavior over a 24-h run.
Morin, Jean-Benoît; Samozino, Pierre; Millet, Guillaume Y
2011-05-01
This study investigated the changes in running mechanics and spring-mass behavior over a 24-h treadmill run (24TR). Kinematics, kinetics, and spring-mass characteristics of the running step were assessed in 10 experienced ultralong-distance runners before, every 2 h, and after a 24TR using an instrumented treadmill dynamometer. These measurements were performed at 10 km·h, and mechanical parameters were sampled at 1000 Hz for 10 consecutive steps. Contact and aerial times were determined from ground reaction force (GRF) signals and used to compute step frequency. Maximal GRF, loading rate, downward displacement of the center of mass, and leg length change during the support phase were determined and used to compute both vertical and leg stiffness. Subjects' running pattern and spring-mass behavior significantly changed over the 24TR with a 4.9% higher step frequency on average (because of a significantly 4.5% shorter contact time), a lower maximal GRF (by 4.4% on average), a 13.0% lower leg length change during contact, and an increase in both leg and vertical stiffness (+9.9% and +8.6% on average, respectively). Most of these changes were significant from the early phase of the 24TR (fourth to sixth hour of running) and could be speculated as contributing to an overall limitation of the potentially harmful consequences of such a long-duration run on subjects' musculoskeletal system. During a 24TR, the changes in running mechanics and spring-mass behavior show a clear shift toward a higher oscillating frequency and stiffness, along with lower GRF and leg length change (hence a reduced overall eccentric load) during the support phase of running. © 2011 by the American College of Sports Medicine
Modeling and experimental studies of a side band power re-injection locked magnetron
NASA Astrophysics Data System (ADS)
Ye, Wen-Jun; Zhang, Yi; Yuan, Ping; Zhu, Hua-Cheng; Huang, Ka-Ma; Yang, Yang
2016-12-01
A side band power re-injection locked (SBPRIL) magnetron is presented in this paper. A tuning stub is placed between the external injection locked (EIL) magnetron and the circulator. Side band power of the EIL magnetron is reflected back to the magnetron. The reflected side band power is reused and pulled back to the central frequency. A phase-locking model is developed from circuit theory to explain the process of reuse of side band power in SBPRIL magnetron. Theoretical analysis proves that the side band power is pulled back to the central frequency of the SBPRIL magnetron, then the amplitude of the RF voltage increases and the phase noise performance is improved. Particle-in-cell (PIC) simulation of a 10-vane continuous wave (CW) magnetron model is presented. Computer simulation predicts that the frequency spectrum’s peak of the SBPRIL magnetron has an increase of 3.25 dB compared with the free running magnetron. The phase noise performance at the side band offset reduces 12.05 dB for the SBPRIL magnetron. Besides, the SBPRIL magnetron experiment is presented. Experimental results show that the spectrum peak rises by 14.29% for SBPRIL magnetron compared with the free running magnetron. The phase noise reduces more than 25 dB at 45-kHz offset compared with the free running magnetron. Project supported by the National Basic Research Program of China (Grant No. 2013CB328902) and the National Natural Science Foundation of China (Grant No. 61501311).
ERIC Educational Resources Information Center
Shade, Daniel D.
1994-01-01
Provides advice and suggestions for educators or parents who are trying to decide what type of computer to buy to run the latest computer software for children. Suggests that purchasers should buy a computer with as large a hard drive as possible, at least 10 megabytes of RAM, and a CD-ROM drive. (MDM)
Xia, Yidong; Lou, Jialin; Luo, Hong; ...
2015-02-09
Here, an OpenACC directive-based graphics processing unit (GPU) parallel scheme is presented for solving the compressible Navier–Stokes equations on 3D hybrid unstructured grids with a third-order reconstructed discontinuous Galerkin method. The developed scheme requires the minimum code intrusion and algorithm alteration for upgrading a legacy solver with the GPU computing capability at very little extra effort in programming, which leads to a unified and portable code development strategy. A face coloring algorithm is adopted to eliminate the memory contention because of the threading of internal and boundary face integrals. A number of flow problems are presented to verify the implementationmore » of the developed scheme. Timing measurements were obtained by running the resulting GPU code on one Nvidia Tesla K20c GPU card (Nvidia Corporation, Santa Clara, CA, USA) and compared with those obtained by running the equivalent Message Passing Interface (MPI) parallel CPU code on a compute node (consisting of two AMD Opteron 6128 eight-core CPUs (Advanced Micro Devices, Inc., Sunnyvale, CA, USA)). Speedup factors of up to 24× and 1.6× for the GPU code were achieved with respect to one and 16 CPU cores, respectively. The numerical results indicate that this OpenACC-based parallel scheme is an effective and extensible approach to port unstructured high-order CFD solvers to GPU computing.« less
Transonic Drag Prediction Using an Unstructured Multigrid Solver
NASA Technical Reports Server (NTRS)
Mavriplis, D. J.; Levy, David W.
2001-01-01
This paper summarizes the results obtained with the NSU-3D unstructured multigrid solver for the AIAA Drag Prediction Workshop held in Anaheim, CA, June 2001. The test case for the workshop consists of a wing-body configuration at transonic flow conditions. Flow analyses for a complete test matrix of lift coefficient values and Mach numbers at a constant Reynolds number are performed, thus producing a set of drag polars and drag rise curves which are compared with experimental data. Results were obtained independently by both authors using an identical baseline grid and different refined grids. Most cases were run in parallel on commodity cluster-type machines while the largest cases were run on an SGI Origin machine using 128 processors. The objective of this paper is to study the accuracy of the subject unstructured grid solver for predicting drag in the transonic cruise regime, to assess the efficiency of the method in terms of convergence, cpu time, and memory, and to determine the effects of grid resolution on this predictive ability and its computational efficiency. A good predictive ability is demonstrated over a wide range of conditions, although accuracy was found to degrade for cases at higher Mach numbers and lift values where increasing amounts of flow separation occur. The ability to rapidly compute large numbers of cases at varying flow conditions using an unstructured solver on inexpensive clusters of commodity computers is also demonstrated.
Web-Beagle: a web server for the alignment of RNA secondary structures.
Mattei, Eugenio; Pietrosanto, Marco; Ferrè, Fabrizio; Helmer-Citterich, Manuela
2015-07-01
Web-Beagle (http://beagle.bio.uniroma2.it) is a web server for the pairwise global or local alignment of RNA secondary structures. The server exploits a new encoding for RNA secondary structure and a substitution matrix of RNA structural elements to perform RNA structural alignments. The web server allows the user to compute up to 10 000 alignments in a single run, taking as input sets of RNA sequences and structures or primary sequences alone. In the latter case, the server computes the secondary structure prediction for the RNAs on-the-fly using RNAfold (free energy minimization). The user can also compare a set of input RNAs to one of five pre-compiled RNA datasets including lncRNAs and 3' UTRs. All types of comparison produce in output the pairwise alignments along with structural similarity and statistical significance measures for each resulting alignment. A graphical color-coded representation of the alignments allows the user to easily identify structural similarities between RNAs. Web-Beagle can be used for finding structurally related regions in two or more RNAs, for the identification of homologous regions or for functional annotation. Benchmark tests show that Web-Beagle has lower computational complexity, running time and better performances than other available methods. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
NASA Astrophysics Data System (ADS)
Bekas, C.; Curioni, A.
2010-06-01
Enforcing the orthogonality of approximate wavefunctions becomes one of the dominant computational kernels in planewave based Density Functional Theory electronic structure calculations that involve thousands of atoms. In this context, algorithms that enjoy both excellent scalability and single processor performance properties are much needed. In this paper we present block versions of the Gram-Schmidt method and we show that they are excellent candidates for our purposes. We compare the new approach with the state of the art practice in planewave based calculations and find that it has much to offer, especially when applied on massively parallel supercomputers such as the IBM Blue Gene/P Supercomputer. The new method achieves excellent sustained performance that surpasses 73 TFLOPS (67% of peak) on 8 Blue Gene/P racks (32 768 compute cores), while it enables more than a two fold decrease in run time when compared with the best competing methodology.
NASA Technical Reports Server (NTRS)
Poole, L. R.
1974-01-01
A study was conducted of an alternate method for storage and use of bathymetry data in the Langley Research Center and Virginia Institute of Marine Science mid-Atlantic continental-shelf wave-refraction computer program. The regional bathymetry array was divided into 105 indexed modules which can be read individually into memory in a nonsequential manner from a peripheral file using special random-access subroutines. In running a sample refraction case, a 75-percent decrease in program field length was achieved by using the random-access storage method in comparison with the conventional method of total regional array storage. This field-length decrease was accompanied by a comparative 5-percent increase in central processing time and a 477-percent increase in the number of operating-system calls. A comparative Langley Research Center computer system cost savings of 68 percent was achieved by using the random-access storage method.
Martin, Bryan D.; Wolfson, Julian; Adomavicius, Gediminas; Fan, Yingling
2017-01-01
We propose and compare combinations of several methods for classifying transportation activity data from smartphone GPS and accelerometer sensors. We have two main objectives. First, we aim to classify our data as accurately as possible. Second, we aim to reduce the dimensionality of the data as much as possible in order to reduce the computational burden of the classification. We combine dimension reduction and classification algorithms and compare them with a metric that balances accuracy and dimensionality. In doing so, we develop a classification algorithm that accurately classifies five different modes of transportation (i.e., walking, biking, car, bus and rail) while being computationally simple enough to run on a typical smartphone. Further, we use data that required no behavioral changes from the smartphone users to collect. Our best classification model uses the random forest algorithm to achieve 96.8% accuracy. PMID:28885550
Martin, Bryan D; Addona, Vittorio; Wolfson, Julian; Adomavicius, Gediminas; Fan, Yingling
2017-09-08
We propose and compare combinations of several methods for classifying transportation activity data from smartphone GPS and accelerometer sensors. We have two main objectives. First, we aim to classify our data as accurately as possible. Second, we aim to reduce the dimensionality of the data as much as possible in order to reduce the computational burden of the classification. We combine dimension reduction and classification algorithms and compare them with a metric that balances accuracy and dimensionality. In doing so, we develop a classification algorithm that accurately classifies five different modes of transportation (i.e., walking, biking, car, bus and rail) while being computationally simple enough to run on a typical smartphone. Further, we use data that required no behavioral changes from the smartphone users to collect. Our best classification model uses the random forest algorithm to achieve 96.8% accuracy.
Geometry Helps to Compare Persistence Diagrams
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kerber, Michael; Morozov, Dmitriy; Nigmetov, Arnur
2015-11-16
Exploiting geometric structure to improve the asymptotic complexity of discrete assignment problems is a well-studied subject. In contrast, the practical advantages of using geometry for such problems have not been explored. We implement geometric variants of the Hopcroft--Karp algorithm for bottleneck matching (based on previous work by Efrat el al.), and of the auction algorithm by Bertsekas for Wasserstein distance computation. Both implementations use k-d trees to replace a linear scan with a geometric proximity query. Our interest in this problem stems from the desire to compute distances between persistence diagrams, a problem that comes up frequently in topological datamore » analysis. We show that our geometric matching algorithms lead to a substantial performance gain, both in running time and in memory consumption, over their purely combinatorial counterparts. Moreover, our implementation significantly outperforms the only other implementation available for comparing persistence diagrams.« less
Use of UNIX in large online processor farms
NASA Astrophysics Data System (ADS)
Biel, Joseph R.
1990-08-01
There has been a recent rapid increase in the power of RISC computers running the UNIX operating system. Fermilab has begun to make use of these computers in the next generation of offline computer farms. It is also planning to use such computers in online computer farms. Issues involved in constructing online UNIX farms are discussed.
Optimizing a mobile robot control system using GPU acceleration
NASA Astrophysics Data System (ADS)
Tuck, Nat; McGuinness, Michael; Martin, Fred
2012-01-01
This paper describes our attempt to optimize a robot control program for the Intelligent Ground Vehicle Competition (IGVC) by running computationally intensive portions of the system on a commodity graphics processing unit (GPU). The IGVC Autonomous Challenge requires a control program that performs a number of different computationally intensive tasks ranging from computer vision to path planning. For the 2011 competition our Robot Operating System (ROS) based control system would not run comfortably on the multicore CPU on our custom robot platform. The process of profiling the ROS control program and selecting appropriate modules for porting to run on a GPU is described. A GPU-targeting compiler, Bacon, is used to speed up development and help optimize the ported modules. The impact of the ported modules on overall performance is discussed. We conclude that GPU optimization can free a significant amount of CPU resources with minimal effort for expensive user-written code, but that replacing heavily-optimized library functions is more difficult, and a much less efficient use of time.
NASA Technical Reports Server (NTRS)
Meyer, Donald; Uchenik, Igor
2007-01-01
The PPC750 Performance Monitor (Perfmon) is a computer program that helps the user to assess the performance characteristics of application programs running under the Wind River VxWorks real-time operating system on a PPC750 computer. Perfmon generates a user-friendly interface and collects performance data by use of performance registers provided by the PPC750 architecture. It processes and presents run-time statistics on a per-task basis over a repeating time interval (typically, several seconds or minutes) specified by the user. When the Perfmon software module is loaded with the user s software modules, it is available for use through Perfmon commands, without any modification of the user s code and at negligible performance penalty. Per-task run-time performance data made available by Perfmon include percentage time, number of instructions executed per unit time, dispatch ratio, stack high water mark, and level-1 instruction and data cache miss rates. The performance data are written to a file specified by the user or to the serial port of the computer
Validation of Magnetic Resonance Thermometry by Computational Fluid Dynamics
NASA Astrophysics Data System (ADS)
Rydquist, Grant; Owkes, Mark; Verhulst, Claire M.; Benson, Michael J.; Vanpoppel, Bret P.; Burton, Sascha; Eaton, John K.; Elkins, Christopher P.
2016-11-01
Magnetic Resonance Thermometry (MRT) is a new experimental technique that can create fully three-dimensional temperature fields in a noninvasive manner. However, validation is still required to determine the accuracy of measured results. One method of examination is to compare data gathered experimentally to data computed with computational fluid dynamics (CFD). In this study, large-eddy simulations have been performed with the NGA computational platform to generate data for a comparison with previously run MRT experiments. The experimental setup consisted of a heated jet inclined at 30° injected into a larger channel. In the simulations, viscosity and density were scaled according to the local temperature to account for differences in buoyant and viscous forces. A mesh-independent study was performed with 5 mil-, 15 mil- and 45 mil-cell meshes. The program Star-CCM + was used to simulate the complete experimental geometry. This was compared to data generated from NGA. Overall, both programs show good agreement with the experimental data gathered with MRT. With this data, the validity of MRT as a diagnostic tool has been shown and the tool can be used to further our understanding of a range of flows with non-trivial temperature distributions.
NASA Astrophysics Data System (ADS)
Menzel, R.; Paynter, D.; Jones, A. L.
2017-12-01
Due to their relatively low computational cost, radiative transfer models in global climate models (GCMs) run on traditional CPU architectures generally consist of shortwave and longwave parameterizations over a small number of wavelength bands. With the rise of newer GPU and MIC architectures, however, the performance of high resolution line-by-line radiative transfer models may soon approach those of the physical parameterizations currently employed in GCMs. Here we present an analysis of the current performance of a new line-by-line radiative transfer model currently under development at GFDL. Although originally designed to specifically exploit GPU architectures through the use of CUDA, the radiative transfer model has recently been extended to include OpenMP in an effort to also effectively target MIC architectures such as Intel's Xeon Phi. Using input data provided by the upcoming Radiative Forcing Model Intercomparison Project (RFMIP, as part of CMIP 6), we compare model results and performance data for various model configurations and spectral resolutions run on both GPU and Intel Knights Landing architectures to analogous runs of the standard Oxford Reference Forward Model on traditional CPUs.
Volunteer Computing Experience with ATLAS@Home
NASA Astrophysics Data System (ADS)
Adam-Bourdarios, C.; Bianchi, R.; Cameron, D.; Filipčič, A.; Isacchini, G.; Lançon, E.; Wu, W.;
2017-10-01
ATLAS@Home is a volunteer computing project which allows the public to contribute to computing for the ATLAS experiment through their home or office computers. The project has grown continuously since its creation in mid-2014 and now counts almost 100,000 volunteers. The combined volunteers’ resources make up a sizeable fraction of overall resources for ATLAS simulation. This paper takes stock of the experience gained so far and describes the next steps in the evolution of the project. These improvements include running natively on Linux to ease the deployment on for example university clusters, using multiple cores inside one task to reduce the memory requirements and running different types of workload such as event generation. In addition to technical details the success of ATLAS@Home as an outreach tool is evaluated.
MPI_XSTAR: MPI-based Parallelization of the XSTAR Photoionization Program
NASA Astrophysics Data System (ADS)
Danehkar, Ashkbiz; Nowak, Michael A.; Lee, Julia C.; Smith, Randall K.
2018-02-01
We describe a program for the parallel implementation of multiple runs of XSTAR, a photoionization code that is used to predict the physical properties of an ionized gas from its emission and/or absorption lines. The parallelization program, called MPI_XSTAR, has been developed and implemented in the C++ language by using the Message Passing Interface (MPI) protocol, a conventional standard of parallel computing. We have benchmarked parallel multiprocessing executions of XSTAR, using MPI_XSTAR, against a serial execution of XSTAR, in terms of the parallelization speedup and the computing resource efficiency. Our experience indicates that the parallel execution runs significantly faster than the serial execution, however, the efficiency in terms of the computing resource usage decreases with increasing the number of processors used in the parallel computing.
NASA Astrophysics Data System (ADS)
Sharma, Diksha; Badal, Andreu; Badano, Aldo
2012-04-01
The computational modeling of medical imaging systems often requires obtaining a large number of simulated images with low statistical uncertainty which translates into prohibitive computing times. We describe a novel hybrid approach for Monte Carlo simulations that maximizes utilization of CPUs and GPUs in modern workstations. We apply the method to the modeling of indirect x-ray detectors using a new and improved version of the code \\scriptsize{{MANTIS}}, an open source software tool used for the Monte Carlo simulations of indirect x-ray imagers. We first describe a GPU implementation of the physics and geometry models in fast\\scriptsize{{DETECT}}2 (the optical transport model) and a serial CPU version of the same code. We discuss its new features like on-the-fly column geometry and columnar crosstalk in relation to the \\scriptsize{{MANTIS}} code, and point out areas where our model provides more flexibility for the modeling of realistic columnar structures in large area detectors. Second, we modify \\scriptsize{{PENELOPE}} (the open source software package that handles the x-ray and electron transport in \\scriptsize{{MANTIS}}) to allow direct output of location and energy deposited during x-ray and electron interactions occurring within the scintillator. This information is then handled by optical transport routines in fast\\scriptsize{{DETECT}}2. A load balancer dynamically allocates optical transport showers to the GPU and CPU computing cores. Our hybrid\\scriptsize{{MANTIS}} approach achieves a significant speed-up factor of 627 when compared to \\scriptsize{{MANTIS}} and of 35 when compared to the same code running only in a CPU instead of a GPU. Using hybrid\\scriptsize{{MANTIS}}, we successfully hide hours of optical transport time by running it in parallel with the x-ray and electron transport, thus shifting the computational bottleneck from optical to x-ray transport. The new code requires much less memory than \\scriptsize{{MANTIS}} and, as a result, allows us to efficiently simulate large area detectors.
Fast non-Abelian geometric gates via transitionless quantum driving.
Zhang, J; Kyaw, Thi Ha; Tong, D M; Sjöqvist, Erik; Kwek, Leong-Chuan
2015-12-21
A practical quantum computer must be capable of performing high fidelity quantum gates on a set of quantum bits (qubits). In the presence of noise, the realization of such gates poses daunting challenges. Geometric phases, which possess intrinsic noise-tolerant features, hold the promise for performing robust quantum computation. In particular, quantum holonomies, i.e., non-Abelian geometric phases, naturally lead to universal quantum computation due to their non-commutativity. Although quantum gates based on adiabatic holonomies have already been proposed, the slow evolution eventually compromises qubit coherence and computational power. Here, we propose a general approach to speed up an implementation of adiabatic holonomic gates by using transitionless driving techniques and show how such a universal set of fast geometric quantum gates in a superconducting circuit architecture can be obtained in an all-geometric approach. Compared with standard non-adiabatic holonomic quantum computation, the holonomies obtained in our approach tends asymptotically to those of the adiabatic approach in the long run-time limit and thus might open up a new horizon for realizing a practical quantum computer.
Fast non-Abelian geometric gates via transitionless quantum driving
Zhang, J.; Kyaw, Thi Ha; Tong, D. M.; Sjöqvist, Erik; Kwek, Leong-Chuan
2015-01-01
A practical quantum computer must be capable of performing high fidelity quantum gates on a set of quantum bits (qubits). In the presence of noise, the realization of such gates poses daunting challenges. Geometric phases, which possess intrinsic noise-tolerant features, hold the promise for performing robust quantum computation. In particular, quantum holonomies, i.e., non-Abelian geometric phases, naturally lead to universal quantum computation due to their non-commutativity. Although quantum gates based on adiabatic holonomies have already been proposed, the slow evolution eventually compromises qubit coherence and computational power. Here, we propose a general approach to speed up an implementation of adiabatic holonomic gates by using transitionless driving techniques and show how such a universal set of fast geometric quantum gates in a superconducting circuit architecture can be obtained in an all-geometric approach. Compared with standard non-adiabatic holonomic quantum computation, the holonomies obtained in our approach tends asymptotically to those of the adiabatic approach in the long run-time limit and thus might open up a new horizon for realizing a practical quantum computer. PMID:26687580
Investigation of Small-Caliber Primer Function Using a Multiphase Computational Model
2008-07-01
all solid walls along with specified inflow at the primer orifice (0.102 cm < Y < 0.102 cm at X = 0). Initially , the entire flowfield is filled...to explicitly treat both the gas and solid phase. The model is based on the One Dimensional Turbulence modeling approach that has recently emerged as...a powerful tool in multiphase simulations. Initial results are shown for the model run as a stand-alone code and are compared to recent experiments
Effects of thickness, insulation, and surface color on the net heat loss through an adobe wall
DOE Office of Scientific and Technical Information (OSTI.GOV)
Herman, R.W.
1980-01-01
A finite difference computer program was written and run to study the net thermal losses through a large variety of adobe walls. Fifty-four different combinations of surface color, wall thickness, and insulation position and R value were modeled over a typical two week winter period for locations similar to Albuquerque, New Mexico. A transient analysis of the heat loss from the room to the interior wall surface was compared to both conventional U value and steady-state calculations.
Turbulent Chemical Interaction Models in NCC: Comparison
NASA Technical Reports Server (NTRS)
Norris, Andrew T.; Liu, Nan-Suey
2006-01-01
The performance of a scalar PDF hydrogen-air combustion model in predicting a complex reacting flow is evaluated. In addition the results are compared to those obtained by running the same case with the so-called laminar chemistry model and also a new model based on the concept of mapping partially stirred reactor data onto perfectly stirred reactor data. The results show that the scalar PDF model produces significantly different results from the other two models, and at a significantly higher computational cost.
Soft tissues store and return mechanical energy in human running.
Riddick, R C; Kuo, A D
2016-02-08
During human running, softer parts of the body may deform under load and dissipate mechanical energy. Although tissues such as the heel pad have been characterized individually, the aggregate work performed by all soft tissues during running is unknown. We therefore estimated the work performed by soft tissues (N=8 healthy adults) at running speeds ranging 2-5 m s(-1), computed as the difference between joint work performed on rigid segments, and whole-body estimates of work performed on the (non-rigid) body center of mass (COM) and peripheral to the COM. Soft tissues performed aggregate negative work, with magnitude increasing linearly with speed. The amount was about -19 J per stance phase at a nominal 3 m s(-1), accounting for more than 25% of stance phase negative work performed by the entire body. Fluctuations in soft tissue mechanical power over time resembled a damped oscillation starting at ground contact, with peak negative power comparable to that for the knee joint (about -500 W). Even the positive work from soft tissue rebound was significant, about 13 J per stance phase (about 17% of the positive work of the entire body). Assuming that the net dissipative work is offset by an equal amount of active, positive muscle work performed at 25% efficiency, soft tissue dissipation could account for about 29% of the net metabolic expenditure for running at 5 m s(-1). During running, soft tissue deformations dissipate mechanical energy that must be offset by active muscle work at non-negligible metabolic cost. Copyright © 2016 Elsevier Ltd. All rights reserved.
Match running performance and fitness in youth soccer.
Buchheit, M; Mendez-Villanueva, A; Simpson, B M; Bourdon, P C
2010-11-01
The activity profiles of highly trained young soccer players were examined in relation to age, playing position and physical capacity. Time-motion analyses (global positioning system) were performed on 77 (U13-U18; fullbacks [FB], centre-backs [CB], midfielders [MD], wide midfielders [W], second strikers [2 (nd)S] and strikers [S]) during 42 international club games. Total distance covered (TD) and very high-intensity activities (VHIA; >16.1 km·h (-1)) were computed during 186 entire player-matches. Physical capacity was assessed via field test measures (e. g., peak running speed during an incremental field test, VVam-eval). Match running performance showed an increasing trend with age ( P<0.001, partial eta-squared (η (2)): 0.20-0.45). When adjusted for age and individual playing time, match running performance was position-dependent ( P<0.001, η (2): 0.13-0.40). MD covered the greater TD; CB the lowest ( P<0.05). Distance for VHIA was lower for CB compared with all other positions ( P<0.05); W and S displayed the highest VHIA ( P<0.05). Relationships between match running performance and physical capacities were position-dependent, with poor or non-significant correlations within FB, CB, MD and W (e. g., VHIA vs. VVam-eval: R=0.06 in FB) but large associations within 2 (nd)S and S positions (e. g., VHIA vs. VVam-eval: R=0.70 in 2 (nd)S). In highly trained young soccer players, the importance of fitness level as a determinant of match running performance should be regarded as a function of playing position.
Garcia-Vicente, Ana María; Molina, David; Pérez-Beteta, Julián; Amo-Salas, Mariano; Martínez-González, Alicia; Bueno, Gloria; Tello-Galán, María Jesús; Soriano-Castrejón, Ángel
2017-12-01
To study the influence of dual time point 18F-FDG PET/CT in textural features and SUV-based variables and their relation among them. Fifty-six patients with locally advanced breast cancer (LABC) were prospectively included. All of them underwent a standard 18F-FDG PET/CT (PET-1) and a delayed acquisition (PET-2). After segmentation, SUV variables (SUVmax, SUVmean, and SUVpeak), metabolic tumor volume (MTV), and total lesion glycolysis (TLG) were obtained. Eighteen three-dimensional (3D) textural measures were computed including: run-length matrices (RLM) features, co-occurrence matrices (CM) features, and energies. Differences between all PET-derived variables obtained in PET-1 and PET-2 were studied. Significant differences were found between the SUV-based parameters and MTV obtained in the dual time point PET/CT, with higher values of SUV-based variables and lower MTV in the PET-2 with respect to the PET-1. In relation with the textural parameters obtained in dual time point acquisition, significant differences were found for the short run emphasis, low gray-level run emphasis, short run high gray-level emphasis, run percentage, long run emphasis, gray-level non-uniformity, homogeneity, and dissimilarity. Textural variables showed relations with MTV and TLG. Significant differences of textural features were found in dual time point 18F-FDG PET/CT. Thus, a dynamic behavior of metabolic characteristics should be expected, with higher heterogeneity in delayed PET acquisition compared with the standard PET. A greater heterogeneity was found in bigger tumors.
Adapting NBODY4 with a GRAPE-6a Supercomputer for Web Access, Using NBodyLab
NASA Astrophysics Data System (ADS)
Johnson, V.; Aarseth, S.
2006-07-01
A demonstration site has been developed by the authors that enables researchers and students to experiment with the capabilities and performance of NBODY4 running on a GRAPE-6a over the web. NBODY4 is a sophisticated open-source N-body code for high accuracy simulations of dense stellar systems (Aarseth 2003). In 2004, NBODY4 was successfully tested with a GRAPE-6a, yielding an unprecedented low-cost tool for astrophysical research. The GRAPE-6a is a supercomputer card developed by astrophysicists to accelerate high accuracy N-body simulations with a cluster or a desktop PC (Fukushige et al. 2005, Makino & Taiji 1998). The GRAPE-6a card became commercially available in 2004, runs at 125 Gflops peak, has a standard PCI interface and costs less than 10,000. Researchers running the widely used NBODY6 (which does not require GRAPE hardware) can compare their own PC or laptop performance with simulations run on http://www.NbodyLab.org. Such comparisons may help justify acquisition of a GRAPE-6a. For workgroups such as university physics or astronomy departments, the demonstration site may be replicated or serve as a model for a shared computing resource. The site was constructed using an NBodyLab server-side framework.
Power Analysis of an Enterprise Wireless Communication Architecture
2017-09-01
easily plug a satellite-based communication module into the enterprise processor when needed. Once plugged-in, it automatically runs the corresponding...reduce the SWaP by using a singular processing/computing module to run user applications and to implement waveform algorithms. This approach would...GPP) technology improved enough to allow a wide variety of waveforms to run in the GPP; thus giving rise to the SDR (Brannon 2004). Today’s
Data intensive computing at Sandia.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wilson, Andrew T.
2010-09-01
Data-Intensive Computing is parallel computing where you design your algorithms and your software around efficient access and traversal of a data set; where hardware requirements are dictated by data size as much as by desired run times usually distilling compact results from massive data.
SFU Hacking for Non-Hackers v. 1.005
DOE Office of Scientific and Technical Information (OSTI.GOV)
Carter, David James
The program provides a limited virtual environment for exploring some concepts of computer hacking. It simulates a simple computer system with intentional vulnerabilities, allowing the user to issue commands and observe their results. It does not affect the computer on which it runs.
GPU-Accelerated Voxelwise Hepatic Perfusion Quantification
Wang, H; Cao, Y
2012-01-01
Voxelwise quantification of hepatic perfusion parameters from dynamic contrast enhanced (DCE) imaging greatly contributes to assessment of liver function in response to radiation therapy. However, the efficiency of the estimation of hepatic perfusion parameters voxel-by-voxel in the whole liver using a dual-input single-compartment model requires substantial improvement for routine clinical applications. In this paper, we utilize the parallel computation power of a graphics processing unit (GPU) to accelerate the computation, while maintaining the same accuracy as the conventional method. Using CUDA-GPU, the hepatic perfusion computations over multiple voxels are run across the GPU blocks concurrently but independently. At each voxel, non-linear least squares fitting the time series of the liver DCE data to the compartmental model is distributed to multiple threads in a block, and the computations of different time points are performed simultaneously and synchronically. An efficient fast Fourier transform in a block is also developed for the convolution computation in the model. The GPU computations of the voxel-by-voxel hepatic perfusion images are compared with ones by the CPU using the simulated DCE data and the experimental DCE MR images from patients. The computation speed is improved by 30 times using a NVIDIA Tesla C2050 GPU compared to a 2.67 GHz Intel Xeon CPU processor. To obtain liver perfusion maps with 626400 voxels in a patient’s liver, it takes 0.9 min with the GPU-accelerated voxelwise computation, compared to 110 min with the CPU, while both methods result in perfusion parameters differences less than 10−6. The method will be useful for generating liver perfusion images in clinical settings. PMID:22892645
Improving the Efficiency of Free Energy Calculations in the Amber Molecular Dynamics Package.
Kaus, Joseph W; Pierce, Levi T; Walker, Ross C; McCammont, J Andrew
2013-09-10
Alchemical transformations are widely used methods to calculate free energies. Amber has traditionally included support for alchemical transformations as part of the sander molecular dynamics (MD) engine. Here we describe the implementation of a more efficient approach to alchemical transformations in the Amber MD package. Specifically we have implemented this new approach within the more computational efficient and scalable pmemd MD engine that is included with the Amber MD package. The majority of the gain in efficiency comes from the improved design of the calculation, which includes better parallel scaling and reduction in the calculation of redundant terms. This new implementation is able to reproduce results from equivalent simulations run with the existing functionality, but at 2.5 times greater computational efficiency. This new implementation is also able to run softcore simulations at the λ end states making direct calculation of free energies more accurate, compared to the extrapolation required in the existing implementation. The updated alchemical transformation functionality will be included in the next major release of Amber (scheduled for release in Q1 2014) and will be available at http://ambermd.org, under the Amber license.
The Mayak Worker Dosimetry System (MWDS-2013): Implementation of the Dose Calculations.
Zhdanov, А; Vostrotin, V; Efimov, А; Birchall, A; Puncher, M
2016-07-15
The calculation of internal doses for the Mayak Worker Dosimetry System (MWDS-2013) involved extensive computational resources due to the complexity and sheer number of calculations required. The required output consisted of a set of 1000 hyper-realizations: each hyper-realization consists of a set (1 for each worker) of probability distributions of organ doses. This report describes the hardware components and computational approaches required to make the calculation tractable. Together with the software, this system is referred to here as the 'PANDORA system'. It is based on a commercial SQL server database in a series of six work stations. A complete run of the entire Mayak worker cohort entailed a huge amount of calculations in PANDORA and due to the relatively slow speed of writing the data into the SQL server, each run took about 47 days. Quality control was monitored by comparing doses calculated in PANDORA with those in a specially modified version of the commercial software 'IMBA Professional Plus'. Suggestions are also made for increasing calculation and storage efficiency for future dosimetry calculations using PANDORA. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Simulated Raman Spectral Analysis of Organic Molecules
NASA Astrophysics Data System (ADS)
Lu, Lu
The advent of the laser technology in the 1960s solved the main difficulty of Raman spectroscopy, resulted in simplified Raman spectroscopy instruments and also boosted the sensitivity of the technique. Up till now, Raman spectroscopy is commonly used in chemistry and biology. As vibrational information is specific to the chemical bonds, Raman spectroscopy provides fingerprints to identify the type of molecules in the sample. In this thesis, we simulate the Raman Spectrum of organic and inorganic materials by General Atomic and Molecular Electronic Structure System (GAMESS) and Gaussian, two computational codes that perform several general chemistry calculations. We run these codes on our CPU-based high-performance cluster (HPC). Through the message passing interface (MPI), a standardized and portable message-passing system which can make the codes run in parallel, we are able to decrease the amount of time for computation and increase the sizes and capacities of systems simulated by the codes. From our simulations, we will set up a database that allows search algorithm to quickly identify N-H and O-H bonds in different materials. Our ultimate goal is to analyze and identify the spectra of organic matter compositions from meteorites and compared these spectra with terrestrial biologically-produced amino acids and residues.
Improving the Efficiency of Free Energy Calculations in the Amber Molecular Dynamics Package
Pierce, Levi T.; Walker, Ross C.; McCammont, J. Andrew
2013-01-01
Alchemical transformations are widely used methods to calculate free energies. Amber has traditionally included support for alchemical transformations as part of the sander molecular dynamics (MD) engine. Here we describe the implementation of a more efficient approach to alchemical transformations in the Amber MD package. Specifically we have implemented this new approach within the more computational efficient and scalable pmemd MD engine that is included with the Amber MD package. The majority of the gain in efficiency comes from the improved design of the calculation, which includes better parallel scaling and reduction in the calculation of redundant terms. This new implementation is able to reproduce results from equivalent simulations run with the existing functionality, but at 2.5 times greater computational efficiency. This new implementation is also able to run softcore simulations at the λ end states making direct calculation of free energies more accurate, compared to the extrapolation required in the existing implementation. The updated alchemical transformation functionality will be included in the next major release of Amber (scheduled for release in Q1 2014) and will be available at http://ambermd.org, under the Amber license. PMID:24185531
Modeling disease transmission near eradication: An equation free approach
NASA Astrophysics Data System (ADS)
Williams, Matthew O.; Proctor, Joshua L.; Kutz, J. Nathan
2015-01-01
Although disease transmission in the near eradication regime is inherently stochastic, deterministic quantities such as the probability of eradication are of interest to policy makers and researchers. Rather than running large ensembles of discrete stochastic simulations over long intervals in time to compute these deterministic quantities, we create a data-driven and deterministic "coarse" model for them using the Equation Free (EF) framework. In lieu of deriving an explicit coarse model, the EF framework approximates any needed information, such as coarse time derivatives, by running short computational experiments. However, the choice of the coarse variables (i.e., the state of the coarse system) is critical if the resulting model is to be accurate. In this manuscript, we propose a set of coarse variables that result in an accurate model in the endemic and near eradication regimes, and demonstrate this on a compartmental model representing the spread of Poliomyelitis. When combined with adaptive time-stepping coarse projective integrators, this approach can yield over a factor of two speedup compared to direct simulation, and due to its lower dimensionality, could be beneficial when conducting systems level tasks such as designing eradication or monitoring campaigns.
Development of a Computing Cluster At the University of Richmond
NASA Astrophysics Data System (ADS)
Carbonneau, J.; Gilfoyle, G. P.; Bunn, E. F.
2010-11-01
The University of Richmond has developed a computing cluster to support the massive simulation and data analysis requirements for programs in intermediate-energy nuclear physics, and cosmology. It is a 20-node, 240-core system running Red Hat Enterprise Linux 5. We have built and installed the physics software packages (Geant4, gemc, MADmap...) and developed shell and Perl scripts for running those programs on the remote nodes. The system has a theoretical processing peak of about 2500 GFLOPS. Testing with the High Performance Linpack (HPL) benchmarking program (one of the standard benchmarks used by the TOP500 list of fastest supercomputers) resulted in speeds of over 900 GFLOPS. The difference between the maximum and measured speeds is due to limitations in the communication speed among the nodes; creating a bottleneck for large memory problems. As HPL sends data between nodes, the gigabit Ethernet connection cannot keep up with the processing power. We will show how both the theoretical and actual performance of the cluster compares with other current and past clusters, as well as the cost per GFLOP. We will also examine the scaling of the performance when distributed to increasing numbers of nodes.
pFlogger: The Parallel Fortran Logging Utility
NASA Technical Reports Server (NTRS)
Clune, Tom; Cruz, Carlos A.
2017-01-01
In the context of high performance computing (HPC), software investments in support of text-based diagnostics, which monitor a running application, are typically limited compared to those for other types of IO. Examples of such diagnostics include reiteration of configuration parameters, progress indicators, simple metrics (e.g., mass conservation, convergence of solvers, etc.), and timers. To some degree, this difference in priority is justifiable as other forms of output are the primary products of a scientific model and, due to their large data volume, much more likely to be a significant performance concern. In contrast, text-based diagnostic content is generally not shared beyond the individual or group running an application and is most often used to troubleshoot when something goes wrong. We suggest that a more systematic approach enabled by a logging facility (or 'logger)' similar to those routinely used by many communities would provide significant value to complex scientific applications. In the context of high-performance computing, an appropriate logger would provide specialized support for distributed and shared-memory parallelism and have low performance overhead. In this paper, we present our prototype implementation of pFlogger - a parallel Fortran-based logging framework, and assess its suitability for use in a complex scientific application.
Working research codes into fluid dynamics education: a science gateway approach
NASA Astrophysics Data System (ADS)
Mason, Lachlan; Hetherington, James; O'Reilly, Martin; Yong, May; Jersakova, Radka; Grieve, Stuart; Perez-Suarez, David; Klapaukh, Roman; Craster, Richard V.; Matar, Omar K.
2017-11-01
Research codes are effective for illustrating complex concepts in educational fluid dynamics courses, compared to textbook examples, an interactive three-dimensional visualisation can bring a problem to life! Various barriers, however, prevent the adoption of research codes in teaching: codes are typically created for highly-specific `once-off' calculations and, as such, have no user interface and a steep learning curve. Moreover, a code may require access to high-performance computing resources that are not readily available in the classroom. This project allows academics to rapidly work research codes into their teaching via a minimalist `science gateway' framework. The gateway is a simple, yet flexible, web interface allowing students to construct and run simulations, as well as view and share their output. Behind the scenes, the common operations of job configuration, submission, monitoring and post-processing are customisable at the level of shell scripting. In this talk, we demonstrate the creation of an example teaching gateway connected to the Code BLUE fluid dynamics software. Student simulations can be run via a third-party cloud computing provider or a local high-performance cluster. EPSRC, UK, MEMPHIS program Grant (EP/K003976/1), RAEng Research Chair (OKM).
Virtual network computing: cross-platform remote display and collaboration software.
Konerding, D E
1999-04-01
VNC (Virtual Network Computing) is a computer program written to address the problem of cross-platform remote desktop/application display. VNC uses a client/server model in which an image of the desktop of the server is transmitted to the client and displayed. The client collects mouse and keyboard input from the user and transmits them back to the server. The VNC client and server can run on Windows 95/98/NT, MacOS, and Unix (including Linux) operating systems. VNC is multi-user on Unix machines (any number of servers can be run are unrelated to the primary display of the computer), while it is effectively single-user on Macintosh and Windows machines (only one server can be run, displaying the contents of the primary display of the server). The VNC servers can be configured to allow more than one client to connect at one time, effectively allowing collaboration through the shared desktop. I describe the function of VNC, provide details of installation, describe how it achieves its goal, and evaluate the use of VNC for molecular modelling. VNC is an extremely useful tool for collaboration, instruction, software development, and debugging of graphical programs with remote users.
Injecting Artificial Memory Errors Into a Running Computer Program
NASA Technical Reports Server (NTRS)
Bornstein, Benjamin J.; Granat, Robert A.; Wagstaff, Kiri L.
2008-01-01
Single-event upsets (SEUs) or bitflips are computer memory errors caused by radiation. BITFLIPS (Basic Instrumentation Tool for Fault Localized Injection of Probabilistic SEUs) is a computer program that deliberately injects SEUs into another computer program, while the latter is running, for the purpose of evaluating the fault tolerance of that program. BITFLIPS was written as a plug-in extension of the open-source Valgrind debugging and profiling software. BITFLIPS can inject SEUs into any program that can be run on the Linux operating system, without needing to modify the program s source code. Further, if access to the original program source code is available, BITFLIPS offers fine-grained control over exactly when and which areas of memory (as specified via program variables) will be subjected to SEUs. The rate of injection of SEUs is controlled by specifying either a fault probability or a fault rate based on memory size and radiation exposure time, in units of SEUs per byte per second. BITFLIPS can also log each SEU that it injects and, if program source code is available, report the magnitude of effect of the SEU on a floating-point value or other program variable.
Recent Performance Results of VPIC on Trinity
NASA Astrophysics Data System (ADS)
Nystrom, W. D.; Bergen, B.; Bird, R. F.; Bowers, K. J.; Daughton, W. S.; Guo, F.; Le, A.; Li, H.; Nam, H.; Pang, X.; Stark, D. J.; Rust, W. N., III; Yin, L.; Albright, B. J.
2017-10-01
Trinity is a new DOE compute resource now in production at Los Alamos National Laboratory. Trinity has several new and unique features including two compute partitions, one with dual socket Intel Haswell Xeon compute nodes and one with Intel Knights Landing (KNL) Xeon Phi compute nodes, use of on package high bandwidth memory (HBM) for KNL nodes, ability to configure KNL nodes with respect to HBM model and on die network topology in a variety of operational modes at run time, and use of solid state storage via burst buffer technology to reduce time required to perform I/O. An effort is in progress to optimize VPIC on Trinity by taking advantage of these new architectural features. Results of work will be presented on performance of VPIC on Haswell and KNL partitions for single node runs and runs at scale. Results include use of burst buffers at scale to optimize I/O, comparison of strategies for using MPI and threads, performance benefits using HBM and effectiveness of using intrinsics for vectorization. Work performed under auspices of U.S. Dept. of Energy by Los Alamos National Security, LLC Los Alamos National Laboratory under contract DE-AC52-06NA25396 and supported by LANL LDRD program.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kendon, Viv
2014-12-04
Quantum versions of random walks have diverse applications that are motivating experimental implementations as well as theoretical studies. Recent results showing quantum walks are “universal for quantum computation” relate to algorithms, to be run on quantum computers. We consider whether an experimental implementation of a quantum walk could provide useful computation before we have a universal quantum computer.
29 CFR 102.111 - Time computation.
Code of Federal Regulations, 2010 CFR
2010-07-01
... 29 Labor 2 2010-07-01 2010-07-01 false Time computation. 102.111 Section 102.111 Labor Regulations... Papers § 102.111 Time computation. (a) In computing any period of time prescribed or allowed by these rules, the day of the act, event, or default after which the designated period of time begins to run is...
29 CFR 102.111 - Time computation.
Code of Federal Regulations, 2014 CFR
2014-07-01
... 29 Labor 2 2014-07-01 2014-07-01 false Time computation. 102.111 Section 102.111 Labor Regulations... Papers § 102.111 Time computation. (a) In computing any period of time prescribed or allowed by these rules, the day of the act, event, or default after which the designated period of time begins to run is...
29 CFR 102.111 - Time computation.
Code of Federal Regulations, 2012 CFR
2012-07-01
... 29 Labor 2 2012-07-01 2012-07-01 false Time computation. 102.111 Section 102.111 Labor Regulations... Papers § 102.111 Time computation. (a) In computing any period of time prescribed or allowed by these rules, the day of the act, event, or default after which the designated period of time begins to run is...
29 CFR 102.111 - Time computation.
Code of Federal Regulations, 2013 CFR
2013-07-01
... 29 Labor 2 2013-07-01 2013-07-01 false Time computation. 102.111 Section 102.111 Labor Regulations... Papers § 102.111 Time computation. (a) In computing any period of time prescribed or allowed by these rules, the day of the act, event, or default after which the designated period of time begins to run is...
Non-volatile memory for checkpoint storage
DOE Office of Scientific and Technical Information (OSTI.GOV)
Blumrich, Matthias A.; Chen, Dong; Cipolla, Thomas M.
A system, method and computer program product for supporting system initiated checkpoints in high performance parallel computing systems and storing of checkpoint data to a non-volatile memory storage device. The system and method generates selective control signals to perform checkpointing of system related data in presence of messaging activity associated with a user application running at the node. The checkpointing is initiated by the system such that checkpoint data of a plurality of network nodes may be obtained even in the presence of user applications running on highly parallel computers that include ongoing user messaging activity. In one embodiment, themore » non-volatile memory is a pluggable flash memory card.« less
Integration of High-Performance Computing into Cloud Computing Services
NASA Astrophysics Data System (ADS)
Vouk, Mladen A.; Sills, Eric; Dreher, Patrick
High-Performance Computing (HPC) projects span a spectrum of computer hardware implementations ranging from peta-flop supercomputers, high-end tera-flop facilities running a variety of operating systems and applications, to mid-range and smaller computational clusters used for HPC application development, pilot runs and prototype staging clusters. What they all have in common is that they operate as a stand-alone system rather than a scalable and shared user re-configurable resource. The advent of cloud computing has changed the traditional HPC implementation. In this article, we will discuss a very successful production-level architecture and policy framework for supporting HPC services within a more general cloud computing infrastructure. This integrated environment, called Virtual Computing Lab (VCL), has been operating at NC State since fall 2004. Nearly 8,500,000 HPC CPU-Hrs were delivered by this environment to NC State faculty and students during 2009. In addition, we present and discuss operational data that show that integration of HPC and non-HPC (or general VCL) services in a cloud can substantially reduce the cost of delivering cloud services (down to cents per CPU hour).
Control of the TSU 2-m automatic telescope
NASA Astrophysics Data System (ADS)
Eaton, Joel A.; Williamson, Michael H.
2004-09-01
Tennessee State University is operating a 2-m automatic telescope for high-dispersion spectroscopy. The alt-azimuth telescope is fiber-coupled to a conventional echelle spectrograph with two resolutions (R=30,000 and 70,000). We control this instrument with four computers running linux and communicating over ethernet through the UDP protocol. A computer physically located on the telescope handles the acquisition and tracking of stars. We avoid the need for real-time programming in this application by periodically latching the positions of the axes in a commercial motion controller and the time in a GPS receiver. A second (spectrograph) computer sets up the spectrograph and runs its CCD, a third (roof) computer controls the roll-off roof and front flap of the telescope enclosure, and the fourth (executive) computer makes decisions about which stars to observe and when to close the observatory for bad weather. The only human intervention in the telescope's operation involves changing the observing program, copying data back to TSU, and running quality-control checks on the data. It has been running reliably in this completely automatic, unattended mode for more than a year with all day-to-day adminsitration carried out over the Internet. To support automatic operation, we have written a number of useful tools to predict and analyze what the telescope does. These include a simulator that predicts roughly how the telescope will operate on a given night, a quality-control program to parse logfiles from the telescope and identify problems, and a rescheduling program that calculates new priorities to keep the frequency of observation for the various stars roughly as desired. We have also set up a database to keep track of the tens of thousands of spectra we expect to get each year.
Implementation of an Intelligent Control System
1992-05-01
there- fore implemented in a portable equipment rack. The controls computer consists of a microcomputer running a real time operating system , interface...circuit boards are mounted in an industry standard Multibus I chassis. The microcomputer runs the iRMX real time operating system . This operating system
10 CFR 205.5 - Computation of time.
Code of Federal Regulations, 2010 CFR
2010-01-01
..., event, or default from which the designated period of time begins to run is not to be included. The last... holiday in which event the period runs until the end of the next day that is neither a Saturday, Sunday... be added to the prescribed period. ...
10 CFR 205.5 - Computation of time.
Code of Federal Regulations, 2013 CFR
2013-01-01
..., event, or default from which the designated period of time begins to run is not to be included. The last... holiday in which event the period runs until the end of the next day that is neither a Saturday, Sunday... be added to the prescribed period. ...
10 CFR 205.5 - Computation of time.
Code of Federal Regulations, 2012 CFR
2012-01-01
..., event, or default from which the designated period of time begins to run is not to be included. The last... holiday in which event the period runs until the end of the next day that is neither a Saturday, Sunday... be added to the prescribed period. ...
10 CFR 205.5 - Computation of time.
Code of Federal Regulations, 2011 CFR
2011-01-01
..., event, or default from which the designated period of time begins to run is not to be included. The last... holiday in which event the period runs until the end of the next day that is neither a Saturday, Sunday... be added to the prescribed period. ...
10 CFR 205.5 - Computation of time.
Code of Federal Regulations, 2014 CFR
2014-01-01
..., event, or default from which the designated period of time begins to run is not to be included. The last... holiday in which event the period runs until the end of the next day that is neither a Saturday, Sunday... be added to the prescribed period. ...
DOT National Transportation Integrated Search
1995-09-05
The Run-Off-Road Collision Avoidance Using IVHS Countermeasures program is to address the single vehicle crash problem through application of technology to prevent and/or reduce the severity of these crashes. : This report documents the RORSIM comput...
A web-based computer aided system for liver surgery planning: initial implementation on RayPlus
NASA Astrophysics Data System (ADS)
Luo, Ming; Yuan, Rong; Sun, Zhi; Li, Tianhong; Xie, Qingguo
2016-03-01
At present, computer aided systems for liver surgery design and risk evaluation are widely used in clinical all over the world. However, most systems are local applications that run on high-performance workstations, and the images have to processed offline. Compared with local applications, a web-based system is accessible anywhere and for a range of regardless of relative processing power or operating system. RayPlus (http://rayplus.life.hust.edu.cn), a B/S platform for medical image processing, was developed to give a jump start on web-based medical image processing. In this paper, we implement a computer aided system for liver surgery planning on the architecture of RayPlus. The system consists of a series of processing to CT images including filtering, segmentation, visualization and analyzing. Each processing is packaged into an executable program and runs on the server side. CT images in DICOM format are processed step by to interactive modeling on browser with zero-installation and server-side computing. The system supports users to semi-automatically segment the liver, intrahepatic vessel and tumor from the pre-processed images. Then, surface and volume models are built to analyze the vessel structure and the relative position between adjacent organs. The results show that the initial implementation meets satisfactorily its first-order objectives and provide an accurate 3D delineation of the liver anatomy. Vessel labeling and resection simulation are planned to add in the future. The system is available on Internet at the link mentioned above and an open username for testing is offered.
Pedretti, Kevin
2008-11-18
A compute processor allocator architecture for allocating compute processors to run applications in a multiple processor computing apparatus is distributed among a subset of processors within the computing apparatus. Each processor of the subset includes a compute processor allocator. The compute processor allocators can share a common database of information pertinent to compute processor allocation. A communication path permits retrieval of information from the database independently of the compute processor allocators.
Supercomputers ready for use as discovery machines for neuroscience.
Helias, Moritz; Kunkel, Susanne; Masumoto, Gen; Igarashi, Jun; Eppler, Jochen Martin; Ishii, Shin; Fukai, Tomoki; Morrison, Abigail; Diesmann, Markus
2012-01-01
NEST is a widely used tool to simulate biological spiking neural networks. Here we explain the improvements, guided by a mathematical model of memory consumption, that enable us to exploit for the first time the computational power of the K supercomputer for neuroscience. Multi-threaded components for wiring and simulation combine 8 cores per MPI process to achieve excellent scaling. K is capable of simulating networks corresponding to a brain area with 10(8) neurons and 10(12) synapses in the worst case scenario of random connectivity; for larger networks of the brain its hierarchical organization can be exploited to constrain the number of communicating computer nodes. We discuss the limits of the software technology, comparing maximum filling scaling plots for K and the JUGENE BG/P system. The usability of these machines for network simulations has become comparable to running simulations on a single PC. Turn-around times in the range of minutes even for the largest systems enable a quasi interactive working style and render simulations on this scale a practical tool for computational neuroscience.
Supercomputers Ready for Use as Discovery Machines for Neuroscience
Helias, Moritz; Kunkel, Susanne; Masumoto, Gen; Igarashi, Jun; Eppler, Jochen Martin; Ishii, Shin; Fukai, Tomoki; Morrison, Abigail; Diesmann, Markus
2012-01-01
NEST is a widely used tool to simulate biological spiking neural networks. Here we explain the improvements, guided by a mathematical model of memory consumption, that enable us to exploit for the first time the computational power of the K supercomputer for neuroscience. Multi-threaded components for wiring and simulation combine 8 cores per MPI process to achieve excellent scaling. K is capable of simulating networks corresponding to a brain area with 108 neurons and 1012 synapses in the worst case scenario of random connectivity; for larger networks of the brain its hierarchical organization can be exploited to constrain the number of communicating computer nodes. We discuss the limits of the software technology, comparing maximum filling scaling plots for K and the JUGENE BG/P system. The usability of these machines for network simulations has become comparable to running simulations on a single PC. Turn-around times in the range of minutes even for the largest systems enable a quasi interactive working style and render simulations on this scale a practical tool for computational neuroscience. PMID:23129998
Gray: a ray tracing-based Monte Carlo simulator for PET
NASA Astrophysics Data System (ADS)
Freese, David L.; Olcott, Peter D.; Buss, Samuel R.; Levin, Craig S.
2018-05-01
Monte Carlo simulation software plays a critical role in PET system design. Performing complex, repeated Monte Carlo simulations can be computationally prohibitive, as even a single simulation can require a large amount of time and a computing cluster to complete. Here we introduce Gray, a Monte Carlo simulation software for PET systems. Gray exploits ray tracing methods used in the computer graphics community to greatly accelerate simulations of PET systems with complex geometries. We demonstrate the implementation of models for positron range, annihilation acolinearity, photoelectric absorption, Compton scatter, and Rayleigh scatter. For validation, we simulate the GATE PET benchmark, and compare energy, distribution of hits, coincidences, and run time. We show a speedup using Gray, compared to GATE for the same simulation, while demonstrating nearly identical results. We additionally simulate the Siemens Biograph mCT system with both the NEMA NU-2 scatter phantom and sensitivity phantom. We estimate the total sensitivity within % when accounting for differences in peak NECR. We also estimate the peak NECR to be kcps, or within % of published experimental data. The activity concentration of the peak is also estimated within 1.3%.
Accuracy of the lattice-Boltzmann method using the Cell processor
NASA Astrophysics Data System (ADS)
Harvey, M. J.; de Fabritiis, G.; Giupponi, G.
2008-11-01
Accelerator processors like the new Cell processor are extending the traditional platforms for scientific computation, allowing orders of magnitude more floating-point operations per second (flops) compared to standard central processing units. However, they currently lack double-precision support and support for some IEEE 754 capabilities. In this work, we develop a lattice-Boltzmann (LB) code to run on the Cell processor and test the accuracy of this lattice method on this platform. We run tests for different flow topologies, boundary conditions, and Reynolds numbers in the range Re=6 350 . In one case, simulation results show a reduced mass and momentum conservation compared to an equivalent double-precision LB implementation. All other cases demonstrate the utility of the Cell processor for fluid dynamics simulations. Benchmarks on two Cell-based platforms are performed, the Sony Playstation3 and the QS20/QS21 IBM blade, obtaining a speed-up factor of 7 and 21, respectively, compared to the original PC version of the code, and a conservative sustained performance of 28 gigaflops per single Cell processor. Our results suggest that choice of IEEE 754 rounding mode is possibly as important as double-precision support for this specific scientific application.
Influence of trunk posture on lower extremity energetics during running.
Teng, Hsiang-Ling; Powers, Christopher M
2015-03-01
This study aimed to examine the influence of sagittal plane trunk posture on lower extremity energetics during running. Forty asymptomatic recreational runners (20 males and 20 females) ran overground at a speed of 3.4 m·s(-1). Sagittal plane trunk kinematics and lower extremity kinematics and energetics during the stance phase of running were computed. Subjects were dichotomized into high flexion (HF) and low flexion (LF) groups on the basis of the mean trunk flexion angle. The mean (±SD) trunk flexion angles of the HF and LF groups were 10.8° ± 2.2° and 3.6° ± 2.8°, respectively. When compared with the LF group, the HF group demonstrated significantly higher hip extensor energy generation (0.12 ± 0.06 vs 0.05 ± 0.04 J·kg(-1), P < 0.001) and lower knee extensor energy absorption (0.60 ± 0.14 vs 0.74 ± 0.09 J·kg(-1), P = 0.001) and generation (0.30 ± 0.05 vs 0.34 ± 0.06 J·kg(-1), P = 0.02). There was no significant group difference for the ankle plantarflexor energy absorption or generation (P > 0.05). Sagittal plane trunk flexion has a significant influence on hip and knee energetics during running. Increasing forward trunk lean during running may be used as a strategy to reduce knee loading without increasing the biomechanical demand at the ankle plantarflexors.
Validation of NASA Thermal Ice Protection Computer Codes. Part 3; The Validation of Antice
NASA Technical Reports Server (NTRS)
Al-Khalil, Kamel M.; Horvath, Charles; Miller, Dean R.; Wright, William B.
2001-01-01
An experimental program was generated by the Icing Technology Branch at NASA Glenn Research Center to validate two ice protection simulation codes: (1) LEWICE/Thermal for transient electrothermal de-icing and anti-icing simulations, and (2) ANTICE for steady state hot gas and electrothermal anti-icing simulations. An electrothermal ice protection system was designed and constructed integral to a 36 inch chord NACA0012 airfoil. The model was fully instrumented with thermo-couples, RTD'S, and heat flux gages. Tests were conducted at several icing environmental conditions during a two week period at the NASA Glenn Icing Research Tunnel. Experimental results of running-wet and evaporative cases were compared to the ANTICE computer code predictions and are presented in this paper.
Implementing the SU(2) Symmetry for the DMRG
NASA Astrophysics Data System (ADS)
Alvarez, Gonzalo
2010-03-01
In the Density Matrix Renormalization Group (DMRG) algorithm (White, 1992), Hamiltonian symmetries play an important role. Using symmetries, the matrix representation of the Hamiltonian can be blocked. Diagonalizing each matrix block is more efficient than diagonalizing the original matrix. This talk will explain how the DMRG++ codefootnotetextarXiv:0902.3185 or Computer Physics Communications 180 (2009) 1572-1578. has been extended to handle the non-local SU(2) symmetry in a model independent way. Improvements in CPU times compared to runs with only local symmetries will be discussed for typical tight-binding models of strongly correlated electronic systems. The computational bottleneck of the algorithm, and the use of shared memory parallelization will also be addressed. Finally, a roadmap for future work on DMRG++ will be presented.
An automatic frequency control loop using overlapping DFTs (Discrete Fourier Transforms)
NASA Technical Reports Server (NTRS)
Aguirre, S.
1988-01-01
An automatic frequency control (AFC) loop is introduced and analyzed in detail. The new scheme is a generalization of the well known Cross Product AFC loop that uses running overlapping discrete Fourier transforms (DFTs) to create a discriminator curve. Linear analysis is included and supported with computer simulations. The algorithm is tested in a low carrier to noise ratio (CNR) dynamic environment, and the probability of loss of lock is estimated via computer simulations. The algorithm discussed is a suboptimum tracking scheme with a larger frequency error variance compared to an optimum strategy, but offers simplicity of implementation and a very low operating threshold CNR. This technique can be applied during the carrier acquisition and re-acquisition process in the Advanced Receiver.
Automatic computation of 2D cardiac measurements from B-mode echocardiography
NASA Astrophysics Data System (ADS)
Park, JinHyeong; Feng, Shaolei; Zhou, S. Kevin
2012-03-01
We propose a robust and fully automatic algorithm which computes the 2D echocardiography measurements recommended by America Society of Echocardiography. The algorithm employs knowledge-based imaging technologies which can learn the expert's knowledge from the training images and expert's annotation. Based on the models constructed from the learning stage, the algorithm searches initial location of the landmark points for the measurements by utilizing heart structure of left ventricle including mitral valve aortic valve. It employs the pseudo anatomic M-mode image generated by accumulating the line images in 2D parasternal long axis view along the time to refine the measurement landmark points. The experiment results with large volume of data show that the algorithm runs fast and is robust comparable to expert.
Design and test of a 10kW ORC supersonic turbine generator
NASA Astrophysics Data System (ADS)
Seume, J. R.; Peters, M.; Kunte, H.
2017-03-01
Manufactures are searching for possibilities to increase the efficiency of combustion engines by using the remaining energy of the exhaust gas. One possibility to recover some of this thermal energy is an organic Rankine cycle (ORC). For such an ORC running with ethanol, the aerothermodynamic design and test of a supersonic axial, single stage impulse turbine generator unit is described. The blade design as well as the regulation by variable partial admission is shown. Additionally the mechanical design of the directly coupled turbine generator unit including the aerodynamic sealing and the test facility is presented. Finally the results of CFD-based computations are compared to the experimental measurements. The comparison shows a remarkably good agreement between the numerical computations and the test data.
Data assimilation using a GPU accelerated path integral Monte Carlo approach
NASA Astrophysics Data System (ADS)
Quinn, John C.; Abarbanel, Henry D. I.
2011-09-01
The answers to data assimilation questions can be expressed as path integrals over all possible state and parameter histories. We show how these path integrals can be evaluated numerically using a Markov Chain Monte Carlo method designed to run in parallel on a graphics processing unit (GPU). We demonstrate the application of the method to an example with a transmembrane voltage time series of a simulated neuron as an input, and using a Hodgkin-Huxley neuron model. By taking advantage of GPU computing, we gain a parallel speedup factor of up to about 300, compared to an equivalent serial computation on a CPU, with performance increasing as the length of the observation time used for data assimilation increases.
Rotary Kiln Gasification of Solid Waste for Base Camps
2017-10-02
cup after full day run 3.3 Feedstock Handling System Garbage bags containing waste feedstock are placed into feed bin FB-101. Ram feeder RF-102...Environmental Science and Technology using the Factory Talk SCADA software running on a laptop computer. A wireless Ethernet router that is located within the...pyrolysis oil produced required consistent draining from the system during operation and became a liquid waste disposal problem. A 5-hour test run could
Certification of computational results
NASA Technical Reports Server (NTRS)
Sullivan, Gregory F.; Wilson, Dwight S.; Masson, Gerald M.
1993-01-01
A conceptually novel and powerful technique to achieve fault detection and fault tolerance in hardware and software systems is described. When used for software fault detection, this new technique uses time and software redundancy and can be outlined as follows. In the initial phase, a program is run to solve a problem and store the result. In addition, this program leaves behind a trail of data called a certification trail. In the second phase, another program is run which solves the original problem again. This program, however, has access to the certification trail left by the first program. Because of the availability of the certification trail, the second phase can be performed by a less complex program and can execute more quickly. In the final phase, the two results are compared and if they agree the results are accepted as correct; otherwise an error is indicated. An essential aspect of this approach is that the second program must always generate either an error indication or a correct output even when the certification trail it receives from the first program is incorrect. The certification trail approach to fault tolerance is formalized and realizations of it are illustrated by considering algorithms for the following problems: convex hull, sorting, and shortest path. Cases in which the second phase can be run concurrently with the first and act as a monitor are discussed. The certification trail approach are compared to other approaches to fault tolerance.
Fortran Program for X-Ray Photoelectron Spectroscopy Data Reformatting
NASA Technical Reports Server (NTRS)
Abel, Phillip B.
1989-01-01
A FORTRAN program has been written for use on an IBM PC/XT or AT or compatible microcomputer (personal computer, PC) that converts a column of ASCII-format numbers into a binary-format file suitable for interactive analysis on a Digital Equipment Corporation (DEC) computer running the VGS-5000 Enhanced Data Processing (EDP) software package. The incompatible floating-point number representations of the two computers were compared, and a subroutine was created to correctly store floating-point numbers on the IBM PC, which can be directly read by the DEC computer. Any file transfer protocol having provision for binary data can be used to transmit the resulting file from the PC to the DEC machine. The data file header required by the EDP programs for an x ray photoelectron spectrum is also written to the file. The user is prompted for the relevant experimental parameters, which are then properly coded into the format used internally by all of the VGS-5000 series EDP packages.
Integration of Titan supercomputer at OLCF with ATLAS Production System
NASA Astrophysics Data System (ADS)
Barreiro Megino, F.; De, K.; Jha, S.; Klimentov, A.; Maeno, T.; Nilsson, P.; Oleynik, D.; Padolski, S.; Panitkin, S.; Wells, J.; Wenaus, T.; ATLAS Collaboration
2017-10-01
The PanDA (Production and Distributed Analysis) workload management system was developed to meet the scale and complexity of distributed computing for the ATLAS experiment. PanDA managed resources are distributed worldwide, on hundreds of computing sites, with thousands of physicists accessing hundreds of Petabytes of data and the rate of data processing already exceeds Exabyte per year. While PanDA currently uses more than 200,000 cores at well over 100 Grid sites, future LHC data taking runs will require more resources than Grid computing can possibly provide. Additional computing and storage resources are required. Therefore ATLAS is engaged in an ambitious program to expand the current computing model to include additional resources such as the opportunistic use of supercomputers. In this paper we will describe a project aimed at integration of ATLAS Production System with Titan supercomputer at Oak Ridge Leadership Computing Facility (OLCF). Current approach utilizes modified PanDA Pilot framework for job submission to Titan’s batch queues and local data management, with lightweight MPI wrappers to run single node workloads in parallel on Titan’s multi-core worker nodes. It provides for running of standard ATLAS production jobs on unused resources (backfill) on Titan. The system already allowed ATLAS to collect on Titan millions of core-hours per month, execute hundreds of thousands jobs, while simultaneously improving Titans utilization efficiency. We will discuss the details of the implementation, current experience with running the system, as well as future plans aimed at improvements in scalability and efficiency. Notice: This manuscript has been authored, by employees of Brookhaven Science Associates, LLC under Contract No. DE-AC02-98CH10886 with the U.S. Department of Energy. The publisher by accepting the manuscript for publication acknowledges that the United States Government retains a non-exclusive, paid-up, irrevocable, world-wide license to publish or reproduce the published form of this manuscript, or allow others to do so, for United States Government purposes.
Simulation Framework for Intelligent Transportation Systems
DOT National Transportation Integrated Search
1996-10-01
A simulation framework has been developed for a large-scale, comprehensive, scaleable simulation of an Intelligent Transportation System. The simulator is designed for running on parellel computers and distributed (networked) computer systems, but ca...
OpenMEEG: opensource software for quasistatic bioelectromagnetics.
Gramfort, Alexandre; Papadopoulo, Théodore; Olivi, Emmanuel; Clerc, Maureen
2010-09-06
Interpreting and controlling bioelectromagnetic phenomena require realistic physiological models and accurate numerical solvers. A semi-realistic model often used in practise is the piecewise constant conductivity model, for which only the interfaces have to be meshed. This simplified model makes it possible to use Boundary Element Methods. Unfortunately, most Boundary Element solutions are confronted with accuracy issues when the conductivity ratio between neighboring tissues is high, as for instance the scalp/skull conductivity ratio in electro-encephalography. To overcome this difficulty, we proposed a new method called the symmetric BEM, which is implemented in the OpenMEEG software. The aim of this paper is to present OpenMEEG, both from the theoretical and the practical point of view, and to compare its performances with other competing software packages. We have run a benchmark study in the field of electro- and magneto-encephalography, in order to compare the accuracy of OpenMEEG with other freely distributed forward solvers. We considered spherical models, for which analytical solutions exist, and we designed randomized meshes to assess the variability of the accuracy. Two measures were used to characterize the accuracy. the Relative Difference Measure and the Magnitude ratio. The comparisons were run, either with a constant number of mesh nodes, or a constant number of unknowns across methods. Computing times were also compared. We observed more pronounced differences in accuracy in electroencephalography than in magnetoencephalography. The methods could be classified in three categories: the linear collocation methods, that run very fast but with low accuracy, the linear collocation methods with isolated skull approach for which the accuracy is improved, and OpenMEEG that clearly outperforms the others. As far as speed is concerned, OpenMEEG is on par with the other methods for a constant number of unknowns, and is hence faster for a prescribed accuracy level. This study clearly shows that OpenMEEG represents the state of the art for forward computations. Moreover, our software development strategies have made it handy to use and to integrate with other packages. The bioelectromagnetic research community should therefore be able to benefit from OpenMEEG with a limited development effort.
OpenMEEG: opensource software for quasistatic bioelectromagnetics
2010-01-01
Background Interpreting and controlling bioelectromagnetic phenomena require realistic physiological models and accurate numerical solvers. A semi-realistic model often used in practise is the piecewise constant conductivity model, for which only the interfaces have to be meshed. This simplified model makes it possible to use Boundary Element Methods. Unfortunately, most Boundary Element solutions are confronted with accuracy issues when the conductivity ratio between neighboring tissues is high, as for instance the scalp/skull conductivity ratio in electro-encephalography. To overcome this difficulty, we proposed a new method called the symmetric BEM, which is implemented in the OpenMEEG software. The aim of this paper is to present OpenMEEG, both from the theoretical and the practical point of view, and to compare its performances with other competing software packages. Methods We have run a benchmark study in the field of electro- and magneto-encephalography, in order to compare the accuracy of OpenMEEG with other freely distributed forward solvers. We considered spherical models, for which analytical solutions exist, and we designed randomized meshes to assess the variability of the accuracy. Two measures were used to characterize the accuracy. the Relative Difference Measure and the Magnitude ratio. The comparisons were run, either with a constant number of mesh nodes, or a constant number of unknowns across methods. Computing times were also compared. Results We observed more pronounced differences in accuracy in electroencephalography than in magnetoencephalography. The methods could be classified in three categories: the linear collocation methods, that run very fast but with low accuracy, the linear collocation methods with isolated skull approach for which the accuracy is improved, and OpenMEEG that clearly outperforms the others. As far as speed is concerned, OpenMEEG is on par with the other methods for a constant number of unknowns, and is hence faster for a prescribed accuracy level. Conclusions This study clearly shows that OpenMEEG represents the state of the art for forward computations. Moreover, our software development strategies have made it handy to use and to integrate with other packages. The bioelectromagnetic research community should therefore be able to benefit from OpenMEEG with a limited development effort. PMID:20819204
Prediction of sound radiation from different practical jet engine inlets
NASA Technical Reports Server (NTRS)
Zinn, B. T.; Meyer, W. L.
1982-01-01
The computer codes necessary for this study were developed and checked against exact solutions generated by the point source method using the NASA Lewis QCSEE inlet geometry. These computer codes were used to predict the acoustic properties of the following five inlet configurations: the NASA Langley Bellmouth, the NASA Lewis JT15D-1 Ground Test Nacelle, and three finite hyperbolic inlets of 50, 70 and 90 degrees. Thirty-five computer runs were done for the NASA Langley Bellmouth. For each of these computer runs, the reflection coefficient at the duct exit plane was calculated as was the far field radiation pattern. These results are presented in both graphical and tabular form with many of the results cross plotted so that trends in the results verses cut-off ratio (wave number) and tangential mode number may be easily identified.
Balancing Particle and Mesh Computation in a Particle-In-Cell Code
DOE Office of Scientific and Technical Information (OSTI.GOV)
Worley, Patrick H; D'Azevedo, Eduardo; Hager, Robert
2016-01-01
The XGC1 plasma microturbulence particle-in-cell simulation code has both particle-based and mesh-based computational kernels that dominate performance. Both of these are subject to load imbalances that can degrade performance and that evolve during a simulation. Each separately can be addressed adequately, but optimizing just for one can introduce significant load imbalances in the other, degrading overall performance. A technique has been developed based on Golden Section Search that minimizes wallclock time given prior information on wallclock time, and on current particle distribution and mesh cost per cell, and also adapts to evolution in load imbalance in both particle and meshmore » work. In problems of interest this doubled the performance on full system runs on the XK7 at the Oak Ridge Leadership Computing Facility compared to load balancing only one of the kernels.« less
Wu, Xin; Koslowski, Axel; Thiel, Walter
2012-07-10
In this work, we demonstrate that semiempirical quantum chemical calculations can be accelerated significantly by leveraging the graphics processing unit (GPU) as a coprocessor on a hybrid multicore CPU-GPU computing platform. Semiempirical calculations using the MNDO, AM1, PM3, OM1, OM2, and OM3 model Hamiltonians were systematically profiled for three types of test systems (fullerenes, water clusters, and solvated crambin) to identify the most time-consuming sections of the code. The corresponding routines were ported to the GPU and optimized employing both existing library functions and a GPU kernel that carries out a sequence of noniterative Jacobi transformations during pseudodiagonalization. The overall computation times for single-point energy calculations and geometry optimizations of large molecules were reduced by one order of magnitude for all methods, as compared to runs on a single CPU core.
Secure Genomic Computation through Site-Wise Encryption
Zhao, Yongan; Wang, XiaoFeng; Tang, Haixu
2015-01-01
Commercial clouds provide on-demand IT services for big-data analysis, which have become an attractive option for users who have no access to comparable infrastructure. However, utilizing these services for human genome analysis is highly risky, as human genomic data contains identifiable information of human individuals and their disease susceptibility. Therefore, currently, no computation on personal human genomic data is conducted on public clouds. To address this issue, here we present a site-wise encryption approach to encrypt whole human genome sequences, which can be subject to secure searching of genomic signatures on public clouds. We implemented this method within the Hadoop framework, and tested it on the case of searching disease markers retrieved from the ClinVar database against patients’ genomic sequences. The secure search runs only one order of magnitude slower than the simple search without encryption, indicating our method is ready to be used for secure genomic computation on public clouds. PMID:26306278
An acceleration framework for synthetic aperture radar algorithms
NASA Astrophysics Data System (ADS)
Kim, Youngsoo; Gloster, Clay S.; Alexander, Winser E.
2017-04-01
Algorithms for radar signal processing, such as Synthetic Aperture Radar (SAR) are computationally intensive and require considerable execution time on a general purpose processor. Reconfigurable logic can be used to off-load the primary computational kernel onto a custom computing machine in order to reduce execution time by an order of magnitude as compared to kernel execution on a general purpose processor. Specifically, Field Programmable Gate Arrays (FPGAs) can be used to accelerate these kernels using hardware-based custom logic implementations. In this paper, we demonstrate a framework for algorithm acceleration. We used SAR as a case study to illustrate the potential for algorithm acceleration offered by FPGAs. Initially, we profiled the SAR algorithm and implemented a homomorphic filter using a hardware implementation of the natural logarithm. Experimental results show a linear speedup by adding reasonably small processing elements in Field Programmable Gate Array (FPGA) as opposed to using a software implementation running on a typical general purpose processor.
GPU accelerated implementation of NCI calculations using promolecular density.
Rubez, Gaëtan; Etancelin, Jean-Matthieu; Vigouroux, Xavier; Krajecki, Michael; Boisson, Jean-Charles; Hénon, Eric
2017-05-30
The NCI approach is a modern tool to reveal chemical noncovalent interactions. It is particularly attractive to describe ligand-protein binding. A custom implementation for NCI using promolecular density is presented. It is designed to leverage the computational power of NVIDIA graphics processing unit (GPU) accelerators through the CUDA programming model. The code performances of three versions are examined on a test set of 144 systems. NCI calculations are particularly well suited to the GPU architecture, which reduces drastically the computational time. On a single compute node, the dual-GPU version leads to a 39-fold improvement for the biggest instance compared to the optimal OpenMP parallel run (C code, icc compiler) with 16 CPU cores. Energy consumption measurements carried out on both CPU and GPU NCI tests show that the GPU approach provides substantial energy savings. © 2017 Wiley Periodicals, Inc. © 2017 Wiley Periodicals, Inc.
Secure Genomic Computation through Site-Wise Encryption.
Zhao, Yongan; Wang, XiaoFeng; Tang, Haixu
2015-01-01
Commercial clouds provide on-demand IT services for big-data analysis, which have become an attractive option for users who have no access to comparable infrastructure. However, utilizing these services for human genome analysis is highly risky, as human genomic data contains identifiable information of human individuals and their disease susceptibility. Therefore, currently, no computation on personal human genomic data is conducted on public clouds. To address this issue, here we present a site-wise encryption approach to encrypt whole human genome sequences, which can be subject to secure searching of genomic signatures on public clouds. We implemented this method within the Hadoop framework, and tested it on the case of searching disease markers retrieved from the ClinVar database against patients' genomic sequences. The secure search runs only one order of magnitude slower than the simple search without encryption, indicating our method is ready to be used for secure genomic computation on public clouds.
ERIC Educational Resources Information Center
Navarro, Aaron B.
1981-01-01
Presents a program in Level II BASIC for a TRS-80 computer that simulates a Turing machine and discusses the nature of the device. The program is run interactively and is designed to be used as an educational tool by computer science or mathematics students studying computational or automata theory. (MP)
10 CFR 2.1003 - Availability of material.
Code of Federal Regulations, 2013 CFR
2013-01-01
... its license application for a geologic repository, the NRC shall make available no later than thirty... privilege in § 2.1006, graphic-oriented documentary material that includes raw data, computer runs, computer... discrepancies; (ii) Gauge, meter and computer settings; (iii) Probe locations; (iv) Logging intervals and rates...
10 CFR 2.1003 - Availability of material.
Code of Federal Regulations, 2014 CFR
2014-01-01
... its license application for a geologic repository, the NRC shall make available no later than thirty... privilege in § 2.1006, graphic-oriented documentary material that includes raw data, computer runs, computer... discrepancies; (ii) Gauge, meter and computer settings; (iii) Probe locations; (iv) Logging intervals and rates...
SU-F-BRD-13: Quantum Annealing Applied to IMRT Beamlet Intensity Optimization
DOE Office of Scientific and Technical Information (OSTI.GOV)
Nazareth, D; Spaans, J
Purpose: We report on the first application of quantum annealing (QA) to the process of beamlet intensity optimization for IMRT. QA is a new technology, which employs novel hardware and software techniques to address various discrete optimization problems in many fields. Methods: We apply the D-Wave Inc. proprietary hardware, which natively exploits quantum mechanical effects for improved optimization. The new QA algorithm, running on this hardware, is most similar to simulated annealing, but relies on natural processes to directly minimize the free energy of a system. A simple quantum system is slowly evolved into a classical system, representing the objectivemore » function. To apply QA to IMRT-type optimization, two prostate cases were considered. A reduced number of beamlets were employed, due to the current QA hardware limitation of ∼500 binary variables. The beamlet dose matrices were computed using CERR, and an objective function was defined based on typical clinical constraints, including dose-volume objectives. The objective function was discretized, and the QA method was compared to two standard optimization Methods: simulated annealing and Tabu search, run on a conventional computing cluster. Results: Based on several runs, the average final objective function value achieved by the QA was 16.9 for the first patient, compared with 10.0 for Tabu and 6.7 for the SA. For the second patient, the values were 70.7 for the QA, 120.0 for Tabu, and 22.9 for the SA. The QA algorithm required 27–38% of the time required by the other two methods. Conclusion: In terms of objective function value, the QA performance was similar to Tabu but less effective than the SA. However, its speed was 3–4 times faster than the other two methods. This initial experiment suggests that QA-based heuristics may offer significant speedup over conventional clinical optimization methods, as quantum annealing hardware scales to larger sizes.« less
Kaufmann, Tobias; Holz, Elisa M; Kübler, Andrea
2013-01-01
This paper describes a case study with a patient in the classic locked-in state, who currently has no means of independent communication. Following a user-centered approach, we investigated event-related potentials (ERP) elicited in different modalities for use in brain-computer interface (BCI) systems. Such systems could provide her with an alternative communication channel. To investigate the most viable modality for achieving BCI based communication, classic oddball paradigms (1 rare and 1 frequent stimulus, ratio 1:5) in the visual, auditory and tactile modality were conducted (2 runs per modality). Classifiers were built on one run and tested offline on another run (and vice versa). In these paradigms, the tactile modality was clearly superior to other modalities, displaying high offline accuracy even when classification was performed on single trials only. Consequently, we tested the tactile paradigm online and the patient successfully selected targets without any error. Furthermore, we investigated use of the visual or tactile modality for different BCI systems with more than two selection options. In the visual modality, several BCI paradigms were tested offline. Neither matrix-based nor so-called gaze-independent paradigms constituted a means of control. These results may thus question the gaze-independence of current gaze-independent approaches to BCI. A tactile four-choice BCI resulted in high offline classification accuracies. Yet, online use raised various issues. Although performance was clearly above chance, practical daily life use appeared unlikely when compared to other communication approaches (e.g., partner scanning). Our results emphasize the need for user-centered design in BCI development including identification of the best stimulus modality for a particular user. Finally, the paper discusses feasibility of EEG-based BCI systems for patients in classic locked-in state and compares BCI to other AT solutions that we also tested during the study.
ABSENTEE COMPUTATIONS IN A MULTIPLE-ACCESS COMPUTER SYSTEM.
require user interaction, and the user may therefore want to run these computations ’ absentee ’ (or, user not present). A mechanism is presented which...provides for the handling of absentee computations in a multiple-access computer system. The design is intended to be implementation-independent...Some novel features of the system’s design are: a user can switch computations from interactive to absentee (and vice versa), the system can
Kok, H P; de Greef, M; Bel, A; Crezee, J
2009-08-01
In regional hyperthermia, optimization is useful to obtain adequate applicator settings. A speed-up of the previously published method for high resolution temperature based optimization is proposed. Element grouping as described in literature uses selected voxel sets instead of single voxels to reduce computation time. Elements which achieve their maximum heating potential for approximately the same phase/amplitude setting are grouped. To form groups, eigenvalues and eigenvectors of precomputed temperature matrices are used. At high resolution temperature matrices are unknown and temperatures are estimated using low resolution (1 cm) computations and the high resolution (2 mm) temperature distribution computed for low resolution optimized settings using zooming. This technique can be applied to estimate an upper bound for high resolution eigenvalues. The heating potential of elements was estimated using these upper bounds. Correlations between elements were estimated with low resolution eigenvalues and eigenvectors, since high resolution eigenvectors remain unknown. Four different grouping criteria were applied. Constraints were set to the average group temperatures. Element grouping was applied for five patients and optimal settings for the AMC-8 system were determined. Without element grouping the average computation times for five and ten runs were 7.1 and 14.4 h, respectively. Strict grouping criteria were necessary to prevent an unacceptable exceeding of the normal tissue constraints (up to approximately 2 degrees C), caused by constraining average instead of maximum temperatures. When strict criteria were applied, speed-up factors of 1.8-2.1 and 2.6-3.5 were achieved for five and ten runs, respectively, depending on the grouping criteria. When many runs are performed, the speed-up factor will converge to 4.3-8.5, which is the average reduction factor of the constraints and depends on the grouping criteria. Tumor temperatures were comparable. Maximum exceeding of the constraint in a hot spot was 0.24-0.34 degree C; average maximum exceeding over all five patients was 0.09-0.21 degree C, which is acceptable. High resolution temperature based optimization using element grouping can achieve a speed-up factor of 4-8, without large deviations from the conventional method.
Spirou, Spiridon V; Papadimitroulas, Panagiotis; Liakou, Paraskevi; Georgoulias, Panagiotis; Loudos, George
2015-09-01
To present and evaluate a new methodology to investigate the effect of attenuation correction (AC) in single-photon emission computed tomography (SPECT) using textural features analysis, Monte Carlo techniques, and a computational anthropomorphic model. The GATE Monte Carlo toolkit was used to simulate SPECT experiments using the XCAT computational anthropomorphic model, filled with a realistic biodistribution of (99m)Tc-N-DBODC. The simulated gamma camera was the Siemens ECAM Dual-Head, equipped with a parallel hole lead collimator, with an image resolution of 3.54 × 3.54 mm(2). Thirty-six equispaced camera positions, spanning a full 360° arc, were simulated. Projections were calculated after applying a ± 20% energy window or after eliminating all scattered photons. The activity of the radioisotope was reconstructed using the MLEM algorithm. Photon attenuation was accounted for by calculating the radiological pathlength in a perpendicular line from the center of each voxel to the gamma camera. Twenty-two textural features were calculated on each slice, with and without AC, using 16 and 64 gray levels. A mask was used to identify only those pixels that belonged to each organ. Twelve of the 22 features showed almost no dependence on AC, irrespective of the organ involved. In both the heart and the liver, the mean and SD were the features most affected by AC. In the liver, six features were affected by AC only on some slices. Depending on the slice, skewness decreased by 22-34% with AC, kurtosis by 35-50%, long-run emphasis mean by 71-91%, and long-run emphasis range by 62-95%. In contrast, gray-level non-uniformity mean increased by 78-218% compared with the value without AC and run percentage mean by 51-159%. These results were not affected by the number of gray levels (16 vs. 64) or the data used for reconstruction: with the energy window or without scattered photons. The mean and SD were the main features affected by AC. In the heart, no other feature was affected. In the liver, other features were affected, but the effect was slice dependent. The number of gray levels did not affect the results.
Exploiting CMS data popularity to model the evolution of data management for Run-2 and beyond
NASA Astrophysics Data System (ADS)
Bonacorsi, D.; Boccali, T.; Giordano, D.; Girone, M.; Neri, M.; Magini, N.; Kuznetsov, V.; Wildish, T.
2015-12-01
During the LHC Run-1 data taking, all experiments collected large data volumes from proton-proton and heavy-ion collisions. The collisions data, together with massive volumes of simulated data, were replicated in multiple copies, transferred among various Tier levels, transformed/slimmed in format/content. These data were then accessed (both locally and remotely) by large groups of distributed analysis communities exploiting the WorldWide LHC Computing Grid infrastructure and services. While efficient data placement strategies - together with optimal data redistribution and deletions on demand - have become the core of static versus dynamic data management projects, little effort has so far been invested in understanding the detailed data-access patterns which surfaced in Run-1. These patterns, if understood, can be used as input to simulation of computing models at the LHC, to optimise existing systems by tuning their behaviour, and to explore next-generation CPU/storage/network co-scheduling solutions. This is of great importance, given that the scale of the computing problem will increase far faster than the resources available to the experiments, for Run-2 and beyond. Studying data-access patterns involves the validation of the quality of the monitoring data collected on the “popularity of each dataset, the analysis of the frequency and pattern of accesses to different datasets by analysis end-users, the exploration of different views of the popularity data (by physics activity, by region, by data type), the study of the evolution of Run-1 data exploitation over time, the evaluation of the impact of different data placement and distribution choices on the available network and storage resources and their impact on the computing operations. This work presents some insights from studies on the popularity data from the CMS experiment. We present the properties of a range of physics analysis activities as seen by the data popularity, and make recommendations for how to tune the initial distribution of data in anticipation of how it will be used in Run-2 and beyond.
CalSimHydro Tool - A Web-based interactive tool for the CalSim 3.0 Hydrology Prepropessor
NASA Astrophysics Data System (ADS)
Li, P.; Stough, T.; Vu, Q.; Granger, S. L.; Jones, D. J.; Ferreira, I.; Chen, Z.
2011-12-01
CalSimHydro, the CalSim 3.0 Hydrology Preprocessor, is an application designed to automate the various steps in the computation of hydrologic inputs for CalSim 3.0, a water resources planning model developed jointly by California State Department of Water Resources and United States Bureau of Reclamation, Mid-Pacific Region. CalSimHydro consists of a five-step FORTRAN based program that runs the individual models in succession passing information from one model to the next and aggregating data as required by each model. The final product of CalSimHydro is an updated CalSim 3.0 state variable (SV) DSS input file. CalSimHydro consists of (1) a Rainfall-Runoff Model to compute monthly infiltration, (2) a Soil moisture and demand calculator (IDC) that estimates surface runoff, deep percolation, and water demands for natural vegetation cover and various crops other than rice, (3) a Rice Water Use Model to compute the water demands, deep percolation, irrigation return flow, and runoff from precipitation for the rice fields, (4) a Refuge Water Use Model that simulates the ponding operations for managed wetlands, and (5) a Data Aggregation and Transfer Module to aggregate the outputs from the above modules and transfer them to the CalSim SV input file. In this presentation, we describe a web-based user interface for CalSimHydro using Google Earth Plug-In. The CalSimHydro tool allows users to - interact with geo-referenced layers of the Water Budget Areas (WBA) and Demand Units (DU) displayed over the Sacramento Valley, - view the input parameters of the hydrology preprocessor for a selected WBA or DU in a time series plot or a tabular form, - edit the values of the input parameters in the table or by downloading a spreadsheet of the selected parameter in a selected time range, - run the CalSimHydro modules in the backend server and notify the user when the job is done, - visualize the model output and compare it with a base run result, - download the output SV file to be used to run CalSim 3.0. The CalSimHydro tool streamlines the complicated steps to configure and run the hydrology preprocessor by providing a user-friendly visual interface and back-end services to validate user inputs and manage the model execution. It is a powerful addition to the new CalSim 3.0 system.
An improved design method and experimental performance of two dimensional curved wall diffusers
NASA Technical Reports Server (NTRS)
Yang, T.; Hudson, W. G.; El-Nashar, A. M.
1972-01-01
A computer design program was developed to incorporate the suction slots in solving the potential flow equations with prescribed boundary conditions. Using the contour generated from this program two Griffith diffusers were tested having area ratios AR = 3 and 4. The inlet Reynolds number ranged from 600,000 to 7 million. It was found that the slot suction required for metastable operation depends on the sidewall suction applied. Values of slot suction of 8% of the inlet flow rate was required for AR = 4 with metastable condition, provided that enough sidewall suction was applied. For AR = 3, the values of slot suction was about 25% lower than those required for AR = 4. For nearly all unseparated test runs, the effectiveness was 100% and the exit flow was uniform. In addition to the Griffith diffusers, dump and cusp diffusers of comparable area ratios were built and tested. The results obtained from these diffusers were compared with those of the Griffith diffusers. Flow separation occurred in all test runs with the dump and cusp diffusers.
webPIPSA: a web server for the comparison of protein interaction properties
Richter, Stefan; Wenzel, Anne; Stein, Matthias; Gabdoulline, Razif R.; Wade, Rebecca C.
2008-01-01
Protein molecular interaction fields are key determinants of protein functionality. PIPSA (Protein Interaction Property Similarity Analysis) is a procedure to compare and analyze protein molecular interaction fields, such as the electrostatic potential. PIPSA may assist in protein functional assignment, classification of proteins, the comparison of binding properties and the estimation of enzyme kinetic parameters. webPIPSA is a web server that enables the use of PIPSA to compare and analyze protein electrostatic potentials. While PIPSA can be run with downloadable software (see http://projects.eml.org/mcm/software/pipsa), webPIPSA extends and simplifies a PIPSA run. This allows non-expert users to perform PIPSA for their protein datasets. With input protein coordinates, the superposition of protein structures, as well as the computation and analysis of electrostatic potentials, is automated. The results are provided as electrostatic similarity matrices from an all-pairwise comparison of the proteins which can be subjected to clustering and visualized as epograms (tree-like diagrams showing electrostatic potential differences) or heat maps. webPIPSA is freely available at: http://pipsa.eml.org. PMID:18420653
Adaptive mesh fluid simulations on GPU
NASA Astrophysics Data System (ADS)
Wang, Peng; Abel, Tom; Kaehler, Ralf
2010-10-01
We describe an implementation of compressible inviscid fluid solvers with block-structured adaptive mesh refinement on Graphics Processing Units using NVIDIA's CUDA. We show that a class of high resolution shock capturing schemes can be mapped naturally on this architecture. Using the method of lines approach with the second order total variation diminishing Runge-Kutta time integration scheme, piecewise linear reconstruction, and a Harten-Lax-van Leer Riemann solver, we achieve an overall speedup of approximately 10 times faster execution on one graphics card as compared to a single core on the host computer. We attain this speedup in uniform grid runs as well as in problems with deep AMR hierarchies. Our framework can readily be applied to more general systems of conservation laws and extended to higher order shock capturing schemes. This is shown directly by an implementation of a magneto-hydrodynamic solver and comparing its performance to the pure hydrodynamic case. Finally, we also combined our CUDA parallel scheme with MPI to make the code run on GPU clusters. Close to ideal speedup is observed on up to four GPUs.
Federated data storage system prototype for LHC experiments and data intensive science
NASA Astrophysics Data System (ADS)
Kiryanov, A.; Klimentov, A.; Krasnopevtsev, D.; Ryabinkin, E.; Zarochentsev, A.
2017-10-01
Rapid increase of data volume from the experiments running at the Large Hadron Collider (LHC) prompted physics computing community to evaluate new data handling and processing solutions. Russian grid sites and universities’ clusters scattered over a large area aim at the task of uniting their resources for future productive work, at the same time giving an opportunity to support large physics collaborations. In our project we address the fundamental problem of designing a computing architecture to integrate distributed storage resources for LHC experiments and other data-intensive science applications and to provide access to data from heterogeneous computing facilities. Studies include development and implementation of federated data storage prototype for Worldwide LHC Computing Grid (WLCG) centres of different levels and University clusters within one National Cloud. The prototype is based on computing resources located in Moscow, Dubna, Saint Petersburg, Gatchina and Geneva. This project intends to implement a federated distributed storage for all kind of operations such as read/write/transfer and access via WAN from Grid centres, university clusters, supercomputers, academic and commercial clouds. The efficiency and performance of the system are demonstrated using synthetic and experiment-specific tests including real data processing and analysis workflows from ATLAS and ALICE experiments, as well as compute-intensive bioinformatics applications (PALEOMIX) running on supercomputers. We present topology and architecture of the designed system, report performance and statistics for different access patterns and show how federated data storage can be used efficiently by physicists and biologists. We also describe how sharing data on a widely distributed storage system can lead to a new computing model and reformations of computing style, for instance how bioinformatics program running on supercomputers can read/write data from the federated storage.
Simulation framework for intelligent transportation systems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ewing, T.; Doss, E.; Hanebutte, U.
1996-10-01
A simulation framework has been developed for a large-scale, comprehensive, scaleable simulation of an Intelligent Transportation System (ITS). The simulator is designed for running on parallel computers and distributed (networked) computer systems, but can run on standalone workstations for smaller simulations. The simulator currently models instrumented smart vehicles with in-vehicle navigation units capable of optimal route planning and Traffic Management Centers (TMC). The TMC has probe vehicle tracking capabilities (display position and attributes of instrumented vehicles), and can provide two-way interaction with traffic to provide advisories and link times. Both the in-vehicle navigation module and the TMC feature detailed graphicalmore » user interfaces to support human-factors studies. Realistic modeling of variations of the posted driving speed are based on human factors studies that take into consideration weather, road conditions, driver personality and behavior, and vehicle type. The prototype has been developed on a distributed system of networked UNIX computers but is designed to run on parallel computers, such as ANL`s IBM SP-2, for large-scale problems. A novel feature of the approach is that vehicles are represented by autonomous computer processes which exchange messages with other processes. The vehicles have a behavior model which governs route selection and driving behavior, and can react to external traffic events much like real vehicles. With this approach, the simulation is scaleable to take advantage of emerging massively parallel processor (MPP) systems.« less
Lanier, T.H.
1996-01-01
The 100-year flood plain was determined for Upper Three Runs, its tributaries, and the part of the Savannah River that borders the Savannah River Site. The results are provided in tabular and graphical formats. The 100-year flood-plain maps and flood profiles provide water-resource managers of the Savannah River Site with a technical basis for making flood-plain management decisions that could minimize future flood problems and provide a basis for designing and constructing drainage structures along roadways. A hydrologic analysis was made to estimate the 100-year recurrence- interval flow for Upper Three Runs and its tributaries. The analysis showed that the well-drained, sandy soils in the head waters of Upper Three Runs reduce the high flows in the stream; therefore, the South Carolina upper Coastal Plain regional-rural-regression equation does not apply for Upper Three Runs. Conse- quently, a relation was established for 100-year recurrence-interval flow and drainage area using streamflow data from U.S. Geological Survey gaging stations on Upper Three Runs. This relation was used to compute 100-year recurrence-interval flows at selected points along the stream. The regional regression equations were applicable for the tributaries to Upper Three Runs, because the soil types in the drainage basins of the tributaries resemble those normally occurring in upper Coastal Plain basins. This was verified by analysis of the flood-frequency data collected from U.S. Geological Survey gaging station 02197342 on Fourmile Branch. Cross sections were surveyed throughout each reach, and other pertinent data such as flow resistance and land-use were col- lected. The surveyed cross sections and computed 100-year recurrence-interval flows were used in a step-backwater model to compute the 100-year flood profile for Upper Three Runs and its tributaries. The profiles were used to delineate the 100-year flood plain on topographic maps. The Savannah River forms the southwestern border of the Savannah River Site. Data from previously published reports were used to delineate the 100-year flood plain for the Savannah River from the downstream site boundary at the mouth of Lower Three Runs at river mile 125 to the upstream site boundary at river mile 163.
Preventing Run-Time Bugs at Compile-Time Using Advanced C++
DOE Office of Scientific and Technical Information (OSTI.GOV)
Neswold, Richard
When writing software, we develop algorithms that tell the computer what to do at run-time. Our solutions are easier to understand and debug when they are properly modeled using class hierarchies, enumerations, and a well-factored API. Unfortunately, even with these design tools, we end up having to debug our programs at run-time. Worse still, debugging an embedded system changes its dynamics, making it tough to find and fix concurrency issues. This paper describes techniques using C++ to detect run-time bugs *at compile time*. A concurrency library, developed at Fermilab, is used for examples in illustrating these techniques.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bartoletti, T.
SPI/U3.1 consists of five tools used to assess and report the security posture of computers running the UNIX operating system. The tools are: Access Control Test: A rule-based system which identifies sequential dependencies in UNIX access controls. Binary Inspector Tool: Evaluates the release status of system binaries by comparing a crypto-checksum to provide table entries. Change Detection Tool: Maintains and applies a snapshot of critical system files and attributes for purposes of change detection. Configuration Query Language: Accepts CQL-based scripts (provided) to evaluate queries over the status of system files, configuration of services and many other elements of UNIX systemmore » security. Password Security Inspector: Tests for weak or aged passwords. The tools are packaged with a forms-based user interface providing on-line context-sensistive help, job scheduling, parameter management and output report management utilities. Tools may be run independent of the UI.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bartoletti, Tony
SPI/U3.2 consists of five tools used to assess and report the security posture of computers running the UNIX operating system. The tools are: Access Control Test: A rule-based system which identifies sequential dependencies in UNIX access controls. Binary Authentication Tool: Evaluates the release status of system binaries by comparing a crypto-checksum to provide table entries. Change Detection Tool: Maintains and applies a snapshot of critical system files and attributes for purposes of change detection. Configuration Query Language: Accepts CQL-based scripts (provided) to evaluate queries over the status of system files, configuration of services and many other elements of UNIX systemmore » security. Password Security Inspector: Tests for weak or aged passwords. The tools are packaged with a forms-based user interface providing on-line context-sensistive help, job scheduling, parameter management and output report management utilities. Tools may be run independent of the UI.« less
Mercury BLASTP: Accelerating Protein Sequence Alignment
Jacob, Arpith; Lancaster, Joseph; Buhler, Jeremy; Harris, Brandon; Chamberlain, Roger D.
2008-01-01
Large-scale protein sequence comparison is an important but compute-intensive task in molecular biology. BLASTP is the most popular tool for comparative analysis of protein sequences. In recent years, an exponential increase in the size of protein sequence databases has required either exponentially more running time or a cluster of machines to keep pace. To address this problem, we have designed and built a high-performance FPGA-accelerated version of BLASTP, Mercury BLASTP. In this paper, we describe the architecture of the portions of the application that are accelerated in the FPGA, and we also describe the integration of these FPGA-accelerated portions with the existing BLASTP software. We have implemented Mercury BLASTP on a commodity workstation with two Xilinx Virtex-II 6000 FPGAs. We show that the new design runs 11-15 times faster than software BLASTP on a modern CPU while delivering close to 99% identical results. PMID:19492068
SPI/U3.2. Security Profile Inspector for UNIX Systems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bartoletti, A.
1994-08-01
SPI/U3.2 consists of five tools used to assess and report the security posture of computers running the UNIX operating system. The tools are: Access Control Test: A rule-based system which identifies sequential dependencies in UNIX access controls. Binary Authentication Tool: Evaluates the release status of system binaries by comparing a crypto-checksum to provide table entries. Change Detection Tool: Maintains and applies a snapshot of critical system files and attributes for purposes of change detection. Configuration Query Language: Accepts CQL-based scripts (provided) to evaluate queries over the status of system files, configuration of services and many other elements of UNIX systemmore » security. Password Security Inspector: Tests for weak or aged passwords. The tools are packaged with a forms-based user interface providing on-line context-sensistive help, job scheduling, parameter management and output report management utilities. Tools may be run independent of the UI.« less
Giving students the run of sprinting models
NASA Astrophysics Data System (ADS)
Heck, André; Ellermeijer, Ton
2009-11-01
A biomechanical study of sprinting is an interesting task for students who have a background in mechanics and calculus. These students can work with real data and do practical investigations similar to the way sports scientists do research. Student research activities are viable when the students are familiar with tools to collect and work with data from sensors and video recordings and with modeling tools for comparing simulation and experimental results. This article describes a multipurpose system, named COACH, that offers a versatile integrated set of tools for learning, doing, and teaching mathematics and science in a computer-based inquiry approach. Automated tracking of reference points and correction of perspective distortion in videos, state-of-the-art algorithms for data smoothing and numerical differentiation, and graphical system dynamics based modeling are some of the built-in techniques that are suitable for motion analysis. Their implementation and their application in student activities involving models of running are discussed.
A computer program for uncertainty analysis integrating regression and Bayesian methods
Lu, Dan; Ye, Ming; Hill, Mary C.; Poeter, Eileen P.; Curtis, Gary
2014-01-01
This work develops a new functionality in UCODE_2014 to evaluate Bayesian credible intervals using the Markov Chain Monte Carlo (MCMC) method. The MCMC capability in UCODE_2014 is based on the FORTRAN version of the differential evolution adaptive Metropolis (DREAM) algorithm of Vrugt et al. (2009), which estimates the posterior probability density function of model parameters in high-dimensional and multimodal sampling problems. The UCODE MCMC capability provides eleven prior probability distributions and three ways to initialize the sampling process. It evaluates parametric and predictive uncertainties and it has parallel computing capability based on multiple chains to accelerate the sampling process. This paper tests and demonstrates the MCMC capability using a 10-dimensional multimodal mathematical function, a 100-dimensional Gaussian function, and a groundwater reactive transport model. The use of the MCMC capability is made straightforward and flexible by adopting the JUPITER API protocol. With the new MCMC capability, UCODE_2014 can be used to calculate three types of uncertainty intervals, which all can account for prior information: (1) linear confidence intervals which require linearity and Gaussian error assumptions and typically 10s–100s of highly parallelizable model runs after optimization, (2) nonlinear confidence intervals which require a smooth objective function surface and Gaussian observation error assumptions and typically 100s–1,000s of partially parallelizable model runs after optimization, and (3) MCMC Bayesian credible intervals which require few assumptions and commonly 10,000s–100,000s or more partially parallelizable model runs. Ready access allows users to select methods best suited to their work, and to compare methods in many circumstances.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhang, S. Y.
Au beam at the RHIC ramp in run 2014 is reviewed together with the run 2011 and run 2012. Observed bunch length and longitudinal emittance are compared with the IBS simulations. The IBS growth rate of the longitudinal emittance in run 2014 is similar to run 2011, and both are larger than run 2012. This is explained by the large transverse emittance at high intensity observed in run 2012, but not in run 2014. The big improvement of the AGS ramping in run 2014 might be related to this change. The importance of the injector intensity improvement in run 2014more » is emphasized, which gives rise to the initial luminosity improvement of 50% in run 2014, compared with the previous Au-Au run 2011. In addition, a modified IBS model, which is calibrated using the RHIC Au runs from 9.8 GeV/n to 100 GeV/n, is presented and used in the study.« less
Development of a particle method of characteristics (PMOC) for one-dimensional shock waves
NASA Astrophysics Data System (ADS)
Hwang, Y.-H.
2018-03-01
In the present study, a particle method of characteristics is put forward to simulate the evolution of one-dimensional shock waves in barotropic gaseous, closed-conduit, open-channel, and two-phase flows. All these flow phenomena can be described with the same set of governing equations. The proposed scheme is established based on the characteristic equations and formulated by assigning the computational particles to move along the characteristic curves. Both the right- and left-running characteristics are traced and represented by their associated computational particles. It inherits the computational merits from the conventional method of characteristics (MOC) and moving particle method, but without their individual deficiencies. In addition, special particles with dual states deduced to the enforcement of the Rankine-Hugoniot relation are deliberately imposed to emulate the shock structure. Numerical tests are carried out by solving some benchmark problems, and the computational results are compared with available analytical solutions. From the derivation procedure and obtained computational results, it is concluded that the proposed PMOC will be a useful tool to replicate one-dimensional shock waves.
NASA Technical Reports Server (NTRS)
Kocher, Joshua E; Gilliam, David P.
2005-01-01
Secure computing is a necessity in the hostile environment that the internet has become. Protection from nefarious individuals and organizations requires a solution that is more a methodology than a one time fix. One aspect of this methodology is having the knowledge of which network ports a computer has open to the world, These network ports are essentially the doorways from the internet into the computer. An assessment method which uses the nmap software to scan ports has been developed to aid System Administrators (SAs) with analysis of open ports on their system(s). Additionally, baselines for several operating systems have been developed so that SAs can compare their open ports to a baseline for a given operating system. Further, the tool is deployed on a website where SAs and Users can request a port scan of their computer. The results are then emailed to the requestor. This tool aids Users, SAs, and security professionals by providing an overall picture of what services are running, what ports are open, potential trojan programs or backdoors, and what ports can be closed.
Navigating the Challenges of the Cloud
ERIC Educational Resources Information Center
Ovadia, Steven
2010-01-01
Cloud computing is increasingly popular in education. Cloud computing is "the delivery of computer services from vast warehouses of shared machines that enables companies and individuals to cut costs by handing over the running of their email, customer databases or accounting software to someone else, and then accessing it over the internet."…
Onboard Flow Sensing For Downwash Detection and Avoidance On Small Quadrotor Helicopters
2015-01-01
onboard computers, one for flight stabilization and a Linux computer for sensor integration and control calculations . The Linux computer runs Robot...Hirokawa, D. Kubo , S. Suzuki, J. Meguro, and T. Suzuki. Small uav for immediate hazard map generation. In AIAA Infotech@Aerospace Conf, May 2007. 8F
5 CFR 841.109 - Computation of time.
Code of Federal Regulations, 2011 CFR
2011-01-01
....109 Administrative Personnel OFFICE OF PERSONNEL MANAGEMENT (CONTINUED) CIVIL SERVICE REGULATIONS... Computation of time. In computing a period of time for filing documents, the day of the action or event after... included unless it is a Saturday, a Sunday, or a legal holiday; in this event, the period runs until the...
Implications of Windowing Techniques for CAI.
ERIC Educational Resources Information Center
Heines, Jesse M.; Grinstein, Georges G.
This paper discusses the use of a technique called windowing in computer assisted instruction to allow independent control of functional areas in complex CAI displays and simultaneous display of output from a running computer program and coordinated instructional material. Two obstacles to widespread use of CAI in computer science courses are…
More Colleges Eye outside Companies to Run Their Computer Operations.
ERIC Educational Resources Information Center
DeLoughry, Thomas J.
1993-01-01
Increasingly, budget pressures and rapid technological change are causing colleges to consider "outsourcing" for computer operations management, particularly for administrative purposes. Supporters see the trend as similar to hiring experts for other, ancillary services. Critics fear loss of control of the institution's vital computer systems.…
Representing, Running, and Revising Mental Models: A Computational Model
ERIC Educational Resources Information Center
Friedman, Scott; Forbus, Kenneth; Sherin, Bruce
2018-01-01
People use commonsense science knowledge to flexibly explain, predict, and manipulate the world around them, yet we lack computational models of how this commonsense science knowledge is represented, acquired, utilized, and revised. This is an important challenge for cognitive science: Building higher order computational models in this area will…
Computational Participation: Understanding Coding as an Extension of Literacy Instruction
ERIC Educational Resources Information Center
Burke, Quinn; O'Byrne, W. Ian; Kafai, Yasmin B.
2016-01-01
Understanding the computational concepts on which countless digital applications run offers learners the opportunity to no longer simply read such media but also become more discerning end users and potentially innovative "writers" of new media themselves. To think computationally--to solve problems, to design systems, and to process and…
Mobile Computer-Assisted-Instruction in Rural New Mexico.
ERIC Educational Resources Information Center
Gittinger, Jack D., Jr.
The University of New Mexico's three-year Computer Assisted Instruction Project established one mobile and five permanent laboratories offering remedial and vocational instruction in winter, 1984-85. Each laboratory has a Degem learning system with minicomputer, teacher terminal, and 32 student terminals. A Digital PDP-11 host computer runs the…
Quantum Statistical Mechanics on a Quantum Computer
NASA Astrophysics Data System (ADS)
Raedt, H. D.; Hams, A. H.; Michielsen, K.; Miyashita, S.; Saito, K.
We describe a quantum algorithm to compute the density of states and thermal equilibrium properties of quantum many-body systems. We present results obtained by running this algorithm on a software implementation of a 21-qubit quantum computer for the case of an antiferromagnetic Heisenberg model on triangular lattices of different size.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Busser, Wendy M. H., E-mail: wendy.busser@radboudumc.nl; Arntz, Mark J.; Jenniskens, Sjoerd F. M.
2015-08-15
PurposeWe assessed whether image registration of cone-beam computed tomography (CT) (CBCT) and contrast-enhanced CT (CE-CT) images indicating the locations of the adrenal veins can aid in increasing the success rate of first-attempts adrenal vein sampling (AVS) and therefore decreasing patient radiation dose.Materials and Methods CBCT scans were acquired in the interventional suite (Philips Allura Xper FD20) and rigidly registered to the vertebra in previously acquired CE-CT. Adrenal vein locations were marked on the CT image and superimposed with live fluoroscopy and digital-subtraction angiography (DSA) to guide the AVS. Seventeen first attempts at AVS were performed with image registration and retrospectivelymore » compared with 15 first attempts without image registration performed earlier by the same 2 interventional radiologists. First-attempt AVS was considered successful when both adrenal vein samples showed representative cortisol levels. Sampling time, dose-area product (DAP), number of DSA runs, fluoroscopy time, and skin dose were recorded.ResultsWithout image registration, the first attempt at sampling was successful in 8 of 15 procedures indicating a success rate of 53.3 %. This increased to 76.5 % (13 of 17) by adding CBCT and CE-CT image registration to AVS procedures (p = 0.266). DAP values (p = 0.001) and DSA runs (p = 0.026) decreased significantly by adding image registration guidance. Sampling and fluoroscopy times and skin dose showed no significant changes.ConclusionGuidance based on registration of CBCT and previously acquired diagnostic CE-CT can aid in enhancing localization of the adrenal veins thereby increasing the success rate of first-attempt AVS with a significant decrease in the number of used DSA runs and, consequently, radiation dose required.« less
Particle Identification on an FPGA Accelerated Compute Platform for the LHCb Upgrade
NASA Astrophysics Data System (ADS)
Fäerber, Christian; Schwemmer, Rainer; Machen, Jonathan; Neufeld, Niko
2017-07-01
The current LHCb readout system will be upgraded in 2018 to a “triggerless” readout of the entire detector at the Large Hadron Collider collision rate of 40 MHz. The corresponding bandwidth from the detector down to the foreseen dedicated computing farm (event filter farm), which acts as the trigger, has to be increased by a factor of almost 100 from currently 500 Gb/s up to 40 Tb/s. The event filter farm will preanalyze the data and will select the events on an event by event basis. This will reduce the bandwidth down to a manageable size to write the interesting physics data to tape. The design of such a system is a challenging task, and the reason why different new technologies are considered and have to be investigated for the different parts of the system. For the usage in the event building farm or in the event filter farm (trigger), an experimental field programmable gate array (FPGA) accelerated computing platform is considered and, therefore, tested. FPGA compute accelerators are used more and more in standard servers such as for Microsoft Bing search or Baidu search. The platform we use hosts a general Intel CPU and a high-performance FPGA linked via the high-speed Intel QuickPath Interconnect. An accelerator is implemented on the FPGA. It is very likely that these platforms, which are built, in general, for high-performance computing, are also very interesting for the high-energy physics community. First, the performance results of smaller test cases performed at the beginning are presented. Afterward, a part of the existing LHCb RICH particle identification is tested and is ported to the experimental FPGA accelerated platform. We have compared the performance of the LHCb RICH particle identification running on a normal CPU with the performance of the same algorithm, which is running on the Xeon-FPGA compute accelerator platform.
A performance comparison of the Cray-2 and the Cray X-MP
NASA Technical Reports Server (NTRS)
Schmickley, Ronald; Bailey, David H.
1986-01-01
A suite of thirteen large Fortran benchmark codes were run on Cray-2 and Cray X-MP supercomputers. These codes were a mix of compute-intensive scientific application programs (mostly Computational Fluid Dynamics) and some special vectorized computation exercise programs. For the general class of programs tested on the Cray-2, most of which were not specially tuned for speed, the floating point operation rates varied under a variety of system load configurations from 40 percent up to 125 percent of X-MP performance rates. It is concluded that the Cray-2, in the original system configuration studied (without memory pseudo-banking) will run untuned Fortran code, on average, about 70 percent of X-MP speeds.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Amerio, S.; Behari, S.; Boyd, J.
The Fermilab Tevatron collider's data-taking run ended in September 2011, yielding a dataset with rich scientific potential. The CDF and D0 experiments each have approximately 9 PB of collider and simulated data stored on tape. A large computing infrastructure consisting of tape storage, disk cache, and distributed grid computing for physics analysis with the Tevatron data is present at Fermilab. The Fermilab Run II data preservation project intends to keep this analysis capability sustained through the year 2020 and beyond. To achieve this goal, we have implemented a system that utilizes virtualization, automated validation, and migration to new standards inmore » both software and data storage technology and leverages resources available from currently-running experiments at Fermilab. Lastly, these efforts have also provided useful lessons in ensuring long-term data access for numerous experiments, and enable high-quality scientific output for years to come.« less
Visualization and Tracking of Parallel CFD Simulations
NASA Technical Reports Server (NTRS)
Vaziri, Arsi; Kremenetsky, Mark
1995-01-01
We describe a system for interactive visualization and tracking of a 3-D unsteady computational fluid dynamics (CFD) simulation on a parallel computer. CM/AVS, a distributed, parallel implementation of a visualization environment (AVS) runs on the CM-5 parallel supercomputer. A CFD solver is run as a CM/AVS module on the CM-5. Data communication between the solver, other parallel visualization modules, and a graphics workstation, which is running AVS, are handled by CM/AVS. Partitioning of the visualization task, between CM-5 and the workstation, can be done interactively in the visual programming environment provided by AVS. Flow solver parameters can also be altered by programmable interactive widgets. This system partially removes the requirement of storing large solution files at frequent time steps, a characteristic of the traditional 'simulate (yields) store (yields) visualize' post-processing approach.
Data preservation at the Fermilab Tevatron
NASA Astrophysics Data System (ADS)
Amerio, S.; Behari, S.; Boyd, J.; Brochmann, M.; Culbertson, R.; Diesburg, M.; Freeman, J.; Garren, L.; Greenlee, H.; Herner, K.; Illingworth, R.; Jayatilaka, B.; Jonckheere, A.; Li, Q.; Naymola, S.; Oleynik, G.; Sakumoto, W.; Varnes, E.; Vellidis, C.; Watts, G.; White, S.
2017-04-01
The Fermilab Tevatron collider's data-taking run ended in September 2011, yielding a dataset with rich scientific potential. The CDF and D0 experiments each have approximately 9 PB of collider and simulated data stored on tape. A large computing infrastructure consisting of tape storage, disk cache, and distributed grid computing for physics analysis with the Tevatron data is present at Fermilab. The Fermilab Run II data preservation project intends to keep this analysis capability sustained through the year 2020 and beyond. To achieve this goal, we have implemented a system that utilizes virtualization, automated validation, and migration to new standards in both software and data storage technology and leverages resources available from currently-running experiments at Fermilab. These efforts have also provided useful lessons in ensuring long-term data access for numerous experiments, and enable high-quality scientific output for years to come.
CMSA: a heterogeneous CPU/GPU computing system for multiple similar RNA/DNA sequence alignment.
Chen, Xi; Wang, Chen; Tang, Shanjiang; Yu, Ce; Zou, Quan
2017-06-24
The multiple sequence alignment (MSA) is a classic and powerful technique for sequence analysis in bioinformatics. With the rapid growth of biological datasets, MSA parallelization becomes necessary to keep its running time in an acceptable level. Although there are a lot of work on MSA problems, their approaches are either insufficient or contain some implicit assumptions that limit the generality of usage. First, the information of users' sequences, including the sizes of datasets and the lengths of sequences, can be of arbitrary values and are generally unknown before submitted, which are unfortunately ignored by previous work. Second, the center star strategy is suited for aligning similar sequences. But its first stage, center sequence selection, is highly time-consuming and requires further optimization. Moreover, given the heterogeneous CPU/GPU platform, prior studies consider the MSA parallelization on GPU devices only, making the CPUs idle during the computation. Co-run computation, however, can maximize the utilization of the computing resources by enabling the workload computation on both CPU and GPU simultaneously. This paper presents CMSA, a robust and efficient MSA system for large-scale datasets on the heterogeneous CPU/GPU platform. It performs and optimizes multiple sequence alignment automatically for users' submitted sequences without any assumptions. CMSA adopts the co-run computation model so that both CPU and GPU devices are fully utilized. Moreover, CMSA proposes an improved center star strategy that reduces the time complexity of its center sequence selection process from O(mn 2 ) to O(mn). The experimental results show that CMSA achieves an up to 11× speedup and outperforms the state-of-the-art software. CMSA focuses on the multiple similar RNA/DNA sequence alignment and proposes a novel bitmap based algorithm to improve the center star strategy. We can conclude that harvesting the high performance of modern GPU is a promising approach to accelerate multiple sequence alignment. Besides, adopting the co-run computation model can maximize the entire system utilization significantly. The source code is available at https://github.com/wangvsa/CMSA .
Thongchote, Kanogwun; Svasti, Saovaros; Teerapornpuntakit, Jarinthorn; Krishnamra, Nateetip; Charoenphandhu, Narattaphol
2014-06-15
A marked decrease in β-globin production led to β-thalassemia, a hereditary anemic disease associated with bone marrow expansion, bone erosion, and osteoporosis. Herein, we aimed to investigate changes in bone mineral density (BMD) and trabecular microstructure in hemizygous β-globin knockout thalassemic (BKO) mice and to determine whether endurance running (60 min/day, 5 days/wk for 12 wk in running wheels) could effectively alleviate bone loss in BKO mice. Both male and female BKO mice (1-2 mo old) showed growth retardation as indicated by smaller body weight and femoral length than their wild-type littermates. A decrease in BMD was more severe in female than in male BKO mice. Bone histomorphometry revealed that BKO mice had decreases in trabecular bone volume, trabecular number, and trabecular thickness, presumably due to suppression of osteoblast-mediated bone formation and activation of osteoclast-mediated bone resorption, the latter of which was consistent with elevated serum levels of osteoclastogenic cytokines IL-1α and -1β. As determined by peripheral quantitative computed tomography, running increased cortical density and thickness in the femoral and tibial diaphyses of BKO mice compared with those of sedentary BKO mice. Several histomorphometric parameters suggested an enhancement of bone formation (e.g., increased mineral apposition rate) and suppression of bone resorption (e.g., decreased osteoclast surface), which led to increases in trabecular bone volume and trabecular thickness in running BKO mice. In conclusion, BKO mice exhibited pervasive osteopenia and impaired bone microstructure, whereas running exercise appeared to be an effective intervention in alleviating bone microstructural defect in β-thalassemia. Copyright © 2014 the American Physiological Society.
Predictive simulation of gait at low gravity reveals skipping as the preferred locomotion strategy
Ackermann, Marko; van den Bogert, Antonie J.
2012-01-01
The investigation of gait strategies at low gravity environments gained momentum recently as manned missions to the Moon and to Mars are reconsidered. Although reports by astronauts of the Apollo missions indicate alternative gait strategies might be favored on the Moon, computational simulations and experimental investigations have been almost exclusively limited to the study of either walking or running, the locomotion modes preferred under Earth's gravity. In order to investigate the gait strategies likely to be favored at low gravity a series of predictive, computational simulations of gait are performed using a physiological model of the musculoskeletal system, without assuming any particular type of gait. A computationally efficient optimization strategy is utilized allowing for multiple simulations. The results reveal skipping as more efficient and less fatiguing than walking or running and suggest the existence of a walk-skip rather than a walk-run transition at low gravity. The results are expected to serve as a background to the design of experimental investigations of gait under simulated low gravity. PMID:22365845
Predictive simulation of gait at low gravity reveals skipping as the preferred locomotion strategy.
Ackermann, Marko; van den Bogert, Antonie J
2012-04-30
The investigation of gait strategies at low gravity environments gained momentum recently as manned missions to the Moon and to Mars are reconsidered. Although reports by astronauts of the Apollo missions indicate alternative gait strategies might be favored on the Moon, computational simulations and experimental investigations have been almost exclusively limited to the study of either walking or running, the locomotion modes preferred under Earth's gravity. In order to investigate the gait strategies likely to be favored at low gravity a series of predictive, computational simulations of gait are performed using a physiological model of the musculoskeletal system, without assuming any particular type of gait. A computationally efficient optimization strategy is utilized allowing for multiple simulations. The results reveal skipping as more efficient and less fatiguing than walking or running and suggest the existence of a walk-skip rather than a walk-run transition at low gravity. The results are expected to serve as a background to the design of experimental investigations of gait under simulated low gravity. Copyright © 2012 Elsevier Ltd. All rights reserved.
Simulation of a master-slave event set processor
DOE Office of Scientific and Technical Information (OSTI.GOV)
Comfort, J.C.
1984-03-01
Event set manipulation may consume a considerable amount of the computation time spent in performing a discrete-event simulation. One way of minimizing this time is to allow event set processing to proceed in parallel with the remainder of the simulation computation. The paper describes a multiprocessor simulation computer, in which all non-event set processing is performed by the principal processor (called the host). Event set processing is coordinated by a front end processor (the master) and actually performed by several other functionally identical processors (the slaves). A trace-driven simulation program modeling this system was constructed, and was run with tracemore » output taken from two different simulation programs. Output from this simulation suggests that a significant reduction in run time may be realized by this approach. Sensitivity analysis was performed on the significant parameters to the system (number of slave processors, relative processor speeds, and interprocessor communication times). A comparison between actual and simulation run times for a one-processor system was used to assist in the validation of the simulation. 7 references.« less
NASA Astrophysics Data System (ADS)
Calafiura, Paolo; Leggett, Charles; Seuster, Rolf; Tsulaia, Vakhtang; Van Gemmeren, Peter
2015-12-01
AthenaMP is a multi-process version of the ATLAS reconstruction, simulation and data analysis framework Athena. By leveraging Linux fork and copy-on-write mechanisms, it allows for sharing of memory pages between event processors running on the same compute node with little to no change in the application code. Originally targeted to optimize the memory footprint of reconstruction jobs, AthenaMP has demonstrated that it can reduce the memory usage of certain configurations of ATLAS production jobs by a factor of 2. AthenaMP has also evolved to become the parallel event-processing core of the recently developed ATLAS infrastructure for fine-grained event processing (Event Service) which allows the running of AthenaMP inside massively parallel distributed applications on hundreds of compute nodes simultaneously. We present the architecture of AthenaMP, various strategies implemented by AthenaMP for scheduling workload to worker processes (for example: Shared Event Queue and Shared Distributor of Event Tokens) and the usage of AthenaMP in the diversity of ATLAS event processing workloads on various computing resources: Grid, opportunistic resources and HPC.
NASA Astrophysics Data System (ADS)
Chaney, N.; Wood, E. F.
2014-12-01
The increasing accessibility of high-resolution land data (< 100 m) and high performance computing allows improved parameterizations of subgrid hydrologic processes in macroscale land surface models. Continental scale fully distributed modeling at these spatial scales is possible; however, its practicality for operational use is still unknown due to uncertainties in input data, model parameters, and storage requirements. To address these concerns, we propose a modeling framework that provides the spatial detail of a fully distributed model yet maintains the benefits of a semi-distributed model. In this presentation we will introduce DTOPLATS-MP, a coupling between the NOAH-MP land surface model and the Dynamic TOPMODEL hydrologic model. This new model captures a catchment's spatial heterogeneity by clustering high-resolution land datasets (soil, topography, and land cover) into hundreds of hydrologic similar units (HSUs). A prior DEM analysis defines the connections between each HSU. At each time step, the 1D land surface model updates each HSU; the HSUs then interact laterally via the subsurface and surface. When compared to the fully distributed form of the model, this framework allows a significant decrease in computation and storage while providing most of the same information and enabling parameter transferability. As a proof of concept, we will show how this new modeling framework can be run over CONUS at a 30-meter spatial resolution. For each catchment in the WBD HUC-12 dataset, the model is run between 2002 and 2012 using available high-resolution continental scale land and meteorological datasets over CONUS (dSSURGO, NLCD, NED, and NCEP Stage IV). For each catchment, the model is run with 1000 model parameter sets obtained from a Latin hypercube sample. This exercise will illustrate the feasibility of running the model operationally at continental scales while accounting for model parameter uncertainty.
Accelerating cardiac bidomain simulations using graphics processing units.
Neic, A; Liebmann, M; Hoetzl, E; Mitchell, L; Vigmond, E J; Haase, G; Plank, G
2012-08-01
Anatomically realistic and biophysically detailed multiscale computer models of the heart are playing an increasingly important role in advancing our understanding of integrated cardiac function in health and disease. Such detailed simulations, however, are computationally vastly demanding, which is a limiting factor for a wider adoption of in-silico modeling. While current trends in high-performance computing (HPC) hardware promise to alleviate this problem, exploiting the potential of such architectures remains challenging since strongly scalable algorithms are necessitated to reduce execution times. Alternatively, acceleration technologies such as graphics processing units (GPUs) are being considered. While the potential of GPUs has been demonstrated in various applications, benefits in the context of bidomain simulations where large sparse linear systems have to be solved in parallel with advanced numerical techniques are less clear. In this study, the feasibility of multi-GPU bidomain simulations is demonstrated by running strong scalability benchmarks using a state-of-the-art model of rabbit ventricles. The model is spatially discretized using the finite element methods (FEM) on fully unstructured grids. The GPU code is directly derived from a large pre-existing code, the Cardiac Arrhythmia Research Package (CARP), with very minor perturbation of the code base. Overall, bidomain simulations were sped up by a factor of 11.8 to 16.3 in benchmarks running on 6-20 GPUs compared to the same number of CPU cores. To match the fastest GPU simulation which engaged 20 GPUs, 476 CPU cores were required on a national supercomputing facility.
Toward Interactive Scenario Analysis and Exploration
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gayle, Thomas R.; Summers, Kenneth Lee; Jungels, John
2015-01-01
As Modeling and Simulation (M&S) tools have matured, their applicability and importance have increased across many national security challenges. In particular, they provide a way to test how something may behave without the need to do real world testing. However, current and future changes across several factors including capabilities, policy, and funding are driving a need for rapid response or evaluation in ways that many M&S tools cannot address. Issues around large data, computational requirements, delivery mechanisms, and analyst involvement already exist and pose significant challenges. Furthermore, rising expectations, rising input complexity, and increasing depth of analysis will only increasemore » the difficulty of these challenges. In this study we examine whether innovations in M&S software coupled with advances in ''cloud'' computing and ''big-data'' methodologies can overcome many of these challenges. In particular, we propose a simple, horizontally-scalable distributed computing environment that could provide the foundation (i.e. ''cloud'') for next-generation M&S-based applications based on the notion of ''parallel multi-simulation''. In our context, the goal of parallel multi- simulation is to consider as many simultaneous paths of execution as possible. Therefore, with sufficient resources, the complexity is dominated by the cost of single scenario runs as opposed to the number of runs required. We show the feasibility of this architecture through a stable prototype implementation coupled with the Umbra Simulation Framework [6]. Finally, we highlight the utility through multiple novel analysis tools and by showing the performance improvement compared to existing tools.« less
Accelerating Cardiac Bidomain Simulations Using Graphics Processing Units
Neic, Aurel; Liebmann, Manfred; Hoetzl, Elena; Mitchell, Lawrence; Vigmond, Edward J.; Haase, Gundolf
2013-01-01
Anatomically realistic and biophysically detailed multiscale computer models of the heart are playing an increasingly important role in advancing our understanding of integrated cardiac function in health and disease. Such detailed simulations, however, are computationally vastly demanding, which is a limiting factor for a wider adoption of in-silico modeling. While current trends in high-performance computing (HPC) hardware promise to alleviate this problem, exploiting the potential of such architectures remains challenging since strongly scalable algorithms are necessitated to reduce execution times. Alternatively, acceleration technologies such as graphics processing units (GPUs) are being considered. While the potential of GPUs has been demonstrated in various applications, benefits in the context of bidomain simulations where large sparse linear systems have to be solved in parallel with advanced numerical techniques are less clear. In this study, the feasibility of multi-GPU bidomain simulations is demonstrated by running strong scalability benchmarks using a state-of-the-art model of rabbit ventricles. The model is spatially discretized using the finite element methods (FEM) on fully unstructured grids. The GPU code is directly derived from a large pre-existing code, the Cardiac Arrhythmia Research Package (CARP), with very minor perturbation of the code base. Overall, bidomain simulations were sped up by a factor of 11.8 to 16.3 in benchmarks running on 6–20 GPUs compared to the same number of CPU cores. To match the fastest GPU simulation which engaged 20GPUs, 476 CPU cores were required on a national supercomputing facility. PMID:22692867
Two-Dimensional High-Lift Aerodynamic Optimization Using Neural Networks
NASA Technical Reports Server (NTRS)
Greenman, Roxana M.
1998-01-01
The high-lift performance of a multi-element airfoil was optimized by using neural-net predictions that were trained using a computational data set. The numerical data was generated using a two-dimensional, incompressible, Navier-Stokes algorithm with the Spalart-Allmaras turbulence model. Because it is difficult to predict maximum lift for high-lift systems, an empirically-based maximum lift criteria was used in this study to determine both the maximum lift and the angle at which it occurs. The 'pressure difference rule,' which states that the maximum lift condition corresponds to a certain pressure difference between the peak suction pressure and the pressure at the trailing edge of the element, was applied and verified with experimental observations for this configuration. Multiple input, single output networks were trained using the NASA Ames variation of the Levenberg-Marquardt algorithm for each of the aerodynamic coefficients (lift, drag and moment). The artificial neural networks were integrated with a gradient-based optimizer. Using independent numerical simulations and experimental data for this high-lift configuration, it was shown that this design process successfully optimized flap deflection, gap, overlap, and angle of attack to maximize lift. Once the neural nets were trained and integrated with the optimizer, minimal additional computer resources were required to perform optimization runs with different initial conditions and parameters. Applying the neural networks within the high-lift rigging optimization process reduced the amount of computational time and resources by 44% compared with traditional gradient-based optimization procedures for multiple optimization runs.
Wada, Yoshiro; Nishiike, Suetaka; Kitahara, Tadashi; Yamanaka, Toshiaki; Imai, Takao; Ito, Taeko; Sato, Go; Matsuda, Kazunori; Kitamura, Yoshiaki; Takeda, Noriaki
2016-11-01
After repeated snowboard exercises in the virtual reality (VR) world with increasing time lags in trials 3-8, it is suggested that the adaptation to repeated visual-vestibulosomatosensory conflict in the VR world improved dynamic posture control and motor performance in the real world without the development of motion sickness. The VR technology was used and the effects of repeated snowboard exercise examined in the VR world with time lags between visual scene and body rotation on the head stability and slalom run performance during exercise in healthy subjects. Forty-two healthy young subjects participated in the study. After trials 1 and 2 of snowboard exercise in the VR world without time lag, trials 3-8 were conducted with 0.1, 0.2, 0.3, 0.4, 0.5, and 0.6 s time lags of the visual scene that the computer creates behind board rotation, respectively. Finally, trial 9 was conducted without time lag. Head linear accelerations and subjective slalom run performance were evaluated. The standard deviations of head linear accelerations in inter-aural direction were significantly increased in trial 8, with a time lag of 0.6 s, but significantly decreased in trial 9 without a time lag, compared with those in trial 2 without a time lag. The subjective scores of slalom run performance were significantly decreased in trial 8, with a time lag of 0.6 s, but significantly increased in trial 9 without a time lag, compared with those in trial 2 without a time lag. Motion sickness was not induced in any subjects.
RELAP5-3D Resolution of Known Restart/Backup Issues
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mesina, George L.; Anderson, Nolan A.
2014-12-01
The state-of-the-art nuclear reactor system safety analysis computer program developed at the Idaho National Laboratory (INL), RELAP5-3D, continues to adapt to changes in computer hardware and software and to develop to meet the ever-expanding needs of the nuclear industry. To continue at the forefront, code testing must evolve with both code and industry developments, and it must work correctly. To best ensure this, the processes of Software Verification and Validation (V&V) are applied. Verification compares coding against its documented algorithms and equations and compares its calculations against analytical solutions and the method of manufactured solutions. A form of this, sequentialmore » verification, checks code specifications against coding only when originally written then applies regression testing which compares code calculations between consecutive updates or versions on a set of test cases to check that the performance does not change. A sequential verification testing system was specially constructed for RELAP5-3D to both detect errors with extreme accuracy and cover all nuclear-plant-relevant code features. Detection is provided through a “verification file” that records double precision sums of key variables. Coverage is provided by a test suite of input decks that exercise code features and capabilities necessary to model a nuclear power plant. A matrix of test features and short-running cases that exercise them is presented. This testing system is used to test base cases (called null testing) as well as restart and backup cases. It can test RELAP5-3D performance in both standalone and coupled (through PVM to other codes) runs. Application of verification testing revealed numerous restart and backup issues in both standalone and couple modes. This document reports the resolution of these issues.« less
A WPS Based Architecture for Climate Data Analytic Services (CDAS) at NASA
NASA Astrophysics Data System (ADS)
Maxwell, T. P.; McInerney, M.; Duffy, D.; Carriere, L.; Potter, G. L.; Doutriaux, C.
2015-12-01
Faced with unprecedented growth in the Big Data domain of climate science, NASA has developed the Climate Data Analytic Services (CDAS) framework. This framework enables scientists to execute trusted and tested analysis operations in a high performance environment close to the massive data stores at NASA. The data is accessed in standard (NetCDF, HDF, etc.) formats in a POSIX file system and processed using trusted climate data analysis tools (ESMF, CDAT, NCO, etc.). The framework is structured as a set of interacting modules allowing maximal flexibility in deployment choices. The current set of module managers include: Staging Manager: Runs the computation locally on the WPS server or remotely using tools such as celery or SLURM. Compute Engine Manager: Runs the computation serially or distributed over nodes using a parallelization framework such as celery or spark. Decomposition Manger: Manages strategies for distributing the data over nodes. Data Manager: Handles the import of domain data from long term storage and manages the in-memory and disk-based caching architectures. Kernel manager: A kernel is an encapsulated computational unit which executes a processor's compute task. Each kernel is implemented in python exploiting existing analysis packages (e.g. CDAT) and is compatible with all CDAS compute engines and decompositions. CDAS services are accessed via a WPS API being developed in collaboration with the ESGF Compute Working Team to support server-side analytics for ESGF. The API can be executed using either direct web service calls, a python script or application, or a javascript-based web application. Client packages in python or javascript contain everything needed to make CDAS requests. The CDAS architecture brings together the tools, data storage, and high-performance computing required for timely analysis of large-scale data sets, where the data resides, to ultimately produce societal benefits. It is is currently deployed at NASA in support of the Collaborative REAnalysis Technical Environment (CREATE) project, which centralizes numerous global reanalysis datasets onto a single advanced data analytics platform. This service permits decision makers to investigate climate changes around the globe, inspect model trends, compare multiple reanalysis datasets, and variability.
1001 Ways to run AutoDock Vina for virtual screening
NASA Astrophysics Data System (ADS)
Jaghoori, Mohammad Mahdi; Bleijlevens, Boris; Olabarriaga, Silvia D.
2016-03-01
Large-scale computing technologies have enabled high-throughput virtual screening involving thousands to millions of drug candidates. It is not trivial, however, for biochemical scientists to evaluate the technical alternatives and their implications for running such large experiments. Besides experience with the molecular docking tool itself, the scientist needs to learn how to run it on high-performance computing (HPC) infrastructures, and understand the impact of the choices made. Here, we review such considerations for a specific tool, AutoDock Vina, and use experimental data to illustrate the following points: (1) an additional level of parallelization increases virtual screening throughput on a multi-core machine; (2) capturing of the random seed is not enough (though necessary) for reproducibility on heterogeneous distributed computing systems; (3) the overall time spent on the screening of a ligand library can be improved by analysis of factors affecting execution time per ligand, including number of active torsions, heavy atoms and exhaustiveness. We also illustrate differences among four common HPC infrastructures: grid, Hadoop, small cluster and multi-core (virtual machine on the cloud). Our analysis shows that these platforms are suitable for screening experiments of different sizes. These considerations can guide scientists when choosing the best computing platform and set-up for their future large virtual screening experiments.
1001 Ways to run AutoDock Vina for virtual screening.
Jaghoori, Mohammad Mahdi; Bleijlevens, Boris; Olabarriaga, Silvia D
2016-03-01
Large-scale computing technologies have enabled high-throughput virtual screening involving thousands to millions of drug candidates. It is not trivial, however, for biochemical scientists to evaluate the technical alternatives and their implications for running such large experiments. Besides experience with the molecular docking tool itself, the scientist needs to learn how to run it on high-performance computing (HPC) infrastructures, and understand the impact of the choices made. Here, we review such considerations for a specific tool, AutoDock Vina, and use experimental data to illustrate the following points: (1) an additional level of parallelization increases virtual screening throughput on a multi-core machine; (2) capturing of the random seed is not enough (though necessary) for reproducibility on heterogeneous distributed computing systems; (3) the overall time spent on the screening of a ligand library can be improved by analysis of factors affecting execution time per ligand, including number of active torsions, heavy atoms and exhaustiveness. We also illustrate differences among four common HPC infrastructures: grid, Hadoop, small cluster and multi-core (virtual machine on the cloud). Our analysis shows that these platforms are suitable for screening experiments of different sizes. These considerations can guide scientists when choosing the best computing platform and set-up for their future large virtual screening experiments.
Supersonic Retro-Propulsion Experimental Design for Computational Fluid Dynamics Model Validation
NASA Technical Reports Server (NTRS)
Berry, Scott A.; Laws, Christopher T.; Kleb, W. L.; Rhode, Matthew N.; Spells, Courtney; McCrea, Andrew C.; Truble, Kerry A.; Schauerhamer, Daniel G.; Oberkampf, William L.
2011-01-01
The development of supersonic retro-propulsion, an enabling technology for heavy payload exploration missions to Mars, is the primary focus for the present paper. A new experimental model, intended to provide computational fluid dynamics model validation data, was recently designed for the Langley Research Center Unitary Plan Wind Tunnel Test Section 2. Pre-test computations were instrumental for sizing and refining the model, over the Mach number range of 2.4 to 4.6, such that tunnel blockage and internal flow separation issues would be minimized. A 5-in diameter 70-deg sphere-cone forebody, which accommodates up to four 4:1 area ratio nozzles, followed by a 10-in long cylindrical aftbody was developed for this study based on the computational results. The model was designed to allow for a large number of surface pressure measurements on the forebody and aftbody. Supplemental data included high-speed Schlieren video and internal pressures and temperatures. The run matrix was developed to allow for the quantification of various sources of experimental uncertainty, such as random errors due to run-to-run variations and bias errors due to flow field or model misalignments. Some preliminary results and observations from the test are presented, although detailed analyses of the data and uncertainties are still on going.