Unified, Cross-Platform, Open-Source Library Package for High-Performance Computing
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kozacik, Stephen
Compute power is continually increasing, but this increased performance is largely found in sophisticated computing devices and supercomputer resources that are difficult to use, resulting in under-utilization. We developed a unified set of programming tools that will allow users to take full advantage of the new technology by allowing them to work at a level abstracted away from the platform specifics, encouraging the use of modern computing systems, including government-funded supercomputer facilities.
A set of devices for Mechanics Laboratory assisted by a Computer
NASA Astrophysics Data System (ADS)
Rusu, Alexandru; Pirtac, Constantin
2015-12-01
The booklet give a description of a set of devices designed for unified work out of a number of Laboratory works in Mechanics for students at Technical Universities. It consists of a clock, adjusted to a computer, which allows to compute times with an error not greater than 0.0001 s. It allows also to make the calculations of the physical quantities measured in the experience and present the compilation of the final report. The least square method is used throughout the workshop.
ERIC Educational Resources Information Center
Moran, Mark; Hawkes, Mark; El Gayar, Omar
2010-01-01
Many educational institutions have implemented ubiquitous or required laptop, notebook, or tablet personal computing programs for their students. Yet, limited evidence exists to validate integration and acceptance of the technology among student populations. This research examines student acceptance of mobile computing devices using a modification…
Beal, Jacob; Viroli, Mirko
2015-07-28
Computation increasingly takes place not on an individual device, but distributed throughout a material or environment, whether it be a silicon surface, a network of wireless devices, a collection of biological cells or a programmable material. Emerging programming models embrace this reality and provide abstractions inspired by physics, such as computational fields, that allow such systems to be programmed holistically, rather than in terms of individual devices. This paper aims to provide a unified approach for the investigation and engineering of computations programmed with the aid of space-time abstractions, by bringing together a number of recent results, as well as to identify critical open problems. © 2015 The Author(s) Published by the Royal Society. All rights reserved.
2014-06-01
in large-scale datasets such as might be obtained by monitoring a corporate network or social network. Identifying guilty actors, rather than payload...by monitoring a corporate network or social network. Identifying guilty actors, rather than payload-carrying objects, is entirely novel in steganalysis...implementation using Compute Unified Device Architecture (CUDA) on NVIDIA graphics cards. The key to good performance is to combine computations so that
GPU acceleration of Dock6's Amber scoring computation.
Yang, Hailong; Zhou, Qiongqiong; Li, Bo; Wang, Yongjian; Luan, Zhongzhi; Qian, Depei; Li, Hanlu
2010-01-01
Dressing the problem of virtual screening is a long-term goal in the drug discovery field, which if properly solved, can significantly shorten new drugs' R&D cycle. The scoring functionality that evaluates the fitness of the docking result is one of the major challenges in virtual screening. In general, scoring functionality in docking requires a large amount of floating-point calculations, which usually takes several weeks or even months to be finished. This time-consuming procedure is unacceptable, especially when highly fatal and infectious virus arises such as SARS and H1N1, which forces the scoring task to be done in a limited time. This paper presents how to leverage the computational power of GPU to accelerate Dock6's (http://dock.compbio.ucsf.edu/DOCK_6/) Amber (J. Comput. Chem. 25: 1157-1174, 2004) scoring with NVIDIA CUDA (NVIDIA Corporation Technical Staff, Compute Unified Device Architecture - Programming Guide, NVIDIA Corporation, 2008) (Compute Unified Device Architecture) platform. We also discuss many factors that will greatly influence the performance after porting the Amber scoring to GPU, including thread management, data transfer, and divergence hidden. Our experiments show that the GPU-accelerated Amber scoring achieves a 6.5× speedup with respect to the original version running on AMD dual-core CPU for the same problem size. This acceleration makes the Amber scoring more competitive and efficient for large-scale virtual screening problems.
Novel synaptic memory device for neuromorphic computing
NASA Astrophysics Data System (ADS)
Mandal, Saptarshi; El-Amin, Ammaarah; Alexander, Kaitlyn; Rajendran, Bipin; Jha, Rashmi
2014-06-01
This report discusses the electrical characteristics of two-terminal synaptic memory devices capable of demonstrating an analog change in conductance in response to the varying amplitude and pulse-width of the applied signal. The devices are based on Mn doped HfO2 material. The mechanism behind reconfiguration was studied and a unified model is presented to explain the underlying device physics. The model was then utilized to show the application of these devices in speech recognition. A comparison between a 20 nm × 20 nm sized synaptic memory device with that of a state-of-the-art VLSI SRAM synapse showed ~10× reduction in area and >106 times reduction in the power consumption per learning cycle.
Design Tools for Accelerating Development and Usage of Multi-Core Computing Platforms
2014-04-01
Government formulated or supplied the drawings, specifications, or other data does not license the holder or any other person or corporation ; or convey...multicore PDSP platforms. The GPU- based capabilities of TDIF are currently oriented towards NVIDIA GPUs, based on the Compute Unified Device Architecture...CUDA) programming language [ NVIDIA 2007], which can be viewed as an extension of C. The multicore PDSP capabilities currently in TDIF are oriented
A unified framework for heat and mass transport at the atomic scale
NASA Astrophysics Data System (ADS)
Ponga, Mauricio; Sun, Dingyi
2018-04-01
We present a unified framework to simulate heat and mass transport in systems of particles. The proposed framework is based on kinematic mean field theory and uses a phenomenological master equation to compute effective transport rates between particles without the need to evaluate operators. We exploit this advantage and apply the model to simulate transport phenomena at the nanoscale. We demonstrate that, when calibrated to experimentally-measured transport coefficients, the model can accurately predict transient and steady state temperature and concentration profiles even in scenarios where the length of the device is comparable to the mean free path of the carriers. Through several example applications, we demonstrate the validity of our model for all classes of materials, including ones that, until now, would have been outside the domain of computational feasibility.
CUDA-based real time surgery simulation.
Liu, Youquan; De, Suvranu
2008-01-01
In this paper we present a general software platform that enables real time surgery simulation on the newly available compute unified device architecture (CUDA)from NVIDIA. CUDA-enabled GPUs harness the power of 128 processors which allow data parallel computations. Compared to the previous GPGPU, it is significantly more flexible with a C language interface. We report implementation of both collision detection and consequent deformation computation algorithms. Our test results indicate that the CUDA enables a twenty times speedup for collision detection and about fifteen times speedup for deformation computation on an Intel Core 2 Quad 2.66 GHz machine with GeForce 8800 GTX.
gpuSPHASE-A shared memory caching implementation for 2D SPH using CUDA
NASA Astrophysics Data System (ADS)
Winkler, Daniel; Meister, Michael; Rezavand, Massoud; Rauch, Wolfgang
2017-04-01
Smoothed particle hydrodynamics (SPH) is a meshless Lagrangian method that has been successfully applied to computational fluid dynamics (CFD), solid mechanics and many other multi-physics problems. Using the method to solve transport phenomena in process engineering requires the simulation of several days to weeks of physical time. Based on the high computational demand of CFD such simulations in 3D need a computation time of years so that a reduction to a 2D domain is inevitable. In this paper gpuSPHASE, a new open-source 2D SPH solver implementation for graphics devices, is developed. It is optimized for simulations that must be executed with thousands of frames per second to be computed in reasonable time. A novel caching algorithm for Compute Unified Device Architecture (CUDA) shared memory is proposed and implemented. The software is validated and the performance is evaluated for the well established dambreak test case.
NASA Astrophysics Data System (ADS)
Nishiura, Daisuke; Furuichi, Mikito; Sakaguchi, Hide
2015-09-01
The computational performance of a smoothed particle hydrodynamics (SPH) simulation is investigated for three types of current shared-memory parallel computer devices: many integrated core (MIC) processors, graphics processing units (GPUs), and multi-core CPUs. We are especially interested in efficient shared-memory allocation methods for each chipset, because the efficient data access patterns differ between compute unified device architecture (CUDA) programming for GPUs and OpenMP programming for MIC processors and multi-core CPUs. We first introduce several parallel implementation techniques for the SPH code, and then examine these on our target computer architectures to determine the most effective algorithms for each processor unit. In addition, we evaluate the effective computing performance and power efficiency of the SPH simulation on each architecture, as these are critical metrics for overall performance in a multi-device environment. In our benchmark test, the GPU is found to produce the best arithmetic performance as a standalone device unit, and gives the most efficient power consumption. The multi-core CPU obtains the most effective computing performance. The computational speed of the MIC processor on Xeon Phi approached that of two Xeon CPUs. This indicates that using MICs is an attractive choice for existing SPH codes on multi-core CPUs parallelized by OpenMP, as it gains computational acceleration without the need for significant changes to the source code.
2014-06-30
steganalysis) in large-scale datasets such as might be obtained by monitoring a corporate network or social network. Identifying guilty actors...guilty’ user (of steganalysis) in large-scale datasets such as might be obtained by monitoring a corporate network or social network. Identifying guilty...floating point operations (1 TFLOPs) for a 1 megapixel image. We designed a new implementation using Compute Unified Device Architecture (CUDA) on NVIDIA
Mobile Computing: 5 Solutions for Your Tablet Management Woes
ERIC Educational Resources Information Center
Raths, David
2013-01-01
At the end of the 2010-2011 school year, San Diego Unified School District, had 10,000 iPads in use and was set to add 18,000 more the next semester. Teachers and students were enthusiastic about using the devices yet the district faced a serious problem: Teachers were reporting to the instructional tech team that using Apple Configurator and…
Graphics processing unit based computation for NDE applications
NASA Astrophysics Data System (ADS)
Nahas, C. A.; Rajagopal, Prabhu; Balasubramaniam, Krishnan; Krishnamurthy, C. V.
2012-05-01
Advances in parallel processing in recent years are helping to improve the cost of numerical simulation. Breakthroughs in Graphical Processing Unit (GPU) based computation now offer the prospect of further drastic improvements. The introduction of 'compute unified device architecture' (CUDA) by NVIDIA (the global technology company based in Santa Clara, California, USA) has made programming GPUs for general purpose computing accessible to the average programmer. Here we use CUDA to develop parallel finite difference schemes as applicable to two problems of interest to NDE community, namely heat diffusion and elastic wave propagation. The implementations are for two-dimensions. Performance improvement of the GPU implementation against serial CPU implementation is then discussed.
A Computational framework for telemedicine.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Foster, I.; von Laszewski, G.; Thiruvathukal, G. K.
1998-07-01
Emerging telemedicine applications require the ability to exploit diverse and geographically distributed resources. Highspeed networks are used to integrate advanced visualization devices, sophisticated instruments, large databases, archival storage devices, PCs, workstations, and supercomputers. This form of telemedical environment is similar to networked virtual supercomputers, also known as metacomputers. Metacomputers are already being used in many scientific application areas. In this article, we analyze requirements necessary for a telemedical computing infrastructure and compare them with requirements found in a typical metacomputing environment. We will show that metacomputing environments can be used to enable a more powerful and unified computational infrastructure formore » telemedicine. The Globus metacomputing toolkit can provide the necessary low level mechanisms to enable a large scale telemedical infrastructure. The Globus toolkit components are designed in a modular fashion and can be extended to support the specific requirements for telemedicine.« less
Peer-to-peer Monte Carlo simulation of photon migration in topical applications of biomedical optics
NASA Astrophysics Data System (ADS)
Doronin, Alexander; Meglinski, Igor
2012-09-01
In the framework of further development of the unified approach of photon migration in complex turbid media, such as biological tissues we present a peer-to-peer (P2P) Monte Carlo (MC) code. The object-oriented programming is used for generalization of MC model for multipurpose use in various applications of biomedical optics. The online user interface providing multiuser access is developed using modern web technologies, such as Microsoft Silverlight, ASP.NET. The emerging P2P network utilizing computers with different types of compute unified device architecture-capable graphics processing units (GPUs) is applied for acceleration and to overcome the limitations, imposed by multiuser access in the online MC computational tool. The developed P2P MC was validated by comparing the results of simulation of diffuse reflectance and fluence rate distribution for semi-infinite scattering medium with known analytical results, results of adding-doubling method, and with other GPU-based MC techniques developed in the past. The best speedup of processing multiuser requests in a range of 4 to 35 s was achieved using single-precision computing, and the double-precision computing for floating-point arithmetic operations provides higher accuracy.
Doronin, Alexander; Meglinski, Igor
2012-09-01
In the framework of further development of the unified approach of photon migration in complex turbid media, such as biological tissues we present a peer-to-peer (P2P) Monte Carlo (MC) code. The object-oriented programming is used for generalization of MC model for multipurpose use in various applications of biomedical optics. The online user interface providing multiuser access is developed using modern web technologies, such as Microsoft Silverlight, ASP.NET. The emerging P2P network utilizing computers with different types of compute unified device architecture-capable graphics processing units (GPUs) is applied for acceleration and to overcome the limitations, imposed by multiuser access in the online MC computational tool. The developed P2P MC was validated by comparing the results of simulation of diffuse reflectance and fluence rate distribution for semi-infinite scattering medium with known analytical results, results of adding-doubling method, and with other GPU-based MC techniques developed in the past. The best speedup of processing multiuser requests in a range of 4 to 35 s was achieved using single-precision computing, and the double-precision computing for floating-point arithmetic operations provides higher accuracy.
Advanced computer graphic techniques for laser range finder (LRF) simulation
NASA Astrophysics Data System (ADS)
Bedkowski, Janusz; Jankowski, Stanislaw
2008-11-01
This paper show an advanced computer graphic techniques for laser range finder (LRF) simulation. The LRF is the common sensor for unmanned ground vehicle, autonomous mobile robot and security applications. The cost of the measurement system is extremely high, therefore the simulation tool is designed. The simulation gives an opportunity to execute algorithm such as the obstacle avoidance[1], slam for robot localization[2], detection of vegetation and water obstacles in surroundings of the robot chassis[3], LRF measurement in crowd of people[1]. The Axis Aligned Bounding Box (AABB) and alternative technique based on CUDA (NVIDIA Compute Unified Device Architecture) is presented.
NASA Astrophysics Data System (ADS)
Raksharam; Dutta, Aloke K.
2017-04-01
In this paper, a unified analytical model for the drain current of a symmetric Double-Gate Junctionless Field-Effect Transistor (DG-JLFET) is presented. The operation of the device has been classified into four modes: subthreshold, semi-depleted, accumulation, and hybrid; with the main focus of this work being on the accumulation mode, which has not been dealt with in detail so far in the literature. A physics-based model, using a simplified one-dimensional approach, has been developed for this mode, and it has been successfully integrated with the model for the hybrid mode. It also includes the effect of carrier mobility degradation due to the transverse electric field, which was hitherto missing in the earlier models reported in the literature. The piece-wise models have been unified using suitable interpolation functions. In addition, the model includes two most important short-channel effects pertaining to DG-JLFETs, namely the Drain Induced Barrier Lowering (DIBL) and the Subthreshold Swing (SS) degradation. The model is completely analytical, and is thus computationally highly efficient. The results of our model have shown an excellent match with those obtained from TCAD simulations for both long- and short-channel devices, as well as with the experimental data reported in the literature.
Massanes, Francesc; Cadennes, Marie; Brankov, Jovan G.
2012-01-01
In this paper we describe and evaluate a fast implementation of a classical block matching motion estimation algorithm for multiple Graphical Processing Units (GPUs) using the Compute Unified Device Architecture (CUDA) computing engine. The implemented block matching algorithm (BMA) uses summed absolute difference (SAD) error criterion and full grid search (FS) for finding optimal block displacement. In this evaluation we compared the execution time of a GPU and CPU implementation for images of various sizes, using integer and non-integer search grids. The results show that use of a GPU card can shorten computation time by a factor of 200 times for integer and 1000 times for a non-integer search grid. The additional speedup for non-integer search grid comes from the fact that GPU has built-in hardware for image interpolation. Further, when using multiple GPU cards, the presented evaluation shows the importance of the data splitting method across multiple cards, but an almost linear speedup with a number of cards is achievable. In addition we compared execution time of the proposed FS GPU implementation with two existing, highly optimized non-full grid search CPU based motion estimations methods, namely implementation of the Pyramidal Lucas Kanade Optical flow algorithm in OpenCV and Simplified Unsymmetrical multi-Hexagon search in H.264/AVC standard. In these comparisons, FS GPU implementation still showed modest improvement even though the computational complexity of FS GPU implementation is substantially higher than non-FS CPU implementation. We also demonstrated that for an image sequence of 720×480 pixels in resolution, commonly used in video surveillance, the proposed GPU implementation is sufficiently fast for real-time motion estimation at 30 frames-per-second using two NVIDIA C1060 Tesla GPU cards. PMID:22347787
Massanes, Francesc; Cadennes, Marie; Brankov, Jovan G
2011-07-01
In this paper we describe and evaluate a fast implementation of a classical block matching motion estimation algorithm for multiple Graphical Processing Units (GPUs) using the Compute Unified Device Architecture (CUDA) computing engine. The implemented block matching algorithm (BMA) uses summed absolute difference (SAD) error criterion and full grid search (FS) for finding optimal block displacement. In this evaluation we compared the execution time of a GPU and CPU implementation for images of various sizes, using integer and non-integer search grids.The results show that use of a GPU card can shorten computation time by a factor of 200 times for integer and 1000 times for a non-integer search grid. The additional speedup for non-integer search grid comes from the fact that GPU has built-in hardware for image interpolation. Further, when using multiple GPU cards, the presented evaluation shows the importance of the data splitting method across multiple cards, but an almost linear speedup with a number of cards is achievable.In addition we compared execution time of the proposed FS GPU implementation with two existing, highly optimized non-full grid search CPU based motion estimations methods, namely implementation of the Pyramidal Lucas Kanade Optical flow algorithm in OpenCV and Simplified Unsymmetrical multi-Hexagon search in H.264/AVC standard. In these comparisons, FS GPU implementation still showed modest improvement even though the computational complexity of FS GPU implementation is substantially higher than non-FS CPU implementation.We also demonstrated that for an image sequence of 720×480 pixels in resolution, commonly used in video surveillance, the proposed GPU implementation is sufficiently fast for real-time motion estimation at 30 frames-per-second using two NVIDIA C1060 Tesla GPU cards.
Graphics processing unit (GPU)-based computation of heat conduction in thermally anisotropic solids
NASA Astrophysics Data System (ADS)
Nahas, C. A.; Balasubramaniam, Krishnan; Rajagopal, Prabhu
2013-01-01
Numerical modeling of anisotropic media is a computationally intensive task since it brings additional complexity to the field problem in such a way that the physical properties are different in different directions. Largely used in the aerospace industry because of their lightweight nature, composite materials are a very good example of thermally anisotropic media. With advancements in video gaming technology, parallel processors are much cheaper today and accessibility to higher-end graphical processing devices has increased dramatically over the past couple of years. Since these massively parallel GPUs are very good in handling floating point arithmetic, they provide a new platform for engineers and scientists to accelerate their numerical models using commodity hardware. In this paper we implement a parallel finite difference model of thermal diffusion through anisotropic media using the NVIDIA CUDA (Compute Unified device Architecture). We use the NVIDIA GeForce GTX 560 Ti as our primary computing device which consists of 384 CUDA cores clocked at 1645 MHz with a standard desktop pc as the host platform. We compare the results from standard CPU implementation for its accuracy and speed and draw implications for simulation using the GPU paradigm.
Parallelized seeded region growing using CUDA.
Park, Seongjin; Lee, Jeongjin; Lee, Hyunna; Shin, Juneseuk; Seo, Jinwook; Lee, Kyoung Ho; Shin, Yeong-Gil; Kim, Bohyoung
2014-01-01
This paper presents a novel method for parallelizing the seeded region growing (SRG) algorithm using Compute Unified Device Architecture (CUDA) technology, with intention to overcome the theoretical weakness of SRG algorithm of its computation time being directly proportional to the size of a segmented region. The segmentation performance of the proposed CUDA-based SRG is compared with SRG implementations on single-core CPUs, quad-core CPUs, and shader language programming, using synthetic datasets and 20 body CT scans. Based on the experimental results, the CUDA-based SRG outperforms the other three implementations, advocating that it can substantially assist the segmentation during massive CT screening tests.
NASA Astrophysics Data System (ADS)
Grzeszczuk, A.; Kowalski, S.
2015-04-01
Compute Unified Device Architecture (CUDA) is a parallel computing platform developed by Nvidia for increase speed of graphics by usage of parallel mode for processes calculation. The success of this solution has opened technology General-Purpose Graphic Processor Units (GPGPUs) for applications not coupled with graphics. The GPGPUs system can be applying as effective tool for reducing huge number of data for pulse shape analysis measures, by on-line recalculation or by very quick system of compression. The simplified structure of CUDA system and model of programming based on example Nvidia GForce GTX580 card are presented by our poster contribution in stand-alone version and as ROOT application.
Operating room integration and telehealth.
Bucholz, Richard D; Laycock, Keith A; McDurmont, Leslie
2011-01-01
The increasing use of advanced automated and computer-controlled systems and devices in surgical procedures has resulted in problems arising from the crowding of the operating room with equipment and the incompatible control and communication standards associated with each system. This lack of compatibility between systems and centralized control means that the surgeon is frequently required to interact with multiple computer interfaces in order to obtain updates and exert control over the various devices at his disposal. To reduce this complexity and provide the surgeon with more complete and precise control of the operating room systems, a unified interface and communication network has been developed. In addition to improving efficiency, this network also allows the surgeon to grant remote access to consultants and observers at other institutions, enabling experts to participate in the procedure without having to travel to the site.
A Project Course Sequence in Innovation and Commercialization of Medical Devices.
Eberhardt, Alan W; Tillman, Shea; Kirkland, Brandon; Sherrod, Brandon
2017-07-01
There exists a need for educational processes in which students gain experience with design and commercialization of medical devices. This manuscript describes the implementation of, and assessment results from, the first year offering of a project course sequence in Master of Engineering (MEng) in Design and Commercialization at our institution. The three-semester course sequence focused on developing and applying hands-on skills that contribute to product development to address medical device needs found within our university hospital and local community. The first semester integrated computer-aided drawing (CAD) as preparation for manufacturing of device-related components (hand machining, computer numeric control (CNC), three-dimensional (3D) printing, and plastics molding), followed by an introduction to microcontrollers (MCUs) and printed circuit boards (PCBs) for associated electronics and control systems. In the second semester, the students applied these skills on a unified project, working together to construct and test multiple weighing scales for wheelchair users. In the final semester, the students applied industrial design concepts to four distinct device designs, including user and context reassessment, human factors (functional and aesthetic) design refinement, and advanced visualization for commercialization. The assessment results are described, along with lessons learned and plans for enhancement of the course sequence.
A learnable parallel processing architecture towards unity of memory and computing
NASA Astrophysics Data System (ADS)
Li, H.; Gao, B.; Chen, Z.; Zhao, Y.; Huang, P.; Ye, H.; Liu, L.; Liu, X.; Kang, J.
2015-08-01
Developing energy-efficient parallel information processing systems beyond von Neumann architecture is a long-standing goal of modern information technologies. The widely used von Neumann computer architecture separates memory and computing units, which leads to energy-hungry data movement when computers work. In order to meet the need of efficient information processing for the data-driven applications such as big data and Internet of Things, an energy-efficient processing architecture beyond von Neumann is critical for the information society. Here we show a non-von Neumann architecture built of resistive switching (RS) devices named “iMemComp”, where memory and logic are unified with single-type devices. Leveraging nonvolatile nature and structural parallelism of crossbar RS arrays, we have equipped “iMemComp” with capabilities of computing in parallel and learning user-defined logic functions for large-scale information processing tasks. Such architecture eliminates the energy-hungry data movement in von Neumann computers. Compared with contemporary silicon technology, adder circuits based on “iMemComp” can improve the speed by 76.8% and the power dissipation by 60.3%, together with a 700 times aggressive reduction in the circuit area.
A learnable parallel processing architecture towards unity of memory and computing.
Li, H; Gao, B; Chen, Z; Zhao, Y; Huang, P; Ye, H; Liu, L; Liu, X; Kang, J
2015-08-14
Developing energy-efficient parallel information processing systems beyond von Neumann architecture is a long-standing goal of modern information technologies. The widely used von Neumann computer architecture separates memory and computing units, which leads to energy-hungry data movement when computers work. In order to meet the need of efficient information processing for the data-driven applications such as big data and Internet of Things, an energy-efficient processing architecture beyond von Neumann is critical for the information society. Here we show a non-von Neumann architecture built of resistive switching (RS) devices named "iMemComp", where memory and logic are unified with single-type devices. Leveraging nonvolatile nature and structural parallelism of crossbar RS arrays, we have equipped "iMemComp" with capabilities of computing in parallel and learning user-defined logic functions for large-scale information processing tasks. Such architecture eliminates the energy-hungry data movement in von Neumann computers. Compared with contemporary silicon technology, adder circuits based on "iMemComp" can improve the speed by 76.8% and the power dissipation by 60.3%, together with a 700 times aggressive reduction in the circuit area.
Real-time radar signal processing using GPGPU (general-purpose graphic processing unit)
NASA Astrophysics Data System (ADS)
Kong, Fanxing; Zhang, Yan Rockee; Cai, Jingxiao; Palmer, Robert D.
2016-05-01
This study introduces a practical approach to develop real-time signal processing chain for general phased array radar on NVIDIA GPUs(Graphical Processing Units) using CUDA (Compute Unified Device Architecture) libraries such as cuBlas and cuFFT, which are adopted from open source libraries and optimized for the NVIDIA GPUs. The processed results are rigorously verified against those from the CPUs. Performance benchmarked in computation time with various input data cube sizes are compared across GPUs and CPUs. Through the analysis, it will be demonstrated that GPGPUs (General Purpose GPU) real-time processing of the array radar data is possible with relatively low-cost commercial GPUs.
Parallelized Seeded Region Growing Using CUDA
Park, Seongjin; Lee, Hyunna; Seo, Jinwook; Lee, Kyoung Ho; Shin, Yeong-Gil; Kim, Bohyoung
2014-01-01
This paper presents a novel method for parallelizing the seeded region growing (SRG) algorithm using Compute Unified Device Architecture (CUDA) technology, with intention to overcome the theoretical weakness of SRG algorithm of its computation time being directly proportional to the size of a segmented region. The segmentation performance of the proposed CUDA-based SRG is compared with SRG implementations on single-core CPUs, quad-core CPUs, and shader language programming, using synthetic datasets and 20 body CT scans. Based on the experimental results, the CUDA-based SRG outperforms the other three implementations, advocating that it can substantially assist the segmentation during massive CT screening tests. PMID:25309619
Optimized Laplacian image sharpening algorithm based on graphic processing unit
NASA Astrophysics Data System (ADS)
Ma, Tinghuai; Li, Lu; Ji, Sai; Wang, Xin; Tian, Yuan; Al-Dhelaan, Abdullah; Al-Rodhaan, Mznah
2014-12-01
In classical Laplacian image sharpening, all pixels are processed one by one, which leads to large amount of computation. Traditional Laplacian sharpening processed on CPU is considerably time-consuming especially for those large pictures. In this paper, we propose a parallel implementation of Laplacian sharpening based on Compute Unified Device Architecture (CUDA), which is a computing platform of Graphic Processing Units (GPU), and analyze the impact of picture size on performance and the relationship between the processing time of between data transfer time and parallel computing time. Further, according to different features of different memory, an improved scheme of our method is developed, which exploits shared memory in GPU instead of global memory and further increases the efficiency. Experimental results prove that two novel algorithms outperform traditional consequentially method based on OpenCV in the aspect of computing speed.
Development and application of unified algorithms for problems in computational science
NASA Technical Reports Server (NTRS)
Shankar, Vijaya; Chakravarthy, Sukumar
1987-01-01
A framework is presented for developing computationally unified numerical algorithms for solving nonlinear equations that arise in modeling various problems in mathematical physics. The concept of computational unification is an attempt to encompass efficient solution procedures for computing various nonlinear phenomena that may occur in a given problem. For example, in Computational Fluid Dynamics (CFD), a unified algorithm will be one that allows for solutions to subsonic (elliptic), transonic (mixed elliptic-hyperbolic), and supersonic (hyperbolic) flows for both steady and unsteady problems. The objectives are: development of superior unified algorithms emphasizing accuracy and efficiency aspects; development of codes based on selected algorithms leading to validation; application of mature codes to realistic problems; and extension/application of CFD-based algorithms to problems in other areas of mathematical physics. The ultimate objective is to achieve integration of multidisciplinary technologies to enhance synergism in the design process through computational simulation. Specific unified algorithms for a hierarchy of gas dynamics equations and their applications to two other areas: electromagnetic scattering, and laser-materials interaction accounting for melting.
Peng, Kuan; He, Ling; Zhu, Ziqiang; Tang, Jingtian; Xiao, Jiaying
2013-12-01
Compared with commonly used analytical reconstruction methods, the frequency-domain finite element method (FEM) based approach has proven to be an accurate and flexible algorithm for photoacoustic tomography. However, the FEM-based algorithm is computationally demanding, especially for three-dimensional cases. To enhance the algorithm's efficiency, in this work a parallel computational strategy is implemented in the framework of the FEM-based reconstruction algorithm using a graphic-processing-unit parallel frame named the "compute unified device architecture." A series of simulation experiments is carried out to test the accuracy and accelerating effect of the improved method. The results obtained indicate that the parallel calculation does not change the accuracy of the reconstruction algorithm, while its computational cost is significantly reduced by a factor of 38.9 with a GTX 580 graphics card using the improved method.
NASA Astrophysics Data System (ADS)
Murni, Bustamam, A.; Ernastuti, Handhika, T.; Kerami, D.
2017-07-01
Calculation of the matrix-vector multiplication in the real-world problems often involves large matrix with arbitrary size. Therefore, parallelization is needed to speed up the calculation process that usually takes a long time. Graph partitioning techniques that have been discussed in the previous studies cannot be used to complete the parallelized calculation of matrix-vector multiplication with arbitrary size. This is due to the assumption of graph partitioning techniques that can only solve the square and symmetric matrix. Hypergraph partitioning techniques will overcome the shortcomings of the graph partitioning technique. This paper addresses the efficient parallelization of matrix-vector multiplication through hypergraph partitioning techniques using CUDA GPU-based parallel computing. CUDA (compute unified device architecture) is a parallel computing platform and programming model that was created by NVIDIA and implemented by the GPU (graphics processing unit).
GPU accelerated fuzzy connected image segmentation by using CUDA.
Zhuge, Ying; Cao, Yong; Miller, Robert W
2009-01-01
Image segmentation techniques using fuzzy connectedness principles have shown their effectiveness in segmenting a variety of objects in several large applications in recent years. However, one problem of these algorithms has been their excessive computational requirements when processing large image datasets. Nowadays commodity graphics hardware provides high parallel computing power. In this paper, we present a parallel fuzzy connected image segmentation algorithm on Nvidia's Compute Unified Device Architecture (CUDA) platform for segmenting large medical image data sets. Our experiments based on three data sets with small, medium, and large data size demonstrate the efficiency of the parallel algorithm, which achieves a speed-up factor of 7.2x, 7.3x, and 14.4x, correspondingly, for the three data sets over the sequential implementation of fuzzy connected image segmentation algorithm on CPU.
Unified computational model of transport in metal-insulating oxide-metal systems
NASA Astrophysics Data System (ADS)
Tierney, B. D.; Hjalmarson, H. P.; Jacobs-Gedrim, R. B.; Agarwal, Sapan; James, C. D.; Marinella, M. J.
2018-04-01
A unified physics-based model of electron transport in metal-insulator-metal (MIM) systems is presented. In this model, transport through metal-oxide interfaces occurs by electron tunneling between the metal electrodes and oxide defect states. Transport in the oxide bulk is dominated by hopping, modeled as a series of tunneling events that alter the electron occupancy of defect states. Electron transport in the oxide conduction band is treated by the drift-diffusion formalism and defect chemistry reactions link all the various transport mechanisms. It is shown that the current-limiting effect of the interface band offsets is a function of the defect vacancy concentration. These results provide insight into the underlying physical mechanisms of leakage currents in oxide-based capacitors and steady-state electron transport in resistive random access memory (ReRAM) MIM devices. Finally, an explanation of ReRAM bipolar switching behavior based on these results is proposed.
14 CFR 1221.108 - Establishment of the NASA Unified Visual Communications System.
Code of Federal Regulations, 2012 CFR
2012-01-01
... 14 Aeronautics and Space 5 2012-01-01 2012-01-01 false Establishment of the NASA Unified Visual... ADMINISTRATION THE NASA SEAL AND OTHER DEVICES, AND THE CONGRESSIONAL SPACE MEDAL OF HONOR NASA Seal, NASA Insignia, NASA Logotype, NASA Program Identifiers, NASA Flags, and the Agency's Unified Visual...
14 CFR § 1221.108 - Establishment of the NASA Unified Visual Communications System.
Code of Federal Regulations, 2014 CFR
2014-01-01
... 14 Aeronautics and Space 5 2014-01-01 2014-01-01 false Establishment of the NASA Unified Visual... ADMINISTRATION THE NASA SEAL AND OTHER DEVICES, AND THE CONGRESSIONAL SPACE MEDAL OF HONOR NASA Seal, NASA Insignia, NASA Logotype, NASA Program Identifiers, NASA Flags, and the Agency's Unified Visual...
14 CFR 1221.108 - Establishment of the NASA Unified Visual Communications System.
Code of Federal Regulations, 2010 CFR
2010-01-01
... 14 Aeronautics and Space 5 2010-01-01 2010-01-01 false Establishment of the NASA Unified Visual... ADMINISTRATION THE NASA SEAL AND OTHER DEVICES, AND THE CONGRESSIONAL SPACE MEDAL OF HONOR NASA Seal, NASA Insignia, NASA Logotype, NASA Program Identifiers, NASA Flags, and the Agency's Unified Visual...
GPU-accelerated computing for Lagrangian coherent structures of multi-body gravitational regimes
NASA Astrophysics Data System (ADS)
Lin, Mingpei; Xu, Ming; Fu, Xiaoyu
2017-04-01
Based on a well-established theoretical foundation, Lagrangian Coherent Structures (LCSs) have elicited widespread research on the intrinsic structures of dynamical systems in many fields, including the field of astrodynamics. Although the application of LCSs in dynamical problems seems straightforward theoretically, its associated computational cost is prohibitive. We propose a block decomposition algorithm developed on Compute Unified Device Architecture (CUDA) platform for the computation of the LCSs of multi-body gravitational regimes. In order to take advantage of GPU's outstanding computing properties, such as Shared Memory, Constant Memory, and Zero-Copy, the algorithm utilizes a block decomposition strategy to facilitate computation of finite-time Lyapunov exponent (FTLE) fields of arbitrary size and timespan. Simulation results demonstrate that this GPU-based algorithm can satisfy double-precision accuracy requirements and greatly decrease the time needed to calculate final results, increasing speed by approximately 13 times. Additionally, this algorithm can be generalized to various large-scale computing problems, such as particle filters, constellation design, and Monte-Carlo simulation.
Chen, Duan; Wei, Guo-Wei
2010-01-01
The miniaturization of nano-scale electronic devices, such as metal oxide semiconductor field effect transistors (MOSFETs), has given rise to a pressing demand in the new theoretical understanding and practical tactic for dealing with quantum mechanical effects in integrated circuits. Modeling and simulation of this class of problems have emerged as an important topic in applied and computational mathematics. This work presents mathematical models and computational algorithms for the simulation of nano-scale MOSFETs. We introduce a unified two-scale energy functional to describe the electrons and the continuum electrostatic potential of the nano-electronic device. This framework enables us to put microscopic and macroscopic descriptions in an equal footing at nano scale. By optimization of the energy functional, we derive consistently-coupled Poisson-Kohn-Sham equations. Additionally, layered structures are crucial to the electrostatic and transport properties of nano transistors. A material interface model is proposed for more accurate description of the electrostatics governed by the Poisson equation. Finally, a new individual dopant model that utilizes the Dirac delta function is proposed to understand the random doping effect in nano electronic devices. Two mathematical algorithms, the matched interface and boundary (MIB) method and the Dirichlet-to-Neumann mapping (DNM) technique, are introduced to improve the computational efficiency of nano-device simulations. Electronic structures are computed via subband decomposition and the transport properties, such as the I-V curves and electron density, are evaluated via the non-equilibrium Green's functions (NEGF) formalism. Two distinct device configurations, a double-gate MOSFET and a four-gate MOSFET, are considered in our three-dimensional numerical simulations. For these devices, the current fluctuation and voltage threshold lowering effect induced by the discrete dopant model are explored. Numerical convergence and model well-posedness are also investigated in the present work. PMID:20396650
ERIC Educational Resources Information Center
Knowlton, Marie; Wetzel, Robin
2006-01-01
This study compared the length of text in English Braille American Edition, the Nemeth code, and the computer braille code with the Unified English Braille Code (UEBC)--also known as Unified English Braille (UEB). The findings indicate that differences in the length of text are dependent on the type of material that is transcribed and the grade…
Impact of office productivity cloud computing on energy consumption and greenhouse gas emissions.
Williams, Daniel R; Tang, Yinshan
2013-05-07
Cloud computing is usually regarded as being energy efficient and thus emitting less greenhouse gases (GHG) than traditional forms of computing. When the energy consumption of Microsoft's cloud computing Office 365 (O365) and traditional Office 2010 (O2010) software suites were tested and modeled, some cloud services were found to consume more energy than the traditional form. The developed model in this research took into consideration the energy consumption at the three main stages of data transmission; data center, network, and end user device. Comparable products from each suite were selected and activities were defined for each product to represent a different computing type. Microsoft provided highly confidential data for the data center stage, while the networking and user device stages were measured directly. A new measurement and software apportionment approach was defined and utilized allowing the power consumption of cloud services to be directly measured for the user device stage. Results indicated that cloud computing is more energy efficient for Excel and Outlook which consumed less energy and emitted less GHG than the standalone counterpart. The power consumption of the cloud based Outlook (8%) and Excel (17%) was lower than their traditional counterparts. However, the power consumption of the cloud version of Word was 17% higher than its traditional equivalent. A third mixed access method was also measured for Word which emitted 5% more GHG than the traditional version. It is evident that cloud computing may not provide a unified way forward to reduce energy consumption and GHG. Direct conversion from the standalone package into the cloud provision platform can now consider energy and GHG emissions at the software development and cloud service design stage using the methods described in this research.
A Unified Approach to Modeling Multidisciplinary Interactions
NASA Technical Reports Server (NTRS)
Samareh, Jamshid A.; Bhatia, Kumar G.
2000-01-01
There are a number of existing methods to transfer information among various disciplines. For a multidisciplinary application with n disciplines, the traditional methods may be required to model (n(exp 2) - n) interactions. This paper presents a unified three-dimensional approach that reduces the number of interactions from (n(exp 2) - n) to 2n by using a computer-aided design model. The proposed modeling approach unifies the interactions among various disciplines. The approach is independent of specific discipline implementation, and a number of existing methods can be reformulated in the context of the proposed unified approach. This paper provides an overview of the proposed unified approach and reformulations for two existing methods. The unified approach is specially tailored for application environments where the geometry is created and managed through a computer-aided design system. Results are presented for a blended-wing body and a high-speed civil transport.
Unified web-based network management based on distributed object orientated software agents
NASA Astrophysics Data System (ADS)
Djalalian, Amir; Mukhtar, Rami; Zukerman, Moshe
2002-09-01
This paper presents an architecture that provides a unified web interface to managed network devices that support CORBA, OSI or Internet-based network management protocols. A client gains access to managed devices through a web browser, which is used to issue management operations and receive event notifications. The proposed architecture is compatible with both the OSI Management reference Model and CORBA. The steps required for designing the building blocks of such architecture are identified.
What Does CALL Have to Offer Computer Science and What Does Computer Science Have to Offer CALL?
ERIC Educational Resources Information Center
Cushion, Steve
2006-01-01
We will argue that CALL can usefully be viewed as a subset of computer software engineering and can profit from adopting some of the recent progress in software development theory. The unified modelling language has become the industry standard modelling technique and the accompanying unified process is rapidly gaining acceptance. The manner in…
Heterogeneous compute in computer vision: OpenCL in OpenCV
NASA Astrophysics Data System (ADS)
Gasparakis, Harris
2014-02-01
We explore the relevance of Heterogeneous System Architecture (HSA) in Computer Vision, both as a long term vision, and as a near term emerging reality via the recently ratified OpenCL 2.0 Khronos standard. After a brief review of OpenCL 1.2 and 2.0, including HSA features such as Shared Virtual Memory (SVM) and platform atomics, we identify what genres of Computer Vision workloads stand to benefit by leveraging those features, and we suggest a new mental framework that replaces GPU compute with hybrid HSA APU compute. As a case in point, we discuss, in some detail, popular object recognition algorithms (part-based models), emphasizing the interplay and concurrent collaboration between the GPU and CPU. We conclude by describing how OpenCL has been incorporated in OpenCV, a popular open source computer vision library, emphasizing recent work on the Transparent API, to appear in OpenCV 3.0, which unifies the native CPU and OpenCL execution paths under a single API, allowing the same code to execute either on CPU or on a OpenCL enabled device, without even recompiling.
Providing security for automated process control systems at hydropower engineering facilities
NASA Astrophysics Data System (ADS)
Vasiliev, Y. S.; Zegzhda, P. D.; Zegzhda, D. P.
2016-12-01
This article suggests the concept of a cyberphysical system to manage computer security of automated process control systems at hydropower engineering facilities. According to the authors, this system consists of a set of information processing tools and computer-controlled physical devices. Examples of cyber attacks on power engineering facilities are provided, and a strategy of improving cybersecurity of hydropower engineering systems is suggested. The architecture of the multilevel protection of the automated process control system (APCS) of power engineering facilities is given, including security systems, control systems, access control, encryption, secure virtual private network of subsystems for monitoring and analysis of security events. The distinctive aspect of the approach is consideration of interrelations and cyber threats, arising when SCADA is integrated with the unified enterprise information system.
A GPU-paralleled implementation of an enhanced face recognition algorithm
NASA Astrophysics Data System (ADS)
Chen, Hao; Liu, Xiyang; Shao, Shuai; Zan, Jiguo
2013-03-01
Face recognition algorithm based on compressed sensing and sparse representation is hotly argued in these years. The scheme of this algorithm increases recognition rate as well as anti-noise capability. However, the computational cost is expensive and has become a main restricting factor for real world applications. In this paper, we introduce a GPU-accelerated hybrid variant of face recognition algorithm named parallel face recognition algorithm (pFRA). We describe here how to carry out parallel optimization design to take full advantage of many-core structure of a GPU. The pFRA is tested and compared with several other implementations under different data sample size. Finally, Our pFRA, implemented with NVIDIA GPU and Computer Unified Device Architecture (CUDA) programming model, achieves a significant speedup over the traditional CPU implementations.
Quantum engineering of transistors based on 2D materials heterostructures
NASA Astrophysics Data System (ADS)
Iannaccone, Giuseppe; Bonaccorso, Francesco; Colombo, Luigi; Fiori, Gianluca
2018-03-01
Quantum engineering entails atom-by-atom design and fabrication of electronic devices. This innovative technology that unifies materials science and device engineering has been fostered by the recent progress in the fabrication of vertical and lateral heterostructures of two-dimensional materials and by the assessment of the technology potential via computational nanotechnology. But how close are we to the possibility of the practical realization of next-generation atomically thin transistors? In this Perspective, we analyse the outlook and the challenges of quantum-engineered transistors using heterostructures of two-dimensional materials against the benchmark of silicon technology and its foreseeable evolution in terms of potential performance and manufacturability. Transistors based on lateral heterostructures emerge as the most promising option from a performance point of view, even if heterostructure formation and control are in the initial technology development stage.
Quantum engineering of transistors based on 2D materials heterostructures.
Iannaccone, Giuseppe; Bonaccorso, Francesco; Colombo, Luigi; Fiori, Gianluca
2018-03-01
Quantum engineering entails atom-by-atom design and fabrication of electronic devices. This innovative technology that unifies materials science and device engineering has been fostered by the recent progress in the fabrication of vertical and lateral heterostructures of two-dimensional materials and by the assessment of the technology potential via computational nanotechnology. But how close are we to the possibility of the practical realization of next-generation atomically thin transistors? In this Perspective, we analyse the outlook and the challenges of quantum-engineered transistors using heterostructures of two-dimensional materials against the benchmark of silicon technology and its foreseeable evolution in terms of potential performance and manufacturability. Transistors based on lateral heterostructures emerge as the most promising option from a performance point of view, even if heterostructure formation and control are in the initial technology development stage.
Symplectic multi-particle tracking on GPUs
NASA Astrophysics Data System (ADS)
Liu, Zhicong; Qiang, Ji
2018-05-01
A symplectic multi-particle tracking model is implemented on the Graphic Processing Units (GPUs) using the Compute Unified Device Architecture (CUDA) language. The symplectic tracking model can preserve phase space structure and reduce non-physical effects in long term simulation, which is important for beam property evaluation in particle accelerators. Though this model is computationally expensive, it is very suitable for parallelization and can be accelerated significantly by using GPUs. In this paper, we optimized the implementation of the symplectic tracking model on both single GPU and multiple GPUs. Using a single GPU processor, the code achieves a factor of 2-10 speedup for a range of problem sizes compared with the time on a single state-of-the-art Central Processing Unit (CPU) node with similar power consumption and semiconductor technology. It also shows good scalability on a multi-GPU cluster at Oak Ridge Leadership Computing Facility. In an application to beam dynamics simulation, the GPU implementation helps save more than a factor of two total computing time in comparison to the CPU implementation.
An, Gary; Bartels, John; Vodovotz, Yoram
2011-03-01
The clinical translation of promising basic biomedical findings, whether derived from reductionist studies in academic laboratories or as the product of extensive high-throughput and -content screens in the biotechnology and pharmaceutical industries, has reached a period of stagnation in which ever higher research and development costs are yielding ever fewer new drugs. Systems biology and computational modeling have been touted as potential avenues by which to break through this logjam. However, few mechanistic computational approaches are utilized in a manner that is fully cognizant of the inherent clinical realities in which the drugs developed through this ostensibly rational process will be ultimately used. In this article, we present a Translational Systems Biology approach to inflammation. This approach is based on the use of mechanistic computational modeling centered on inherent clinical applicability, namely that a unified suite of models can be applied to generate in silico clinical trials, individualized computational models as tools for personalized medicine, and rational drug and device design based on disease mechanism.
Nagaoka, Tomoaki; Watanabe, Soichi
2012-01-01
Electromagnetic simulation with anatomically realistic computational human model using the finite-difference time domain (FDTD) method has recently been performed in a number of fields in biomedical engineering. To improve the method's calculation speed and realize large-scale computing with the computational human model, we adapt three-dimensional FDTD code to a multi-GPU cluster environment with Compute Unified Device Architecture and Message Passing Interface. Our multi-GPU cluster system consists of three nodes. The seven GPU boards (NVIDIA Tesla C2070) are mounted on each node. We examined the performance of the FDTD calculation on multi-GPU cluster environment. We confirmed that the FDTD calculation on the multi-GPU clusters is faster than that on a multi-GPU (a single workstation), and we also found that the GPU cluster system calculate faster than a vector supercomputer. In addition, our GPU cluster system allowed us to perform the large-scale FDTD calculation because were able to use GPU memory of over 100 GB.
GPU-accelerated phase-field simulation of dendritic solidification in a binary alloy
NASA Astrophysics Data System (ADS)
Yamanaka, Akinori; Aoki, Takayuki; Ogawa, Satoi; Takaki, Tomohiro
2011-03-01
The phase-field simulation for dendritic solidification of a binary alloy has been accelerated by using a graphic processing unit (GPU). To perform the phase-field simulation of the alloy solidification on GPU, a program code was developed with computer unified device architecture (CUDA). In this paper, the implementation technique of the phase-field model on GPU is presented. Also, we evaluated the acceleration performance of the three-dimensional solidification simulation by using a single NVIDIA TESLA C1060 GPU and the developed program code. The results showed that the GPU calculation for 5763 computational grids achieved the performance of 170 GFLOPS by utilizing the shared memory as a software-managed cache. Furthermore, it can be demonstrated that the computation with the GPU is 100 times faster than that with a single CPU core. From the obtained results, we confirmed the feasibility of realizing a real-time full three-dimensional phase-field simulation of microstructure evolution on a personal desktop computer.
High performance hybrid functional Petri net simulations of biological pathway models on CUDA.
Chalkidis, Georgios; Nagasaki, Masao; Miyano, Satoru
2011-01-01
Hybrid functional Petri nets are a wide-spread tool for representing and simulating biological models. Due to their potential of providing virtual drug testing environments, biological simulations have a growing impact on pharmaceutical research. Continuous research advancements in biology and medicine lead to exponentially increasing simulation times, thus raising the demand for performance accelerations by efficient and inexpensive parallel computation solutions. Recent developments in the field of general-purpose computation on graphics processing units (GPGPU) enabled the scientific community to port a variety of compute intensive algorithms onto the graphics processing unit (GPU). This work presents the first scheme for mapping biological hybrid functional Petri net models, which can handle both discrete and continuous entities, onto compute unified device architecture (CUDA) enabled GPUs. GPU accelerated simulations are observed to run up to 18 times faster than sequential implementations. Simulating the cell boundary formation by Delta-Notch signaling on a CUDA enabled GPU results in a speedup of approximately 7x for a model containing 1,600 cells.
Parallel Computer System for 3D Visualization Stereo on GPU
NASA Astrophysics Data System (ADS)
Al-Oraiqat, Anas M.; Zori, Sergii A.
2018-03-01
This paper proposes the organization of a parallel computer system based on Graphic Processors Unit (GPU) for 3D stereo image synthesis. The development is based on the modified ray tracing method developed by the authors for fast search of tracing rays intersections with scene objects. The system allows significant increase in the productivity for the 3D stereo synthesis of photorealistic quality. The generalized procedure of 3D stereo image synthesis on the Graphics Processing Unit/Graphics Processing Clusters (GPU/GPC) is proposed. The efficiency of the proposed solutions by GPU implementation is compared with single-threaded and multithreaded implementations on the CPU. The achieved average acceleration in multi-thread implementation on the test GPU and CPU is about 7.5 and 1.6 times, respectively. Studying the influence of choosing the size and configuration of the computational Compute Unified Device Archi-tecture (CUDA) network on the computational speed shows the importance of their correct selection. The obtained experimental estimations can be significantly improved by new GPUs with a large number of processing cores and multiprocessors, as well as optimized configuration of the computing CUDA network.
Crespo, Alejandro C.; Dominguez, Jose M.; Barreiro, Anxo; Gómez-Gesteira, Moncho; Rogers, Benedict D.
2011-01-01
Smoothed Particle Hydrodynamics (SPH) is a numerical method commonly used in Computational Fluid Dynamics (CFD) to simulate complex free-surface flows. Simulations with this mesh-free particle method far exceed the capacity of a single processor. In this paper, as part of a dual-functioning code for either central processing units (CPUs) or Graphics Processor Units (GPUs), a parallelisation using GPUs is presented. The GPU parallelisation technique uses the Compute Unified Device Architecture (CUDA) of nVidia devices. Simulations with more than one million particles on a single GPU card exhibit speedups of up to two orders of magnitude over using a single-core CPU. It is demonstrated that the code achieves different speedups with different CUDA-enabled GPUs. The numerical behaviour of the SPH code is validated with a standard benchmark test case of dam break flow impacting on an obstacle where good agreement with the experimental results is observed. Both the achieved speed-ups and the quantitative agreement with experiments suggest that CUDA-based GPU programming can be used in SPH methods with efficiency and reliability. PMID:21695185
An SNMP-based solution to enable remote ISO/IEEE 11073 technical management.
Lasierra, Nelia; Alesanco, Alvaro; García, José
2012-07-01
This paper presents the design and implementation of an architecture based on the integration of simple network management protocol version 3 (SNMPv3) and the standard ISO/IEEE 11073 (X73) to manage technical information in home-based telemonitoring scenarios. This architecture includes the development of an SNMPv3-proxyX73 agent which comprises a management information base (MIB) module adapted to X73. In the proposed scenario, medical devices (MDs) send information to a concentrator device [designated as compute engine (CE)] using the X73 standard. This information together with extra information collected in the CE is stored in the developed MIB. Finally, the information collected is available for remote access via SNMP connection. Moreover, alarms and events can be configured by an external manager in order to provide warnings of irregularities in the MDs' technical performance evaluation. This proposed SNMPv3 agent provides a solution to integrate and unify technical device management in home-based telemonitoring scenarios fully adapted to X73.
Liu, Kui; Wei, Sixiao; Chen, Zhijiang; Jia, Bin; Chen, Genshe; Ling, Haibin; Sheaff, Carolyn; Blasch, Erik
2017-01-01
This paper presents the first attempt at combining Cloud with Graphic Processing Units (GPUs) in a complementary manner within the framework of a real-time high performance computation architecture for the application of detecting and tracking multiple moving targets based on Wide Area Motion Imagery (WAMI). More specifically, the GPU and Cloud Moving Target Tracking (GC-MTT) system applied a front-end web based server to perform the interaction with Hadoop and highly parallelized computation functions based on the Compute Unified Device Architecture (CUDA©). The introduced multiple moving target detection and tracking method can be extended to other applications such as pedestrian tracking, group tracking, and Patterns of Life (PoL) analysis. The cloud and GPUs based computing provides an efficient real-time target recognition and tracking approach as compared to methods when the work flow is applied using only central processing units (CPUs). The simultaneous tracking and recognition results demonstrate that a GC-MTT based approach provides drastically improved tracking with low frame rates over realistic conditions. PMID:28208684
Liu, Kui; Wei, Sixiao; Chen, Zhijiang; Jia, Bin; Chen, Genshe; Ling, Haibin; Sheaff, Carolyn; Blasch, Erik
2017-02-12
This paper presents the first attempt at combining Cloud with Graphic Processing Units (GPUs) in a complementary manner within the framework of a real-time high performance computation architecture for the application of detecting and tracking multiple moving targets based on Wide Area Motion Imagery (WAMI). More specifically, the GPU and Cloud Moving Target Tracking (GC-MTT) system applied a front-end web based server to perform the interaction with Hadoop and highly parallelized computation functions based on the Compute Unified Device Architecture (CUDA©). The introduced multiple moving target detection and tracking method can be extended to other applications such as pedestrian tracking, group tracking, and Patterns of Life (PoL) analysis. The cloud and GPUs based computing provides an efficient real-time target recognition and tracking approach as compared to methods when the work flow is applied using only central processing units (CPUs). The simultaneous tracking and recognition results demonstrate that a GC-MTT based approach provides drastically improved tracking with low frame rates over realistic conditions.
Miao, Yipu; Merz, Kenneth M
2015-04-14
We present an efficient implementation of ab initio self-consistent field (SCF) energy and gradient calculations that run on Compute Unified Device Architecture (CUDA) enabled graphical processing units (GPUs) using recurrence relations. We first discuss the machine-generated code that calculates the electron-repulsion integrals (ERIs) for different ERI types. Next we describe the porting of the SCF gradient calculation to GPUs, which results in an acceleration of the computation of the first-order derivative of the ERIs. However, only s, p, and d ERIs and s and p derivatives could be executed simultaneously on GPUs using the current version of CUDA and generation of NVidia GPUs using a previously described algorithm [Miao and Merz J. Chem. Theory Comput. 2013, 9, 965-976.]. Hence, we developed an algorithm to compute f type ERIs and d type ERI derivatives on GPUs. Our benchmarks shows the performance GPU enable ERI and ERI derivative computation yielded speedups of 10-18 times relative to traditional CPU execution. An accuracy analysis using double-precision calculations demonstrates that the overall accuracy is satisfactory for most applications.
NASA Astrophysics Data System (ADS)
Fehr, M.; Navarro, V.; Martin, L.; Fletcher, E.
2013-08-01
Space Situational Awareness[8] (SSA) is defined as the comprehensive knowledge, understanding and maintained awareness of the population of space objects, the space environment and existing threats and risks. As ESA's SSA Conjunction Prediction Service (CPS) requires the repetitive application of a processing algorithm against a data set of man-made space objects, it is crucial to exploit the highly parallelizable nature of this problem. Currently the CPS system makes use of OpenMP[7] for parallelization purposes using CPU threads, but only a GPU with its hundreds of cores can fully benefit from such high levels of parallelism. This paper presents the adaptation of several core algorithms[5] of the CPS for general-purpose computing on graphics processing units (GPGPU) using NVIDIAs Compute Unified Device Architecture (CUDA).
Multi-GPU accelerated three-dimensional FDTD method for electromagnetic simulation.
Nagaoka, Tomoaki; Watanabe, Soichi
2011-01-01
Numerical simulation with a numerical human model using the finite-difference time domain (FDTD) method has recently been performed in a number of fields in biomedical engineering. To improve the method's calculation speed and realize large-scale computing with the numerical human model, we adapt three-dimensional FDTD code to a multi-GPU environment using Compute Unified Device Architecture (CUDA). In this study, we used NVIDIA Tesla C2070 as GPGPU boards. The performance of multi-GPU is evaluated in comparison with that of a single GPU and vector supercomputer. The calculation speed with four GPUs was approximately 3.5 times faster than with a single GPU, and was slightly (approx. 1.3 times) slower than with the supercomputer. Calculation speed of the three-dimensional FDTD method using GPUs can significantly improve with an expanding number of GPUs.
NASA Astrophysics Data System (ADS)
Miller, Jacob; Sanders, Stephen; Miyake, Akimasa
2017-12-01
While quantum speed-up in solving certain decision problems by a fault-tolerant universal quantum computer has been promised, a timely research interest includes how far one can reduce the resource requirement to demonstrate a provable advantage in quantum devices without demanding quantum error correction, which is crucial for prolonging the coherence time of qubits. We propose a model device made of locally interacting multiple qubits, designed such that simultaneous single-qubit measurements on it can output probability distributions whose average-case sampling is classically intractable, under similar assumptions as the sampling of noninteracting bosons and instantaneous quantum circuits. Notably, in contrast to these previous unitary-based realizations, our measurement-based implementation has two distinctive features. (i) Our implementation involves no adaptation of measurement bases, leading output probability distributions to be generated in constant time, independent of the system size. Thus, it could be implemented in principle without quantum error correction. (ii) Verifying the classical intractability of our sampling is done by changing the Pauli measurement bases only at certain output qubits. Our usage of random commuting quantum circuits in place of computationally universal circuits allows a unique unification of sampling and verification, so they require the same physical resource requirements in contrast to the more demanding verification protocols seen elsewhere in the literature.
BarraCUDA - a fast short read sequence aligner using graphics processing units
2012-01-01
Background With the maturation of next-generation DNA sequencing (NGS) technologies, the throughput of DNA sequencing reads has soared to over 600 gigabases from a single instrument run. General purpose computing on graphics processing units (GPGPU), extracts the computing power from hundreds of parallel stream processors within graphics processing cores and provides a cost-effective and energy efficient alternative to traditional high-performance computing (HPC) clusters. In this article, we describe the implementation of BarraCUDA, a GPGPU sequence alignment software that is based on BWA, to accelerate the alignment of sequencing reads generated by these instruments to a reference DNA sequence. Findings Using the NVIDIA Compute Unified Device Architecture (CUDA) software development environment, we ported the most computational-intensive alignment component of BWA to GPU to take advantage of the massive parallelism. As a result, BarraCUDA offers a magnitude of performance boost in alignment throughput when compared to a CPU core while delivering the same level of alignment fidelity. The software is also capable of supporting multiple CUDA devices in parallel to further accelerate the alignment throughput. Conclusions BarraCUDA is designed to take advantage of the parallelism of GPU to accelerate the alignment of millions of sequencing reads generated by NGS instruments. By doing this, we could, at least in part streamline the current bioinformatics pipeline such that the wider scientific community could benefit from the sequencing technology. BarraCUDA is currently available from http://seqbarracuda.sf.net PMID:22244497
Establishing a Novel Modeling Tool: A Python-Based Interface for a Neuromorphic Hardware System
Brüderle, Daniel; Müller, Eric; Davison, Andrew; Muller, Eilif; Schemmel, Johannes; Meier, Karlheinz
2008-01-01
Neuromorphic hardware systems provide new possibilities for the neuroscience modeling community. Due to the intrinsic parallelism of the micro-electronic emulation of neural computation, such models are highly scalable without a loss of speed. However, the communities of software simulator users and neuromorphic engineering in neuroscience are rather disjoint. We present a software concept that provides the possibility to establish such hardware devices as valuable modeling tools. It is based on the integration of the hardware interface into a simulator-independent language which allows for unified experiment descriptions that can be run on various simulation platforms without modification, implying experiment portability and a huge simplification of the quantitative comparison of hardware and simulator results. We introduce an accelerated neuromorphic hardware device and describe the implementation of the proposed concept for this system. An example setup and results acquired by utilizing both the hardware system and a software simulator are demonstrated. PMID:19562085
Establishing a novel modeling tool: a python-based interface for a neuromorphic hardware system.
Brüderle, Daniel; Müller, Eric; Davison, Andrew; Muller, Eilif; Schemmel, Johannes; Meier, Karlheinz
2009-01-01
Neuromorphic hardware systems provide new possibilities for the neuroscience modeling community. Due to the intrinsic parallelism of the micro-electronic emulation of neural computation, such models are highly scalable without a loss of speed. However, the communities of software simulator users and neuromorphic engineering in neuroscience are rather disjoint. We present a software concept that provides the possibility to establish such hardware devices as valuable modeling tools. It is based on the integration of the hardware interface into a simulator-independent language which allows for unified experiment descriptions that can be run on various simulation platforms without modification, implying experiment portability and a huge simplification of the quantitative comparison of hardware and simulator results. We introduce an accelerated neuromorphic hardware device and describe the implementation of the proposed concept for this system. An example setup and results acquired by utilizing both the hardware system and a software simulator are demonstrated.
ERIC Educational Resources Information Center
Holbrook, M. Cay; MacCuspie, P. Ann
2010-01-01
Braille-reading mathematicians, scientists, and computer scientists were asked to examine the usability of the Unified English Braille Code (UEB) for technical materials. They had little knowledge of the code prior to the study. The research included two reading tasks, a short tutorial about UEB, and a focus group. The results indicated that the…
Zheng, Song; Zhang, Qi; Zheng, Rong; Huang, Bi-Qin; Song, Yi-Lin; Chen, Xin-Chu
2017-01-01
In recent years, the smart home field has gained wide attention for its broad application prospects. However, families using smart home systems must usually adopt various heterogeneous smart devices, including sensors and devices, which makes it more difficult to manage and control their home system. How to design a unified control platform to deal with the collaborative control problem of heterogeneous smart devices is one of the greatest challenges in the current smart home field. The main contribution of this paper is to propose a universal smart home control platform architecture (IAPhome) based on a multi-agent system and communication middleware, which shows significant adaptability and advantages in many aspects, including heterogeneous devices connectivity, collaborative control, human-computer interaction and user self-management. The communication middleware is an important foundation to design and implement this architecture which makes it possible to integrate heterogeneous smart devices in a flexible way. A concrete method of applying the multi-agent software technique to solve the integrated control problem of the smart home system is also presented. The proposed platform architecture has been tested in a real smart home environment, and the results indicate that the effectiveness of our approach for solving the collaborative control problem of different smart devices. PMID:28926957
Zheng, Song; Zhang, Qi; Zheng, Rong; Huang, Bi-Qin; Song, Yi-Lin; Chen, Xin-Chu
2017-09-16
In recent years, the smart home field has gained wide attention for its broad application prospects. However, families using smart home systems must usually adopt various heterogeneous smart devices, including sensors and devices, which makes it more difficult to manage and control their home system. How to design a unified control platform to deal with the collaborative control problem of heterogeneous smart devices is one of the greatest challenges in the current smart home field. The main contribution of this paper is to propose a universal smart home control platform architecture (IAPhome) based on a multi-agent system and communication middleware, which shows significant adaptability and advantages in many aspects, including heterogeneous devices connectivity, collaborative control, human-computer interaction and user self-management. The communication middleware is an important foundation to design and implement this architecture which makes it possible to integrate heterogeneous smart devices in a flexible way. A concrete method of applying the multi-agent software technique to solve the integrated control problem of the smart home system is also presented. The proposed platform architecture has been tested in a real smart home environment, and the results indicate that the effectiveness of our approach for solving the collaborative control problem of different smart devices.
Pokorney, Sean D; Greenfield, Ruth Ann; Atwater, Brett D; Daubert, James P; Piccini, Jonathan P
2014-12-01
Battery failure is an uncommon complication of implantable cardioverter-defibrillators (ICDs), but unanticipated battery depletion can have life-threatening consequences. The purpose of this study was to describe the prevalence of a novel mechanism of battery failure in St. Jude Medical Fortify and Unify ICDs. Cases of premature Fortify battery failure from a single center are reported. A search (January 1, 2010 through November 30, 2013) for Fortify and Unify premature batter failure was conducted of the Food and Drug Administration's Manufacturer and User Facility Device Experience Database (MAUDE). These findings were supplemented with information provided by St. Jude Medical. Premature battery failure for 2 Fortify ICDs in our practice were attributed to the presence of lithium clusters near the cathode, causing a short circuit and high current drain. The prevalence of this mechanism of premature battery failure was 0.6% in our practice. A MAUDE search identified 39 cases of Fortify (30) and Unify (9) premature battery depletion confirmed by the manufacturer, representing a 0.03% prevalence. Four additional Fortify and 2 Unify cases were identified in MAUDE as suspected premature battery depletion, but in these cases the pulse generator was not returned to the manufacturer for evaluation. St. Jude Medical identified 10 cases of premature battery failure due to lithium clusters in Fortify devices (9) and Unify devices (1), representing a 0.004% prevalence. The deposition of lithium clusters near the cathode is a novel mechanism of premature battery failure. The prevalence of this problem is unknown. Providers should be aware of this mechanism for patient management. Copyright © 2014 Heart Rhythm Society. Published by Elsevier Inc. All rights reserved.
Exploiting current-generation graphics hardware for synthetic-scene generation
NASA Astrophysics Data System (ADS)
Tanner, Michael A.; Keen, Wayne A.
2010-04-01
Increasing seeker frame rate and pixel count, as well as the demand for higher levels of scene fidelity, have driven scene generation software for hardware-in-the-loop (HWIL) and software-in-the-loop (SWIL) testing to higher levels of parallelization. Because modern PC graphics cards provide multiple computational cores (240 shader cores for a current NVIDIA Corporation GeForce and Quadro cards), implementation of phenomenology codes on graphics processing units (GPUs) offers significant potential for simultaneous enhancement of simulation frame rate and fidelity. To take advantage of this potential requires algorithm implementation that is structured to minimize data transfers between the central processing unit (CPU) and the GPU. In this paper, preliminary methodologies developed at the Kinetic Hardware In-The-Loop Simulator (KHILS) will be presented. Included in this paper will be various language tradeoffs between conventional shader programming, Compute Unified Device Architecture (CUDA) and Open Computing Language (OpenCL), including performance trades and possible pathways for future tool development.
Kohno, R; Hotta, K; Nishioka, S; Matsubara, K; Tansho, R; Suzuki, T
2011-11-21
We implemented the simplified Monte Carlo (SMC) method on graphics processing unit (GPU) architecture under the computer-unified device architecture platform developed by NVIDIA. The GPU-based SMC was clinically applied for four patients with head and neck, lung, or prostate cancer. The results were compared to those obtained by a traditional CPU-based SMC with respect to the computation time and discrepancy. In the CPU- and GPU-based SMC calculations, the estimated mean statistical errors of the calculated doses in the planning target volume region were within 0.5% rms. The dose distributions calculated by the GPU- and CPU-based SMCs were similar, within statistical errors. The GPU-based SMC showed 12.30-16.00 times faster performance than the CPU-based SMC. The computation time per beam arrangement using the GPU-based SMC for the clinical cases ranged 9-67 s. The results demonstrate the successful application of the GPU-based SMC to a clinical proton treatment planning.
MIGS-GPU: Microarray Image Gridding and Segmentation on the GPU.
Katsigiannis, Stamos; Zacharia, Eleni; Maroulis, Dimitris
2017-05-01
Complementary DNA (cDNA) microarray is a powerful tool for simultaneously studying the expression level of thousands of genes. Nevertheless, the analysis of microarray images remains an arduous and challenging task due to the poor quality of the images that often suffer from noise, artifacts, and uneven background. In this study, the MIGS-GPU [Microarray Image Gridding and Segmentation on Graphics Processing Unit (GPU)] software for gridding and segmenting microarray images is presented. MIGS-GPU's computations are performed on the GPU by means of the compute unified device architecture (CUDA) in order to achieve fast performance and increase the utilization of available system resources. Evaluation on both real and synthetic cDNA microarray images showed that MIGS-GPU provides better performance than state-of-the-art alternatives, while the proposed GPU implementation achieves significantly lower computational times compared to the respective CPU approaches. Consequently, MIGS-GPU can be an advantageous and useful tool for biomedical laboratories, offering a user-friendly interface that requires minimum input in order to run.
A versatile model for soft patchy particles with various patch arrangements.
Li, Zhan-Wei; Zhu, You-Liang; Lu, Zhong-Yuan; Sun, Zhao-Yan
2016-01-21
We propose a simple and general mesoscale soft patchy particle model, which can felicitously describe the deformable and surface-anisotropic characteristics of soft patchy particles. This model can be used in dynamics simulations to investigate the aggregation behavior and mechanism of various types of soft patchy particles with tunable number, size, direction, and geometrical arrangement of the patches. To improve the computational efficiency of this mesoscale model in dynamics simulations, we give the simulation algorithm that fits the compute unified device architecture (CUDA) framework of NVIDIA graphics processing units (GPUs). The validation of the model and the performance of the simulations using GPUs are demonstrated by simulating several benchmark systems of soft patchy particles with 1 to 4 patches in a regular geometrical arrangement. Because of its simplicity and computational efficiency, the soft patchy particle model will provide a powerful tool to investigate the aggregation behavior of soft patchy particles, such as patchy micelles, patchy microgels, and patchy dendrimers, over larger spatial and temporal scales.
Acceleration of FDTD mode solver by high-performance computing techniques.
Han, Lin; Xi, Yanping; Huang, Wei-Ping
2010-06-21
A two-dimensional (2D) compact finite-difference time-domain (FDTD) mode solver is developed based on wave equation formalism in combination with the matrix pencil method (MPM). The method is validated for calculation of both real guided and complex leaky modes of typical optical waveguides against the bench-mark finite-difference (FD) eigen mode solver. By taking advantage of the inherent parallel nature of the FDTD algorithm, the mode solver is implemented on graphics processing units (GPUs) using the compute unified device architecture (CUDA). It is demonstrated that the high-performance computing technique leads to significant acceleration of the FDTD mode solver with more than 30 times improvement in computational efficiency in comparison with the conventional FDTD mode solver running on CPU of a standard desktop computer. The computational efficiency of the accelerated FDTD method is in the same order of magnitude of the standard finite-difference eigen mode solver and yet require much less memory (e.g., less than 10%). Therefore, the new method may serve as an efficient, accurate and robust tool for mode calculation of optical waveguides even when the conventional eigen value mode solvers are no longer applicable due to memory limitation.
An, Gary; Bartels, John; Vodovotz, Yoram
2011-01-01
The clinical translation of promising basic biomedical findings, whether derived from reductionist studies in academic laboratories or as the product of extensive high-throughput and –content screens in the biotechnology and pharmaceutical industries, has reached a period of stagnation in which ever higher research and development costs are yielding ever fewer new drugs. Systems biology and computational modeling have been touted as potential avenues by which to break through this logjam. However, few mechanistic computational approaches are utilized in a manner that is fully cognizant of the inherent clinical realities in which the drugs developed through this ostensibly rational process will be ultimately used. In this article, we present a Translational Systems Biology approach to inflammation. This approach is based on the use of mechanistic computational modeling centered on inherent clinical applicability, namely that a unified suite of models can be applied to generate in silico clinical trials, individualized computational models as tools for personalized medicine, and rational drug and device design based on disease mechanism. PMID:21552346
A real-time spike sorting method based on the embedded GPU.
Zelan Yang; Kedi Xu; Xiang Tian; Shaomin Zhang; Xiaoxiang Zheng
2017-07-01
Microelectrode arrays with hundreds of channels have been widely used to acquire neuron population signals in neuroscience studies. Online spike sorting is becoming one of the most important challenges for high-throughput neural signal acquisition systems. Graphic processing unit (GPU) with high parallel computing capability might provide an alternative solution for increasing real-time computational demands on spike sorting. This study reported a method of real-time spike sorting through computing unified device architecture (CUDA) which was implemented on an embedded GPU (NVIDIA JETSON Tegra K1, TK1). The sorting approach is based on the principal component analysis (PCA) and K-means. By analyzing the parallelism of each process, the method was further optimized in the thread memory model of GPU. Our results showed that the GPU-based classifier on TK1 is 37.92 times faster than the MATLAB-based classifier on PC while their accuracies were the same with each other. The high-performance computing features of embedded GPU demonstrated in our studies suggested that the embedded GPU provide a promising platform for the real-time neural signal processing.
GPU Accelerated Vector Median Filter
NASA Technical Reports Server (NTRS)
Aras, Rifat; Shen, Yuzhong
2011-01-01
Noise reduction is an important step for most image processing tasks. For three channel color images, a widely used technique is vector median filter in which color values of pixels are treated as 3-component vectors. Vector median filters are computationally expensive; for a window size of n x n, each of the n(sup 2) vectors has to be compared with other n(sup 2) - 1 vectors in distances. General purpose computation on graphics processing units (GPUs) is the paradigm of utilizing high-performance many-core GPU architectures for computation tasks that are normally handled by CPUs. In this work. NVIDIA's Compute Unified Device Architecture (CUDA) paradigm is used to accelerate vector median filtering. which has to the best of our knowledge never been done before. The performance of GPU accelerated vector median filter is compared to that of the CPU and MPI-based versions for different image and window sizes, Initial findings of the study showed 100x improvement of performance of vector median filter implementation on GPUs over CPU implementations and further speed-up is expected after more extensive optimizations of the GPU algorithm .
Wang, Chunfei; Zhang, Guang; Wu, Taihu; Zhan, Ningbo; Wang, Yaling
2016-03-01
High-quality cardiopulmonary resuscitation contributes to cardiac arrest survival. The traditional chest compression (CC) standard, which neglects individual differences, uses unified standards for compression depth and compression rate in practice. In this study, an effective and personalized CC method for automatic mechanical compression devices is provided. We rebuild Charles F. Babbs' human circulation model with a coronary perfusion pressure (CPP) simulation module and propose a closed-loop controller based on a fuzzy control algorithm for CCs, which adjusts the CC depth according to the CPP. Compared with a traditional proportion-integration-differentiation (PID) controller, the performance of the fuzzy controller is evaluated in computer simulation studies. The simulation results demonstrate that the fuzzy closed-loop controller results in shorter regulation time, fewer oscillations and smaller overshoot than traditional PID controllers and outperforms the traditional PID controller for CPP regulation and maintenance.
POM.gpu-v1.0: a GPU-based Princeton Ocean Model
NASA Astrophysics Data System (ADS)
Xu, S.; Huang, X.; Oey, L.-Y.; Xu, F.; Fu, H.; Zhang, Y.; Yang, G.
2015-09-01
Graphics processing units (GPUs) are an attractive solution in many scientific applications due to their high performance. However, most existing GPU conversions of climate models use GPUs for only a few computationally intensive regions. In the present study, we redesign the mpiPOM (a parallel version of the Princeton Ocean Model) with GPUs. Specifically, we first convert the model from its original Fortran form to a new Compute Unified Device Architecture C (CUDA-C) code, then we optimize the code on each of the GPUs, the communications between the GPUs, and the I / O between the GPUs and the central processing units (CPUs). We show that the performance of the new model on a workstation containing four GPUs is comparable to that on a powerful cluster with 408 standard CPU cores, and it reduces the energy consumption by a factor of 6.8.
CUDAEASY - a GPU accelerated cosmological lattice program
NASA Astrophysics Data System (ADS)
Sainio, J.
2010-05-01
This paper presents, to the author's knowledge, the first graphics processing unit (GPU) accelerated program that solves the evolution of interacting scalar fields in an expanding universe. We present the implementation in NVIDIA's Compute Unified Device Architecture (CUDA) and compare the performance to other similar programs in chaotic inflation models. We report speedups between one and two orders of magnitude depending on the used hardware and software while achieving small errors in single precision. Simulations that used to last roughly one day to compute can now be done in hours and this difference is expected to increase in the future. The program has been written in the spirit of LATTICEEASY and users of the aforementioned program should find it relatively easy to start using CUDAEASY in lattice simulations. The program is available at http://www.physics.utu.fi/theory/particlecosmology/cudaeasy/ under the GNU General Public License.
NASA Astrophysics Data System (ADS)
Panov, Yu. D.; Moskvin, A. S.; Rybakov, F. N.; Borisov, A. B.
2016-12-01
We made use of a special algorithm for compute unified device architecture for NVIDIA graphics cards, a nonlinear conjugate-gradient method to minimize energy functional, and Monte-Carlo technique to directly observe the forming of the ground state configuration for the 2D hard-core bosons by lowering the temperature and its evolution with deviation away from half-filling. The novel technique allowed us to examine earlier implications and uncover novel features of the phase transitions, in particular, look upon the nucleation of the odd domain structure, emergence of filamentary superfluidity nucleated at the antiphase domain walls of the charge-ordered phase, and nucleation and evolution of different topological structures.
Unified Engineering Software System
NASA Technical Reports Server (NTRS)
Purves, L. R.; Gordon, S.; Peltzman, A.; Dube, M.
1989-01-01
Collection of computer programs performs diverse functions in prototype engineering. NEXUS, NASA Engineering Extendible Unified Software system, is research set of computer programs designed to support full sequence of activities encountered in NASA engineering projects. Sequence spans preliminary design, design analysis, detailed design, manufacturing, assembly, and testing. Primarily addresses process of prototype engineering, task of getting single or small number of copies of product to work. Written in FORTRAN 77 and PROLOG.
UFC advisor: An AI-based system for the automatic test environment
NASA Technical Reports Server (NTRS)
Lincoln, David T.; Fink, Pamela K.
1990-01-01
The Air Logistics Command within the Air Force is responsible for maintaining a wide variety of aircraft fleets and weapon systems. To maintain these fleets and systems requires specialized test equipment that provides data concerning the behavior of a particular device. The test equipment is used to 'poke and prod' the device to determine its functionality. The data represent voltages, pressures, torques, temperatures, etc. and are called testpoints. These testpoints can be defined numerically as being in or out of limits/tolerance. Some test equipment is termed 'automatic' because it is computer-controlled. Due to the fact that effective maintenance in the test arena requires a significant amount of expertise, it is an ideal area for the application of knowledge-based system technology. Such a system would take testpoint data, identify values out-of-limits, and determine potential underlying problems based on what is out-of-limits and how far. This paper discusses the application of this technology to a device called the Unified Fuel Control (UFC) which is maintained in this manner.
VIDANA: Data Management System for Nano Satellites
NASA Astrophysics Data System (ADS)
Montenegro, Sergio; Walter, Thomas; Dilger, Erik
2013-08-01
A Vidana data management system is a network of software and hardware components. This implies a software network, a hardware network and a smooth connection between both of them. Our strategy is based on our innovative middleware. A reliable interconnection network (SW & HW) which can interconnect many unreliable redundant components such as sensors, actuators, communication devices, computers, and storage elements,... and software components! Component failures are detected, the affected device is disabled and its function is taken over by a redundant component. Our middleware doesn't connect only software, but also devices and software together. Software and hardware communicate with each other without having to distinguish which functions are in software and which are implemented in hardware. Components may be turned on and off at any time, and the whole system will autonomously adapt to its new configuration in order to continue fulfilling its task. In VIDANA we aim dynamic adaptability (run tine), static adaptability (tailoring), and unified HW/SW communication protocols. For many of these aspects we use "learn from the nature" where we can find astonishing reference implementations.
Numerical Simulation of Transit-Time Ultrasonic Flowmeters by a Direct Approach.
Luca, Adrian; Marchiano, Regis; Chassaing, Jean-Camille
2016-06-01
This paper deals with the development of a computational code for the numerical simulation of wave propagation through domains with a complex geometry consisting in both solids and moving fluids. The emphasis is on the numerical simulation of ultrasonic flowmeters (UFMs) by modeling the wave propagation in solids with the equations of linear elasticity (ELE) and in fluids with the linearized Euler equations (LEEs). This approach requires high performance computing because of the high number of degrees of freedom and the long propagation distances. Therefore, the numerical method should be chosen with care. In order to minimize the numerical dissipation which may occur in this kind of configuration, the numerical method employed here is the nodal discontinuous Galerkin (DG) method. Also, this method is well suited for parallel computing. To speed up the code, almost all the computational stages have been implemented to run on graphical processing unit (GPU) by using the compute unified device architecture (CUDA) programming model from NVIDIA. This approach has been validated and then used for the two-dimensional simulation of gas UFMs. The large contrast of acoustic impedance characteristic to gas UFMs makes their simulation a real challenge.
2011-10-14
landscapes. It is motivated by statistical learning arguments and unifies the tasks of biasing the molecular dynamics to escape free energy wells and...statistical learning arguments and unifies the tasks of biasing the molecular dynamics to escape free energy wells and estimating the free energy...experimentally, to characterize global changes as well as investigate relative stabilities. In most applications, a brute- force computation based on
A Unified Methodology for Computing Accurate Quaternion Color Moments and Moment Invariants.
Karakasis, Evangelos G; Papakostas, George A; Koulouriotis, Dimitrios E; Tourassis, Vassilios D
2014-02-01
In this paper, a general framework for computing accurate quaternion color moments and their corresponding invariants is proposed. The proposed unified scheme arose by studying the characteristics of different orthogonal polynomials. These polynomials are used as kernels in order to form moments, the invariants of which can easily be derived. The resulted scheme permits the usage of any polynomial-like kernel in a unified and consistent way. The resulted moments and moment invariants demonstrate robustness to noisy conditions and high discriminative power. Additionally, in the case of continuous moments, accurate computations take place to avoid approximation errors. Based on this general methodology, the quaternion Tchebichef, Krawtchouk, Dual Hahn, Legendre, orthogonal Fourier-Mellin, pseudo Zernike and Zernike color moments, and their corresponding invariants are introduced. A selected paradigm presents the reconstruction capability of each moment family, whereas proper classification scenarios evaluate the performance of color moment invariants.
A convergent model for distributed processing of Big Sensor Data in urban engineering networks
NASA Astrophysics Data System (ADS)
Parygin, D. S.; Finogeev, A. G.; Kamaev, V. A.; Finogeev, A. A.; Gnedkova, E. P.; Tyukov, A. P.
2017-01-01
The problems of development and research of a convergent model of the grid, cloud, fog and mobile computing for analytical Big Sensor Data processing are reviewed. The model is meant to create monitoring systems of spatially distributed objects of urban engineering networks and processes. The proposed approach is the convergence model of the distributed data processing organization. The fog computing model is used for the processing and aggregation of sensor data at the network nodes and/or industrial controllers. The program agents are loaded to perform computing tasks for the primary processing and data aggregation. The grid and the cloud computing models are used for integral indicators mining and accumulating. A computing cluster has a three-tier architecture, which includes the main server at the first level, a cluster of SCADA system servers at the second level, a lot of GPU video cards with the support for the Compute Unified Device Architecture at the third level. The mobile computing model is applied to visualize the results of intellectual analysis with the elements of augmented reality and geo-information technologies. The integrated indicators are transferred to the data center for accumulation in a multidimensional storage for the purpose of data mining and knowledge gaining.
A CFD Heterogeneous Parallel Solver Based on Collaborating CPU and GPU
NASA Astrophysics Data System (ADS)
Lai, Jianqi; Tian, Zhengyu; Li, Hua; Pan, Sha
2018-03-01
Since Graphic Processing Unit (GPU) has a strong ability of floating-point computation and memory bandwidth for data parallelism, it has been widely used in the areas of common computing such as molecular dynamics (MD), computational fluid dynamics (CFD) and so on. The emergence of compute unified device architecture (CUDA), which reduces the complexity of compiling program, brings the great opportunities to CFD. There are three different modes for parallel solution of NS equations: parallel solver based on CPU, parallel solver based on GPU and heterogeneous parallel solver based on collaborating CPU and GPU. As we can see, GPUs are relatively rich in compute capacity but poor in memory capacity and the CPUs do the opposite. We need to make full use of the GPUs and CPUs, so a CFD heterogeneous parallel solver based on collaborating CPU and GPU has been established. Three cases are presented to analyse the solver’s computational accuracy and heterogeneous parallel efficiency. The numerical results agree well with experiment results, which demonstrate that the heterogeneous parallel solver has high computational precision. The speedup on a single GPU is more than 40 for laminar flow, it decreases for turbulent flow, but it still can reach more than 20. What’s more, the speedup increases as the grid size becomes larger.
Technical Evaluation of Sample-Processing, Collection, and Preservation Methods
2014-07-01
For the Gram-positive organism, B. atrophaeus var. globigii (Unified Culture Collection [ UCC ] designation: BACI051) was selected as a surrogate for...the well-known biothreat agent Bacillus anthracis. For the Gram-negative organism, Y. pestis CO92 (pgm–) ( UCC designation: YERS059) was selected...Diagnostics device) TAMRA tetramethylrhodamine TE buffer tris-ethylenediaminetetraacetic acid buffer UCC Unified Culture Collection USG U.S. Government
Unified first wall - blanket structure for plasma device applications
Gruen, D.M.
A plasma device is described for use in controlling nuclear reactions within the plasma including a first wall and blanket formed in a one-piece structure composed of a solid solution containing copper and lithium and melting above about 500/sup 0/C.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Peyret, Thomas; Poulin, Patrick; Krishnan, Kannan, E-mail: kannan.krishnan@umontreal.ca
The algorithms in the literature focusing to predict tissue:blood PC (P{sub tb}) for environmental chemicals and tissue:plasma PC based on total (K{sub p}) or unbound concentration (K{sub pu}) for drugs differ in their consideration of binding to hemoglobin, plasma proteins and charged phospholipids. The objective of the present study was to develop a unified algorithm such that P{sub tb}, K{sub p} and K{sub pu} for both drugs and environmental chemicals could be predicted. The development of the unified algorithm was accomplished by integrating all mechanistic algorithms previously published to compute the PCs. Furthermore, the algorithm was structured in such amore » way as to facilitate predictions of the distribution of organic compounds at the macro (i.e. whole tissue) and micro (i.e. cells and fluids) levels. The resulting unified algorithm was applied to compute the rat P{sub tb}, K{sub p} or K{sub pu} of muscle (n = 174), liver (n = 139) and adipose tissue (n = 141) for acidic, neutral, zwitterionic and basic drugs as well as ketones, acetate esters, alcohols, aliphatic hydrocarbons, aromatic hydrocarbons and ethers. The unified algorithm reproduced adequately the values predicted previously by the published algorithms for a total of 142 drugs and chemicals. The sensitivity analysis demonstrated the relative importance of the various compound properties reflective of specific mechanistic determinants relevant to prediction of PC values of drugs and environmental chemicals. Overall, the present unified algorithm uniquely facilitates the computation of macro and micro level PCs for developing organ and cellular-level PBPK models for both chemicals and drugs.« less
NASA Astrophysics Data System (ADS)
Qian, Feng; Li, Guoqiang
2001-12-01
In this paper a generalized look-ahead logic algorithm for number conversion from signed-digit to its complement representation is developed. By properly encoding the signed digits, all the operations are performed by binary logic, and unified logical expressions can be obtained for conversion from modified-signed-digit (MSD) to 2's complement, trinary signed-digit (TSD) to 3's complement, and quaternary signed-digit (QSD) to 4's complement. For optical implementation, a parallel logical array module using electron-trapping device is employed, which is suitable for realizing complex logic functions in the form of sum-of-product. The proposed algorithm and architecture are compatible with a general-purpose optoelectronic computing system.
Fast simulation of the NICER instrument
NASA Astrophysics Data System (ADS)
Doty, John P.; Wampler-Doty, Matthew P.; Prigozhin, Gregory Y.; Okajima, Takashi; Arzoumanian, Zaven; Gendreau, Keith
2016-07-01
The NICER1 mission uses a complicated physical system to collect information from objects that are, by x-ray timing science standards, rather faint. To get the most out of the data we will need a rigorous understanding of all instrumental effects. We are in the process of constructing a very fast, high fidelity simulator that will help us to assess instrument performance, support simulation-based data reduction, and improve our estimates of measurement error. We will combine and extend existing optics, detector, and electronics simulations. We will employ the Compute Unified Device Architecture (CUDA2) to parallelize these calculations. The price of suitable CUDA-compatible multi-giga op cores is about $0.20/core, so this approach will be very cost-effective.
Workflow of the Grover algorithm simulation incorporating CUDA and GPGPU
NASA Astrophysics Data System (ADS)
Lu, Xiangwen; Yuan, Jiabin; Zhang, Weiwei
2013-09-01
The Grover quantum search algorithm, one of only a few representative quantum algorithms, can speed up many classical algorithms that use search heuristics. No true quantum computer has yet been developed. For the present, simulation is one effective means of verifying the search algorithm. In this work, we focus on the simulation workflow using a compute unified device architecture (CUDA). Two simulation workflow schemes are proposed. These schemes combine the characteristics of the Grover algorithm and the parallelism of general-purpose computing on graphics processing units (GPGPU). We also analyzed the optimization of memory space and memory access from this perspective. We implemented four programs on CUDA to evaluate the performance of schemes and optimization. Through experimentation, we analyzed the organization of threads suited to Grover algorithm simulations, compared the storage costs of the four programs, and validated the effectiveness of optimization. Experimental results also showed that the distinguished program on CUDA outperformed the serial program of libquantum on a CPU with a speedup of up to 23 times (12 times on average), depending on the scale of the simulation.
Kwon, M-W; Kim, S-C; Yoon, S-E; Ho, Y-S; Kim, E-S
2015-02-09
A new object tracking mask-based novel-look-up-table (OTM-NLUT) method is proposed and implemented on graphics-processing-units (GPUs) for real-time generation of holographic videos of three-dimensional (3-D) scenes. Since the proposed method is designed to be matched with software and memory structures of the GPU, the number of compute-unified-device-architecture (CUDA) kernel function calls and the computer-generated hologram (CGH) buffer size of the proposed method have been significantly reduced. It therefore results in a great increase of the computational speed of the proposed method and enables real-time generation of CGH patterns of 3-D scenes. Experimental results show that the proposed method can generate 31.1 frames of Fresnel CGH patterns with 1,920 × 1,080 pixels per second, on average, for three test 3-D video scenarios with 12,666 object points on three GPU boards of NVIDIA GTX TITAN, and confirm the feasibility of the proposed method in the practical application of electro-holographic 3-D displays.
Kwon, Min-Woo; Kim, Seung-Cheol; Kim, Eun-Soo
2016-01-20
A three-directional motion-compensation mask-based novel look-up table method is proposed and implemented on graphics processing units (GPUs) for video-rate generation of digital holographic videos of three-dimensional (3D) scenes. Since the proposed method is designed to be well matched with the software and memory structures of GPUs, the number of compute-unified-device-architecture kernel function calls can be significantly reduced. This results in a great increase of the computational speed of the proposed method, allowing video-rate generation of the computer-generated hologram (CGH) patterns of 3D scenes. Experimental results reveal that the proposed method can generate 39.8 frames of Fresnel CGH patterns with 1920×1080 pixels per second for the test 3D video scenario with 12,088 object points on dual GPU boards of NVIDIA GTX TITANs, and they confirm the feasibility of the proposed method in the practical application fields of electroholographic 3D displays.
A Unified Taxonomic Approach to the Laboratory Assessment of Visionic Devices
2006-09-01
the ratification stage with member nations. Marasco and Task 4 presented a large array of tests applicable to image intensification-based visionic...aircraft.” In print. 4. Marasco , P. L., and Task, H. L. 1999. “Optical characterization of wide field-of-view night vision devices,” in
Visual based laser speckle pattern recognition method for structural health monitoring
NASA Astrophysics Data System (ADS)
Park, Kyeongtaek; Torbol, Marco
2017-04-01
This study performed the system identification of a target structure by analyzing the laser speckle pattern taken by a camera. The laser speckle pattern is generated by the diffuse reflection of the laser beam on a rough surface of the target structure. The camera, equipped with a red filter, records the scattered speckle particles of the laser light in real time and the raw speckle image of the pixel data is fed to the graphic processing unit (GPU) in the system. The algorithm for laser speckle contrast analysis (LASCA) computes: the laser speckle contrast images and the laser speckle flow images. The k-mean clustering algorithm is used to classify the pixels in each frame and the clusters' centroids, which function as virtual sensors, track the displacement between different frames in time domain. The fast Fourier transform (FFT) and the frequency domain decomposition (FDD) compute the modal properties of the structure: natural frequencies and damping ratios. This study takes advantage of the large scale computational capability of GPU. The algorithm is written in Compute Unifies Device Architecture (CUDA C) that allows the processing of speckle images in real time.
SU (2) lattice gauge theory simulations on Fermi GPUs
NASA Astrophysics Data System (ADS)
Cardoso, Nuno; Bicudo, Pedro
2011-05-01
In this work we explore the performance of CUDA in quenched lattice SU (2) simulations. CUDA, NVIDIA Compute Unified Device Architecture, is a hardware and software architecture developed by NVIDIA for computing on the GPU. We present an analysis and performance comparison between the GPU and CPU in single and double precision. Analyses with multiple GPUs and two different architectures (G200 and Fermi architectures) are also presented. In order to obtain a high performance, the code must be optimized for the GPU architecture, i.e., an implementation that exploits the memory hierarchy of the CUDA programming model. We produce codes for the Monte Carlo generation of SU (2) lattice gauge configurations, for the mean plaquette, for the Polyakov Loop at finite T and for the Wilson loop. We also present results for the potential using many configurations (50,000) without smearing and almost 2000 configurations with APE smearing. With two Fermi GPUs we have achieved an excellent performance of 200× the speed over one CPU, in single precision, around 110 Gflops/s. We also find that, using the Fermi architecture, double precision computations for the static quark-antiquark potential are not much slower (less than 2× slower) than single precision computations.
A model for calculating expected performance of the Apollo unified S-band (USB) communication system
NASA Technical Reports Server (NTRS)
Schroeder, N. W.
1971-01-01
A model for calculating the expected performance of the Apollo unified S-band (USB) communication system is presented. The general organization of the Apollo USB is described. The mathematical model is reviewed and the computer program for implementation of the calculations is included.
Eghtesad, Adnan; Germaschewski, Kai; Beyerlein, Irene J.; ...
2017-10-14
We present the first high-performance computing implementation of the meso-scale phase field dislocation dynamics (PFDD) model on a graphics processing unit (GPU)-based platform. The implementation takes advantage of the portable OpenACC standard directive pragmas along with Nvidia's compute unified device architecture (CUDA) fast Fourier transform (FFT) library called CUFFT to execute the FFT computations within the PFDD formulation on the same GPU platform. The overall implementation is termed ACCPFDD-CUFFT. The package is entirely performance portable due to the use of OPENACC-CUDA inter-operability, in which calls to CUDA functions are replaced with the OPENACC data regions for a host central processingmore » unit (CPU) and device (GPU). A comprehensive benchmark study has been conducted, which compares a number of FFT routines, the Numerical Recipes FFT (FOURN), Fastest Fourier Transform in the West (FFTW), and the CUFFT. The last one exploits the advantages of the GPU hardware for FFT calculations. The novel ACCPFDD-CUFFT implementation is verified using the analytical solutions for the stress field around an infinite edge dislocation and subsequently applied to simulate the interaction and motion of dislocations through a bi-phase copper-nickel (Cu–Ni) interface. It is demonstrated that the ACCPFDD-CUFFT implementation on a single TESLA K80 GPU offers a 27.6X speedup relative to the serial version and a 5X speedup relative to the 22-multicore Intel Xeon CPU E5-2699 v4 @ 2.20 GHz version of the code.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Eghtesad, Adnan; Germaschewski, Kai; Beyerlein, Irene J.
We present the first high-performance computing implementation of the meso-scale phase field dislocation dynamics (PFDD) model on a graphics processing unit (GPU)-based platform. The implementation takes advantage of the portable OpenACC standard directive pragmas along with Nvidia's compute unified device architecture (CUDA) fast Fourier transform (FFT) library called CUFFT to execute the FFT computations within the PFDD formulation on the same GPU platform. The overall implementation is termed ACCPFDD-CUFFT. The package is entirely performance portable due to the use of OPENACC-CUDA inter-operability, in which calls to CUDA functions are replaced with the OPENACC data regions for a host central processingmore » unit (CPU) and device (GPU). A comprehensive benchmark study has been conducted, which compares a number of FFT routines, the Numerical Recipes FFT (FOURN), Fastest Fourier Transform in the West (FFTW), and the CUFFT. The last one exploits the advantages of the GPU hardware for FFT calculations. The novel ACCPFDD-CUFFT implementation is verified using the analytical solutions for the stress field around an infinite edge dislocation and subsequently applied to simulate the interaction and motion of dislocations through a bi-phase copper-nickel (Cu–Ni) interface. It is demonstrated that the ACCPFDD-CUFFT implementation on a single TESLA K80 GPU offers a 27.6X speedup relative to the serial version and a 5X speedup relative to the 22-multicore Intel Xeon CPU E5-2699 v4 @ 2.20 GHz version of the code.« less
Chen, Qingkui; Zhao, Deyu; Wang, Jingjuan
2017-01-01
This paper aims to develop a low-cost, high-performance and high-reliability computing system to process large-scale data using common data mining algorithms in the Internet of Things (IoT) computing environment. Considering the characteristics of IoT data processing, similar to mainstream high performance computing, we use a GPU (Graphics Processing Unit) cluster to achieve better IoT services. Firstly, we present an energy consumption calculation method (ECCM) based on WSNs. Then, using the CUDA (Compute Unified Device Architecture) Programming model, we propose a Two-level Parallel Optimization Model (TLPOM) which exploits reasonable resource planning and common compiler optimization techniques to obtain the best blocks and threads configuration considering the resource constraints of each node. The key to this part is dynamic coupling Thread-Level Parallelism (TLP) and Instruction-Level Parallelism (ILP) to improve the performance of the algorithms without additional energy consumption. Finally, combining the ECCM and the TLPOM, we use the Reliable GPU Cluster Architecture (RGCA) to obtain a high-reliability computing system considering the nodes’ diversity, algorithm characteristics, etc. The results show that the performance of the algorithms significantly increased by 34.1%, 33.96% and 24.07% for Fermi, Kepler and Maxwell on average with TLPOM and the RGCA ensures that our IoT computing system provides low-cost and high-reliability services. PMID:28777325
Fang, Yuling; Chen, Qingkui; Xiong, Neal N; Zhao, Deyu; Wang, Jingjuan
2017-08-04
This paper aims to develop a low-cost, high-performance and high-reliability computing system to process large-scale data using common data mining algorithms in the Internet of Things (IoT) computing environment. Considering the characteristics of IoT data processing, similar to mainstream high performance computing, we use a GPU (Graphics Processing Unit) cluster to achieve better IoT services. Firstly, we present an energy consumption calculation method (ECCM) based on WSNs. Then, using the CUDA (Compute Unified Device Architecture) Programming model, we propose a Two-level Parallel Optimization Model (TLPOM) which exploits reasonable resource planning and common compiler optimization techniques to obtain the best blocks and threads configuration considering the resource constraints of each node. The key to this part is dynamic coupling Thread-Level Parallelism (TLP) and Instruction-Level Parallelism (ILP) to improve the performance of the algorithms without additional energy consumption. Finally, combining the ECCM and the TLPOM, we use the Reliable GPU Cluster Architecture (RGCA) to obtain a high-reliability computing system considering the nodes' diversity, algorithm characteristics, etc. The results show that the performance of the algorithms significantly increased by 34.1%, 33.96% and 24.07% for Fermi, Kepler and Maxwell on average with TLPOM and the RGCA ensures that our IoT computing system provides low-cost and high-reliability services.
GPU computing of compressible flow problems by a meshless method with space-filling curves
NASA Astrophysics Data System (ADS)
Ma, Z. H.; Wang, H.; Pu, S. H.
2014-04-01
A graphic processing unit (GPU) implementation of a meshless method for solving compressible flow problems is presented in this paper. Least-square fit is used to discretize the spatial derivatives of Euler equations and an upwind scheme is applied to estimate the flux terms. The compute unified device architecture (CUDA) C programming model is employed to efficiently and flexibly port the meshless solver from CPU to GPU. Considering the data locality of randomly distributed points, space-filling curves are adopted to re-number the points in order to improve the memory performance. Detailed evaluations are firstly carried out to assess the accuracy and conservation property of the underlying numerical method. Then the GPU accelerated flow solver is used to solve external steady flows over aerodynamic configurations. Representative results are validated through extensive comparisons with the experimental, finite volume or other available reference solutions. Performance analysis reveals that the running time cost of simulations is significantly reduced while impressive (more than an order of magnitude) speedups are achieved.
GPU accelerated manifold correction method for spinning compact binaries
NASA Astrophysics Data System (ADS)
Ran, Chong-xi; Liu, Song; Zhong, Shuang-ying
2018-04-01
The graphics processing unit (GPU) acceleration of the manifold correction algorithm based on the compute unified device architecture (CUDA) technology is designed to simulate the dynamic evolution of the Post-Newtonian (PN) Hamiltonian formulation of spinning compact binaries. The feasibility and the efficiency of parallel computation on GPU have been confirmed by various numerical experiments. The numerical comparisons show that the accuracy on GPU execution of manifold corrections method has a good agreement with the execution of codes on merely central processing unit (CPU-based) method. The acceleration ability when the codes are implemented on GPU can increase enormously through the use of shared memory and register optimization techniques without additional hardware costs, implying that the speedup is nearly 13 times as compared with the codes executed on CPU for phase space scan (including 314 × 314 orbits). In addition, GPU-accelerated manifold correction method is used to numerically study how dynamics are affected by the spin-induced quadrupole-monopole interaction for black hole binary system.
Porting marine ecosystem model spin-up using transport matrices to GPUs
NASA Astrophysics Data System (ADS)
Siewertsen, E.; Piwonski, J.; Slawig, T.
2013-01-01
We have ported an implementation of the spin-up for marine ecosystem models based on transport matrices to graphics processing units (GPUs). The original implementation was designed for distributed-memory architectures and uses the Portable, Extensible Toolkit for Scientific Computation (PETSc) library that is based on the Message Passing Interface (MPI) standard. The spin-up computes a steady seasonal cycle of ecosystem tracers with climatological ocean circulation data as forcing. Since the transport is linear with respect to the tracers, the resulting operator is represented by matrices. Each iteration of the spin-up involves two matrix-vector multiplications and the evaluation of the used biogeochemical model. The original code was written in C and Fortran. On the GPU, we use the Compute Unified Device Architecture (CUDA) standard, a customized version of PETSc and a commercial CUDA Fortran compiler. We describe the extensions to PETSc and the modifications of the original C and Fortran codes that had to be done. Here we make use of freely available libraries for the GPU. We analyze the computational effort of the main parts of the spin-up for two exemplar ecosystem models and compare the overall computational time to those necessary on different CPUs. The results show that a consumer GPU can compete with a significant number of cluster CPUs without further code optimization.
Fast parallel tandem mass spectral library searching using GPU hardware acceleration.
Baumgardner, Lydia Ashleigh; Shanmugam, Avinash Kumar; Lam, Henry; Eng, Jimmy K; Martin, Daniel B
2011-06-03
Mass spectrometry-based proteomics is a maturing discipline of biologic research that is experiencing substantial growth. Instrumentation has steadily improved over time with the advent of faster and more sensitive instruments collecting ever larger data files. Consequently, the computational process of matching a peptide fragmentation pattern to its sequence, traditionally accomplished by sequence database searching and more recently also by spectral library searching, has become a bottleneck in many mass spectrometry experiments. In both of these methods, the main rate-limiting step is the comparison of an acquired spectrum with all potential matches from a spectral library or sequence database. This is a highly parallelizable process because the core computational element can be represented as a simple but arithmetically intense multiplication of two vectors. In this paper, we present a proof of concept project taking advantage of the massively parallel computing available on graphics processing units (GPUs) to distribute and accelerate the process of spectral assignment using spectral library searching. This program, which we have named FastPaSS (for Fast Parallelized Spectral Searching), is implemented in CUDA (Compute Unified Device Architecture) from NVIDIA, which allows direct access to the processors in an NVIDIA GPU. Our efforts demonstrate the feasibility of GPU computing for spectral assignment, through implementation of the validated spectral searching algorithm SpectraST in the CUDA environment.
Choi, Jeeyae; Choi, Jeungok E
2014-01-01
To provide best recommendations at the point of care, guidelines have been implemented in computer systems. As a prerequisite, guidelines are translated into a computer-interpretable guideline format. Since there are no specific tools to translate nursing guidelines, only a few nursing guidelines are translated and implemented in computer systems. Unified modeling language (UML) is a software writing language and is known to well and accurately represent end-users' perspective, due to the expressive characteristics of the UML. In order to facilitate the development of computer systems for nurses' use, the UML was used to translate a paper-based nursing guideline, and its ease of use and the usefulness were tested through a case study of a genetic counseling guideline. The UML was found to be a useful tool to nurse informaticians and a sufficient tool to model a guideline in a computer program.
Unifying framework for multimodal brain MRI segmentation based on Hidden Markov Chains.
Bricq, S; Collet, Ch; Armspach, J P
2008-12-01
In the frame of 3D medical imaging, accurate segmentation of multimodal brain MR images is of interest for many brain disorders. However, due to several factors such as noise, imaging artifacts, intrinsic tissue variation and partial volume effects, tissue classification remains a challenging task. In this paper, we present a unifying framework for unsupervised segmentation of multimodal brain MR images including partial volume effect, bias field correction, and information given by a probabilistic atlas. Here-proposed method takes into account neighborhood information using a Hidden Markov Chain (HMC) model. Due to the limited resolution of imaging devices, voxels may be composed of a mixture of different tissue types, this partial volume effect is included to achieve an accurate segmentation of brain tissues. Instead of assigning each voxel to a single tissue class (i.e., hard classification), we compute the relative amount of each pure tissue class in each voxel (mixture estimation). Further, a bias field estimation step is added to the proposed algorithm to correct intensity inhomogeneities. Furthermore, atlas priors were incorporated using probabilistic brain atlas containing prior expectations about the spatial localization of different tissue classes. This atlas is considered as a complementary sensor and the proposed method is extended to multimodal brain MRI without any user-tunable parameter (unsupervised algorithm). To validate this new unifying framework, we present experimental results on both synthetic and real brain images, for which the ground truth is available. Comparison with other often used techniques demonstrates the accuracy and the robustness of this new Markovian segmentation scheme.
Semiotics, Information Science, Documents and Computers.
ERIC Educational Resources Information Center
Warner, Julian
1990-01-01
Discusses the relationship and value of semiotics to the established domains of information science. Highlights include documentation; computer operations; the language of computing; automata theory; linguistics; speech and writing; and the written language as a unifying principle for the document and the computer. (93 references) (LRW)
Dual-Arm Generalized Compliant Motion With Shared Control
NASA Technical Reports Server (NTRS)
Backes, Paul G.
1994-01-01
Dual-Arm Generalized Compliant Motion (DAGCM) primitive computer program implementing improved unified control scheme for two manipulator arms cooperating in task in which both grasp same object. Provides capabilities for autonomous, teleoperation, and shared control of two robot arms. Unifies cooperative dual-arm control with multi-sensor-based task control and makes complete task-control capability available to higher-level task-planning computer system via large set of input parameters used to describe desired force and position trajectories followed by manipulator arms. Some concepts discussed in "A Generalized-Compliant-Motion Primitive" (NPO-18134).
Unified Quest 2004 Revisits Future War, Volume 6, Issue 3, April-June 2004
2004-06-01
fought campaign plans with students from the other Senior Level Colleges in a free - play computer-assisted war game. INSIDE THIS ISSUE • Unified...dynamic free - play environment. The exercise, guided by the participants’ own goals and objectives, and not by scripts or the Master Scenario Event
NASA Technical Reports Server (NTRS)
Maag, W.
1977-01-01
The Flight Design System (FDS) and the Unified System for Orbit Computation (USOC) are compared and described in relation to mission planning for the shuttle transportation system (STS). The FDS is designed to meet the requirements of a standardized production tool and the USOC is designed for rapid generation of particular application programs. The main emphasis in USOC is put on adaptability to new types of missions. It is concluded that a software system having a USOC-like structure, adapted to the specific needs of MPAD, would be appropriate to support planning tasks in the area unique to STS missions.
NASA Astrophysics Data System (ADS)
Yang, Wei; Hall, Trevor
2012-12-01
The Internet is entering an era of cloud computing to provide more cost effective, eco-friendly and reliable services to consumer and business users and the nature of the Internet traffic will undertake a fundamental transformation. Consequently, the current Internet will no longer suffice for serving cloud traffic in metro areas. This work proposes an infrastructure with a unified control plane that integrates simple packet aggregation technology with optical express through the interoperation between IP routers and electrical traffic controllers in optical metro networks. The proposed infrastructure provides flexible, intelligent, and eco-friendly bandwidth on demand for cloud computing in metro areas.
Adaptive correlation filter-based video stabilization without accumulative global motion estimation
NASA Astrophysics Data System (ADS)
Koh, Eunjin; Lee, Chanyong; Jeong, Dong Gil
2014-12-01
We present a digital video stabilization approach that provides both robustness and efficiency for practical applications. In this approach, we adopt a stabilization model that maintains spatio-temporal information of past input frames efficiently and can track original stabilization position. Because of the stabilization model, the proposed method does not need accumulative global motion estimation and can recover the original position even if there is a failure in interframe motion estimation. It can also intelligently overcome the situation of damaged or interrupted video sequences. Moreover, because it is simple and suitable to parallel scheme, we implement it on a commercial field programmable gate array and a graphics processing unit board with compute unified device architecture in a breeze. Experimental results show that the proposed approach is both fast and robust.
NASA Astrophysics Data System (ADS)
Li, Guoqiang; Qian, Feng
2001-11-01
We present, for the first time to our knowledge, a generalized lookahead logic algorithm for number conversion from signed-digit to complement representation. By properly encoding the signed-digits, all the operations are performed by binary logic, and unified logical expressions can be obtained for conversion from modified-signed- digit (MSD) to 2's complement, trinary signed-digit (TSD) to 3's complement, and quarternary signed-digit (QSD) to 4's complement. For optical implementation, a parallel logical array module using an electron-trapping device is employed and experimental results are shown. This optical module is suitable for implementing complex logic functions in the form of the sum of the product. The algorithm and architecture are compatible with a general-purpose optoelectronic computing system.
NASA Astrophysics Data System (ADS)
Amaral, Barbara; Cabello, Adán; Cunha, Marcelo Terra; Aolita, Leandro
2018-03-01
Contextuality is a fundamental feature of quantum theory necessary for certain models of quantum computation and communication. Serious steps have therefore been taken towards a formal framework for contextuality as an operational resource. However, the main ingredient of a resource theory—a concrete, explicit form of free operations of contextuality—was still missing. Here we provide such a component by introducing noncontextual wirings: a class of contextuality-free operations with a clear operational interpretation and a friendly parametrization. We characterize them completely for general black-box measurement devices with arbitrarily many inputs and outputs. As applications, we show that the relative entropy of contextuality is a contextuality monotone and that maximally contextual boxes that serve as contextuality bits exist for a broad class of scenarios. Our results complete a unified resource-theoretic framework for contextuality and Bell nonlocality.
Amaral, Barbara; Cabello, Adán; Cunha, Marcelo Terra; Aolita, Leandro
2018-03-30
Contextuality is a fundamental feature of quantum theory necessary for certain models of quantum computation and communication. Serious steps have therefore been taken towards a formal framework for contextuality as an operational resource. However, the main ingredient of a resource theory-a concrete, explicit form of free operations of contextuality-was still missing. Here we provide such a component by introducing noncontextual wirings: a class of contextuality-free operations with a clear operational interpretation and a friendly parametrization. We characterize them completely for general black-box measurement devices with arbitrarily many inputs and outputs. As applications, we show that the relative entropy of contextuality is a contextuality monotone and that maximally contextual boxes that serve as contextuality bits exist for a broad class of scenarios. Our results complete a unified resource-theoretic framework for contextuality and Bell nonlocality.
A simple GPU-accelerated two-dimensional MUSCL-Hancock solver for ideal magnetohydrodynamics
NASA Astrophysics Data System (ADS)
Bard, Christopher M.; Dorelli, John C.
2014-02-01
We describe our experience using NVIDIA's CUDA (Compute Unified Device Architecture) C programming environment to implement a two-dimensional second-order MUSCL-Hancock ideal magnetohydrodynamics (MHD) solver on a GTX 480 Graphics Processing Unit (GPU). Taking a simple approach in which the MHD variables are stored exclusively in the global memory of the GTX 480 and accessed in a cache-friendly manner (without further optimizing memory access by, for example, staging data in the GPU's faster shared memory), we achieved a maximum speed-up of ≈126 for a 10242 grid relative to the sequential C code running on a single Intel Nehalem (2.8 GHz) core. This speedup is consistent with simple estimates based on the known floating point performance, memory throughput and parallel processing capacity of the GTX 480.
14 CFR 1215.102 - Definitions.
Code of Federal Regulations, 2013 CFR
2013-01-01
... necessary TDRSS operational areas, interface devices, and NASA communication circuits that unify the above... stream. The electronic signals acquired by TDRSS from the user craft or the user-generated input commands...
Cloud Computing - A Unified Approach for Surveillance Issues
NASA Astrophysics Data System (ADS)
Rachana, C. R.; Banu, Reshma, Dr.; Ahammed, G. F. Ali, Dr.; Parameshachari, B. D., Dr.
2017-08-01
Cloud computing describes highly scalable resources provided as an external service via the Internet on a basis of pay-per-use. From the economic point of view, the main attractiveness of cloud computing is that users only use what they need, and only pay for what they actually use. Resources are available for access from the cloud at any time, and from any location through networks. Cloud computing is gradually replacing the traditional Information Technology Infrastructure. Securing data is one of the leading concerns and biggest issue for cloud computing. Privacy of information is always a crucial pointespecially when an individual’s personalinformation or sensitive information is beingstored in the organization. It is indeed true that today; cloud authorization systems are notrobust enough. This paper presents a unified approach for analyzing the various security issues and techniques to overcome the challenges in the cloud environment.
A unified account of gloss and lightness perception in terms of gamut relativity.
Vladusich, Tony
2013-08-01
A recently introduced computational theory of visual surface representation, termed gamut relativity, overturns the classical assumption that brightness, lightness, and transparency constitute perceptual dimensions corresponding to the physical dimensions of luminance, diffuse reflectance, and transmittance, respectively. Here I extend the theory to show how surface gloss and lightness can be understood in a unified manner in terms of the vector computation of "layered representations" of surface and illumination properties, rather than as perceptual dimensions corresponding to diffuse and specular reflectance, respectively. The theory simulates the effects of image histogram skewness on surface gloss/lightness and lightness constancy as a function of specular highlight intensity. More generally, gamut relativity clarifies, unifies, and generalizes a wide body of previous theoretical and experimental work aimed at understanding how the visual system parses the retinal image into layered representations of surface and illumination properties.
Benchmarking and Evaluating Unified Memory for OpenMP GPU Offloading
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mishra, Alok; Li, Lingda; Kong, Martin
Here, the latest OpenMP standard offers automatic device offloading capabilities which facilitate GPU programming. Despite this, there remain many challenges. One of these is the unified memory feature introduced in recent GPUs. GPUs in current and future HPC systems have enhanced support for unified memory space. In such systems, CPU and GPU can access each other's memory transparently, that is, the data movement is managed automatically by the underlying system software and hardware. Memory over subscription is also possible in these systems. However, there is a significant lack of knowledge about how this mechanism will perform, and how programmers shouldmore » use it. We have modified several benchmarks codes, in the Rodinia benchmark suite, to study the behavior of OpenMP accelerator extensions and have used them to explore the impact of unified memory in an OpenMP context. We moreover modified the open source LLVM compiler to allow OpenMP programs to exploit unified memory. The results of our evaluation reveal that, while the performance of unified memory is comparable with that of normal GPU offloading for benchmarks with little data reuse, it suffers from significant overhead when GPU memory is over subcribed for benchmarks with large amount of data reuse. Based on these results, we provide several guidelines for programmers to achieve better performance with unified memory.« less
Kokkos: Enabling manycore performance portability through polymorphic memory access patterns
Carter Edwards, H.; Trott, Christian R.; Sunderland, Daniel
2014-07-22
The manycore revolution can be characterized by increasing thread counts, decreasing memory per thread, and diversity of continually evolving manycore architectures. High performance computing (HPC) applications and libraries must exploit increasingly finer levels of parallelism within their codes to sustain scalability on these devices. We found that a major obstacle to performance portability is the diverse and conflicting set of constraints on memory access patterns across devices. Contemporary portable programming models address manycore parallelism (e.g., OpenMP, OpenACC, OpenCL) but fail to address memory access patterns. The Kokkos C++ library enables applications and domain libraries to achieve performance portability on diversemore » manycore architectures by unifying abstractions for both fine-grain data parallelism and memory access patterns. In this paper we describe Kokkos’ abstractions, summarize its application programmer interface (API), present performance results for unit-test kernels and mini-applications, and outline an incremental strategy for migrating legacy C++ codes to Kokkos. Furthermore, the Kokkos library is under active research and development to incorporate capabilities from new generations of manycore architectures, and to address a growing list of applications and domain libraries.« less
A multichannel model for the self-consistent analysis of coherent transport in graphene nanoribbons.
Mencarelli, Davide; Pierantoni, Luca; Farina, Marco; Di Donato, Andrea; Rozzi, Tullio
2011-08-23
In this contribution, we analyze the multichannel coherent transport in graphene nanoribbons (GNRs) by a scattering matrix approach. We consider the transport properties of GNR devices of a very general form, involving multiple bands and multiple leads. The 2D quantum transport over the whole GNR surface, described by the Schrödinger equation, is strongly nonlinear as it implies calculation of self-generated and externally applied electrostatic potentials, solutions of the 3D Poisson equation. The surface charge density is computed as a balance of carriers traveling through the channel at all of the allowed energies. Moreover, formation of bound charges corresponding to a discrete modal spectrum is observed and included in the model. We provide simulation examples by considering GNR configurations typical for transistor devices and GNR protrusions that find an interesting application as cold cathodes for X-ray generation. With reference to the latter case, a unified model is required in order to couple charge transport and charge emission. However, to a first approximation, these could be considered as independent problems, as in the example. © 2011 American Chemical Society
The Unified Levelling Network of Sarawak and its Adjustment
NASA Astrophysics Data System (ADS)
Som, Z. A. M.; Yazid, A. M.; Ming, T. K.; Yazid, N. M.
2016-09-01
The height reference network of Sarawak has seen major improvement in over the past two decades. The most significant improvement was the establishment of extended precise leveling network of which is now able to connect all three major datum points at Pulau Lakei, Original and Bintulu. Datum by following the major accessible routes across Sarawak. This means the leveling network in Sarawak has now been inter-connected and unified. By having such a unified network leads to the possibility of having a common single least squares adjustment been performed for the first time. The least squares adjustment of this unified levelling network was attempted in order to compute the height of all Bench Marks established in the entire levelling network. The adjustment was done by using MoreFix levelling adjustment package developed at FGHT UTM. The computational procedure adopted is linear parametric adjustment by minimum constraint. Since Sarawak has three separate datums therefore three separate adjustments were implemented by utilizing datum at Pulau Lakei, Original Miri and Bintulu Datum respectively. Results of the MoreFix unified adjustment agreed very well with adjustment repeated using Starnet. Further the results were compared with solution given by Jupem and they are in good agreement as well. The difference in height analysed were within 10mm for the case of minimum constraint at Pulau Lakei datum and with much better agreement in the case of Original Miri Datum.
CT to Cone-beam CT Deformable Registration With Simultaneous Intensity Correction
Zhen, Xin; Gu, Xuejun; Yan, Hao; Zhou, Linghong; Jia, Xun; Jiang, Steve B.
2012-01-01
Computed tomography (CT) to cone-beam computed tomography (CBCT) deformable image registration (DIR) is a crucial step in adaptive radiation therapy. Current intensity-based registration algorithms, such as demons, may fail in the context of CT-CBCT DIR because of inconsistent intensities between the two modalities. In this paper, we propose a variant of demons, called Deformation with Intensity Simultaneously Corrected (DISC), to deal with CT-CBCT DIR. DISC distinguishes itself from the original demons algorithm by performing an adaptive intensity correction step on the CBCT image at every iteration step of the demons registration. Specifically, the intensity correction of a voxel in CBCT is achieved by matching the first and the second moments of the voxel intensities inside a patch around the voxel with those on the CT image. It is expected that such a strategy can remove artifacts in the CBCT image, as well as ensuring the intensity consistency between the two modalities. DISC is implemented on computer graphics processing units (GPUs) in compute unified device architecture (CUDA) programming environment. The performance of DISC is evaluated on a simulated patient case and six clinical head-and-neck cancer patient data. It is found that DISC is robust against the CBCT artifacts and intensity inconsistency and significantly improves the registration accuracy when compared with the original demons. PMID:23032638
Implementation and evaluation of various demons deformable image registration algorithms on a GPU.
Gu, Xuejun; Pan, Hubert; Liang, Yun; Castillo, Richard; Yang, Deshan; Choi, Dongju; Castillo, Edward; Majumdar, Amitava; Guerrero, Thomas; Jiang, Steve B
2010-01-07
Online adaptive radiation therapy (ART) promises the ability to deliver an optimal treatment in response to daily patient anatomic variation. A major technical barrier for the clinical implementation of online ART is the requirement of rapid image segmentation. Deformable image registration (DIR) has been used as an automated segmentation method to transfer tumor/organ contours from the planning image to daily images. However, the current computational time of DIR is insufficient for online ART. In this work, this issue is addressed by using computer graphics processing units (GPUs). A gray-scale-based DIR algorithm called demons and five of its variants were implemented on GPUs using the compute unified device architecture (CUDA) programming environment. The spatial accuracy of these algorithms was evaluated over five sets of pulmonary 4D CT images with an average size of 256 x 256 x 100 and more than 1100 expert-determined landmark point pairs each. For all the testing scenarios presented in this paper, the GPU-based DIR computation required around 7 to 11 s to yield an average 3D error ranging from 1.5 to 1.8 mm. It is interesting to find out that the original passive force demons algorithms outperform subsequently proposed variants based on the combination of accuracy, efficiency and ease of implementation.
High-Speed GPU-Based Fully Three-Dimensional Diffuse Optical Tomographic System
Saikia, Manob Jyoti; Kanhirodan, Rajan; Mohan Vasu, Ram
2014-01-01
We have developed a graphics processor unit (GPU-) based high-speed fully 3D system for diffuse optical tomography (DOT). The reduction in execution time of 3D DOT algorithm, a severely ill-posed problem, is made possible through the use of (1) an algorithmic improvement that uses Broyden approach for updating the Jacobian matrix and thereby updating the parameter matrix and (2) the multinode multithreaded GPU and CUDA (Compute Unified Device Architecture) software architecture. Two different GPU implementations of DOT programs are developed in this study: (1) conventional C language program augmented by GPU CUDA and CULA routines (C GPU), (2) MATLAB program supported by MATLAB parallel computing toolkit for GPU (MATLAB GPU). The computation time of the algorithm on host CPU and the GPU system is presented for C and Matlab implementations. The forward computation uses finite element method (FEM) and the problem domain is discretized into 14610, 30823, and 66514 tetrahedral elements. The reconstruction time, so achieved for one iteration of the DOT reconstruction for 14610 elements, is 0.52 seconds for a C based GPU program for 2-plane measurements. The corresponding MATLAB based GPU program took 0.86 seconds. The maximum number of reconstructed frames so achieved is 2 frames per second. PMID:24891848
High-Speed GPU-Based Fully Three-Dimensional Diffuse Optical Tomographic System.
Saikia, Manob Jyoti; Kanhirodan, Rajan; Mohan Vasu, Ram
2014-01-01
We have developed a graphics processor unit (GPU-) based high-speed fully 3D system for diffuse optical tomography (DOT). The reduction in execution time of 3D DOT algorithm, a severely ill-posed problem, is made possible through the use of (1) an algorithmic improvement that uses Broyden approach for updating the Jacobian matrix and thereby updating the parameter matrix and (2) the multinode multithreaded GPU and CUDA (Compute Unified Device Architecture) software architecture. Two different GPU implementations of DOT programs are developed in this study: (1) conventional C language program augmented by GPU CUDA and CULA routines (C GPU), (2) MATLAB program supported by MATLAB parallel computing toolkit for GPU (MATLAB GPU). The computation time of the algorithm on host CPU and the GPU system is presented for C and Matlab implementations. The forward computation uses finite element method (FEM) and the problem domain is discretized into 14610, 30823, and 66514 tetrahedral elements. The reconstruction time, so achieved for one iteration of the DOT reconstruction for 14610 elements, is 0.52 seconds for a C based GPU program for 2-plane measurements. The corresponding MATLAB based GPU program took 0.86 seconds. The maximum number of reconstructed frames so achieved is 2 frames per second.
SU (2) lattice gauge theory simulations on Fermi GPUs
DOE Office of Scientific and Technical Information (OSTI.GOV)
Cardoso, Nuno, E-mail: nunocardoso@cftp.ist.utl.p; Bicudo, Pedro, E-mail: bicudo@ist.utl.p
2011-05-10
In this work we explore the performance of CUDA in quenched lattice SU (2) simulations. CUDA, NVIDIA Compute Unified Device Architecture, is a hardware and software architecture developed by NVIDIA for computing on the GPU. We present an analysis and performance comparison between the GPU and CPU in single and double precision. Analyses with multiple GPUs and two different architectures (G200 and Fermi architectures) are also presented. In order to obtain a high performance, the code must be optimized for the GPU architecture, i.e., an implementation that exploits the memory hierarchy of the CUDA programming model. We produce codes formore » the Monte Carlo generation of SU (2) lattice gauge configurations, for the mean plaquette, for the Polyakov Loop at finite T and for the Wilson loop. We also present results for the potential using many configurations (50,000) without smearing and almost 2000 configurations with APE smearing. With two Fermi GPUs we have achieved an excellent performance of 200x the speed over one CPU, in single precision, around 110 Gflops/s. We also find that, using the Fermi architecture, double precision computations for the static quark-antiquark potential are not much slower (less than 2x slower) than single precision computations.« less
A Cloud-based Infrastructure and Architecture for Environmental System Research
NASA Astrophysics Data System (ADS)
Wang, D.; Wei, Y.; Shankar, M.; Quigley, J.; Wilson, B. E.
2016-12-01
The present availability of high-capacity networks, low-cost computers and storage devices, and the widespread adoption of hardware virtualization and service-oriented architecture provide a great opportunity to enable data and computing infrastructure sharing between closely related research activities. By taking advantage of these approaches, along with the world-class high computing and data infrastructure located at Oak Ridge National Laboratory, a cloud-based infrastructure and architecture has been developed to efficiently deliver essential data and informatics service and utilities to the environmental system research community, and will provide unique capabilities that allows terrestrial ecosystem research projects to share their software utilities (tools), data and even data submission workflow in a straightforward fashion. The infrastructure will minimize large disruptions from current project-based data submission workflows for better acceptances from existing projects, since many ecosystem research projects already have their own requirements or preferences for data submission and collection. The infrastructure will eliminate scalability problems with current project silos by provide unified data services and infrastructure. The Infrastructure consists of two key components (1) a collection of configurable virtual computing environments and user management systems that expedite data submission and collection from environmental system research community, and (2) scalable data management services and system, originated and development by ORNL data centers.
Efficient parallel implementation of active appearance model fitting algorithm on GPU.
Wang, Jinwei; Ma, Xirong; Zhu, Yuanping; Sun, Jizhou
2014-01-01
The active appearance model (AAM) is one of the most powerful model-based object detecting and tracking methods which has been widely used in various situations. However, the high-dimensional texture representation causes very time-consuming computations, which makes the AAM difficult to apply to real-time systems. The emergence of modern graphics processing units (GPUs) that feature a many-core, fine-grained parallel architecture provides new and promising solutions to overcome the computational challenge. In this paper, we propose an efficient parallel implementation of the AAM fitting algorithm on GPUs. Our design idea is fine grain parallelism in which we distribute the texture data of the AAM, in pixels, to thousands of parallel GPU threads for processing, which makes the algorithm fit better into the GPU architecture. We implement our algorithm using the compute unified device architecture (CUDA) on the Nvidia's GTX 650 GPU, which has the latest Kepler architecture. To compare the performance of our algorithm with different data sizes, we built sixteen face AAM models of different dimensional textures. The experiment results show that our parallel AAM fitting algorithm can achieve real-time performance for videos even on very high-dimensional textures.
Efficient Parallel Implementation of Active Appearance Model Fitting Algorithm on GPU
Wang, Jinwei; Ma, Xirong; Zhu, Yuanping; Sun, Jizhou
2014-01-01
The active appearance model (AAM) is one of the most powerful model-based object detecting and tracking methods which has been widely used in various situations. However, the high-dimensional texture representation causes very time-consuming computations, which makes the AAM difficult to apply to real-time systems. The emergence of modern graphics processing units (GPUs) that feature a many-core, fine-grained parallel architecture provides new and promising solutions to overcome the computational challenge. In this paper, we propose an efficient parallel implementation of the AAM fitting algorithm on GPUs. Our design idea is fine grain parallelism in which we distribute the texture data of the AAM, in pixels, to thousands of parallel GPU threads for processing, which makes the algorithm fit better into the GPU architecture. We implement our algorithm using the compute unified device architecture (CUDA) on the Nvidia's GTX 650 GPU, which has the latest Kepler architecture. To compare the performance of our algorithm with different data sizes, we built sixteen face AAM models of different dimensional textures. The experiment results show that our parallel AAM fitting algorithm can achieve real-time performance for videos even on very high-dimensional textures. PMID:24723812
NASA Astrophysics Data System (ADS)
Derkachov, G.; Jakubczyk, T.; Jakubczyk, D.; Archer, J.; Woźniak, M.
2017-07-01
Utilising Compute Unified Device Architecture (CUDA) platform for Graphics Processing Units (GPUs) enables significant reduction of computation time at a moderate cost, by means of parallel computing. In the paper [Jakubczyk et al., Opto-Electron. Rev., 2016] we reported using GPU for Mie scattering inverse problem solving (up to 800-fold speed-up). Here we report the development of two subroutines utilising GPU at data preprocessing stages for the inversion procedure: (i) A subroutine, based on ray tracing, for finding spherical aberration correction function. (ii) A subroutine performing the conversion of an image to a 1D distribution of light intensity versus azimuth angle (i.e. scattering diagram), fed from a movie-reading CPU subroutine running in parallel. All subroutines are incorporated in PikeReader application, which we make available on GitHub repository. PikeReader returns a sequence of intensity distributions versus a common azimuth angle vector, corresponding to the recorded movie. We obtained an overall ∼ 400 -fold speed-up of calculations at data preprocessing stages using CUDA codes running on GPU in comparison to single thread MATLAB-only code running on CPU.
High Performance GPU-Based Fourier Volume Rendering.
Abdellah, Marwan; Eldeib, Ayman; Sharawi, Amr
2015-01-01
Fourier volume rendering (FVR) is a significant visualization technique that has been used widely in digital radiography. As a result of its (N (2)logN) time complexity, it provides a faster alternative to spatial domain volume rendering algorithms that are (N (3)) computationally complex. Relying on the Fourier projection-slice theorem, this technique operates on the spectral representation of a 3D volume instead of processing its spatial representation to generate attenuation-only projections that look like X-ray radiographs. Due to the rapid evolution of its underlying architecture, the graphics processing unit (GPU) became an attractive competent platform that can deliver giant computational raw power compared to the central processing unit (CPU) on a per-dollar-basis. The introduction of the compute unified device architecture (CUDA) technology enables embarrassingly-parallel algorithms to run efficiently on CUDA-capable GPU architectures. In this work, a high performance GPU-accelerated implementation of the FVR pipeline on CUDA-enabled GPUs is presented. This proposed implementation can achieve a speed-up of 117x compared to a single-threaded hybrid implementation that uses the CPU and GPU together by taking advantage of executing the rendering pipeline entirely on recent GPU architectures.
Using IKAROS as a data transfer and management utility within the KM3NeT computing model
NASA Astrophysics Data System (ADS)
Filippidis, Christos; Cotronis, Yiannis; Markou, Christos
2016-04-01
KM3NeT is a future European deep-sea research infrastructure hosting a new generation neutrino detectors that - located at the bottom of the Mediterranean Sea - will open a new window on the universe and answer fundamental questions both in particle physics and astrophysics. IKAROS is a framework that enables creating scalable storage formations on-demand and helps addressing several limitations that the current file systems face when dealing with very large scale infrastructures. It enables creating ad-hoc nearby storage formations and can use a huge number of I/O nodes in order to increase the available bandwidth (I/O and network). IKAROS unifies remote and local access in the overall data flow, by permitting direct access to each I/O node. In this way we can handle the overall data flow at the network layer, limiting the interaction with the operating system. This approach allows virtually connecting, at the users level, the several different computing facilities used (Grids, Clouds, HPCs, Data Centers, Local computing Clusters and personal storage devices), on-demand, based on the needs, by using well known standards and protocols, like HTTP.
Su, Xiaoquan; Wang, Xuetao; Jing, Gongchao; Ning, Kang
2014-04-01
The number of microbial community samples is increasing with exponential speed. Data-mining among microbial community samples could facilitate the discovery of valuable biological information that is still hidden in the massive data. However, current methods for the comparison among microbial communities are limited by their ability to process large amount of samples each with complex community structure. We have developed an optimized GPU-based software, GPU-Meta-Storms, to efficiently measure the quantitative phylogenetic similarity among massive amount of microbial community samples. Our results have shown that GPU-Meta-Storms would be able to compute the pair-wise similarity scores for 10 240 samples within 20 min, which gained a speed-up of >17 000 times compared with single-core CPU, and >2600 times compared with 16-core CPU. Therefore, the high-performance of GPU-Meta-Storms could facilitate in-depth data mining among massive microbial community samples, and make the real-time analysis and monitoring of temporal or conditional changes for microbial communities possible. GPU-Meta-Storms is implemented by CUDA (Compute Unified Device Architecture) and C++. Source code is available at http://www.computationalbioenergy.org/meta-storms.html.
Blind multirigid retrospective motion correction of MR images.
Loktyushin, Alexander; Nickisch, Hannes; Pohmann, Rolf; Schölkopf, Bernhard
2015-04-01
Physiological nonrigid motion is inevitable when imaging, e.g., abdominal viscera, and can lead to serious deterioration of the image quality. Prospective techniques for motion correction can handle only special types of nonrigid motion, as they only allow global correction. Retrospective methods developed so far need guidance from navigator sequences or external sensors. We propose a fully retrospective nonrigid motion correction scheme that only needs raw data as an input. Our method is based on a forward model that describes the effects of nonrigid motion by partitioning the image into patches with locally rigid motion. Using this forward model, we construct an objective function that we can optimize with respect to both unknown motion parameters per patch and the underlying sharp image. We evaluate our method on both synthetic and real data in 2D and 3D. In vivo data was acquired using standard imaging sequences. The correction algorithm significantly improves the image quality. Our compute unified device architecture (CUDA)-enabled graphic processing unit implementation ensures feasible computation times. The presented technique is the first computationally feasible retrospective method that uses the raw data of standard imaging sequences, and allows to correct for nonrigid motion without guidance from external motion sensors. © 2014 Wiley Periodicals, Inc.
GPU implementation of prior image constrained compressed sensing (PICCS)
NASA Astrophysics Data System (ADS)
Nett, Brian E.; Tang, Jie; Chen, Guang-Hong
2010-04-01
The Prior Image Constrained Compressed Sensing (PICCS) algorithm (Med. Phys. 35, pg. 660, 2008) has been applied to several computed tomography applications with both standard CT systems and flat-panel based systems designed for guiding interventional procedures and radiation therapy treatment delivery. The PICCS algorithm typically utilizes a prior image which is reconstructed via the standard Filtered Backprojection (FBP) reconstruction algorithm. The algorithm then iteratively solves for the image volume that matches the measured data, while simultaneously assuring the image is similar to the prior image. The PICCS algorithm has demonstrated utility in several applications including: improved temporal resolution reconstruction, 4D respiratory phase specific reconstructions for radiation therapy, and cardiac reconstruction from data acquired on an interventional C-arm. One disadvantage of the PICCS algorithm, just as other iterative algorithms, is the long computation times typically associated with reconstruction. In order for an algorithm to gain clinical acceptance reconstruction must be achievable in minutes rather than hours. In this work the PICCS algorithm has been implemented on the GPU in order to significantly reduce the reconstruction time of the PICCS algorithm. The Compute Unified Device Architecture (CUDA) was used in this implementation.
Parallel hyperspectral image reconstruction using random projections
NASA Astrophysics Data System (ADS)
Sevilla, Jorge; Martín, Gabriel; Nascimento, José M. P.
2016-10-01
Spaceborne sensors systems are characterized by scarce onboard computing and storage resources and by communication links with reduced bandwidth. Random projections techniques have been demonstrated as an effective and very light way to reduce the number of measurements in hyperspectral data, thus, the data to be transmitted to the Earth station is reduced. However, the reconstruction of the original data from the random projections may be computationally expensive. SpeCA is a blind hyperspectral reconstruction technique that exploits the fact that hyperspectral vectors often belong to a low dimensional subspace. SpeCA has shown promising results in the task of recovering hyperspectral data from a reduced number of random measurements. In this manuscript we focus on the implementation of the SpeCA algorithm for graphics processing units (GPU) using the compute unified device architecture (CUDA). Experimental results conducted using synthetic and real hyperspectral datasets on the GPU architecture by NVIDIA: GeForce GTX 980, reveal that the use of GPUs can provide real-time reconstruction. The achieved speedup is up to 22 times when compared with the processing time of SpeCA running on one core of the Intel i7-4790K CPU (3.4GHz), with 32 Gbyte memory.
Parallel design of JPEG-LS encoder on graphics processing units
NASA Astrophysics Data System (ADS)
Duan, Hao; Fang, Yong; Huang, Bormin
2012-01-01
With recent technical advances in graphic processing units (GPUs), GPUs have outperformed CPUs in terms of compute capability and memory bandwidth. Many successful GPU applications to high performance computing have been reported. JPEG-LS is an ISO/IEC standard for lossless image compression which utilizes adaptive context modeling and run-length coding to improve compression ratio. However, adaptive context modeling causes data dependency among adjacent pixels and the run-length coding has to be performed in a sequential way. Hence, using JPEG-LS to compress large-volume hyperspectral image data is quite time-consuming. We implement an efficient parallel JPEG-LS encoder for lossless hyperspectral compression on a NVIDIA GPU using the computer unified device architecture (CUDA) programming technology. We use the block parallel strategy, as well as such CUDA techniques as coalesced global memory access, parallel prefix sum, and asynchronous data transfer. We also show the relation between GPU speedup and AVIRIS block size, as well as the relation between compression ratio and AVIRIS block size. When AVIRIS images are divided into blocks, each with 64×64 pixels, we gain the best GPU performance with 26.3x speedup over its original CPU code.
Dynamic partial reconfiguration of logic controllers implemented in FPGAs
NASA Astrophysics Data System (ADS)
Bazydło, Grzegorz; Wiśniewski, Remigiusz
2016-09-01
Technological progress in recent years benefits in digital circuits containing millions of logic gates with the capability for reprogramming and reconfiguring. On the one hand it provides the unprecedented computational power, but on the other hand the modelled systems are becoming increasingly complex, hierarchical and concurrent. Therefore, abstract modelling supported by the Computer Aided Design tools becomes a very important task. Even the higher consumption of the basic electronic components seems to be acceptable because chip manufacturing costs tend to fall over the time. The paper presents a modelling approach for logic controllers with the use of Unified Modelling Language (UML). Thanks to the Model Driven Development approach, starting with a UML state machine model, through the construction of an intermediate Hierarchical Concurrent Finite State Machine model, a collection of Verilog files is created. The system description generated in hardware description language can be synthesized and implemented in reconfigurable devices, such as FPGAs. Modular specification of the prototyped controller permits for further dynamic partial reconfiguration of the prototyped system. The idea bases on the exchanging of the functionality of the already implemented controller without stopping of the FPGA device. It means, that a part (for example a single module) of the logic controller is replaced by other version (called context), while the rest of the system is still running. The method is illustrated by a practical example by an exemplary Home Area Network system.
Near real-time digital holographic microscope based on GPU parallel computing
NASA Astrophysics Data System (ADS)
Zhu, Gang; Zhao, Zhixiong; Wang, Huarui; Yang, Yan
2018-01-01
A transmission near real-time digital holographic microscope with in-line and off-axis light path is presented, in which the parallel computing technology based on compute unified device architecture (CUDA) and digital holographic microscopy are combined. Compared to other holographic microscopes, which have to implement reconstruction in multiple focal planes and are time-consuming the reconstruction speed of the near real-time digital holographic microscope can be greatly improved with the parallel computing technology based on CUDA, so it is especially suitable for measurements of particle field in micrometer and nanometer scale. Simulations and experiments show that the proposed transmission digital holographic microscope can accurately measure and display the velocity of particle field in micrometer scale, and the average velocity error is lower than 10%.With the graphic processing units(GPU), the computing time of the 100 reconstruction planes(512×512 grids) is lower than 120ms, while it is 4.9s using traditional reconstruction method by CPU. The reconstruction speed has been raised by 40 times. In other words, it can handle holograms at 8.3 frames per second and the near real-time measurement and display of particle velocity field are realized. The real-time three-dimensional reconstruction of particle velocity field is expected to achieve by further optimization of software and hardware. Keywords: digital holographic microscope,
NASA Astrophysics Data System (ADS)
Wu, Yuanfeng; Gao, Lianru; Zhang, Bing; Zhao, Haina; Li, Jun
2014-01-01
We present a parallel implementation of the optimized maximum noise fraction (G-OMNF) transform algorithm for feature extraction of hyperspectral images on commodity graphics processing units (GPUs). The proposed approach explored the algorithm data-level concurrency and optimized the computing flow. We first defined a three-dimensional grid, in which each thread calculates a sub-block data to easily facilitate the spatial and spectral neighborhood data searches in noise estimation, which is one of the most important steps involved in OMNF. Then, we optimized the processing flow and computed the noise covariance matrix before computing the image covariance matrix to reduce the original hyperspectral image data transmission. These optimization strategies can greatly improve the computing efficiency and can be applied to other feature extraction algorithms. The proposed parallel feature extraction algorithm was implemented on an Nvidia Tesla GPU using the compute unified device architecture and basic linear algebra subroutines library. Through the experiments on several real hyperspectral images, our GPU parallel implementation provides a significant speedup of the algorithm compared with the CPU implementation, especially for highly data parallelizable and arithmetically intensive algorithm parts, such as noise estimation. In order to further evaluate the effectiveness of G-OMNF, we used two different applications: spectral unmixing and classification for evaluation. Considering the sensor scanning rate and the data acquisition time, the proposed parallel implementation met the on-board real-time feature extraction.
Fast parallel tandem mass spectral library searching using GPU hardware acceleration
Baumgardner, Lydia Ashleigh; Shanmugam, Avinash Kumar; Lam, Henry; Eng, Jimmy K.; Martin, Daniel B.
2011-01-01
Mass spectrometry-based proteomics is a maturing discipline of biologic research that is experiencing substantial growth. Instrumentation has steadily improved over time with the advent of faster and more sensitive instruments collecting ever larger data files. Consequently, the computational process of matching a peptide fragmentation pattern to its sequence, traditionally accomplished by sequence database searching and more recently also by spectral library searching, has become a bottleneck in many mass spectrometry experiments. In both of these methods, the main rate limiting step is the comparison of an acquired spectrum with all potential matches from a spectral library or sequence database. This is a highly parallelizable process because the core computational element can be represented as a simple but arithmetically intense multiplication of two vectors. In this paper we present a proof of concept project taking advantage of the massively parallel computing available on graphics processing units (GPUs) to distribute and accelerate the process of spectral assignment using spectral library searching. This program, which we have named FastPaSS (for Fast Parallelized Spectral Searching) is implemented in CUDA (Compute Unified Device Architecture) from NVIDIA which allows direct access to the processors in an NVIDIA GPU. Our efforts demonstrate the feasibility of GPU computing for spectral assignment, through implementation of the validated spectral searching algorithm SpectraST in the CUDA environment. PMID:21545112
A Simple GPU-Accelerated Two-Dimensional MUSCL-Hancock Solver for Ideal Magnetohydrodynamics
NASA Technical Reports Server (NTRS)
Bard, Christopher; Dorelli, John C.
2013-01-01
We describe our experience using NVIDIA's CUDA (Compute Unified Device Architecture) C programming environment to implement a two-dimensional second-order MUSCL-Hancock ideal magnetohydrodynamics (MHD) solver on a GTX 480 Graphics Processing Unit (GPU). Taking a simple approach in which the MHD variables are stored exclusively in the global memory of the GTX 480 and accessed in a cache-friendly manner (without further optimizing memory access by, for example, staging data in the GPU's faster shared memory), we achieved a maximum speed-up of approx. = 126 for a sq 1024 grid relative to the sequential C code running on a single Intel Nehalem (2.8 GHz) core. This speedup is consistent with simple estimates based on the known floating point performance, memory throughput and parallel processing capacity of the GTX 480.
NASA Astrophysics Data System (ADS)
Xu, Jincheng; Liu, Wei; Wang, Jin; Liu, Linong; Zhang, Jianfeng
2018-02-01
De-absorption pre-stack time migration (QPSTM) compensates for the absorption and dispersion of seismic waves by introducing an effective Q parameter, thereby making it an effective tool for 3D, high-resolution imaging of seismic data. Although the optimal aperture obtained via stationary-phase migration reduces the computational cost of 3D QPSTM and yields 3D stationary-phase QPSTM, the associated computational efficiency is still the main problem in the processing of 3D, high-resolution images for real large-scale seismic data. In the current paper, we proposed a division method for large-scale, 3D seismic data to optimize the performance of stationary-phase QPSTM on clusters of graphics processing units (GPU). Then, we designed an imaging point parallel strategy to achieve an optimal parallel computing performance. Afterward, we adopted an asynchronous double buffering scheme for multi-stream to perform the GPU/CPU parallel computing. Moreover, several key optimization strategies of computation and storage based on the compute unified device architecture (CUDA) were adopted to accelerate the 3D stationary-phase QPSTM algorithm. Compared with the initial GPU code, the implementation of the key optimization steps, including thread optimization, shared memory optimization, register optimization and special function units (SFU), greatly improved the efficiency. A numerical example employing real large-scale, 3D seismic data showed that our scheme is nearly 80 times faster than the CPU-QPSTM algorithm. Our GPU/CPU heterogeneous parallel computing framework significant reduces the computational cost and facilitates 3D high-resolution imaging for large-scale seismic data.
Conservative boundary conditions for 3D gas dynamics problems
NASA Technical Reports Server (NTRS)
Gerasimov, B. P.; Karagichev, A. B.; Semushin, S. A.
1986-01-01
A method is described for 3D-gas dynamics computer simulation in regions of complicated shape by means of nonadjusted rectangular grids providing unified treatment of various problems. Some test problem computation results are given.
NASA Astrophysics Data System (ADS)
Peng, Ao-Ping; Li, Zhi-Hui; Wu, Jun-Lin; Jiang, Xin-Yu
2016-12-01
Based on the previous researches of the Gas-Kinetic Unified Algorithm (GKUA) for flows from highly rarefied free-molecule transition to continuum, a new implicit scheme of cell-centered finite volume method is presented for directly solving the unified Boltzmann model equation covering various flow regimes. In view of the difficulty in generating the single-block grid system with high quality for complex irregular bodies, a multi-block docking grid generation method is designed on the basis of data transmission between blocks, and the data structure is constructed for processing arbitrary connection relations between blocks with high efficiency and reliability. As a result, the gas-kinetic unified algorithm with the implicit scheme and multi-block docking grid has been firstly established and used to solve the reentry flow problems around the multi-bodies covering all flow regimes with the whole range of Knudsen numbers from 10 to 3.7E-6. The implicit and explicit schemes are applied to computing and analyzing the supersonic flows in near-continuum and continuum regimes around a circular cylinder with careful comparison each other. It is shown that the present algorithm and modelling possess much higher computational efficiency and faster converging properties. The flow problems including two and three side-by-side cylinders are simulated from highly rarefied to near-continuum flow regimes, and the present computed results are found in good agreement with the related DSMC simulation and theoretical analysis solutions, which verify the good accuracy and reliability of the present method. It is observed that the spacing of the multi-body is smaller, the cylindrical throat obstruction is greater with the flow field of single-body asymmetrical more obviously and the normal force coefficient bigger. While in the near-continuum transitional flow regime of near-space flying surroundings, the spacing of the multi-body increases to six times of the diameter of the single-body, the interference effects of the multi-bodies tend to be negligible. The computing practice has confirmed that it is feasible for the present method to compute the aerodynamics and reveal flow mechanism around complex multi-body vehicles covering all flow regimes from the gas-kinetic point of view of solving the unified Boltzmann model velocity distribution function equation.
Unified sensor management in unknown dynamic clutter
NASA Astrophysics Data System (ADS)
Mahler, Ronald; El-Fallah, Adel
2010-04-01
In recent years the first author has developed a unified, computationally tractable approach to multisensor-multitarget sensor management. This approach consists of closed-loop recursion of a PHD or CPHD filter with maximization of a "natural" sensor management objective function called PENT (posterior expected number of targets). In this paper we extend this approach so that it can be used in unknown, dynamic clutter backgrounds.
NASA Astrophysics Data System (ADS)
Nguyen, Van-Dung; Wu, Ling; Noels, Ludovic
2017-03-01
This work provides a unified treatment of arbitrary kinds of microscopic boundary conditions usually considered in the multi-scale computational homogenization method for nonlinear multi-physics problems. An efficient procedure is developed to enforce the multi-point linear constraints arising from the microscopic boundary condition either by the direct constraint elimination or by the Lagrange multiplier elimination methods. The macroscopic tangent operators are computed in an efficient way from a multiple right hand sides linear system whose left hand side matrix is the stiffness matrix of the microscopic linearized system at the converged solution. The number of vectors at the right hand side is equal to the number of the macroscopic kinematic variables used to formulate the microscopic boundary condition. As the resolution of the microscopic linearized system often follows a direct factorization procedure, the computation of the macroscopic tangent operators is then performed using this factorized matrix at a reduced computational time.
Unified Computational Methods for Regression Analysis of Zero-Inflated and Bound-Inflated Data
Yang, Yan; Simpson, Douglas
2010-01-01
Bounded data with excess observations at the boundary are common in many areas of application. Various individual cases of inflated mixture models have been studied in the literature for bound-inflated data, yet the computational methods have been developed separately for each type of model. In this article we use a common framework for computing these models, and expand the range of models for both discrete and semi-continuous data with point inflation at the lower boundary. The quasi-Newton and EM algorithms are adapted and compared for estimation of model parameters. The numerical Hessian and generalized Louis method are investigated as means for computing standard errors after optimization. Correlated data are included in this framework via generalized estimating equations. The estimation of parameters and effectiveness of standard errors are demonstrated through simulation and in the analysis of data from an ultrasound bioeffect study. The unified approach enables reliable computation for a wide class of inflated mixture models and comparison of competing models. PMID:20228950
Fast simulation of Proton Induced X-Ray Emission Tomography using CUDA
NASA Astrophysics Data System (ADS)
Beasley, D. G.; Marques, A. C.; Alves, L. C.; da Silva, R. C.
2013-07-01
A new 3D Proton Induced X-Ray Emission Tomography (PIXE-T) and Scanning Transmission Ion Microscopy Tomography (STIM-T) simulation software has been developed in Java and uses NVIDIA™ Common Unified Device Architecture (CUDA) to calculate the X-ray attenuation for large detector areas. A challenge with PIXE-T is to get sufficient counts while retaining a small beam spot size. Therefore a high geometric efficiency is required. However, as the detector solid angle increases the calculations required for accurate reconstruction of the data increase substantially. To overcome this limitation, the CUDA parallel computing platform was used which enables general purpose programming of NVIDIA graphics processing units (GPUs) to perform computations traditionally handled by the central processing unit (CPU). For simulation performance evaluation, the results of a CPU- and a CUDA-based simulation of a phantom are presented. Furthermore, a comparison with the simulation code in the PIXE-Tomography reconstruction software DISRA (A. Sakellariou, D.N. Jamieson, G.J.F. Legge, 2001) is also shown. Compared to a CPU implementation, the CUDA based simulation is approximately 30× faster.
Parallel Implementation of MAFFT on CUDA-Enabled Graphics Hardware.
Zhu, Xiangyuan; Li, Kenli; Salah, Ahmad; Shi, Lin; Li, Keqin
2015-01-01
Multiple sequence alignment (MSA) constitutes an extremely powerful tool for many biological applications including phylogenetic tree estimation, secondary structure prediction, and critical residue identification. However, aligning large biological sequences with popular tools such as MAFFT requires long runtimes on sequential architectures. Due to the ever increasing sizes of sequence databases, there is increasing demand to accelerate this task. In this paper, we demonstrate how graphic processing units (GPUs), powered by the compute unified device architecture (CUDA), can be used as an efficient computational platform to accelerate the MAFFT algorithm. To fully exploit the GPU's capabilities for accelerating MAFFT, we have optimized the sequence data organization to eliminate the bandwidth bottleneck of memory access, designed a memory allocation and reuse strategy to make full use of limited memory of GPUs, proposed a new modified-run-length encoding (MRLE) scheme to reduce memory consumption, and used high-performance shared memory to speed up I/O operations. Our implementation tested in three NVIDIA GPUs achieves speedup up to 11.28 on a Tesla K20m GPU compared to the sequential MAFFT 7.015.
Global magnetohydrodynamic simulations on multiple GPUs
NASA Astrophysics Data System (ADS)
Wong, Un-Hong; Wong, Hon-Cheng; Ma, Yonghui
2014-01-01
Global magnetohydrodynamic (MHD) models play the major role in investigating the solar wind-magnetosphere interaction. However, the huge computation requirement in global MHD simulations is also the main problem that needs to be solved. With the recent development of modern graphics processing units (GPUs) and the Compute Unified Device Architecture (CUDA), it is possible to perform global MHD simulations in a more efficient manner. In this paper, we present a global magnetohydrodynamic (MHD) simulator on multiple GPUs using CUDA 4.0 with GPUDirect 2.0. Our implementation is based on the modified leapfrog scheme, which is a combination of the leapfrog scheme and the two-step Lax-Wendroff scheme. GPUDirect 2.0 is used in our implementation to drive multiple GPUs. All data transferring and kernel processing are managed with CUDA 4.0 API instead of using MPI or OpenMP. Performance measurements are made on a multi-GPU system with eight NVIDIA Tesla M2050 (Fermi architecture) graphics cards. These measurements show that our multi-GPU implementation achieves a peak performance of 97.36 GFLOPS in double precision.
NMF-mGPU: non-negative matrix factorization on multi-GPU systems.
Mejía-Roa, Edgardo; Tabas-Madrid, Daniel; Setoain, Javier; García, Carlos; Tirado, Francisco; Pascual-Montano, Alberto
2015-02-13
In the last few years, the Non-negative Matrix Factorization ( NMF ) technique has gained a great interest among the Bioinformatics community, since it is able to extract interpretable parts from high-dimensional datasets. However, the computing time required to process large data matrices may become impractical, even for a parallel application running on a multiprocessors cluster. In this paper, we present NMF-mGPU, an efficient and easy-to-use implementation of the NMF algorithm that takes advantage of the high computing performance delivered by Graphics-Processing Units ( GPUs ). Driven by the ever-growing demands from the video-games industry, graphics cards usually provided in PCs and laptops have evolved from simple graphics-drawing platforms into high-performance programmable systems that can be used as coprocessors for linear-algebra operations. However, these devices may have a limited amount of on-board memory, which is not considered by other NMF implementations on GPU. NMF-mGPU is based on CUDA ( Compute Unified Device Architecture ), the NVIDIA's framework for GPU computing. On devices with low memory available, large input matrices are blockwise transferred from the system's main memory to the GPU's memory, and processed accordingly. In addition, NMF-mGPU has been explicitly optimized for the different CUDA architectures. Finally, platforms with multiple GPUs can be synchronized through MPI ( Message Passing Interface ). In a four-GPU system, this implementation is about 120 times faster than a single conventional processor, and more than four times faster than a single GPU device (i.e., a super-linear speedup). Applications of GPUs in Bioinformatics are getting more and more attention due to their outstanding performance when compared to traditional processors. In addition, their relatively low price represents a highly cost-effective alternative to conventional clusters. In life sciences, this results in an excellent opportunity to facilitate the daily work of bioinformaticians that are trying to extract biological meaning out of hundreds of gigabytes of experimental information. NMF-mGPU can be used "out of the box" by researchers with little or no expertise in GPU programming in a variety of platforms, such as PCs, laptops, or high-end GPU clusters. NMF-mGPU is freely available at https://github.com/bioinfo-cnb/bionmf-gpu .
Partly cloudy with a chance of migration: Weather, radars, and aeroecology
USDA-ARS?s Scientific Manuscript database
Aeroecology is an emerging scientific discipline that integrates atmospheric science, terrestrial science, geography, ecology, computer science, computational biology, and engineering to further the understanding of ecological patterns and processes. The unifying concept underlying this new transdis...
Parallel fuzzy connected image segmentation on GPU
Zhuge, Ying; Cao, Yong; Udupa, Jayaram K.; Miller, Robert W.
2011-01-01
Purpose: Image segmentation techniques using fuzzy connectedness (FC) principles have shown their effectiveness in segmenting a variety of objects in several large applications. However, one challenge in these algorithms has been their excessive computational requirements when processing large image datasets. Nowadays, commodity graphics hardware provides a highly parallel computing environment. In this paper, the authors present a parallel fuzzy connected image segmentation algorithm implementation on NVIDIA’s compute unified device Architecture (cuda) platform for segmenting medical image data sets. Methods: In the FC algorithm, there are two major computational tasks: (i) computing the fuzzy affinity relations and (ii) computing the fuzzy connectedness relations. These two tasks are implemented as cuda kernels and executed on GPU. A dramatic improvement in speed for both tasks is achieved as a result. Results: Our experiments based on three data sets of small, medium, and large data size demonstrate the efficiency of the parallel algorithm, which achieves a speed-up factor of 24.4x, 18.1x, and 10.3x, correspondingly, for the three data sets on the NVIDIA Tesla C1060 over the implementation of the algorithm on CPU, and takes 0.25, 0.72, and 15.04 s, correspondingly, for the three data sets. Conclusions: The authors developed a parallel algorithm of the widely used fuzzy connected image segmentation method on the NVIDIA GPUs, which are far more cost- and speed-effective than both cluster of workstations and multiprocessing systems. A near-interactive speed of segmentation has been achieved, even for the large data set. PMID:21859037
Parallel fuzzy connected image segmentation on GPU.
Zhuge, Ying; Cao, Yong; Udupa, Jayaram K; Miller, Robert W
2011-07-01
Image segmentation techniques using fuzzy connectedness (FC) principles have shown their effectiveness in segmenting a variety of objects in several large applications. However, one challenge in these algorithms has been their excessive computational requirements when processing large image datasets. Nowadays, commodity graphics hardware provides a highly parallel computing environment. In this paper, the authors present a parallel fuzzy connected image segmentation algorithm implementation on NVIDIA's compute unified device Architecture (CUDA) platform for segmenting medical image data sets. In the FC algorithm, there are two major computational tasks: (i) computing the fuzzy affinity relations and (ii) computing the fuzzy connectedness relations. These two tasks are implemented as CUDA kernels and executed on GPU. A dramatic improvement in speed for both tasks is achieved as a result. Our experiments based on three data sets of small, medium, and large data size demonstrate the efficiency of the parallel algorithm, which achieves a speed-up factor of 24.4x, 18.1x, and 10.3x, correspondingly, for the three data sets on the NVIDIA Tesla C1060 over the implementation of the algorithm on CPU, and takes 0.25, 0.72, and 15.04 s, correspondingly, for the three data sets. The authors developed a parallel algorithm of the widely used fuzzy connected image segmentation method on the NVIDIA GPUs, which are far more cost- and speed-effective than both cluster of workstations and multiprocessing systems. A near-interactive speed of segmentation has been achieved, even for the large data set.
Unified-theory-of-reinforcement neural networks do not simulate the blocking effect.
Calvin, Nicholas T; J McDowell, J
2015-11-01
For the last 20 years the unified theory of reinforcement (Donahoe et al., 1993) has been used to develop computer simulations to evaluate its plausibility as an account for behavior. The unified theory of reinforcement states that operant and respondent learning occurs via the same neural mechanisms. As part of a larger project to evaluate the operant behavior predicted by the theory, this project was the first replication of neural network models based on the unified theory of reinforcement. In the process of replicating these neural network models it became apparent that a previously published finding, namely, that the networks simulate the blocking phenomenon (Donahoe et al., 1993), was a misinterpretation of the data. We show that the apparent blocking produced by these networks is an artifact of the inability of these networks to generate the same conditioned response to multiple stimuli. The piecemeal approach to evaluate the unified theory of reinforcement via simulation is critiqued and alternatives are discussed. Copyright © 2015 Elsevier B.V. All rights reserved.
Current and Future Development of a Non-hydrostatic Unified Atmospheric Model (NUMA)
2010-09-09
following capabilities: 1. Highly scalable on current and future computer architectures ( exascale computing and beyond and GPUs) 2. Flexibility... Exascale Computing • 10 of Top 500 are already in the Petascale range • Should also keep our eyes on GPUs (e.g., Mare Nostrum) 2. Numerical
NASA Technical Reports Server (NTRS)
Chu, Y. Y.
1978-01-01
A unified formulation of computer-aided, multi-task, decision making is presented. Strategy for the allocation of decision making responsibility between human and computer is developed. The plans of a flight management systems are studied. A model based on the queueing theory was implemented.
Medical Device Integration Model Based on the Internet of Things
Hao, Aiyu; Wang, Ling
2015-01-01
At present, hospitals in our country have basically established the HIS system, which manages registration, treatment, and charge, among many others, of patients. During treatment, patients need to use medical devices repeatedly to acquire all sorts of inspection data. Currently, the output data of the medical devices are often manually input into information system, which is easy to get wrong or easy to cause mismatches between inspection reports and patients. For some small hospitals of which information construction is still relatively weak, the information generated by the devices is still presented in the form of paper reports. When doctors or patients want to have access to the data at a given time again, they can only look at the paper files. Data integration between medical devices has long been a difficult problem for the medical information system, because the data from medical devices are lack of mandatory unified global standards and have outstanding heterogeneity of devices. In order to protect their own interests, manufacturers use special protocols, etc., thus causing medical decices to still be the "lonely island" of hospital information system. Besides, unfocused application of the data will lead to failure to achieve a reasonable distribution of medical resources. With the deepening of IT construction in hospitals, medical information systems will be bound to develop towards mobile applications, intelligent analysis, and interconnection and interworking, on the premise that there is an effective medical device integration (MDI) technology. To this end, this paper presents a MDI model based on the Internet of Things (IoT). Through abstract classification, this model is able to extract the common characteristics of the devices, resolve the heterogeneous differences between them, and employ a unified protocol to integrate data between devices. And by the IoT technology, it realizes interconnection network of devices and conducts associate matching between the data and the inspected at the device terminal in a timely manner. PMID:26628938
Medical Device Integration Model Based on the Internet of Things.
Hao, Aiyu; Wang, Ling
2015-01-01
At present, hospitals in our country have basically established the HIS system, which manages registration, treatment, and charge, among many others, of patients. During treatment, patients need to use medical devices repeatedly to acquire all sorts of inspection data. Currently, the output data of the medical devices are often manually input into information system, which is easy to get wrong or easy to cause mismatches between inspection reports and patients. For some small hospitals of which information construction is still relatively weak, the information generated by the devices is still presented in the form of paper reports. When doctors or patients want to have access to the data at a given time again, they can only look at the paper files. Data integration between medical devices has long been a difficult problem for the medical information system, because the data from medical devices are lack of mandatory unified global standards and have outstanding heterogeneity of devices. In order to protect their own interests, manufacturers use special protocols, etc., thus causing medical decices to still be the "lonely island" of hospital information system. Besides, unfocused application of the data will lead to failure to achieve a reasonable distribution of medical resources. With the deepening of IT construction in hospitals, medical information systems will be bound to develop towards mobile applications, intelligent analysis, and interconnection and interworking, on the premise that there is an effective medical device integration (MDI) technology. To this end, this paper presents a MDI model based on the Internet of Things (IoT). Through abstract classification, this model is able to extract the common characteristics of the devices, resolve the heterogeneous differences between them, and employ a unified protocol to integrate data between devices. And by the IoT technology, it realizes interconnection network of devices and conducts associate matching between the data and the inspected at the device terminal in a timely manner.
Cross-Platform Development Techniques for Mobile Devices
2013-09-01
many other platforms including Windows, Blackberry , and Symbian. Each of these platforms has their own distinct architecture and programming language...sales of iPhones and the increasing use of Android-based devices have forced less successful competitors such as Microsoft, Blackberry , and Symbian... Blackberry and Windows Phone are planned [12] in this tool’s attempt to reuse code with a unified JavaScript API while at the same time supporting unique
Translations on Environmental Quality, Number 127
1976-12-29
first cleansing is done "by the highly effective method of electrocoagulation . It is "based on the passage of a constant electric flow with a voltage of...chemical rea- gents and becomes suitable for repeated use in industry. With the aid of iod- exchange devices electrocoagulation is supplemented with... electrocoagulation method, it is creating unified devices for the various substances contained in water and is improving plant design. All this will guarantee
ERIC Educational Resources Information Center
Venkatesh, Vijay P.
2013-01-01
The current computing landscape owes its roots to the birth of hardware and software technologies from the 1940s and 1950s. Since then, the advent of mainframes, miniaturized computing, and internetworking has given rise to the now prevalent cloud computing era. In the past few months just after 2010, cloud computing adoption has picked up pace…
Unified Ultrasonic/Eddy-Current Data Acquisition
NASA Technical Reports Server (NTRS)
Chern, E. James; Butler, David W.
1993-01-01
Imaging station for detecting cracks and flaws in solid materials developed combining both ultrasonic C-scan and eddy-current imaging. Incorporation of both techniques into one system eliminates duplication of computers and of mechanical scanners; unifies acquisition, processing, and storage of data; reduces setup time for repetitious ultrasonic and eddy-current scans; and increases efficiency of system. Same mechanical scanner used to maneuver either ultrasonic or eddy-current probe over specimen and acquire point-by-point data. For ultrasonic scanning, probe linked to ultrasonic pulser/receiver circuit card, while, for eddy-current imaging, probe linked to impedance-analyzer circuit card. Both ultrasonic and eddy-current imaging subsystems share same desktop-computer controller, containing dedicated plug-in circuit boards for each.
Topical perspective on massive threading and parallelism.
Farber, Robert M
2011-09-01
Unquestionably computer architectures have undergone a recent and noteworthy paradigm shift that now delivers multi- and many-core systems with tens to many thousands of concurrent hardware processing elements per workstation or supercomputer node. GPGPU (General Purpose Graphics Processor Unit) technology in particular has attracted significant attention as new software development capabilities, namely CUDA (Compute Unified Device Architecture) and OpenCL™, have made it possible for students as well as small and large research organizations to achieve excellent speedup for many applications over more conventional computing architectures. The current scientific literature reflects this shift with numerous examples of GPGPU applications that have achieved one, two, and in some special cases, three-orders of magnitude increased computational performance through the use of massive threading to exploit parallelism. Multi-core architectures are also evolving quickly to exploit both massive-threading and massive-parallelism such as the 1.3 million threads Blue Waters supercomputer. The challenge confronting scientists in planning future experimental and theoretical research efforts--be they individual efforts with one computer or collaborative efforts proposing to use the largest supercomputers in the world is how to capitalize on these new massively threaded computational architectures--especially as not all computational problems will scale to massive parallelism. In particular, the costs associated with restructuring software (and potentially redesigning algorithms) to exploit the parallelism of these multi- and many-threaded machines must be considered along with application scalability and lifespan. This perspective is an overview of the current state of threading and parallelize with some insight into the future. Published by Elsevier Inc.
Read-noise characterization of focal plane array detectors via mean-variance analysis.
Sperline, R P; Knight, A K; Gresham, C A; Koppenaal, D W; Hieftje, G M; Denton, M B
2005-11-01
Mean-variance analysis is described as a method for characterization of the read-noise and gain of focal plane array (FPA) detectors, including charge-coupled devices (CCDs), charge-injection devices (CIDs), and complementary metal-oxide-semiconductor (CMOS) multiplexers (infrared arrays). Practical FPA detector characterization is outlined. The nondestructive readout capability available in some CIDs and FPA devices is discussed as a means for signal-to-noise ratio improvement. Derivations of the equations are fully presented to unify understanding of this method by the spectroscopic community.
The Research and Implementation of MUSER CLEAN Algorithm Based on OpenCL
NASA Astrophysics Data System (ADS)
Feng, Y.; Chen, K.; Deng, H.; Wang, F.; Mei, Y.; Wei, S. L.; Dai, W.; Yang, Q. P.; Liu, Y. B.; Wu, J. P.
2017-03-01
It's urgent to carry out high-performance data processing with a single machine in the development of astronomical software. However, due to the different configuration of the machine, traditional programming techniques such as multi-threading, and CUDA (Compute Unified Device Architecture)+GPU (Graphic Processing Unit) have obvious limitations in portability and seamlessness between different operation systems. The OpenCL (Open Computing Language) used in the development of MUSER (MingantU SpEctral Radioheliograph) data processing system is introduced. And the Högbom CLEAN algorithm is re-implemented into parallel CLEAN algorithm by the Python language and PyOpenCL extended package. The experimental results show that the CLEAN algorithm based on OpenCL has approximately equally operating efficiency compared with the former CLEAN algorithm based on CUDA. More important, the data processing in merely CPU (Central Processing Unit) environment of this system can also achieve high performance, which has solved the problem of environmental dependence of CUDA+GPU. Overall, the research improves the adaptability of the system with emphasis on performance of MUSER image clean computing. In the meanwhile, the realization of OpenCL in MUSER proves its availability in scientific data processing. In view of the high-performance computing features of OpenCL in heterogeneous environment, it will probably become the preferred technology in the future high-performance astronomical software development.
Welch, M C; Kwan, P W; Sajeev, A S M
2014-10-01
Agent-based modelling has proven to be a promising approach for developing rich simulations for complex phenomena that provide decision support functions across a broad range of areas including biological, social and agricultural sciences. This paper demonstrates how high performance computing technologies, namely General-Purpose Computing on Graphics Processing Units (GPGPU), and commercial Geographic Information Systems (GIS) can be applied to develop a national scale, agent-based simulation of an incursion of Old World Screwworm fly (OWS fly) into the Australian mainland. The development of this simulation model leverages the combination of massively data-parallel processing capabilities supported by NVidia's Compute Unified Device Architecture (CUDA) and the advanced spatial visualisation capabilities of GIS. These technologies have enabled the implementation of an individual-based, stochastic lifecycle and dispersal algorithm for the OWS fly invasion. The simulation model draws upon a wide range of biological data as input to stochastically determine the reproduction and survival of the OWS fly through the different stages of its lifecycle and dispersal of gravid females. Through this model, a highly efficient computational platform has been developed for studying the effectiveness of control and mitigation strategies and their associated economic impact on livestock industries can be materialised. Copyright © 2014 International Atomic Energy Agency 2014. Published by Elsevier B.V. All rights reserved.
GPU acceleration of Runge Kutta-Fehlberg and its comparison with Dormand-Prince method
NASA Astrophysics Data System (ADS)
Seen, Wo Mei; Gobithaasan, R. U.; Miura, Kenjiro T.
2014-07-01
There is a significant reduction of processing time and speedup of performance in computer graphics with the emergence of Graphic Processing Units (GPUs). GPUs have been developed to surpass Central Processing Unit (CPU) in terms of performance and processing speed. This evolution has opened up a new area in computing and researches where highly parallel GPU has been used for non-graphical algorithms. Physical or phenomenal simulations and modelling can be accelerated through General Purpose Graphic Processing Units (GPGPU) and Compute Unified Device Architecture (CUDA) implementations. These phenomena can be represented with mathematical models in the form of Ordinary Differential Equations (ODEs) which encompasses the gist of change rate between independent and dependent variables. ODEs are numerically integrated over time in order to simulate these behaviours. The classical Runge-Kutta (RK) scheme is the common method used to numerically solve ODEs. The Runge Kutta Fehlberg (RKF) scheme has been specially developed to provide an estimate of the principal local truncation error at each step, known as embedding estimate technique. This paper delves into the implementation of RKF scheme for GPU devices and compares its result with Dorman Prince method. A pseudo code is developed to show the implementation in detail. Hence, practitioners will be able to understand the data allocation in GPU, formation of RKF kernels and the flow of data to/from GPU-CPU upon RKF kernel evaluation. The pseudo code is then written in C Language and two ODE models are executed to show the achievable speedup as compared to CPU implementation. The accuracy and efficiency of the proposed implementation method is discussed in the final section of this paper.
Finding order in complexity: themes from the career of Dr. Robert F. Wagner
NASA Astrophysics Data System (ADS)
Myers, Kyle J.
2009-02-01
Over the course of his long and productive career, Dr. Robert F. Wagner built a framework for the evaluation of imaging systems based on a task-based, decision theoretic approach. His most recent contributions involved the consideration of the random effects associated with multiple readers of medical images and the logical extension of this work to the problem of the evaluation of multiple competing classifiers in statistical pattern recognition. This contemporary work expanded on familiar themes from Bob's many SPIE presentations in earlier years. It was driven by the need for practical solutions to current problems facing FDA'S Center for Devices and Radiological Health and the medical imaging community regarding the assessment of new computer-aided diagnosis tools and Bob's unique ability to unify concepts across a range of disciplines as he gave order to increasingly complex problems in our field.
Neurotech for Neuroscience: Unifying Concepts, Organizing Principles, and Emerging Tools
Silver, Rae; Boahen, Kwabena; Grillner, Sten; Kopell, Nancy; Olsen, Kathie L.
2012-01-01
The ability to tackle analysis of the brain at multiple levels simultaneously is emerging from rapid methodological developments. The classical research strategies of “measure,” “model,” and “make” are being applied to the exploration of nervous system function. These include novel conceptual and theoretical approaches, creative use of mathematical modeling, and attempts to build brain-like devices and systems, as well as other developments including instrumentation and statistical modeling (not covered here). Increasingly, these efforts require teams of scientists from a variety of traditional scientific disciplines to work together. The potential of such efforts for understanding directed motor movement, emergence of cognitive function from neuronal activity, and development of neuromimetic computers are described by a team that includes individuals experienced in behavior and neuroscience, mathematics, and engineering. Funding agencies, including the National Science Foundation, explore the potential of these changing frontiers of research for developing research policies and long-term planning. PMID:17978017
General purpose graphic processing unit implementation of adaptive pulse compression algorithms
NASA Astrophysics Data System (ADS)
Cai, Jingxiao; Zhang, Yan
2017-07-01
This study introduces a practical approach to implement real-time signal processing algorithms for general surveillance radar based on NVIDIA graphical processing units (GPUs). The pulse compression algorithms are implemented using compute unified device architecture (CUDA) libraries such as CUDA basic linear algebra subroutines and CUDA fast Fourier transform library, which are adopted from open source libraries and optimized for the NVIDIA GPUs. For more advanced, adaptive processing algorithms such as adaptive pulse compression, customized kernel optimization is needed and investigated. A statistical optimization approach is developed for this purpose without needing much knowledge of the physical configurations of the kernels. It was found that the kernel optimization approach can significantly improve the performance. Benchmark performance is compared with the CPU performance in terms of processing accelerations. The proposed implementation framework can be used in various radar systems including ground-based phased array radar, airborne sense and avoid radar, and aerospace surveillance radar.
A GPU-based calculation using the three-dimensional FDTD method for electromagnetic field analysis.
Nagaoka, Tomoaki; Watanabe, Soichi
2010-01-01
Numerical simulations with the numerical human model using the finite-difference time domain (FDTD) method have recently been performed frequently in a number of fields in biomedical engineering. However, the FDTD calculation runs too slowly. We focus, therefore, on general purpose programming on the graphics processing unit (GPGPU). The three-dimensional FDTD method was implemented on the GPU using Compute Unified Device Architecture (CUDA). In this study, we used the NVIDIA Tesla C1060 as a GPGPU board. The performance of the GPU is evaluated in comparison with the performance of a conventional CPU and a vector supercomputer. The results indicate that three-dimensional FDTD calculations using a GPU can significantly reduce run time in comparison with that using a conventional CPU, even a native GPU implementation of the three-dimensional FDTD method, while the GPU/CPU speed ratio varies with the calculation domain and thread block size.
Dynamic mapping of EDDL device descriptions to OPC UA
NASA Astrophysics Data System (ADS)
Atta Nsiah, Kofi; Schappacher, Manuel; Sikora, Axel
2017-07-01
OPC UA (Open Platform Communications Unified Architecture) is already a well-known concept used widely in the automation industry. In the area of factory automation, OPC UA models the underlying field devices such as sensors and actuators in an OPC UA server to allow connecting OPC UA clients to access device-specific information via a standardized information model. One of the requirements of the OPC UA server to represent field device data using its information model is to have advanced knowledge about the properties of the field devices in the form of device descriptions. The international standard IEC 61804 specifies EDDL (Electronic Device Description Language) as a generic language for describing the properties of field devices. In this paper, the authors describe a possibility to dynamically map and integrate field device descriptions based on EDDL into OPCUA.
NASA Astrophysics Data System (ADS)
Shen, Yanfeng; Cesnik, Carlos E. S.
2016-04-01
This paper presents a parallelized modeling technique for the efficient simulation of nonlinear ultrasonics introduced by the wave interaction with fatigue cracks. The elastodynamic wave equations with contact effects are formulated using an explicit Local Interaction Simulation Approach (LISA). The LISA formulation is extended to capture the contact-impact phenomena during the wave damage interaction based on the penalty method. A Coulomb friction model is integrated into the computation procedure to capture the stick-slip contact shear motion. The LISA procedure is coded using the Compute Unified Device Architecture (CUDA), which enables the highly parallelized supercomputing on powerful graphic cards. Both the explicit contact formulation and the parallel feature facilitates LISA's superb computational efficiency over the conventional finite element method (FEM). The theoretical formulations based on the penalty method is introduced and a guideline for the proper choice of the contact stiffness is given. The convergence behavior of the solution under various contact stiffness values is examined. A numerical benchmark problem is used to investigate the new LISA formulation and results are compared with a conventional contact finite element solution. Various nonlinear ultrasonic phenomena are successfully captured using this contact LISA formulation, including the generation of nonlinear higher harmonic responses. Nonlinear mode conversion of guided waves at fatigue cracks is also studied.
On-line range images registration with GPGPU
NASA Astrophysics Data System (ADS)
Będkowski, J.; Naruniec, J.
2013-03-01
This paper concerns implementation of algorithms in the two important aspects of modern 3D data processing: data registration and segmentation. Solution proposed for the first topic is based on the 3D space decomposition, while the latter on image processing and local neighbourhood search. Data processing is implemented by using NVIDIA compute unified device architecture (NIVIDIA CUDA) parallel computation. The result of the segmentation is a coloured map where different colours correspond to different objects, such as walls, floor and stairs. The research is related to the problem of collecting 3D data with a RGB-D camera mounted on a rotated head, to be used in mobile robot applications. Performance of the data registration algorithm is aimed for on-line processing. The iterative closest point (ICP) approach is chosen as a registration method. Computations are based on the parallel fast nearest neighbour search. This procedure decomposes 3D space into cubic buckets and, therefore, the time of the matching is deterministic. First technique of the data segmentation uses accele-rometers integrated with a RGB-D sensor to obtain rotation compensation and image processing method for defining pre-requisites of the known categories. The second technique uses the adapted nearest neighbour search procedure for obtaining normal vectors for each range point.
The Development of the Non-hydrostatic Unified Model of the Atmosphere (NUMA)
2011-09-19
capabilities: 1. Highly scalable on current and future computer architectures ( exascale computing: this means CPUs and GPUs) 2. Flexibility to use a...From Terascale to Petascale/ Exascale Computing • 10 of Top 500 are already in the Petascale range • 3 of top 10 are GPU-based machines 2
Cognitive Computational Neuroscience: A New Conference for an Emerging Discipline.
Naselaris, Thomas; Bassett, Danielle S; Fletcher, Alyson K; Kording, Konrad; Kriegeskorte, Nikolaus; Nienborg, Hendrikje; Poldrack, Russell A; Shohamy, Daphna; Kay, Kendrick
2018-05-01
Understanding the computational principles that underlie complex behavior is a central goal in cognitive science, artificial intelligence, and neuroscience. In an attempt to unify these disconnected communities, we created a new conference called Cognitive Computational Neuroscience (CCN). The inaugural meeting revealed considerable enthusiasm but significant obstacles remain. Copyright © 2018 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Lin, Mingpei; Xu, Ming; Fu, Xiaoyu
2017-05-01
Currently, a tremendous amount of space debris in Earth's orbit imperils operational spacecraft. It is essential to undertake risk assessments of collisions and predict dangerous encounters in space. However, collision predictions for an enormous amount of space debris give rise to large-scale computations. In this paper, a parallel algorithm is established on the Compute Unified Device Architecture (CUDA) platform of NVIDIA Corporation for collision prediction. According to the parallel structure of NVIDIA graphics processors, a block decomposition strategy is adopted in the algorithm. Space debris is divided into batches, and the computation and data transfer operations of adjacent batches overlap. As a consequence, the latency to access shared memory during the entire computing process is significantly reduced, and a higher computing speed is reached. Theoretically, a simulation of collision prediction for space debris of any amount and for any time span can be executed. To verify this algorithm, a simulation example including 1382 pieces of debris, whose operational time scales vary from 1 min to 3 days, is conducted on Tesla C2075 of NVIDIA. The simulation results demonstrate that with the same computational accuracy as that of a CPU, the computing speed of the parallel algorithm on a GPU is 30 times that on a CPU. Based on this algorithm, collision prediction of over 150 Chinese spacecraft for a time span of 3 days can be completed in less than 3 h on a single computer, which meets the timeliness requirement of the initial screening task. Furthermore, the algorithm can be adapted for multiple tasks, including particle filtration, constellation design, and Monte-Carlo simulation of an orbital computation.
Toward a unified account of comprehension and production in language development.
McCauley, Stewart M; Christiansen, Morten H
2013-08-01
Although Pickering & Garrod (P&G) argue convincingly for a unified system for language comprehension and production, they fail to explain how such a system might develop. Using a recent computational model of language acquisition as an example, we sketch a developmental perspective on the integration of comprehension and production. We conclude that only through development can we fully understand the intertwined nature of comprehension and production in adult processing.
gpuPOM: a GPU-based Princeton Ocean Model
NASA Astrophysics Data System (ADS)
Xu, S.; Huang, X.; Zhang, Y.; Fu, H.; Oey, L.-Y.; Xu, F.; Yang, G.
2014-11-01
Rapid advances in the performance of the graphics processing unit (GPU) have made the GPU a compelling solution for a series of scientific applications. However, most existing GPU acceleration works for climate models are doing partial code porting for certain hot spots, and can only achieve limited speedup for the entire model. In this work, we take the mpiPOM (a parallel version of the Princeton Ocean Model) as our starting point, design and implement a GPU-based Princeton Ocean Model. By carefully considering the architectural features of the state-of-the-art GPU devices, we rewrite the full mpiPOM model from the original Fortran version into a new Compute Unified Device Architecture C (CUDA-C) version. We take several accelerating methods to further improve the performance of gpuPOM, including optimizing memory access in a single GPU, overlapping communication and boundary operations among multiple GPUs, and overlapping input/output (I/O) between the hybrid Central Processing Unit (CPU) and the GPU. Our experimental results indicate that the performance of the gpuPOM on a workstation containing 4 GPUs is comparable to a powerful cluster with 408 CPU cores and it reduces the energy consumption by 6.8 times.
A novel VLSI processor architecture for supercomputing arrays
NASA Technical Reports Server (NTRS)
Venkateswaran, N.; Pattabiraman, S.; Devanathan, R.; Ahmed, Ashaf; Venkataraman, S.; Ganesh, N.
1993-01-01
Design of the processor element for general purpose massively parallel supercomputing arrays is highly complex and cost ineffective. To overcome this, the architecture and organization of the functional units of the processor element should be such as to suit the diverse computational structures and simplify mapping of complex communication structures of different classes of algorithms. This demands that the computation and communication structures of different class of algorithms be unified. While unifying the different communication structures is a difficult process, analysis of a wide class of algorithms reveals that their computation structures can be expressed in terms of basic IP,IP,OP,CM,R,SM, and MAA operations. The execution of these operations is unified on the PAcube macro-cell array. Based on this PAcube macro-cell array, we present a novel processor element called the GIPOP processor, which has dedicated functional units to perform the above operations. The architecture and organization of these functional units are such to satisfy the two important criteria mentioned above. The structure of the macro-cell and the unification process has led to a very regular and simpler design of the GIPOP processor. The production cost of the GIPOP processor is drastically reduced as it is designed on high performance mask programmable PAcube arrays.
Constitutive modeling for isotropic materials (HOST)
NASA Technical Reports Server (NTRS)
Lindholm, Ulric S.; Chan, Kwai S.; Bodner, S. R.; Weber, R. M.; Walker, K. P.; Cassenti, B. N.
1984-01-01
The results of the first year of work on a program to validate unified constitutive models for isotropic materials utilized in high temperature regions of gas turbine engines and to demonstrate their usefulness in computing stress-strain-time-temperature histories in complex three-dimensional structural components. The unified theories combine all inelastic strain-rate components in a single term avoiding, for example, treating plasticity and creep as separate response phenomena. An extensive review of existing unified theories is given and numerical methods for integrating these stiff time-temperature-dependent constitutive equations are discussed. Two particular models, those developed by Bodner and Partom and by Walker, were selected for more detailed development and evaluation against experimental tensile, creep and cyclic strain tests on specimens of a cast nickel base alloy, B19000+Hf. Initial results comparing computed and test results for tensile and cyclic straining for temperature from ambient to 982 C and strain rates from 10(exp-7) 10(exp-3) s(exp-1) are given. Some preliminary date correlations are presented also for highly non-proportional biaxial loading which demonstrate an increase in biaxial cyclic hardening rate over uniaxial or proportional loading conditions. Initial work has begun on the implementation of both constitutive models in the MARC finite element computer code.
Parallel mutual information estimation for inferring gene regulatory networks on GPUs
2011-01-01
Background Mutual information is a measure of similarity between two variables. It has been widely used in various application domains including computational biology, machine learning, statistics, image processing, and financial computing. Previously used simple histogram based mutual information estimators lack the precision in quality compared to kernel based methods. The recently introduced B-spline function based mutual information estimation method is competitive to the kernel based methods in terms of quality but at a lower computational complexity. Results We present a new approach to accelerate the B-spline function based mutual information estimation algorithm with commodity graphics hardware. To derive an efficient mapping onto this type of architecture, we have used the Compute Unified Device Architecture (CUDA) programming model to design and implement a new parallel algorithm. Our implementation, called CUDA-MI, can achieve speedups of up to 82 using double precision on a single GPU compared to a multi-threaded implementation on a quad-core CPU for large microarray datasets. We have used the results obtained by CUDA-MI to infer gene regulatory networks (GRNs) from microarray data. The comparisons to existing methods including ARACNE and TINGe show that CUDA-MI produces GRNs of higher quality in less time. Conclusions CUDA-MI is publicly available open-source software, written in CUDA and C++ programming languages. It obtains significant speedup over sequential multi-threaded implementation by fully exploiting the compute capability of commonly used CUDA-enabled low-cost GPUs. PMID:21672264
GPU-based cone beam computed tomography.
Noël, Peter B; Walczak, Alan M; Xu, Jinhui; Corso, Jason J; Hoffmann, Kenneth R; Schafer, Sebastian
2010-06-01
The use of cone beam computed tomography (CBCT) is growing in the clinical arena due to its ability to provide 3D information during interventions, its high diagnostic quality (sub-millimeter resolution), and its short scanning times (60 s). In many situations, the short scanning time of CBCT is followed by a time-consuming 3D reconstruction. The standard reconstruction algorithm for CBCT data is the filtered backprojection, which for a volume of size 256(3) takes up to 25 min on a standard system. Recent developments in the area of Graphic Processing Units (GPUs) make it possible to have access to high-performance computing solutions at a low cost, allowing their use in many scientific problems. We have implemented an algorithm for 3D reconstruction of CBCT data using the Compute Unified Device Architecture (CUDA) provided by NVIDIA (NVIDIA Corporation, Santa Clara, California), which was executed on a NVIDIA GeForce GTX 280. Our implementation results in improved reconstruction times from minutes, and perhaps hours, to a matter of seconds, while also giving the clinician the ability to view 3D volumetric data at higher resolutions. We evaluated our implementation on ten clinical data sets and one phantom data set to observe if differences occur between CPU and GPU-based reconstructions. By using our approach, the computation time for 256(3) is reduced from 25 min on the CPU to 3.2 s on the GPU. The GPU reconstruction time for 512(3) volumes is 8.5 s. Copyright 2009 Elsevier Ireland Ltd. All rights reserved.
Design Principles and Guidelines for Security
2007-11-21
Padula , Secure Computer Systems: Unified Exposition and Multics Interpretation. Electronic Systems Division, USAF. ESD-TR-75-306, MTR-2997 Rev.1...Hanscom AFB, MA. March 1976 [7] David Elliott Bell. “Looking Back at the Bell-La Padula Model,” Proc. Annual Computer Security Applications Conference
The semiotics of medical image Segmentation.
Baxter, John S H; Gibson, Eli; Eagleson, Roy; Peters, Terry M
2018-02-01
As the interaction between clinicians and computational processes increases in complexity, more nuanced mechanisms are required to describe how their communication is mediated. Medical image segmentation in particular affords a large number of distinct loci for interaction which can act on a deep, knowledge-driven level which complicates the naive interpretation of the computer as a symbol processing machine. Using the perspective of the computer as dialogue partner, we can motivate the semiotic understanding of medical image segmentation. Taking advantage of Peircean semiotic traditions and new philosophical inquiry into the structure and quality of metaphors, we can construct a unified framework for the interpretation of medical image segmentation as a sign exchange in which each sign acts as an interface metaphor. This allows for a notion of finite semiosis, described through a schematic medium, that can rigorously describe how clinicians and computers interpret the signs mediating their interaction. Altogether, this framework provides a unified approach to the understanding and development of medical image segmentation interfaces. Copyright © 2017 Elsevier B.V. All rights reserved.
BowMapCL: Burrows-Wheeler Mapping on Multiple Heterogeneous Accelerators.
Nogueira, David; Tomas, Pedro; Roma, Nuno
2016-01-01
The computational demand of exact-search procedures has pressed the exploitation of parallel processing accelerators to reduce the execution time of many applications. However, this often imposes strict restrictions in terms of the problem size and implementation efforts, mainly due to their possibly distinct architectures. To circumvent this limitation, a new exact-search alignment tool (BowMapCL) based on the Burrows-Wheeler Transform and FM-Index is presented. Contrasting to other alternatives, BowMapCL is based on a unified implementation using OpenCL, allowing the exploitation of multiple and possibly different devices (e.g., NVIDIA, AMD/ATI, and Intel GPUs/APUs). Furthermore, to efficiently exploit such heterogeneous architectures, BowMapCL incorporates several techniques to promote its performance and scalability, including multiple buffering, work-queue task-distribution, and dynamic load-balancing, together with index partitioning, bit-encoding, and sampling. When compared with state-of-the-art tools, the attained results showed that BowMapCL (using a single GPU) is 2 × to 7.5 × faster than mainstream multi-threaded CPU BWT-based aligners, like Bowtie, BWA, and SOAP2; and up to 4 × faster than the best performing state-of-the-art GPU implementations (namely, SOAP3 and HPG-BWT). When multiple and completely distinct devices are considered, BowMapCL efficiently scales the offered throughput, ensuring a convenient load-balance of the involved processing in the several distinct devices.
GPU accelerated dynamic functional connectivity analysis for functional MRI data.
Akgün, Devrim; Sakoğlu, Ünal; Esquivel, Johnny; Adinoff, Bryon; Mete, Mutlu
2015-07-01
Recent advances in multi-core processors and graphics card based computational technologies have paved the way for an improved and dynamic utilization of parallel computing techniques. Numerous applications have been implemented for the acceleration of computationally-intensive problems in various computational science fields including bioinformatics, in which big data problems are prevalent. In neuroimaging, dynamic functional connectivity (DFC) analysis is a computationally demanding method used to investigate dynamic functional interactions among different brain regions or networks identified with functional magnetic resonance imaging (fMRI) data. In this study, we implemented and analyzed a parallel DFC algorithm based on thread-based and block-based approaches. The thread-based approach was designed to parallelize DFC computations and was implemented in both Open Multi-Processing (OpenMP) and Compute Unified Device Architecture (CUDA) programming platforms. Another approach developed in this study to better utilize CUDA architecture is the block-based approach, where parallelization involves smaller parts of fMRI time-courses obtained by sliding-windows. Experimental results showed that the proposed parallel design solutions enabled by the GPUs significantly reduce the computation time for DFC analysis. Multicore implementation using OpenMP on 8-core processor provides up to 7.7× speed-up. GPU implementation using CUDA yielded substantial accelerations ranging from 18.5× to 157× speed-up once thread-based and block-based approaches were combined in the analysis. Proposed parallel programming solutions showed that multi-core processor and CUDA-supported GPU implementations accelerated the DFC analyses significantly. Developed algorithms make the DFC analyses more practical for multi-subject studies with more dynamic analyses. Copyright © 2015 Elsevier Ltd. All rights reserved.
SSC San Diego Biennial Review 2003. Vol 2: Communication and Information Systems
2003-01-01
University, Department of Electrical and Computer Engineering) Michael Jablecki (Science and Technology Corporation) Stochastic Unified Multiple...wearable computers and cellular phones. The technology-transfer process involved a coalition of government and industrial partners, each providing...the design and fabrication of the coupler. SSC San Diego developed a computer -controlled fused fiber fabrication station to achieve the required
DOE Office of Scientific and Technical Information (OSTI.GOV)
Childs, Andrew M.; Center for Theoretical Physics, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139; Leung, Debbie W.
We present unified, systematic derivations of schemes in the two known measurement-based models of quantum computation. The first model (introduced by Raussendorf and Briegel, [Phys. Rev. Lett. 86, 5188 (2001)]) uses a fixed entangled state, adaptive measurements on single qubits, and feedforward of the measurement results. The second model (proposed by Nielsen, [Phys. Lett. A 308, 96 (2003)] and further simplified by Leung, [Int. J. Quant. Inf. 2, 33 (2004)]) uses adaptive two-qubit measurements that can be applied to arbitrary pairs of qubits, and feedforward of the measurement results. The underlying principle of our derivations is a variant of teleportationmore » introduced by Zhou, Leung, and Chuang, [Phys. Rev. A 62, 052316 (2000)]. Our derivations unify these two measurement-based models of quantum computation and provide significantly simpler schemes.« less
Aerospace System Unified Life Cycle Engineering Producibility Measurement Issues
1989-05-01
Control .................................................................. 11-9 5 . C o st...in the development process; these computer -aided models offer clarity approaching that of a prototype model. Once a part geometry is represented...of part geometry , allowing manufacturability evaluation and possibly other computer -integrated manufacturing (CIM) tasks. (Other papers that discuss
Computer Networking Strategies for Building Collaboration among Science Educators.
ERIC Educational Resources Information Center
Aust, Ronald
The development and dissemination of science materials can be associated with technical delivery systems such as the Unified Network for Informatics in Teacher Education (UNITE). The UNITE project was designed to investigate ways for using computer networking to improve communications and collaboration among university schools of education and…
A Model for Conducting and Assessing Interdisciplinary Undergraduate Dissertations
ERIC Educational Resources Information Center
Engström, Henrik
2015-01-01
This paper presents an effort to create a unified model for conducting and assessing undergraduate dissertations, shared by all disciplines involved in computer game development at a Swedish university. Computer game development includes technology-oriented disciplines as well as disciplines with aesthetical traditions. The challenge has been to…
Computer Utilization by Schools: An Example.
ERIC Educational Resources Information Center
Tondow, Murray
1968-01-01
The Educational Data Services Department of the Palo Alto Unified School District is responsible for implementing data processing needs to improve the quality of education in Palo Alto, California. Information from the schools enters the Department data library to be scanned, coded, and corrected prior to IBM 1620 computer input. Operating 17…
Unified commutation-pruning technique for efficient computation of composite DFTs
NASA Astrophysics Data System (ADS)
Castro-Palazuelos, David E.; Medina-Melendrez, Modesto Gpe.; Torres-Roman, Deni L.; Shkvarko, Yuriy V.
2015-12-01
An efficient computation of a composite length discrete Fourier transform (DFT), as well as a fast Fourier transform (FFT) of both time and space data sequences in uncertain (non-sparse or sparse) computational scenarios, requires specific processing algorithms. Traditional algorithms typically employ some pruning methods without any commutations, which prevents them from attaining the potential computational efficiency. In this paper, we propose an alternative unified approach with automatic commutations between three computational modalities aimed at efficient computations of the pruned DFTs adapted for variable composite lengths of the non-sparse input-output data. The first modality is an implementation of the direct computation of a composite length DFT, the second one employs the second-order recursive filtering method, and the third one performs the new pruned decomposed transform. The pruned decomposed transform algorithm performs the decimation in time or space (DIT) data acquisition domain and, then, decimation in frequency (DIF). The unified combination of these three algorithms is addressed as the DFTCOMM technique. Based on the treatment of the combinational-type hypotheses testing optimization problem of preferable allocations between all feasible commuting-pruning modalities, we have found the global optimal solution to the pruning problem that always requires a fewer or, at most, the same number of arithmetic operations than other feasible modalities. The DFTCOMM method outperforms the existing competing pruning techniques in the sense of attainable savings in the number of required arithmetic operations. It requires fewer or at most the same number of arithmetic operations for its execution than any other of the competing pruning methods reported in the literature. Finally, we provide the comparison of the DFTCOMM with the recently developed sparse fast Fourier transform (SFFT) algorithmic family. We feature that, in the sensing scenarios with sparse/non-sparse data Fourier spectrum, the DFTCOMM technique manifests robustness against such model uncertainties in the sense of insensitivity for sparsity/non-sparsity restrictions and the variability of the operating parameters.
Pronominalization: A Device for Unifying Sentences in Memory
ERIC Educational Resources Information Center
Lesgold, Alan M.
1972-01-01
Based on the author's doctoral dissertation. Experiments supported by a National Science Foundation Predoctoral Fellowship and a National Institute of Mental Health grant; paper preparation supported by the Learning Research and Development Center, University of Pittsburgh, through the U.S. Office of Education. (VM)
NASA Astrophysics Data System (ADS)
Prudden, R.; Arribas, A.; Tomlinson, J.; Robinson, N.
2017-12-01
The Unified Model is a numerical model of the atmosphere used at the UK Met Office (and numerous partner organisations including Korean Meteorological Agency, Australian Bureau of Meteorology and US Air Force) for both weather and climate applications.Especifically, dynamical models such as the Unified Model are now a central part of weather forecasting. Starting from basic physical laws, these models make it possible to predict events such as storms before they have even begun to form. The Unified Model can be simply described as having two components: one component solves the navier-stokes equations (usually referred to as the "dynamics"); the other solves relevant sub-grid physical processes (usually referred to as the "physics"). Running weather forecasts requires substantial computing resources - for example, the UK Met Office operates the largest operational High Performance Computer in Europe - and the cost of a typical simulation is spent roughly 50% in the "dynamics" and 50% in the "physics". Therefore there is a high incentive to reduce cost of weather forecasts and Machine Learning is a possible option because, once a machine learning model has been trained, it is often much faster to run than a full simulation. This is the motivation for a technique called model emulation, the idea being to build a fast statistical model which closely approximates a far more expensive simulation. In this paper we discuss the use of Machine Learning as an emulator to replace the "physics" component of the Unified Model. Various approaches and options will be presented and the implications for further model development, operational running of forecasting systems, development of data assimilation schemes, and development of ensemble prediction techniques will be discussed.
A unified approach to computational drug discovery.
Tseng, Chih-Yuan; Tuszynski, Jack
2015-11-01
It has been reported that a slowdown in the development of new medical therapies is affecting clinical outcomes. The FDA has thus initiated the Critical Path Initiative project investigating better approaches. We review the current strategies in drug discovery and focus on the advantages of the maximum entropy method being introduced in this area. The maximum entropy principle is derived from statistical thermodynamics and has been demonstrated to be an inductive inference tool. We propose a unified method to drug discovery that hinges on robust information processing using entropic inductive inference. Increasingly, applications of maximum entropy in drug discovery employ this unified approach and demonstrate the usefulness of the concept in the area of pharmaceutical sciences. Copyright © 2015. Published by Elsevier Ltd.
1977-01-26
Sisteme Matematicheskogo Obespecheniya YeS EVM [ Applied Programs in the Software System for the Unified System of Computers], by A. Ye. Fateyev, A. I...computerized systems are most effective in large production complexes , in which the level of utilization of computers can be as high as 500,000...performance of these tasks could be furthered by the complex introduction of electronic computers in automated control systems. The creation of ASU
[Research on fast implementation method of image Gaussian RBF interpolation based on CUDA].
Chen, Hao; Yu, Haizhong
2014-04-01
Image interpolation is often required during medical image processing and analysis. Although interpolation method based on Gaussian radial basis function (GRBF) has high precision, the long calculation time still limits its application in field of image interpolation. To overcome this problem, a method of two-dimensional and three-dimensional medical image GRBF interpolation based on computing unified device architecture (CUDA) is proposed in this paper. According to single instruction multiple threads (SIMT) executive model of CUDA, various optimizing measures such as coalesced access and shared memory are adopted in this study. To eliminate the edge distortion of image interpolation, natural suture algorithm is utilized in overlapping regions while adopting data space strategy of separating 2D images into blocks or dividing 3D images into sub-volumes. Keeping a high interpolation precision, the 2D and 3D medical image GRBF interpolation achieved great acceleration in each basic computing step. The experiments showed that the operative efficiency of image GRBF interpolation based on CUDA platform was obviously improved compared with CPU calculation. The present method is of a considerable reference value in the application field of image interpolation.
Deducing the reachable space from fingertip positions.
Hai-Trieu Pham; Pathirana, Pubudu N
2015-01-01
The reachable space of the hand has received significant interests in the past from relevant medical researchers and health professionals. The reachable space was often computed from the joint angles acquired from a motion capture system such as gloves or markers attached to each bone of the finger. However, the contact between the hand and device can cause difficulties particularly for hand with injuries, burns or experiencing certain dermatological conditions. This paper introduces an approach to find the reachable space of the hand in a non-contact measurement form utilizing the Leap Motion Controller. The approach is based on the analysis of each position in the motion path of the fingertip acquired by the Leap Motion Controller. For each position of the fingertip, the inverse kinematic problem was solved under the physiological multiple constraints of the human hand to find a set of all possible configurations of three finger joints. Subsequently, all the sets are unified to form a set of all possible configurations specific for that motion. Finally, a reachable space is computed from the configuration corresponding to the complete extension and the complete flexion of the finger joint angles in this set.
Investigation of unifying transcutaneous transformer for transmission of energy and information.
Tamura, Nozomi; Yamamoto, Takahiko; Aoki, Hirooki; Koshiji, Kohji; Homma, Akihiko; Tatsumi, Eisuke; Taenaka, Yoshiyuki
2009-01-01
When patients are fitted with a totally implantable artificial heart (TAH), they need to be implanted with two additional devices: one for the transmission of energy and one for information. However, this is a cumbersome process that affects the quality of life of the recipient. Therefore, we investigated the use of electromagnetic coupling for the transmission of energy and information and the possibility of unifying two transcutaneous transformers for the simultaneous transmission of energy and information. While unifying the transformers, it is important to suppress the electromagnetic coupling between energy and information transmission. Therefore, we ensured that the electromagnetic fields generated from the transformer windings for the transmissions of information and energy intersected perpendicularly. If the fields are perpendicular, the electromagnetic coupling between the energy and information transmissions will be suppressed significantly. The characteristics of the simultaneous transmission of information and energy using the unified transcutaneous transformer, developed experimentally, were evaluated by changing the number of windings used for the transmission of information. The electromagnetic coupling between the energy and information transmissions was suppressed by determining the direction of the magnetic field. Moreover, the optimum number of transformer windings required for the simultaneous transmission of energy and information was determined. We concluded that the externally coupled transcutaneous transformer unified for the simultaneous transmission of energy and information performed with good transmission characteristics.
Tajstra, Mateusz; Sokal, Adam; Gwóźdź, Arkadiusz; Wilczek, Marcin; Gacek, Adam; Wojciechowski, Konrad; Gadula-Gacek, Elżbieta; Adamowicz-Czoch, Elżbieta; Chłosta-Niepiekło, Katarzyna; Milewski, Krzysztof; Rozentryt, Piotr; Kalarus, Zbigniew; Gąsior, Mariusz; Poloński, Lech
2017-07-01
The number of patients with heart failure implantable cardiac electronic devices (CIEDs) is growing. Hospitalization rate in this group is very high and generates enormous costs. To avoid the need for hospital treatment, optimized monitoring and follow-up is crucial. Remote monitoring (RM) has been widely put into practice in the management of CIEDs but it may be difficult due to the presence of differences in systems provided by device manufacturers and loss of gathered data in case of device reimplantation. Additionally, conclusions derived from studies about usefulness of RM in clinical practice apply to devices coming only from a single company. An integrated monitoring platform allows for more comprehensive data analysis and interpretation. Therefore, the primary objective of Remote Supervision to Decrease Hospitalization Rate (RESULT) study is to evaluate the impact of RM on the clinical status of patients with ICDs or CRT-Ds using an integrated platform. Six hundred consecutive patients with ICDs or CRT-Ds implanted will be prospectively randomized to either a traditional or RM-based follow-up model. The primary clinical endpoint will be a composite of all-cause mortality or hospitalization for cardiovascular reasons within 12 months after randomization. The primary technical endpoint will be to construct and evaluate a unified and integrated platform for the data collected from RM devices manufactured by different companies. This manuscript describes the design and methodology of the prospective, randomized trial designed to determine whether remote monitoring using an integrated platform for different companies is safe, feasible, and efficacious (ClinicalTrials.gov Identifier: NCT02409225). © 2016 Wiley Periodicals, Inc.
NASA Astrophysics Data System (ADS)
Ye, Liu; Hu, GuiYu; Li, AiXia
2011-01-01
We propose a unified scheme to implement the optimal 1 → 3 economical phase-covariant quantum cloning and optimal 1 → 3 economical real state cloning with superconducting quantum interference devices (SQUIDs) in a cavity. During this process, no transfer of quantum information between the SQUIDs and cavity is required. The cavity field is only virtually excited. The scheme is insensitive to cavity decay. Therefore, the scheme can be experimentally realized in the range of current cavity QED techniques.
Curricular Improvements through Computation and Experiment Based Learning Modules
ERIC Educational Resources Information Center
Khan, Fazeel; Singh, Kumar
2015-01-01
Engineers often need to predict how a part, mechanism or machine will perform in service, and this insight is typically achieved thorough computer simulations. Therefore, instruction in the creation and application of simulation models is essential for aspiring engineers. The purpose of this project was to develop a unified approach to teaching…
A subsequent closed-form description of propagated signaling phenomena in the membrane of an axon
DOE Office of Scientific and Technical Information (OSTI.GOV)
Melendy, Robert F., E-mail: rfmelendy@liberty.edu
2016-05-15
I recently introduced a closed-form description of propagated signaling phenomena in the membrane of an axon [R.F. Melendy, Journal of Applied Physics 118, 244701 (2015)]. Those results demonstrate how intracellular conductance, the thermodynamics of magnetization, and current modulation, function together in generating an action potential in a unified, closed-form description. At present, I report on a subsequent closed-form model that unifies intracellular conductance and the thermodynamics of magnetization, with the membrane electric field, E{sub m}. It’s anticipated this work will compel researchers in biophysics, physical biology, and the computational neurosciences, to probe deeper into the classical and quantum features ofmore » membrane magnetization and signaling, informed by the computational features of this subsequent model.« less
Simultaneous analysis and design
NASA Technical Reports Server (NTRS)
Haftka, R. T.
1984-01-01
Optimization techniques are increasingly being used for performing nonlinear structural analysis. The development of element by element (EBE) preconditioned conjugate gradient (CG) techniques is expected to extend this trend to linear analysis. Under these circumstances the structural design problem can be viewed as a nested optimization problem. There are computational benefits to treating this nested problem as a large single optimization problem. The response variables (such as displacements) and the structural parameters are all treated as design variables in a unified formulation which performs simultaneously the design and analysis. Two examples are used for demonstration. A seventy-two bar truss is optimized subject to linear stress constraints and a wing box structure is optimized subject to nonlinear collapse constraints. Both examples show substantial computational savings with the unified approach as compared to the traditional nested approach.
NASA Astrophysics Data System (ADS)
Binti Shamsuddin, Norsila
Technology advancement and development in a higher learning institution is a chance for students to be motivated to learn in depth in the information technology areas. Students should take hold of the opportunity to blend their skills towards these technologies as preparation for them when graduating. The curriculum itself can rise up the students' interest and persuade them to be directly involved in the evolvement of the technology. The aim of this study is to see how deep is the students' involvement as well as their acceptance towards the adoption of the technology used in Computer Graphics and Image Processing subjects. The study will be towards the Bachelor students in Faculty of Industrial Information Technology (FIIT), Universiti Industri Selangor (UNISEL); Bac. In Multimedia Industry, BSc. Computer Science and BSc. Computer Science (Software Engineering). This study utilizes the new Unified Theory of Acceptance and Use of Technology (UTAUT) to further validate the model and enhance our understanding of the adoption of Computer Graphics and Image Processing Technologies. Four (4) out of eight (8) independent factors in UTAUT will be studied towards the dependent factor.
Structural behavior of composites with progressive fracture
NASA Technical Reports Server (NTRS)
Minnetyan, L.; Murthy, P. L. N.; Chamis, C. C.
1989-01-01
The objective of the study is to unify several computational tools developed for the prediction of progressive damage and fracture with efforts for the prediction of the overall response of damaged composite structures. In particular, a computational finite element model for the damaged structure is developed using a computer program as a byproduct of the analysis of progressive damage and fracture. Thus, a single computational investigation can predict progressive fracture and the resulting variation in structural properties of angleplied composites.
Region Templates: Data Representation and Management for High-Throughput Image Analysis
Pan, Tony; Kurc, Tahsin; Kong, Jun; Cooper, Lee; Klasky, Scott; Saltz, Joel
2015-01-01
We introduce a region template abstraction and framework for the efficient storage, management and processing of common data types in analysis of large datasets of high resolution images on clusters of hybrid computing nodes. The region template abstraction provides a generic container template for common data structures, such as points, arrays, regions, and object sets, within a spatial and temporal bounding box. It allows for different data management strategies and I/O implementations, while providing a homogeneous, unified interface to applications for data storage and retrieval. A region template application is represented as a hierarchical dataflow in which each computing stage may be represented as another dataflow of finer-grain tasks. The execution of the application is coordinated by a runtime system that implements optimizations for hybrid machines, including performance-aware scheduling for maximizing the utilization of computing devices and techniques to reduce the impact of data transfers between CPUs and GPUs. An experimental evaluation on a state-of-the-art hybrid cluster using a microscopy imaging application shows that the abstraction adds negligible overhead (about 3%) and achieves good scalability and high data transfer rates. Optimizations in a high speed disk based storage implementation of the abstraction to support asynchronous data transfers and computation result in an application performance gain of about 1.13×. Finally, a processing rate of 11,730 4K×4K tiles per minute was achieved for the microscopy imaging application on a cluster with 100 nodes (300 GPUs and 1,200 CPU cores). This computation rate enables studies with very large datasets. PMID:26139953
Construction of multi-functional open modulized Matlab simulation toolbox for imaging ladar system
NASA Astrophysics Data System (ADS)
Wu, Long; Zhao, Yuan; Tang, Meng; He, Jiang; Zhang, Yong
2011-06-01
Ladar system simulation is to simulate the ladar models using computer simulation technology in order to predict the performance of the ladar system. This paper presents the developments of laser imaging radar simulation for domestic and overseas studies and the studies of computer simulation on ladar system with different application requests. The LadarSim and FOI-LadarSIM simulation facilities of Utah State University and Swedish Defence Research Agency are introduced in details. This paper presents the low level of simulation scale, un-unified design and applications of domestic researches in imaging ladar system simulation, which are mostly to achieve simple function simulation based on ranging equations for ladar systems. Design of laser imaging radar simulation with open and modularized structure is proposed to design unified modules for ladar system, laser emitter, atmosphere models, target models, signal receiver, parameters setting and system controller. Unified Matlab toolbox and standard control modules have been built with regulated input and output of the functions, and the communication protocols between hardware modules. A simulation based on ICCD gain-modulated imaging ladar system for a space shuttle is made based on the toolbox. The simulation result shows that the models and parameter settings of the Matlab toolbox are able to simulate the actual detection process precisely. The unified control module and pre-defined parameter settings simplify the simulation of imaging ladar detection. Its open structures enable the toolbox to be modified for specialized requests. The modulization gives simulations flexibility.
Federal Register 2010, 2011, 2012, 2013, 2014
2012-09-21
... Devices, Portable Music and Data Processing Devices, Computers, and Components Thereof; Institution of... communication devices, portable music and data processing devices, computers, and components thereof by reason... certain wireless communication devices, portable music and data processing devices, computers, and...
14 CFR § 1221.105 - Establishment of NASA Program Identifiers.
Code of Federal Regulations, 2014 CFR
2014-01-01
... 14 Aeronautics and Space 5 2014-01-01 2014-01-01 false Establishment of NASA Program Identifiers... THE NASA SEAL AND OTHER DEVICES, AND THE CONGRESSIONAL SPACE MEDAL OF HONOR NASA Seal, NASA Insignia, NASA Logotype, NASA Program Identifiers, NASA Flags, and the Agency's Unified Visual Communications...
PHYSICAL EDUCATION FOR BOYS, GRADES 7-12.
ERIC Educational Resources Information Center
LEBOWITZ, GORDON; AND OTHERS
TEACHERS IN THE JUNIOR AND SENIOR HIGH SCHOOLS ARE PROVIDED WITH TEACHING OUTLINES, TEACHING DEVICES, AND OTHER MATERIALS TO DEVELOP PUPILS' SKILLS, APTITUDES, AND PROFICIENCY IN PHYSICAL ACTIVITIES AND SPORTS. A GRADED AND SEQUENTIAL DEVELOPMENT OF ACTIVITIES IN A UNIFIED PROGRAM, BASED UPON THE CONCEPT OF UNIT TEACHING IN SEASONAL ACTIVITIES, IS…
NASA Astrophysics Data System (ADS)
Shen, Yanfeng
2017-04-01
This paper presents a numerical investigation of the nonlinear interactions between multimodal guided waves and delamination in composite structures. The elastodynamic wave equations for anisotropic composite laminate were formulated using an explicit Local Interaction Simulation Approach (LISA). The contact dynamics was modeled using the penalty method. In order to capture the stick-slip contact motion, a Coulomb friction law was integrated into the computation procedure. A random gap function was defined for the contact pairs to model distributed initial closures or openings to approximate the nature of rough delamination interfaces. The LISA procedure was coded using the Compute Unified Device Architecture (CUDA), which enables the highly parallelized computation on powerful graphic cards. Several guided wave modes centered at various frequencies were investigated as the incident wave. Numerical case studies of different delamination locations across the thickness were carried out. The capability of different wave modes at various frequencies to trigger the Contact Acoustic Nonlinearity (CAN) was studied. The correlation between the delamination size and the signal nonlinearity was also investigated. Furthermore, the influence from the roughness of the delamination interfaces was discussed as well. The numerical investigation shows that the nonlinear features of wave delamination interactions can enhance the evaluation capability of guided wave Structural Health Monitoring (SHM) system. This paper finishes with discussion, concluding remarks, and suggestions for future work.
Xia, Yidong; Lou, Jialin; Luo, Hong; ...
2015-02-09
Here, an OpenACC directive-based graphics processing unit (GPU) parallel scheme is presented for solving the compressible Navier–Stokes equations on 3D hybrid unstructured grids with a third-order reconstructed discontinuous Galerkin method. The developed scheme requires the minimum code intrusion and algorithm alteration for upgrading a legacy solver with the GPU computing capability at very little extra effort in programming, which leads to a unified and portable code development strategy. A face coloring algorithm is adopted to eliminate the memory contention because of the threading of internal and boundary face integrals. A number of flow problems are presented to verify the implementationmore » of the developed scheme. Timing measurements were obtained by running the resulting GPU code on one Nvidia Tesla K20c GPU card (Nvidia Corporation, Santa Clara, CA, USA) and compared with those obtained by running the equivalent Message Passing Interface (MPI) parallel CPU code on a compute node (consisting of two AMD Opteron 6128 eight-core CPUs (Advanced Micro Devices, Inc., Sunnyvale, CA, USA)). Speedup factors of up to 24× and 1.6× for the GPU code were achieved with respect to one and 16 CPU cores, respectively. The numerical results indicate that this OpenACC-based parallel scheme is an effective and extensible approach to port unstructured high-order CFD solvers to GPU computing.« less
NASA Astrophysics Data System (ADS)
Yu, Jian; Yin, Qian; Guo, Ping; Luo, A.-li
2014-09-01
This paper presents an efficient method for the extraction of astronomical spectra from two-dimensional (2D) multifibre spectrographs based on the regularized least-squares QR-factorization (LSQR) algorithm. We address two issues: we propose a modified Gaussian point spread function (PSF) for modelling the 2D PSF from multi-emission-line gas-discharge lamp images (arc images), and we develop an efficient deconvolution method to extract spectra in real circumstances. The proposed modified 2D Gaussian PSF model can fit various types of 2D PSFs, including different radial distortion angles and ellipticities. We adopt the regularized LSQR algorithm to solve the sparse linear equations constructed from the sparse convolution matrix, which we designate the deconvolution spectrum extraction method. Furthermore, we implement a parallelized LSQR algorithm based on graphics processing unit programming in the Compute Unified Device Architecture to accelerate the computational processing. Experimental results illustrate that the proposed extraction method can greatly reduce the computational cost and memory use of the deconvolution method and, consequently, increase its efficiency and practicability. In addition, the proposed extraction method has a stronger noise tolerance than other methods, such as the boxcar (aperture) extraction and profile extraction methods. Finally, we present an analysis of the sensitivity of the extraction results to the radius and full width at half-maximum of the 2D PSF.
NASA Astrophysics Data System (ADS)
Ramos, José A.; Mercère, Guillaume
2016-12-01
In this paper, we present an algorithm for identifying two-dimensional (2D) causal, recursive and separable-in-denominator (CRSD) state-space models in the Roesser form with deterministic-stochastic inputs. The algorithm implements the N4SID, PO-MOESP and CCA methods, which are well known in the literature on 1D system identification, but here we do so for the 2D CRSD Roesser model. The algorithm solves the 2D system identification problem by maintaining the constraint structure imposed by the problem (i.e. Toeplitz and Hankel) and computes the horizontal and vertical system orders, system parameter matrices and covariance matrices of a 2D CRSD Roesser model. From a computational point of view, the algorithm has been presented in a unified framework, where the user can select which of the three methods to use. Furthermore, the identification task is divided into three main parts: (1) computing the deterministic horizontal model parameters, (2) computing the deterministic vertical model parameters and (3) computing the stochastic components. Specific attention has been paid to the computation of a stabilised Kalman gain matrix and a positive real solution when required. The efficiency and robustness of the unified algorithm have been demonstrated via a thorough simulation example.
Rosen's (M,R) system in Unified Modelling Language.
Zhang, Ling; Williams, Richard A; Gatherer, Derek
2016-01-01
Robert Rosen's (M,R) system is an abstract biological network architecture that is allegedly non-computable on a Turing machine. If (M,R) is truly non-computable, there are serious implications for the modelling of large biological networks in computer software. A body of work has now accumulated addressing Rosen's claim concerning (M,R) by attempting to instantiate it in various software systems. However, a conclusive refutation has remained elusive, principally since none of the attempts to date have unambiguously avoided the critique that they have altered the properties of (M,R) in the coding process, producing merely approximate simulations of (M,R) rather than true computational models. In this paper, we use the Unified Modelling Language (UML), a diagrammatic notation standard, to express (M,R) as a system of objects having attributes, functions and relations. We believe that this instantiates (M,R) in such a way than none of the original properties of the system are corrupted in the process. Crucially, we demonstrate that (M,R) as classically represented in the relational biology literature is implicitly a UML communication diagram. Furthermore, since UML is formally compatible with object-oriented computing languages, instantiation of (M,R) in UML strongly implies its computability in object-oriented coding languages. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
ELSI: A unified software interface for Kohn–Sham electronic structure solvers
Yu, Victor Wen-zhe; Corsetti, Fabiano; Garcia, Alberto; ...
2017-09-15
Solving the electronic structure from a generalized or standard eigenproblem is often the bottleneck in large scale calculations based on Kohn-Sham density-functional theory. This problem must be addressed by essentially all current electronic structure codes, based on similar matrix expressions, and by high-performance computation. We here present a unified software interface, ELSI, to access different strategies that address the Kohn-Sham eigenvalue problem. Currently supported algorithms include the dense generalized eigensolver library ELPA, the orbital minimization method implemented in libOMM, and the pole expansion and selected inversion (PEXSI) approach with lower computational complexity for semilocal density functionals. The ELSI interface aimsmore » to simplify the implementation and optimal use of the different strategies, by offering (a) a unified software framework designed for the electronic structure solvers in Kohn-Sham density-functional theory; (b) reasonable default parameters for a chosen solver; (c) automatic conversion between input and internal working matrix formats, and in the future (d) recommendation of the optimal solver depending on the specific problem. As a result, comparative benchmarks are shown for system sizes up to 11,520 atoms (172,800 basis functions) on distributed memory supercomputing architectures.« less
A unified model for transfer alignment at random misalignment angles based on second-order EKF
NASA Astrophysics Data System (ADS)
Cui, Xiao; Mei, Chunbo; Qin, Yongyuan; Yan, Gongmin; Liu, Zhenbo
2017-04-01
In the transfer alignment process of inertial navigation systems (INSs), the conventional linear error model based on the small misalignment angle assumption cannot be applied to large misalignment situations. Furthermore, the nonlinear model based on the large misalignment angle suffers from redundant computation with nonlinear filters. This paper presents a unified model for transfer alignment suitable for arbitrary misalignment angles. The alignment problem is transformed into an estimation of the relative attitude between the master INS (MINS) and the slave INS (SINS), by decomposing the attitude matrix of the latter. Based on the Rodriguez parameters, a unified alignment model in the inertial frame with the linear state-space equation and a second order nonlinear measurement equation are established, without making any assumptions about the misalignment angles. Furthermore, we employ the Taylor series expansions on the second-order nonlinear measurement equation to implement the second-order extended Kalman filter (EKF2). Monte-Carlo simulations demonstrate that the initial alignment can be fulfilled within 10 s, with higher accuracy and much smaller computational cost compared with the traditional unscented Kalman filter (UKF) at large misalignment angles.
ELSI: A unified software interface for Kohn-Sham electronic structure solvers
NASA Astrophysics Data System (ADS)
Yu, Victor Wen-zhe; Corsetti, Fabiano; García, Alberto; Huhn, William P.; Jacquelin, Mathias; Jia, Weile; Lange, Björn; Lin, Lin; Lu, Jianfeng; Mi, Wenhui; Seifitokaldani, Ali; Vázquez-Mayagoitia, Álvaro; Yang, Chao; Yang, Haizhao; Blum, Volker
2018-01-01
Solving the electronic structure from a generalized or standard eigenproblem is often the bottleneck in large scale calculations based on Kohn-Sham density-functional theory. This problem must be addressed by essentially all current electronic structure codes, based on similar matrix expressions, and by high-performance computation. We here present a unified software interface, ELSI, to access different strategies that address the Kohn-Sham eigenvalue problem. Currently supported algorithms include the dense generalized eigensolver library ELPA, the orbital minimization method implemented in libOMM, and the pole expansion and selected inversion (PEXSI) approach with lower computational complexity for semilocal density functionals. The ELSI interface aims to simplify the implementation and optimal use of the different strategies, by offering (a) a unified software framework designed for the electronic structure solvers in Kohn-Sham density-functional theory; (b) reasonable default parameters for a chosen solver; (c) automatic conversion between input and internal working matrix formats, and in the future (d) recommendation of the optimal solver depending on the specific problem. Comparative benchmarks are shown for system sizes up to 11,520 atoms (172,800 basis functions) on distributed memory supercomputing architectures.
ELSI: A unified software interface for Kohn–Sham electronic structure solvers
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yu, Victor Wen-zhe; Corsetti, Fabiano; Garcia, Alberto
Solving the electronic structure from a generalized or standard eigenproblem is often the bottleneck in large scale calculations based on Kohn-Sham density-functional theory. This problem must be addressed by essentially all current electronic structure codes, based on similar matrix expressions, and by high-performance computation. We here present a unified software interface, ELSI, to access different strategies that address the Kohn-Sham eigenvalue problem. Currently supported algorithms include the dense generalized eigensolver library ELPA, the orbital minimization method implemented in libOMM, and the pole expansion and selected inversion (PEXSI) approach with lower computational complexity for semilocal density functionals. The ELSI interface aimsmore » to simplify the implementation and optimal use of the different strategies, by offering (a) a unified software framework designed for the electronic structure solvers in Kohn-Sham density-functional theory; (b) reasonable default parameters for a chosen solver; (c) automatic conversion between input and internal working matrix formats, and in the future (d) recommendation of the optimal solver depending on the specific problem. As a result, comparative benchmarks are shown for system sizes up to 11,520 atoms (172,800 basis functions) on distributed memory supercomputing architectures.« less
Computation of elementary modes: a unifying framework and the new binary approach
Gagneur, Julien; Klamt, Steffen
2004-01-01
Background Metabolic pathway analysis has been recognized as a central approach to the structural analysis of metabolic networks. The concept of elementary (flux) modes provides a rigorous formalism to describe and assess pathways and has proven to be valuable for many applications. However, computing elementary modes is a hard computational task. In recent years we assisted in a multiplication of algorithms dedicated to it. We require a summarizing point of view and a continued improvement of the current methods. Results We show that computing the set of elementary modes is equivalent to computing the set of extreme rays of a convex cone. This standard mathematical representation provides a unified framework that encompasses the most prominent algorithmic methods that compute elementary modes and allows a clear comparison between them. Taking lessons from this benchmark, we here introduce a new method, the binary approach, which computes the elementary modes as binary patterns of participating reactions from which the respective stoichiometric coefficients can be computed in a post-processing step. We implemented the binary approach in FluxAnalyzer 5.1, a software that is free for academics. The binary approach decreases the memory demand up to 96% without loss of speed giving the most efficient method available for computing elementary modes to date. Conclusions The equivalence between elementary modes and extreme ray computations offers opportunities for employing tools from polyhedral computation for metabolic pathway analysis. The new binary approach introduced herein was derived from this general theoretical framework and facilitates the computation of elementary modes in considerably larger networks. PMID:15527509
ERIC Educational Resources Information Center
Giannakos, Michail N.
2014-01-01
Computer Science (CS) courses comprise both Programming and Information and Communication Technology (ICT) issues; however these two areas have substantial differences, inter alia the attitudes and beliefs of the students regarding the intended learning content. In this research, factors from the Social Cognitive Theory and Unified Theory of…
Integration of Computers into the Medical School Curriculum: An Example from a Microbiology Course.
ERIC Educational Resources Information Center
Platt, Mark W.; And Others
1994-01-01
While the use of computers has become widespread in recent years, a unified, integrated approach to their use in the medical school curriculum has not yet emerged. Describes a program at the University of New Mexico that will phase-in computerization of its curriculum beginning in the fall of 1993. (LZ)
Using Technology To Support Comprehensive Guidance Program Operations: A Variety of Strategies.
ERIC Educational Resources Information Center
Bowers, Judy
The Tucson Unified School District made a goal for 2000-2001 for all counselors to have their own computer at school. This article looks at how these computers are used to enhance the counselors' jobs. At the district level, the staff communicates with counselors through e-mail. Meeting reminders, general information, and upcoming events are…
minimUML: A Minimalist Approach to UML Diagramming for Early Computer Science Education
ERIC Educational Resources Information Center
Turner, Scott A.; Perez-Quinones, Manuel A.; Edwards, Stephen H.
2005-01-01
In introductory computer science courses, the Unified Modeling Language (UML) is commonly used to teach basic object-oriented design. However, there appears to be a lack of suitable software to support this task. Many of the available programs that support UML focus on developing code and not on enhancing learning. Programs designed for…
Using Rule-Based Computer Programming to Unify Communication Rules Research.
ERIC Educational Resources Information Center
Sanford, David L.; Roach, J. W.
This paper proposes the use of a rule-based computer programming language as a standard for the expression of rules, arguing that the adoption of a standard would enable researchers to communicate about rules in a consistent and significant way. Focusing on the formal equivalence of artificial intelligence (AI) programming to different types of…
Visual Debugging of Object-Oriented Systems With the Unified Modeling Language
2004-03-01
to be “the systematic and imaginative use of the technology of interactive computer graphics and the disciplines of graphic design , typography ... Graphics volume 23 no 6, pp893-901, 1999. [SHN98] Shneiderman, B. Designing the User Interface. Strategies for Effective Human-Computer Interaction...System Design Objectives ................................................................................ 44 3.3 System Architecture
Roux-Rouquié, Magali; Caritey, Nicolas; Gaubert, Laurent; Rosenthal-Sabroux, Camille
2004-07-01
One of the main issues in Systems Biology is to deal with semantic data integration. Previously, we examined the requirements for a reference conceptual model to guide semantic integration based on the systemic principles. In the present paper, we examine the usefulness of the Unified Modelling Language (UML) to describe and specify biological systems and processes. This makes unambiguous representations of biological systems, which would be suitable for translation into mathematical and computational formalisms, enabling analysis, simulation and prediction of these systems behaviours.
Unified Monitoring Architecture for IT and Grid Services
NASA Astrophysics Data System (ADS)
Aimar, A.; Aguado Corman, A.; Andrade, P.; Belov, S.; Delgado Fernandez, J.; Garrido Bear, B.; Georgiou, M.; Karavakis, E.; Magnoni, L.; Rama Ballesteros, R.; Riahi, H.; Rodriguez Martinez, J.; Saiz, P.; Zolnai, D.
2017-10-01
This paper provides a detailed overview of the Unified Monitoring Architecture (UMA) that aims at merging the monitoring of the CERN IT data centres and the WLCG monitoring using common and widely-adopted open source technologies such as Flume, Elasticsearch, Hadoop, Spark, Kibana, Grafana and Zeppelin. It provides insights and details on the lessons learned, explaining the work performed in order to monitor the CERN IT data centres and the WLCG computing activities such as the job processing, data access and transfers, and the status of sites and services.
Recent Theoretical Studies On Excitation and Recombination
NASA Technical Reports Server (NTRS)
Pradhan, Anil K.
2000-01-01
New advances in the theoretical treatment of atomic processes in plasmas are described. These enable not only an integrated, unified, and self-consistent treatment of important radiative and collisional processes, but also large-scale computation of atomic data with high accuracy. An extension of the R-matrix work, from excitation and photoionization to electron-ion recombination, includes a unified method that subsumes both the radiative and the di-electronic recombination processes in an ab initio manner. The extensive collisional calculations for iron and iron-peak elements under the Iron Project are also discussed.
Federal Register 2010, 2011, 2012, 2013, 2014
2013-01-08
... Wireless Communication Devices, Tablet Computers, Media Players, and Televisions, and Components Thereof... devices, including wireless communication devices, tablet computers, media players, and televisions, and... wireless communication devices, tablet computers, media players, and televisions, and components thereof...
An Unified Multiscale Framework for Planar, Surface, and Curve Skeletonization.
Jalba, Andrei C; Sobiecki, Andre; Telea, Alexandru C
2016-01-01
Computing skeletons of 2D shapes, and medial surface and curve skeletons of 3D shapes, is a challenging task. In particular, there is no unified framework that detects all types of skeletons using a single model, and also produces a multiscale representation which allows to progressively simplify, or regularize, all skeleton types. In this paper, we present such a framework. We model skeleton detection and regularization by a conservative mass transport process from a shape's boundary to its surface skeleton, next to its curve skeleton, and finally to the shape center. The resulting density field can be thresholded to obtain a multiscale representation of progressively simplified surface, or curve, skeletons. We detail a numerical implementation of our framework which is demonstrably stable and has high computational efficiency. We demonstrate our framework on several complex 2D and 3D shapes.
A unified account of perceptual layering and surface appearance in terms of gamut relativity.
Vladusich, Tony; McDonnell, Mark D
2014-01-01
When we look at the world--or a graphical depiction of the world--we perceive surface materials (e.g. a ceramic black and white checkerboard) independently of variations in illumination (e.g. shading or shadow) and atmospheric media (e.g. clouds or smoke). Such percepts are partly based on the way physical surfaces and media reflect and transmit light and partly on the way the human visual system processes the complex patterns of light reaching the eye. One way to understand how these percepts arise is to assume that the visual system parses patterns of light into layered perceptual representations of surfaces, illumination and atmospheric media, one seen through another. Despite a great deal of previous experimental and modelling work on layered representation, however, a unified computational model of key perceptual demonstrations is still lacking. Here we present the first general computational model of perceptual layering and surface appearance--based on a boarder theoretical framework called gamut relativity--that is consistent with these demonstrations. The model (a) qualitatively explains striking effects of perceptual transparency, figure-ground separation and lightness, (b) quantitatively accounts for the role of stimulus- and task-driven constraints on perceptual matching performance, and (c) unifies two prominent theoretical frameworks for understanding surface appearance. The model thereby provides novel insights into the remarkable capacity of the human visual system to represent and identify surface materials, illumination and atmospheric media, which can be exploited in computer graphics applications.
A Unified Account of Perceptual Layering and Surface Appearance in Terms of Gamut Relativity
Vladusich, Tony; McDonnell, Mark D.
2014-01-01
When we look at the world—or a graphical depiction of the world—we perceive surface materials (e.g. a ceramic black and white checkerboard) independently of variations in illumination (e.g. shading or shadow) and atmospheric media (e.g. clouds or smoke). Such percepts are partly based on the way physical surfaces and media reflect and transmit light and partly on the way the human visual system processes the complex patterns of light reaching the eye. One way to understand how these percepts arise is to assume that the visual system parses patterns of light into layered perceptual representations of surfaces, illumination and atmospheric media, one seen through another. Despite a great deal of previous experimental and modelling work on layered representation, however, a unified computational model of key perceptual demonstrations is still lacking. Here we present the first general computational model of perceptual layering and surface appearance—based on a boarder theoretical framework called gamut relativity—that is consistent with these demonstrations. The model (a) qualitatively explains striking effects of perceptual transparency, figure-ground separation and lightness, (b) quantitatively accounts for the role of stimulus- and task-driven constraints on perceptual matching performance, and (c) unifies two prominent theoretical frameworks for understanding surface appearance. The model thereby provides novel insights into the remarkable capacity of the human visual system to represent and identify surface materials, illumination and atmospheric media, which can be exploited in computer graphics applications. PMID:25402466
Models and techniques for evaluating the effectiveness of aircraft computing systems
NASA Technical Reports Server (NTRS)
Meyer, J. F.
1977-01-01
Models, measures and techniques were developed for evaluating the effectiveness of aircraft computing systems. The concept of effectiveness involves aspects of system performance, reliability and worth. Specifically done was a detailed development of model hierarchy at mission, functional task, and computational task levels. An appropriate class of stochastic models was investigated which served as bottom level models in the hierarchial scheme. A unified measure of effectiveness called 'performability' was defined and formulated.
NASA Astrophysics Data System (ADS)
Pantelić, Dejan V.; Grujić, Dušan Ž.; Vasiljević, Darko M.
2014-12-01
We describe a method for dual-view biomechanical strain measurements of highly asymmetrical biological objects, like teeth or bones. By using a spherical mirror, we were able to simultaneously record a digital hologram of the object itself and the mirror image of its (otherwise invisible) rear side. A single laser beam was sufficient to illuminate both sides of the object, and to provide a reference beam. As a result, the system was mechanically very stable, enabling long exposure times (up to 2 min) without the need for vibration isolation. The setup is simple to construct and adjust, and can be used to interferometrically observe any object that is smaller than the mirror diameter. Parallel data processing on a CUDA-enabled (compute unified device architecture) graphics card was used to reconstruct digital holograms and to further correct image distortion. We used the setup to measure the deformation of a tooth due to mastication forces. The finite-element method was used to compare experimental results and theoretical predictions.
NASA Astrophysics Data System (ADS)
Peña, Adrian F.; Devine, Jack; Doronin, Alexander; Meglinski, Igor
2014-03-01
We report the use of conventional Optical Coherence Tomography (OCT) for visualization of propagation of low frequency electric field in soft biological tissues ex vivo. To increase the overall quality of the experimental images an adaptive Wiener filtering technique has been employed. Fourier domain correlation has been subsequently applied to enhance spatial resolution of images of biological tissues influenced by low frequency electric field. Image processing has been performed on Graphics Processing Units (GPUs) utilizing Compute Unified Device Architecture (CUDA) framework in the frequencydomain. The results show that variation in voltage and frequency of the applied electric field relates exponentially to the magnitude of its influence on biological tissue. The magnitude of influence is about twice more for fresh tissue samples in comparison to non-fresh ones. The obtained results suggest that OCT can be used for observation and quantitative evaluation of the electro-kinetic changes in biological tissues under different physiological conditions, functional electrical stimulation, and potentially can be used non-invasively for food quality control.
Plugin free remote visualization in the browser
NASA Astrophysics Data System (ADS)
Tamm, Georg; Slusallek, Philipp
2015-01-01
Today, users access information and rich media from anywhere using the web browser on their desktop computers, tablets or smartphones. But the web evolves beyond media delivery. Interactive graphics applications like visualization or gaming become feasible as browsers advance in the functionality they provide. However, to deliver large-scale visualization to thin clients like mobile devices, a dedicated server component is necessary. Ideally, the client runs directly within the browser the user is accustomed to, requiring no installation of a plugin or native application. In this paper, we present the state-of-the-art of technologies which enable plugin free remote rendering in the browser. Further, we describe a remote visualization system unifying these technologies. The system transfers rendering results to the client as images or as a video stream. We utilize the upcoming World Wide Web Consortium (W3C) conform Web Real-Time Communication (WebRTC) standard, and the Native Client (NaCl) technology built into Chrome, to deliver video with low latency.
Pantelić, Dejan V; Grujić, Dušan Ž; Vasiljević, Darko M
2014-12-01
We describe a method for dual-view biomechanical strain measurements of highly asymmetrical biological objects, like teeth or bones. By using a spherical mirror, we were able to simultaneously record a digital hologram of the object itself and the mirror image of its (otherwise invisible) rear side. A single laser beam was sufficient to illuminate both sides of the object, and to provide a reference beam. As a result, the system was mechanically very stable, enabling long exposure times (up to 2 min) without the need for vibration isolation. The setup is simple to construct and adjust, and can be used to interferometrically observe any object that is smaller than the mirror diameter. Parallel data processing on a CUDA-enabled (compute unified device architecture) graphics card was used to reconstruct digital holograms and to further correct image distortion. We used the setup to measure the deformation of a tooth due to mastication forces. The finite-element method was used to compare experimental results and theoretical predictions.
A new gravitational N-body simulation algorithm for investigation of cosmological chaotic advection
NASA Astrophysics Data System (ADS)
Stalder, Diego H.; Rosa, Reinaldo R.; da Silva Junior, José R.; Clua, Esteban; Ruiz, Renata S. R.; Velho, Haroldo F. Campos; Ramos, Fernando M.; Araújo, Amarísio Da S.; Conrado, Vitor G.
2012-10-01
Recently alternative approaches in cosmology seeks to explain the nature of dark matter as a direct result of the non-linear spacetime curvature due to different types of deformation potentials. In this context, a key test for this hypothesis is to examine the effects of deformation on the evolution of large scales structures. An important requirement for the fine analysis of this pure gravitational signature (without dark matter elements) is to characterize the position of a galaxy during its trajectory to the gravitational collapse of super clusters at low redshifts. In this context, each element in an gravitational N-body simulation behaves as a tracer of collapse governed by the process known as chaotic advection (or lagrangian turbulence). In order to develop a detailed study of this new approach we develop the COsmic LAgrangian TUrbulence Simulator (COLATUS) to perform gravitational N-body simulations based on Compute Unified Device Architecture (CUDA) for graphics processing units (GPUs). In this paper we report the first robust results obtained from COLATUS.
NASA Astrophysics Data System (ADS)
Liu, Tianyu; Du, Xining; Ji, Wei; Xu, X. George; Brown, Forrest B.
2014-06-01
For nuclear reactor analysis such as the neutron eigenvalue calculations, the time consuming Monte Carlo (MC) simulations can be accelerated by using graphics processing units (GPUs). However, traditional MC methods are often history-based, and their performance on GPUs is affected significantly by the thread divergence problem. In this paper we describe the development of a newly designed event-based vectorized MC algorithm for solving the neutron eigenvalue problem. The code was implemented using NVIDIA's Compute Unified Device Architecture (CUDA), and tested on a NVIDIA Tesla M2090 GPU card. We found that although the vectorized MC algorithm greatly reduces the occurrence of thread divergence thus enhancing the warp execution efficiency, the overall simulation speed is roughly ten times slower than the history-based MC code on GPUs. Profiling results suggest that the slow speed is probably due to the memory access latency caused by the large amount of global memory transactions. Possible solutions to improve the code efficiency are discussed.
AlQuraishi, Mohammed; Tang, Shengdong; Xia, Xide
2015-11-19
Molecular interactions between proteins and DNA molecules underlie many cellular processes, including transcriptional regulation, chromosome replication, and nucleosome positioning. Computational analyses of protein-DNA interactions rely on experimental data characterizing known protein-DNA interactions structurally and biochemically. While many databases exist that contain either structural or biochemical data, few integrate these two data sources in a unified fashion. Such integration is becoming increasingly critical with the rapid growth of structural and biochemical data, and the emergence of algorithms that rely on the synthesis of multiple data types to derive computational models of molecular interactions. We have developed an integrated affinity-structure database in which the experimental and quantitative DNA binding affinities of helix-turn-helix proteins are mapped onto the crystal structures of the corresponding protein-DNA complexes. This database provides access to: (i) protein-DNA structures, (ii) quantitative summaries of protein-DNA binding affinities using position weight matrices, and (iii) raw experimental data of protein-DNA binding instances. Critically, this database establishes a correspondence between experimental structural data and quantitative binding affinity data at the single basepair level. Furthermore, we present a novel alignment algorithm that structurally aligns the protein-DNA complexes in the database and creates a unified residue-level coordinate system for comparing the physico-chemical environments at the interface between complexes. Using this unified coordinate system, we compute the statistics of atomic interactions at the protein-DNA interface of helix-turn-helix proteins. We provide an interactive website for visualization, querying, and analyzing this database, and a downloadable version to facilitate programmatic analysis. This database will facilitate the analysis of protein-DNA interactions and the development of programmatic computational methods that capitalize on integration of structural and biochemical datasets. The database can be accessed at http://ProteinDNA.hms.harvard.edu.
Information Geometry for Landmark Shape Analysis: Unifying Shape Representation and Deformation
Peter, Adrian M.; Rangarajan, Anand
2010-01-01
Shape matching plays a prominent role in the comparison of similar structures. We present a unifying framework for shape matching that uses mixture models to couple both the shape representation and deformation. The theoretical foundation is drawn from information geometry wherein information matrices are used to establish intrinsic distances between parametric densities. When a parameterized probability density function is used to represent a landmark-based shape, the modes of deformation are automatically established through the information matrix of the density. We first show that given two shapes parameterized by Gaussian mixture models (GMMs), the well-known Fisher information matrix of the mixture model is also a Riemannian metric (actually, the Fisher-Rao Riemannian metric) and can therefore be used for computing shape geodesics. The Fisher-Rao metric has the advantage of being an intrinsic metric and invariant to reparameterization. The geodesic—computed using this metric—establishes an intrinsic deformation between the shapes, thus unifying both shape representation and deformation. A fundamental drawback of the Fisher-Rao metric is that it is not available in closed form for the GMM. Consequently, shape comparisons are computationally very expensive. To address this, we develop a new Riemannian metric based on generalized ϕ-entropy measures. In sharp contrast to the Fisher-Rao metric, the new metric is available in closed form. Geodesic computations using the new metric are considerably more efficient. We validate the performance and discriminative capabilities of these new information geometry-based metrics by pairwise matching of corpus callosum shapes. We also study the deformations of fish shapes that have various topological properties. A comprehensive comparative analysis is also provided using other landmark-based distances, including the Hausdorff distance, the Procrustes metric, landmark-based diffeomorphisms, and the bending energies of the thin-plate (TPS) and Wendland splines. PMID:19110497
Malaysian University Students' Use of Mobile Phones for Study
ERIC Educational Resources Information Center
Pullen, Darren; J-F; Swabey, Karen; Abadooz, M.; Sing, Termit Kaur Ranjit
2015-01-01
Mobile technology coupled with Internet accessibility has increased not only how we communicate but also how we might engage in learning. The ubiquity of mobile technology, such as smart phones and tablet devices, makes it a valuable tool for accessing learning resources on the Internet. The unified theory of acceptance and use of technology…
14 CFR § 1221.112 - Use of the NASA Program Identifiers.
Code of Federal Regulations, 2014 CFR
2014-01-01
... 14 Aeronautics and Space 5 2014-01-01 2014-01-01 false Use of the NASA Program Identifiers. Â... NASA SEAL AND OTHER DEVICES, AND THE CONGRESSIONAL SPACE MEDAL OF HONOR NASA Seal, NASA Insignia, NASA Logotype, NASA Program Identifiers, NASA Flags, and the Agency's Unified Visual Communications System...
Web-based monitoring and management system for integrated enterprise-wide imaging networks
NASA Astrophysics Data System (ADS)
Ma, Keith; Slik, David; Lam, Alvin; Ng, Won
2003-05-01
Mass proliferation of IP networks and the maturity of standards has enabled the creation of sophisticated image distribution networks that operate over Intranets, Extranets, Communities of Interest (CoI) and even the public Internet. Unified monitoring, provisioning and management of such systems at the application and protocol levels represent a challenge. This paper presents a web based monitoring and management tool that employs established telecom standards for the creation of an open system that enables proactive management, provisioning and monitoring of image management systems at the enterprise level and across multi-site geographically distributed deployments. Utilizing established standards including ITU-T M.3100, and web technologies such as XML/XSLT, JSP/JSTL, and J2SE, the system allows for seamless device and protocol adaptation between multiple disparate devices. The goal has been to develop a unified interface that provides network topology views, multi-level customizable alerts, real-time fault detection as well as real-time and historical reporting of all monitored resources, including network connectivity, system load, DICOM transactions and storage capacities.
Castillo, Encarnación; López-Ramos, Juan A.; Morales, Diego P.
2018-01-01
Security is a critical challenge for the effective expansion of all new emerging applications in the Internet of Things paradigm. Therefore, it is necessary to define and implement different mechanisms for guaranteeing security and privacy of data interchanged within the multiple wireless sensor networks being part of the Internet of Things. However, in this context, low power and low area are required, limiting the resources available for security and thus hindering the implementation of adequate security protocols. Group keys can save resources and communications bandwidth, but should be combined with public key cryptography to be really secure. In this paper, a compact and unified co-processor for enabling Elliptic Curve Cryptography along to Advanced Encryption Standard with low area requirements and Group-Key support is presented. The designed co-processor allows securing wireless sensor networks with independence of the communications protocols used. With an area occupancy of only 2101 LUTs over Spartan 6 devices from Xilinx, it requires 15% less area while achieving near 490% better performance when compared to cryptoprocessors with similar features in the literature. PMID:29337921
Parrilla, Luis; Castillo, Encarnación; López-Ramos, Juan A; Álvarez-Bermejo, José A; García, Antonio; Morales, Diego P
2018-01-16
Security is a critical challenge for the effective expansion of all new emerging applications in the Internet of Things paradigm. Therefore, it is necessary to define and implement different mechanisms for guaranteeing security and privacy of data interchanged within the multiple wireless sensor networks being part of the Internet of Things. However, in this context, low power and low area are required, limiting the resources available for security and thus hindering the implementation of adequate security protocols. Group keys can save resources and communications bandwidth, but should be combined with public key cryptography to be really secure. In this paper, a compact and unified co-processor for enabling Elliptic Curve Cryptography along to Advanced Encryption Standard with low area requirements and Group-Key support is presented. The designed co-processor allows securing wireless sensor networks with independence of the communications protocols used. With an area occupancy of only 2101 LUTs over Spartan 6 devices from Xilinx, it requires 15% less area while achieving near 490% better performance when compared to cryptoprocessors with similar features in the literature.
NASA Astrophysics Data System (ADS)
Li, Zhi-Hui; Peng, Ao-Ping; Zhang, Han-Xin; Yang, Jaw-Yen
2015-04-01
This article reviews rarefied gas flow computations based on nonlinear model Boltzmann equations using deterministic high-order gas-kinetic unified algorithms (GKUA) in phase space. The nonlinear Boltzmann model equations considered include the BGK model, the Shakhov model, the Ellipsoidal Statistical model and the Morse model. Several high-order gas-kinetic unified algorithms, which combine the discrete velocity ordinate method in velocity space and the compact high-order finite-difference schemes in physical space, are developed. The parallel strategies implemented with the accompanying algorithms are of equal importance. Accurate computations of rarefied gas flow problems using various kinetic models over wide ranges of Mach numbers 1.2-20 and Knudsen numbers 0.0001-5 are reported. The effects of different high resolution schemes on the flow resolution under the same discrete velocity ordinate method are studied. A conservative discrete velocity ordinate method to ensure the kinetic compatibility condition is also implemented. The present algorithms are tested for the one-dimensional unsteady shock-tube problems with various Knudsen numbers, the steady normal shock wave structures for different Mach numbers, the two-dimensional flows past a circular cylinder and a NACA 0012 airfoil to verify the present methodology and to simulate gas transport phenomena covering various flow regimes. Illustrations of large scale parallel computations of three-dimensional hypersonic rarefied flows over the reusable sphere-cone satellite and the re-entry spacecraft using almost the largest computer systems available in China are also reported. The present computed results are compared with the theoretical prediction from gas dynamics, related DSMC results, slip N-S solutions and experimental data, and good agreement can be found. The numerical experience indicates that although the direct model Boltzmann equation solver in phase space can be computationally expensive, nevertheless, the present GKUAs for kinetic model Boltzmann equations in conjunction with current available high-performance parallel computer power can provide a vital engineering tool for analyzing rarefied gas flows covering the whole range of flow regimes in aerospace engineering applications.
Federal Register 2010, 2011, 2012, 2013, 2014
2012-08-24
... Music and Data Processing Devices, Computers, and Components Thereof; Notice of Receipt of Complaint... complaint entitled Wireless Communication Devices, Portable Music and Data Processing Devices, Computers..., portable music and data processing devices, computers, and components thereof. The complaint names as...
DOE Office of Scientific and Technical Information (OSTI.GOV)
Beltran-Valle, Omar; Pena-Gallardo, Rafael; Segundo-Ramirez, Juan
This paper presents a comparative study of the application of Flexible AC Transmission System (FACTS) devices, as Thyristor Controlled Series Capacitor (TCSC), Static Synchronous Compensator (STATCOM) and Unified Power Controller (UPFC) on congestion management and voltage support in the area of the Istmo of Tehuantepec, Oaxaca, Mexico. The present work provides an analysis about the performance of the control of active and reactive power of the FACTS controllers applied to mentioned problems in the power system.
Computer-Aided Group Problem Solving for Unified Life Cycle Engineering (ULCE)
1989-02-01
defining the problem, generating alternative solutions, evaluating alternatives, selecting alternatives, and implementing the solution. Systems...specialist in group dynamics, assists the group in formulating the problem and selecting a model framework. The analyst provides the group with computer...allocating resources, evaluating and selecting options, making judgments explicit, and analyzing dynamic systems. c. University of Rhode Island Drs. Geoffery
Design Of Computer Based Test Using The Unified Modeling Language
NASA Astrophysics Data System (ADS)
Tedyyana, Agus; Danuri; Lidyawati
2017-12-01
The Admission selection of Politeknik Negeri Bengkalis through interest and talent search (PMDK), Joint Selection of admission test for state Polytechnics (SB-UMPN) and Independent (UM-Polbeng) were conducted by using paper-based Test (PBT). Paper Based Test model has some weaknesses. They are wasting too much paper, the leaking of the questios to the public, and data manipulation of the test result. This reasearch was Aimed to create a Computer-based Test (CBT) models by using Unified Modeling Language (UML) the which consists of Use Case diagrams, Activity diagram and sequence diagrams. During the designing process of the application, it is important to pay attention on the process of giving the password for the test questions before they were shown through encryption and description process. RSA cryptography algorithm was used in this process. Then, the questions shown in the questions banks were randomized by using the Fisher-Yates Shuffle method. The network architecture used in Computer Based test application was a client-server network models and Local Area Network (LAN). The result of the design was the Computer Based Test application for admission to the selection of Politeknik Negeri Bengkalis.
NASA Technical Reports Server (NTRS)
Amiet, R. K.
1991-01-01
A unified theory for aerodynamics and noise of advanced turboprops is presented. The theory and a computer code developed for evaluation at the shielding benefits that might be expected by an aircraft wing in a wing-mounted propeller installation are presented. Several computed directivity patterns are presented to demonstrate the theory. Recently with the advent of the concept of using the wing of an aircraft for noise shielding, the case of diffraction by a surface in a flow has been given attention. The present analysis is based on the case of diffraction of no flow. By combining a Galilean and a Lorentz transform, the wave equation with a mean flow can be reduced to the ordinary equation. Allowance is also made in the analysis for the case of a swept wing. The same combination of Galilean and Lorentz transforms lead to a problem with no flow but a different sweep. The solution procedures for the cases of leading and trailing edges are basically the same. Two normalizations of the solution are given by the computer program. FORTRAN computer programs are presented with detailed documentation. The output from these programs compares favorably with the results of other investigators.
Aguilar, I; Misztal, I; Legarra, A; Tsuruta, S
2011-12-01
Genomic evaluations can be calculated using a unified procedure that combines phenotypic, pedigree and genomic information. Implementation of such a procedure requires the inverse of the relationship matrix based on pedigree and genomic relationships. The objective of this study was to investigate efficient computing options to create relationship matrices based on genomic markers and pedigree information as well as their inverses. SNP maker information was simulated for a panel of 40 K SNPs, with the number of genotyped animals up to 30 000. Matrix multiplication in the computation of the genomic relationship was by a simple 'do' loop, by two optimized versions of the loop, and by a specific matrix multiplication subroutine. Inversion was by a generalized inverse algorithm and by a LAPACK subroutine. With the most efficient choices and parallel processing, creation of matrices for 30 000 animals would take a few hours. Matrices required to implement a unified approach can be computed efficiently. Optimizations can be either by modifications of existing code or by the use of efficient automatic optimizations provided by open source or third-party libraries. © 2011 Blackwell Verlag GmbH.
A unified RANS–LES model: Computational development, accuracy and cost
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gopalan, Harish, E-mail: hgopalan@uwyo.edu; Heinz, Stefan, E-mail: heinz@uwyo.edu; Stöllinger, Michael K., E-mail: MStoell@uwyo.edu
2013-09-15
Large eddy simulation (LES) is computationally extremely expensive for the investigation of wall-bounded turbulent flows at high Reynolds numbers. A way to reduce the computational cost of LES by orders of magnitude is to combine LES equations with Reynolds-averaged Navier–Stokes (RANS) equations used in the near-wall region. A large variety of such hybrid RANS–LES methods are currently in use such that there is the question of which hybrid RANS-LES method represents the optimal approach. The properties of an optimal hybrid RANS–LES model are formulated here by taking reference to fundamental properties of fluid flow equations. It is shown that unifiedmore » RANS–LES models derived from an underlying stochastic turbulence model have the properties of optimal hybrid RANS–LES models. The rest of the paper is organized in two parts. First, a priori and a posteriori analyses of channel flow data are used to find the optimal computational formulation of the theoretically derived unified RANS–LES model and to show that this computational model, which is referred to as linear unified model (LUM), does also have all the properties of an optimal hybrid RANS–LES model. Second, a posteriori analyses of channel flow data are used to study the accuracy and cost features of the LUM. The following conclusions are obtained. (i) Compared to RANS, which require evidence for their predictions, the LUM has the significant advantage that the quality of predictions is relatively independent of the RANS model applied. (ii) Compared to LES, the significant advantage of the LUM is a cost reduction of high-Reynolds number simulations by a factor of 0.07Re{sup 0.46}. For coarse grids, the LUM has a significant accuracy advantage over corresponding LES. (iii) Compared to other usually applied hybrid RANS–LES models, it is shown that the LUM provides significantly improved predictions.« less
Nonequilibrium Energy Transfer at Nanoscale: A Unified Theory from Weak to Strong Coupling
NASA Astrophysics Data System (ADS)
Wang, Chen; Ren, Jie; Cao, Jianshu
2015-07-01
Unraveling the microscopic mechanism of quantum energy transfer across two-level systems provides crucial insights to the optimal design and potential applications of low-dimensional nanodevices. Here, we study the non-equilibrium spin-boson model as a minimal prototype and develop a fluctuation-decoupled quantum master equation approach that is valid ranging from the weak to the strong system-bath coupling regime. The exact expression of energy flux is analytically established, which dissects the energy transfer as multiple boson processes with even and odd parity. Our analysis provides a unified interpretation of several observations, including coherence-enhanced heat flux and negative differential thermal conductance. The results will have broad implications for the fine control of energy transfer in nano-structural devices.
NASA Astrophysics Data System (ADS)
Cai, Xiaohui; Liu, Yang; Ren, Zhiming
2018-06-01
Reverse-time migration (RTM) is a powerful tool for imaging geologically complex structures such as steep-dip and subsalt. However, its implementation is quite computationally expensive. Recently, as a low-cost solution, the graphic processing unit (GPU) was introduced to improve the efficiency of RTM. In the paper, we develop three ameliorative strategies to implement RTM on GPU card. First, given the high accuracy and efficiency of the adaptive optimal finite-difference (FD) method based on least squares (LS) on central processing unit (CPU), we study the optimal LS-based FD method on GPU. Second, we develop the CPU-based hybrid absorbing boundary condition (ABC) to the GPU-based one by addressing two issues of the former when introduced to GPU card: time-consuming and chaotic threads. Third, for large-scale data, the combinatorial strategy for optimal checkpointing and efficient boundary storage is introduced for the trade-off between memory and recomputation. To save the time of communication between host and disk, the portable operating system interface (POSIX) thread is utilized to create the other CPU core at the checkpoints. Applications of the three strategies on GPU with the compute unified device architecture (CUDA) programming language in RTM demonstrate their efficiency and validity.
Tsuchimoto, Masashi; Tanimura, Yoshitaka
2015-08-11
A system with many energy states coupled to a harmonic oscillator bath is considered. To study quantum non-Markovian system-bath dynamics numerically rigorously and nonperturbatively, we developed a computer code for the reduced hierarchy equations of motion (HEOM) for a graphics processor unit (GPU) that can treat the system as large as 4096 energy states. The code employs a Padé spectrum decomposition (PSD) for a construction of HEOM and the exponential integrators. Dynamics of a quantum spin glass system are studied by calculating the free induction decay signal for the cases of 3 × 2 to 3 × 4 triangular lattices with antiferromagnetic interactions. We found that spins relax faster at lower temperature due to transitions through a quantum coherent state, as represented by the off-diagonal elements of the reduced density matrix, while it has been known that the spins relax slower due to suppression of thermal activation in a classical case. The decay of the spins are qualitatively similar regardless of the lattice sizes. The pathway of spin relaxation is analyzed under a sudden temperature drop condition. The Compute Unified Device Architecture (CUDA) based source code used in the present calculations is provided as Supporting Information .
Hernández, Moisés; Guerrero, Ginés D.; Cecilia, José M.; García, José M.; Inuggi, Alberto; Jbabdi, Saad; Behrens, Timothy E. J.; Sotiropoulos, Stamatios N.
2013-01-01
With the performance of central processing units (CPUs) having effectively reached a limit, parallel processing offers an alternative for applications with high computational demands. Modern graphics processing units (GPUs) are massively parallel processors that can execute simultaneously thousands of light-weight processes. In this study, we propose and implement a parallel GPU-based design of a popular method that is used for the analysis of brain magnetic resonance imaging (MRI). More specifically, we are concerned with a model-based approach for extracting tissue structural information from diffusion-weighted (DW) MRI data. DW-MRI offers, through tractography approaches, the only way to study brain structural connectivity, non-invasively and in-vivo. We parallelise the Bayesian inference framework for the ball & stick model, as it is implemented in the tractography toolbox of the popular FSL software package (University of Oxford). For our implementation, we utilise the Compute Unified Device Architecture (CUDA) programming model. We show that the parameter estimation, performed through Markov Chain Monte Carlo (MCMC), is accelerated by at least two orders of magnitude, when comparing a single GPU with the respective sequential single-core CPU version. We also illustrate similar speed-up factors (up to 120x) when comparing a multi-GPU with a multi-CPU implementation. PMID:23658616
NASA Astrophysics Data System (ADS)
Yan, Jiawei; Ke, Youqi
2016-07-01
Electron transport properties of nanoelectronics can be significantly influenced by the inevitable and randomly distributed impurities/defects. For theoretical simulation of disordered nanoscale electronics, one is interested in both the configurationally averaged transport property and its statistical fluctuation that tells device-to-device variability induced by disorder. However, due to the lack of an effective method to do disorder averaging under the nonequilibrium condition, the important effects of disorders on electron transport remain largely unexplored or poorly understood. In this work, we report a general formalism of Green's function based nonequilibrium effective medium theory to calculate the disordered nanoelectronics. In this method, based on a generalized coherent potential approximation for the Keldysh nonequilibrium Green's function, we developed a generalized nonequilibrium vertex correction method to calculate the average of a two-Keldysh-Green's-function correlator. We obtain nine nonequilibrium vertex correction terms, as a complete family, to express the average of any two-Green's-function correlator and find they can be solved by a set of linear equations. As an important result, the averaged nonequilibrium density matrix, averaged current, disorder-induced current fluctuation, and averaged shot noise, which involve different two-Green's-function correlators, can all be derived and computed in an effective and unified way. To test the general applicability of this method, we applied it to compute the transmission coefficient and its fluctuation with a square-lattice tight-binding model and compared with the exact results and other previously proposed approximations. Our results show very good agreement with the exact results for a wide range of disorder concentrations and energies. In addition, to incorporate with density functional theory to realize first-principles quantum transport simulation, we have also derived a general form of conditionally averaged nonequilibrium Green's function for multicomponent disorders.
NASA Astrophysics Data System (ADS)
Okanoya, Kazuo
2014-09-01
The comparative computational approach of Fitch [1] attempts to renew the classical David Marr paradigm of computation, algorithm, and implementation, by introducing evolutionary view of the relationship between neural architecture and cognition. This comparative evolutionary view provides constraints useful in narrowing down the problem space for both cognition and neural mechanisms. I will provide two examples from our own studies that reinforce and extend Fitch's proposal.
NASA Astrophysics Data System (ADS)
Bogdanov, Alexander; Khramushin, Vasily
2016-02-01
The architecture of a digital computing system determines the technical foundation of a unified mathematical language for exact arithmetic-logical description of phenomena and laws of continuum mechanics for applications in fluid mechanics and theoretical physics. The deep parallelization of the computing processes results in functional programming at a new technological level, providing traceability of the computing processes with automatic application of multiscale hybrid circuits and adaptive mathematical models for the true reproduction of the fundamental laws of physics and continuum mechanics.
Computer Access. Tech Use Guide: Using Computer Technology.
ERIC Educational Resources Information Center
Council for Exceptional Children, Reston, VA. Center for Special Education Technology.
One of nine brief guides for special educators on using computer technology, this guide focuses on access including adaptations in input devices, output devices, and computer interfaces. Low technology devices include "no-technology" devices (usually modifications to existing devices), simple switches, and multiple switches. High technology input…
Choi, Jeeyae; Jansen, Kay; Coenen, Amy
In recent years, Decision Support Systems (DSSs) have been developed and used to achieve "meaningful use". One approach to developing DSSs is to translate clinical guidelines into a computer-interpretable format. However, there is no specific guideline modeling approach to translate nursing guidelines to computer-interpretable guidelines. This results in limited use of DSSs in nursing. Unified modeling language (UML) is a software writing language known to accurately represent the end-users' perspective, due to its expressive characteristics. Furthermore, standard terminology enabled DSSs have been shown to smoothly integrate into existing health information systems. In order to facilitate development of nursing DSSs, the UML was used to represent a guideline for medication management for older adults encode with the International Classification for Nursing Practice (ICNP®). The UML was found to be a useful and sufficient tool to model a nursing guideline for a DSS.
Choi, Jeeyae; Jansen, Kay; Coenen, Amy
2015-01-01
In recent years, Decision Support Systems (DSSs) have been developed and used to achieve “meaningful use”. One approach to developing DSSs is to translate clinical guidelines into a computer-interpretable format. However, there is no specific guideline modeling approach to translate nursing guidelines to computer-interpretable guidelines. This results in limited use of DSSs in nursing. Unified modeling language (UML) is a software writing language known to accurately represent the end-users’ perspective, due to its expressive characteristics. Furthermore, standard terminology enabled DSSs have been shown to smoothly integrate into existing health information systems. In order to facilitate development of nursing DSSs, the UML was used to represent a guideline for medication management for older adults encode with the International Classification for Nursing Practice (ICNP®). The UML was found to be a useful and sufficient tool to model a nursing guideline for a DSS. PMID:26958174
A unified wall function for compressible turbulence modelling
NASA Astrophysics Data System (ADS)
Ong, K. C.; Chan, A.
2018-05-01
Turbulence modelling near the wall often requires a high mesh density clustered around the wall and the first cells adjacent to the wall to be placed in the viscous sublayer. As a result, the numerical stability is constrained by the smallest cell size and hence requires high computational overhead. In the present study, a unified wall function is developed which is valid for viscous sublayer, buffer sublayer and inertial sublayer, as well as including effects of compressibility, heat transfer and pressure gradient. The resulting wall function applies to compressible turbulence modelling for both isothermal and adiabatic wall boundary conditions with the non-zero pressure gradient. Two simple wall function algorithms are implemented for practical computation of isothermal and adiabatic wall boundary conditions. The numerical results show that the wall function evaluates the wall shear stress and turbulent quantities of wall adjacent cells at wide range of non-dimensional wall distance and alleviate the number and size of cells required.
A unified convergence theory of a numerical method, and applications to the replenishment policies.
Mi, Xiang-jiang; Wang, Xing-hua
2004-01-01
In determining the replenishment policy for an inventory system, some researchers advocated that the iterative method of Newton could be applied to the derivative of the total cost function in order to get the optimal solution. But this approach requires calculation of the second derivative of the function. Avoiding this complex computation we use another iterative method presented by the second author. One of the goals of this paper is to present a unified convergence theory of this method. Then we give a numerical example to show the application of our theory.
Self-evaluation of decision-making: A general Bayesian framework for metacognitive computation.
Fleming, Stephen M; Daw, Nathaniel D
2017-01-01
People are often aware of their mistakes, and report levels of confidence in their choices that correlate with objective performance. These metacognitive assessments of decision quality are important for the guidance of behavior, particularly when external feedback is absent or sporadic. However, a computational framework that accounts for both confidence and error detection is lacking. In addition, accounts of dissociations between performance and metacognition have often relied on ad hoc assumptions, precluding a unified account of intact and impaired self-evaluation. Here we present a general Bayesian framework in which self-evaluation is cast as a "second-order" inference on a coupled but distinct decision system, computationally equivalent to inferring the performance of another actor. Second-order computation may ensue whenever there is a separation between internal states supporting decisions and confidence estimates over space and/or time. We contrast second-order computation against simpler first-order models in which the same internal state supports both decisions and confidence estimates. Through simulations we show that second-order computation provides a unified account of different types of self-evaluation often considered in separate literatures, such as confidence and error detection, and generates novel predictions about the contribution of one's own actions to metacognitive judgments. In addition, the model provides insight into why subjects' metacognition may sometimes be better or worse than task performance. We suggest that second-order computation may underpin self-evaluative judgments across a range of domains. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
Self-Evaluation of Decision-Making: A General Bayesian Framework for Metacognitive Computation
2017-01-01
People are often aware of their mistakes, and report levels of confidence in their choices that correlate with objective performance. These metacognitive assessments of decision quality are important for the guidance of behavior, particularly when external feedback is absent or sporadic. However, a computational framework that accounts for both confidence and error detection is lacking. In addition, accounts of dissociations between performance and metacognition have often relied on ad hoc assumptions, precluding a unified account of intact and impaired self-evaluation. Here we present a general Bayesian framework in which self-evaluation is cast as a “second-order” inference on a coupled but distinct decision system, computationally equivalent to inferring the performance of another actor. Second-order computation may ensue whenever there is a separation between internal states supporting decisions and confidence estimates over space and/or time. We contrast second-order computation against simpler first-order models in which the same internal state supports both decisions and confidence estimates. Through simulations we show that second-order computation provides a unified account of different types of self-evaluation often considered in separate literatures, such as confidence and error detection, and generates novel predictions about the contribution of one’s own actions to metacognitive judgments. In addition, the model provides insight into why subjects’ metacognition may sometimes be better or worse than task performance. We suggest that second-order computation may underpin self-evaluative judgments across a range of domains. PMID:28004960
Integrating Xgrid into the HENP distributed computing model
NASA Astrophysics Data System (ADS)
Hajdu, L.; Kocoloski, A.; Lauret, J.; Miller, M.
2008-07-01
Modern Macintosh computers feature Xgrid, a distributed computing architecture built directly into Apple's OS X operating system. While the approach is radically different from those generally expected by the Unix based Grid infrastructures (Open Science Grid, TeraGrid, EGEE), opportunistic computing on Xgrid is nonetheless a tempting and novel way to assemble a computing cluster with a minimum of additional configuration. In fact, it requires only the default operating system and authentication to a central controller from each node. OS X also implements arbitrarily extensible metadata, allowing an instantly updated file catalog to be stored as part of the filesystem itself. The low barrier to entry allows an Xgrid cluster to grow quickly and organically. This paper and presentation will detail the steps that can be taken to make such a cluster a viable resource for HENP research computing. We will further show how to provide to users a unified job submission framework by integrating Xgrid through the STAR Unified Meta-Scheduler (SUMS), making tasks and jobs submission effortlessly at reach for those users already using the tool for traditional Grid or local cluster job submission. We will discuss additional steps that can be taken to make an Xgrid cluster a full partner in grid computing initiatives, focusing on Open Science Grid integration. MIT's Xgrid system currently supports the work of multiple research groups in the Laboratory for Nuclear Science, and has become an important tool for generating simulations and conducting data analyses at the Massachusetts Institute of Technology.
Role of the ATLAS Grid Information System (AGIS) in Distributed Data Analysis and Simulation
NASA Astrophysics Data System (ADS)
Anisenkov, A. V.
2018-03-01
In modern high-energy physics experiments, particular attention is paid to the global integration of information and computing resources into a unified system for efficient storage and processing of experimental data. Annually, the ATLAS experiment performed at the Large Hadron Collider at the European Organization for Nuclear Research (CERN) produces tens of petabytes raw data from the recording electronics and several petabytes of data from the simulation system. For processing and storage of such super-large volumes of data, the computing model of the ATLAS experiment is based on heterogeneous geographically distributed computing environment, which includes the worldwide LHC computing grid (WLCG) infrastructure and is able to meet the requirements of the experiment for processing huge data sets and provide a high degree of their accessibility (hundreds of petabytes). The paper considers the ATLAS grid information system (AGIS) used by the ATLAS collaboration to describe the topology and resources of the computing infrastructure, to configure and connect the high-level software systems of computer centers, to describe and store all possible parameters, control, configuration, and other auxiliary information required for the effective operation of the ATLAS distributed computing applications and services. The role of the AGIS system in the development of a unified description of the computing resources provided by grid sites, supercomputer centers, and cloud computing into a consistent information model for the ATLAS experiment is outlined. This approach has allowed the collaboration to extend the computing capabilities of the WLCG project and integrate the supercomputers and cloud computing platforms into the software components of the production and distributed analysis workload management system (PanDA, ATLAS).
Toward unification of taxonomy databases in a distributed computer environment
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kitakami, Hajime; Tateno, Yoshio; Gojobori, Takashi
1994-12-31
All the taxonomy databases constructed with the DNA databases of the international DNA data banks are powerful electronic dictionaries which aid in biological research by computer. The taxonomy databases are, however not consistently unified with a relational format. If we can achieve consistent unification of the taxonomy databases, it will be useful in comparing many research results, and investigating future research directions from existent research results. In particular, it will be useful in comparing relationships between phylogenetic trees inferred from molecular data and those constructed from morphological data. The goal of the present study is to unify the existent taxonomymore » databases and eliminate inconsistencies (errors) that are present in them. Inconsistencies occur particularly in the restructuring of the existent taxonomy databases, since classification rules for constructing the taxonomy have rapidly changed with biological advancements. A repair system is needed to remove inconsistencies in each data bank and mismatches among data banks. This paper describes a new methodology for removing both inconsistencies and mismatches from the databases on a distributed computer environment. The methodology is implemented in a relational database management system, SYBASE.« less
Efficiency limits for photoelectrochemical water-splitting
Fountaine, Katherine T.; Lewerenz, Hans Joachim; Atwater, Harry A.
2016-12-02
Theoretical limiting efficiencies have a critical role in determining technological viability and expectations for device prototypes, as evidenced by the photovoltaics community’s focus on detailed balance. However, due to their multicomponent nature, photoelectrochemical devices do not have an equivalent analogue to detailed balance, and reported theoretical efficiency limits vary depending on the assumptions made. Here we introduce a unified framework for photoelectrochemical device performance through which all previous limiting efficiencies can be understood and contextualized. Ideal and experimentally realistic limiting efficiencies are presented, and then generalized using five representative parameters—semiconductor absorption fraction, external radiative efficiency, series resistance, shunt resistance andmore » catalytic exchange current density—to account for imperfect light absorption, charge transport and catalysis. Finally, we discuss the origin of deviations between the limits discussed herein and reported water-splitting efficiencies. This analysis provides insight into the primary factors that determine device performance and a powerful handle to improve device efficiency.« less
NASA Technical Reports Server (NTRS)
Davidson, Paul; Pineda, Evan J.; Heinrich, Christian; Waas, Anthony M.
2013-01-01
The open hole tensile and compressive strengths are important design parameters in qualifying fiber reinforced laminates for a wide variety of structural applications in the aerospace industry. In this paper, we present a unified model that can be used for predicting both these strengths (tensile and compressive) using the same set of coupon level, material property data. As a prelude to the unified computational model that follows, simplified approaches, referred to as "zeroth order", "first order", etc. with increasing levels of fidelity are first presented. The results and methods presented are practical and validated against experimental data. They serve as an introductory step in establishing a virtual building block, bottom-up approach to designing future airframe structures with composite materials. The results are useful for aerospace design engineers, particularly those that deal with airframe design.
A unified approach to computer analysis and modeling of spacecraft environmental interactions
NASA Technical Reports Server (NTRS)
Katz, I.; Mandell, M. J.; Cassidy, J. J.
1986-01-01
A new, coordinated, unified approach to the development of spacecraft plasma interaction models is proposed. The objective is to eliminate the unnecessary duplicative work in order to allow researchers to concentrate on the scientific aspects. By streamlining the developmental process, the interchange between theories and experimentalists is enhanced, and the transfer of technology to the spacecraft engineering community is faster. This approach is called the UNIfied Spacecraft Interaction Model (UNISIM). UNISIM is a coordinated system of software, hardware, and specifications. It is a tool for modeling and analyzing spacecraft interactions. It will be used to design experiments, to interpret results of experiments, and to aid in future spacecraft design. It breaks a Spacecraft Ineraction analysis into several modules. Each module will perform an analysis for some physical process, using phenomenology and algorithms which are well documented and have been subject to review. This system and its characteristics are discussed.
NASA Technical Reports Server (NTRS)
Kaufman, A.; Laflen, J. H.; Lindholm, U. S.
1985-01-01
Unified constitutive material models were developed for structural analyses of aircraft gas turbine engine components with particular application to isotropic materials used for high-pressure stage turbine blades and vanes. Forms or combinations of models independently proposed by Bodner and Walker were considered. These theories combine time-dependent and time-independent aspects of inelasticity into a continuous spectrum of behavior. This is in sharp contrast to previous classical approaches that partition inelastic strain into uncoupled plastic and creep components. Predicted stress-strain responses from these models were evaluated against monotonic and cyclic test results for uniaxial specimens of two cast nickel-base alloys, B1900+Hf and Rene' 80. Previously obtained tension-torsion test results for Hastelloy X alloy were used to evaluate multiaxial stress-strain cycle predictions. The unified models, as well as appropriate algorithms for integrating the constitutive equations, were implemented in finite-element computer codes.
21 CFR 868.1730 - Oxygen uptake computer.
Code of Federal Regulations, 2010 CFR
2010-04-01
... 21 Food and Drugs 8 2010-04-01 2010-04-01 false Oxygen uptake computer. 868.1730 Section 868.1730...) MEDICAL DEVICES ANESTHESIOLOGY DEVICES Diagnostic Devices § 868.1730 Oxygen uptake computer. (a) Identification. An oxygen uptake computer is a device intended to compute the amount of oxygen consumed by a...
21 CFR 868.1730 - Oxygen uptake computer.
Code of Federal Regulations, 2011 CFR
2011-04-01
... 21 Food and Drugs 8 2011-04-01 2011-04-01 false Oxygen uptake computer. 868.1730 Section 868.1730...) MEDICAL DEVICES ANESTHESIOLOGY DEVICES Diagnostic Devices § 868.1730 Oxygen uptake computer. (a) Identification. An oxygen uptake computer is a device intended to compute the amount of oxygen consumed by a...
21 CFR 868.1730 - Oxygen uptake computer.
Code of Federal Regulations, 2014 CFR
2014-04-01
... 21 Food and Drugs 8 2014-04-01 2014-04-01 false Oxygen uptake computer. 868.1730 Section 868.1730...) MEDICAL DEVICES ANESTHESIOLOGY DEVICES Diagnostic Devices § 868.1730 Oxygen uptake computer. (a) Identification. An oxygen uptake computer is a device intended to compute the amount of oxygen consumed by a...
21 CFR 868.1730 - Oxygen uptake computer.
Code of Federal Regulations, 2013 CFR
2013-04-01
... 21 Food and Drugs 8 2013-04-01 2013-04-01 false Oxygen uptake computer. 868.1730 Section 868.1730...) MEDICAL DEVICES ANESTHESIOLOGY DEVICES Diagnostic Devices § 868.1730 Oxygen uptake computer. (a) Identification. An oxygen uptake computer is a device intended to compute the amount of oxygen consumed by a...
21 CFR 868.1730 - Oxygen uptake computer.
Code of Federal Regulations, 2012 CFR
2012-04-01
... 21 Food and Drugs 8 2012-04-01 2012-04-01 false Oxygen uptake computer. 868.1730 Section 868.1730...) MEDICAL DEVICES ANESTHESIOLOGY DEVICES Diagnostic Devices § 868.1730 Oxygen uptake computer. (a) Identification. An oxygen uptake computer is a device intended to compute the amount of oxygen consumed by a...
NASA Astrophysics Data System (ADS)
Yan, Jiawei; Wang, Shizhuo; Xia, Ke; Ke, Youqi
2017-03-01
Because disorders are inevitable in realistic nanodevices, the capability to quantitatively simulate the disorder effects on electron transport is indispensable for quantum transport theory. Here, we report a unified and effective first-principles quantum transport method for analyzing effects of chemical or substitutional disorder on transport properties of nanoelectronics, including averaged transmission coefficient, shot noise, and disorder-induced device-to-device variability. All our theoretical formulations and numerical implementations are worked out within the framework of the tight-binding linear muffin tin orbital method. In this method, we carry out the electronic structure calculation with the density functional theory, treat the nonequilibrium statistics by the nonequilbrium Green's function method, and include the effects of multiple impurity scattering with the generalized nonequilibrium vertex correction (NVC) method in coherent potential approximation (CPA). The generalized NVC equations are solved from first principles to obtain various disorder-averaged two-Green's-function correlators. This method provides a unified way to obtain different disorder-averaged transport properties of disordered nanoelectronics from first principles. To test our implementation, we apply the method to investigate the shot noise in the disordered copper conductor, and find all our results for different disorder concentrations approach a universal Fano factor 1 /3 . As the second test, we calculate the device-to-device variability in the spin-dependent transport through the disordered Cu/Co interface and find the conductance fluctuation is very large in the minority spin channel and negligible in the majority spin channel. Our results agree well with experimental measurements and other theories. In both applications, we show the generalized nonequilibrium vertex corrections play a determinant role in electron transport simulation. Our results demonstrate the effectiveness of the first-principles generalized CPA-NVC for atomistic analysis of disordered nanoelectronics, extending the capability of quantum transport simulation.
ERIC Educational Resources Information Center
Los Angeles Unified School District, CA. Div. of Adult and Occupational Education.
This document consists of performance, computational, and communication modules used by the Working Smart workplace literacy project, a project conducted for the hotel and food industry in the Los Angeles area by a public school district and several profit and nonprofit companies. Literacy instruction was merged with job requirements of the…
Unified Method for Delay Analysis of Random Multiple Access Algorithms.
1985-08-01
packets in the first cell of the stack. The rules of the algorithm yield the following relation for the wi’s: n-1 n w 0= ; W =1; i i 9Q h I+ + zwI .+N...for computer communica- tions", in Proc. 1970 Fall Joint Computer Conf., AFIPS Press, vol. 37, 1970, pp. 281 -285. (15] N. D. Vvedenskaya and B. S
Federal Register 2010, 2011, 2012, 2013, 2014
2012-05-02
... INTERNATIONAL TRADE COMMISSION [Inv. No. 337-TA-841] Certain Computers and Computer Peripheral... after importation of certain computers and computer peripheral devices and components thereof and... computers and computer peripheral devices and components thereof and products containing the same that...
ROI-Based On-Board Compression for Hyperspectral Remote Sensing Images on GPU.
Giordano, Rossella; Guccione, Pietro
2017-05-19
In recent years, hyperspectral sensors for Earth remote sensing have become very popular. Such systems are able to provide the user with images having both spectral and spatial information. The current hyperspectral spaceborne sensors are able to capture large areas with increased spatial and spectral resolution. For this reason, the volume of acquired data needs to be reduced on board in order to avoid a low orbital duty cycle due to limited storage space. Recently, literature has focused the attention on efficient ways for on-board data compression. This topic is a challenging task due to the difficult environment (outer space) and due to the limited time, power and computing resources. Often, the hardware properties of Graphic Processing Units (GPU) have been adopted to reduce the processing time using parallel computing. The current work proposes a framework for on-board operation on a GPU, using NVIDIA's CUDA (Compute Unified Device Architecture) architecture. The algorithm aims at performing on-board compression using the target's related strategy. In detail, the main operations are: the automatic recognition of land cover types or detection of events in near real time in regions of interest (this is a user related choice) with an unsupervised classifier; the compression of specific regions with space-variant different bit rates including Principal Component Analysis (PCA), wavelet and arithmetic coding; and data volume management to the Ground Station. Experiments are provided using a real dataset taken from an AVIRIS (Airborne Visible/Infrared Imaging Spectrometer) airborne sensor in a harbor area.
2014-08-01
performance computing, smoothed particle hydrodynamics, rigid body dynamics, flexible body dynamics ARMAN PAZOUKI ∗, RADU SERBAN ∗, DAN NEGRUT ∗ A...HIGH PERFORMANCE COMPUTING APPROACH TO THE SIMULATION OF FLUID-SOLID INTERACTION PROBLEMS WITH RIGID AND FLEXIBLE COMPONENTS This work outlines a unified...are implemented to model rigid and flexible multibody dynamics. The two- way coupling of the fluid and solid phases is supported through use of
A Unified Framework for Simulating Markovian Models of Highly Dependable Systems
1989-07-01
ependability I’valuiation of Complex lault- lolerant Computing Systems. Ptreedings of the 1-.et-enth Sv~npmiun on Falult- lolerant Comnputing. Portland, Maine...New York. [12] (icis;t, R.M. and ’I’rivedi, K.S. (1983). I!Itra-Il gh Reliability Prediction for Fault-’ lolerant Computer Systems. IEE.-E Trw.%,.cions... 1998 ). Surv’ey of Software Tools for [valuating Reli- ability. A vailability, and Serviceabilitv. ACA1 Computing S urveyjs 20. 4, 227-269). [32] Meyer
The Road to a New Unified Command
2008-01-01
The Directorate of Command, Control, Communications, and Computers (C4) Systems is chartered with information architecture (including in Africa...Friday afternoon cinema presentations where a documentary or feature film covering an African historic event was played, followed by dialogue
Federal Register 2010, 2011, 2012, 2013, 2014
2013-04-26
... INTERNATIONAL TRADE COMMISSION [Investigation No. 337-TA-745] Certain Wireless Communication Devices, Portable Music and Data Processing Devices, Computers and Components Thereof; Commission Decision... importation of certain wireless communication devices, portable music and data processing devices, computers...
Federal Register 2010, 2011, 2012, 2013, 2014
2013-02-25
... INTERNATIONAL TRADE COMMISSION [Investigation No. 337-TA-745] Certain Wireless Communication Devices, Portable Music and Data Processing Devices, Computers and Components Thereof; Commission Decision... importation of certain wireless communication devices, portable music and data processing devices, computers...
Federal Register 2010, 2011, 2012, 2013, 2014
2012-08-30
... INTERNATIONAL TRADE COMMISSION [Investigation No. 337-TA-745] Certain Wireless Communication Devices, Portable Music and Data Processing Devices, Computers and Components Thereof; Notice of... communication devices, portable music and data processing devices, computers and components thereof by reason of...
Crupi, Vincenzo; Nelson, Jonathan D; Meder, Björn; Cevolani, Gustavo; Tentori, Katya
2018-06-17
Searching for information is critical in many situations. In medicine, for instance, careful choice of a diagnostic test can help narrow down the range of plausible diseases that the patient might have. In a probabilistic framework, test selection is often modeled by assuming that people's goal is to reduce uncertainty about possible states of the world. In cognitive science, psychology, and medical decision making, Shannon entropy is the most prominent and most widely used model to formalize probabilistic uncertainty and the reduction thereof. However, a variety of alternative entropy metrics (Hartley, Quadratic, Tsallis, Rényi, and more) are popular in the social and the natural sciences, computer science, and philosophy of science. Particular entropy measures have been predominant in particular research areas, and it is often an open issue whether these divergences emerge from different theoretical and practical goals or are merely due to historical accident. Cutting across disciplinary boundaries, we show that several entropy and entropy reduction measures arise as special cases in a unified formalism, the Sharma-Mittal framework. Using mathematical results, computer simulations, and analyses of published behavioral data, we discuss four key questions: How do various entropy models relate to each other? What insights can be obtained by considering diverse entropy models within a unified framework? What is the psychological plausibility of different entropy models? What new questions and insights for research on human information acquisition follow? Our work provides several new pathways for theoretical and empirical research, reconciling apparently conflicting approaches and empirical findings within a comprehensive and unified information-theoretic formalism. Copyright © 2018 Cognitive Science Society, Inc.
21 CFR 870.1425 - Programmable diagnostic computer.
Code of Federal Regulations, 2013 CFR
2013-04-01
... 21 Food and Drugs 8 2013-04-01 2013-04-01 false Programmable diagnostic computer. 870.1425 Section... (CONTINUED) MEDICAL DEVICES CARDIOVASCULAR DEVICES Cardiovascular Diagnostic Devices § 870.1425 Programmable diagnostic computer. (a) Identification. A programmable diagnostic computer is a device that can be...
21 CFR 870.1425 - Programmable diagnostic computer.
Code of Federal Regulations, 2012 CFR
2012-04-01
... 21 Food and Drugs 8 2012-04-01 2012-04-01 false Programmable diagnostic computer. 870.1425 Section... (CONTINUED) MEDICAL DEVICES CARDIOVASCULAR DEVICES Cardiovascular Diagnostic Devices § 870.1425 Programmable diagnostic computer. (a) Identification. A programmable diagnostic computer is a device that can be...
21 CFR 870.1425 - Programmable diagnostic computer.
Code of Federal Regulations, 2014 CFR
2014-04-01
... 21 Food and Drugs 8 2014-04-01 2014-04-01 false Programmable diagnostic computer. 870.1425 Section... (CONTINUED) MEDICAL DEVICES CARDIOVASCULAR DEVICES Cardiovascular Diagnostic Devices § 870.1425 Programmable diagnostic computer. (a) Identification. A programmable diagnostic computer is a device that can be...
21 CFR 870.1425 - Programmable diagnostic computer.
Code of Federal Regulations, 2010 CFR
2010-04-01
... 21 Food and Drugs 8 2010-04-01 2010-04-01 false Programmable diagnostic computer. 870.1425 Section... (CONTINUED) MEDICAL DEVICES CARDIOVASCULAR DEVICES Cardiovascular Diagnostic Devices § 870.1425 Programmable diagnostic computer. (a) Identification. A programmable diagnostic computer is a device that can be...
21 CFR 870.1425 - Programmable diagnostic computer.
Code of Federal Regulations, 2011 CFR
2011-04-01
... 21 Food and Drugs 8 2011-04-01 2011-04-01 false Programmable diagnostic computer. 870.1425 Section... (CONTINUED) MEDICAL DEVICES CARDIOVASCULAR DEVICES Cardiovascular Diagnostic Devices § 870.1425 Programmable diagnostic computer. (a) Identification. A programmable diagnostic computer is a device that can be...
Statistical fingerprinting for malware detection and classification
Prowell, Stacy J.; Rathgeb, Christopher T.
2015-09-15
A system detects malware in a computing architecture with an unknown pedigree. The system includes a first computing device having a known pedigree and operating free of malware. The first computing device executes a series of instrumented functions that, when executed, provide a statistical baseline that is representative of the time it takes the software application to run on a computing device having a known pedigree. A second computing device executes a second series of instrumented functions that, when executed, provides an actual time that is representative of the time the known software application runs on the second computing device. The system detects malware when there is a difference in execution times between the first and the second computing devices.
NASA Astrophysics Data System (ADS)
Iyyappan, I.; Ponmurugan, M.
2017-09-01
We study the performance of a three-terminal thermoelectric device such as heat engine and refrigerator with broken time-reversal symmetry by applying the unified trade-off figure of merit (\\dotΩ criterion) which accounts for both useful energy and losses. For the heat engine, we find that a thermoelectric device working under the maximum \\dotΩ criterion gives a significantly better performance than a device working at maximum power output. Within the framework of linear irreversible thermodynamics such a direct comparison is not possible for refrigerators, however, our study indicates that, for refrigerator, the maximum cooling load gives a better performance than the maximum \\dotΩ criterion for a larger asymmetry. Our results can be useful to choose a suitable optimization criterion for operating a real thermoelectric device with broken time-reversal symmetry.
Federal Register 2010, 2011, 2012, 2013, 2014
2012-07-03
...] Guidances for Industry and Food and Drug Administration Staff: Computer-Assisted Detection Devices Applied... Clinical Performance Assessment: Considerations for Computer-Assisted Detection Devices Applied to... guidance, entitled ``Computer-Assisted Detection Devices Applied to Radiology Images and Radiology Device...
Nonequilibrium Energy Transfer at Nanoscale: A Unified Theory from Weak to Strong Coupling
Wang, Chen; Ren, Jie; Cao, Jianshu
2015-01-01
Unraveling the microscopic mechanism of quantum energy transfer across two-level systems provides crucial insights to the optimal design and potential applications of low-dimensional nanodevices. Here, we study the non-equilibrium spin-boson model as a minimal prototype and develop a fluctuation-decoupled quantum master equation approach that is valid ranging from the weak to the strong system-bath coupling regime. The exact expression of energy flux is analytically established, which dissects the energy transfer as multiple boson processes with even and odd parity. Our analysis provides a unified interpretation of several observations, including coherence-enhanced heat flux and negative differential thermal conductance. The results will have broad implications for the fine control of energy transfer in nano-structural devices. PMID:26152705
76 FR 40154 - Unified Agenda of Federal Regulatory and Deregulatory Actions-Spring 2011
Federal Register 2010, 2011, 2012, 2013, 2014
2011-07-07
... investment and innovation in applications and devices that will be used not only in the TV band but... Service (ET Docket No. 10-142). 348 Innovation in the 3060-AJ57 Broadcast Television Bands; ET Docket No... and 3060-AH23 Designation of Spectrum in the 36.0- 43.5 GHz Band. 353 Space Station Licensing 3060...
Thayer-Calder, K.; Gettelman, A.; Craig, C.; ...
2015-06-30
Most global climate models parameterize separate cloud types using separate parameterizations. This approach has several disadvantages, including obscure interactions between parameterizations and inaccurate triggering of cumulus parameterizations. Alternatively, a unified cloud parameterization uses one equation set to represent all cloud types. Such cloud types include stratiform liquid and ice cloud, shallow convective cloud, and deep convective cloud. Vital to the success of a unified parameterization is a general interface between clouds and microphysics. One such interface involves drawing Monte Carlo samples of subgrid variability of temperature, water vapor, cloud liquid, and cloud ice, and feeding the sample points into amore » microphysics scheme.This study evaluates a unified cloud parameterization and a Monte Carlo microphysics interface that has been implemented in the Community Atmosphere Model (CAM) version 5.3. Results describing the mean climate and tropical variability from global simulations are presented. The new model shows a degradation in precipitation skill but improvements in short-wave cloud forcing, liquid water path, long-wave cloud forcing, precipitable water, and tropical wave simulation. Also presented are estimations of computational expense and investigation of sensitivity to number of subcolumns.« less
Thayer-Calder, Katherine; Gettelman, A.; Craig, Cheryl; ...
2015-12-01
Most global climate models parameterize separate cloud types using separate parameterizations.This approach has several disadvantages, including obscure interactions between parameterizations and inaccurate triggering of cumulus parameterizations. Alternatively, a unified cloud parameterization uses one equation set to represent all cloud types. Such cloud types include stratiform liquid and ice cloud, shallow convective cloud, and deep convective cloud. Vital to the success of a unified parameterization is a general interface between clouds and microphysics. One such interface involves drawing Monte Carlo samples of subgrid variability of temperature, water vapor, cloud liquid, and cloud ice, and feeding the sample points into a microphysicsmore » scheme. This study evaluates a unified cloud parameterization and a Monte Carlo microphysics interface that has been implemented in the Community Atmosphere Model (CAM) version 5.3. Results describing the mean climate and tropical variability from global simulations are presented. In conclusion, the new model shows a degradation in precipitation skill but improvements in short-wave cloud forcing, liquid water path, long-wave cloud forcing, perceptible water, and tropical wave simulation. Also presented are estimations of computational expense and investigation of sensitivity to number of subcolumns.« less
Franz, A; Triesch, J
2010-12-01
The perception of the unity of objects, their permanence when out of sight, and the ability to perceive continuous object trajectories even during occlusion belong to the first and most important capacities that infants have to acquire. Despite much research a unified model of the development of these abilities is still missing. Here we make an attempt to provide such a unified model. We present a recurrent artificial neural network that learns to predict the motion of stimuli occluding each other and that develops representations of occluded object parts. It represents completely occluded, moving objects for several time steps and successfully predicts their reappearance after occlusion. This framework allows us to account for a broad range of experimental data. Specifically, the model explains how the perception of object unity develops, the role of the width of the occluders, and it also accounts for differences between data for moving and stationary stimuli. We demonstrate that these abilities can be acquired by learning to predict the sensory input. The model makes specific predictions and provides a unifying framework that has the potential to be extended to other visual event categories. Copyright © 2010 Elsevier Inc. All rights reserved.
Exascale computing and big data
Reed, Daniel A.; Dongarra, Jack
2015-06-25
Scientific discovery and engineering innovation requires unifying traditionally separated high-performance computing and big data analytics. The tools and cultures of high-performance computing and big data analytics have diverged, to the detriment of both; unification is essential to address a spectrum of major research domains. The challenges of scale tax our ability to transmit data, compute complicated functions on that data, or store a substantial part of it; new approaches are required to meet these challenges. Finally, the international nature of science demands further development of advanced computer architectures and global standards for processing data, even as international competition complicates themore » openness of the scientific process.« less
Exascale computing and big data
DOE Office of Scientific and Technical Information (OSTI.GOV)
Reed, Daniel A.; Dongarra, Jack
Scientific discovery and engineering innovation requires unifying traditionally separated high-performance computing and big data analytics. The tools and cultures of high-performance computing and big data analytics have diverged, to the detriment of both; unification is essential to address a spectrum of major research domains. The challenges of scale tax our ability to transmit data, compute complicated functions on that data, or store a substantial part of it; new approaches are required to meet these challenges. Finally, the international nature of science demands further development of advanced computer architectures and global standards for processing data, even as international competition complicates themore » openness of the scientific process.« less
Federal Register 2010, 2011, 2012, 2013, 2014
2010-02-24
... Communications and Computer Devices and Components Thereof; Notice of Investigation AGENCY: U.S. International... States after importation of certain mobile communications and computer devices and components thereof by... importation of certain mobile communications or computer devices or components thereof that infringe one or...
Research and development activities in unified control-structure modeling and design
NASA Technical Reports Server (NTRS)
Nayak, A. P.
1985-01-01
Results of work to develop a unified control/structures modeling and design capability for large space structures modeling are presented. Recent analytical results are presented to demonstrate the significant interdependence between structural and control properties. A new design methodology is suggested in which the structure, material properties, dynamic model and control design are all optimized simultaneously. Parallel research done by other researchers is reviewed. The development of a methodology for global design optimization is recommended as a long-term goal. It is suggested that this methodology should be incorporated into computer aided engineering programs, which eventually will be supplemented by an expert system to aid design optimization.
NASA Technical Reports Server (NTRS)
Fleming, Gary A. (Technical Monitor); Schwartz, Richard J.
2004-01-01
The desire to revolutionize the aircraft design cycle from its currently lethargic pace to a fast turn-around operation enabling the optimization of non-traditional configurations is a critical challenge facing the aeronautics industry. In response, a large scale effort is underway to not only advance the state of the art in wind tunnel testing, computational modeling, and information technology, but to unify these often disparate elements into a cohesive design resource. This paper will address Seamless Data Transfer, the critical central nervous system that will enable a wide variety of varied components to work together.
Modified unified kinetic scheme for all flow regimes.
Liu, Sha; Zhong, Chengwen
2012-06-01
A modified unified kinetic scheme for the prediction of fluid flow behaviors in all flow regimes is described. The time evolution of macrovariables at the cell interface is calculated with the idea that both free transport and collision mechanisms should be considered. The time evolution of macrovariables is obtained through the conservation constraints. The time evolution of local Maxwellian distribution is obtained directly through the one-to-one mapping from the evolution of macrovariables. These improvements provide more physical realities in flow behaviors and more accurate numerical results in all flow regimes especially in the complex transition flow regime. In addition, the improvement steps introduce no extra computational complexity.
An affinity-structure database of helix-turn-helix: DNA complexes with a universal coordinate system
DOE Office of Scientific and Technical Information (OSTI.GOV)
AlQuraishi, Mohammed; Tang, Shengdong; Xia, Xide
Molecular interactions between proteins and DNA molecules underlie many cellular processes, including transcriptional regulation, chromosome replication, and nucleosome positioning. Computational analyses of protein-DNA interactions rely on experimental data characterizing known protein-DNA interactions structurally and biochemically. While many databases exist that contain either structural or biochemical data, few integrate these two data sources in a unified fashion. Such integration is becoming increasingly critical with the rapid growth of structural and biochemical data, and the emergence of algorithms that rely on the synthesis of multiple data types to derive computational models of molecular interactions. We have developed an integrated affinity-structure database inmore » which the experimental and quantitative DNA binding affinities of helix-turn-helix proteins are mapped onto the crystal structures of the corresponding protein-DNA complexes. This database provides access to: (i) protein-DNA structures, (ii) quantitative summaries of protein-DNA binding affinities using position weight matrices, and (iii) raw experimental data of protein-DNA binding instances. Critically, this database establishes a correspondence between experimental structural data and quantitative binding affinity data at the single basepair level. Furthermore, we present a novel alignment algorithm that structurally aligns the protein-DNA complexes in the database and creates a unified residue-level coordinate system for comparing the physico-chemical environments at the interface between complexes. Using this unified coordinate system, we compute the statistics of atomic interactions at the protein-DNA interface of helix-turn-helix proteins. We provide an interactive website for visualization, querying, and analyzing this database, and a downloadable version to facilitate programmatic analysis. Lastly, this database will facilitate the analysis of protein-DNA interactions and the development of programmatic computational methods that capitalize on integration of structural and biochemical datasets. The database can be accessed at http://ProteinDNA.hms.harvard.edu.« less
An affinity-structure database of helix-turn-helix: DNA complexes with a universal coordinate system
AlQuraishi, Mohammed; Tang, Shengdong; Xia, Xide
2015-11-19
Molecular interactions between proteins and DNA molecules underlie many cellular processes, including transcriptional regulation, chromosome replication, and nucleosome positioning. Computational analyses of protein-DNA interactions rely on experimental data characterizing known protein-DNA interactions structurally and biochemically. While many databases exist that contain either structural or biochemical data, few integrate these two data sources in a unified fashion. Such integration is becoming increasingly critical with the rapid growth of structural and biochemical data, and the emergence of algorithms that rely on the synthesis of multiple data types to derive computational models of molecular interactions. We have developed an integrated affinity-structure database inmore » which the experimental and quantitative DNA binding affinities of helix-turn-helix proteins are mapped onto the crystal structures of the corresponding protein-DNA complexes. This database provides access to: (i) protein-DNA structures, (ii) quantitative summaries of protein-DNA binding affinities using position weight matrices, and (iii) raw experimental data of protein-DNA binding instances. Critically, this database establishes a correspondence between experimental structural data and quantitative binding affinity data at the single basepair level. Furthermore, we present a novel alignment algorithm that structurally aligns the protein-DNA complexes in the database and creates a unified residue-level coordinate system for comparing the physico-chemical environments at the interface between complexes. Using this unified coordinate system, we compute the statistics of atomic interactions at the protein-DNA interface of helix-turn-helix proteins. We provide an interactive website for visualization, querying, and analyzing this database, and a downloadable version to facilitate programmatic analysis. Lastly, this database will facilitate the analysis of protein-DNA interactions and the development of programmatic computational methods that capitalize on integration of structural and biochemical datasets. The database can be accessed at http://ProteinDNA.hms.harvard.edu.« less
Federal Register 2010, 2011, 2012, 2013, 2014
2011-07-14
... Communications and Computer Devices and Components Thereof; Notice of Commission Determination Not To Review an... in its entirety Inv. No. 337-TA-704, Certain Mobile Communications and Computer Devices and... importation of certain mobile communications and computer devices and components thereof by reason of...
Usability of a Low-Cost Head Tracking Computer Access Method following Stroke.
Mah, Jasmine; Jutai, Jeffrey W; Finestone, Hillel; Mckee, Hilary; Carter, Melanie
2015-01-01
Assistive technology devices for computer access can facilitate social reintegration and promote independence for people who have had a stroke. This work describes the exploration of the usefulness and acceptability of a new computer access device called the Nouse™ (Nose-as-mouse). The device uses standard webcam and video recognition algorithms to map the movement of the user's nose to a computer cursor, thereby allowing hands-free computer operation. Ten participants receiving in- or outpatient stroke rehabilitation completed a series of standardized and everyday computer tasks using the Nouse™ and then completed a device usability questionnaire. Task completion rates were high (90%) for computer activities only in the absence of time constraints. Most of the participants were satisfied with ease of use (70%) and liked using the Nouse™ (60%), indicating they could resume most of their usual computer activities apart from word-processing using the device. The findings suggest that hands-free computer access devices like the Nouse™ may be an option for people who experience upper motor impairment caused by stroke and are highly motivated to resume personal computing. More research is necessary to further evaluate the effectiveness of this technology, especially in relation to other computer access assistive technology devices.
Power throttling of collections of computing elements
Bellofatto, Ralph E [Ridgefield, CT; Coteus, Paul W [Yorktown Heights, NY; Crumley, Paul G [Yorktown Heights, NY; Gara, Alan G [Mount Kidsco, NY; Giampapa, Mark E [Irvington, NY; Gooding,; Thomas, M [Rochester, MN; Haring, Rudolf A [Cortlandt Manor, NY; Megerian, Mark G [Rochester, MN; Ohmacht, Martin [Yorktown Heights, NY; Reed, Don D [Mantorville, MN; Swetz, Richard A [Mahopac, NY; Takken, Todd [Brewster, NY
2011-08-16
An apparatus and method for controlling power usage in a computer includes a plurality of computers communicating with a local control device, and a power source supplying power to the local control device and the computer. A plurality of sensors communicate with the computer for ascertaining power usage of the computer, and a system control device communicates with the computer for controlling power usage of the computer.
Models for evaluating the performability of degradable computing systems
NASA Technical Reports Server (NTRS)
Wu, L. T.
1982-01-01
Recent advances in multiprocessor technology established the need for unified methods to evaluate computing systems performance and reliability. In response to this modeling need, a general modeling framework that permits the modeling, analysis and evaluation of degradable computing systems is considered. Within this framework, several user oriented performance variables are identified and shown to be proper generalizations of the traditional notions of system performance and reliability. Furthermore, a time varying version of the model is developed to generalize the traditional fault tree reliability evaluation methods of phased missions.
FREQ: A computational package for multivariable system loop-shaping procedures
NASA Technical Reports Server (NTRS)
Giesy, Daniel P.; Armstrong, Ernest S.
1989-01-01
Many approaches in the field of linear, multivariable time-invariant systems analysis and controller synthesis employ loop-sharing procedures wherein design parameters are chosen to shape frequency-response singular value plots of selected transfer matrices. A software package, FREQ, is documented for computing within on unified framework many of the most used multivariable transfer matrices for both continuous and discrete systems. The matrices are evaluated at user-selected frequency-response values, and singular values against frequency. Example computations are presented to demonstrate the use of the FREQ code.
One approach for evaluating the Distributed Computing Design System (DCDS)
NASA Technical Reports Server (NTRS)
Ellis, J. T.
1985-01-01
The Distributed Computer Design System (DCDS) provides an integrated environment to support the life cycle of developing real-time distributed computing systems. The primary focus of DCDS is to significantly increase system reliability and software development productivity, and to minimize schedule and cost risk. DCDS consists of integrated methodologies, languages, and tools to support the life cycle of developing distributed software and systems. Smooth and well-defined transistions from phase to phase, language to language, and tool to tool provide a unique and unified environment. An approach to evaluating DCDS highlights its benefits.
Unified algorithm of cone optics to compute solar flux on central receiver
NASA Astrophysics Data System (ADS)
Grigoriev, Victor; Corsi, Clotilde
2017-06-01
Analytical algorithms to compute flux distribution on central receiver are considered as a faster alternative to ray tracing. They have quite too many modifications, with HFLCAL and UNIZAR being the most recognized and verified. In this work, a generalized algorithm is presented which is valid for arbitrary sun shape of radial symmetry. Heliostat mirrors can have a nonrectangular profile, and the effects of shading and blocking, strong defocusing and astigmatism can be taken into account. The algorithm is suitable for parallel computing and can benefit from hardware acceleration of polygon texturing.
Jena, Manas Kumar; Samantaray, Subhransu Ranjan
2016-01-01
This paper presents a data-mining-based intelligent differential relaying scheme for transmission lines, including flexible ac transmission system device, such as unified power flow controller (UPFC) and wind farms. Initially, the current and voltage signals are processed through extended Kalman filter phasor measurement unit for phasor estimation, and 21 potential features are computed at both ends of the line. Once the features are extracted at both ends, the corresponding differential features are derived. These differential features are fed to a data-mining model known as decision tree (DT) to provide the final relaying decision. The proposed technique has been extensively tested for single-circuit transmission line, including UPFC and wind farms with in-feed, double-circuit line with UPFC on one line and wind farm as one of the substations with wide variations in operating parameters. The test results obtained from simulation as well as in real-time digital simulator testing indicate that the DT-based intelligent differential relaying scheme is highly reliable and accurate with a response time of 2.25 cycles from the fault inception.
Real-time blood flow visualization using the graphics processing unit
NASA Astrophysics Data System (ADS)
Yang, Owen; Cuccia, David; Choi, Bernard
2011-01-01
Laser speckle imaging (LSI) is a technique in which coherent light incident on a surface produces a reflected speckle pattern that is related to the underlying movement of optical scatterers, such as red blood cells, indicating blood flow. Image-processing algorithms can be applied to produce speckle flow index (SFI) maps of relative blood flow. We present a novel algorithm that employs the NVIDIA Compute Unified Device Architecture (CUDA) platform to perform laser speckle image processing on the graphics processing unit. Software written in C was integrated with CUDA and integrated into a LabVIEW Virtual Instrument (VI) that is interfaced with a monochrome CCD camera able to acquire high-resolution raw speckle images at nearly 10 fps. With the CUDA code integrated into the LabVIEW VI, the processing and display of SFI images were performed also at ~10 fps. We present three video examples depicting real-time flow imaging during a reactive hyperemia maneuver, with fluid flow through an in vitro phantom, and a demonstration of real-time LSI during laser surgery of a port wine stain birthmark.
Real-time blood flow visualization using the graphics processing unit
Yang, Owen; Cuccia, David; Choi, Bernard
2011-01-01
Laser speckle imaging (LSI) is a technique in which coherent light incident on a surface produces a reflected speckle pattern that is related to the underlying movement of optical scatterers, such as red blood cells, indicating blood flow. Image-processing algorithms can be applied to produce speckle flow index (SFI) maps of relative blood flow. We present a novel algorithm that employs the NVIDIA Compute Unified Device Architecture (CUDA) platform to perform laser speckle image processing on the graphics processing unit. Software written in C was integrated with CUDA and integrated into a LabVIEW Virtual Instrument (VI) that is interfaced with a monochrome CCD camera able to acquire high-resolution raw speckle images at nearly 10 fps. With the CUDA code integrated into the LabVIEW VI, the processing and display of SFI images were performed also at ∼10 fps. We present three video examples depicting real-time flow imaging during a reactive hyperemia maneuver, with fluid flow through an in vitro phantom, and a demonstration of real-time LSI during laser surgery of a port wine stain birthmark. PMID:21280915
NASA Astrophysics Data System (ADS)
Hou, Zhenlong; Huang, Danian
2017-09-01
In this paper, we make a study on the inversion of probability tomography (IPT) with gravity gradiometry data at first. The space resolution of the results is improved by multi-tensor joint inversion, depth weighting matrix and the other methods. Aiming at solving the problems brought by the big data in the exploration, we present the parallel algorithm and the performance analysis combining Compute Unified Device Architecture (CUDA) with Open Multi-Processing (OpenMP) based on Graphics Processing Unit (GPU) accelerating. In the test of the synthetic model and real data from Vinton Dome, we get the improved results. It is also proved that the improved inversion algorithm is effective and feasible. The performance of parallel algorithm we designed is better than the other ones with CUDA. The maximum speedup could be more than 200. In the performance analysis, multi-GPU speedup and multi-GPU efficiency are applied to analyze the scalability of the multi-GPU programs. The designed parallel algorithm is demonstrated to be able to process larger scale of data and the new analysis method is practical.
Characterization of PET preforms using spectral domain optical coherence tomography
NASA Astrophysics Data System (ADS)
Hosseiny, Hamid; Ferreira, Manuel João.; Martins, Teresa; Carmelo Rosa, Carla
2013-11-01
Polyethylene terephthalate (PET) preforms are massively produced nowadays with the purpose of producing food and beverages packaging and liquid containers. Some varieties of these preforms are produced as multilayer structures, where very thin inner film(s) act as a barrier for nutrients leakage. The knowledge of the thickness of this thin inner layer is important in the production line. The quality control of preforms production requires a fast approach and normally the thickness control is performed by destructive means out of the production line. A spectral domain optical coherence tomography (SD-OCT) method was proposed to examine the thin layers in real time. This paper describes a nondestructive approach and all required signal processing steps to characterize the thin inner layers and also to improve the imaging speed and the signal to noise ratio. The algorithm was developed by using graphics processing unit (GPU) with computer unified device architecture (CUDA). This GPU-accelerated white light interferometry technique nondestructively assesses the samples and has high imaging speed advantage, overcoming the bottlenecks in PET performs quality control.
Observability of Automata Networks: Fixed and Switching Cases.
Li, Rui; Hong, Yiguang; Wang, Xingyuan
2018-04-01
Automata networks are a class of fully discrete dynamical systems, which have received considerable interest in various different areas. This brief addresses the observability of automata networks and switched automata networks in a unified framework, and proposes simple necessary and sufficient conditions for observability. The results are achieved by employing methods from symbolic computation, and are suited for implementation using computer algebra systems. Several examples are presented to demonstrate the application of the results.
Nurse-computer performance. Considerations for the nurse administrator.
Mills, M E; Staggers, N
1994-11-01
Regulatory reporting requirements and economic pressures to create a unified healthcare database are leading to the development of a fully computerized patient record. Nursing staff members will be responsible increasingly for using this technology, yet little is known about the interaction effect of staff characteristics and computer screen design on on-line accuracy and speed. In examining these issues, new considerations are raised for nurse administrators interested in facilitating staff use of clinical information systems.
Sandow, M J; Fisher, T J; Howard, C Q; Papas, S
2014-05-01
This study was part of a larger project to develop a (kinetic) theory of carpal motion based on computationally derived isometric constraints. Three-dimensional models were created from computed tomography scans of the wrists of ten normal subjects and carpal spatial relationships at physiological motion extremes were assessed. Specific points on the surface of the various carpal bones and the radius that remained isometric through range of movement were identified. Analysis of the isometric constraints and intercarpal motion suggests that the carpus functions as a stable central column (lunate-capitate-hamate-trapezoid-trapezium) with a supporting lateral column (scaphoid), which behaves as a 'two gear four bar linkage'. The triquetrum functions as an ulnar translation restraint, as well as controlling lunate flexion. The 'trapezoid'-shaped trapezoid places the trapezium anterior to the transverse plane of the radius and ulna, and thus rotates the principal axis of the central column to correspond to that used in the 'dart thrower's motion'. This study presents a forward kinematic analysis of the carpus that provides the basis for the development of a unifying kinetic theory of wrist motion based on isometric constraints and rules-based motion.
Verifying Stability of Dynamic Soft-Computing Systems
NASA Technical Reports Server (NTRS)
Wen, Wu; Napolitano, Marcello; Callahan, John
1997-01-01
Soft computing is a general term for algorithms that learn from human knowledge and mimic human skills. Example of such algorithms are fuzzy inference systems and neural networks. Many applications, especially in control engineering, have demonstrated their appropriateness in building intelligent systems that are flexible and robust. Although recent research have shown that certain class of neuro-fuzzy controllers can be proven bounded and stable, they are implementation dependent and difficult to apply to the design and validation process. Many practitioners adopt the trial and error approach for system validation or resort to exhaustive testing using prototypes. In this paper, we describe our on-going research towards establishing necessary theoretic foundation as well as building practical tools for the verification and validation of soft-computing systems. A unified model for general neuro-fuzzy system is adopted. Classic non-linear system control theory and recent results of its applications to neuro-fuzzy systems are incorporated and applied to the unified model. It is hoped that general tools can be developed to help the designer to visualize and manipulate the regions of stability and boundedness, much the same way Bode plots and Root locus plots have helped conventional control design and validation.
FACTS Devices Cost Recovery During Congestion Management in Deregulated Electricity Markets
NASA Astrophysics Data System (ADS)
Sharma, Ashwani Kumar; Mittapalli, Ram Kumar; Pal, Yash
2016-09-01
In future electricity markets, flexible alternating current transmission system (FACTS) devices will play key role for providing ancillary services. Since huge cost is involved for the FACTS devices placement in the power system, the cost invested has to be recovered in their life time for the replacement of these devices. The FACTS devices in future electricity markets can act as an ancillary services provider and have to be remunerated. The main contributions of the paper are: (1) investment recovery of FACTS devices during congestion management such as static VAR compensator and unified power flow controller along with thyristor controlled series compensator using non-linear bid curves, (2) the impact of ZIP load model on the FACTS cost recovery of the devices, (3) the comparison of results obtained without ZIP load model for both pool and hybrid market model, (4) secure bilateral transactions incorporation in hybrid market model. An optimal power flow based approach has been developed for maximizing social welfare including FACTS devices cost. The optimal placement of the FACTS devices have been obtained based on maximum social welfare. The results have been obtained for both pool and hybrid electricity market for IEEE 24-bus RTS.
Teachers and Electronic Mail: Networking on the Network.
ERIC Educational Resources Information Center
Broholm, John R.; Aust, Ronald
1994-01-01
Describes a study that examined the communication patterns of teachers who used UNITE (Unified Network for Informatics in Teacher Education), an electronic mail system designed to encourage curricular collaboration and resource sharing. Highlights include computer-mediated communication, use of UNITE by librarians, and recommendations for…
A remote sensing computer-assisted learning tool developed using the unified modeling language
NASA Astrophysics Data System (ADS)
Friedrich, J.; Karslioglu, M. O.
The goal of this work has been to create an easy-to-use and simple-to-make learning tool for remote sensing at an introductory level. Many students struggle to comprehend what seems to be a very basic knowledge of digital images, image processing and image arithmetic, for example. Because professional programs are generally too complex and overwhelming for beginners and often not tailored to the specific needs of a course regarding functionality, a computer-assisted learning (CAL) program was developed based on the unified modeling language (UML), the present standard for object-oriented (OO) system development. A major advantage of this approach is an easier transition from modeling to coding of such an application, if modern UML tools are being used. After introducing the constructed UML model, its implementation is briefly described followed by a series of learning exercises. They illustrate how the resulting CAL tool supports students taking an introductory course in remote sensing at the author's institution.
The Unified Medical Language System
Humphreys, Betsy L.; Lindberg, Donald A. B.; Schoolman, Harold M.; Barnett, G. Octo
1998-01-01
In 1986, the National Library of Medicine (NLM) assembled a large multidisciplinary, multisite team to work on the Unified Medical Language System (UMLS), a collaborative research project aimed at reducing fundamental barriers to the application of computers to medicine. Beyond its tangible products, the UMLS Knowledge Sources, and its influence on the field of informatics, the UMLS project is an interesting case study in collaborative research and development. It illustrates the strengths and challenges of substantive collaboration among widely distributed research groups. Over the past decade, advances in computing and communications have minimized the technical difficulties associated with UMLS collaboration and also facilitated the development, dissemination, and use of the UMLS Knowledge Sources. The spread of the World Wide Web has increased the visibility of the information access problems caused by multiple vocabularies and many information sources which are the focus of UMLS work. The time is propitious for building on UMLS accomplishments and making more progress on the informatics research issues first highlighted by the UMLS project more than 10 years ago. PMID:9452981
The Unified Medical Language System: an informatics research collaboration.
Humphreys, B L; Lindberg, D A; Schoolman, H M; Barnett, G O
1998-01-01
In 1986, the National Library of Medicine (NLM) assembled a large multidisciplinary, multisite team to work on the Unified Medical Language System (UMLS), a collaborative research project aimed at reducing fundamental barriers to the application of computers to medicine. Beyond its tangible products, the UMLS Knowledge Sources, and its influence on the field of informatics, the UMLS project is an interesting case study in collaborative research and development. It illustrates the strengths and challenges of substantive collaboration among widely distributed research groups. Over the past decade, advances in computing and communications have minimized the technical difficulties associated with UMLS collaboration and also facilitated the development, dissemination, and use of the UMLS Knowledge Sources. The spread of the World Wide Web has increased the visibility of the information access problems caused by multiple vocabularies and many information sources which are the focus of UMLS work. The time is propitious for building on UMLS accomplishments and making more progress on the informatics research issues first highlighted by the UMLS project more than 10 years ago.
NASA Astrophysics Data System (ADS)
Arendt, V.; Shalchi, A.
2018-06-01
We explore numerically the transport of energetic particles in a turbulent magnetic field configuration. A test-particle code is employed to compute running diffusion coefficients as well as particle distribution functions in the different directions of space. Our numerical findings are compared with models commonly used in diffusion theory such as Gaussian distribution functions and solutions of the cosmic ray Fokker-Planck equation. Furthermore, we compare the running diffusion coefficients across the mean magnetic field with solutions obtained from the time-dependent version of the unified non-linear transport theory. In most cases we find that particle distribution functions are indeed of Gaussian form as long as a two-component turbulence model is employed. For turbulence setups with reduced dimensionality, however, the Gaussian distribution can no longer be obtained. It is also shown that the unified non-linear transport theory agrees with simulated perpendicular diffusion coefficients as long as the pure two-dimensional model is excluded.
MELLO: Medical lifelog ontology for data terms from self-tracking and lifelog devices.
Kim, Hye Hyeon; Lee, Soo Youn; Baik, Su Youn; Kim, Ju Han
2015-12-01
The increasing use of health self-tracking devices is making the integration of heterogeneous data and shared decision-making more challenging. Computational analysis of lifelog data has been hampered by the lack of semantic and syntactic consistency among lifelog terms and related ontologies. Medical lifelog ontology (MELLO) was developed by identifying lifelog concepts and relationships between concepts, and it provides clear definitions by following ontology development methods. MELLO aims to support the classification and semantic mapping of lifelog data from diverse health self-tracking devices. MELLO was developed using the General Formal Ontology method with a manual iterative process comprising five steps: (1) defining the scope of lifelog data, (2) identifying lifelog concepts, (3) assigning relationships among MELLO concepts, (4) developing MELLO properties (e.g., synonyms, preferred terms, and definitions) for each MELLO concept, and (5) evaluating representative layers of the ontology content. An evaluation was performed by classifying 11 devices into 3 classes by subjects, and performing pairwise comparisons of lifelog terms among 5 devices in each class as measured using the Jaccard similarity index. MELLO represents a comprehensive knowledge base of 1998 lifelog concepts, with 4996 synonyms for 1211 (61%) concepts and 1395 definitions for 926 (46%) concepts. The MELLO Browser and MELLO Mapper provide convenient access and annotating non-standard proprietary terms with MELLO (http://mello.snubi.org/). MELLO covers 88.1% of lifelog terms from 11 health self-tracking devices and uses simple string matching to match semantically similar terms provided by various devices that are not yet integrated. The results from the comparisons of Jaccard similarities between simple string matching and MELLO matching revealed increases of 2.5, 2.2, and 5.7 folds for physical activity,body measure, and sleep classes, respectively. MELLO is the first ontology for representing health-related lifelog data with rich contents including definitions, synonyms, and semantic relationships. MELLO fills the semantic gap between heterogeneous lifelog terms that are generated by diverse health self-tracking devices. The unified representation of lifelog terms facilitated by MELLO can help describe an individual's lifestyle and environmental factors, which can be included with user-generated data for clinical research and thereby enhance data integration and sharing. Copyright © 2015. Published by Elsevier Ireland Ltd.
Federal Register 2010, 2011, 2012, 2013, 2014
2013-06-10
..., Including Wireless Communication Devices, Portable Music and Data Processing Devices, and Tablet Computers... importing wireless communication devices, portable music and data processing devices, and tablet computers... certain electronic devices, including wireless communication devices, portable music and data processing...
Unified Planetary Coordinates System: A Searchable Database of Geodetic Information
NASA Technical Reports Server (NTRS)
Becker, K. J.a; Gaddis, L. R.; Soderblom, L. A.; Kirk, R. L.; Archinal, B. A.; Johnson, J. R.; Anderson, J. A.; Bowman-Cisneros, E.; LaVoie, S.; McAuley, M.
2005-01-01
Over the past 40 years, an enormous quantity of orbital remote sensing data has been collected for Mars from many missions and instruments. Unfortunately these datasets currently exist in a wide range of disparate coordinate systems, making it extremely difficult for the scientific community to easily correlate, combine, and compare data from different Mars missions and instruments. As part of our work for the PDS Imaging Node and on behalf of the USGS Astrogeology Team, we are working to solve this problem and to provide the NASA scientific research community with easy access to Mars orbital data in a unified, consistent coordinate system along with a wide variety of other key geometric variables. The Unified Planetary Coordinates (UPC) system is comprised of two main elements: (1) a database containing Mars orbital remote sensing data computed using a uniform coordinate system, and (2) a process by which continual maintainance and updates to the contents of the database are performed.
Unified tensor model for space-frequency spreading-multiplexing (SFSM) MIMO communication systems
NASA Astrophysics Data System (ADS)
de Almeida, André LF; Favier, Gérard
2013-12-01
This paper presents a unified tensor model for space-frequency spreading-multiplexing (SFSM) multiple-input multiple-output (MIMO) wireless communication systems that combine space- and frequency-domain spreadings, followed by a space-frequency multiplexing. Spreading across space (transmit antennas) and frequency (subcarriers) adds resilience against deep channel fades and provides space and frequency diversities, while orthogonal space-frequency multiplexing enables multi-stream transmission. We adopt a tensor-based formulation for the proposed SFSM MIMO system that incorporates space, frequency, time, and code dimensions by means of the parallel factor model. The developed SFSM tensor model unifies the tensorial formulation of some existing multiple-access/multicarrier MIMO signaling schemes as special cases, while revealing interesting tradeoffs due to combined space, frequency, and time diversities which are of practical relevance for joint symbol-channel-code estimation. The performance of the proposed SFSM MIMO system using either a zero forcing receiver or a semi-blind tensor-based receiver is illustrated by means of computer simulation results under realistic channel and system parameters.
The Living Heart Project: A robust and integrative simulator for human heart function.
Baillargeon, Brian; Rebelo, Nuno; Fox, David D; Taylor, Robert L; Kuhl, Ellen
2014-11-01
The heart is not only our most vital, but also our most complex organ: Precisely controlled by the interplay of electrical and mechanical fields, it consists of four chambers and four valves, which act in concert to regulate its filling, ejection, and overall pump function. While numerous computational models exist to study either the electrical or the mechanical response of its individual chambers, the integrative electro-mechanical response of the whole heart remains poorly understood. Here we present a proof-of-concept simulator for a four-chamber human heart model created from computer topography and magnetic resonance images. We illustrate the governing equations of excitation-contraction coupling and discretize them using a single, unified finite element environment. To illustrate the basic features of our model, we visualize the electrical potential and the mechanical deformation across the human heart throughout its cardiac cycle. To compare our simulation against common metrics of cardiac function, we extract the pressure-volume relationship and show that it agrees well with clinical observations. Our prototype model allows us to explore and understand the key features, physics, and technologies to create an integrative, predictive model of the living human heart. Ultimately, our simulator will open opportunities to probe landscapes of clinical parameters, and guide device design and treatment planning in cardiac diseases such as stenosis, regurgitation, or prolapse of the aortic, pulmonary, tricuspid, or mitral valve.
Eshraghian, Jason K; Baek, Seungbum; Kim, Jun-Ho; Iannella, Nicolangelo; Cho, Kyoungrok; Goo, Yong Sook; Iu, Herbert H C; Kang, Sung-Mo; Eshraghian, Kamran
2018-02-13
Existing computational models of the retina often compromise between the biophysical accuracy and a hardware-adaptable methodology of implementation. When compared to the current modes of vision restoration, algorithmic models often contain a greater correlation between stimuli and the affected neural network, but lack physical hardware practicality. Thus, if the present processing methods are adapted to complement very-large-scale circuit design techniques, it is anticipated that it will engender a more feasible approach to the physical construction of the artificial retina. The computational model presented in this research serves to provide a fast and accurate predictive model of the retina, a deeper understanding of neural responses to visual stimulation, and an architecture that can realistically be transformed into a hardware device. Traditionally, implicit (or semi-implicit) ordinary differential equations (OES) have been used for optimal speed and accuracy. We present a novel approach that requires the effective integration of different dynamical time scales within a unified framework of neural responses, where the rod, cone, amacrine, bipolar, and ganglion cells correspond to the implemented pathways. Furthermore, we show that adopting numerical integration can both accelerate retinal pathway simulations by more than 50% when compared with traditional ODE solvers in some cases, and prove to be a more realizable solution for the hardware implementation of predictive retinal models.
High-performance computing for airborne applications
DOE Office of Scientific and Technical Information (OSTI.GOV)
Quinn, Heather M; Manuzzato, Andrea; Fairbanks, Tom
2010-06-28
Recently, there has been attempts to move common satellite tasks to unmanned aerial vehicles (UAVs). UAVs are significantly cheaper to buy than satellites and easier to deploy on an as-needed basis. The more benign radiation environment also allows for an aggressive adoption of state-of-the-art commercial computational devices, which increases the amount of data that can be collected. There are a number of commercial computing devices currently available that are well-suited to high-performance computing. These devices range from specialized computational devices, such as field-programmable gate arrays (FPGAs) and digital signal processors (DSPs), to traditional computing platforms, such as microprocessors. Even thoughmore » the radiation environment is relatively benign, these devices could be susceptible to single-event effects. In this paper, we will present radiation data for high-performance computing devices in a accelerated neutron environment. These devices include a multi-core digital signal processor, two field-programmable gate arrays, and a microprocessor. From these results, we found that all of these devices are suitable for many airplane environments without reliability problems.« less
Fast Particle Methods for Multiscale Phenomena Simulations
NASA Technical Reports Server (NTRS)
Koumoutsakos, P.; Wray, A.; Shariff, K.; Pohorille, Andrew
2000-01-01
We are developing particle methods oriented at improving computational modeling capabilities of multiscale physical phenomena in : (i) high Reynolds number unsteady vortical flows, (ii) particle laden and interfacial flows, (iii)molecular dynamics studies of nanoscale droplets and studies of the structure, functions, and evolution of the earliest living cell. The unifying computational approach involves particle methods implemented in parallel computer architectures. The inherent adaptivity, robustness and efficiency of particle methods makes them a multidisciplinary computational tool capable of bridging the gap of micro-scale and continuum flow simulations. Using efficient tree data structures, multipole expansion algorithms, and improved particle-grid interpolation, particle methods allow for simulations using millions of computational elements, making possible the resolution of a wide range of length and time scales of these important physical phenomena.The current challenges in these simulations are in : [i] the proper formulation of particle methods in the molecular and continuous level for the discretization of the governing equations [ii] the resolution of the wide range of time and length scales governing the phenomena under investigation. [iii] the minimization of numerical artifacts that may interfere with the physics of the systems under consideration. [iv] the parallelization of processes such as tree traversal and grid-particle interpolations We are conducting simulations using vortex methods, molecular dynamics and smooth particle hydrodynamics, exploiting their unifying concepts such as : the solution of the N-body problem in parallel computers, highly accurate particle-particle and grid-particle interpolations, parallel FFT's and the formulation of processes such as diffusion in the context of particle methods. This approach enables us to transcend among seemingly unrelated areas of research.
NASA Astrophysics Data System (ADS)
Ha, Sanghyun; Park, Junshin; You, Donghyun
2018-01-01
Utility of the computational power of Graphics Processing Units (GPUs) is elaborated for solutions of incompressible Navier-Stokes equations which are integrated using a semi-implicit fractional-step method. The Alternating Direction Implicit (ADI) and the Fourier-transform-based direct solution methods used in the semi-implicit fractional-step method take advantage of multiple tridiagonal matrices whose inversion is known as the major bottleneck for acceleration on a typical multi-core machine. A novel implementation of the semi-implicit fractional-step method designed for GPU acceleration of the incompressible Navier-Stokes equations is presented. Aspects of the programing model of Compute Unified Device Architecture (CUDA), which are critical to the bandwidth-bound nature of the present method are discussed in detail. A data layout for efficient use of CUDA libraries is proposed for acceleration of tridiagonal matrix inversion and fast Fourier transform. OpenMP is employed for concurrent collection of turbulence statistics on a CPU while the Navier-Stokes equations are computed on a GPU. Performance of the present method using CUDA is assessed by comparing the speed of solving three tridiagonal matrices using ADI with the speed of solving one heptadiagonal matrix using a conjugate gradient method. An overall speedup of 20 times is achieved using a Tesla K40 GPU in comparison with a single-core Xeon E5-2660 v3 CPU in simulations of turbulent boundary-layer flow over a flat plate conducted on over 134 million grids. Enhanced performance of 48 times speedup is reached for the same problem using a Tesla P100 GPU.
NASA Astrophysics Data System (ADS)
Xu, Shigang; Liu, Yang
2018-03-01
The conventional pseudo-acoustic wave equations (PWEs) in arbitrary orthorhombic anisotropic (OA) media usually have coupled P- and SV-wave modes. These coupled equations may introduce strong SV-wave artifacts and numerical instabilities in P-wave simulation results and reverse-time migration (RTM) profiles. However, pure acoustic wave equations (PAWEs) completely decouple the P-wave component from the full elastic wavefield and naturally solve all the aforementioned problems. In this article, we present a novel PAWE in arbitrary OA media and compare it with the conventional coupled PWEs. Through decomposing the solution of the corresponding eigenvalue equation for the original PWE into an ellipsoidal differential operator (EDO) and an ellipsoidal scalar operator (ESO), the new PAWE in time-space domain is constructed by applying the combination of these two solvable operators and can effectively describe P-wave features in arbitrary OA media. Furthermore, we adopt the optimal finite-difference method (FDM) to solve the newly derived PAWE. In addition, the three-dimensional (3D) hybrid absorbing boundary condition (HABC) with some reasonable modifications is developed for reducing artificial edge reflections in anisotropic media. To improve computational efficiency in 3D case, we adopt graphic processing unit (GPU) with Compute Unified Device Architecture (CUDA) instead of traditional central processing unit (CPU) architecture. Several numerical experiments for arbitrary OA models confirm that the proposed schemes can produce pure, stable and accurate P-wave modeling results and RTM images with higher computational efficiency. Moreover, the 3D numerical simulations can provide us with a comprehensive and real description of wave propagation.
NASA Astrophysics Data System (ADS)
Meng, Hui; Hui, Hui; Hu, Chaoen; Yang, Xin; Tian, Jie
2017-03-01
The ability of fast and single-neuron resolution imaging of neural activities enables light-sheet fluorescence microscopy (LSFM) as a powerful imaging technique in functional neural connection applications. The state-of-art LSFM imaging system can record the neuronal activities of entire brain for small animal, such as zebrafish or C. elegans at single-neuron resolution. However, the stimulated and spontaneous movements in animal brain result in inconsistent neuron positions during recording process. It is time consuming to register the acquired large-scale images with conventional method. In this work, we address the problem of fast registration of neural positions in stacks of LSFM images. This is necessary to register brain structures and activities. To achieve fast registration of neural activities, we present a rigid registration architecture by implementation of Graphics Processing Unit (GPU). In this approach, the image stacks were preprocessed on GPU by mean stretching to reduce the computation effort. The present image was registered to the previous image stack that considered as reference. A fast Fourier transform (FFT) algorithm was used for calculating the shift of the image stack. The calculations for image registration were performed in different threads while the preparation functionality was refactored and called only once by the master thread. We implemented our registration algorithm on NVIDIA Quadro K4200 GPU under Compute Unified Device Architecture (CUDA) programming environment. The experimental results showed that the registration computation can speed-up to 550ms for a full high-resolution brain image. Our approach also has potential to be used for other dynamic image registrations in biomedical applications.
Lexical Problems in Large Distributed Information Systems.
ERIC Educational Resources Information Center
Berkovich, Simon Ya; Shneiderman, Ben
1980-01-01
Suggests a unified concept of a lexical subsystem as part of an information system to deal with lexical problems in local and network environments. The linguistic and control functions of the lexical subsystems in solving problems for large computer systems are described, and references are included. (Author/BK)
Federal Register 2010, 2011, 2012, 2013, 2014
2012-10-04
... INTERNATIONAL TRADE COMMISSION [Investigation No. 337-TA-794] Certain Electronic Devices, Including Wireless Commmunication Devices, Portable Music and Data Processing Devices, and Tablet Computers... communication devices, portable music and data processing devices, and tablet computers, imported by Apple Inc...
A UML model for the description of different brain-computer interface systems.
Quitadamo, Lucia Rita; Abbafati, Manuel; Saggio, Giovanni; Marciani, Maria Grazia; Cardarilli, Gian Carlo; Bianchi, Luigi
2008-01-01
BCI research lacks a universal descriptive language among labs and a unique standard model for the description of BCI systems. This results in a serious problem in comparing performances of different BCI processes and in unifying tools and resources. In such a view we implemented a Unified Modeling Language (UML) model for the description virtually of any BCI protocol and we demonstrated that it can be successfully applied to the most common ones such as P300, mu-rhythms, SCP, SSVEP, fMRI. Finally we illustrated the advantages in utilizing a standard terminology for BCIs and how the same basic structure can be successfully adopted for the implementation of new systems.
Unifying Gate Synthesis and Magic State Distillation.
Campbell, Earl T; Howard, Mark
2017-02-10
The leading paradigm for performing a computation on quantum memories can be encapsulated as distill-then-synthesize. Initially, one performs several rounds of distillation to create high-fidelity magic states that provide one good T gate, an essential quantum logic gate. Subsequently, gate synthesis intersperses many T gates with Clifford gates to realize a desired circuit. We introduce a unified framework that implements one round of distillation and multiquibit gate synthesis in a single step. Typically, our method uses the same number of T gates as conventional synthesis but with the added benefit of quadratic error suppression. Because of this, one less round of magic state distillation needs to be performed, leading to significant resource savings.
Quantum-gravity predictions for the fine-structure constant
NASA Astrophysics Data System (ADS)
Eichhorn, Astrid; Held, Aaron; Wetterich, Christof
2018-07-01
Asymptotically safe quantum fluctuations of gravity can uniquely determine the value of the gauge coupling for a large class of grand unified models. In turn, this makes the electromagnetic fine-structure constant calculable. The balance of gravity and matter fluctuations results in a fixed point for the running of the gauge coupling. It is approached as the momentum scale is lowered in the transplanckian regime, leading to a uniquely predicted value of the gauge coupling at the Planck scale. The precise value of the predicted fine-structure constant depends on the matter content of the grand unified model. It is proportional to the gravitational fluctuation effects for which computational uncertainties remain to be settled.
A Unified Theory of Trust and Collaboration
NASA Astrophysics Data System (ADS)
Cai, Guoray; Squicciarini, Anna
We consider a type of applications where collaboration and trust are tightly coupled with the need to protect sensitive information. Existing trust management technologies have been limited to offering generic mechanisms for enforcing access control policies based on exchanged credentials, and rarely deal with the situated meaning of trust in a specific collaborative context. Towards trust management for highly dynamic and collaborative activities, this paper describes a theory of trust intention and semantics that makes explicit connections between collaborative activities and trust. The model supports inferring trust state based on knowledge about state of collaborative activity. It is the first step towards a unified approach for computer-mediated trust communication in the context of collaborative work.
OVERGRID: A Unified Overset Grid Generation Graphical Interface
NASA Technical Reports Server (NTRS)
Chan, William M.; Akien, Edwin W. (Technical Monitor)
1999-01-01
This paper presents a unified graphical interface and gridding strategy for performing overset grid generation. The interface called OVERGRID has been specifically designed to follow an efficient overset gridding strategy, and contains general grid manipulation capabilities as well as modules that are specifically suited for overset grids. General grid utilities include functions for grid redistribution, smoothing, concatenation, extraction, extrapolation, projection, and many others. Modules specially tailored for overset grids include a seam curve extractor, hyperbolic and algebraic surface grid generators, a hyperbolic volume grid generator, and a Cartesian box grid generator, Grid visualization is achieved using OpenGL while widgets are constructed with Tcl/Tk. The software is portable between various platforms from UNIX workstations to personal computers.
Sensor sentinel computing device
Damico, Joseph P.
2016-08-02
Technologies pertaining to authenticating data output by sensors in an industrial environment are described herein. A sensor sentinel computing device receives time-series data from a sensor by way of a wireline connection. The sensor sentinel computing device generates a validation signal that is a function of the time-series signal. The sensor sentinel computing device then transmits the validation signal to a programmable logic controller in the industrial environment.
A Unified Air-Sea Visualization System: Survey on Gridding Structures
NASA Technical Reports Server (NTRS)
Anand, Harsh; Moorhead, Robert
1995-01-01
The goal is to develop a Unified Air-Sea Visualization System (UASVS) to enable the rapid fusion of observational, archival, and model data for verification and analysis. To design and develop UASVS, modelers were polled to determine the gridding structures and visualization systems used, and their needs with respect to visual analysis. A basic UASVS requirement is to allow a modeler to explore multiple data sets within a single environment, or to interpolate multiple datasets onto one unified grid. From this survey, the UASVS should be able to visualize 3D scalar/vector fields; render isosurfaces; visualize arbitrary slices of the 3D data; visualize data defined on spectral element grids with the minimum number of interpolation stages; render contours; produce 3D vector plots and streamlines; provide unified visualization of satellite images, observations and model output overlays; display the visualization on a projection of the users choice; implement functions so the user can derive diagnostic values; animate the data to see the time-evolution; animate ocean and atmosphere at different rates; store the record of cursor movement, smooth the path, and animate a window around the moving path; repeatedly start and stop the visual time-stepping; generate VHS tape animations; work on a variety of workstations; and allow visualization across clusters of workstations and scalable high performance computer systems.
Simple video format for mobile applications
NASA Astrophysics Data System (ADS)
Smith, John R.; Miao, Zhourong; Li, Chung-Sheng
2000-04-01
With the advent of pervasive computing, there is a growing demand for enabling multimedia applications on mobile devices. Large numbers of pervasive computing devices, such as personal digital assistants (PDAs), hand-held computer (HHC), smart phones, portable audio players, automotive computing devices, and wearable computers are gaining access to online information sources. However, the pervasive computing devices are often constrained along a number of dimensions, such as processing power, local storage, display size and depth, connectivity, and communication bandwidth, which makes it difficult to access rich image and video content. In this paper, we report on our initial efforts in designing a simple scalable video format with low-decoding and transcoding complexity for pervasive computing. The goal is to enable image and video access for mobile applications such as electronic catalog shopping, video conferencing, remote surveillance and video mail using pervasive computing devices.
NASA Astrophysics Data System (ADS)
Perfors, Amy
2014-09-01
There is much to approve of in this provocative and interesting paper. I strongly agree in many parts, especially the point that dichotomies like nature/nurture are actively detrimental to the field. I also appreciate the idea that cognitive scientists should take the "biological wetware" of the cell (rather than the network) more seriously.
Proactive human-computer collaboration for information discovery
NASA Astrophysics Data System (ADS)
DiBona, Phil; Shilliday, Andrew; Barry, Kevin
2016-05-01
Lockheed Martin Advanced Technology Laboratories (LM ATL) is researching methods, representations, and processes for human/autonomy collaboration to scale analysis and hypotheses substantiation for intelligence analysts. This research establishes a machinereadable hypothesis representation that is commonsensical to the human analyst. The representation unifies context between the human and computer, enabling autonomy in the form of analytic software, to support the analyst through proactively acquiring, assessing, and organizing high-value information that is needed to inform and substantiate hypotheses.
Interfacing External Quantum Devices to a Universal Quantum Computer
Lagana, Antonio A.; Lohe, Max A.; von Smekal, Lorenz
2011-01-01
We present a scheme to use external quantum devices using the universal quantum computer previously constructed. We thereby show how the universal quantum computer can utilize networked quantum information resources to carry out local computations. Such information may come from specialized quantum devices or even from remote universal quantum computers. We show how to accomplish this by devising universal quantum computer programs that implement well known oracle based quantum algorithms, namely the Deutsch, Deutsch-Jozsa, and the Grover algorithms using external black-box quantum oracle devices. In the process, we demonstrate a method to map existing quantum algorithms onto the universal quantum computer. PMID:22216276
Interfacing external quantum devices to a universal quantum computer.
Lagana, Antonio A; Lohe, Max A; von Smekal, Lorenz
2011-01-01
We present a scheme to use external quantum devices using the universal quantum computer previously constructed. We thereby show how the universal quantum computer can utilize networked quantum information resources to carry out local computations. Such information may come from specialized quantum devices or even from remote universal quantum computers. We show how to accomplish this by devising universal quantum computer programs that implement well known oracle based quantum algorithms, namely the Deutsch, Deutsch-Jozsa, and the Grover algorithms using external black-box quantum oracle devices. In the process, we demonstrate a method to map existing quantum algorithms onto the universal quantum computer. © 2011 Lagana et al.
ERIC Educational Resources Information Center
Peale, Lydia
An effective, 3-week unit on the 1920's in America, inspired by a reading of "The Great Gatsby," is designed around the idea of structure in history, government, literature, and writing. It has the following aims for students: (1) to understand the history of the 1920s, (2) to see literature as a creative shaping of the experience of the…
Bruno Garza, J L; Young, J G
2015-01-01
Extended use of conventional computer input devices is associated with negative musculoskeletal outcomes. While many alternative designs have been proposed, it is unclear whether these devices reduce biomechanical loading and musculoskeletal outcomes. To review studies describing and evaluating the biomechanical loading and musculoskeletal outcomes associated with conventional and alternative input devices. Included studies evaluated biomechanical loading and/or musculoskeletal outcomes of users' distal or proximal upper extremity regions associated with the operation of alternative input devices (pointing devices, mice, other devices) that could be used in a desktop personal computing environment during typical office work. Some alternative pointing device designs (e.g. rollerbar) were consistently associated with decreased biomechanical loading while other designs had inconsistent results across studies. Most alternative keyboards evaluated in the literature reduce biomechanical loading and musculoskeletal outcomes. Studies of other input devices (e.g. touchscreen and gestural controls) were rare, however, those reported to date indicate that these devices are currently unsuitable as replacements for traditional devices. Alternative input devices that reduce biomechanical loading may make better choices for preventing or alleviating musculoskeletal outcomes during computer use, however, it is unclear whether many existing designs are effective.
Fitting the Reduced RUM with Mplus: A Tutorial
ERIC Educational Resources Information Center
Chiu, Chia-Yi; Köhn, Hans-Friedrich; Wu, Huey-Min
2016-01-01
The Reduced Reparameterized Unified Model (Reduced RUM) is a diagnostic classification model for educational assessment that has received considerable attention among psychometricians. However, the computational options for researchers and practitioners who wish to use the Reduced RUM in their work, but do not feel comfortable writing their own…
The Collins Center Update. Volume 6, Issue 3, April-June 2004
2004-06-01
fought campaign plans with students from the other Senior Level Colleges in a free - play computer-assisted war game. INSIDE THIS ISSUE • Unified...phase, where the students came together to execute their plans in a dynamic free - play environment. The exercise, guided by the participants’ own
MATREX: A Unifying Modeling and Simulation Architecture for Live-Virtual-Constructive Applications
2007-05-23
Deployment Systems Acquisition Operations & Support B C Sustainment FRP Decision Review FOC LRIP/IOT& ECritical Design Review Pre-Systems...CMS2 – Comprehensive Munitions & Sensor Server • CSAT – C4ISR Static Analysis Tool • C4ISR – Command & Control, Communications, Computers
Integrating Incremental Learning and Episodic Memory Models of the Hippocampal Region
ERIC Educational Resources Information Center
Meeter, M.; Myers, C. E.; Gluck, M. A.
2005-01-01
By integrating previous computational models of corticohippocampal function, the authors develop and test a unified theory of the neural substrates of familiarity, recollection, and classical conditioning. This approach integrates models from 2 traditions of hippocampal modeling, those of episodic memory and incremental learning, by drawing on an…
Building Efficient Wireless Infrastructures for Pervasive Computing Environments
ERIC Educational Resources Information Center
Sheng, Bo
2010-01-01
Pervasive computing is an emerging concept that thoroughly brings computing devices and the consequent technology into people's daily life and activities. Most of these computing devices are very small, sometimes even "invisible", and often embedded into the objects surrounding people. In addition, these devices usually are not isolated, but…
Generic-distributed framework for cloud services marketplace based on unified ontology.
Hasan, Samer; Valli Kumari, V
2017-11-01
Cloud computing is a pattern for delivering ubiquitous and on demand computing resources based on pay-as-you-use financial model. Typically, cloud providers advertise cloud service descriptions in various formats on the Internet. On the other hand, cloud consumers use available search engines (Google and Yahoo) to explore cloud service descriptions and find the adequate service. Unfortunately, general purpose search engines are not designed to provide a small and complete set of results, which makes the process a big challenge. This paper presents a generic-distrusted framework for cloud services marketplace to automate cloud services discovery and selection process, and remove the barriers between service providers and consumers. Additionally, this work implements two instances of generic framework by adopting two different matching algorithms; namely dominant and recessive attributes algorithm borrowed from gene science and semantic similarity algorithm based on unified cloud service ontology. Finally, this paper presents unified cloud services ontology and models the real-life cloud services according to the proposed ontology. To the best of the authors' knowledge, this is the first attempt to build a cloud services marketplace where cloud providers and cloud consumers can trend cloud services as utilities. In comparison with existing work, semantic approach reduced the execution time by 20% and maintained the same values for all other parameters. On the other hand, dominant and recessive attributes approach reduced the execution time by 57% but showed lower value for recall.
Towards Scalable Graph Computation on Mobile Devices.
Chen, Yiqi; Lin, Zhiyuan; Pienta, Robert; Kahng, Minsuk; Chau, Duen Horng
2014-10-01
Mobile devices have become increasingly central to our everyday activities, due to their portability, multi-touch capabilities, and ever-improving computational power. Such attractive features have spurred research interest in leveraging mobile devices for computation. We explore a novel approach that aims to use a single mobile device to perform scalable graph computation on large graphs that do not fit in the device's limited main memory, opening up the possibility of performing on-device analysis of large datasets, without relying on the cloud. Based on the familiar memory mapping capability provided by today's mobile operating systems, our approach to scale up computation is powerful and intentionally kept simple to maximize its applicability across the iOS and Android platforms. Our experiments demonstrate that an iPad mini can perform fast computation on large real graphs with as many as 272 million edges (Google+ social graph), at a speed that is only a few times slower than a 13″ Macbook Pro. Through creating a real world iOS app with this technique, we demonstrate the strong potential application for scalable graph computation on a single mobile device using our approach.
Towards Scalable Graph Computation on Mobile Devices
Chen, Yiqi; Lin, Zhiyuan; Pienta, Robert; Kahng, Minsuk; Chau, Duen Horng
2015-01-01
Mobile devices have become increasingly central to our everyday activities, due to their portability, multi-touch capabilities, and ever-improving computational power. Such attractive features have spurred research interest in leveraging mobile devices for computation. We explore a novel approach that aims to use a single mobile device to perform scalable graph computation on large graphs that do not fit in the device's limited main memory, opening up the possibility of performing on-device analysis of large datasets, without relying on the cloud. Based on the familiar memory mapping capability provided by today's mobile operating systems, our approach to scale up computation is powerful and intentionally kept simple to maximize its applicability across the iOS and Android platforms. Our experiments demonstrate that an iPad mini can perform fast computation on large real graphs with as many as 272 million edges (Google+ social graph), at a speed that is only a few times slower than a 13″ Macbook Pro. Through creating a real world iOS app with this technique, we demonstrate the strong potential application for scalable graph computation on a single mobile device using our approach. PMID:25859564
Materials requirements for optical processing and computing devices
NASA Technical Reports Server (NTRS)
Tanguay, A. R., Jr.
1985-01-01
Devices for optical processing and computing systems are discussed, with emphasis on the materials requirements imposed by functional constraints. Generalized optical processing and computing systems are described in order to identify principal categories of requisite components for complete system implementation. Three principal device categories are selected for analysis in some detail: spatial light modulators, volume holographic optical elements, and bistable optical devices. The implications for optical processing and computing systems of the materials requirements identified for these device categories are described, and directions for future research are proposed.
NASA Astrophysics Data System (ADS)
Bessonov, O.; Silvestrov, P.
2017-02-01
This paper describes the general idea and the first implementation of the Interactive information and simulation system - an integrated environment that combines computational modules for modeling the aerodynamics and aerothermodynamics of re-entry space vehicles with the large collection of different information materials on this topic. The internal organization and the composition of the system are described and illustrated. Examples of the computational and information output are presented. The system has the unified implementation for Windows and Linux operation systems and can be deployed on any modern high-performance personal computer.
NASA Technical Reports Server (NTRS)
Muellerschoen, R. J.
1988-01-01
A unified method to permute vector stored Upper triangular Diagonal factorized covariance and vector stored upper triangular Square Root Information arrays is presented. The method involves cyclic permutation of the rows and columns of the arrays and retriangularization with fast (slow) Givens rotations (reflections). Minimal computation is performed, and a one dimensional scratch array is required. To make the method efficient for large arrays on a virtual memory machine, computations are arranged so as to avoid expensive paging faults. This method is potentially important for processing large volumes of radio metric data in the Deep Space Network.
An Enduring Dialogue between Computational and Empirical Vision.
Martinez-Conde, Susana; Macknik, Stephen L; Heeger, David J
2018-04-01
In the late 1970s, key discoveries in neurophysiology, psychophysics, computer vision, and image processing had reached a tipping point that would shape visual science for decades to come. David Marr and Ellen Hildreth's 'Theory of edge detection', published in 1980, set out to integrate the newly available wealth of data from behavioral, physiological, and computational approaches in a unifying theory. Although their work had wide and enduring ramifications, their most important contribution may have been to consolidate the foundations of the ongoing dialogue between theoretical and empirical vision science. Copyright © 2018 Elsevier Ltd. All rights reserved.
Quantum Computing Architectural Design
NASA Astrophysics Data System (ADS)
West, Jacob; Simms, Geoffrey; Gyure, Mark
2006-03-01
Large scale quantum computers will invariably require scalable architectures in addition to high fidelity gate operations. Quantum computing architectural design (QCAD) addresses the problems of actually implementing fault-tolerant algorithms given physical and architectural constraints beyond those of basic gate-level fidelity. Here we introduce a unified framework for QCAD that enables the scientist to study the impact of varying error correction schemes, architectural parameters including layout and scheduling, and physical operations native to a given architecture. Our software package, aptly named QCAD, provides compilation, manipulation/transformation, multi-paradigm simulation, and visualization tools. We demonstrate various features of the QCAD software package through several examples.
NASA Technical Reports Server (NTRS)
Thakur, Siddarth; Wright, Jeffrey
2006-01-01
The traditional design and analysis practice for advanced propulsion systems, particularly chemical rocket engines, relies heavily on expensive full-scale prototype development and testing. Over the past decade, use of high-fidelity analysis and design tools such as CFD early in the product development cycle has been identified as one way to alleviate testing costs and to develop these devices better, faster and cheaper. Increased emphasis is being placed on developing and applying CFD models to simulate the flow field environments and performance of advanced propulsion systems. This necessitates the development of next generation computational tools which can be used effectively and reliably in a design environment by non-CFD specialists. A computational tool, called Loci-STREAM is being developed for this purpose. It is a pressure-based, Reynolds-averaged Navier-Stokes (RANS) solver for generalized unstructured grids, which is designed to handle all-speed flows (incompressible to hypersonic) and is particularly suitable for solving multi-species flow in fixed-frame combustion devices. Loci-STREAM integrates proven numerical methods for generalized grids and state-of-the-art physical models in a novel rule-based programming framework called Loci which allows: (a) seamless integration of multidisciplinary physics in a unified manner, and (b) automatic handling of massively parallel computing. The objective of the ongoing work is to develop a robust simulation capability for combustion problems in rocket engines. As an initial step towards validating this capability, a model problem is investigated in the present study which involves a gaseous oxygen/gaseous hydrogen (GO2/GH2) shear coaxial single element injector, for which experimental data are available. The sensitivity of the computed solutions to grid density, grid distribution, different turbulence models, and different near-wall treatments is investigated. A refined grid, which is clustered in the vicinity of the solid walls as well as the flame, is used to obtain a steady state solution which may be considered as the best solution attainable with the steady-state RANS methodology. From a design point of view, quick turnaround times are desirable; with this in mind, coarser grids are also employed and the resulting solutions are evaluated with respect to the fine grid solution.
Chi, Chia-Fen; Tseng, Li-Kai; Jang, Yuh
2012-07-01
Many disabled individuals lack extensive knowledge about assistive technology, which could help them use computers. In 1997, Denis Anson developed a decision tree of 49 evaluative questions designed to evaluate the functional capabilities of the disabled user and choose an appropriate combination of assistive devices, from a selection of 26, that enable the individual to use a computer. In general, occupational therapists guide the disabled users through this process. They often have to go over repetitive questions in order to find an appropriate device. A disabled user may require an alphanumeric entry device, a pointing device, an output device, a performance enhancement device, or some combination of these. Therefore, the current research eliminates redundant questions and divides Anson's decision tree into multiple independent subtrees to meet the actual demand of computer users with disabilities. The modified decision tree was tested by six disabled users to prove it can determine a complete set of assistive devices with a smaller number of evaluative questions. The means to insert new categories of computer-related assistive devices was included to ensure the decision tree can be expanded and updated. The current decision tree can help the disabled users and assistive technology practitioners to find appropriate computer-related assistive devices that meet with clients' individual needs in an efficient manner.
NASA Astrophysics Data System (ADS)
Gonzalez-Ayala, Julian; Calvo Hernández, A.; Roco, J. M. M.
2016-07-01
The main unified energetic properties of low dissipation heat engines and refrigerator engines allow for both endoreversible or irreversible configurations. This is accomplished by means of the constraints imposed on the characteristic global operation time or the contact times between the working system with the external heat baths and modulated by the dissipation symmetries. A suited unified figure of merit (which becomes power output for heat engines) is analyzed and the influence of the symmetries on the optimum performance discussed. The obtained results, independent on any heat transfer law, are faced with those obtained from Carnot-like heat models where specific heat transfer laws are needed. Thus, it is shown that only the inverse phenomenological law, often used in linear irreversible thermodynamics, correctly reproduces all optimized values for both the efficiency and coefficient of performance values.
NASA Astrophysics Data System (ADS)
Qin, Cheng-Zhi; Zhan, Lijun
2012-06-01
As one of the important tasks in digital terrain analysis, the calculation of flow accumulations from gridded digital elevation models (DEMs) usually involves two steps in a real application: (1) using an iterative DEM preprocessing algorithm to remove the depressions and flat areas commonly contained in real DEMs, and (2) using a recursive flow-direction algorithm to calculate the flow accumulation for every cell in the DEM. Because both algorithms are computationally intensive, quick calculation of the flow accumulations from a DEM (especially for a large area) presents a practical challenge to personal computer (PC) users. In recent years, rapid increases in hardware capacity of the graphics processing units (GPUs) provided in modern PCs have made it possible to meet this challenge in a PC environment. Parallel computing on GPUs using a compute-unified-device-architecture (CUDA) programming model has been explored to speed up the execution of the single-flow-direction algorithm (SFD). However, the parallel implementation on a GPU of the multiple-flow-direction (MFD) algorithm, which generally performs better than the SFD algorithm, has not been reported. Moreover, GPU-based parallelization of the DEM preprocessing step in the flow-accumulation calculations has not been addressed. This paper proposes a parallel approach to calculate flow accumulations (including both iterative DEM preprocessing and a recursive MFD algorithm) on a CUDA-compatible GPU. For the parallelization of an MFD algorithm (MFD-md), two different parallelization strategies using a GPU are explored. The first parallelization strategy, which has been used in the existing parallel SFD algorithm on GPU, has the problem of computing redundancy. Therefore, we designed a parallelization strategy based on graph theory. The application results show that the proposed parallel approach to calculate flow accumulations on a GPU performs much faster than either sequential algorithms or other parallel GPU-based algorithms based on existing parallelization strategies.
Klonoff, David C
2017-07-01
The Internet of Things (IoT) is generating an immense volume of data. With cloud computing, medical sensor and actuator data can be stored and analyzed remotely by distributed servers. The results can then be delivered via the Internet. The number of devices in IoT includes such wireless diabetes devices as blood glucose monitors, continuous glucose monitors, insulin pens, insulin pumps, and closed-loop systems. The cloud model for data storage and analysis is increasingly unable to process the data avalanche, and processing is being pushed out to the edge of the network closer to where the data-generating devices are. Fog computing and edge computing are two architectures for data handling that can offload data from the cloud, process it nearby the patient, and transmit information machine-to-machine or machine-to-human in milliseconds or seconds. Sensor data can be processed near the sensing and actuating devices with fog computing (with local nodes) and with edge computing (within the sensing devices). Compared to cloud computing, fog computing and edge computing offer five advantages: (1) greater data transmission speed, (2) less dependence on limited bandwidths, (3) greater privacy and security, (4) greater control over data generated in foreign countries where laws may limit use or permit unwanted governmental access, and (5) lower costs because more sensor-derived data are used locally and less data are transmitted remotely. Connected diabetes devices almost all use fog computing or edge computing because diabetes patients require a very rapid response to sensor input and cannot tolerate delays for cloud computing.
NASA Astrophysics Data System (ADS)
Honing, Henkjan; Zuidema, Willem
2014-09-01
The future of cognitive science will be about bridging neuroscience and behavioral studies, with essential roles played by comparative biology, formal modeling, and the theory of computation. Nowhere will this integration be more strongly needed than in understanding the biological basis of language and music. We thus strongly sympathize with the general framework that Fitch [1] proposes, and welcome the remarkably broad and readable review he presents to support it.
Optimization of coupled systems: A critical overview of approaches
NASA Technical Reports Server (NTRS)
Balling, R. J.; Sobieszczanski-Sobieski, J.
1994-01-01
A unified overview is given of problem formulation approaches for the optimization of multidisciplinary coupled systems. The overview includes six fundamental approaches upon which a large number of variations may be made. Consistent approach names and a compact approach notation are given. The approaches are formulated to apply to general nonhierarchic systems. The approaches are compared both from a computational viewpoint and a managerial viewpoint. Opportunities for parallelism of both computation and manpower resources are discussed. Recommendations regarding the need for future research are advanced.
NASA Technical Reports Server (NTRS)
Huynh, H. T.; Wang, Z. J.; Vincent, P. E.
2013-01-01
Popular high-order schemes with compact stencils for Computational Fluid Dynamics (CFD) include Discontinuous Galerkin (DG), Spectral Difference (SD), and Spectral Volume (SV) methods. The recently proposed Flux Reconstruction (FR) approach or Correction Procedure using Reconstruction (CPR) is based on a differential formulation and provides a unifying framework for these high-order schemes. Here we present a brief review of recent developments for the FR/CPR schemes as well as some pacing items.
NASA Astrophysics Data System (ADS)
Knoeferle, Pia
2016-03-01
In his review article [19], Arbib outlines an ambitious research agenda: to accommodate within a unified framework the evolution, the development, and the processing of language in natural settings (implicating other systems such as vision). He does so with neuro-computationally explicit modeling in mind [1,2] and inspired by research on the mirror neuron system in primates. Similar research questions have received substantial attention also among other scientists [3,4,12].
2008-03-01
computational version of the CASIE architecture serves to demonstrate the functionality of our primary theories. However, implementation of several other...following facts. First, based on Theorem 3 and Theorem 5, the objective function is non -increasing under updating rule (6); second, by the criteria for...reassignment in updating rule (7), it is trivial to show that the objective function is non -increasing under updating rule (7). A Unified View to Graph
Electrical transport and low-frequency noise in chemical vapor deposited single-layer MoS2 devices.
Sharma, Deepak; Amani, Matin; Motayed, Abhishek; Shah, Pankaj B; Birdwell, A Glen; Najmaei, Sina; Ajayan, Pulickel M; Lou, Jun; Dubey, Madan; Li, Qiliang; Davydov, Albert V
2014-04-18
We have studied temperature-dependent (77-300 K) electrical characteristics and low-frequency noise (LFN) in chemical vapor deposited (CVD) single-layer molybdenum disulfide (MoS2) based back-gated field-effect transistors (FETs). Electrical characterization and LFN measurements were conducted on MoS2 FETs with Al2O3 top-surface passivation. We also studied the effect of top-surface passivation etching on the electrical characteristics of the device. Significant decrease in channel current and transconductance was observed in these devices after the Al2O3 passivation etching. For passivated devices, the two-terminal resistance variation with temperature showed a good fit to the activation energy model, whereas for the etched devices the trend indicated a hopping transport mechanism. A significant increase in the normalized drain current noise power spectral density (PSD) was observed after the etching of the top passivation layer. The observed channel current noise was explained using a standard unified model incorporating carrier number fluctuation and correlated surface mobility fluctuation mechanisms. Detailed analysis of the gate-referred noise voltage PSD indicated the presence of different trapping states in passivated devices when compared to the etched devices. Etched devices showed weak temperature dependence of the channel current noise, whereas passivated devices exhibited near-linear temperature dependence.
Mesoscale Interfacial Dynamics in Magnetoelectric Nanocomposites
DOE Office of Scientific and Technical Information (OSTI.GOV)
Khachaturyan, Armen G.
Theory and modeling of chessboard-like self-assembling of vertically aligned columnar nanostructures in films has been developed. By means of modeling and three-dimensional computational simulations, we proposed a novel self-assembly process that can produce good chessboard nanostructure architectures through a pseudo-spinodal decomposition of an epitaxial film under optimal thermodynamic and crystallographic conditions (appropriate choice of the temperature, composition of the film, and crystal lattice parameters of the film and substrate). These conditions are formulated. The obtained results have been published on Nano Letters. Based on the principles of the formation of chessboard nanostructured films, we are currently trying to find goodmore » decomposing material systems that satisfy the optimal conditions to produce the chessboard nanostructure architecture. In addition we are under way doing 'computer experiments' to look for the appropriate materials with the chessboard columnar nanostructures, as a potential candidate for engineering of optical devices, high-efficiency multiferroics, and high-density magnetic perpendicular recording media. We are also currently to investigate the magnetoelectric response of multiferroic chessboard nanostructures under applied electric/magnetic fields. A unified 3-dimensional phase field theory of the strain-mediated magnetoelectric effect in magnetoelectric composites is developed. The theory is based on the established equivalency paradigm: we proved that by using a variational priciple the exact values of the electric, magnetic and strain fields in a magnetoelectric composite of arbitrary morphology and their coupled magneto-electric-mechanical response can be evaluated by considering an equivalent homogeneous system with the specially chosen effective eigenstrain, polarization and magnetization. These equivalency parameters are spatially inhomogeneous fields, which are obtained by solving the time-dependent Ginzburg-Landau equations. The paper summarizing these results is to be submitted to JAP. We are currently using the computational model based on the unified phase field theory to predict the local and overall response of the magnetoelectric composites with arbitrary configuration under applied fields, and to find the optimal composite microstructure that produces the strongest ME coupling. We have developed modeling and simulations to support Dr. S. Pryia efforts to produce the strongest ME coupling by searching the optimal configuration of applied electric/magnetic fields, and microstructure of polycrystalline multiferroics. An analytical model demonstrates that the optimization of a magnetoelectric (ME) coupling of a laminar magnetic/piezoelectric polycrystalline composite could be obtained by a proper choice of the magnetic and electric poling directions and the directions of the applied a.c. fields. The results have been published on JAP. Our next step is to determine the domain of optimal parameters and configurations by using our optimization theory and computational modeling.« less
NASA Astrophysics Data System (ADS)
Ryoo, Jean; Goode, Joanna; Margolis, Jane
2015-10-01
This article describes the importance that high school computer science teachers place on a teachers' professional learning community designed around an inquiry- and equity-oriented approach for broadening participation in computing. Using grounded theory to analyze four years of teacher surveys and interviews from the Exploring Computer Science (ECS) program in the Los Angeles Unified School District, this article describes how participating in professional development activities purposefully aimed at fostering a teachers' professional learning community helps ECS teachers make the transition to an inquiry-based classroom culture and break professional isolation. This professional learning community also provides experiences that challenge prevalent deficit notions and stereotypes about which students can or cannot excel in computer science.
Teaching and Learning with Mobile Computing Devices: Case Study in K-12 Classrooms
ERIC Educational Resources Information Center
Grant, Michael M.; Tamim, Suha; Brown, Dorian B.; Sweeney, Joseph P.; Ferguson, Fatima K.; Jones, Lakavious B.
2015-01-01
While ownership of mobile computing devices, such as cellphones, smartphones, and tablet computers, has been rapid, the adoption of these devices in K-12 classrooms has been measured. Some schools and individual teachers have integrated mobile devices to support teaching and learning. The purpose of this qualitative research was to describe the…
On Emulation of Flueric Devices in Excitable Chemical Medium
Adamatzky, Andrew
2016-01-01
Flueric devices are fluidic devices without moving parts. Fluidic devices use fluid as a medium for information transfer and computation. A Belousov-Zhabotinsky (BZ) medium is a thin-layer spatially extended excitable chemical medium which exhibits travelling excitation wave-fronts. The excitation wave-fronts transfer information. Flueric devices compute via jets interaction. BZ devices compute via excitation wave-fronts interaction. In numerical model of BZ medium we show that functions of key flueric devices are implemented in the excitable chemical system: signal generator, and, xor, not and nor Boolean gates, delay elements, diodes and sensors. Flueric devices have been widely used in industry since late 1960s and are still employed in automotive and aircraft technologies. Implementation of analog of the flueric devices in the excitable chemical systems opens doors to further applications of excitation wave-based unconventional computing in soft robotics, embedded organic electronics and living technologies. PMID:27997561
On Emulation of Flueric Devices in Excitable Chemical Medium.
Adamatzky, Andrew
2016-01-01
Flueric devices are fluidic devices without moving parts. Fluidic devices use fluid as a medium for information transfer and computation. A Belousov-Zhabotinsky (BZ) medium is a thin-layer spatially extended excitable chemical medium which exhibits travelling excitation wave-fronts. The excitation wave-fronts transfer information. Flueric devices compute via jets interaction. BZ devices compute via excitation wave-fronts interaction. In numerical model of BZ medium we show that functions of key flueric devices are implemented in the excitable chemical system: signal generator, and, xor, not and nor Boolean gates, delay elements, diodes and sensors. Flueric devices have been widely used in industry since late 1960s and are still employed in automotive and aircraft technologies. Implementation of analog of the flueric devices in the excitable chemical systems opens doors to further applications of excitation wave-based unconventional computing in soft robotics, embedded organic electronics and living technologies.
Medical Imaging Lesion Detection Based on Unified Gravitational Fuzzy Clustering
Vianney Kinani, Jean Marie; Gallegos Funes, Francisco; Mújica Vargas, Dante; Ramos Díaz, Eduardo; Arellano, Alfonso
2017-01-01
We develop a swift, robust, and practical tool for detecting brain lesions with minimal user intervention to assist clinicians and researchers in the diagnosis process, radiosurgery planning, and assessment of the patient's response to the therapy. We propose a unified gravitational fuzzy clustering-based segmentation algorithm, which integrates the Newtonian concept of gravity into fuzzy clustering. We first perform fuzzy rule-based image enhancement on our database which is comprised of T1/T2 weighted magnetic resonance (MR) and fluid-attenuated inversion recovery (FLAIR) images to facilitate a smoother segmentation. The scalar output obtained is fed into a gravitational fuzzy clustering algorithm, which separates healthy structures from the unhealthy. Finally, the lesion contour is automatically outlined through the initialization-free level set evolution method. An advantage of this lesion detection algorithm is its precision and its simultaneous use of features computed from the intensity properties of the MR scan in a cascading pattern, which makes the computation fast, robust, and self-contained. Furthermore, we validate our algorithm with large-scale experiments using clinical and synthetic brain lesion datasets. As a result, an 84%–93% overlap performance is obtained, with an emphasis on robustness with respect to different and heterogeneous types of lesion and a swift computation time. PMID:29158887
Takano, Yu; Nakata, Kazuto; Yonezawa, Yasushige; Nakamura, Haruki
2016-05-05
A massively parallel program for quantum mechanical-molecular mechanical (QM/MM) molecular dynamics simulation, called Platypus (PLATform for dYnamic Protein Unified Simulation), was developed to elucidate protein functions. The speedup and the parallelization ratio of Platypus in the QM and QM/MM calculations were assessed for a bacteriochlorophyll dimer in the photosynthetic reaction center (DIMER) on the K computer, a massively parallel computer achieving 10 PetaFLOPs with 705,024 cores. Platypus exhibited the increase in speedup up to 20,000 core processors at the HF/cc-pVDZ and B3LYP/cc-pVDZ, and up to 10,000 core processors by the CASCI(16,16)/6-31G** calculations. We also performed excited QM/MM-MD simulations on the chromophore of Sirius (SIRIUS) in water. Sirius is a pH-insensitive and photo-stable ultramarine fluorescent protein. Platypus accelerated on-the-fly excited-state QM/MM-MD simulations for SIRIUS in water, using over 4000 core processors. In addition, it also succeeded in 50-ps (200,000-step) on-the-fly excited-state QM/MM-MD simulations for the SIRIUS in water. © 2016 The Authors. Journal of Computational Chemistry Published by Wiley Periodicals, Inc.
NASA Astrophysics Data System (ADS)
Arkadov, G. V.; Zhukavin, A. P.; Kroshilin, A. E.; Parshikov, I. A.; Solov'ev, S. L.; Shishov, A. V.
2014-10-01
The article describes the "Virtual Digital VVER-Based Nuclear Power Plant" computerized system comprising a totality of verified initial data (sets of input data for a model intended for describing the behavior of nuclear power plant (NPP) systems in design and emergency modes of their operation) and a unified system of new-generation computation codes intended for carrying out coordinated computation of the variety of physical processes in the reactor core and NPP equipment. Experiments with the demonstration version of the "Virtual Digital VVER-Based NPP" computerized system has shown that it is in principle possible to set up a unified system of computation codes in a common software environment for carrying out interconnected calculations of various physical phenomena at NPPs constructed according to the standard AES-2006 project. With the full-scale version of the "Virtual Digital VVER-Based NPP" computerized system put in operation, the concerned engineering, design, construction, and operating organizations will have access to all necessary information relating to the NPP power unit project throughout its entire lifecycle. The domestically developed commercial-grade software product set to operate as an independently operating application to the project will bring about additional competitive advantages in the modern market of nuclear power technologies.
Computational and experimental studies of LEBUs at high device Reynolds numbers
NASA Technical Reports Server (NTRS)
Bertelrud, Arild; Watson, R. D.
1988-01-01
The present paper summarizes computational and experimental studies for large-eddy breakup devices (LEBUs). LEBU optimization (using a computational approach considering compressibility, Reynolds number, and the unsteadiness of the flow) and experiments with LEBUs at high Reynolds numbers in flight are discussed. The measurements include streamwise as well as spanwise distributions of local skin friction. The unsteady flows around the LEBU devices and far downstream are characterized by strain-gage measurements on the devices and hot-wire readings downstream. Computations are made with available time-averaged and quasi-stationary techniques to find suitable device profiles with minimum drag.
Direct modeling for computational fluid dynamics
NASA Astrophysics Data System (ADS)
Xu, Kun
2015-06-01
All fluid dynamic equations are valid under their modeling scales, such as the particle mean free path and mean collision time scale of the Boltzmann equation and the hydrodynamic scale of the Navier-Stokes (NS) equations. The current computational fluid dynamics (CFD) focuses on the numerical solution of partial differential equations (PDEs), and its aim is to get the accurate solution of these governing equations. Under such a CFD practice, it is hard to develop a unified scheme that covers flow physics from kinetic to hydrodynamic scales continuously because there is no such governing equation which could make a smooth transition from the Boltzmann to the NS modeling. The study of fluid dynamics needs to go beyond the traditional numerical partial differential equations. The emerging engineering applications, such as air-vehicle design for near-space flight and flow and heat transfer in micro-devices, do require further expansion of the concept of gas dynamics to a larger domain of physical reality, rather than the traditional distinguishable governing equations. At the current stage, the non-equilibrium flow physics has not yet been well explored or clearly understood due to the lack of appropriate tools. Unfortunately, under the current numerical PDE approach, it is hard to develop such a meaningful tool due to the absence of valid PDEs. In order to construct multiscale and multiphysics simulation methods similar to the modeling process of constructing the Boltzmann or the NS governing equations, the development of a numerical algorithm should be based on the first principle of physical modeling. In this paper, instead of following the traditional numerical PDE path, we introduce direct modeling as a principle for CFD algorithm development. Since all computations are conducted in a discretized space with limited cell resolution, the flow physics to be modeled has to be done in the mesh size and time step scales. Here, the CFD is more or less a direct construction of discrete numerical evolution equations, where the mesh size and time step will play dynamic roles in the modeling process. With the variation of the ratio between mesh size and local particle mean free path, the scheme will capture flow physics from the kinetic particle transport and collision to the hydrodynamic wave propagation. Based on the direct modeling, a continuous dynamics of flow motion will be captured in the unified gas-kinetic scheme. This scheme can be faithfully used to study the unexplored non-equilibrium flow physics in the transition regime.
NASA Technical Reports Server (NTRS)
Hall, William A. (Inventor)
1993-01-01
A bus programmable slave module card for use in a computer control system is disclosed which comprises a master computer and one or more slave computer modules interfacing by means of a bus. Each slave module includes its own microprocessor, memory, and control program for acting as a single loop controller. The slave card includes a plurality of memory means (S1, S2...) corresponding to a like plurality of memory devices (C1, C2...) in the master computer, for each slave memory means its own communication lines connectable through the bus with memory communication lines of an associated memory device in the master computer, and a one-way electronic door which is switchable to either a closed condition or a one-way open condition. With the door closed, communication lines between master computer memory (C1, C2...) and slave memory (S1, S2...) are blocked. In the one-way open condition invention, the memory communication lines or each slave memory means (S1, S2...) connect with the memory communication lines of its associated memory device (C1, C2...) in the master computer, and the memory devices (C1, C2...) of the master computer and slave card are electrically parallel such that information seen by the master's memory is also seen by the slave's memory. The slave card is also connectable to a switch for electronically removing the slave microprocessor from the system. With the master computer and the slave card in programming mode relationship, and the slave microprocessor electronically removed from the system, loading a program in the memory devices (C1, C2...) of the master accomplishes a parallel loading into the memory devices (S1, S2...) of the slave.
Unified Performance and Power Modeling of Scientific Workloads
DOE Office of Scientific and Technical Information (OSTI.GOV)
Song, Shuaiwen; Barker, Kevin J.; Kerbyson, Darren J.
2013-11-17
It is expected that scientific applications executing on future large-scale HPC must be optimized not only in terms of performance, but also in terms of power consumption. As power and energy become increasingly constrained resources, researchers and developers must have access to tools that will allow for accurate prediction of both performance and power consumption. Reasoning about performance and power consumption in concert will be critical for achieving maximum utilization of limited resources on future HPC systems. To this end, we present a unified performance and power model for the Nek-Bone mini-application developed as part of the DOE's CESAR Exascalemore » Co-Design Center. Our models consider the impact of computation, point-to-point communication, and collective communication« less
NASA Technical Reports Server (NTRS)
Gunness, R. C., Jr.; Knight, C. J.; Dsylva, E.
1972-01-01
The unified small disturbance equations are numerically solved using the well-known Lax-Wendroff finite difference technique. The method allows complete determination of the inviscid flow field and surface properties as long as the flow remains supersonic. Shock waves and other discontinuities are accounted for implicity in the numerical method. This technique was programed for general application to the three-dimensional case. The validity of the method is demonstrated by calculations on cones, axisymmetric bodies, lifting bodies, delta wings, and a conical wing/body combination. Part 1 contains the discussion of problem development and results of the study. Part 2 contains flow charts, subroutine descriptions, and a listing of the computer program.
Empirical relations for cavitation and liquid impingement erosion processes
NASA Technical Reports Server (NTRS)
Rao, P. V.; Buckley, D. H.
1984-01-01
A unified power-law relationship between average erosion rate and cumulative erosion is presented. Extensive data analyses from venturi, magnetostriction (stationary and oscillating specimens), liquid drop, and jet impact devices appear to conform to this relation. A normalization technique using cavitation and liquid impingement erosion data is also presented to facilitate prediction. Attempts are made to understand the relationship between the coefficients in the power-law relationships and the material properties.
2008-12-01
Coordination Center RDD--Radiological Dispersal Device TOP OFF--Top Officials Exercise TQP--Tiered Qualifications Plan UC--Unified Command UCG ...National Press Club, Release No. FNF-06- 019, November 30, 2006. 101 W. Chan Kim and Renee Mauborgne, Blue Ocean Strategy: How to Create Uncontested Market ...Ask Open Ended Questions,” Evan Carmichael, http://www.evancarmichael.com/ marketing /80/ask-open-ended-questions.html (accessed on September 21, 2008
Manycore Performance-Portability: Kokkos Multidimensional Array Library
Edwards, H. Carter; Sunderland, Daniel; Porter, Vicki; ...
2012-01-01
Large, complex scientific and engineering application code have a significant investment in computational kernels to implement their mathematical models. Porting these computational kernels to the collection of modern manycore accelerator devices is a major challenge in that these devices have diverse programming models, application programming interfaces (APIs), and performance requirements. The Kokkos Array programming model provides library-based approach to implement computational kernels that are performance-portable to CPU-multicore and GPGPU accelerator devices. This programming model is based upon three fundamental concepts: (1) manycore compute devices each with its own memory space, (2) data parallel kernels and (3) multidimensional arrays. Kernel executionmore » performance is, especially for NVIDIA® devices, extremely dependent on data access patterns. Optimal data access pattern can be different for different manycore devices – potentially leading to different implementations of computational kernels specialized for different devices. The Kokkos Array programming model supports performance-portable kernels by (1) separating data access patterns from computational kernels through a multidimensional array API and (2) introduce device-specific data access mappings when a kernel is compiled. An implementation of Kokkos Array is available through Trilinos [Trilinos website, http://trilinos.sandia.gov/, August 2011].« less
Federal Register 2010, 2011, 2012, 2013, 2014
2012-06-04
... INTERNATIONAL TRADE COMMISSION [Investigation No. 337-TA-769] Certain Handheld Electronic Computing Devices, Related Software, and Components Thereof; Termination of the Investigation Based on... electronic computing devices, related software, and components thereof by reason of infringement of certain...
Glitches in Los Angeles Payroll System Spark Furor
ERIC Educational Resources Information Center
Trotter, Andrew
2007-01-01
Thousands of Los Angeles teachers have not been paid properly for months because of errors in a corporate-style payroll system that was introduced in January as part of a sweeping, $95 million computer modernization. The Los Angeles Unified School District acknowledges that the payroll system's rollout was rushed and tainted by numerous…
Practical Issues in Estimating Classification Accuracy and Consistency with R Package cacIRT
ERIC Educational Resources Information Center
Lathrop, Quinn N.
2015-01-01
There are two main lines of research in estimating classification accuracy (CA) and classification consistency (CC) under Item Response Theory (IRT). The R package cacIRT provides computer implementations of both approaches in an accessible and unified framework. Even with available implementations, there remains decisions a researcher faces when…
Optimization Techniques for Analysis of Biological and Social Networks
2012-03-28
analyzing a new metaheuristic technique, variable objective search. 3. Experimentation and application: Implement the proposed algorithms , test and fine...alternative mathematical programming formulations, their theoretical analysis, the development of exact algorithms , and heuristics. Originally, clusters...systematic fashion under a unifying theoretical and algorithmic framework. Optimization, Complex Networks, Social Network Analysis, Computational
NASA Technical Reports Server (NTRS)
1972-01-01
A general graded life detection approach ranging from environmental and in situ observations to specific metabolic experiments on Mars is reported. A zero dead volume leak and computer mediated data acquisition system is also discussed. Soil analyses were also conducted.
A Software Hub for High Assurance Model-Driven Development and Analysis
2007-01-23
verification of UML models in TLPVS. In Thomas Baar, Alfred Strohmeier, Ana Moreira, and Stephen J. Mellor, editors, UML 2004 - The Unified Modeling...volume 3785 of Lecture Notes in Computer Science, pages 52–65, Manchester, UK, Nov 2005. Springer. [GH04] Günter Graw and Peter Herrmann. Transformation
A Unified Theory of Solid Propellant Ignition. Part 3. Computer Solutions
1975-12-01
characteristics of the sol«.tU.n were examined: (1) the time (t ) to attain zero surface chemical heating (endothermic heat of pyroly - sis equal to exothermic... pyrolys .3 ictivation ener- gies can be and stiil permit ignition when both pyrolyses are endothermic has not been determined. The jnly systematic
Mathematics: PROJECT DESIGN. Educational Needs, Fresno, 1968, Number 12.
ERIC Educational Resources Information Center
Smart, James R.
This report examines and summarizes the needs in mathematics of the Fresno City school system. The study is one in a series of needs assessment reports for PROJECT DESIGN, an ESEA Title III project administered by the Fresno City Unified School District. Theoretical concepts, rather than computational drill, would be emphasized in the proposed…
A support architecture for reliable distributed computing systems
NASA Technical Reports Server (NTRS)
Dasgupta, Partha; Leblanc, Richard J., Jr.
1988-01-01
The Clouds project is well underway to its goal of building a unified distributed operating system supporting the object model. The operating system design uses the object concept of structuring software at all levels of the system. The basic operating system was developed and work is under progress to build a usable system.
Unifying Computer-Based Assessment across Conceptual Instruction, Problem-Solving, and Digital Games
ERIC Educational Resources Information Center
Miller, William L.; Baker, Ryan S.; Rossi, Lisa M.
2014-01-01
As students work through online learning systems such as the Reasoning Mind blended learning system, they often are not confined to working within a single educational activity; instead, they work through various different activities such as conceptual instruction, problem-solving items, and fluency-building games. However, most work on assessing…
MLeXAI: A Project-Based Application-Oriented Model
ERIC Educational Resources Information Center
Russell, Ingrid; Markov, Zdravko; Neller, Todd; Coleman, Susan
2010-01-01
Our approach to teaching introductory artificial intelligence (AI) unifies its diverse core topics through a theme of machine learning, and emphasizes how AI relates more broadly with computer science. Our work, funded by a grant from the National Science Foundation, involves the development, implementation, and testing of a suite of projects that…
The Use of Computer-Simulated Trajectories to Teach Real Particle Flight
ERIC Educational Resources Information Center
Gagnon, Michel
2011-01-01
The close relationship between charged particles and electromagnetic fields has been well known since the 19th century, thanks to James Clerk Maxwell's brilliant unified theory of electricity and magnetism. Today, electromagnetism is recognized as an essential aspect of human activity and has consequently become a major component of senior…
Energy efficient hybrid computing systems using spin devices
NASA Astrophysics Data System (ADS)
Sharad, Mrigank
Emerging spin-devices like magnetic tunnel junctions (MTJ's), spin-valves and domain wall magnets (DWM) have opened new avenues for spin-based logic design. This work explored potential computing applications which can exploit such devices for higher energy-efficiency and performance. The proposed applications involve hybrid design schemes, where charge-based devices supplement the spin-devices, to gain large benefits at the system level. As an example, lateral spin valves (LSV) involve switching of nanomagnets using spin-polarized current injection through a metallic channel such as Cu. Such spin-torque based devices possess several interesting properties that can be exploited for ultra-low power computation. Analog characteristic of spin current facilitate non-Boolean computation like majority evaluation that can be used to model a neuron. The magneto-metallic neurons can operate at ultra-low terminal voltage of ˜20mV, thereby resulting in small computation power. Moreover, since nano-magnets inherently act as memory elements, these devices can facilitate integration of logic and memory in interesting ways. The spin based neurons can be integrated with CMOS and other emerging devices leading to different classes of neuromorphic/non-Von-Neumann architectures. The spin-based designs involve `mixed-mode' processing and hence can provide very compact and ultra-low energy solutions for complex computation blocks, both digital as well as analog. Such low-power, hybrid designs can be suitable for various data processing applications like cognitive computing, associative memory, and currentmode on-chip global interconnects. Simulation results for these applications based on device-circuit co-simulation framework predict more than ˜100x improvement in computation energy as compared to state of the art CMOS design, for optimal spin-device parameters.
Toward a unifying framework for evolutionary processes.
Paixão, Tiago; Badkobeh, Golnaz; Barton, Nick; Çörüş, Doğan; Dang, Duc-Cuong; Friedrich, Tobias; Lehre, Per Kristian; Sudholt, Dirk; Sutton, Andrew M; Trubenová, Barbora
2015-10-21
The theory of population genetics and evolutionary computation have been evolving separately for nearly 30 years. Many results have been independently obtained in both fields and many others are unique to its respective field. We aim to bridge this gap by developing a unifying framework for evolutionary processes that allows both evolutionary algorithms and population genetics models to be cast in the same formal framework. The framework we present here decomposes the evolutionary process into its several components in order to facilitate the identification of similarities between different models. In particular, we propose a classification of evolutionary operators based on the defining properties of the different components. We cast several commonly used operators from both fields into this common framework. Using this, we map different evolutionary and genetic algorithms to different evolutionary regimes and identify candidates with the most potential for the translation of results between the fields. This provides a unified description of evolutionary processes and represents a stepping stone towards new tools and results to both fields. Copyright © 2015 The Authors. Published by Elsevier Ltd.. All rights reserved.
BioNet Digital Communications Framework
NASA Technical Reports Server (NTRS)
Gifford, Kevin; Kuzminsky, Sebastian; Williams, Shea
2010-01-01
BioNet v2 is a peer-to-peer middleware that enables digital communication devices to talk to each other. It provides a software development framework, standardized application, network-transparent device integration services, a flexible messaging model, and network communications for distributed applications. BioNet is an implementation of the Constellation Program Command, Control, Communications and Information (C3I) Interoperability specification, given in CxP 70022-01. The system architecture provides the necessary infrastructure for the integration of heterogeneous wired and wireless sensing and control devices into a unified data system with a standardized application interface, providing plug-and-play operation for hardware and software systems. BioNet v2 features a naming schema for mobility and coarse-grained localization information, data normalization within a network-transparent device driver framework, enabling of network communications to non-IP devices, and fine-grained application control of data subscription band width usage. BioNet directly integrates Disruption Tolerant Networking (DTN) as a communications technology, enabling networked communications with assets that are only intermittently connected including orbiting relay satellites and planetary rover vehicles.
Hybrid quantum-classical modeling of quantum dot devices
NASA Astrophysics Data System (ADS)
Kantner, Markus; Mittnenzweig, Markus; Koprucki, Thomas
2017-11-01
The design of electrically driven quantum dot devices for quantum optical applications asks for modeling approaches combining classical device physics with quantum mechanics. We connect the well-established fields of semiclassical semiconductor transport theory and the theory of open quantum systems to meet this requirement. By coupling the van Roosbroeck system with a quantum master equation in Lindblad form, we introduce a new hybrid quantum-classical modeling approach, which provides a comprehensive description of quantum dot devices on multiple scales: it enables the calculation of quantum optical figures of merit and the spatially resolved simulation of the current flow in realistic semiconductor device geometries in a unified way. We construct the interface between both theories in such a way, that the resulting hybrid system obeys the fundamental axioms of (non)equilibrium thermodynamics. We show that our approach guarantees the conservation of charge, consistency with the thermodynamic equilibrium and the second law of thermodynamics. The feasibility of the approach is demonstrated by numerical simulations of an electrically driven single-photon source based on a single quantum dot in the stationary and transient operation regime.
The Unified Floating Point Vector Coprocessor for Reconfigurable Hardware
NASA Astrophysics Data System (ADS)
Kathiara, Jainik
There has been an increased interest recently in using embedded cores on FPGAs. Many of the applications that make use of these cores have floating point operations. Due to the complexity and expense of floating point hardware, these algorithms are usually converted to fixed point operations or implemented using floating-point emulation in software. As the technology advances, more and more homogeneous computational resources and fixed function embedded blocks are added to FPGAs and hence implementation of floating point hardware becomes a feasible option. In this research we have implemented a high performance, autonomous floating point vector Coprocessor (FPVC) that works independently within an embedded processor system. We have presented a unified approach to vector and scalar computation, using a single register file for both scalar operands and vector elements. The Hybrid vector/SIMD computational model of FPVC results in greater overall performance for most applications along with improved peak performance compared to other approaches. By parameterizing vector length and the number of vector lanes, we can design an application specific FPVC and take optimal advantage of the FPGA fabric. For this research we have also initiated designing a software library for various computational kernels, each of which adapts FPVC's configuration and provide maximal performance. The kernels implemented are from the area of linear algebra and include matrix multiplication and QR and Cholesky decomposition. We have demonstrated the operation of FPVC on a Xilinx Virtex 5 using the embedded PowerPC.
NASA Astrophysics Data System (ADS)
Lin, S. J.
2015-12-01
The NOAA/Geophysical Fluid Dynamics Laboratory has been developing a unified regional-global modeling system with variable resolution capabilities that can be used for severe weather predictions (e.g., tornado outbreak events and cat-5 hurricanes) and ultra-high-resolution (1-km) regional climate simulations within a consistent global modeling framework. The fundation of this flexible regional-global modeling system is the non-hydrostatic extension of the vertically Lagrangian dynamical core (Lin 2004, Monthly Weather Review) known in the community as FV3 (finite-volume on the cubed-sphere). Because of its flexability and computational efficiency, the FV3 is one of the final candidates of NOAA's Next Generation Global Prediction System (NGGPS). We have built into the modeling system a stretched (single) grid capability, a two-way (regional-global) multiple nested grid capability, and the combination of the stretched and two-way nests, so as to make convection-resolving regional climate simulation within a consistent global modeling system feasible using today's High Performance Computing System. One of our main scientific goals is to enable simulations of high impact weather phenomena (such as tornadoes, thunderstorms, category-5 hurricanes) within an IPCC-class climate modeling system previously regarded as impossible. In this presentation I will demonstrate that it is computationally feasible to simulate not only super-cell thunderstorms, but also the subsequent genesis of tornadoes using a global model that was originally designed for century long climate simulations. As a unified weather-climate modeling system, we evaluated the performance of the model with horizontal resolution ranging from 1 km to as low as 200 km. In particular, for downscaling studies, we have developed various tests to ensure that the large-scale circulation within the global varaible resolution system is well simulated while at the same time the small-scale can be accurately captured within the targeted high resolution region.
Gschwind, Michael K
2013-04-16
Mechanisms for generating and executing programs for a floating point (FP) only single instruction multiple data (SIMD) instruction set architecture (ISA) are provided. A computer program product comprising a computer recordable medium having a computer readable program recorded thereon is provided. The computer readable program, when executed on a computing device, causes the computing device to receive one or more instructions and execute the one or more instructions using logic in an execution unit of the computing device. The logic implements a floating point (FP) only single instruction multiple data (SIMD) instruction set architecture (ISA), based on data stored in a vector register file of the computing device. The vector register file is configured to store both scalar and floating point values as vectors having a plurality of vector elements.
Local rollback for fault-tolerance in parallel computing systems
Blumrich, Matthias A [Yorktown Heights, NY; Chen, Dong [Yorktown Heights, NY; Gara, Alan [Yorktown Heights, NY; Giampapa, Mark E [Yorktown Heights, NY; Heidelberger, Philip [Yorktown Heights, NY; Ohmacht, Martin [Yorktown Heights, NY; Steinmacher-Burow, Burkhard [Boeblingen, DE; Sugavanam, Krishnan [Yorktown Heights, NY
2012-01-24
A control logic device performs a local rollback in a parallel super computing system. The super computing system includes at least one cache memory device. The control logic device determines a local rollback interval. The control logic device runs at least one instruction in the local rollback interval. The control logic device evaluates whether an unrecoverable condition occurs while running the at least one instruction during the local rollback interval. The control logic device checks whether an error occurs during the local rollback. The control logic device restarts the local rollback interval if the error occurs and the unrecoverable condition does not occur during the local rollback interval.
Research on computer virus database management system
NASA Astrophysics Data System (ADS)
Qi, Guoquan
2011-12-01
The growing proliferation of computer viruses becomes the lethal threat and research focus of the security of network information. While new virus is emerging, the number of viruses is growing, virus classification increasing complex. Virus naming because of agencies' capture time differences can not be unified. Although each agency has its own virus database, the communication between each other lacks, or virus information is incomplete, or a small number of sample information. This paper introduces the current construction status of the virus database at home and abroad, analyzes how to standardize and complete description of virus characteristics, and then gives the information integrity, storage security and manageable computer virus database design scheme.
Experimental and Computational Study of Ductile Fracture in Small Punch Tests
Bargmann, Swantje; Hähner, Peter
2017-01-01
A unified experimental-computational study on ductile fracture initiation and propagation during small punch testing is presented. Tests are carried out at room temperature with unnotched disks of different thicknesses where large-scale yielding prevails. In thinner specimens, the fracture occurs with severe necking under membrane tension, whereas for thicker ones a through thickness shearing mode prevails changing the crack orientation relative to the loading direction. Computational studies involve finite element simulations using a shear modified Gurson-Tvergaard-Needleman porous plasticity model with an integral-type nonlocal formulation. The predicted punch load-displacement curves and deformed profiles are in good agreement with the experimental results. PMID:29039748
Experimental and Computational Study of Ductile Fracture in Small Punch Tests.
Gülçimen Çakan, Betül; Soyarslan, Celal; Bargmann, Swantje; Hähner, Peter
2017-10-17
A unified experimental-computational study on ductile fracture initiation and propagation during small punch testing is presented. Tests are carried out at room temperature with unnotched disks of different thicknesses where large-scale yielding prevails. In thinner specimens, the fracture occurs with severe necking under membrane tension, whereas for thicker ones a through thickness shearing mode prevails changing the crack orientation relative to the loading direction. Computational studies involve finite element simulations using a shear modified Gurson-Tvergaard-Needleman porous plasticity model with an integral-type nonlocal formulation. The predicted punch load-displacement curves and deformed profiles are in good agreement with the experimental results.
Computation of free energy profiles with parallel adaptive dynamics
NASA Astrophysics Data System (ADS)
Lelièvre, Tony; Rousset, Mathias; Stoltz, Gabriel
2007-04-01
We propose a formulation of an adaptive computation of free energy differences, in the adaptive biasing force or nonequilibrium metadynamics spirit, using conditional distributions of samples of configurations which evolve in time. This allows us to present a truly unifying framework for these methods, and to prove convergence results for certain classes of algorithms. From a numerical viewpoint, a parallel implementation of these methods is very natural, the replicas interacting through the reconstructed free energy. We demonstrate how to improve this parallel implementation by resorting to some selection mechanism on the replicas. This is illustrated by computations on a model system of conformational changes.
Embedded-Based Graphics Processing Unit Cluster Platform for Multiple Sequence Alignments
Wei, Jyh-Da; Cheng, Hui-Jun; Lin, Chun-Yuan; Ye, Jin; Yeh, Kuan-Yu
2017-01-01
High-end graphics processing units (GPUs), such as NVIDIA Tesla/Fermi/Kepler series cards with thousands of cores per chip, are widely applied to high-performance computing fields in a decade. These desktop GPU cards should be installed in personal computers/servers with desktop CPUs, and the cost and power consumption of constructing a GPU cluster platform are very high. In recent years, NVIDIA releases an embedded board, called Jetson Tegra K1 (TK1), which contains 4 ARM Cortex-A15 CPUs and 192 Compute Unified Device Architecture cores (belong to Kepler GPUs). Jetson Tegra K1 has several advantages, such as the low cost, low power consumption, and high applicability, and it has been applied into several specific applications. In our previous work, a bioinformatics platform with a single TK1 (STK platform) was constructed, and this previous work is also used to prove that the Web and mobile services can be implemented in the STK platform with a good cost-performance ratio by comparing a STK platform with the desktop CPU and GPU. In this work, an embedded-based GPU cluster platform will be constructed with multiple TK1s (MTK platform). Complex system installation and setup are necessary procedures at first. Then, 2 job assignment modes are designed for the MTK platform to provide services for users. Finally, ClustalW v2.0.11 and ClustalWtk will be ported to the MTK platform. The experimental results showed that the speedup ratios achieved 5.5 and 4.8 times for ClustalW v2.0.11 and ClustalWtk, respectively, by comparing 6 TK1s with a single TK1. The MTK platform is proven to be useful for multiple sequence alignments. PMID:28835734
NASA Astrophysics Data System (ADS)
Ford, Eric B.
2009-05-01
We present the results of a highly parallel Kepler equation solver using the Graphics Processing Unit (GPU) on a commercial nVidia GeForce 280GTX and the "Compute Unified Device Architecture" (CUDA) programming environment. We apply this to evaluate a goodness-of-fit statistic (e.g., χ2) for Doppler observations of stars potentially harboring multiple planetary companions (assuming negligible planet-planet interactions). Given the high-dimensionality of the model parameter space (at least five dimensions per planet), a global search is extremely computationally demanding. We expect that the underlying Kepler solver and model evaluator will be combined with a wide variety of more sophisticated algorithms to provide efficient global search, parameter estimation, model comparison, and adaptive experimental design for radial velocity and/or astrometric planet searches. We tested multiple implementations using single precision, double precision, pairs of single precision, and mixed precision arithmetic. We find that the vast majority of computations can be performed using single precision arithmetic, with selective use of compensated summation for increased precision. However, standard single precision is not adequate for calculating the mean anomaly from the time of observation and orbital period when evaluating the goodness-of-fit for real planetary systems and observational data sets. Using all double precision, our GPU code outperforms a similar code using a modern CPU by a factor of over 60. Using mixed precision, our GPU code provides a speed-up factor of over 600, when evaluating nsys > 1024 models planetary systems each containing npl = 4 planets and assuming nobs = 256 observations of each system. We conclude that modern GPUs also offer a powerful tool for repeatedly evaluating Kepler's equation and a goodness-of-fit statistic for orbital models when presented with a large parameter space.
Specialized Computer Systems for Environment Visualization
NASA Astrophysics Data System (ADS)
Al-Oraiqat, Anas M.; Bashkov, Evgeniy A.; Zori, Sergii A.
2018-06-01
The need for real time image generation of landscapes arises in various fields as part of tasks solved by virtual and augmented reality systems, as well as geographic information systems. Such systems provide opportunities for collecting, storing, analyzing and graphically visualizing geographic data. Algorithmic and hardware software tools for increasing the realism and efficiency of the environment visualization in 3D visualization systems are proposed. This paper discusses a modified path tracing algorithm with a two-level hierarchy of bounding volumes and finding intersections with Axis-Aligned Bounding Box. The proposed algorithm eliminates the branching and hence makes the algorithm more suitable to be implemented on the multi-threaded CPU and GPU. A modified ROAM algorithm is used to solve the qualitative visualization of reliefs' problems and landscapes. The algorithm is implemented on parallel systems—cluster and Compute Unified Device Architecture-networks. Results show that the implementation on MPI clusters is more efficient than Graphics Processing Unit/Graphics Processing Clusters and allows real-time synthesis. The organization and algorithms of the parallel GPU system for the 3D pseudo stereo image/video synthesis are proposed. With realizing possibility analysis on a parallel GPU-architecture of each stage, 3D pseudo stereo synthesis is performed. An experimental prototype of a specialized hardware-software system 3D pseudo stereo imaging and video was developed on the CPU/GPU. The experimental results show that the proposed adaptation of 3D pseudo stereo imaging to the architecture of GPU-systems is efficient. Also it accelerates the computational procedures of 3D pseudo-stereo synthesis for the anaglyph and anamorphic formats of the 3D stereo frame without performing optimization procedures. The acceleration is on average 11 and 54 times for test GPUs.
Considerations for Software Defined Networking (SDN): Approaches and use cases
NASA Astrophysics Data System (ADS)
Bakshi, K.
Software Defined Networking (SDN) is an evolutionary approach to network design and functionality based on the ability to programmatically modify the behavior of network devices. SDN uses user-customizable and configurable software that's independent of hardware to enable networked systems to expand data flow control. SDN is in large part about understanding and managing a network as a unified abstraction. It will make networks more flexible, dynamic, and cost-efficient, while greatly simplifying operational complexity. And this advanced solution provides several benefits including network and service customizability, configurability, improved operations, and increased performance. There are several approaches to SDN and its practical implementation. Among them, two have risen to prominence with differences in pedigree and implementation. This paper's main focus will be to define, review, and evaluate salient approaches and use cases of the OpenFlow and Virtual Network Overlay approaches to SDN. OpenFlow is a communication protocol that gives access to the forwarding plane of a network's switches and routers. The Virtual Network Overlay relies on a completely virtualized network infrastructure and services to abstract the underlying physical network, which allows the overlay to be mobile to other physical networks. This is an important requirement for cloud computing, where applications and associated network services are migrated to cloud service providers and remote data centers on the fly as resource demands dictate. The paper will discuss how and where SDN can be applied and implemented, including research and academia, virtual multitenant data center, and cloud computing applications. Specific attention will be given to the cloud computing use case, where automated provisioning and programmable overlay for scalable multi-tenancy is leveraged via the SDN approach.
Embedded-Based Graphics Processing Unit Cluster Platform for Multiple Sequence Alignments.
Wei, Jyh-Da; Cheng, Hui-Jun; Lin, Chun-Yuan; Ye, Jin; Yeh, Kuan-Yu
2017-01-01
High-end graphics processing units (GPUs), such as NVIDIA Tesla/Fermi/Kepler series cards with thousands of cores per chip, are widely applied to high-performance computing fields in a decade. These desktop GPU cards should be installed in personal computers/servers with desktop CPUs, and the cost and power consumption of constructing a GPU cluster platform are very high. In recent years, NVIDIA releases an embedded board, called Jetson Tegra K1 (TK1), which contains 4 ARM Cortex-A15 CPUs and 192 Compute Unified Device Architecture cores (belong to Kepler GPUs). Jetson Tegra K1 has several advantages, such as the low cost, low power consumption, and high applicability, and it has been applied into several specific applications. In our previous work, a bioinformatics platform with a single TK1 (STK platform) was constructed, and this previous work is also used to prove that the Web and mobile services can be implemented in the STK platform with a good cost-performance ratio by comparing a STK platform with the desktop CPU and GPU. In this work, an embedded-based GPU cluster platform will be constructed with multiple TK1s (MTK platform). Complex system installation and setup are necessary procedures at first. Then, 2 job assignment modes are designed for the MTK platform to provide services for users. Finally, ClustalW v2.0.11 and ClustalWtk will be ported to the MTK platform. The experimental results showed that the speedup ratios achieved 5.5 and 4.8 times for ClustalW v2.0.11 and ClustalWtk, respectively, by comparing 6 TK1s with a single TK1. The MTK platform is proven to be useful for multiple sequence alignments.
Leaving patients to their own devices? Smart technology, safety and therapeutic relationships.
Ho, Anita; Quick, Oliver
2018-03-06
This debate article explores how smart technologies may create a double-edged sword for patient safety and effective therapeutic relationships. Increasing utilization of health monitoring devices by patients will likely become an important aspect of self-care and preventive medicine. It may also help to enhance accurate symptom reports, diagnoses, and prompt referral to specialist care where appropriate. However, the development, marketing, and use of such technology raise significant ethical implications for therapeutic relationships and patient safety. Drawing on lessons learned from other direct-to-consumer health products such as genetic testing, this article explores how smart technology can also pose regulatory challenges and encourage overutilization of healthcare services. In order for smart technology to promote safer care and effective therapeutic encounters, the technology and its utilization must be safe. This article argues for unified regulatory guidelines and better education for both healthcare providers and patients regarding the benefits and risks of these devices.
Photo-generated carriers lose energy during extraction from polymer-fullerene solar cells
Melianas, Armantas; Etzold, Fabian; Savenije, Tom J.; Laquai, Frédéric; Inganäs, Olle; Kemerink, Martijn
2015-01-01
In photovoltaic devices, the photo-generated charge carriers are typically assumed to be in thermal equilibrium with the lattice. In conventional materials, this assumption is experimentally justified as carrier thermalization completes before any significant carrier transport has occurred. Here, we demonstrate by unifying time-resolved optical and electrical experiments and Monte Carlo simulations over an exceptionally wide dynamic range that in the case of organic photovoltaic devices, this assumption is invalid. As the photo-generated carriers are transported to the electrodes, a substantial amount of their energy is lost by continuous thermalization in the disorder broadened density of states. Since thermalization occurs downward in energy, carrier motion is boosted by this process, leading to a time-dependent carrier mobility as confirmed by direct experiments. We identify the time and distance scales relevant for carrier extraction and show that the photo-generated carriers are extracted from the operating device before reaching thermal equilibrium. PMID:26537357
NASA Astrophysics Data System (ADS)
Park, C.; Kim, Y.; Jang, H.
2016-12-01
Poor temporal distribution of precipitation increases winter drought risks in mountain valley areas in Korea. Since perennial streams or reservoirs for water use are rare in the areas, groundwater is usually a major water resource. Significant amount of the precipitation contributing groundwater recharge mostly occurs during the summer season. However, a volume of groundwater recharge is limited by rapid runoff because of the topographic characteristics such as steep hill and slope. A groundwater reservoir using artificial recharge method with rain water reuse can be a suitable solution to secure water resource for the mountain valley areas. Successful groundwater reservoir design depends on optimization of well placement and operation. This study introduces a combined approach using GA (Genetic Algorithm) and MODFLOW and its rapid application. The methodology is based on RAD (Rapid Application Development) concept in order to minimize the cost of implementation. DEAP (Distributed Evolutionary Algorithms in Python), a framework for prototyping and testing evolutionary algorithms, is applied for quick code development and CUDA (Compute Unified Device Architecture), a parallel computing platform using GPU (Graphics Processing Unit), is introduced to reduce runtime. The application was successfully applied to Samdeok-ri, Gosung, Korea. The site is located in a mountain valley area and unconfined aquifers are major source of water use. The results of the application produced the best location and optimized operation schedule of wells including pumping and injecting.
Ichihashi, Yasuyuki; Oi, Ryutaro; Senoh, Takanori; Yamamoto, Kenji; Kurita, Taiichiro
2012-09-10
We developed a real-time capture and reconstruction system for three-dimensional (3D) live scenes. In previous research, we used integral photography (IP) to capture 3D images and then generated holograms from the IP images to implement a real-time reconstruction system. In this paper, we use a 4K (3,840 × 2,160) camera to capture IP images and 8K (7,680 × 4,320) liquid crystal display (LCD) panels for the reconstruction of holograms. We investigate two methods for enlarging the 4K images that were captured by integral photography to 8K images. One of the methods increases the number of pixels of each elemental image. The other increases the number of elemental images. In addition, we developed a personal computer (PC) cluster system with graphics processing units (GPUs) for the enlargement of IP images and the generation of holograms from the IP images using fast Fourier transform (FFT). We used the Compute Unified Device Architecture (CUDA) as the development environment for the GPUs. The Fast Fourier transform is performed using the CUFFT (CUDA FFT) library. As a result, we developed an integrated system for performing all processing from the capture to the reconstruction of 3D images by using these components and successfully used this system to reconstruct a 3D live scene at 12 frames per second.
Federal Register 2010, 2011, 2012, 2013, 2014
2012-05-08
... Phones and Tablet Computers, and Components Thereof; Notice of Receipt of Complaint; Solicitation of... entitled Certain Electronic Devices, Including Mobile Phones and Tablet Computers, and Components Thereof... the United States after importation of certain electronic devices, including mobile phones and tablet...
Computation offloading for real-time health-monitoring devices.
Kalantarian, Haik; Sideris, Costas; Tuan Le; Hosseini, Anahita; Sarrafzadeh, Majid
2016-08-01
Among the major challenges in the development of real-time wearable health monitoring systems is to optimize battery life. One of the major techniques with which this objective can be achieved is computation offloading, in which portions of computation can be partitioned between the device and other resources such as a server or cloud. In this paper, we describe a novel dynamic computation offloading scheme for real-time wearable health monitoring devices that adjusts the partitioning of data between the wearable device and mobile application as a function of desired classification accuracy.
Technical Note: scuda: A software platform for cumulative dose assessment.
Park, Seyoun; McNutt, Todd; Plishker, William; Quon, Harry; Wong, John; Shekhar, Raj; Lee, Junghoon
2016-10-01
Accurate tracking of anatomical changes and computation of actually delivered dose to the patient are critical for successful adaptive radiation therapy (ART). Additionally, efficient data management and fast processing are practically important for the adoption in clinic as ART involves a large amount of image and treatment data. The purpose of this study was to develop an accurate and efficient Software platform for CUmulative Dose Assessment (scuda) that can be seamlessly integrated into the clinical workflow. scuda consists of deformable image registration (DIR), segmentation, dose computation modules, and a graphical user interface. It is connected to our image PACS and radiotherapy informatics databases from which it automatically queries/retrieves patient images, radiotherapy plan, beam data, and daily treatment information, thus providing an efficient and unified workflow. For accurate registration of the planning CT and daily CBCTs, the authors iteratively correct CBCT intensities by matching local intensity histograms during the DIR process. Contours of the target tumor and critical structures are then propagated from the planning CT to daily CBCTs using the computed deformations. The actual delivered daily dose is computed using the registered CT and patient setup information by a superposition/convolution algorithm, and accumulated using the computed deformation fields. Both DIR and dose computation modules are accelerated by a graphics processing unit. The cumulative dose computation process has been validated on 30 head and neck (HN) cancer cases, showing 3.5 ± 5.0 Gy (mean±STD) absolute mean dose differences between the planned and the actually delivered doses in the parotid glands. On average, DIR, dose computation, and segmentation take 20 s/fraction and 17 min for a 35-fraction treatment including additional computation for dose accumulation. The authors developed a unified software platform that provides accurate and efficient monitoring of anatomical changes and computation of actually delivered dose to the patient, thus realizing an efficient cumulative dose computation workflow. Evaluation on HN cases demonstrated the utility of our platform for monitoring the treatment quality and detecting significant dosimetric variations that are keys to successful ART.
Technical Note: SCUDA: A software platform for cumulative dose assessment
DOE Office of Scientific and Technical Information (OSTI.GOV)
Park, Seyoun; McNutt, Todd; Quon, Harry
Purpose: Accurate tracking of anatomical changes and computation of actually delivered dose to the patient are critical for successful adaptive radiation therapy (ART). Additionally, efficient data management and fast processing are practically important for the adoption in clinic as ART involves a large amount of image and treatment data. The purpose of this study was to develop an accurate and efficient Software platform for CUmulative Dose Assessment (SCUDA) that can be seamlessly integrated into the clinical workflow. Methods: SCUDA consists of deformable image registration (DIR), segmentation, dose computation modules, and a graphical user interface. It is connected to our imagemore » PACS and radiotherapy informatics databases from which it automatically queries/retrieves patient images, radiotherapy plan, beam data, and daily treatment information, thus providing an efficient and unified workflow. For accurate registration of the planning CT and daily CBCTs, the authors iteratively correct CBCT intensities by matching local intensity histograms during the DIR process. Contours of the target tumor and critical structures are then propagated from the planning CT to daily CBCTs using the computed deformations. The actual delivered daily dose is computed using the registered CT and patient setup information by a superposition/convolution algorithm, and accumulated using the computed deformation fields. Both DIR and dose computation modules are accelerated by a graphics processing unit. Results: The cumulative dose computation process has been validated on 30 head and neck (HN) cancer cases, showing 3.5 ± 5.0 Gy (mean±STD) absolute mean dose differences between the planned and the actually delivered doses in the parotid glands. On average, DIR, dose computation, and segmentation take 20 s/fraction and 17 min for a 35-fraction treatment including additional computation for dose accumulation. Conclusions: The authors developed a unified software platform that provides accurate and efficient monitoring of anatomical changes and computation of actually delivered dose to the patient, thus realizing an efficient cumulative dose computation workflow. Evaluation on HN cases demonstrated the utility of our platform for monitoring the treatment quality and detecting significant dosimetric variations that are keys to successful ART.« less
Parallel hyperspectral compressive sensing method on GPU
NASA Astrophysics Data System (ADS)
Bernabé, Sergio; Martín, Gabriel; Nascimento, José M. P.
2015-10-01
Remote hyperspectral sensors collect large amounts of data per flight usually with low spatial resolution. It is known that the bandwidth connection between the satellite/airborne platform and the ground station is reduced, thus a compression onboard method is desirable to reduce the amount of data to be transmitted. This paper presents a parallel implementation of an compressive sensing method, called parallel hyperspectral coded aperture (P-HYCA), for graphics processing units (GPU) using the compute unified device architecture (CUDA). This method takes into account two main properties of hyperspectral dataset, namely the high correlation existing among the spectral bands and the generally low number of endmembers needed to explain the data, which largely reduces the number of measurements necessary to correctly reconstruct the original data. Experimental results conducted using synthetic and real hyperspectral datasets on two different GPU architectures by NVIDIA: GeForce GTX 590 and GeForce GTX TITAN, reveal that the use of GPUs can provide real-time compressive sensing performance. The achieved speedup is up to 20 times when compared with the processing time of HYCA running on one core of the Intel i7-2600 CPU (3.4GHz), with 16 Gbyte memory.
NASA Astrophysics Data System (ADS)
Sanchez del Rio, Manuel; Dejus, Roger J.
1997-11-01
XOP (X-ray OPtics utilities) is a graphical user interface (GUI) created to execute several computer programs that calculate the basic information needed by a synchrotron beamline scientist (designer or experimentalist). Typical examples of such calculations are: insertion device (undulator or wiggler) spectral and angular distributions, mirror and multilayer reflectivities, and crystal diffraction profiles. All programs are provided to the user under a unified GUI, which greatly simplifies their execution. The XOP optics applications (especially mirror calculations) take their basic input (optical constants, compound and mixture tables) from a flexible file-oriented database, which allows the user to select data from a large number of choices and also to customize their own data sets. XOP includes many mathematical and visualization capabilities. It also permits the combination of reflectivities from several mirrors and filters, and their effect, onto a source spectrum. This feature is very useful when calculating thermal load on a series of optical elements. The XOP interface is written in the IDL (Interactive Data Language). An embedded version of XOP, which freely runs under most Unix platforms (HP, Sun, Dec, Linux, etc) and under Windows95 and NT, is available upon request.
Molecular Monte Carlo Simulations Using Graphics Processing Units: To Waste Recycle or Not?
Kim, Jihan; Rodgers, Jocelyn M; Athènes, Manuel; Smit, Berend
2011-10-11
In the waste recycling Monte Carlo (WRMC) algorithm, (1) multiple trial states may be simultaneously generated and utilized during Monte Carlo moves to improve the statistical accuracy of the simulations, suggesting that such an algorithm may be well posed for implementation in parallel on graphics processing units (GPUs). In this paper, we implement two waste recycling Monte Carlo algorithms in CUDA (Compute Unified Device Architecture) using uniformly distributed random trial states and trial states based on displacement random-walk steps, and we test the methods on a methane-zeolite MFI framework system to evaluate their utility. We discuss the specific implementation details of the waste recycling GPU algorithm and compare the methods to other parallel algorithms optimized for the framework system. We analyze the relationship between the statistical accuracy of our simulations and the CUDA block size to determine the efficient allocation of the GPU hardware resources. We make comparisons between the GPU and the serial CPU Monte Carlo implementations to assess speedup over conventional microprocessors. Finally, we apply our optimized GPU algorithms to the important problem of determining free energy landscapes, in this case for molecular motion through the zeolite LTA.
Jiansen Li; Jianqi Sun; Ying Song; Yanran Xu; Jun Zhao
2014-01-01
An effective way to improve the data acquisition speed of magnetic resonance imaging (MRI) is using under-sampled k-space data, and dictionary learning method can be used to maintain the reconstruction quality. Three-dimensional dictionary trains the atoms in dictionary in the form of blocks, which can utilize the spatial correlation among slices. Dual-dictionary learning method includes a low-resolution dictionary and a high-resolution dictionary, for sparse coding and image updating respectively. However, the amount of data is huge for three-dimensional reconstruction, especially when the number of slices is large. Thus, the procedure is time-consuming. In this paper, we first utilize the NVIDIA Corporation's compute unified device architecture (CUDA) programming model to design the parallel algorithms on graphics processing unit (GPU) to accelerate the reconstruction procedure. The main optimizations operate in the dictionary learning algorithm and the image updating part, such as the orthogonal matching pursuit (OMP) algorithm and the k-singular value decomposition (K-SVD) algorithm. Then we develop another version of CUDA code with algorithmic optimization. Experimental results show that more than 324 times of speedup is achieved compared with the CPU-only codes when the number of MRI slices is 24.
Physarum machines: encapsulating reaction-diffusion to compute spanning tree
NASA Astrophysics Data System (ADS)
Adamatzky, Andrew
2007-12-01
The Physarum machine is a biological computing device, which employs plasmodium of Physarum polycephalum as an unconventional computing substrate. A reaction-diffusion computer is a chemical computing device that computes by propagating diffusive or excitation wave fronts. Reaction-diffusion computers, despite being computationally universal machines, are unable to construct certain classes of proximity graphs without the assistance of an external computing device. I demonstrate that the problem can be solved if the reaction-diffusion system is enclosed in a membrane with few ‘growth points’, sites guiding the pattern propagation. Experimental approximation of spanning trees by P. polycephalum slime mold demonstrates the feasibility of the approach. Findings provided advance theory of reaction-diffusion computation by enriching it with ideas of slime mold computation.
Federal Register 2010, 2011, 2012, 2013, 2014
2010-11-08
... Communication Devices, Portable Music and Data Processing Devices, Computers and Components Thereof; Notice of... within the United States after importation of certain wireless communication devices, portable music and... music and data processing devices, computers and components thereof that infringe one or more of claim...
Indiva: a middleware for managing distributed media environment
NASA Astrophysics Data System (ADS)
Ooi, Wei-Tsang; Pletcher, Peter; Rowe, Lawrence A.
2003-12-01
This paper presents a unified set of abstractions and operations for hardware devices, software processes, and media data in a distributed audio and video environment. These abstractions, which are provided through a middleware layer called Indiva, use a file system metaphor to access resources and high-level commands to simplify the development of Internet webcast and distributed collaboration control applications. The design and implementation of Indiva are described and examples are presented to illustrate the usefulness of the abstractions.
Processing Device for High-Speed Execution of an Xrisc Computer Program
NASA Technical Reports Server (NTRS)
Ng, Tak-Kwong (Inventor); Mills, Carl S. (Inventor)
2016-01-01
A processing device for high-speed execution of a computer program is provided. A memory module may store one or more computer programs. A sequencer may select one of the computer programs and controls execution of the selected program. A register module may store intermediate values associated with a current calculation set, a set of output values associated with a previous calculation set, and a set of input values associated with a subsequent calculation set. An external interface may receive the set of input values from a computing device and provides the set of output values to the computing device. A computation interface may provide a set of operands for computation during processing of the current calculation set. The set of input values are loaded into the register and the set of output values are unloaded from the register in parallel with processing of the current calculation set.
Toward a computational theory for motion understanding: The expert animators model
NASA Technical Reports Server (NTRS)
Mohamed, Ahmed S.; Armstrong, William W.
1988-01-01
Artificial intelligence researchers claim to understand some aspect of human intelligence when their model is able to emulate it. In the context of computer graphics, the ability to go from motion representation to convincing animation should accordingly be treated not simply as a trick for computer graphics programmers but as important epistemological and methodological goal. In this paper we investigate a unifying model for animating a group of articulated bodies such as humans and robots in a three-dimensional environment. The proposed model is considered in the framework of knowledge representation and processing, with special reference to motion knowledge. The model is meant to help setting the basis for a computational theory for motion understanding applied to articulated bodies.
NASA Technical Reports Server (NTRS)
Tomayko, James E.
1986-01-01
Twenty-five years of spacecraft onboard computer development have resulted in a better understanding of the requirements for effective, efficient, and fault tolerant flight computer systems. Lessons from eight flight programs (Gemini, Apollo, Skylab, Shuttle, Mariner, Voyager, and Galileo) and three reserach programs (digital fly-by-wire, STAR, and the Unified Data System) are useful in projecting the computer hardware configuration of the Space Station and the ways in which the Ada programming language will enhance the development of the necessary software. The evolution of hardware technology, fault protection methods, and software architectures used in space flight in order to provide insight into the pending development of such items for the Space Station are reviewed.
A computer program for predicting nonlinear uniaxial material responses using viscoplastic models
NASA Technical Reports Server (NTRS)
Chang, T. Y.; Thompson, R. L.
1984-01-01
A computer program was developed for predicting nonlinear uniaxial material responses using viscoplastic constitutive models. Four specific models, i.e., those due to Miller, Walker, Krieg-Swearengen-Rhode, and Robinson, are included. Any other unified model is easily implemented into the program in the form of subroutines. Analysis features include stress-strain cycling, creep response, stress relaxation, thermomechanical fatigue loop, or any combination of these responses. An outline is given on the theoretical background of uniaxial constitutive models, analysis procedure, and numerical integration methods for solving the nonlinear constitutive equations. In addition, a discussion on the computer program implementation is also given. Finally, seven numerical examples are included to demonstrate the versatility of the computer program developed.
Multi-input and binary reproducible, high bandwidth floating point adder in a collective network
Chen, Dong; Eisley, Noel A.; Heidelberger, Philip; Steinmacher-Burow, Burkhard
2016-11-15
To add floating point numbers in a parallel computing system, a collective logic device receives the floating point numbers from computing nodes. The collective logic devices converts the floating point numbers to integer numbers. The collective logic device adds the integer numbers and generating a summation of the integer numbers. The collective logic device converts the summation to a floating point number. The collective logic device performs the receiving, the converting the floating point numbers, the adding, the generating and the converting the summation in one pass. One pass indicates that the computing nodes send inputs only once to the collective logic device and receive outputs only once from the collective logic device.
Gender Divide and Acceptance of Collaborative Web 2.0 Applications for Learning in Higher Education
ERIC Educational Resources Information Center
Huang, Wen-Hao David; Hood, Denice Ward; Yoo, Sun Joo
2013-01-01
Situated in the gender digital divide framework, this survey study investigated the role of computer anxiety in influencing female college students' perceptions toward Web 2.0 applications for learning. Based on 432 college students' "Web 2.0 for learning" perception ratings collected by relevant categories of "Unified Theory of Acceptance and Use…
Global Weather Prediction and High-End Computing at NASA
NASA Technical Reports Server (NTRS)
Lin, Shian-Jiann; Atlas, Robert; Yeh, Kao-San
2003-01-01
We demonstrate current capabilities of the NASA finite-volume General Circulation Model an high-resolution global weather prediction, and discuss its development path in the foreseeable future. This model can be regarded as a prototype of a future NASA Earth modeling system intended to unify development activities cutting across various disciplines within the NASA Earth Science Enterprise.
A Vision on the Status and Evolution of HEP Physics Software Tools
DOE Office of Scientific and Technical Information (OSTI.GOV)
Canal, P.; Elvira, D.; Hatcher, R.
2013-07-28
This paper represents the vision of the members of the Fermilab Scientific Computing Division's Computational Physics Department (SCD-CPD) on the status and the evolution of various HEP software tools such as the Geant4 detector simulation toolkit, the Pythia and GENIE physics generators, and the ROOT data analysis framework. The goal of this paper is to contribute ideas to the Snowmass 2013 process toward the composition of a unified document on the current status and potential evolution of the physics software tools which are essential to HEP.
Computing an operating parameter of a unified power flow controller
Wilson, David G.; Robinett, III, Rush D.
2017-12-26
A Unified Power Flow Controller described herein comprises a sensor that outputs at least one sensed condition, a processor that receives the at least one sensed condition, a memory that comprises control logic that is executable by the processor; and power electronics that comprise power storage, wherein the processor causes the power electronics to selectively cause the power storage to act as one of a power generator or a load based at least in part upon the at least one sensed condition output by the sensor and the control logic, and wherein at least one operating parameter of the power electronics is designed to facilitate maximal transmittal of electrical power generated at a variable power generation system to a grid system while meeting power constraints set forth by the electrical power grid.
On numerical integration and computer implementation of viscoplastic models
NASA Technical Reports Server (NTRS)
Chang, T. Y.; Chang, J. P.; Thompson, R. L.
1985-01-01
Due to the stringent design requirement for aerospace or nuclear structural components, considerable research interests have been generated on the development of constitutive models for representing the inelastic behavior of metals at elevated temperatures. In particular, a class of unified theories (or viscoplastic constitutive models) have been proposed to simulate material responses such as cyclic plasticity, rate sensitivity, creep deformations, strain hardening or softening, etc. This approach differs from the conventional creep and plasticity theory in that both the creep and plastic deformations are treated as unified time-dependent quantities. Although most of viscoplastic models give better material behavior representation, the associated constitutive differential equations have stiff regimes which present numerical difficulties in time-dependent analysis. In this connection, appropriate solution algorithm must be developed for viscoplastic analysis via finite element method.
Coltrin, Michael E.; Kee, Robert J.
2016-06-18
This paper develops a unified analysis of stagnation flow heat and mass transport, considering both semi-infinite domains and finite gaps, with and without rotation of the stagnation surface. An important objective is to derive Nusselt- and Sherwood-number correlations that represent heat and mass transport at the stagnation surface. The approach is based on computationally solving the governing conservation equations in similarity form as a boundary-value problem. The formulation considers ideal gases and incompressible fluids. The correlated results depend on fluid properties in terms of Prandtl, Schmidt, and Damkohler numbers. Heterogeneous chemistry at the stagnation surface is represented as a singlemore » first-order reaction. A composite Reynolds number represents the combination of stagnation flows with and without stagnation-surface rotation.« less
NASA Technical Reports Server (NTRS)
Rozendaal, Rodger A.; Behbehani, Roxanna
1990-01-01
NASA initiated the Variable Sweep Transition Flight Experiment (VSTFE) to establish a boundary layer transition database for laminar flow wing design. For this experiment, full-span upper surface gloves were fitted to a variable sweep F-14 aircraft. The development of an improved laminar boundary layer stability analysis system called the Unified Stability System (USS) is documented and results of its use on the VSTFE flight data are shown. The USS consists of eight computer codes. The theoretical background of the system is described, as is the input, output, and usage hints. The USS is capable of analyzing boundary layer stability over a wide range of disturbance frequencies and orientations, making it possible to use different philosophies in calculating the growth of disturbances on sweptwings.
Application-oriented integrated control center (AICC) for heterogeneous optical networks
NASA Astrophysics Data System (ADS)
Zhao, Yongli; Zhang, Jie; Cao, Xuping; Wang, Dajiang; Wu, Koubo; Cai, Yinxiang; Gu, Wanyi
2011-12-01
Various broad bandwidth services have being swallowing the bandwidth resource of optical networks, such as the data center application and cloud computation. There are still some challenges for future optical networks although the available bandwidth is increasing with the development of transmission technologies. The relationship between upper application layer and lower network resource layer is necessary to be researched further. In order to improve the efficiency of network resources and capability of service provisioning, heterogeneous optical networks resource can be abstracted as unified Application Programming Interfaces (APIs) which can be open to various upper applications through Application-oriented Integrated Control Center (AICC) proposed in the paper. A novel Openflow-based unified control architecture is proposed for the optimization of cross layer resources. Numeric results show good performance of AICC through simulation experiments.
Computing an operating parameter of a unified power flow controller
Wilson, David G; Robinett, III, Rush D
2015-01-06
A Unified Power Flow Controller described herein comprises a sensor that outputs at least one sensed condition, a processor that receives the at least one sensed condition, a memory that comprises control logic that is executable by the processor; and power electronics that comprise power storage, wherein the processor causes the power electronics to selectively cause the power storage to act as one of a power generator or a load based at least in part upon the at least one sensed condition output by the sensor and the control logic, and wherein at least one operating parameter of the power electronics is designed to facilitate maximal transmittal of electrical power generated at a variable power generation system to a grid system while meeting power constraints set forth by the electrical power grid.
A unified perspective on robot control - The energy Lyapunov function approach
NASA Technical Reports Server (NTRS)
Wen, John T.
1990-01-01
A unified framework for the stability analysis of robot tracking control is presented. By using an energy-motivated Lyapunov function candidate, the closed-loop stability is shown for a large family of control laws sharing a common structure of proportional and derivative feedback and a model-based feedforward. The feedforward can be zero, partial or complete linearized dynamics, partial or complete nonlinear dynamics, or linearized or nonlinear dynamics with parameter adaptation. As result, the dichotomous approaches to the robot control problem based on the open-loop linearization and nonlinear Lyapunov analysis are both included in this treatment. Furthermore, quantitative estimates of the trade-offs between different schemes in terms of the tracking performance, steady state error, domain of convergence, realtime computation load and required a prior model information are derived.
[Research of joint-robotics-based design of biomechanics testing device on human spine].
Deng, Guoyong; Tian, Lianfang; Mao, Zongyuan
2009-12-01
This paper introduces the hardware and software of a biomechanical robot-based testing device. The bottom control orders, posture and torque data transmission, and the control algorithms are integrated in a unified visual control platform by Visual C+ +, with easy control and management. By using hybrid force-displacement control method to load the human spine, we can test the organizational structure and the force state of the FSU (Functional spinal unit) well, which overcomes the shortcomings due to the separation of the force and displacement measurement, thus greatly improves the measurement accuracy. Also it is esay to identify the spinal degeneration and the load-bearing impact on the organizational structure of the FSU after various types of surgery.
A Model-Based Probabilistic Inversion Framework for Wire Fault Detection Using TDR
NASA Technical Reports Server (NTRS)
Schuet, Stefan R.; Timucin, Dogan A.; Wheeler, Kevin R.
2010-01-01
Time-domain reflectometry (TDR) is one of the standard methods for diagnosing faults in electrical wiring and interconnect systems, with a long-standing history focused mainly on hardware development of both high-fidelity systems for laboratory use and portable hand-held devices for field deployment. While these devices can easily assess distance to hard faults such as sustained opens or shorts, their ability to assess subtle but important degradation such as chafing remains an open question. This paper presents a unified framework for TDR-based chafing fault detection in lossy coaxial cables by combining an S-parameter based forward modeling approach with a probabilistic (Bayesian) inference algorithm. Results are presented for the estimation of nominal and faulty cable parameters from laboratory data.
Creation and use of a survey instrument for comparing mobile computing devices.
Macri, Jennifer M; Lee, Paul P; Silvey, Garry M; Lobach, David F
2005-01-01
Both personal digital assistants (PDAs) and tablet computers have emerged to facilitate data collection at the point of care. However, little research has been reported comparing these mobile computing devices in specific care settings. In this study we present an approach for comparing functionally identical applications on a Palm operating system-based PDA and a Windows-based tablet computer for point-of-care documentation of clinical observations by eye care professionals when caring for patients with diabetes. Eye-care professionals compared the devices through focus group sessions and through validated usability surveys. This poster describes the development and use of the survey instrument used for comparing mobile computing devices.
Computational Hemodynamics Involving Artificial Devices
NASA Technical Reports Server (NTRS)
Kwak, Dochan; Kiris, Cetin; Feiereisen, William (Technical Monitor)
2001-01-01
This paper reports the progress being made towards developing complete blood flow simulation capability in human, especially, in the presence of artificial devices such as valves and ventricular assist devices. Devices modeling poses unique challenges different from computing the blood flow in natural hearts and arteries. There are many elements needed such as flow solvers, geometry modeling including flexible walls, moving boundary procedures and physiological characterization of blood. As a first step, computational technology developed for aerospace applications was extended in the recent past to the analysis and development of mechanical devices. The blood flow in these devices is practically incompressible and Newtonian, and thus various incompressible Navier-Stokes solution procedures can be selected depending on the choice of formulations, variables and numerical schemes. Two primitive variable formulations used are discussed as well as the overset grid approach to handle complex moving geometry. This procedure has been applied to several artificial devices. Among these, recent progress made in developing DeBakey axial flow blood pump will be presented from computational point of view. Computational and clinical issues will be discussed in detail as well as additional work needed.
Boolean and brain-inspired computing using spin-transfer torque devices
NASA Astrophysics Data System (ADS)
Fan, Deliang
Several completely new approaches (such as spintronic, carbon nanotube, graphene, TFETs, etc.) to information processing and data storage technologies are emerging to address the time frame beyond current Complementary Metal-Oxide-Semiconductor (CMOS) roadmap. The high speed magnetization switching of a nano-magnet due to current induced spin-transfer torque (STT) have been demonstrated in recent experiments. Such STT devices can be explored in compact, low power memory and logic design. In order to truly leverage STT devices based computing, researchers require a re-think of circuit, architecture, and computing model, since the STT devices are unlikely to be drop-in replacements for CMOS. The potential of STT devices based computing will be best realized by considering new computing models that are inherently suited to the characteristics of STT devices, and new applications that are enabled by their unique capabilities, thereby attaining performance that CMOS cannot achieve. The goal of this research is to conduct synergistic exploration in architecture, circuit and device levels for Boolean and brain-inspired computing using nanoscale STT devices. Specifically, we first show that the non-volatile STT devices can be used in designing configurable Boolean logic blocks. We propose a spin-memristor threshold logic (SMTL) gate design, where memristive cross-bar array is used to perform current mode summation of binary inputs and the low power current mode spintronic threshold device carries out the energy efficient threshold operation. Next, for brain-inspired computing, we have exploited different spin-transfer torque device structures that can implement the hard-limiting and soft-limiting artificial neuron transfer functions respectively. We apply such STT based neuron (or 'spin-neuron') in various neural network architectures, such as hierarchical temporal memory and feed-forward neural network, for performing "human-like" cognitive computing, which show more than two orders of lower energy consumption compared to state of the art CMOS implementation. Finally, we show the dynamics of injection locked Spin Hall Effect Spin-Torque Oscillator (SHE-STO) cluster can be exploited as a robust multi-dimensional distance metric for associative computing, image/ video analysis, etc. Our simulation results show that the proposed system architecture with injection locked SHE-STOs and the associated CMOS interface circuits can be suitable for robust and energy efficient associative computing and pattern matching.
OmniPHR: A distributed architecture model to integrate personal health records.
Roehrs, Alex; da Costa, Cristiano André; da Rosa Righi, Rodrigo
2017-07-01
The advances in the Information and Communications Technology (ICT) brought many benefits to the healthcare area, specially to digital storage of patients' health records. However, it is still a challenge to have a unified viewpoint of patients' health history, because typically health data is scattered among different health organizations. Furthermore, there are several standards for these records, some of them open and others proprietary. Usually health records are stored in databases within health organizations and rarely have external access. This situation applies mainly to cases where patients' data are maintained by healthcare providers, known as EHRs (Electronic Health Records). In case of PHRs (Personal Health Records), in which patients by definition can manage their health records, they usually have no control over their data stored in healthcare providers' databases. Thereby, we envision two main challenges regarding PHR context: first, how patients could have a unified view of their scattered health records, and second, how healthcare providers can access up-to-date data regarding their patients, even though changes occurred elsewhere. For addressing these issues, this work proposes a model named OmniPHR, a distributed model to integrate PHRs, for patients and healthcare providers use. The scientific contribution is to propose an architecture model to support a distributed PHR, where patients can maintain their health history in an unified viewpoint, from any device anywhere. Likewise, for healthcare providers, the possibility of having their patients data interconnected among health organizations. The evaluation demonstrates the feasibility of the model in maintaining health records distributed in an architecture model that promotes a unified view of PHR with elasticity and scalability of the solution. Copyright © 2017 Elsevier Inc. All rights reserved.
A unifying framework for rigid multibody dynamics and serial and parallel computational issues
NASA Technical Reports Server (NTRS)
Fijany, Amir; Jain, Abhinandan
1989-01-01
A unifying framework for various formulations of the dynamics of open-chain rigid multibody systems is discussed. Their suitability for serial and parallel processing is assessed. The framework is based on the derivation of intrinsic, i.e., coordinate-free, equations of the algorithms which provides a suitable abstraction and permits a distinction to be made between the computational redundancy in the intrinsic and extrinsic equations. A set of spatial notation is used which allows the derivation of the various algorithms in a common setting and thus clarifies the relationships among them. The three classes of algorithms viz., O(n), O(n exp 2) and O(n exp 3) or the solution of the dynamics problem are investigated. Researchers begin with the derivation of O(n exp 3) algorithms based on the explicit computation of the mass matrix and it provides insight into the underlying basis of the O(n) algorithms. From a computational perspective, the optimal choice of a coordinate frame for the projection of the intrinsic equations is discussed and the serial computational complexity of the different algorithms is evaluated. The three classes of algorithms are also analyzed for suitability for parallel processing. It is shown that the problem belongs to the class of N C and the time and processor bounds are of O(log2/2(n)) and O(n exp 4), respectively. However, the algorithm that achieves the above bounds is not stable. Researchers show that the fastest stable parallel algorithm achieves a computational complexity of O(n) with O(n exp 4), respectively. However, the algorithm that achieves the above bounds is not stable. Researchers show that the fastest stable parallel algorithm achieves a computational complexity of O(n) with O(n exp 2) processors, and results from the parallelization of the O(n exp 3) serial algorithm.
3D Fluid-Structure Interaction Simulation of Aortic Valves Using a Unified Continuum ALE FEM Model.
Spühler, Jeannette H; Jansson, Johan; Jansson, Niclas; Hoffman, Johan
2018-01-01
Due to advances in medical imaging, computational fluid dynamics algorithms and high performance computing, computer simulation is developing into an important tool for understanding the relationship between cardiovascular diseases and intraventricular blood flow. The field of cardiac flow simulation is challenging and highly interdisciplinary. We apply a computational framework for automated solutions of partial differential equations using Finite Element Methods where any mathematical description directly can be translated to code. This allows us to develop a cardiac model where specific properties of the heart such as fluid-structure interaction of the aortic valve can be added in a modular way without extensive efforts. In previous work, we simulated the blood flow in the left ventricle of the heart. In this paper, we extend this model by placing prototypes of both a native and a mechanical aortic valve in the outflow region of the left ventricle. Numerical simulation of the blood flow in the vicinity of the valve offers the possibility to improve the treatment of aortic valve diseases as aortic stenosis (narrowing of the valve opening) or regurgitation (leaking) and to optimize the design of prosthetic heart valves in a controlled and specific way. The fluid-structure interaction and contact problem are formulated in a unified continuum model using the conservation laws for mass and momentum and a phase function. The discretization is based on an Arbitrary Lagrangian-Eulerian space-time finite element method with streamline diffusion stabilization, and it is implemented in the open source software Unicorn which shows near optimal scaling up to thousands of cores. Computational results are presented to demonstrate the capability of our framework.
3D Fluid-Structure Interaction Simulation of Aortic Valves Using a Unified Continuum ALE FEM Model
Spühler, Jeannette H.; Jansson, Johan; Jansson, Niclas; Hoffman, Johan
2018-01-01
Due to advances in medical imaging, computational fluid dynamics algorithms and high performance computing, computer simulation is developing into an important tool for understanding the relationship between cardiovascular diseases and intraventricular blood flow. The field of cardiac flow simulation is challenging and highly interdisciplinary. We apply a computational framework for automated solutions of partial differential equations using Finite Element Methods where any mathematical description directly can be translated to code. This allows us to develop a cardiac model where specific properties of the heart such as fluid-structure interaction of the aortic valve can be added in a modular way without extensive efforts. In previous work, we simulated the blood flow in the left ventricle of the heart. In this paper, we extend this model by placing prototypes of both a native and a mechanical aortic valve in the outflow region of the left ventricle. Numerical simulation of the blood flow in the vicinity of the valve offers the possibility to improve the treatment of aortic valve diseases as aortic stenosis (narrowing of the valve opening) or regurgitation (leaking) and to optimize the design of prosthetic heart valves in a controlled and specific way. The fluid-structure interaction and contact problem are formulated in a unified continuum model using the conservation laws for mass and momentum and a phase function. The discretization is based on an Arbitrary Lagrangian-Eulerian space-time finite element method with streamline diffusion stabilization, and it is implemented in the open source software Unicorn which shows near optimal scaling up to thousands of cores. Computational results are presented to demonstrate the capability of our framework. PMID:29713288
Consolidation and development roadmap of the EMI middleware
NASA Astrophysics Data System (ADS)
Kónya, B.; Aiftimiei, C.; Cecchi, M.; Field, L.; Fuhrmann, P.; Nilsen, J. K.; White, J.
2012-12-01
Scientific research communities have benefited recently from the increasing availability of computing and data infrastructures with unprecedented capabilities for large scale distributed initiatives. These infrastructures are largely defined and enabled by the middleware they deploy. One of the major issues in the current usage of research infrastructures is the need to use similar but often incompatible middleware solutions. The European Middleware Initiative (EMI) is a collaboration of the major European middleware providers ARC, dCache, gLite and UNICORE. EMI aims to: deliver a consolidated set of middleware components for deployment in EGI, PRACE and other Distributed Computing Infrastructures; extend the interoperability between grids and other computing infrastructures; strengthen the reliability of the services; establish a sustainable model to maintain and evolve the middleware; fulfil the requirements of the user communities. This paper presents the consolidation and development objectives of the EMI software stack covering the last two years. The EMI development roadmap is introduced along the four technical areas of compute, data, security and infrastructure. The compute area plan focuses on consolidation of standards and agreements through a unified interface for job submission and management, a common format for accounting, the wide adoption of GLUE schema version 2.0 and the provision of a common framework for the execution of parallel jobs. The security area is working towards a unified security model and lowering the barriers to Grid usage by allowing users to gain access with their own credentials. The data area is focusing on implementing standards to ensure interoperability with other grids and industry components and to reuse already existing clients in operating systems and open source distributions. One of the highlights of the infrastructure area is the consolidation of the information system services via the creation of a common information backbone.
Federal Register 2010, 2011, 2012, 2013, 2014
2012-05-15
... INTERNATIONAL TRADE COMMISSION [Investigation No. 337-TA-745] Certain Wireless Communication Devices, Portable Music and Data Processing Devices, Computers and Components Thereof; Notice of Request for Statements on the Public Interest AGENCY: U.S. International Trade Commission. ACTION: Notice...
Federal Register 2010, 2011, 2012, 2013, 2014
2012-10-29
... INTERNATIONAL TRADE COMMISSION [Investigation No. 337-TA-856] Certain Wireless Communication Devices, Portable Music and Data Processing Devices, Computers, and Components Thereof AGENCY: U.S. International Trade Commission. ACTION: Notice. SUMMARY: Notice is hereby given that the U.S. International...
Orthorectification by Using Gpgpu Method
NASA Astrophysics Data System (ADS)
Sahin, H.; Kulur, S.
2012-07-01
Thanks to the nature of the graphics processing, the newly released products offer highly parallel processing units with high-memory bandwidth and computational power of more than teraflops per second. The modern GPUs are not only powerful graphic engines but also they are high level parallel programmable processors with very fast computing capabilities and high-memory bandwidth speed compared to central processing units (CPU). Data-parallel computations can be shortly described as mapping data elements to parallel processing threads. The rapid development of GPUs programmability and capabilities attracted the attentions of researchers dealing with complex problems which need high level calculations. This interest has revealed the concepts of "General Purpose Computation on Graphics Processing Units (GPGPU)" and "stream processing". The graphic processors are powerful hardware which is really cheap and affordable. So the graphic processors became an alternative to computer processors. The graphic chips which were standard application hardware have been transformed into modern, powerful and programmable processors to meet the overall needs. Especially in recent years, the phenomenon of the usage of graphics processing units in general purpose computation has led the researchers and developers to this point. The biggest problem is that the graphics processing units use different programming models unlike current programming methods. Therefore, an efficient GPU programming requires re-coding of the current program algorithm by considering the limitations and the structure of the graphics hardware. Currently, multi-core processors can not be programmed by using traditional programming methods. Event procedure programming method can not be used for programming the multi-core processors. GPUs are especially effective in finding solution for repetition of the computing steps for many data elements when high accuracy is needed. Thus, it provides the computing process more quickly and accurately. Compared to the GPUs, CPUs which perform just one computing in a time according to the flow control are slower in performance. This structure can be evaluated for various applications of computer technology. In this study covers how general purpose parallel programming and computational power of the GPUs can be used in photogrammetric applications especially direct georeferencing. The direct georeferencing algorithm is coded by using GPGPU method and CUDA (Compute Unified Device Architecture) programming language. Results provided by this method were compared with the traditional CPU programming. In the other application the projective rectification is coded by using GPGPU method and CUDA programming language. Sample images of various sizes, as compared to the results of the program were evaluated. GPGPU method can be used especially in repetition of same computations on highly dense data, thus finding the solution quickly.
Probabilistic arithmetic automata and their applications.
Marschall, Tobias; Herms, Inke; Kaltenbach, Hans-Michael; Rahmann, Sven
2012-01-01
We present a comprehensive review on probabilistic arithmetic automata (PAAs), a general model to describe chains of operations whose operands depend on chance, along with two algorithms to numerically compute the distribution of the results of such probabilistic calculations. PAAs provide a unifying framework to approach many problems arising in computational biology and elsewhere. We present five different applications, namely 1) pattern matching statistics on random texts, including the computation of the distribution of occurrence counts, waiting times, and clump sizes under hidden Markov background models; 2) exact analysis of window-based pattern matching algorithms; 3) sensitivity of filtration seeds used to detect candidate sequence alignments; 4) length and mass statistics of peptide fragments resulting from enzymatic cleavage reactions; and 5) read length statistics of 454 and IonTorrent sequencing reads. The diversity of these applications indicates the flexibility and unifying character of the presented framework. While the construction of a PAA depends on the particular application, we single out a frequently applicable construction method: We introduce deterministic arithmetic automata (DAAs) to model deterministic calculations on sequences, and demonstrate how to construct a PAA from a given DAA and a finite-memory random text model. This procedure is used for all five discussed applications and greatly simplifies the construction of PAAs. Implementations are available as part of the MoSDi package. Its application programming interface facilitates the rapid development of new applications based on the PAA framework.
Design Considerations of a Virtual Laboratory for Advanced X-ray Sources
NASA Astrophysics Data System (ADS)
Luginsland, J. W.; Frese, M. H.; Frese, S. D.; Watrous, J. J.; Heileman, G. L.
2004-11-01
The field of scientific computation has greatly advanced in the last few years, resulting in the ability to perform complex computer simulations that can predict the performance of real-world experiments in a number of fields of study. Among the forces driving this new computational capability is the advent of parallel algorithms, allowing calculations in three-dimensional space with realistic time scales. Electromagnetic radiation sources driven by high-voltage, high-current electron beams offer an area to further push the state-of-the-art in high fidelity, first-principles simulation tools. The physics of these x-ray sources combine kinetic plasma physics (electron beams) with dense fluid-like plasma physics (anode plasmas) and x-ray generation (bremsstrahlung). There are a number of mature techniques and software packages for dealing with the individual aspects of these sources, such as Particle-In-Cell (PIC), Magneto-Hydrodynamics (MHD), and radiation transport codes. The current effort is focused on developing an object-oriented software environment using the Rational© Unified Process and the Unified Modeling Language (UML) to provide a framework where multiple 3D parallel physics packages, such as a PIC code (ICEPIC), a MHD code (MACH), and a x-ray transport code (ITS) can co-exist in a system-of-systems approach to modeling advanced x-ray sources. Initial software design and assessments of the various physics algorithms' fidelity will be presented.
NASA Astrophysics Data System (ADS)
Reichenberger, Michael A.; Nichols, Daniel M.; Stevenson, Sarah R.; Swope, Tanner M.; Hilger, Caden W.; Unruh, Troy C.; McGregor, Douglas S.; Roberts, Jeremy A.
2017-08-01
Advancements in nuclear reactor core modeling and computational capability have encouraged further development of in-core neutron sensors. Micro-Pocket Fission Detectors (MPFDs) have been fabricated and tested previously, but successful testing of these prior detectors was limited to single-node operation with specialized designs. Described in this work is a modular, four-node MPFD array fabricated and tested at Kansas State University (KSU). The four sensor nodes were equally spaced to span the length of the fuel-region of the KSU TRIGA Mk. II research nuclear reactor core. The encapsulated array was filled with argon gas, serving as an ionization medium in the small cavities of the MPFDs. The unified design improved device ruggedness and simplified construction over previous designs. A 0.315-in. (8-mm) penetration in the upper grid plate of the KSU TRIGA Mk. II research nuclear reactor was used to deploy the array between fuel elements in the core. The MPFD array was coupled to an electronic support system which has been developed to support pulse-mode operation. Neutron-induced pulses were observed on all four sensor channels. Stable device operation was confirmed by testing under steady-state reactor conditions. Each of the four sensors in the array responded to changes in reactor power between 10 kWth and full power (750 kWth). Reactor power transients were observed in real-time including positive transients with periods of 5, 15, and 30 s. Finally, manual reactor power oscillations were observed in real-time.
GPU: the biggest key processor for AI and parallel processing
NASA Astrophysics Data System (ADS)
Baji, Toru
2017-07-01
Two types of processors exist in the market. One is the conventional CPU and the other is Graphic Processor Unit (GPU). Typical CPU is composed of 1 to 8 cores while GPU has thousands of cores. CPU is good for sequential processing, while GPU is good to accelerate software with heavy parallel executions. GPU was initially dedicated for 3D graphics. However from 2006, when GPU started to apply general-purpose cores, it was noticed that this architecture can be used as a general purpose massive-parallel processor. NVIDIA developed a software framework Compute Unified Device Architecture (CUDA) that make it possible to easily program the GPU for these application. With CUDA, GPU started to be used in workstations and supercomputers widely. Recently two key technologies are highlighted in the industry. The Artificial Intelligence (AI) and Autonomous Driving Cars. AI requires a massive parallel operation to train many-layers of neural networks. With CPU alone, it was impossible to finish the training in a practical time. The latest multi-GPU system with P100 makes it possible to finish the training in a few hours. For the autonomous driving cars, TOPS class of performance is required to implement perception, localization, path planning processing and again SoC with integrated GPU will play a key role there. In this paper, the evolution of the GPU which is one of the biggest commercial devices requiring state-of-the-art fabrication technology will be introduced. Also overview of the GPU demanding key application like the ones described above will be introduced.
Unified description of H-atom-induced chemicurrents and inelastic scattering.
Kandratsenka, Alexander; Jiang, Hongyan; Dorenkamp, Yvonne; Janke, Svenja M; Kammler, Marvin; Wodtke, Alec M; Bünermann, Oliver
2018-01-23
The Born-Oppenheimer approximation (BOA) provides the foundation for virtually all computational studies of chemical binding and reactivity, and it is the justification for the widely used "balls and springs" picture of molecules. The BOA assumes that nuclei effectively stand still on the timescale of electronic motion, due to their large masses relative to electrons. This implies electrons never change their energy quantum state. When molecules react, atoms must move, meaning that electrons may become excited in violation of the BOA. Such electronic excitation is clearly seen for: ( i ) Schottky diodes where H adsorption at Ag surfaces produces electrical "chemicurrent;" ( ii ) Au-based metal-insulator-metal (MIM) devices, where chemicurrents arise from H-H surface recombination; and ( iii ) Inelastic energy transfer, where H collisions with Au surfaces show H-atom translation excites the metal's electrons. As part of this work, we report isotopically selective hydrogen/deuterium (H/D) translational inelasticity measurements in collisions with Ag and Au. Together, these experiments provide an opportunity to test new theories that simultaneously describe both nuclear and electronic motion, a standing challenge to the field. Here, we show results of a recently developed first-principles theory that quantitatively explains both inelastic scattering experiments that probe nuclear motion and chemicurrent experiments that probe electronic excitation. The theory explains the magnitude of chemicurrents on Ag Schottky diodes and resolves an apparent paradox--chemicurrents exhibit a much larger isotope effect than does H/D inelastic scattering. It also explains why, unlike Ag-based Schottky diodes, Au-based MIM devices are insensitive to H adsorption.
Multi-input and binary reproducible, high bandwidth floating point adder in a collective network
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chen, Dong; Eisley, Noel A; Heidelberger, Philip
To add floating point numbers in a parallel computing system, a collective logic device receives the floating point numbers from computing nodes. The collective logic devices converts the floating point numbers to integer numbers. The collective logic device adds the integer numbers and generating a summation of the integer numbers. The collective logic device converts the summation to a floating point number. The collective logic device performs the receiving, the converting the floating point numbers, the adding, the generating and the converting the summation in one pass. One pass indicates that the computing nodes send inputs only once to themore » collective logic device and receive outputs only once from the collective logic device.« less
Simulation of Aerosols and Chemistry with a Unified Global Model
NASA Technical Reports Server (NTRS)
Chin, Mian
2004-01-01
This project is to continue the development of the global simulation capabilities of tropospheric and stratospheric chemistry and aerosols in a unified global model. This is a part of our overall investigation of aerosol-chemistry-climate interaction. In the past year, we have enabled the tropospheric chemistry simulations based on the GEOS-CHEM model, and added stratospheric chemical reactions into the GEOS-CHEM such that a globally unified troposphere-stratosphere chemistry and transport can be simulated consistently without any simplifications. The tropospheric chemical mechanism in the GEOS-CHEM includes 80 species and 150 reactions. 24 tracers are transported, including O3, NOx, total nitrogen (NOy), H2O2, CO, and several types of hydrocarbon. The chemical solver used in the GEOS-CHEM model is a highly accurate sparse-matrix vectorized Gear solver (SMVGEAR). The stratospheric chemical mechanism includes an additional approximately 100 reactions and photolysis processes. Because of the large number of total chemical reactions and photolysis processes and very different photochemical regimes involved in the unified simulation, the model demands significant computer resources that are currently not practical. Therefore, several improvements will be taken, such as massive parallelization, code optimization, or selecting a faster solver. We have also continued aerosol simulation (including sulfate, dust, black carbon, organic carbon, and sea-salt) in the global model to cover most of year 2002. These results have been made available to many groups worldwide and accessible from the website http://code916.gsfc.nasa.gov/People/Chin/aot.html.
Qualification and Approval of Personal Computer-Based Aviation Training Devices
DOT National Transportation Integrated Search
1997-05-12
This Advisory Circular (AC) provides information and guidance to potential training device manufacturers and aviation training consumers concerning a means, acceptable to the Administrator, by which personal computer-based aviation training devices (...
Federal Register 2010, 2011, 2012, 2013, 2014
2010-10-12
... Certain Wireless Communication Devices, Portable Music and Data Processing Devices, Computers and..., portable music and data processing devices, computers and components thereof. The complaint names as...