Sample records for improve computation efficiency

  1. Memristive Mixed-Signal Neuromorphic Systems: Energy-Efficient Learning at the Circuit-Level

    DOE PAGES

    Chakma, Gangotree; Adnan, Md Musabbir; Wyer, Austin R.; ...

    2017-11-23

    Neuromorphic computing is non-von Neumann computer architecture for the post Moore’s law era of computing. Since a main focus of the post Moore’s law era is energy-efficient computing with fewer resources and less area, neuromorphic computing contributes effectively in this research. Here in this paper, we present a memristive neuromorphic system for improved power and area efficiency. Our particular mixed-signal approach implements neural networks with spiking events in a synchronous way. Moreover, the use of nano-scale memristive devices saves both area and power in the system. We also provide device-level considerations that make the system more energy-efficient. The proposed systemmore » additionally includes synchronous digital long term plasticity, an online learning methodology that helps the system train the neural networks during the operation phase and improves the efficiency in learning considering the power consumption and area overhead.« less

  2. Memristive Mixed-Signal Neuromorphic Systems: Energy-Efficient Learning at the Circuit-Level

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chakma, Gangotree; Adnan, Md Musabbir; Wyer, Austin R.

    Neuromorphic computing is non-von Neumann computer architecture for the post Moore’s law era of computing. Since a main focus of the post Moore’s law era is energy-efficient computing with fewer resources and less area, neuromorphic computing contributes effectively in this research. Here in this paper, we present a memristive neuromorphic system for improved power and area efficiency. Our particular mixed-signal approach implements neural networks with spiking events in a synchronous way. Moreover, the use of nano-scale memristive devices saves both area and power in the system. We also provide device-level considerations that make the system more energy-efficient. The proposed systemmore » additionally includes synchronous digital long term plasticity, an online learning methodology that helps the system train the neural networks during the operation phase and improves the efficiency in learning considering the power consumption and area overhead.« less

  3. Reducing Vehicle Weight and Improving U.S. Energy Efficiency Using Integrated Computational Materials Engineering

    NASA Astrophysics Data System (ADS)

    Joost, William J.

    2012-09-01

    Transportation accounts for approximately 28% of U.S. energy consumption with the majority of transportation energy derived from petroleum sources. Many technologies such as vehicle electrification, advanced combustion, and advanced fuels can reduce transportation energy consumption by improving the efficiency of cars and trucks. Lightweight materials are another important technology that can improve passenger vehicle fuel efficiency by 6-8% for each 10% reduction in weight while also making electric and alternative vehicles more competitive. Despite the opportunities for improved efficiency, widespread deployment of lightweight materials for automotive structures is hampered by technology gaps most often associated with performance, manufacturability, and cost. In this report, the impact of reduced vehicle weight on energy efficiency is discussed with a particular emphasis on quantitative relationships determined by several researchers. The most promising lightweight materials systems are described along with a brief review of the most significant technical barriers to their implementation. For each material system, the development of accurate material models is critical to support simulation-intensive processing and structural design for vehicles; improved models also contribute to an integrated computational materials engineering (ICME) approach for addressing technical barriers and accelerating deployment. The value of computational techniques is described by considering recent ICME and computational materials science success stories with an emphasis on applying problem-specific methods.

  4. Improvements in the efficiency of turboexpanders in cryogenic applications

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Agahi, R.R.; Lin, M.C.; Ershaghi, B.

    1996-12-31

    Process designers have utilized turboexpanders in cryogenic processes because of their higher thermal efficiencies when compared with conventional refrigeration cycles. Process design and equipment performance have improved substantially through the utilization of modern technologies. Turboexpander manufacturers have also adopted Computational Fluid Dynamic Software, Computer Numerical Control Technology and Holography Techniques to further improve an already impressive turboexpander efficiency performance. In this paper, the authors explain the design process of the turboexpander utilizing modern technology. Two cases of turboexpanders processing helium (4.35{degrees}K) and hydrogen (56{degrees}K) will be presented.

  5. An Adaptive Evolutionary Algorithm for Traveling Salesman Problem with Precedence Constraints

    PubMed Central

    Sung, Jinmo; Jeong, Bongju

    2014-01-01

    Traveling sales man problem with precedence constraints is one of the most notorious problems in terms of the efficiency of its solution approach, even though it has very wide range of industrial applications. We propose a new evolutionary algorithm to efficiently obtain good solutions by improving the search process. Our genetic operators guarantee the feasibility of solutions over the generations of population, which significantly improves the computational efficiency even when it is combined with our flexible adaptive searching strategy. The efficiency of the algorithm is investigated by computational experiments. PMID:24701158

  6. An adaptive evolutionary algorithm for traveling salesman problem with precedence constraints.

    PubMed

    Sung, Jinmo; Jeong, Bongju

    2014-01-01

    Traveling sales man problem with precedence constraints is one of the most notorious problems in terms of the efficiency of its solution approach, even though it has very wide range of industrial applications. We propose a new evolutionary algorithm to efficiently obtain good solutions by improving the search process. Our genetic operators guarantee the feasibility of solutions over the generations of population, which significantly improves the computational efficiency even when it is combined with our flexible adaptive searching strategy. The efficiency of the algorithm is investigated by computational experiments.

  7. Irregular large-scale computed tomography on multiple graphics processors improves energy-efficiency metrics for industrial applications

    NASA Astrophysics Data System (ADS)

    Jimenez, Edward S.; Goodman, Eric L.; Park, Ryeojin; Orr, Laurel J.; Thompson, Kyle R.

    2014-09-01

    This paper will investigate energy-efficiency for various real-world industrial computed-tomography reconstruction algorithms, both CPU- and GPU-based implementations. This work shows that the energy required for a given reconstruction is based on performance and problem size. There are many ways to describe performance and energy efficiency, thus this work will investigate multiple metrics including performance-per-watt, energy-delay product, and energy consumption. This work found that irregular GPU-based approaches1 realized tremendous savings in energy consumption when compared to CPU implementations while also significantly improving the performance-per- watt and energy-delay product metrics. Additional energy savings and other metric improvement was realized on the GPU-based reconstructions by improving storage I/O by implementing a parallel MIMD-like modularization of the compute and I/O tasks.

  8. Construction and application of Red5 cluster based on OpenStack

    NASA Astrophysics Data System (ADS)

    Wang, Jiaqing; Song, Jianxin

    2017-08-01

    With the application and development of cloud computing technology in various fields, the resource utilization rate of the data center has been improved obviously, and the system based on cloud computing platform has also improved the expansibility and stability. In the traditional way, Red5 cluster resource utilization is low and the system stability is poor. This paper uses cloud computing to efficiently calculate the resource allocation ability, and builds a Red5 server cluster based on OpenStack. Multimedia applications can be published to the Red5 cloud server cluster. The system achieves the flexible construction of computing resources, but also greatly improves the stability of the cluster and service efficiency.

  9. Speeding Up Ecological and Evolutionary Computations in R; Essentials of High Performance Computing for Biologists

    PubMed Central

    Visser, Marco D.; McMahon, Sean M.; Merow, Cory; Dixon, Philip M.; Record, Sydne; Jongejans, Eelke

    2015-01-01

    Computation has become a critical component of research in biology. A risk has emerged that computational and programming challenges may limit research scope, depth, and quality. We review various solutions to common computational efficiency problems in ecological and evolutionary research. Our review pulls together material that is currently scattered across many sources and emphasizes those techniques that are especially effective for typical ecological and environmental problems. We demonstrate how straightforward it can be to write efficient code and implement techniques such as profiling or parallel computing. We supply a newly developed R package (aprof) that helps to identify computational bottlenecks in R code and determine whether optimization can be effective. Our review is complemented by a practical set of examples and detailed Supporting Information material (S1–S3 Texts) that demonstrate large improvements in computational speed (ranging from 10.5 times to 14,000 times faster). By improving computational efficiency, biologists can feasibly solve more complex tasks, ask more ambitious questions, and include more sophisticated analyses in their research. PMID:25811842

  10. Application of the MacCormack scheme to overland flow routing for high-spatial resolution distributed hydrological model

    NASA Astrophysics Data System (ADS)

    Zhang, Ling; Nan, Zhuotong; Liang, Xu; Xu, Yi; Hernández, Felipe; Li, Lianxia

    2018-03-01

    Although process-based distributed hydrological models (PDHMs) are evolving rapidly over the last few decades, their extensive applications are still challenged by the computational expenses. This study attempted, for the first time, to apply the numerically efficient MacCormack algorithm to overland flow routing in a representative high-spatial resolution PDHM, i.e., the distributed hydrology-soil-vegetation model (DHSVM), in order to improve its computational efficiency. The analytical verification indicates that both the semi and full versions of the MacCormack schemes exhibit robust numerical stability and are more computationally efficient than the conventional explicit linear scheme. The full-version outperforms the semi-version in terms of simulation accuracy when a same time step is adopted. The semi-MacCormack scheme was implemented into DHSVM (version 3.1.2) to solve the kinematic wave equations for overland flow routing. The performance and practicality of the enhanced DHSVM-MacCormack model was assessed by performing two groups of modeling experiments in the Mercer Creek watershed, a small urban catchment near Bellevue, Washington. The experiments show that DHSVM-MacCormack can considerably improve the computational efficiency without compromising the simulation accuracy of the original DHSVM model. More specifically, with the same computational environment and model settings, the computational time required by DHSVM-MacCormack can be reduced to several dozen minutes for a simulation period of three months (in contrast with one day and a half by the original DHSVM model) without noticeable sacrifice of the accuracy. The MacCormack scheme proves to be applicable to overland flow routing in DHSVM, which implies that it can be coupled into other PHDMs for watershed routing to either significantly improve their computational efficiency or to make the kinematic wave routing for high resolution modeling computational feasible.

  11. Design and Verification of Remote Sensing Image Data Center Storage Architecture Based on Hadoop

    NASA Astrophysics Data System (ADS)

    Tang, D.; Zhou, X.; Jing, Y.; Cong, W.; Li, C.

    2018-04-01

    The data center is a new concept of data processing and application proposed in recent years. It is a new method of processing technologies based on data, parallel computing, and compatibility with different hardware clusters. While optimizing the data storage management structure, it fully utilizes cluster resource computing nodes and improves the efficiency of data parallel application. This paper used mature Hadoop technology to build a large-scale distributed image management architecture for remote sensing imagery. Using MapReduce parallel processing technology, it called many computing nodes to process image storage blocks and pyramids in the background to improve the efficiency of image reading and application and sovled the need for concurrent multi-user high-speed access to remotely sensed data. It verified the rationality, reliability and superiority of the system design by testing the storage efficiency of different image data and multi-users and analyzing the distributed storage architecture to improve the application efficiency of remote sensing images through building an actual Hadoop service system.

  12. Improved Discrete Ordinate Solutions in the Presence of an Anisotropically Reflecting Lower Boundary: Upgrades of the DISORT Computational Tool

    NASA Technical Reports Server (NTRS)

    Lin, Z.; Stamnes, S.; Jin, Z.; Laszlo, I.; Tsay, S. C.; Wiscombe, W. J.; Stamnes, K.

    2015-01-01

    A successor version 3 of DISORT (DISORT3) is presented with important upgrades that improve the accuracy, efficiency, and stability of the algorithm. Compared with version 2 (DISORT2 released in 2000) these upgrades include (a) a redesigned BRDF computation that improves both speed and accuracy, (b) a revised treatment of the single scattering correction, and (c) additional efficiency and stability upgrades for beam sources. In DISORT3 the BRDF computation is improved in the following three ways: (i) the Fourier decomposition is prepared "off-line", thus avoiding the repeated internal computations done in DISORT2; (ii) a large enough number of terms in the Fourier expansion of the BRDF is employed to guarantee accurate values of the expansion coefficients (default is 200 instead of 50 in DISORT2); (iii) in the post processing step the reflection of the direct attenuated beam from the lower boundary is included resulting in a more accurate single scattering correction. These improvements in the treatment of the BRDF have led to improved accuracy and a several-fold increase in speed. In addition, the stability of beam sources has been improved by removing a singularity occurring when the cosine of the incident beam angle is too close to the reciprocal of any of the eigenvalues. The efficiency for beam sources has been further improved from reducing by a factor of 2 (compared to DISORT2) the dimension of the linear system of equations that must be solved to obtain the particular solutions, and by replacing the LINPAK routines used in DISORT2 by LAPACK 3.5 in DISORT3. These beam source stability and efficiency upgrades bring enhanced stability and an additional 5-7% improvement in speed. Numerical results are provided to demonstrate and quantify the improvements in accuracy and efficiency of DISORT3 compared to DISORT2.

  13. On computational methods for crashworthiness

    NASA Technical Reports Server (NTRS)

    Belytschko, T.

    1992-01-01

    The evolution of computational methods for crashworthiness and related fields is described and linked with the decreasing cost of computational resources and with improvements in computation methodologies. The latter includes more effective time integration procedures and more efficient elements. Some recent developments in methodologies and future trends are also summarized. These include multi-time step integration (or subcycling), further improvements in elements, adaptive meshes, and the exploitation of parallel computers.

  14. An efficient and accurate 3D displacements tracking strategy for digital volume correlation

    NASA Astrophysics Data System (ADS)

    Pan, Bing; Wang, Bo; Wu, Dafang; Lubineau, Gilles

    2014-07-01

    Owing to its inherent computational complexity, practical implementation of digital volume correlation (DVC) for internal displacement and strain mapping faces important challenges in improving its computational efficiency. In this work, an efficient and accurate 3D displacement tracking strategy is proposed for fast DVC calculation. The efficiency advantage is achieved by using three improvements. First, to eliminate the need of updating Hessian matrix in each iteration, an efficient 3D inverse compositional Gauss-Newton (3D IC-GN) algorithm is introduced to replace existing forward additive algorithms for accurate sub-voxel displacement registration. Second, to ensure the 3D IC-GN algorithm that converges accurately and rapidly and avoid time-consuming integer-voxel displacement searching, a generalized reliability-guided displacement tracking strategy is designed to transfer accurate and complete initial guess of deformation for each calculation point from its computed neighbors. Third, to avoid the repeated computation of sub-voxel intensity interpolation coefficients, an interpolation coefficient lookup table is established for tricubic interpolation. The computational complexity of the proposed fast DVC and the existing typical DVC algorithms are first analyzed quantitatively according to necessary arithmetic operations. Then, numerical tests are performed to verify the performance of the fast DVC algorithm in terms of measurement accuracy and computational efficiency. The experimental results indicate that, compared with the existing DVC algorithm, the presented fast DVC algorithm produces similar precision and slightly higher accuracy at a substantially reduced computational cost.

  15. Remote sensing image ship target detection method based on visual attention model

    NASA Astrophysics Data System (ADS)

    Sun, Yuejiao; Lei, Wuhu; Ren, Xiaodong

    2017-11-01

    The traditional methods of detecting ship targets in remote sensing images mostly use sliding window to search the whole image comprehensively. However, the target usually occupies only a small fraction of the image. This method has high computational complexity for large format visible image data. The bottom-up selective attention mechanism can selectively allocate computing resources according to visual stimuli, thus improving the computational efficiency and reducing the difficulty of analysis. Considering of that, a method of ship target detection in remote sensing images based on visual attention model was proposed in this paper. The experimental results show that the proposed method can reduce the computational complexity while improving the detection accuracy, and improve the detection efficiency of ship targets in remote sensing images.

  16. Influence of computational fluid dynamics on experimental aerospace facilities: A fifteen year projection

    NASA Technical Reports Server (NTRS)

    1983-01-01

    An assessment was made of the impact of developments in computational fluid dynamics (CFD) on the traditional role of aerospace ground test facilities over the next fifteen years. With improvements in CFD and more powerful scientific computers projected over this period it is expected to have the capability to compute the flow over a complete aircraft at a unit cost three orders of magnitude lower than presently possible. Over the same period improvements in ground test facilities will progress by application of computational techniques including CFD to data acquisition, facility operational efficiency, and simulation of the light envelope; however, no dramatic change in unit cost is expected as greater efficiency will be countered by higher energy and labor costs.

  17. Methods for Computationally Efficient Structured CFD Simulations of Complex Turbomachinery Flows

    NASA Technical Reports Server (NTRS)

    Herrick, Gregory P.; Chen, Jen-Ping

    2012-01-01

    This research presents more efficient computational methods by which to perform multi-block structured Computational Fluid Dynamics (CFD) simulations of turbomachinery, thus facilitating higher-fidelity solutions of complicated geometries and their associated flows. This computational framework offers flexibility in allocating resources to balance process count and wall-clock computation time, while facilitating research interests of simulating axial compressor stall inception with more complete gridding of the flow passages and rotor tip clearance regions than is typically practiced with structured codes. The paradigm presented herein facilitates CFD simulation of previously impractical geometries and flows. These methods are validated and demonstrate improved computational efficiency when applied to complicated geometries and flows.

  18. Variable cross-section windings for efficiency improvement of electric machines

    NASA Astrophysics Data System (ADS)

    Grachev, P. Yu; Bazarov, A. A.; Tabachinskiy, A. S.

    2018-02-01

    Implementation of energy-saving technologies in industry is impossible without efficiency improvement of electric machines. The article considers the ways of efficiency improvement and mass and dimensions reduction of electric machines with electronic control. Features of compact winding design for stators and armatures are described. Influence of compact winding on thermal and electrical process is given. Finite element method was used in computer simulation.

  19. Efficient Strategies for Estimating the Spatial Coherence of Backscatter

    PubMed Central

    Hyun, Dongwoon; Crowley, Anna Lisa C.; Dahl, Jeremy J.

    2017-01-01

    The spatial coherence of ultrasound backscatter has been proposed to reduce clutter in medical imaging, to measure the anisotropy of the scattering source, and to improve the detection of blood flow. These techniques rely on correlation estimates that are obtained using computationally expensive strategies. In this study, we assess existing spatial coherence estimation methods and propose three computationally efficient modifications: a reduced kernel, a downsampled receive aperture, and the use of an ensemble correlation coefficient. The proposed methods are implemented in simulation and in vivo studies. Reducing the kernel to a single sample improved computational throughput and improved axial resolution. Downsampling the receive aperture was found to have negligible effect on estimator variance, and improved computational throughput by an order of magnitude for a downsample factor of 4. The ensemble correlation estimator demonstrated lower variance than the currently used average correlation. Combining the three methods, the throughput was improved 105-fold in simulation with a downsample factor of 4 and 20-fold in vivo with a downsample factor of 2. PMID:27913342

  20. The Improvement of Efficiency in the Numerical Computation of Orbit Trajectories

    NASA Technical Reports Server (NTRS)

    Dyer, J.; Danchick, R.; Pierce, S.; Haney, R.

    1972-01-01

    An analysis, system design, programming, and evaluation of results are described for numerical computation of orbit trajectories. Evaluation of generalized methods, interaction of different formulations for satellite motion, transformation of equations of motion and integrator loads, and development of efficient integrators are also considered.

  1. Bringing the CMS distributed computing system into scalable operations

    NASA Astrophysics Data System (ADS)

    Belforte, S.; Fanfani, A.; Fisk, I.; Flix, J.; Hernández, J. M.; Kress, T.; Letts, J.; Magini, N.; Miccio, V.; Sciabà, A.

    2010-04-01

    Establishing efficient and scalable operations of the CMS distributed computing system critically relies on the proper integration, commissioning and scale testing of the data and workload management tools, the various computing workflows and the underlying computing infrastructure, located at more than 50 computing centres worldwide and interconnected by the Worldwide LHC Computing Grid. Computing challenges periodically undertaken by CMS in the past years with increasing scale and complexity have revealed the need for a sustained effort on computing integration and commissioning activities. The Processing and Data Access (PADA) Task Force was established at the beginning of 2008 within the CMS Computing Program with the mandate of validating the infrastructure for organized processing and user analysis including the sites and the workload and data management tools, validating the distributed production system by performing functionality, reliability and scale tests, helping sites to commission, configure and optimize the networking and storage through scale testing data transfers and data processing, and improving the efficiency of accessing data across the CMS computing system from global transfers to local access. This contribution reports on the tools and procedures developed by CMS for computing commissioning and scale testing as well as the improvements accomplished towards efficient, reliable and scalable computing operations. The activities include the development and operation of load generators for job submission and data transfers with the aim of stressing the experiment and Grid data management and workload management systems, site commissioning procedures and tools to monitor and improve site availability and reliability, as well as activities targeted to the commissioning of the distributed production, user analysis and monitoring systems.

  2. A Survey of Methods for Analyzing and Improving GPU Energy Efficiency

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Mittal, Sparsh; Vetter, Jeffrey S

    2014-01-01

    Recent years have witnessed a phenomenal growth in the computational capabilities and applications of GPUs. However, this trend has also led to dramatic increase in their power consumption. This paper surveys research works on analyzing and improving energy efficiency of GPUs. It also provides a classification of these techniques on the basis of their main research idea. Further, it attempts to synthesize research works which compare energy efficiency of GPUs with other computing systems, e.g. FPGAs and CPUs. The aim of this survey is to provide researchers with knowledge of state-of-the-art in GPU power management and motivate them to architectmore » highly energy-efficient GPUs of tomorrow.« less

  3. Improved Quasi-Newton method via PSB update for solving systems of nonlinear equations

    NASA Astrophysics Data System (ADS)

    Mamat, Mustafa; Dauda, M. K.; Waziri, M. Y.; Ahmad, Fadhilah; Mohamad, Fatma Susilawati

    2016-10-01

    The Newton method has some shortcomings which includes computation of the Jacobian matrix which may be difficult or even impossible to compute and solving the Newton system in every iteration. Also, the common setback with some quasi-Newton methods is that they need to compute and store an n × n matrix at each iteration, this is computationally costly for large scale problems. To overcome such drawbacks, an improved Method for solving systems of nonlinear equations via PSB (Powell-Symmetric-Broyden) update is proposed. In the proposed method, the approximate Jacobian inverse Hk of PSB is updated and its efficiency has improved thereby require low memory storage, hence the main aim of this paper. The preliminary numerical results show that the proposed method is practically efficient when applied on some benchmark problems.

  4. Cognitive Load Theory vs. Constructivist Approaches: Which Best Leads to Efficient, Deep Learning?

    ERIC Educational Resources Information Center

    Vogel-Walcutt, J. J.; Gebrim, J. B.; Bowers, C.; Carper, T. M.; Nicholson, D.

    2011-01-01

    Computer-assisted learning, in the form of simulation-based training, is heavily focused upon by the military. Because computer-based learning offers highly portable, reusable, and cost-efficient training options, the military has dedicated significant resources to the investigation of instructional strategies that improve learning efficiency…

  5. Open source acceleration of wave optics simulations on energy efficient high-performance computing platforms

    NASA Astrophysics Data System (ADS)

    Beck, Jeffrey; Bos, Jeremy P.

    2017-05-01

    We compare several modifications to the open-source wave optics package, WavePy, intended to improve execution time. Specifically, we compare the relative performance of the Intel MKL, a CPU based OpenCV distribution, and GPU-based version. Performance is compared between distributions both on the same compute platform and between a fully-featured computing workstation and the NVIDIA Jetson TX1 platform. Comparisons are drawn in terms of both execution time and power consumption. We have found that substituting the Fast Fourier Transform operation from OpenCV provides a marked improvement on all platforms. In addition, we show that embedded platforms offer some possibility for extensive improvement in terms of efficiency compared to a fully featured workstation.

  6. An Initial Multi-Domain Modeling of an Actively Cooled Structure

    NASA Technical Reports Server (NTRS)

    Steinthorsson, Erlendur

    1997-01-01

    A methodology for the simulation of turbine cooling flows is being developed. The methodology seeks to combine numerical techniques that optimize both accuracy and computational efficiency. Key components of the methodology include the use of multiblock grid systems for modeling complex geometries, and multigrid convergence acceleration for enhancing computational efficiency in highly resolved fluid flow simulations. The use of the methodology has been demonstrated in several turbo machinery flow and heat transfer studies. Ongoing and future work involves implementing additional turbulence models, improving computational efficiency, adding AMR.

  7. Rapid Optimization of External Quantum Efficiency of Thin Film Solar Cells Using Surrogate Modeling of Absorptivity.

    PubMed

    Kaya, Mine; Hajimirza, Shima

    2018-05-25

    This paper uses surrogate modeling for very fast design of thin film solar cells with improved solar-to-electricity conversion efficiency. We demonstrate that the wavelength-specific optical absorptivity of a thin film multi-layered amorphous-silicon-based solar cell can be modeled accurately with Neural Networks and can be efficiently approximated as a function of cell geometry and wavelength. Consequently, the external quantum efficiency can be computed by averaging surrogate absorption and carrier recombination contributions over the entire irradiance spectrum in an efficient way. Using this framework, we optimize a multi-layer structure consisting of ITO front coating, metallic back-reflector and oxide layers for achieving maximum efficiency. Our required computation time for an entire model fitting and optimization is 5 to 20 times less than the best previous optimization results based on direct Finite Difference Time Domain (FDTD) simulations, therefore proving the value of surrogate modeling. The resulting optimization solution suggests at least 50% improvement in the external quantum efficiency compared to bare silicon, and 25% improvement compared to a random design.

  8. Efficient computation of the Grünwald-Letnikov fractional diffusion derivative using adaptive time step memory

    NASA Astrophysics Data System (ADS)

    MacDonald, Christopher L.; Bhattacharya, Nirupama; Sprouse, Brian P.; Silva, Gabriel A.

    2015-09-01

    Computing numerical solutions to fractional differential equations can be computationally intensive due to the effect of non-local derivatives in which all previous time points contribute to the current iteration. In general, numerical approaches that depend on truncating part of the system history while efficient, can suffer from high degrees of error and inaccuracy. Here we present an adaptive time step memory method for smooth functions applied to the Grünwald-Letnikov fractional diffusion derivative. This method is computationally efficient and results in smaller errors during numerical simulations. Sampled points along the system's history at progressively longer intervals are assumed to reflect the values of neighboring time points. By including progressively fewer points backward in time, a temporally 'weighted' history is computed that includes contributions from the entire past of the system, maintaining accuracy, but with fewer points actually calculated, greatly improving computational efficiency.

  9. Software Surface Modeling and Grid Generation Steering Committee

    NASA Technical Reports Server (NTRS)

    Smith, Robert E. (Editor)

    1992-01-01

    It is a NASA objective to promote improvements in the capability and efficiency of computational fluid dynamics. Grid generation, the creation of a discrete representation of the solution domain, is an essential part of computational fluid dynamics. However, grid generation about complex boundaries requires sophisticated surface-model descriptions of the boundaries. The surface modeling and the associated computation of surface grids consume an extremely large percentage of the total time required for volume grid generation. Efficient and user friendly software systems for surface modeling and grid generation are critical for computational fluid dynamics to reach its potential. The papers presented here represent the state-of-the-art in software systems for surface modeling and grid generation. Several papers describe improved techniques for grid generation.

  10. Evaluation of Emerging Energy-Efficient Heterogeneous Computing Platforms for Biomolecular and Cellular Simulation Workloads.

    PubMed

    Stone, John E; Hallock, Michael J; Phillips, James C; Peterson, Joseph R; Luthey-Schulten, Zaida; Schulten, Klaus

    2016-05-01

    Many of the continuing scientific advances achieved through computational biology are predicated on the availability of ongoing increases in computational power required for detailed simulation and analysis of cellular processes on biologically-relevant timescales. A critical challenge facing the development of future exascale supercomputer systems is the development of new computing hardware and associated scientific applications that dramatically improve upon the energy efficiency of existing solutions, while providing increased simulation, analysis, and visualization performance. Mobile computing platforms have recently become powerful enough to support interactive molecular visualization tasks that were previously only possible on laptops and workstations, creating future opportunities for their convenient use for meetings, remote collaboration, and as head mounted displays for immersive stereoscopic viewing. We describe early experiences adapting several biomolecular simulation and analysis applications for emerging heterogeneous computing platforms that combine power-efficient system-on-chip multi-core CPUs with high-performance massively parallel GPUs. We present low-cost power monitoring instrumentation that provides sufficient temporal resolution to evaluate the power consumption of individual CPU algorithms and GPU kernels. We compare the performance and energy efficiency of scientific applications running on emerging platforms with results obtained on traditional platforms, identify hardware and algorithmic performance bottlenecks that affect the usability of these platforms, and describe avenues for improving both the hardware and applications in pursuit of the needs of molecular modeling tasks on mobile devices and future exascale computers.

  11. Numerical convergence improvements for porflow unsaturated flow simulations

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Flach, Greg

    2017-08-14

    Section 3.6 of SRNL (2016) discusses various PORFLOW code improvements to increase modeling efficiency, in preparation for the next E-Area Performance Assessment (WSRC 2008) revision. This memorandum documents interaction with Analytic & Computational Research, Inc. (http://www.acricfd.com/default.htm) to improve numerical convergence efficiency using PORFLOW version 6.42 for unsaturated flow simulations.

  12. Computer-assisted coding and clinical documentation: first things first.

    PubMed

    Tully, Melinda; Carmichael, Angela

    2012-10-01

    Computer-assisted coding tools have the potential to drive improvements in seven areas: Transparency of coding. Productivity (generally by 20 to 25 percent for inpatient claims). Accuracy (by improving specificity of documentation). Cost containment (by reducing overtime expenses, audit fees, and denials). Compliance. Efficiency. Consistency.

  13. Low-Energy Truly Random Number Generation with Superparamagnetic Tunnel Junctions for Unconventional Computing

    NASA Astrophysics Data System (ADS)

    Vodenicarevic, D.; Locatelli, N.; Mizrahi, A.; Friedman, J. S.; Vincent, A. F.; Romera, M.; Fukushima, A.; Yakushiji, K.; Kubota, H.; Yuasa, S.; Tiwari, S.; Grollier, J.; Querlioz, D.

    2017-11-01

    Low-energy random number generation is critical for many emerging computing schemes proposed to complement or replace von Neumann architectures. However, current random number generators are always associated with an energy cost that is prohibitive for these computing schemes. We introduce random number bit generation based on specific nanodevices: superparamagnetic tunnel junctions. We experimentally demonstrate high-quality random bit generation that represents an orders-of-magnitude improvement in energy efficiency over current solutions. We show that the random generation speed improves with nanodevice scaling, and we investigate the impact of temperature, magnetic field, and cross talk. Finally, we show how alternative computing schemes can be implemented using superparamagentic tunnel junctions as random number generators. These results open the way for fabricating efficient hardware computing devices leveraging stochasticity, and they highlight an alternative use for emerging nanodevices.

  14. Parallel discontinuous Galerkin FEM for computing hyperbolic conservation law on unstructured grids

    NASA Astrophysics Data System (ADS)

    Ma, Xinrong; Duan, Zhijian

    2018-04-01

    High-order resolution Discontinuous Galerkin finite element methods (DGFEM) has been known as a good method for solving Euler equations and Navier-Stokes equations on unstructured grid, but it costs too much computational resources. An efficient parallel algorithm was presented for solving the compressible Euler equations. Moreover, the multigrid strategy based on three-stage three-order TVD Runge-Kutta scheme was used in order to improve the computational efficiency of DGFEM and accelerate the convergence of the solution of unsteady compressible Euler equations. In order to make each processor maintain load balancing, the domain decomposition method was employed. Numerical experiment performed for the inviscid transonic flow fluid problems around NACA0012 airfoil and M6 wing. The results indicated that our parallel algorithm can improve acceleration and efficiency significantly, which is suitable for calculating the complex flow fluid.

  15. Efficient computation of kinship and identity coefficients on large pedigrees.

    PubMed

    Cheng, En; Elliott, Brendan; Ozsoyoglu, Z Meral

    2009-06-01

    With the rapidly expanding field of medical genetics and genetic counseling, genealogy information is becoming increasingly abundant. An important computation on pedigree data is the calculation of identity coefficients, which provide a complete description of the degree of relatedness of a pair of individuals. The areas of application of identity coefficients are numerous and diverse, from genetic counseling to disease tracking, and thus, the computation of identity coefficients merits special attention. However, the computation of identity coefficients is not done directly, but rather as the final step after computing a set of generalized kinship coefficients. In this paper, we first propose a novel Path-Counting Formula for calculating generalized kinship coefficients, which is motivated by Wright's path-counting method for computing inbreeding coefficient. We then present an efficient and scalable scheme for calculating generalized kinship coefficients on large pedigrees using NodeCodes, a special encoding scheme for expediting the evaluation of queries on pedigree graph structures. Furthermore, we propose an improved scheme using Family NodeCodes for the computation of generalized kinship coefficients, which is motivated by the significant improvement of using Family NodeCodes for inbreeding coefficient over the use of NodeCodes. We also perform experiments for evaluating the efficiency of our method, and compare it with the performance of the traditional recursive algorithm for three individuals. Experimental results demonstrate that the resulting scheme is more scalable and efficient than the traditional recursive methods for computing generalized kinship coefficients.

  16. Student Ability, Confidence, and Attitudes Toward Incorporating a Computer into a Patient Interview.

    PubMed

    Ray, Sarah; Valdovinos, Katie

    2015-05-25

    To improve pharmacy students' ability to effectively incorporate a computer into a simulated patient encounter and to improve their awareness of barriers and attitudes towards and their confidence in using a computer during simulated patient encounters. Students completed a survey that assessed their awareness of, confidence in, and attitudes towards computer use during simulated patient encounters. Students were evaluated with a rubric on their ability to incorporate a computer into a simulated patient encounter. Students were resurveyed and reevaluated after instruction. Students improved in their ability to effectively incorporate computer usage into a simulated patient encounter. They also became more aware of and improved their attitudes toward barriers regarding such usage and gained more confidence in their ability to use a computer during simulated patient encounters. Instruction can improve pharmacy students' ability to incorporate a computer into simulated patient encounters. This skill is critical to developing efficiency while maintaining rapport with patients.

  17. Improving the Efficiency and Effectiveness of Grading through the Use of Computer-Assisted Grading Rubrics

    ERIC Educational Resources Information Center

    Anglin, Linda; Anglin, Kenneth; Schumann, Paul L.; Kaliski, John A.

    2008-01-01

    This study tests the use of computer-assisted grading rubrics compared to other grading methods with respect to the efficiency and effectiveness of different grading processes for subjective assignments. The test was performed on a large Introduction to Business course. The students in this course were randomly assigned to four treatment groups…

  18. Introduction to Computer Aided Instruction in the Language Laboratory.

    ERIC Educational Resources Information Center

    Hughett, Harvey L.

    The first half of this book focuses on the rationale, ideas, and information for the use of technology, including microcomputers, to improve language teaching efficiency. Topics discussed include foreign language computer assisted instruction (CAI), hardware and software selection, computer literacy, educational computing organizations, ease of…

  19. Research related to improved computer aided design software package. [comparative efficiency of finite, boundary, and hybrid element methods in elastostatics

    NASA Technical Reports Server (NTRS)

    Walston, W. H., Jr.

    1986-01-01

    The comparative computational efficiencies of the finite element (FEM), boundary element (BEM), and hybrid boundary element-finite element (HVFEM) analysis techniques are evaluated for representative bounded domain interior and unbounded domain exterior problems in elastostatics. Computational efficiency is carefully defined in this study as the computer time required to attain a specified level of solution accuracy. The study found the FEM superior to the BEM for the interior problem, while the reverse was true for the exterior problem. The hybrid analysis technique was found to be comparable or superior to both the FEM and BEM for both the interior and exterior problems.

  20. Radiation shielding evaluation of the BNCT treatment room at THOR: a TORT-coupled MCNP Monte Carlo simulation study.

    PubMed

    Chen, A Y; Liu, Y-W H; Sheu, R J

    2008-01-01

    This study investigates the radiation shielding design of the treatment room for boron neutron capture therapy at Tsing Hua Open-pool Reactor using "TORT-coupled MCNP" method. With this method, the computational efficiency is improved significantly by two to three orders of magnitude compared to the analog Monte Carlo MCNP calculation. This makes the calculation feasible using a single CPU in less than 1 day. Further optimization of the photon weight windows leads to additional 50-75% improvement in the overall computational efficiency.

  1. GPU-accelerated element-free reverse-time migration with Gauss points partition

    NASA Astrophysics Data System (ADS)

    Zhou, Zhen; Jia, Xiaofeng; Qiang, Xiaodong

    2018-06-01

    An element-free method (EFM) has been demonstrated successfully in elasticity, heat conduction and fatigue crack growth problems. We present the theory of EFM and its numerical applications in seismic modelling and reverse time migration (RTM). Compared with the finite difference method and the finite element method, the EFM has unique advantages: (1) independence of grids in computation and (2) lower expense and more flexibility (because only the information of the nodes and the boundary of the concerned area is required). However, in EFM, due to improper computation and storage of some large sparse matrices, such as the mass matrix and the stiffness matrix, the method is difficult to apply to seismic modelling and RTM for a large velocity model. To solve the problem of storage and computation efficiency, we propose a concept of Gauss points partition and utilise the graphics processing unit to improve the computational efficiency. We employ the compressed sparse row format to compress the intermediate large sparse matrices and attempt to simplify the operations by solving the linear equations with CULA solver. To improve the computation efficiency further, we introduce the concept of the lumped mass matrix. Numerical experiments indicate that the proposed method is accurate and more efficient than the regular EFM.

  2. Evaluation of Emerging Energy-Efficient Heterogeneous Computing Platforms for Biomolecular and Cellular Simulation Workloads

    PubMed Central

    Stone, John E.; Hallock, Michael J.; Phillips, James C.; Peterson, Joseph R.; Luthey-Schulten, Zaida; Schulten, Klaus

    2016-01-01

    Many of the continuing scientific advances achieved through computational biology are predicated on the availability of ongoing increases in computational power required for detailed simulation and analysis of cellular processes on biologically-relevant timescales. A critical challenge facing the development of future exascale supercomputer systems is the development of new computing hardware and associated scientific applications that dramatically improve upon the energy efficiency of existing solutions, while providing increased simulation, analysis, and visualization performance. Mobile computing platforms have recently become powerful enough to support interactive molecular visualization tasks that were previously only possible on laptops and workstations, creating future opportunities for their convenient use for meetings, remote collaboration, and as head mounted displays for immersive stereoscopic viewing. We describe early experiences adapting several biomolecular simulation and analysis applications for emerging heterogeneous computing platforms that combine power-efficient system-on-chip multi-core CPUs with high-performance massively parallel GPUs. We present low-cost power monitoring instrumentation that provides sufficient temporal resolution to evaluate the power consumption of individual CPU algorithms and GPU kernels. We compare the performance and energy efficiency of scientific applications running on emerging platforms with results obtained on traditional platforms, identify hardware and algorithmic performance bottlenecks that affect the usability of these platforms, and describe avenues for improving both the hardware and applications in pursuit of the needs of molecular modeling tasks on mobile devices and future exascale computers. PMID:27516922

  3. Computational efficiency for the surface renewal method

    NASA Astrophysics Data System (ADS)

    Kelley, Jason; Higgins, Chad

    2018-04-01

    Measuring surface fluxes using the surface renewal (SR) method requires programmatic algorithms for tabulation, algebraic calculation, and data quality control. A number of different methods have been published describing automated calibration of SR parameters. Because the SR method utilizes high-frequency (10 Hz+) measurements, some steps in the flux calculation are computationally expensive, especially when automating SR to perform many iterations of these calculations. Several new algorithms were written that perform the required calculations more efficiently and rapidly, and that tested for sensitivity to length of flux averaging period, ability to measure over a large range of lag timescales, and overall computational efficiency. These algorithms utilize signal processing techniques and algebraic simplifications that demonstrate simple modifications that dramatically improve computational efficiency. The results here complement efforts by other authors to standardize a robust and accurate computational SR method. Increased speed of computation time grants flexibility to implementing the SR method, opening new avenues for SR to be used in research, for applied monitoring, and in novel field deployments.

  4. Aerothermal modeling program, phase 2. Element B: Flow interaction experiment

    NASA Technical Reports Server (NTRS)

    Nikjooy, M.; Mongia, H. C.; Murthy, S. N. B.; Sullivan, J. P.

    1986-01-01

    The design process was improved and the efficiency, life, and maintenance costs of the turbine engine hot section was enhanced. Recently, there has been much emphasis on the need for improved numerical codes for the design of efficient combustors. For the development of improved computational codes, there is a need for an experimentally obtained data base to be used at test cases for the accuracy of the computations. The purpose of Element-B is to establish a benchmark quality velocity and scalar measurements of the flow interaction of circular jets with swirling flow typical of that in the dome region of annular combustor. In addition to the detailed experimental effort, extensive computations of the swirling flows are to be compared with the measurements for the purpose of assessing the accuracy of current and advanced turbulence and scalar transport models.

  5. On Improving Efficiency of Differential Evolution for Aerodynamic Shape Optimization Applications

    NASA Technical Reports Server (NTRS)

    Madavan, Nateri K.

    2004-01-01

    Differential Evolution (DE) is a simple and robust evolutionary strategy that has been proven effective in determining the global optimum for several difficult optimization problems. Although DE offers several advantages over traditional optimization approaches, its use in applications such as aerodynamic shape optimization where the objective function evaluations are computationally expensive is limited by the large number of function evaluations often required. In this paper various approaches for improving the efficiency of DE are reviewed and discussed. These approaches are implemented in a DE-based aerodynamic shape optimization method that uses a Navier-Stokes solver for the objective function evaluations. Parallelization techniques on distributed computers are used to reduce turnaround times. Results are presented for the inverse design of a turbine airfoil. The efficiency improvements achieved by the different approaches are evaluated and compared.

  6. Using Neural Net Technology To Enhance the Efficiency of a Computer Adaptive Testing Application.

    ERIC Educational Resources Information Center

    Van Nelson, C.; Henriksen, Larry W.

    The potential for computer adaptive testing (CAT) has been well documented. In order to improve the efficiency of this process, it may be possible to utilize a neural network, or more specifically, a back propagation neural network. The paper asserts that in order to accomplish this end, it must be shown that grouping examinees by ability as…

  7. Efficient convolutional sparse coding

    DOEpatents

    Wohlberg, Brendt

    2017-06-20

    Computationally efficient algorithms may be applied for fast dictionary learning solving the convolutional sparse coding problem in the Fourier domain. More specifically, efficient convolutional sparse coding may be derived within an alternating direction method of multipliers (ADMM) framework that utilizes fast Fourier transforms (FFT) to solve the main linear system in the frequency domain. Such algorithms may enable a significant reduction in computational cost over conventional approaches by implementing a linear solver for the most critical and computationally expensive component of the conventional iterative algorithm. The theoretical computational cost of the algorithm may be reduced from O(M.sup.3N) to O(MN log N), where N is the dimensionality of the data and M is the number of elements in the dictionary. This significant improvement in efficiency may greatly increase the range of problems that can practically be addressed via convolutional sparse representations.

  8. Coalescent: an open-science framework for importance sampling in coalescent theory.

    PubMed

    Tewari, Susanta; Spouge, John L

    2015-01-01

    Background. In coalescent theory, computer programs often use importance sampling to calculate likelihoods and other statistical quantities. An importance sampling scheme can exploit human intuition to improve statistical efficiency of computations, but unfortunately, in the absence of general computer frameworks on importance sampling, researchers often struggle to translate new sampling schemes computationally or benchmark against different schemes, in a manner that is reliable and maintainable. Moreover, most studies use computer programs lacking a convenient user interface or the flexibility to meet the current demands of open science. In particular, current computer frameworks can only evaluate the efficiency of a single importance sampling scheme or compare the efficiencies of different schemes in an ad hoc manner. Results. We have designed a general framework (http://coalescent.sourceforge.net; language: Java; License: GPLv3) for importance sampling that computes likelihoods under the standard neutral coalescent model of a single, well-mixed population of constant size over time following infinite sites model of mutation. The framework models the necessary core concepts, comes integrated with several data sets of varying size, implements the standard competing proposals, and integrates tightly with our previous framework for calculating exact probabilities. For a given dataset, it computes the likelihood and provides the maximum likelihood estimate of the mutation parameter. Well-known benchmarks in the coalescent literature validate the accuracy of the framework. The framework provides an intuitive user interface with minimal clutter. For performance, the framework switches automatically to modern multicore hardware, if available. It runs on three major platforms (Windows, Mac and Linux). Extensive tests and coverage make the framework reliable and maintainable. Conclusions. In coalescent theory, many studies of computational efficiency consider only effective sample size. Here, we evaluate proposals in the coalescent literature, to discover that the order of efficiency among the three importance sampling schemes changes when one considers running time as well as effective sample size. We also describe a computational technique called "just-in-time delegation" available to improve the trade-off between running time and precision by constructing improved importance sampling schemes from existing ones. Thus, our systems approach is a potential solution to the "2(8) programs problem" highlighted by Felsenstein, because it provides the flexibility to include or exclude various features of similar coalescent models or importance sampling schemes.

  9. A strategy for improved computational efficiency of the method of anchored distributions

    NASA Astrophysics Data System (ADS)

    Over, Matthew William; Yang, Yarong; Chen, Xingyuan; Rubin, Yoram

    2013-06-01

    This paper proposes a strategy for improving the computational efficiency of model inversion using the method of anchored distributions (MAD) by "bundling" similar model parametrizations in the likelihood function. Inferring the likelihood function typically requires a large number of forward model (FM) simulations for each possible model parametrization; as a result, the process is quite expensive. To ease this prohibitive cost, we present an approximation for the likelihood function called bundling that relaxes the requirement for high quantities of FM simulations. This approximation redefines the conditional statement of the likelihood function as the probability of a set of similar model parametrizations "bundle" replicating field measurements, which we show is neither a model reduction nor a sampling approach to improving the computational efficiency of model inversion. To evaluate the effectiveness of these modifications, we compare the quality of predictions and computational cost of bundling relative to a baseline MAD inversion of 3-D flow and transport model parameters. Additionally, to aid understanding of the implementation we provide a tutorial for bundling in the form of a sample data set and script for the R statistical computing language. For our synthetic experiment, bundling achieved a 35% reduction in overall computational cost and had a limited negative impact on predicted probability distributions of the model parameters. Strategies for minimizing error in the bundling approximation, for enforcing similarity among the sets of model parametrizations, and for identifying convergence of the likelihood function are also presented.

  10. Smart integrated microsystems: the energy efficiency challenge (Conference Presentation) (Plenary Presentation)

    NASA Astrophysics Data System (ADS)

    Benini, Luca

    2017-06-01

    The "internet of everything" envisions trillions of connected objects loaded with high-bandwidth sensors requiring massive amounts of local signal processing, fusion, pattern extraction and classification. From the computational viewpoint, the challenge is formidable and can be addressed only by pushing computing fabrics toward massive parallelism and brain-like energy efficiency levels. CMOS technology can still take us a long way toward this goal, but technology scaling is losing steam. Energy efficiency improvement will increasingly hinge on architecture, circuits, design techniques such as heterogeneous 3D integration, mixed-signal preprocessing, event-based approximate computing and non-Von-Neumann architectures for scalable acceleration.

  11. Carbon nanotube computer.

    PubMed

    Shulaker, Max M; Hills, Gage; Patil, Nishant; Wei, Hai; Chen, Hong-Yu; Wong, H-S Philip; Mitra, Subhasish

    2013-09-26

    The miniaturization of electronic devices has been the principal driving force behind the semiconductor industry, and has brought about major improvements in computational power and energy efficiency. Although advances with silicon-based electronics continue to be made, alternative technologies are being explored. Digital circuits based on transistors fabricated from carbon nanotubes (CNTs) have the potential to outperform silicon by improving the energy-delay product, a metric of energy efficiency, by more than an order of magnitude. Hence, CNTs are an exciting complement to existing semiconductor technologies. Owing to substantial fundamental imperfections inherent in CNTs, however, only very basic circuit blocks have been demonstrated. Here we show how these imperfections can be overcome, and demonstrate the first computer built entirely using CNT-based transistors. The CNT computer runs an operating system that is capable of multitasking: as a demonstration, we perform counting and integer-sorting simultaneously. In addition, we implement 20 different instructions from the commercial MIPS instruction set to demonstrate the generality of our CNT computer. This experimental demonstration is the most complex carbon-based electronic system yet realized. It is a considerable advance because CNTs are prominent among a variety of emerging technologies that are being considered for the next generation of highly energy-efficient electronic systems.

  12. Improved numerical methods for turbulent viscous flows aerothermal modeling program, phase 2

    NASA Technical Reports Server (NTRS)

    Karki, K. C.; Patankar, S. V.; Runchal, A. K.; Mongia, H. C.

    1988-01-01

    The details of a study to develop accurate and efficient numerical schemes to predict complex flows are described. In this program, several discretization schemes were evaluated using simple test cases. This assessment led to the selection of three schemes for an in-depth evaluation based on two-dimensional flows. The scheme with the superior overall performance was incorporated in a computer program for three-dimensional flows. To improve the computational efficiency, the selected discretization scheme was combined with a direct solution approach in which the fluid flow equations are solved simultaneously rather than sequentially.

  13. Local deformation for soft tissue simulation

    PubMed Central

    Omar, Nadzeri; Zhong, Yongmin; Smith, Julian; Gu, Chengfan

    2016-01-01

    ABSTRACT This paper presents a new methodology to localize the deformation range to improve the computational efficiency for soft tissue simulation. This methodology identifies the local deformation range from the stress distribution in soft tissues due to an external force. A stress estimation method is used based on elastic theory to estimate the stress in soft tissues according to a depth from the contact surface. The proposed methodology can be used with both mass-spring and finite element modeling approaches for soft tissue deformation. Experimental results show that the proposed methodology can improve the computational efficiency while maintaining the modeling realism. PMID:27286482

  14. On Improving Efficiency of Differential Evolution for Aerodynamic Shape Optimization Applications

    NASA Technical Reports Server (NTRS)

    Madavan, Nateri K.

    2004-01-01

    Differential Evolution (DE) is a simple and robust evolutionary strategy that has been provEn effective in determining the global optimum for several difficult optimization problems. Although DE offers several advantages over traditional optimization approaches, its use in applications such as aerodynamic shape optimization where the objective function evaluations are computationally expensive is limited by the large number of function evaluations often required. In this paper various approaches for improving the efficiency of DE are reviewed and discussed. Several approaches that have proven effective for other evolutionary algorithms are modified and implemented in a DE-based aerodynamic shape optimization method that uses a Navier-Stokes solver for the objective function evaluations. Parallelization techniques on distributed computers are used to reduce turnaround times. Results are presented for standard test optimization problems and for the inverse design of a turbine airfoil. The efficiency improvements achieved by the different approaches are evaluated and compared.

  15. Robust efficient video fingerprinting

    NASA Astrophysics Data System (ADS)

    Puri, Manika; Lubin, Jeffrey

    2009-02-01

    We have developed a video fingerprinting system with robustness and efficiency as the primary and secondary design criteria. In extensive testing, the system has shown robustness to cropping, letter-boxing, sub-titling, blur, drastic compression, frame rate changes, size changes and color changes, as well as to the geometric distortions often associated with camcorder capture in cinema settings. Efficiency is afforded by a novel two-stage detection process in which a fast matching process first computes a number of likely candidates, which are then passed to a second slower process that computes the overall best match with minimal false alarm probability. One key component of the algorithm is a maximally stable volume computation - a three-dimensional generalization of maximally stable extremal regions - that provides a content-centric coordinate system for subsequent hash function computation, independent of any affine transformation or extensive cropping. Other key features include an efficient bin-based polling strategy for initial candidate selection, and a final SIFT feature-based computation for final verification. We describe the algorithm and its performance, and then discuss additional modifications that can provide further improvement to efficiency and accuracy.

  16. Electronic and mechanical improvement of the receiving terminal of a free-space microwave power transmission system

    NASA Technical Reports Server (NTRS)

    Brown, W. C.

    1977-01-01

    Significant advancements were made in a number of areas: improved efficiency of basic receiving element at low power density levels, improved resolution and confidence in efficiency measurements mathematical modelling and computer simulation of the receiving element and the design, construction, and testing of an environmentally protected two-plane construction suitable for low cost, highly automated construction of large receiving arrays.

  17. Optimal Computing Budget Allocation for Particle Swarm Optimization in Stochastic Optimization.

    PubMed

    Zhang, Si; Xu, Jie; Lee, Loo Hay; Chew, Ek Peng; Wong, Wai Peng; Chen, Chun-Hung

    2017-04-01

    Particle Swarm Optimization (PSO) is a popular metaheuristic for deterministic optimization. Originated in the interpretations of the movement of individuals in a bird flock or fish school, PSO introduces the concept of personal best and global best to simulate the pattern of searching for food by flocking and successfully translate the natural phenomena to the optimization of complex functions. Many real-life applications of PSO cope with stochastic problems. To solve a stochastic problem using PSO, a straightforward approach is to equally allocate computational effort among all particles and obtain the same number of samples of fitness values. This is not an efficient use of computational budget and leaves considerable room for improvement. This paper proposes a seamless integration of the concept of optimal computing budget allocation (OCBA) into PSO to improve the computational efficiency of PSO for stochastic optimization problems. We derive an asymptotically optimal allocation rule to intelligently determine the number of samples for all particles such that the PSO algorithm can efficiently select the personal best and global best when there is stochastic estimation noise in fitness values. We also propose an easy-to-implement sequential procedure. Numerical tests show that our new approach can obtain much better results using the same amount of computational effort.

  18. Optimal Computing Budget Allocation for Particle Swarm Optimization in Stochastic Optimization

    PubMed Central

    Zhang, Si; Xu, Jie; Lee, Loo Hay; Chew, Ek Peng; Chen, Chun-Hung

    2017-01-01

    Particle Swarm Optimization (PSO) is a popular metaheuristic for deterministic optimization. Originated in the interpretations of the movement of individuals in a bird flock or fish school, PSO introduces the concept of personal best and global best to simulate the pattern of searching for food by flocking and successfully translate the natural phenomena to the optimization of complex functions. Many real-life applications of PSO cope with stochastic problems. To solve a stochastic problem using PSO, a straightforward approach is to equally allocate computational effort among all particles and obtain the same number of samples of fitness values. This is not an efficient use of computational budget and leaves considerable room for improvement. This paper proposes a seamless integration of the concept of optimal computing budget allocation (OCBA) into PSO to improve the computational efficiency of PSO for stochastic optimization problems. We derive an asymptotically optimal allocation rule to intelligently determine the number of samples for all particles such that the PSO algorithm can efficiently select the personal best and global best when there is stochastic estimation noise in fitness values. We also propose an easy-to-implement sequential procedure. Numerical tests show that our new approach can obtain much better results using the same amount of computational effort. PMID:29170617

  19. An approximate solution to improve computational efficiency of impedance-type payload load prediction

    NASA Technical Reports Server (NTRS)

    White, C. W.

    1981-01-01

    The computational efficiency of the impedance type loads prediction method was studied. Three goals were addressed: devise a method to make the impedance method operate more efficiently in the computer; assess the accuracy and convenience of the method for determining the effect of design changes; and investigate the use of the method to identify design changes for reduction of payload loads. The method is suitable for calculation of dynamic response in either the frequency or time domain. It is concluded that: the choice of an orthogonal coordinate system will allow the impedance method to operate more efficiently in the computer; the approximate mode impedance technique is adequate for determining the effect of design changes, and is applicable for both statically determinate and statically indeterminate payload attachments; and beneficial design changes to reduce payload loads can be identified by the combined application of impedance techniques and energy distribution review techniques.

  20. Improving Computational Efficiency of Prediction in Model-Based Prognostics Using the Unscented Transform

    NASA Technical Reports Server (NTRS)

    Daigle, Matthew John; Goebel, Kai Frank

    2010-01-01

    Model-based prognostics captures system knowledge in the form of physics-based models of components, and how they fail, in order to obtain accurate predictions of end of life (EOL). EOL is predicted based on the estimated current state distribution of a component and expected profiles of future usage. In general, this requires simulations of the component using the underlying models. In this paper, we develop a simulation-based prediction methodology that achieves computational efficiency by performing only the minimal number of simulations needed in order to accurately approximate the mean and variance of the complete EOL distribution. This is performed through the use of the unscented transform, which predicts the means and covariances of a distribution passed through a nonlinear transformation. In this case, the EOL simulation acts as that nonlinear transformation. In this paper, we review the unscented transform, and describe how this concept is applied to efficient EOL prediction. As a case study, we develop a physics-based model of a solenoid valve, and perform simulation experiments to demonstrate improved computational efficiency without sacrificing prediction accuracy.

  1. Development of Efficient Real-Fluid Model in Simulating Liquid Rocket Injector Flows

    NASA Technical Reports Server (NTRS)

    Cheng, Gary; Farmer, Richard

    2003-01-01

    The characteristics of propellant mixing near the injector have a profound effect on the liquid rocket engine performance. However, the flow features near the injector of liquid rocket engines are extremely complicated, for example supercritical-pressure spray, turbulent mixing, and chemical reactions are present. Previously, a homogeneous spray approach with a real-fluid property model was developed to account for the compressibility and evaporation effects such that thermodynamics properties of a mixture at a wide range of pressures and temperatures can be properly calculated, including liquid-phase, gas- phase, two-phase, and dense fluid regions. The developed homogeneous spray model demonstrated a good success in simulating uni- element shear coaxial injector spray combustion flows. However, the real-fluid model suffered a computational deficiency when applied to a pressure-based computational fluid dynamics (CFD) code. The deficiency is caused by the pressure and enthalpy being the independent variables in the solution procedure of a pressure-based code, whereas the real-fluid model utilizes density and temperature as independent variables. The objective of the present research work is to improve the computational efficiency of the real-fluid property model in computing thermal properties. The proposed approach is called an efficient real-fluid model, and the improvement of computational efficiency is achieved by using a combination of a liquid species and a gaseous species to represent a real-fluid species.

  2. Multi-GPU hybrid programming accelerated three-dimensional phase-field model in binary alloy

    NASA Astrophysics Data System (ADS)

    Zhu, Changsheng; Liu, Jieqiong; Zhu, Mingfang; Feng, Li

    2018-03-01

    In the process of dendritic growth simulation, the computational efficiency and the problem scales have extremely important influence on simulation efficiency of three-dimensional phase-field model. Thus, seeking for high performance calculation method to improve the computational efficiency and to expand the problem scales has a great significance to the research of microstructure of the material. A high performance calculation method based on MPI+CUDA hybrid programming model is introduced. Multi-GPU is used to implement quantitative numerical simulations of three-dimensional phase-field model in binary alloy under the condition of multi-physical processes coupling. The acceleration effect of different GPU nodes on different calculation scales is explored. On the foundation of multi-GPU calculation model that has been introduced, two optimization schemes, Non-blocking communication optimization and overlap of MPI and GPU computing optimization, are proposed. The results of two optimization schemes and basic multi-GPU model are compared. The calculation results show that the use of multi-GPU calculation model can improve the computational efficiency of three-dimensional phase-field obviously, which is 13 times to single GPU, and the problem scales have been expanded to 8193. The feasibility of two optimization schemes is shown, and the overlap of MPI and GPU computing optimization has better performance, which is 1.7 times to basic multi-GPU model, when 21 GPUs are used.

  3. A Foothold for Handhelds.

    ERIC Educational Resources Information Center

    Joyner, Amy

    2003-01-01

    Handheld computers provide students tremendous computing and learning power at about a 10th the cost of a regular computer. Describes the evolution of handhelds; provides some examples of their uses; and cites research indicating they are effective classroom tools that can improve efficiency and instruction. A sidebar lists handheld resources.…

  4. Quality indexing with computer-aided lexicography

    NASA Technical Reports Server (NTRS)

    Buchan, Ronald L.

    1992-01-01

    Indexing with computers is a far cry from indexing with the first indexing tool, the manual card sorter. With the aid of computer-aided lexicography, both indexing and indexing tools can provide standardization, consistency, and accuracy, resulting in greater quality control than ever before. A brief survey of computer activity in indexing is presented with detailed illustrations from NASA activity. Applications from techniques mentioned, such as Retrospective Indexing (RI), can be made to many indexing systems. In addition to improving the quality of indexing with computers, the improved efficiency with which certain tasks can be done is demonstrated.

  5. Performance of computer-designed small-size multistage depressed collectors for a high-perveance traveling wave tube

    NASA Technical Reports Server (NTRS)

    Ramins, P.

    1984-01-01

    Computer designed axisymmetric 2.4-cm-diameter three-, four-, and five-stage depressed collectors were evaluated in conjunction with an octave bandwidth, high-perveance, and high-electronic-efficiency, griddled-gun traveling wave tube (TWT). Spent-beam refocusing was used to condition the beam for optimum entry into the depressed collectors. Both the TWT and multistage depressed collector (MDC) efficiencies were measured, as well as the MDC current, dissipated thermal power, and DC input power distributions, for the TWT operating both at saturation over its bandwidth and over its full dynamic range. Relatively high collector efficiencies were obtained, leading to a very substantial improvement in the overall TWT efficiency. In spite of large fixed TWT body losses (due largely to the 6 to 8 percent beam interception), average overall efficiencies of 45 to 47 percent (for three to five collector stages) were obtained at saturation across the 2.5-, to 5.5-GHz operating band. For operation below saturation the collector efficiencies improved steadily, leading to reasonable ( 20 percent) overall efficiencies as far as 6 dB below saturation.

  6. Additional development of the XTRAN3S computer program

    NASA Technical Reports Server (NTRS)

    Borland, C. J.

    1989-01-01

    Additional developments and enhancements to the XTRAN3S computer program, a code for calculation of steady and unsteady aerodynamics, and associated aeroelastic solutions, for 3-D wings in the transonic flow regime are described. Algorithm improvements for the XTRAN3S program were provided including an implicit finite difference scheme to enhance the allowable time step and vectorization for improved computational efficiency. The code was modified to treat configurations with a fuselage, multiple stores/nacelles/pylons, and winglets. Computer program changes (updates) for error corrections and updates for version control are provided.

  7. On the generalized VIP time integral methodology for transient thermal problems

    NASA Technical Reports Server (NTRS)

    Mei, Youping; Chen, Xiaoqin; Tamma, Kumar K.; Sha, Desong

    1993-01-01

    The paper describes the development and applicability of a generalized VIrtual-Pulse (VIP) time integral method of computation for thermal problems. Unlike past approaches for general heat transfer computations, and with the advent of high speed computing technology and the importance of parallel computations for efficient use of computing environments, a major motivation via the developments described in this paper is the need for developing explicit computational procedures with improved accuracy and stability characteristics. As a consequence, a new and effective VIP methodology is described which inherits these improved characteristics. Numerical illustrative examples are provided to demonstrate the developments and validate the results obtained for thermal problems.

  8. Energy 101: Energy Efficient Data Centers

    ScienceCinema

    None

    2018-04-16

    Data centers provide mission-critical computing functions vital to the daily operation of top U.S. economic, scientific, and technological organizations. These data centers consume large amounts of energy to run and maintain their computer systems, servers, and associated high-performance components—up to 3% of all U.S. electricity powers data centers. And as more information comes online, data centers will consume even more energy. Data centers can become more energy efficient by incorporating features like power-saving "stand-by" modes, energy monitoring software, and efficient cooling systems instead of energy-intensive air conditioners. These and other efficiency improvements to data centers can produce significant energy savings, reduce the load on the electric grid, and help protect the nation by increasing the reliability of critical computer operations.

  9. Efficient storage, computation, and exposure of computer-generated holograms by electron-beam lithography.

    PubMed

    Newman, D M; Hawley, R W; Goeckel, D L; Crawford, R D; Abraham, S; Gallagher, N C

    1993-05-10

    An efficient storage format was developed for computer-generated holograms for use in electron-beam lithography. This method employs run-length encoding and Lempel-Ziv-Welch compression and succeeds in exposing holograms that were previously infeasible owing to the hologram's tremendous pattern-data file size. These holograms also require significant computation; thus the algorithm was implemented on a parallel computer, which improved performance by 2 orders of magnitude. The decompression algorithm was integrated into the Cambridge electron-beam machine's front-end processor.Although this provides much-needed ability, some hardware enhancements will be required in the future to overcome inadequacies in the current front-end processor that result in a lengthy exposure time.

  10. Energy Efficient Engine (E3) controls and accessories detail design report

    NASA Technical Reports Server (NTRS)

    Beitler, R. S.; Lavash, J. P.

    1982-01-01

    An Energy Efficient Engine program has been established by NASA to develop technology for improving the energy efficiency of future commercial transport aircraft engines. As part of this program, a new turbofan engine was designed. This report describes the fuel and control system for this engine. The system design is based on many of the proven concepts and component designs used on the General Electric CF6 family of engines. One significant difference is the incorporation of digital electronic computation in place of the hydromechanical computation currently used.

  11. IMPROVING TACONITE PROCESSING PLANT EFFICIENCY BY COMPUTER SIMULATION, Final Report

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    William M. Bond; Salih Ersayin

    2007-03-30

    This project involved industrial scale testing of a mineral processing simulator to improve the efficiency of a taconite processing plant, namely the Minorca mine. The Concentrator Modeling Center at the Coleraine Minerals Research Laboratory, University of Minnesota Duluth, enhanced the capabilities of available software, Usim Pac, by developing mathematical models needed for accurate simulation of taconite plants. This project provided funding for this technology to prove itself in the industrial environment. As the first step, data representing existing plant conditions were collected by sampling and sample analysis. Data were then balanced and provided a basis for assessing the efficiency ofmore » individual devices and the plant, and also for performing simulations aimed at improving plant efficiency. Performance evaluation served as a guide in developing alternative process strategies for more efficient production. A large number of computer simulations were then performed to quantify the benefits and effects of implementing these alternative schemes. Modification of makeup ball size was selected as the most feasible option for the target performance improvement. This was combined with replacement of existing hydrocyclones with more efficient ones. After plant implementation of these modifications, plant sampling surveys were carried out to validate findings of the simulation-based study. Plant data showed very good agreement with the simulated data, confirming results of simulation. After the implementation of modifications in the plant, several upstream bottlenecks became visible. Despite these bottlenecks limiting full capacity, concentrator energy improvement of 7% was obtained. Further improvements in energy efficiency are expected in the near future. The success of this project demonstrated the feasibility of a simulation-based approach. Currently, the Center provides simulation-based service to all the iron ore mining companies operating in northern Minnesota, and future proposals are pending with non-taconite mineral processing applications.« less

  12. Accurate first-principles structures and energies of diversely bonded systems from an efficient density functional

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sun, Jianwei; Remsing, Richard C.; Zhang, Yubo

    2016-06-13

    One atom or molecule binds to another through various types of bond, the strengths of which range from several meV to several eV. Although some computational methods can provide accurate descriptions of all bond types, those methods are not efficient enough for many studies (for example, large systems, ab initio molecular dynamics and high-throughput searches for functional materials). Here, we show that the recently developed non-empirical strongly constrained and appropriately normed (SCAN) meta-generalized gradient approximation (meta-GGA) within the density functional theory framework predicts accurate geometries and energies of diversely bonded molecules and materials (including covalent, metallic, ionic, hydrogen and vanmore » der Waals bonds). This represents a significant improvement at comparable efficiency over its predecessors, the GGAs that currently dominate materials computation. Often, SCAN matches or improves on the accuracy of a computationally expensive hybrid functional, at almost-GGA cost. SCAN is therefore expected to have a broad impact on chemistry and materials science.« less

  13. Accurate first-principles structures and energies of diversely bonded systems from an efficient density functional.

    PubMed

    Sun, Jianwei; Remsing, Richard C; Zhang, Yubo; Sun, Zhaoru; Ruzsinszky, Adrienn; Peng, Haowei; Yang, Zenghui; Paul, Arpita; Waghmare, Umesh; Wu, Xifan; Klein, Michael L; Perdew, John P

    2016-09-01

    One atom or molecule binds to another through various types of bond, the strengths of which range from several meV to several eV. Although some computational methods can provide accurate descriptions of all bond types, those methods are not efficient enough for many studies (for example, large systems, ab initio molecular dynamics and high-throughput searches for functional materials). Here, we show that the recently developed non-empirical strongly constrained and appropriately normed (SCAN) meta-generalized gradient approximation (meta-GGA) within the density functional theory framework predicts accurate geometries and energies of diversely bonded molecules and materials (including covalent, metallic, ionic, hydrogen and van der Waals bonds). This represents a significant improvement at comparable efficiency over its predecessors, the GGAs that currently dominate materials computation. Often, SCAN matches or improves on the accuracy of a computationally expensive hybrid functional, at almost-GGA cost. SCAN is therefore expected to have a broad impact on chemistry and materials science.

  14. Efficient high-order structure-preserving methods for the generalized Rosenau-type equation with power law nonlinearity

    NASA Astrophysics Data System (ADS)

    Cai, Jiaxiang; Liang, Hua; Zhang, Chun

    2018-06-01

    Based on the multi-symplectic Hamiltonian formula of the generalized Rosenau-type equation, a multi-symplectic scheme and an energy-preserving scheme are proposed. To improve the accuracy of the solution, we apply the composition technique to the obtained schemes to develop high-order schemes which are also multi-symplectic and energy-preserving respectively. Discrete fast Fourier transform makes a significant improvement to the computational efficiency of schemes. Numerical results verify that all the proposed schemes have satisfactory performance in providing accurate solution and preserving the discrete mass and energy invariants. Numerical results also show that although each basic time step is divided into several composition steps, the computational efficiency of the composition schemes is much higher than that of the non-composite schemes.

  15. Usability of an Adaptive Computer Assistant that Improves Self-care and Health Literacy of Older Adults

    PubMed Central

    Blanson Henkemans, O. A.; Rogers, W. A.; Fisk, A. D.; Neerincx, M. A.; Lindenberg, J.; van der Mast, C. A. P. G.

    2014-01-01

    Summary Objectives We developed an adaptive computer assistant for the supervision of diabetics’ self-care, to support limiting illness and need for acute treatment, and improve health literacy. This assistant monitors self-care activities logged in the patient’s electronic diary. Accordingly, it provides context-aware feedback. The objective was to evaluate whether older adults in general can make use of the computer assistant and to compare an adaptive computer assistant with a fixed one, concerning its usability and contribution to health literacy. Methods We conducted a laboratory experiment in the Georgia Tech Aware Home wherein 28 older adults participated in a usability evaluation of the computer assistant, while engaged in scenarios reflecting normal and health-critical situations. We evaluated the assistant on effectiveness, efficiency, satisfaction, and educational value. Finally, we studied the moderating effects of the subjects’ personal characteristics. Results Logging self-care tasks and receiving feedback from the computer assistant enhanced the subjects’ knowledge of diabetes. The adaptive assistant was more effective in dealing with normal and health-critical situations, and, generally, it led to more time efficiency. Subjects’ personal characteristics had substantial effects on the effectiveness and efficiency of the two computer assistants. Conclusions Older adults were able to use the adaptive computer assistant. In addition, it had a positive effect on the development of health literacy. The assistant has the potential to support older diabetics’ self care while maintaining quality of life. PMID:18213433

  16. Dynamic water allocation policies improve the global efficiency of storage systems

    NASA Astrophysics Data System (ADS)

    Niayifar, Amin; Perona, Paolo

    2017-06-01

    Water impoundment by dams strongly affects the river natural flow regime, its attributes and the related ecosystem biodiversity. Fostering the sustainability of water uses e.g., hydropower systems thus implies searching for innovative operational policies able to generate Dynamic Environmental Flows (DEF) that mimic natural flow variability. The objective of this study is to propose a Direct Policy Search (DPS) framework based on defining dynamic flow release rules to improve the global efficiency of storage systems. The water allocation policies proposed for dammed systems are an extension of previously developed flow redistribution rules for small hydropower plants by Razurel et al. (2016).The mathematical form of the Fermi-Dirac statistical distribution applied to lake equations for the stored water in the dam is used to formulate non-proportional redistribution rules that partition the flow for energy production and environmental use. While energy production is computed from technical data, riverine ecological benefits associated with DEF are computed by integrating the Weighted Usable Area (WUA) for fishes with Richter's hydrological indicators. Then, multiobjective evolutionary algorithms (MOEAs) are applied to build ecological versus economic efficiency plot and locate its (Pareto) frontier. This study benchmarks two MOEAs (NSGA II and Borg MOEA) and compares their efficiency in terms of the quality of Pareto's frontier and computational cost. A detailed analysis of dam characteristics is performed to examine their impact on the global system efficiency and choice of the best redistribution rule. Finally, it is found that non-proportional flow releases can statistically improve the global efficiency, specifically the ecological one, of the hydropower system when compared to constant minimal flows.

  17. High performance transcription factor-DNA docking with GPU computing

    PubMed Central

    2012-01-01

    Background Protein-DNA docking is a very challenging problem in structural bioinformatics and has important implications in a number of applications, such as structure-based prediction of transcription factor binding sites and rational drug design. Protein-DNA docking is very computational demanding due to the high cost of energy calculation and the statistical nature of conformational sampling algorithms. More importantly, experiments show that the docking quality depends on the coverage of the conformational sampling space. It is therefore desirable to accelerate the computation of the docking algorithm, not only to reduce computing time, but also to improve docking quality. Methods In an attempt to accelerate the sampling process and to improve the docking performance, we developed a graphics processing unit (GPU)-based protein-DNA docking algorithm. The algorithm employs a potential-based energy function to describe the binding affinity of a protein-DNA pair, and integrates Monte-Carlo simulation and a simulated annealing method to search through the conformational space. Algorithmic techniques were developed to improve the computation efficiency and scalability on GPU-based high performance computing systems. Results The effectiveness of our approach is tested on a non-redundant set of 75 TF-DNA complexes and a newly developed TF-DNA docking benchmark. We demonstrated that the GPU-based docking algorithm can significantly accelerate the simulation process and thereby improving the chance of finding near-native TF-DNA complex structures. This study also suggests that further improvement in protein-DNA docking research would require efforts from two integral aspects: improvement in computation efficiency and energy function design. Conclusions We present a high performance computing approach for improving the prediction accuracy of protein-DNA docking. The GPU-based docking algorithm accelerates the search of the conformational space and thus increases the chance of finding more near-native structures. To the best of our knowledge, this is the first ad hoc effort of applying GPU or GPU clusters to the protein-DNA docking problem. PMID:22759575

  18. Computational efficiency improvements for image colorization

    NASA Astrophysics Data System (ADS)

    Yu, Chao; Sharma, Gaurav; Aly, Hussein

    2013-03-01

    We propose an efficient algorithm for colorization of greyscale images. As in prior work, colorization is posed as an optimization problem: a user specifies the color for a few scribbles drawn on the greyscale image and the color image is obtained by propagating color information from the scribbles to surrounding regions, while maximizing the local smoothness of colors. In this formulation, colorization is obtained by solving a large sparse linear system, which normally requires substantial computation and memory resources. Our algorithm improves the computational performance through three innovations over prior colorization implementations. First, the linear system is solved iteratively without explicitly constructing the sparse matrix, which significantly reduces the required memory. Second, we formulate each iteration in terms of integral images obtained by dynamic programming, reducing repetitive computation. Third, we use a coarseto- fine framework, where a lower resolution subsampled image is first colorized and this low resolution color image is upsampled to initialize the colorization process for the fine level. The improvements we develop provide significant speedup and memory savings compared to the conventional approach of solving the linear system directly using off-the-shelf sparse solvers, and allow us to colorize images with typical sizes encountered in realistic applications on typical commodity computing platforms.

  19. Dynamic Load-Balancing for Distributed Heterogeneous Computing of Parallel CFD Problems

    NASA Technical Reports Server (NTRS)

    Ecer, A.; Chien, Y. P.; Boenisch, T.; Akay, H. U.

    2000-01-01

    The developed methodology is aimed at improving the efficiency of executing block-structured algorithms on parallel, distributed, heterogeneous computers. The basic approach of these algorithms is to divide the flow domain into many sub- domains called blocks, and solve the governing equations over these blocks. Dynamic load balancing problem is defined as the efficient distribution of the blocks among the available processors over a period of several hours of computations. In environments with computers of different architecture, operating systems, CPU speed, memory size, load, and network speed, balancing the loads and managing the communication between processors becomes crucial. Load balancing software tools for mutually dependent parallel processes have been created to efficiently utilize an advanced computation environment and algorithms. These tools are dynamic in nature because of the chances in the computer environment during execution time. More recently, these tools were extended to a second operating system: NT. In this paper, the problems associated with this application will be discussed. Also, the developed algorithms were combined with the load sharing capability of LSF to efficiently utilize workstation clusters for parallel computing. Finally, results will be presented on running a NASA based code ADPAC to demonstrate the developed tools for dynamic load balancing.

  20. A one-model approach based on relaxed combinations of inputs for evaluating input congestion in DEA

    NASA Astrophysics Data System (ADS)

    Khodabakhshi, Mohammad

    2009-08-01

    This paper provides a one-model approach of input congestion based on input relaxation model developed in data envelopment analysis (e.g. [G.R. Jahanshahloo, M. Khodabakhshi, Suitable combination of inputs for improving outputs in DEA with determining input congestion -- Considering textile industry of China, Applied Mathematics and Computation (1) (2004) 263-273; G.R. Jahanshahloo, M. Khodabakhshi, Determining assurance interval for non-Archimedean ele improving outputs model in DEA, Applied Mathematics and Computation 151 (2) (2004) 501-506; M. Khodabakhshi, A super-efficiency model based on improved outputs in data envelopment analysis, Applied Mathematics and Computation 184 (2) (2007) 695-703; M. Khodabakhshi, M. Asgharian, An input relaxation measure of efficiency in stochastic data analysis, Applied Mathematical Modelling 33 (2009) 2010-2023]. This approach reduces solving three problems with the two-model approach introduced in the first of the above-mentioned reference to two problems which is certainly important from computational point of view. The model is applied to a set of data extracted from ISI database to estimate input congestion of 12 Canadian business schools.

  1. Final Report Collaborative Project. Improving the Representation of Coastal and Estuarine Processes in Earth System Models

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bryan, Frank; Dennis, John; MacCready, Parker

    This project aimed to improve long term global climate simulations by resolving and enhancing the representation of the processes involved in the cycling of freshwater through estuaries and coastal regions. This was a collaborative multi-institution project consisting of physical oceanographers, climate model developers, and computational scientists. It specifically targeted the DOE objectives of advancing simulation and predictive capability of climate models through improvements in resolution and physical process representation. The main computational objectives were: 1. To develop computationally efficient, but physically based, parameterizations of estuary and continental shelf mixing processes for use in an Earth System Model (CESM). 2. Tomore » develop a two-way nested regional modeling framework in order to dynamically downscale the climate response of particular coastal ocean regions and to upscale the impact of the regional coastal processes to the global climate in an Earth System Model (CESM). 3. To develop computational infrastructure to enhance the efficiency of data transfer between specific sources and destinations, i.e., a point-to-point communication capability, (used in objective 1) within POP, the ocean component of CESM.« less

  2. Computationally efficient method for Fourier transform of highly chirped pulses for laser and parametric amplifier modeling.

    PubMed

    Andrianov, Alexey; Szabo, Aron; Sergeev, Alexander; Kim, Arkady; Chvykov, Vladimir; Kalashnikov, Mikhail

    2016-11-14

    We developed an improved approach to calculate the Fourier transform of signals with arbitrary large quadratic phase which can be efficiently implemented in numerical simulations utilizing Fast Fourier transform. The proposed algorithm significantly reduces the computational cost of Fourier transform of a highly chirped and stretched pulse by splitting it into two separate transforms of almost transform limited pulses, thereby reducing the required grid size roughly by a factor of the pulse stretching. The application of our improved Fourier transform algorithm in the split-step method for numerical modeling of CPA and OPCPA shows excellent agreement with standard algorithms.

  3. Experimental and Computational Sonic Boom Assessment of Lockheed-Martin N+2 Low Boom Models

    NASA Technical Reports Server (NTRS)

    Cliff, Susan E.; Durston, Donald A.; Elmiligui, Alaa A.; Walker, Eric L.; Carter, Melissa B.

    2015-01-01

    Flight at speeds greater than the speed of sound is not permitted over land, primarily because of the noise and structural damage caused by sonic boom pressure waves of supersonic aircraft. Mitigation of sonic boom is a key focus area of the High Speed Project under NASA's Fundamental Aeronautics Program. The project is focusing on technologies to enable future civilian aircraft to fly efficiently with reduced sonic boom, engine and aircraft noise, and emissions. A major objective of the project is to improve both computational and experimental capabilities for design of low-boom, high-efficiency aircraft. NASA and industry partners are developing improved wind tunnel testing techniques and new pressure instrumentation to measure the weak sonic boom pressure signatures of modern vehicle concepts. In parallel, computational methods are being developed to provide rapid design and analysis of supersonic aircraft with improved meshing techniques that provide efficient, robust, and accurate on- and off-body pressures at several body lengths from vehicles with very low sonic boom overpressures. The maturity of these critical parallel efforts is necessary before low-boom flight can be demonstrated and commercial supersonic flight can be realized.

  4. Efficient visibility-driven medical image visualisation via adaptive binned visibility histogram.

    PubMed

    Jung, Younhyun; Kim, Jinman; Kumar, Ashnil; Feng, David Dagan; Fulham, Michael

    2016-07-01

    'Visibility' is a fundamental optical property that represents the observable, by users, proportion of the voxels in a volume during interactive volume rendering. The manipulation of this 'visibility' improves the volume rendering processes; for instance by ensuring the visibility of regions of interest (ROIs) or by guiding the identification of an optimal rendering view-point. The construction of visibility histograms (VHs), which represent the distribution of all the visibility of all voxels in the rendered volume, enables users to explore the volume with real-time feedback about occlusion patterns among spatially related structures during volume rendering manipulations. Volume rendered medical images have been a primary beneficiary of VH given the need to ensure that specific ROIs are visible relative to the surrounding structures, e.g. the visualisation of tumours that may otherwise be occluded by neighbouring structures. VH construction and its subsequent manipulations, however, are computationally expensive due to the histogram binning of the visibilities. This limits the real-time application of VH to medical images that have large intensity ranges and volume dimensions and require a large number of histogram bins. In this study, we introduce an efficient adaptive binned visibility histogram (AB-VH) in which a smaller number of histogram bins are used to represent the visibility distribution of the full VH. We adaptively bin medical images by using a cluster analysis algorithm that groups the voxels according to their intensity similarities into a smaller subset of bins while preserving the distribution of the intensity range of the original images. We increase efficiency by exploiting the parallel computation and multiple render targets (MRT) extension of the modern graphical processing units (GPUs) and this enables efficient computation of the histogram. We show the application of our method to single-modality computed tomography (CT), magnetic resonance (MR) imaging and multi-modality positron emission tomography-CT (PET-CT). In our experiments, the AB-VH markedly improved the computational efficiency for the VH construction and thus improved the subsequent VH-driven volume manipulations. This efficiency was achieved without major degradation in the VH visually and numerical differences between the AB-VH and its full-bin counterpart. We applied several variants of the K-means clustering algorithm with varying Ks (the number of clusters) and found that higher values of K resulted in better performance at a lower computational gain. The AB-VH also had an improved performance when compared to the conventional method of down-sampling of the histogram bins (equal binning) for volume rendering visualisation. Copyright © 2016 Elsevier Ltd. All rights reserved.

  5. The AAHA Computer Program. American Animal Hospital Association.

    PubMed

    Albers, J W

    1986-07-01

    The American Animal Hospital Association Computer Program should benefit all small animal practitioners. Through the availability of well-researched and well-developed certified software, veterinarians will have increased confidence in their purchase decisions. With the expansion of computer applications to improve practice management efficiency, veterinary computer systems will further justify their initial expense. The development of the Association's veterinary computer network will provide a variety of important services to the profession.

  6. Automatic domain updating technique for improving computational efficiency of 2-D flood-inundation simulation

    NASA Astrophysics Data System (ADS)

    Tanaka, T.; Tachikawa, Y.; Ichikawa, Y.; Yorozu, K.

    2017-12-01

    Flood is one of the most hazardous disasters and causes serious damage to people and property around the world. To prevent/mitigate flood damage through early warning system and/or river management planning, numerical modelling of flood-inundation processes is essential. In a literature, flood-inundation models have been extensively developed and improved to achieve flood flow simulation with complex topography at high resolution. With increasing demands on flood-inundation modelling, its computational burden is now one of the key issues. Improvements of computational efficiency of full shallow water equations are made from various perspectives such as approximations of the momentum equations, parallelization technique, and coarsening approaches. To support these techniques and more improve the computational efficiency of flood-inundation simulations, this study proposes an Automatic Domain Updating (ADU) method of 2-D flood-inundation simulation. The ADU method traces the wet and dry interface and automatically updates the simulation domain in response to the progress and recession of flood propagation. The updating algorithm is as follow: first, to register the simulation cells potentially flooded at initial stage (such as floodplains nearby river channels), and then if a registered cell is flooded, to register its surrounding cells. The time for this additional process is saved by checking only cells at wet and dry interface. The computation time is reduced by skipping the processing time of non-flooded area. This algorithm is easily applied to any types of 2-D flood inundation models. The proposed ADU method is implemented to 2-D local inertial equations for the Yodo River basin, Japan. Case studies for two flood events show that the simulation is finished within two to 10 times smaller time showing the same result as that without the ADU method.

  7. A Computational Study of a New Dual Throat Fluidic Thrust Vectoring Nozzle Concept

    NASA Technical Reports Server (NTRS)

    Deere, Karen A.; Berrier, Bobby L.; Flamm, Jeffrey D.; Johnson, Stuart K.

    2005-01-01

    A computational investigation of a two-dimensional nozzle was completed to assess the use of fluidic injection to manipulate flow separation and cause thrust vectoring of the primary jet thrust. The nozzle was designed with a recessed cavity to enhance the throat shifting method of fluidic thrust vectoring. Several design cycles with the structured-grid, computational fluid dynamics code PAB3D and with experiments in the NASA Langley Research Center Jet Exit Test Facility have been completed to guide the nozzle design and analyze performance. This paper presents computational results on potential design improvements for best experimental configuration tested to date. Nozzle design variables included cavity divergence angle, cavity convergence angle and upstream throat height. Pulsed fluidic injection was also investigated for its ability to decrease mass flow requirements. Internal nozzle performance (wind-off conditions) and thrust vector angles were computed for several configurations over a range of nozzle pressure ratios from 2 to 7, with the fluidic injection flow rate equal to 3 percent of the primary flow rate. Computational results indicate that increasing cavity divergence angle beyond 10 is detrimental to thrust vectoring efficiency, while increasing cavity convergence angle from 20 to 30 improves thrust vectoring efficiency at nozzle pressure ratios greater than 2, albeit at the expense of discharge coefficient. Pulsed injection was no more efficient than steady injection for the Dual Throat Nozzle concept.

  8. Efficient parallelization of analytic bond-order potentials for large-scale atomistic simulations

    NASA Astrophysics Data System (ADS)

    Teijeiro, C.; Hammerschmidt, T.; Drautz, R.; Sutmann, G.

    2016-07-01

    Analytic bond-order potentials (BOPs) provide a way to compute atomistic properties with controllable accuracy. For large-scale computations of heterogeneous compounds at the atomistic level, both the computational efficiency and memory demand of BOP implementations have to be optimized. Since the evaluation of BOPs is a local operation within a finite environment, the parallelization concepts known from short-range interacting particle simulations can be applied to improve the performance of these simulations. In this work, several efficient parallelization methods for BOPs that use three-dimensional domain decomposition schemes are described. The schemes are implemented into the bond-order potential code BOPfox, and their performance is measured in a series of benchmarks. Systems of up to several millions of atoms are simulated on a high performance computing system, and parallel scaling is demonstrated for up to thousands of processors.

  9. Finite element analysis of transonic flows in cascades: Importance of computational grids in improving accuracy and convergence

    NASA Technical Reports Server (NTRS)

    Ecer, A.; Akay, H. U.

    1981-01-01

    The finite element method is applied for the solution of transonic potential flows through a cascade of airfoils. Convergence characteristics of the solution scheme are discussed. Accuracy of the numerical solutions is investigated for various flow regions in the transonic flow configuration. The design of an efficient finite element computational grid is discussed for improving accuracy and convergence.

  10. Multiresolution molecular mechanics: Implementation and efficiency

    NASA Astrophysics Data System (ADS)

    Biyikli, Emre; To, Albert C.

    2017-01-01

    Atomistic/continuum coupling methods combine accurate atomistic methods and efficient continuum methods to simulate the behavior of highly ordered crystalline systems. Coupled methods utilize the advantages of both approaches to simulate systems at a lower computational cost, while retaining the accuracy associated with atomistic methods. Many concurrent atomistic/continuum coupling methods have been proposed in the past; however, their true computational efficiency has not been demonstrated. The present work presents an efficient implementation of a concurrent coupling method called the Multiresolution Molecular Mechanics (MMM) for serial, parallel, and adaptive analysis. First, we present the features of the software implemented along with the associated technologies. The scalability of the software implementation is demonstrated, and the competing effects of multiscale modeling and parallelization are discussed. Then, the algorithms contributing to the efficiency of the software are presented. These include algorithms for eliminating latent ghost atoms from calculations and measurement-based dynamic balancing of parallel workload. The efficiency improvements made by these algorithms are demonstrated by benchmark tests. The efficiency of the software is found to be on par with LAMMPS, a state-of-the-art Molecular Dynamics (MD) simulation code, when performing full atomistic simulations. Speed-up of the MMM method is shown to be directly proportional to the reduction of the number of the atoms visited in force computation. Finally, an adaptive MMM analysis on a nanoindentation problem, containing over a million atoms, is performed, yielding an improvement of 6.3-8.5 times in efficiency, over the full atomistic MD method. For the first time, the efficiency of a concurrent atomistic/continuum coupling method is comprehensively investigated and demonstrated.

  11. Analytical prediction with multidimensional computer programs and experimental verification of the performance, at a variety of operating conditions, of two traveling wave tubes with depressed collectors

    NASA Technical Reports Server (NTRS)

    Dayton, J. A., Jr.; Kosmahl, H. G.; Ramins, P.; Stankiewicz, N.

    1979-01-01

    Experimental and analytical results are compared for two high performance, octave bandwidth TWT's that use depressed collectors (MDC's) to improve the efficiency. The computations were carried out with advanced, multidimensional computer programs that are described here in detail. These programs model the electron beam as a series of either disks or rings of charge and follow their multidimensional trajectories from the RF input of the ideal TWT, through the slow wave structure, through the magnetic refocusing system, to their points of impact in the depressed collector. Traveling wave tube performance, collector efficiency, and collector current distribution were computed and the results compared with measurements for a number of TWT-MDC systems. Power conservation and correct accounting of TWT and collector losses were observed. For the TWT's operating at saturation, very good agreement was obtained between the computed and measured collector efficiencies. For a TWT operating 3 and 6 dB below saturation, excellent agreement between computed and measured collector efficiencies was obtained in some cases but only fair agreement in others. However, deviations can largely be explained by small differences in the computed and actual spent beam energy distributions. The analytical tools used here appear to be sufficiently refined to design efficient collectors for this class of TWT. However, for maximum efficiency, some experimental optimization (e.g., collector voltages and aperture sizes) will most likely be required.

  12. Improving robustness and computational efficiency using modern C++

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Paterno, M.; Kowalkowski, J.; Green, C.

    2014-01-01

    For nearly two decades, the C++ programming language has been the dominant programming language for experimental HEP. The publication of ISO/IEC 14882:2011, the current version of the international standard for the C++ programming language, makes available a variety of language and library facilities for improving the robustness, expressiveness, and computational efficiency of C++ code. However, much of the C++ written by the experimental HEP community does not take advantage of the features of the language to obtain these benefits, either due to lack of familiarity with these features or concern that these features must somehow be computationally inefficient. In thismore » paper, we address some of the features of modern C+-+, and show how they can be used to make programs that are both robust and computationally efficient. We compare and contrast simple yet realistic examples of some common implementation patterns in C, currently-typical C++, and modern C++, and show (when necessary, down to the level of generated assembly language code) the quality of the executable code produced by recent C++ compilers, with the aim of allowing the HEP community to make informed decisions on the costs and benefits of the use of modern C++.« less

  13. Methods for operating parallel computing systems employing sequenced communications

    DOEpatents

    Benner, R.E.; Gustafson, J.L.; Montry, G.R.

    1999-08-10

    A parallel computing system and method are disclosed having improved performance where a program is concurrently run on a plurality of nodes for reducing total processing time, each node having a processor, a memory, and a predetermined number of communication channels connected to the node and independently connected directly to other nodes. The present invention improves performance of the parallel computing system by providing a system which can provide efficient communication between the processors and between the system and input and output devices. A method is also disclosed which can locate defective nodes with the computing system. 15 figs.

  14. Methods for operating parallel computing systems employing sequenced communications

    DOEpatents

    Benner, Robert E.; Gustafson, John L.; Montry, Gary R.

    1999-01-01

    A parallel computing system and method having improved performance where a program is concurrently run on a plurality of nodes for reducing total processing time, each node having a processor, a memory, and a predetermined number of communication channels connected to the node and independently connected directly to other nodes. The present invention improves performance of performance of the parallel computing system by providing a system which can provide efficient communication between the processors and between the system and input and output devices. A method is also disclosed which can locate defective nodes with the computing system.

  15. Delayed Slater determinant update algorithms for high efficiency quantum Monte Carlo

    DOE PAGES

    McDaniel, Tyler; D’Azevedo, Ed F.; Li, Ying Wai; ...

    2017-11-07

    Within ab initio Quantum Monte Carlo simulations, the leading numerical cost for large systems is the computation of the values of the Slater determinants in the trial wavefunction. Each Monte Carlo step requires finding the determinant of a dense matrix. This is most commonly iteratively evaluated using a rank-1 Sherman-Morrison updating scheme to avoid repeated explicit calculation of the inverse. The overall computational cost is therefore formally cubic in the number of electrons or matrix size. To improve the numerical efficiency of this procedure, we propose a novel multiple rank delayed update scheme. This strategy enables probability evaluation with applicationmore » of accepted moves to the matrices delayed until after a predetermined number of moves, K. The accepted events are then applied to the matrices en bloc with enhanced arithmetic intensity and computational efficiency via matrix-matrix operations instead of matrix-vector operations. Here this procedure does not change the underlying Monte Carlo sampling or its statistical efficiency. For calculations on large systems and algorithms such as diffusion Monte Carlo where the acceptance ratio is high, order of magnitude improvements in the update time can be obtained on both multi- core CPUs and GPUs.« less

  16. Delayed Slater determinant update algorithms for high efficiency quantum Monte Carlo

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    McDaniel, Tyler; D’Azevedo, Ed F.; Li, Ying Wai

    Within ab initio Quantum Monte Carlo simulations, the leading numerical cost for large systems is the computation of the values of the Slater determinants in the trial wavefunction. Each Monte Carlo step requires finding the determinant of a dense matrix. This is most commonly iteratively evaluated using a rank-1 Sherman-Morrison updating scheme to avoid repeated explicit calculation of the inverse. The overall computational cost is therefore formally cubic in the number of electrons or matrix size. To improve the numerical efficiency of this procedure, we propose a novel multiple rank delayed update scheme. This strategy enables probability evaluation with applicationmore » of accepted moves to the matrices delayed until after a predetermined number of moves, K. The accepted events are then applied to the matrices en bloc with enhanced arithmetic intensity and computational efficiency via matrix-matrix operations instead of matrix-vector operations. Here this procedure does not change the underlying Monte Carlo sampling or its statistical efficiency. For calculations on large systems and algorithms such as diffusion Monte Carlo where the acceptance ratio is high, order of magnitude improvements in the update time can be obtained on both multi- core CPUs and GPUs.« less

  17. Delayed Slater determinant update algorithms for high efficiency quantum Monte Carlo.

    PubMed

    McDaniel, T; D'Azevedo, E F; Li, Y W; Wong, K; Kent, P R C

    2017-11-07

    Within ab initio Quantum Monte Carlo simulations, the leading numerical cost for large systems is the computation of the values of the Slater determinants in the trial wavefunction. Each Monte Carlo step requires finding the determinant of a dense matrix. This is most commonly iteratively evaluated using a rank-1 Sherman-Morrison updating scheme to avoid repeated explicit calculation of the inverse. The overall computational cost is, therefore, formally cubic in the number of electrons or matrix size. To improve the numerical efficiency of this procedure, we propose a novel multiple rank delayed update scheme. This strategy enables probability evaluation with an application of accepted moves to the matrices delayed until after a predetermined number of moves, K. The accepted events are then applied to the matrices en bloc with enhanced arithmetic intensity and computational efficiency via matrix-matrix operations instead of matrix-vector operations. This procedure does not change the underlying Monte Carlo sampling or its statistical efficiency. For calculations on large systems and algorithms such as diffusion Monte Carlo, where the acceptance ratio is high, order of magnitude improvements in the update time can be obtained on both multi-core central processing units and graphical processing units.

  18. Delayed Slater determinant update algorithms for high efficiency quantum Monte Carlo

    NASA Astrophysics Data System (ADS)

    McDaniel, T.; D'Azevedo, E. F.; Li, Y. W.; Wong, K.; Kent, P. R. C.

    2017-11-01

    Within ab initio Quantum Monte Carlo simulations, the leading numerical cost for large systems is the computation of the values of the Slater determinants in the trial wavefunction. Each Monte Carlo step requires finding the determinant of a dense matrix. This is most commonly iteratively evaluated using a rank-1 Sherman-Morrison updating scheme to avoid repeated explicit calculation of the inverse. The overall computational cost is, therefore, formally cubic in the number of electrons or matrix size. To improve the numerical efficiency of this procedure, we propose a novel multiple rank delayed update scheme. This strategy enables probability evaluation with an application of accepted moves to the matrices delayed until after a predetermined number of moves, K. The accepted events are then applied to the matrices en bloc with enhanced arithmetic intensity and computational efficiency via matrix-matrix operations instead of matrix-vector operations. This procedure does not change the underlying Monte Carlo sampling or its statistical efficiency. For calculations on large systems and algorithms such as diffusion Monte Carlo, where the acceptance ratio is high, order of magnitude improvements in the update time can be obtained on both multi-core central processing units and graphical processing units.

  19. Enhancing battery efficiency for pervasive health-monitoring systems based on electronic textiles.

    PubMed

    Zheng, Nenggan; Wu, Zhaohui; Lin, Man; Yang, Laurence Tianruo

    2010-03-01

    Electronic textiles are regarded as one of the most important computation platforms for future computer-assisted health-monitoring applications. In these novel systems, multiple batteries are used in order to prolong their operational lifetime, which is a significant metric for system usability. However, due to the nonlinear features of batteries, computing systems with multiple batteries cannot achieve the same battery efficiency as those powered by a monolithic battery of equal capacity. In this paper, we propose an algorithm aiming to maximize battery efficiency globally for the computer-assisted health-care systems with multiple batteries. Based on an accurate analytical battery model, the concept of weighted battery fatigue degree is introduced and the novel battery-scheduling algorithm called predicted weighted fatigue degree least first (PWFDLF) is developed. Besides, we also discuss our attempts during search PWFDLF: a weighted round-robin (WRR) and a greedy algorithm achieving highest local battery efficiency, which reduces to the sequential discharging policy. Evaluation results show that a considerable improvement in battery efficiency can be obtained by PWFDLF under various battery configurations and current profiles compared to conventional sequential and WRR discharging policies.

  20. Matching pursuit parallel decomposition of seismic data

    NASA Astrophysics Data System (ADS)

    Li, Chuanhui; Zhang, Fanchang

    2017-07-01

    In order to improve the computation speed of matching pursuit decomposition of seismic data, a matching pursuit parallel algorithm is designed in this paper. We pick a fixed number of envelope peaks from the current signal in every iteration according to the number of compute nodes and assign them to the compute nodes on average to search the optimal Morlet wavelets in parallel. With the help of parallel computer systems and Message Passing Interface, the parallel algorithm gives full play to the advantages of parallel computing to significantly improve the computation speed of the matching pursuit decomposition and also has good expandability. Besides, searching only one optimal Morlet wavelet by every compute node in every iteration is the most efficient implementation.

  1. Computer-aided design of large-scale integrated circuits - A concept

    NASA Technical Reports Server (NTRS)

    Schansman, T. T.

    1971-01-01

    Circuit design and mask development sequence are improved by using general purpose computer with interactive graphics capability establishing efficient two way communications link between design engineer and system. Interactive graphics capability places design engineer in direct control of circuit development.

  2. Computational Modeling in Concert with Laboratory Studies: Application to B Cell Differentiation

    EPA Science Inventory

    Remediation is expensive, so accurate prediction of dose-response is important to help control costs. Dose response is a function of biological mechanisms. Computational models of these mechanisms improve the efficiency of research and provide the capability for prediction.

  3. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Biyikli, Emre; To, Albert C., E-mail: albertto@pitt.edu

    Atomistic/continuum coupling methods combine accurate atomistic methods and efficient continuum methods to simulate the behavior of highly ordered crystalline systems. Coupled methods utilize the advantages of both approaches to simulate systems at a lower computational cost, while retaining the accuracy associated with atomistic methods. Many concurrent atomistic/continuum coupling methods have been proposed in the past; however, their true computational efficiency has not been demonstrated. The present work presents an efficient implementation of a concurrent coupling method called the Multiresolution Molecular Mechanics (MMM) for serial, parallel, and adaptive analysis. First, we present the features of the software implemented along with themore » associated technologies. The scalability of the software implementation is demonstrated, and the competing effects of multiscale modeling and parallelization are discussed. Then, the algorithms contributing to the efficiency of the software are presented. These include algorithms for eliminating latent ghost atoms from calculations and measurement-based dynamic balancing of parallel workload. The efficiency improvements made by these algorithms are demonstrated by benchmark tests. The efficiency of the software is found to be on par with LAMMPS, a state-of-the-art Molecular Dynamics (MD) simulation code, when performing full atomistic simulations. Speed-up of the MMM method is shown to be directly proportional to the reduction of the number of the atoms visited in force computation. Finally, an adaptive MMM analysis on a nanoindentation problem, containing over a million atoms, is performed, yielding an improvement of 6.3–8.5 times in efficiency, over the full atomistic MD method. For the first time, the efficiency of a concurrent atomistic/continuum coupling method is comprehensively investigated and demonstrated.« less

  4. Efficient Unsteady Flow Visualization with High-Order Access Dependencies

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zhang, Jiang; Guo, Hanqi; Yuan, Xiaoru

    We present a novel high-order access dependencies based model for efficient pathline computation in unsteady flow visualization. By taking longer access sequences into account to model more sophisticated data access patterns in particle tracing, our method greatly improves the accuracy and reliability in data access prediction. In our work, high-order access dependencies are calculated by tracing uniformly-seeded pathlines in both forward and backward directions in a preprocessing stage. The effectiveness of our proposed approach is demonstrated through a parallel particle tracing framework with high-order data prefetching. Results show that our method achieves higher data locality and hence improves the efficiencymore » of pathline computation.« less

  5. Workflow computing. Improving management and efficiency of pathology diagnostic services.

    PubMed

    Buffone, G J; Moreau, D; Beck, J R

    1996-04-01

    Traditionally, information technology in health care has helped practitioners to collect, store, and present information and also to add a degree of automation to simple tasks (instrument interfaces supporting result entry, for example). Thus commercially available information systems do little to support the need to model, execute, monitor, coordinate, and revise the various complex clinical processes required to support health-care delivery. Workflow computing, which is already implemented and improving the efficiency of operations in several nonmedical industries, can address the need to manage complex clinical processes. Workflow computing not only provides a means to define and manage the events, roles, and information integral to health-care delivery but also supports the explicit implementation of policy or rules appropriate to the process. This article explains how workflow computing may be applied to health-care and the inherent advantages of the technology, and it defines workflow system requirements for use in health-care delivery with special reference to diagnostic pathology.

  6. An efficient method for hybrid density functional calculation with spin-orbit coupling

    NASA Astrophysics Data System (ADS)

    Wang, Maoyuan; Liu, Gui-Bin; Guo, Hong; Yao, Yugui

    2018-03-01

    In first-principles calculations, hybrid functional is often used to improve accuracy from local exchange correlation functionals. A drawback is that evaluating the hybrid functional needs significantly more computing effort. When spin-orbit coupling (SOC) is taken into account, the non-collinear spin structure increases computing effort by at least eight times. As a result, hybrid functional calculations with SOC are intractable in most cases. In this paper, we present an approximate solution to this problem by developing an efficient method based on a mixed linear combination of atomic orbital (LCAO) scheme. We demonstrate the power of this method using several examples and we show that the results compare very well with those of direct hybrid functional calculations with SOC, yet the method only requires a computing effort similar to that without SOC. The presented technique provides a good balance between computing efficiency and accuracy, and it can be extended to magnetic materials.

  7. Image restoration for three-dimensional fluorescence microscopy using an orthonormal basis for efficient representation of depth-variant point-spread functions

    PubMed Central

    Patwary, Nurmohammed; Preza, Chrysanthe

    2015-01-01

    A depth-variant (DV) image restoration algorithm for wide field fluorescence microscopy, using an orthonormal basis decomposition of DV point-spread functions (PSFs), is investigated in this study. The efficient PSF representation is based on a previously developed principal component analysis (PCA), which is computationally intensive. We present an approach developed to reduce the number of DV PSFs required for the PCA computation, thereby making the PCA-based approach computationally tractable for thick samples. Restoration results from both synthetic and experimental images show consistency and that the proposed algorithm addresses efficiently depth-induced aberration using a small number of principal components. Comparison of the PCA-based algorithm with a previously-developed strata-based DV restoration algorithm demonstrates that the proposed method improves performance by 50% in terms of accuracy and simultaneously reduces the processing time by 64% using comparable computational resources. PMID:26504634

  8. Efficient Computation of Closed-loop Frequency Response for Large Order Flexible Systems

    NASA Technical Reports Server (NTRS)

    Maghami, Peiman G.; Giesy, Daniel P.

    1997-01-01

    An efficient and robust computational scheme is given for the calculation of the frequency response function of a large order, flexible system implemented with a linear, time invariant control system. Advantage is taken of the highly structured sparsity of the system matrix of the plant based on a model of the structure using normal mode coordinates. The computational time per frequency point of the new computational scheme is a linear function of system size, a significant improvement over traditional, full-matrix techniques whose computational times per frequency point range from quadratic to cubic functions of system size. This permits the practical frequency domain analysis of systems of much larger order than by traditional, full-matrix techniques. Formulations are given for both open and closed loop loop systems. Numerical examples are presented showing the advantages of the present formulation over traditional approaches, both in speed and in accuracy. Using a model with 703 structural modes, a speed-up of almost two orders of magnitude was observed while accuracy improved by up to 5 decimal places.

  9. Fundamental Aeronautics Program: Overview of Project Work in Supersonic Cruise Efficiency

    NASA Technical Reports Server (NTRS)

    Castner, Raymond

    2011-01-01

    The Supersonics Project, part of NASA?s Fundamental Aeronautics Program, contains a number of technical challenge areas which include sonic boom community response, airport noise, high altitude emissions, cruise efficiency, light weight durable engines/airframes, and integrated multi-discipline system design. This presentation provides an overview of the current (2011) activities in the supersonic cruise efficiency technical challenge, and is focused specifically on propulsion technologies. The intent is to develop and validate high-performance supersonic inlet and nozzle technologies. Additional work is planned for design and analysis tools for highly-integrated low-noise, low-boom applications. If successful, the payoffs include improved technologies and tools for optimized propulsion systems, propulsion technologies for a minimized sonic boom signature, and a balanced approach to meeting efficiency and community noise goals. In this propulsion area, the work is divided into advanced supersonic inlet concepts, advanced supersonic nozzle concepts, low fidelity computational tool development, high fidelity computational tools, and improved sensors and measurement capability. The current work in each area is summarized.

  10. Fundamental Aeronautics Program: Overview of Propulsion Work in the Supersonic Cruise Efficiency Technical Challenge

    NASA Technical Reports Server (NTRS)

    Castner, Ray

    2012-01-01

    The Supersonics Project, part of NASA's Fundamental Aeronautics Program, contains a number of technical challenge areas which include sonic boom community response, airport noise, high altitude emissions, cruise efficiency, light weight durable engines/airframes, and integrated multi-discipline system design. This presentation provides an overview of the current (2012) activities in the supersonic cruise efficiency technical challenge, and is focused specifically on propulsion technologies. The intent is to develop and validate high-performance supersonic inlet and nozzle technologies. Additional work is planned for design and analysis tools for highly-integrated low-noise, low-boom applications. If successful, the payoffs include improved technologies and tools for optimized propulsion systems, propulsion technologies for a minimized sonic boom signature, and a balanced approach to meeting efficiency and community noise goals. In this propulsion area, the work is divided into advanced supersonic inlet concepts, advanced supersonic nozzle concepts, low fidelity computational tool development, high fidelity computational tools, and improved sensors and measurement capability. The current work in each area is summarized.

  11. Improving membrane protein expression by optimizing integration efficiency

    PubMed Central

    2017-01-01

    The heterologous overexpression of integral membrane proteins in Escherichia coli often yields insufficient quantities of purifiable protein for applications of interest. The current study leverages a recently demonstrated link between co-translational membrane integration efficiency and protein expression levels to predict protein sequence modifications that improve expression. Membrane integration efficiencies, obtained using a coarse-grained simulation approach, robustly predicted effects on expression of the integral membrane protein TatC for a set of 140 sequence modifications, including loop-swap chimeras and single-residue mutations distributed throughout the protein sequence. Mutations that improve simulated integration efficiency were 4-fold enriched with respect to improved experimentally observed expression levels. Furthermore, the effects of double mutations on both simulated integration efficiency and experimentally observed expression levels were cumulative and largely independent, suggesting that multiple mutations can be introduced to yield higher levels of purifiable protein. This work provides a foundation for a general method for the rational overexpression of integral membrane proteins based on computationally simulated membrane integration efficiencies. PMID:28918393

  12. U.S. EPA computational toxicology programs: Central role of chemical-annotation efforts and molecular databases

    EPA Science Inventory

    EPA’s National Center for Computational Toxicology is engaged in high-profile research efforts to improve the ability to more efficiently and effectively prioritize and screen thousands of environmental chemicals for potential toxicity. A central component of these efforts invol...

  13. Improved Safety Margin Characterization of Risk from Loss of Offsite Power

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Nelson, Paul

    Original intent: The original intent of this task was “support of the Risk-Informed Safety Margin Characteristic (RISMC) methodology in order” “to address … efficiency of computation so that more accurate and cost-effective techniques can be used to address safety margin characterizations” (S. M. Hess et al., “Risk-Informed Safety Margin Characterization,” Procs. ICONE17, Brussels, July 2009, CD format). It was intended that “in Task 1 itself this improvement will be directed toward upon the very important issue of Loss of Offsite Power (LOOP) events,” more specifically toward the challenge of efficient computation of the multidimensional nonrecovery integral that has been discussedmore » by many previous contributors to the theory of nuclear safety. It was further envisioned that “three different computational approaches will be explored,” corresponding to the three subtasks listed below; deliverables were tied to the individual subtasks.« less

  14. Distributed collaborative response surface method for mechanical dynamic assembly reliability design

    NASA Astrophysics Data System (ADS)

    Bai, Guangchen; Fei, Chengwei

    2013-11-01

    Because of the randomness of many impact factors influencing the dynamic assembly relationship of complex machinery, the reliability analysis of dynamic assembly relationship needs to be accomplished considering the randomness from a probabilistic perspective. To improve the accuracy and efficiency of dynamic assembly relationship reliability analysis, the mechanical dynamic assembly reliability(MDAR) theory and a distributed collaborative response surface method(DCRSM) are proposed. The mathematic model of DCRSM is established based on the quadratic response surface function, and verified by the assembly relationship reliability analysis of aeroengine high pressure turbine(HPT) blade-tip radial running clearance(BTRRC). Through the comparison of the DCRSM, traditional response surface method(RSM) and Monte Carlo Method(MCM), the results show that the DCRSM is not able to accomplish the computational task which is impossible for the other methods when the number of simulation is more than 100 000 times, but also the computational precision for the DCRSM is basically consistent with the MCM and improved by 0.40˜4.63% to the RSM, furthermore, the computational efficiency of DCRSM is up to about 188 times of the MCM and 55 times of the RSM under 10000 times simulations. The DCRSM is demonstrated to be a feasible and effective approach for markedly improving the computational efficiency and accuracy of MDAR analysis. Thus, the proposed research provides the promising theory and method for the MDAR design and optimization, and opens a novel research direction of probabilistic analysis for developing the high-performance and high-reliability of aeroengine.

  15. An accurate and efficient reliability-based design optimization using the second order reliability method and improved stability transformation method

    NASA Astrophysics Data System (ADS)

    Meng, Zeng; Yang, Dixiong; Zhou, Huanlin; Yu, Bo

    2018-05-01

    The first order reliability method has been extensively adopted for reliability-based design optimization (RBDO), but it shows inaccuracy in calculating the failure probability with highly nonlinear performance functions. Thus, the second order reliability method is required to evaluate the reliability accurately. However, its application for RBDO is quite challenge owing to the expensive computational cost incurred by the repeated reliability evaluation and Hessian calculation of probabilistic constraints. In this article, a new improved stability transformation method is proposed to search the most probable point efficiently, and the Hessian matrix is calculated by the symmetric rank-one update. The computational capability of the proposed method is illustrated and compared to the existing RBDO approaches through three mathematical and two engineering examples. The comparison results indicate that the proposed method is very efficient and accurate, providing an alternative tool for RBDO of engineering structures.

  16. Accelerated simulation of stochastic particle removal processes in particle-resolved aerosol models

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Curtis, J.H.; Michelotti, M.D.; Riemer, N.

    2016-10-01

    Stochastic particle-resolved methods have proven useful for simulating multi-dimensional systems such as composition-resolved aerosol size distributions. While particle-resolved methods have substantial benefits for highly detailed simulations, these techniques suffer from high computational cost, motivating efforts to improve their algorithmic efficiency. Here we formulate an algorithm for accelerating particle removal processes by aggregating particles of similar size into bins. We present the Binned Algorithm for particle removal processes and analyze its performance with application to the atmospherically relevant process of aerosol dry deposition. We show that the Binned Algorithm can dramatically improve the efficiency of particle removals, particularly for low removalmore » rates, and that computational cost is reduced without introducing additional error. In simulations of aerosol particle removal by dry deposition in atmospherically relevant conditions, we demonstrate about 50-times increase in algorithm efficiency.« less

  17. Efficient multiscale magnetic-domain analysis of iron-core material under mechanical stress

    NASA Astrophysics Data System (ADS)

    Nishikubo, Atsushi; Ito, Shumpei; Mifune, Takeshi; Matsuo, Tetsuji; Kaido, Chikara; Takahashi, Yasuhito; Fujiwara, Koji

    2018-05-01

    For an efficient analysis of magnetization, a partial-implicit solution method is improved using an assembled domain structure model with six-domain mesoscopic particles exhibiting pinning-type hysteresis. The quantitative analysis of non-oriented silicon steel succeeds in predicting the stress dependence of hysteresis loss with computation times greatly reduced by using the improved partial-implicit method. The effect of cell division along the thickness direction is also evaluated.

  18. SPAR improved structural-fluid dynamic analysis capability

    NASA Technical Reports Server (NTRS)

    Pearson, M. L.

    1985-01-01

    The results of a study whose objective was to improve the operation of the SPAR computer code by improving efficiency, user features, and documentation is presented. Additional capability was added to the SPAR arithmetic utility system, including trigonometric functions, numerical integration, interpolation, and matrix combinations. Improvements were made in the EIG processor. A processor was created to compute and store principal stresses in table-format data sets. An additional capability was developed and incorporated into the plot processor which permits plotting directly from table-format data sets. Documentation of all these features is provided in the form of updates to the SPAR users manual.

  19. Fast Reduction Method in Dominance-Based Information Systems

    NASA Astrophysics Data System (ADS)

    Li, Yan; Zhou, Qinghua; Wen, Yongchuan

    2018-01-01

    In real world applications, there are often some data with continuous values or preference-ordered values. Rough sets based on dominance relations can effectively deal with these kinds of data. Attribute reduction can be done in the framework of dominance-relation based approach to better extract decision rules. However, the computational cost of the dominance classes greatly affects the efficiency of attribute reduction and rule extraction. This paper presents an efficient method of computing dominance classes, and further compares it with traditional method with increasing attributes and samples. Experiments on UCI data sets show that the proposed algorithm obviously improves the efficiency of the traditional method, especially for large-scale data.

  20. Redundancy management for efficient fault recovery in NASA's distributed computing system

    NASA Technical Reports Server (NTRS)

    Malek, Miroslaw; Pandya, Mihir; Yau, Kitty

    1991-01-01

    The management of redundancy in computer systems was studied and guidelines were provided for the development of NASA's fault-tolerant distributed systems. Fault recovery and reconfiguration mechanisms were examined. A theoretical foundation was laid for redundancy management by efficient reconfiguration methods and algorithmic diversity. Algorithms were developed to optimize the resources for embedding of computational graphs of tasks in the system architecture and reconfiguration of these tasks after a failure has occurred. The computational structure represented by a path and the complete binary tree was considered and the mesh and hypercube architectures were targeted for their embeddings. The innovative concept of Hybrid Algorithm Technique was introduced. This new technique provides a mechanism for obtaining fault tolerance while exhibiting improved performance.

  1. A novel combined SLAM based on RBPF-SLAM and EIF-SLAM for mobile system sensing in a large scale environment.

    PubMed

    He, Bo; Zhang, Shujing; Yan, Tianhong; Zhang, Tao; Liang, Yan; Zhang, Hongjin

    2011-01-01

    Mobile autonomous systems are very important for marine scientific investigation and military applications. Many algorithms have been studied to deal with the computational efficiency problem required for large scale simultaneous localization and mapping (SLAM) and its related accuracy and consistency. Among these methods, submap-based SLAM is a more effective one. By combining the strength of two popular mapping algorithms, the Rao-Blackwellised particle filter (RBPF) and extended information filter (EIF), this paper presents a combined SLAM-an efficient submap-based solution to the SLAM problem in a large scale environment. RBPF-SLAM is used to produce local maps, which are periodically fused into an EIF-SLAM algorithm. RBPF-SLAM can avoid linearization of the robot model during operating and provide a robust data association, while EIF-SLAM can improve the whole computational speed, and avoid the tendency of RBPF-SLAM to be over-confident. In order to further improve the computational speed in a real time environment, a binary-tree-based decision-making strategy is introduced. Simulation experiments show that the proposed combined SLAM algorithm significantly outperforms currently existing algorithms in terms of accuracy and consistency, as well as the computing efficiency. Finally, the combined SLAM algorithm is experimentally validated in a real environment by using the Victoria Park dataset.

  2. Entrepreneurial systems. Do it yourself for profit and pleasure.

    PubMed

    Manuel, G; Young, K

    1991-11-28

    You don't have to have a degree in computing or a 1 m pounds budget to create a system that improves efficiency or saves money. Gren Manuel introduces a special report which celebrates the small-scale initiatives devised and implemented by health staff armed only with a personal computer.

  3. Complexity control algorithm based on adaptive mode selection for interframe coding in high efficiency video coding

    NASA Astrophysics Data System (ADS)

    Chen, Gang; Yang, Bing; Zhang, Xiaoyun; Gao, Zhiyong

    2017-07-01

    The latest high efficiency video coding (HEVC) standard significantly increases the encoding complexity for improving its coding efficiency. Due to the limited computational capability of handheld devices, complexity constrained video coding has drawn great attention in recent years. A complexity control algorithm based on adaptive mode selection is proposed for interframe coding in HEVC. Considering the direct proportionality between encoding time and computational complexity, the computational complexity is measured in terms of encoding time. First, complexity is mapped to a target in terms of prediction modes. Then, an adaptive mode selection algorithm is proposed for the mode decision process. Specifically, the optimal mode combination scheme that is chosen through offline statistics is developed at low complexity. If the complexity budget has not been used up, an adaptive mode sorting method is employed to further improve coding efficiency. The experimental results show that the proposed algorithm achieves a very large complexity control range (as low as 10%) for the HEVC encoder while maintaining good rate-distortion performance. For the lowdelayP condition, compared with the direct resource allocation method and the state-of-the-art method, an average gain of 0.63 and 0.17 dB in BDPSNR is observed for 18 sequences when the target complexity is around 40%.

  4. Accurate optimization of amino acid form factors for computing small-angle X-ray scattering intensity of atomistic protein structures

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Tong, Dudu; Yang, Sichun; Lu, Lanyuan

    2016-06-20

    Structure modellingviasmall-angle X-ray scattering (SAXS) data generally requires intensive computations of scattering intensity from any given biomolecular structure, where the accurate evaluation of SAXS profiles using coarse-grained (CG) methods is vital to improve computational efficiency. To date, most CG SAXS computing methods have been based on a single-bead-per-residue approximation but have neglected structural correlations between amino acids. To improve the accuracy of scattering calculations, accurate CG form factors of amino acids are now derived using a rigorous optimization strategy, termed electron-density matching (EDM), to best fit electron-density distributions of protein structures. This EDM method is compared with and tested againstmore » other CG SAXS computing methods, and the resulting CG SAXS profiles from EDM agree better with all-atom theoretical SAXS data. By including the protein hydration shell represented by explicit CG water molecules and the correction of protein excluded volume, the developed CG form factors also reproduce the selected experimental SAXS profiles with very small deviations. Taken together, these EDM-derived CG form factors present an accurate and efficient computational approach for SAXS computing, especially when higher molecular details (represented by theqrange of the SAXS data) become necessary for effective structure modelling.« less

  5. A flexible and accurate digital volume correlation method applicable to high-resolution volumetric images

    NASA Astrophysics Data System (ADS)

    Pan, Bing; Wang, Bo

    2017-10-01

    Digital volume correlation (DVC) is a powerful technique for quantifying interior deformation within solid opaque materials and biological tissues. In the last two decades, great efforts have been made to improve the accuracy and efficiency of the DVC algorithm. However, there is still a lack of a flexible, robust and accurate version that can be efficiently implemented in personal computers with limited RAM. This paper proposes an advanced DVC method that can realize accurate full-field internal deformation measurement applicable to high-resolution volume images with up to billions of voxels. Specifically, a novel layer-wise reliability-guided displacement tracking strategy combined with dynamic data management is presented to guide the DVC computation from slice to slice. The displacements at specified calculation points in each layer are computed using the advanced 3D inverse-compositional Gauss-Newton algorithm with the complete initial guess of the deformation vector accurately predicted from the computed calculation points. Since only limited slices of interest in the reference and deformed volume images rather than the whole volume images are required, the DVC calculation can thus be efficiently implemented on personal computers. The flexibility, accuracy and efficiency of the presented DVC approach are demonstrated by analyzing computer-simulated and experimentally obtained high-resolution volume images.

  6. Using Unified Modelling Language (UML) as a process-modelling technique for clinical-research process improvement.

    PubMed

    Kumarapeli, P; De Lusignan, S; Ellis, T; Jones, B

    2007-03-01

    The Primary Care Data Quality programme (PCDQ) is a quality-improvement programme which processes routinely collected general practice computer data. Patient data collected from a wide range of different brands of clinical computer systems are aggregated, processed, and fed back to practices in an educational context to improve the quality of care. Process modelling is a well-established approach used to gain understanding and systematic appraisal, and identify areas of improvement of a business process. Unified modelling language (UML) is a general purpose modelling technique used for this purpose. We used UML to appraise the PCDQ process to see if the efficiency and predictability of the process could be improved. Activity analysis and thinking-aloud sessions were used to collect data to generate UML diagrams. The UML model highlighted the sequential nature of the current process as a barrier for efficiency gains. It also identified the uneven distribution of process controls, lack of symmetric communication channels, critical dependencies among processing stages, and failure to implement all the lessons learned in the piloting phase. It also suggested that improved structured reporting at each stage - especially from the pilot phase, parallel processing of data and correctly positioned process controls - should improve the efficiency and predictability of research projects. Process modelling provided a rational basis for the critical appraisal of a clinical data processing system; its potential maybe underutilized within health care.

  7. Improving multivariate Horner schemes with Monte Carlo tree search

    NASA Astrophysics Data System (ADS)

    Kuipers, J.; Plaat, A.; Vermaseren, J. A. M.; van den Herik, H. J.

    2013-11-01

    Optimizing the cost of evaluating a polynomial is a classic problem in computer science. For polynomials in one variable, Horner's method provides a scheme for producing a computationally efficient form. For multivariate polynomials it is possible to generalize Horner's method, but this leaves freedom in the order of the variables. Traditionally, greedy schemes like most-occurring variable first are used. This simple textbook algorithm has given remarkably efficient results. Finding better algorithms has proved difficult. In trying to improve upon the greedy scheme we have implemented Monte Carlo tree search, a recent search method from the field of artificial intelligence. This results in better Horner schemes and reduces the cost of evaluating polynomials, sometimes by factors up to two.

  8. Optimization of Particle-in-Cell Codes on RISC Processors

    NASA Technical Reports Server (NTRS)

    Decyk, Viktor K.; Karmesin, Steve Roy; Boer, Aeint de; Liewer, Paulette C.

    1996-01-01

    General strategies are developed to optimize particle-cell-codes written in Fortran for RISC processors which are commonly used on massively parallel computers. These strategies include data reorganization to improve cache utilization and code reorganization to improve efficiency of arithmetic pipelines.

  9. Machines That Teach.

    ERIC Educational Resources Information Center

    Sniecinski, Jozef

    This paper reviews efforts which have been made to improve the effectiveness of teaching through the development of principles of programed teaching and the construction of teaching machines, concluding that a combination of computer technology and programed teaching principles offers an efficient approach to improving teaching. Three different…

  10. Efficiency improvement of technological preparation of power equipment manufacturing

    NASA Astrophysics Data System (ADS)

    Milukov, I. A.; Rogalev, A. N.; Sokolov, V. P.; Shevchenko, I. V.

    2017-11-01

    Competitiveness of power equipment primarily depends on speeding-up the development and mastering of new equipment samples and technologies, enhancement of organisation and management of design, manufacturing and operation. Actual political, technological and economic conditions cause the acute need in changing the strategy and tactics of process planning. At that the issues of maintenance of equipment with simultaneous improvement of its efficiency and compatibility to domestically produced components are considering. In order to solve these problems, using the systems of computer-aided process planning for process design at all stages of power equipment life cycle is economically viable. Computer-aided process planning is developed for the purpose of improvement of process planning by using mathematical methods and optimisation of design and management processes on the basis of CALS technologies, which allows for simultaneous process design, process planning organisation and management based on mathematical and physical modelling of interrelated design objects and production system. An integration of computer-aided systems providing the interaction of informative and material processes at all stages of product life cycle is proposed as effective solution to the challenges in new equipment design and process planning.

  11. A community detection algorithm based on structural similarity

    NASA Astrophysics Data System (ADS)

    Guo, Xuchao; Hao, Xia; Liu, Yaqiong; Zhang, Li; Wang, Lu

    2017-09-01

    In order to further improve the efficiency and accuracy of community detection algorithm, a new algorithm named SSTCA (the community detection algorithm based on structural similarity with threshold) is proposed. In this algorithm, the structural similarities are taken as the weights of edges, and the threshold k is considered to remove multiple edges whose weights are less than the threshold, and improve the computational efficiency. Tests were done on the Zachary’s network, Dolphins’ social network and Football dataset by the proposed algorithm, and compared with GN and SSNCA algorithm. The results show that the new algorithm is superior to other algorithms in accuracy for the dense networks and the operating efficiency is improved obviously.

  12. PVT: An Efficient Computational Procedure to Speed up Next-generation Sequence Analysis

    PubMed Central

    2014-01-01

    Background High-throughput Next-Generation Sequencing (NGS) techniques are advancing genomics and molecular biology research. This technology generates substantially large data which puts up a major challenge to the scientists for an efficient, cost and time effective solution to analyse such data. Further, for the different types of NGS data, there are certain common challenging steps involved in analysing those data. Spliced alignment is one such fundamental step in NGS data analysis which is extremely computational intensive as well as time consuming. There exists serious problem even with the most widely used spliced alignment tools. TopHat is one such widely used spliced alignment tools which although supports multithreading, does not efficiently utilize computational resources in terms of CPU utilization and memory. Here we have introduced PVT (Pipelined Version of TopHat) where we take up a modular approach by breaking TopHat’s serial execution into a pipeline of multiple stages, thereby increasing the degree of parallelization and computational resource utilization. Thus we address the discrepancies in TopHat so as to analyze large NGS data efficiently. Results We analysed the SRA dataset (SRX026839 and SRX026838) consisting of single end reads and SRA data SRR1027730 consisting of paired-end reads. We used TopHat v2.0.8 to analyse these datasets and noted the CPU usage, memory footprint and execution time during spliced alignment. With this basic information, we designed PVT, a pipelined version of TopHat that removes the redundant computational steps during ‘spliced alignment’ and breaks the job into a pipeline of multiple stages (each comprising of different step(s)) to improve its resource utilization, thus reducing the execution time. Conclusions PVT provides an improvement over TopHat for spliced alignment of NGS data analysis. PVT thus resulted in the reduction of the execution time to ~23% for the single end read dataset. Further, PVT designed for paired end reads showed an improved performance of ~41% over TopHat (for the chosen data) with respect to execution time. Moreover we propose PVT-Cloud which implements PVT pipeline in cloud computing system. PMID:24894600

  13. PVT: an efficient computational procedure to speed up next-generation sequence analysis.

    PubMed

    Maji, Ranjan Kumar; Sarkar, Arijita; Khatua, Sunirmal; Dasgupta, Subhasis; Ghosh, Zhumur

    2014-06-04

    High-throughput Next-Generation Sequencing (NGS) techniques are advancing genomics and molecular biology research. This technology generates substantially large data which puts up a major challenge to the scientists for an efficient, cost and time effective solution to analyse such data. Further, for the different types of NGS data, there are certain common challenging steps involved in analysing those data. Spliced alignment is one such fundamental step in NGS data analysis which is extremely computational intensive as well as time consuming. There exists serious problem even with the most widely used spliced alignment tools. TopHat is one such widely used spliced alignment tools which although supports multithreading, does not efficiently utilize computational resources in terms of CPU utilization and memory. Here we have introduced PVT (Pipelined Version of TopHat) where we take up a modular approach by breaking TopHat's serial execution into a pipeline of multiple stages, thereby increasing the degree of parallelization and computational resource utilization. Thus we address the discrepancies in TopHat so as to analyze large NGS data efficiently. We analysed the SRA dataset (SRX026839 and SRX026838) consisting of single end reads and SRA data SRR1027730 consisting of paired-end reads. We used TopHat v2.0.8 to analyse these datasets and noted the CPU usage, memory footprint and execution time during spliced alignment. With this basic information, we designed PVT, a pipelined version of TopHat that removes the redundant computational steps during 'spliced alignment' and breaks the job into a pipeline of multiple stages (each comprising of different step(s)) to improve its resource utilization, thus reducing the execution time. PVT provides an improvement over TopHat for spliced alignment of NGS data analysis. PVT thus resulted in the reduction of the execution time to ~23% for the single end read dataset. Further, PVT designed for paired end reads showed an improved performance of ~41% over TopHat (for the chosen data) with respect to execution time. Moreover we propose PVT-Cloud which implements PVT pipeline in cloud computing system.

  14. PIMS: Memristor-Based Processing-in-Memory-and-Storage.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Cook, Jeanine

    Continued progress in computing has augmented the quest for higher performance with a new quest for higher energy efficiency. This has led to the re-emergence of Processing-In-Memory (PIM) ar- chitectures that offer higher density and performance with some boost in energy efficiency. Past PIM work either integrated a standard CPU with a conventional DRAM to improve the CPU- memory link, or used a bit-level processor with Single Instruction Multiple Data (SIMD) control, but neither matched the energy consumption of the memory to the computation. We originally proposed to develop a new architecture derived from PIM that more effectively addressed energymore » efficiency for high performance scientific, data analytics, and neuromorphic applications. We also originally planned to implement a von Neumann architecture with arithmetic/logic units (ALUs) that matched the power consumption of an advanced storage array to maximize energy efficiency. Implementing this architecture in storage was our original idea, since by augmenting storage (in- stead of memory), the system could address both in-memory computation and applications that accessed larger data sets directly from storage, hence Processing-in-Memory-and-Storage (PIMS). However, as our research matured, we discovered several things that changed our original direc- tion, the most important being that a PIM that implements a standard von Neumann-type archi- tecture results in significant energy efficiency improvement, but only about a O(10) performance improvement. In addition to this, the emergence of new memory technologies moved us to propos- ing a non-von Neumann architecture, called Superstrider, implemented not in storage, but in a new DRAM technology called High Bandwidth Memory (HBM). HBM is a stacked DRAM tech- nology that includes a logic layer where an architecture such as Superstrider could potentially be implemented.« less

  15. Method for simultaneous overlapped communications between neighboring processors in a multiple

    DOEpatents

    Benner, Robert E.; Gustafson, John L.; Montry, Gary R.

    1991-01-01

    A parallel computing system and method having improved performance where a program is concurrently run on a plurality of nodes for reducing total processing time, each node having a processor, a memory, and a predetermined number of communication channels connected to the node and independently connected directly to other nodes. The present invention improves performance of performance of the parallel computing system by providing a system which can provide efficient communication between the processors and between the system and input and output devices. A method is also disclosed which can locate defective nodes with the computing system.

  16. DoDs Efforts to Consolidate Data Centers Need Improvement

    DTIC Science & Technology

    2016-03-29

    Consolidation Initiative, February 26, 2010. 3 Green IT minimizes negative environmental impact of IT operations by ensuring that computers and computer-related...objectives for consolidating data centers. DoD’s objectives were to: • reduce cost; • reduce environmental impact ; • improve efficiency and service levels...number of DoD data centers. Finding A DODIG-2016-068 │ 7 information in DCIM, the DoD CIO did not confirm whether those changes would impact DoD’s

  17. Improving urban district heating systems and assessing the efficiency of the energy usage therein

    NASA Astrophysics Data System (ADS)

    Orlov, M. E.; Sharapov, V. I.

    2017-11-01

    The report describes issues in connection with improving urban district heating systems from combined heat power plants (CHPs), to propose the ways for improving the reliability and the efficiency of the energy usage (often referred to as “energy efficiency”) in such systems. The main direction of such urban district heating systems improvement suggests transition to combined heating systems that include structural elements of both centralized and decentralized systems. Such systems provide the basic part of thermal power via highly efficient methods for extracting thermal power plants turbines steam, while peak loads are covered by decentralized peak thermal power sources to be mounted at consumers’ locations, with the peak sources being also reserve thermal power sources. The methodology was developed for assessing energy efficiency of the combined district heating systems, implemented as a computer software product capable of comparatively calculating saving on reference fuel for the system.

  18. A Survey of Architectural Techniques For Improving Cache Power Efficiency

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Mittal, Sparsh

    Modern processors are using increasingly larger sized on-chip caches. Also, with each CMOS technology generation, there has been a significant increase in their leakage energy consumption. For this reason, cache power management has become a crucial research issue in modern processor design. To address this challenge and also meet the goals of sustainable computing, researchers have proposed several techniques for improving energy efficiency of cache architectures. This paper surveys recent architectural techniques for improving cache power efficiency and also presents a classification of these techniques based on their characteristics. For providing an application perspective, this paper also reviews several real-worldmore » processor chips that employ cache energy saving techniques. The aim of this survey is to enable engineers and researchers to get insights into the techniques for improving cache power efficiency and motivate them to invent novel solutions for enabling low-power operation of caches.« less

  19. Improving Design Efficiency for Large-Scale Heterogeneous Circuits

    NASA Astrophysics Data System (ADS)

    Gregerson, Anthony

    Despite increases in logic density, many Big Data applications must still be partitioned across multiple computing devices in order to meet their strict performance requirements. Among the most demanding of these applications is high-energy physics (HEP), which uses complex computing systems consisting of thousands of FPGAs and ASICs to process the sensor data created by experiments at particles accelerators such as the Large Hadron Collider (LHC). Designing such computing systems is challenging due to the scale of the systems, the exceptionally high-throughput and low-latency performance constraints that necessitate application-specific hardware implementations, the requirement that algorithms are efficiently partitioned across many devices, and the possible need to update the implemented algorithms during the lifetime of the system. In this work, we describe our research to develop flexible architectures for implementing such large-scale circuits on FPGAs. In particular, this work is motivated by (but not limited in scope to) high-energy physics algorithms for the Compact Muon Solenoid (CMS) experiment at the LHC. To make efficient use of logic resources in multi-FPGA systems, we introduce Multi-Personality Partitioning, a novel form of the graph partitioning problem, and present partitioning algorithms that can significantly improve resource utilization on heterogeneous devices while also reducing inter-chip connections. To reduce the high communication costs of Big Data applications, we also introduce Information-Aware Partitioning, a partitioning method that analyzes the data content of application-specific circuits, characterizes their entropy, and selects circuit partitions that enable efficient compression of data between chips. We employ our information-aware partitioning method to improve the performance of the hardware validation platform for evaluating new algorithms for the CMS experiment. Together, these research efforts help to improve the efficiency and decrease the cost of the developing large-scale, heterogeneous circuits needed to enable large-scale application in high-energy physics and other important areas.

  20. Sensitivity Analysis and Optimization of Enclosure Radiation with Applications to Crystal Growth

    NASA Technical Reports Server (NTRS)

    Tiller, Michael M.

    1995-01-01

    In engineering, simulation software is often used as a convenient means for carrying out experiments to evaluate physical systems. The benefit of using simulations as 'numerical' experiments is that the experimental conditions can be easily modified and repeated at much lower cost than the comparable physical experiment. The goal of these experiments is to 'improve' the process or result of the experiment. In most cases, the computational experiments employ the same trial and error approach as their physical counterparts. When using this approach for complex systems, the cause and effect relationship of the system may never be fully understood and efficient strategies for improvement never utilized. However, it is possible when running simulations to accurately and efficiently determine the sensitivity of the system results with respect to simulation to accurately and efficiently determine the sensitivity of the system results with respect to simulation parameters (e.g., initial conditions, boundary conditions, and material properties) by manipulating the underlying computations. This results in a better understanding of the system dynamics and gives us efficient means to improve processing conditions. We begin by discussing the steps involved in performing simulations. Then we consider how sensitivity information about simulation results can be obtained and ways this information may be used to improve the process or result of the experiment. Next, we discuss optimization and the efficient algorithms which use sensitivity information. We draw on all this information to propose a generalized approach for integrating simulation and optimization, with an emphasis on software programming issues. After discussing our approach to simulation and optimization we consider an application involving crystal growth. This application is interesting because it includes radiative heat transfer. We discuss the computation of radiative new factors and the impact this mode of heat transfer has on our approach. Finally, we will demonstrate the results of our optimization.

  1. Speech recognition-based and automaticity programs to help students with severe reading and spelling problems.

    PubMed

    Higgins, Eleanor L; Raskind, Marshall H

    2004-12-01

    This study was conducted to assess the effectiveness of two programs developed by the Frostig Center Research Department to improve the reading and spelling of students with learning disabilities (LD): a computer Speech Recognition-based Program (SRBP) and a computer and text-based Automaticity Program (AP). Twenty-eight LD students with reading and spelling difficulties (aged 8 to 18) received each program for 17 weeks and were compared with 16 students in a contrast group who did not receive either program. After adjusting for age and IQ, both the SRBP and AP groups showed significant differences over the contrast group in improving word recognition and reading comprehension. Neither program showed significant differences over contrasts in spelling. The SRBP also improved the performance of the target group when compared with the contrast group on phonological elision and nonword reading efficiency tasks. The AP showed significant differences in all process and reading efficiency measures.

  2. Content-Based Discovery for Web Map Service using Support Vector Machine and User Relevance Feedback

    PubMed Central

    Cheng, Xiaoqiang; Qi, Kunlun; Zheng, Jie; You, Lan; Wu, Huayi

    2016-01-01

    Many discovery methods for geographic information services have been proposed. There are approaches for finding and matching geographic information services, methods for constructing geographic information service classification schemes, and automatic geographic information discovery. Overall, the efficiency of the geographic information discovery keeps improving., There are however, still two problems in Web Map Service (WMS) discovery that must be solved. Mismatches between the graphic contents of a WMS and the semantic descriptions in the metadata make discovery difficult for human users. End-users and computers comprehend WMSs differently creating semantic gaps in human-computer interactions. To address these problems, we propose an improved query process for WMSs based on the graphic contents of WMS layers, combining Support Vector Machine (SVM) and user relevance feedback. Our experiments demonstrate that the proposed method can improve the accuracy and efficiency of WMS discovery. PMID:27861505

  3. Content-Based Discovery for Web Map Service using Support Vector Machine and User Relevance Feedback.

    PubMed

    Hu, Kai; Gui, Zhipeng; Cheng, Xiaoqiang; Qi, Kunlun; Zheng, Jie; You, Lan; Wu, Huayi

    2016-01-01

    Many discovery methods for geographic information services have been proposed. There are approaches for finding and matching geographic information services, methods for constructing geographic information service classification schemes, and automatic geographic information discovery. Overall, the efficiency of the geographic information discovery keeps improving., There are however, still two problems in Web Map Service (WMS) discovery that must be solved. Mismatches between the graphic contents of a WMS and the semantic descriptions in the metadata make discovery difficult for human users. End-users and computers comprehend WMSs differently creating semantic gaps in human-computer interactions. To address these problems, we propose an improved query process for WMSs based on the graphic contents of WMS layers, combining Support Vector Machine (SVM) and user relevance feedback. Our experiments demonstrate that the proposed method can improve the accuracy and efficiency of WMS discovery.

  4. Parallelization of ARC3D with Computer-Aided Tools

    NASA Technical Reports Server (NTRS)

    Jin, Haoqiang; Hribar, Michelle; Yan, Jerry; Saini, Subhash (Technical Monitor)

    1998-01-01

    A series of efforts have been devoted to investigating methods of porting and parallelizing applications quickly and efficiently for new architectures, such as the SCSI Origin 2000 and Cray T3E. This report presents the parallelization of a CFD application, ARC3D, using the computer-aided tools, Cesspools. Steps of parallelizing this code and requirements of achieving better performance are discussed. The generated parallel version has achieved reasonably well performance, for example, having a speedup of 30 for 36 Cray T3E processors. However, this performance could not be obtained without modification of the original serial code. It is suggested that in many cases improving serial code and performing necessary code transformations are important parts for the automated parallelization process although user intervention in many of these parts are still necessary. Nevertheless, development and improvement of useful software tools, such as Cesspools, can help trim down many tedious parallelization details and improve the processing efficiency.

  5. Bandwidth reduction for video-on-demand broadcasting using secondary content insertion

    NASA Astrophysics Data System (ADS)

    Golynski, Alexander; Lopez-Ortiz, Alejandro; Poirier, Guillaume; Quimper, Claude-Guy

    2005-01-01

    An optimal broadcasting scheme under the presence of secondary content (i.e. advertisements) is proposed. The proposed scheme works both for movies encoded in a Constant Bit Rate (CBR) or a Variable Bit Rate (VBR) format. It is shown experimentally that secondary content in movies can make Video-on-Demand (VoD) broadcasting systems more efficient. An efficient algorithm is given to compute the optimal broadcasting schedule with secondary content, which in particular significantly improves over the best previously known algorithm for computing the optimal broadcasting schedule without secondary content.

  6. Computer Applications in Class and Transportation Scheduling. Educational Management Review Series Number 1.

    ERIC Educational Resources Information Center

    Piele, Philip K.

    This document shows how computer technology can aid educators in meeting demands for improved class scheduling and more efficient use of transportation resources. The first section surveys literature on operational systems that provide individualized scheduling for students, varied class structures, and maximum use of space and staff skills.…

  7. Mobile Learning According to Students of Computer Engineering and Computer Education: A Comparison of Attitudes

    ERIC Educational Resources Information Center

    Gezgin, Deniz Mertkan; Adnan, Muge; Acar Guvendir, Meltem

    2018-01-01

    Mobile learning has started to perform an increasingly significant role in improving learning outcomes in education. Successful and efficient implementation of m-learning in higher education, as with all educational levels, depends on users' acceptance of this technology. This study focuses on investigating the attitudes of undergraduate students…

  8. DOSESCREEN: a computer program to aid dose placement

    Treesearch

    Kimberly C. Smith; Jacqueline L. Robertson

    1984-01-01

    Careful selection of an experimental design for a bioassay substantially improves the precision of effective dose (ED) estimates. Design considerations typically include determination of sample size, dose selection, and allocation of subjects to doses. DOSESCREEN is a computer program written to help investigators select an efficient design for the estimation of an...

  9. Improving the Efficiency of Free Energy Calculations in the Amber Molecular Dynamics Package.

    PubMed

    Kaus, Joseph W; Pierce, Levi T; Walker, Ross C; McCammont, J Andrew

    2013-09-10

    Alchemical transformations are widely used methods to calculate free energies. Amber has traditionally included support for alchemical transformations as part of the sander molecular dynamics (MD) engine. Here we describe the implementation of a more efficient approach to alchemical transformations in the Amber MD package. Specifically we have implemented this new approach within the more computational efficient and scalable pmemd MD engine that is included with the Amber MD package. The majority of the gain in efficiency comes from the improved design of the calculation, which includes better parallel scaling and reduction in the calculation of redundant terms. This new implementation is able to reproduce results from equivalent simulations run with the existing functionality, but at 2.5 times greater computational efficiency. This new implementation is also able to run softcore simulations at the λ end states making direct calculation of free energies more accurate, compared to the extrapolation required in the existing implementation. The updated alchemical transformation functionality will be included in the next major release of Amber (scheduled for release in Q1 2014) and will be available at http://ambermd.org, under the Amber license.

  10. Improving the Efficiency of Free Energy Calculations in the Amber Molecular Dynamics Package

    PubMed Central

    Pierce, Levi T.; Walker, Ross C.; McCammont, J. Andrew

    2013-01-01

    Alchemical transformations are widely used methods to calculate free energies. Amber has traditionally included support for alchemical transformations as part of the sander molecular dynamics (MD) engine. Here we describe the implementation of a more efficient approach to alchemical transformations in the Amber MD package. Specifically we have implemented this new approach within the more computational efficient and scalable pmemd MD engine that is included with the Amber MD package. The majority of the gain in efficiency comes from the improved design of the calculation, which includes better parallel scaling and reduction in the calculation of redundant terms. This new implementation is able to reproduce results from equivalent simulations run with the existing functionality, but at 2.5 times greater computational efficiency. This new implementation is also able to run softcore simulations at the λ end states making direct calculation of free energies more accurate, compared to the extrapolation required in the existing implementation. The updated alchemical transformation functionality will be included in the next major release of Amber (scheduled for release in Q1 2014) and will be available at http://ambermd.org, under the Amber license. PMID:24185531

  11. Evidence of effectiveness of health care professionals using handheld computers: a scoping review of systematic reviews.

    PubMed

    Mickan, Sharon; Tilson, Julie K; Atherton, Helen; Roberts, Nia Wyn; Heneghan, Carl

    2013-10-28

    Handheld computers and mobile devices provide instant access to vast amounts and types of useful information for health care professionals. Their reduced size and increased processing speed has led to rapid adoption in health care. Thus, it is important to identify whether handheld computers are actually effective in clinical practice. A scoping review of systematic reviews was designed to provide a quick overview of the documented evidence of effectiveness for health care professionals using handheld computers in their clinical work. A detailed search, sensitive for systematic reviews was applied for Cochrane, Medline, EMBASE, PsycINFO, Allied and Complementary Medicine Database (AMED), Global Health, and Cumulative Index to Nursing and Allied Health Literature (CINAHL) databases. All outcomes that demonstrated effectiveness in clinical practice were included. Classroom learning and patient use of handheld computers were excluded. Quality was assessed using the Assessment of Multiple Systematic Reviews (AMSTAR) tool. A previously published conceptual framework was used as the basis for dual data extraction. Reported outcomes were summarized according to the primary function of the handheld computer. Five systematic reviews met the inclusion and quality criteria. Together, they reviewed 138 unique primary studies. Most reviewed descriptive intervention studies, where physicians, pharmacists, or medical students used personal digital assistants. Effectiveness was demonstrated across four distinct functions of handheld computers: patient documentation, patient care, information seeking, and professional work patterns. Within each of these functions, a range of positive outcomes were reported using both objective and self-report measures. The use of handheld computers improved patient documentation through more complete recording, fewer documentation errors, and increased efficiency. Handheld computers provided easy access to clinical decision support systems and patient management systems, which improved decision making for patient care. Handheld computers saved time and gave earlier access to new information. There were also reports that handheld computers enhanced work patterns and efficiency. This scoping review summarizes the secondary evidence for effectiveness of handheld computers and mhealth. It provides a snapshot of effective use by health care professionals across four key functions. We identified evidence to suggest that handheld computers provide easy and timely access to information and enable accurate and complete documentation. Further, they can give health care professionals instant access to evidence-based decision support and patient management systems to improve clinical decision making. Finally, there is evidence that handheld computers allow health professionals to be more efficient in their work practices. It is anticipated that this evidence will guide clinicians and managers in implementing handheld computers in clinical practice and in designing future research.

  12. Computer-intensive simulation of solid-state NMR experiments using SIMPSON.

    PubMed

    Tošner, Zdeněk; Andersen, Rasmus; Stevensson, Baltzar; Edén, Mattias; Nielsen, Niels Chr; Vosegaard, Thomas

    2014-09-01

    Conducting large-scale solid-state NMR simulations requires fast computer software potentially in combination with efficient computational resources to complete within a reasonable time frame. Such simulations may involve large spin systems, multiple-parameter fitting of experimental spectra, or multiple-pulse experiment design using parameter scan, non-linear optimization, or optimal control procedures. To efficiently accommodate such simulations, we here present an improved version of the widely distributed open-source SIMPSON NMR simulation software package adapted to contemporary high performance hardware setups. The software is optimized for fast performance on standard stand-alone computers, multi-core processors, and large clusters of identical nodes. We describe the novel features for fast computation including internal matrix manipulations, propagator setups and acquisition strategies. For efficient calculation of powder averages, we implemented interpolation method of Alderman, Solum, and Grant, as well as recently introduced fast Wigner transform interpolation technique. The potential of the optimal control toolbox is greatly enhanced by higher precision gradients in combination with the efficient optimization algorithm known as limited memory Broyden-Fletcher-Goldfarb-Shanno. In addition, advanced parallelization can be used in all types of calculations, providing significant time reductions. SIMPSON is thus reflecting current knowledge in the field of numerical simulations of solid-state NMR experiments. The efficiency and novel features are demonstrated on the representative simulations. Copyright © 2014 Elsevier Inc. All rights reserved.

  13. Efficient parallel resolution of the simplified transport equations in mixed-dual formulation

    NASA Astrophysics Data System (ADS)

    Barrault, M.; Lathuilière, B.; Ramet, P.; Roman, J.

    2011-03-01

    A reactivity computation consists of computing the highest eigenvalue of a generalized eigenvalue problem, for which an inverse power algorithm is commonly used. Very fine modelizations are difficult to treat for our sequential solver, based on the simplified transport equations, in terms of memory consumption and computational time. A first implementation of a Lagrangian based domain decomposition method brings to a poor parallel efficiency because of an increase in the power iterations [1]. In order to obtain a high parallel efficiency, we improve the parallelization scheme by changing the location of the loop over the subdomains in the overall algorithm and by benefiting from the characteristics of the Raviart-Thomas finite element. The new parallel algorithm still allows us to locally adapt the numerical scheme (mesh, finite element order). However, it can be significantly optimized for the matching grid case. The good behavior of the new parallelization scheme is demonstrated for the matching grid case on several hundreds of nodes for computations based on a pin-by-pin discretization.

  14. A GPU-accelerated implicit meshless method for compressible flows

    NASA Astrophysics Data System (ADS)

    Zhang, Jia-Le; Ma, Zhi-Hua; Chen, Hong-Quan; Cao, Cheng

    2018-05-01

    This paper develops a recently proposed GPU based two-dimensional explicit meshless method (Ma et al., 2014) by devising and implementing an efficient parallel LU-SGS implicit algorithm to further improve the computational efficiency. The capability of the original 2D meshless code is extended to deal with 3D complex compressible flow problems. To resolve the inherent data dependency of the standard LU-SGS method, which causes thread-racing conditions destabilizing numerical computation, a generic rainbow coloring method is presented and applied to organize the computational points into different groups by painting neighboring points with different colors. The original LU-SGS method is modified and parallelized accordingly to perform calculations in a color-by-color manner. The CUDA Fortran programming model is employed to develop the key kernel functions to apply boundary conditions, calculate time steps, evaluate residuals as well as advance and update the solution in the temporal space. A series of two- and three-dimensional test cases including compressible flows over single- and multi-element airfoils and a M6 wing are carried out to verify the developed code. The obtained solutions agree well with experimental data and other computational results reported in the literature. Detailed analysis on the performance of the developed code reveals that the developed CPU based implicit meshless method is at least four to eight times faster than its explicit counterpart. The computational efficiency of the implicit method could be further improved by ten to fifteen times on the GPU.

  15. Automation to improve efficiency of field expedient injury prediction screening.

    PubMed

    Teyhen, Deydre S; Shaffer, Scott W; Umlauf, Jon A; Akerman, Raymond J; Canada, John B; Butler, Robert J; Goffar, Stephen L; Walker, Michael J; Kiesel, Kyle B; Plisky, Phillip J

    2012-07-01

    Musculoskeletal injuries are a primary source of disability in the U.S. Military. Physical training and sports-related activities account for up to 90% of all injuries, and 80% of these injuries are considered overuse in nature. As a result, there is a need to develop an evidence-based musculoskeletal screen that can assist with injury prevention. The purpose of this study was to assess the capability of an automated system to improve the efficiency of field expedient tests that may help predict injury risk and provide corrective strategies for deficits identified. The field expedient tests include survey questions and measures of movement quality, balance, trunk stability, power, mobility, and foot structure and mobility. Data entry for these tests was automated using handheld computers, barcode scanning, and netbook computers. An automated algorithm for injury risk stratification and mitigation techniques was run on a server computer. Without automation support, subjects were assessed in 84.5 ± 9.1 minutes per subject compared with 66.8 ± 6.1 minutes per subject with automation and 47.1 ± 5.2 minutes per subject with automation and process improvement measures (p < 0.001). The average time to manually enter the data was 22.2 ± 7.4 minutes per subject. An additional 11.5 ± 2.5 minutes per subject was required to manually assign an intervention strategy. Automation of this injury prevention screening protocol using handheld devices and netbook computers allowed for real-time data entry and enhanced the efficiency of injury screening, risk stratification, and prescription of a risk mitigation strategy.

  16. Two tradeoffs between economy and reliability in loss of load probability constrained unit commitment

    NASA Astrophysics Data System (ADS)

    Liu, Yuan; Wang, Mingqiang; Ning, Xingyao

    2018-02-01

    Spinning reserve (SR) should be scheduled considering the balance between economy and reliability. To address the computational intractability cursed by the computation of loss of load probability (LOLP), many probabilistic methods use simplified formulations of LOLP to improve the computational efficiency. Two tradeoffs embedded in the SR optimization model are not explicitly analyzed in these methods. In this paper, two tradeoffs including primary tradeoff and secondary tradeoff between economy and reliability in the maximum LOLP constrained unit commitment (UC) model are explored and analyzed in a small system and in IEEE-RTS System. The analysis on the two tradeoffs can help in establishing new efficient simplified LOLP formulations and new SR optimization models.

  17. Hamiltonian Monte Carlo acceleration using surrogate functions with random bases.

    PubMed

    Zhang, Cheng; Shahbaba, Babak; Zhao, Hongkai

    2017-11-01

    For big data analysis, high computational cost for Bayesian methods often limits their applications in practice. In recent years, there have been many attempts to improve computational efficiency of Bayesian inference. Here we propose an efficient and scalable computational technique for a state-of-the-art Markov chain Monte Carlo methods, namely, Hamiltonian Monte Carlo. The key idea is to explore and exploit the structure and regularity in parameter space for the underlying probabilistic model to construct an effective approximation of its geometric properties. To this end, we build a surrogate function to approximate the target distribution using properly chosen random bases and an efficient optimization process. The resulting method provides a flexible, scalable, and efficient sampling algorithm, which converges to the correct target distribution. We show that by choosing the basis functions and optimization process differently, our method can be related to other approaches for the construction of surrogate functions such as generalized additive models or Gaussian process models. Experiments based on simulated and real data show that our approach leads to substantially more efficient sampling algorithms compared to existing state-of-the-art methods.

  18. Optimizing Engineering Tools Using Modern Ground Architectures

    DTIC Science & Technology

    2017-12-01

    Considerations,” International Journal of Computer Science & Engineering Survey , vol. 5, no. 4, 2014. [10] R. Bell. (n.d). A beginner’s guide to big O notation...scientific community. Traditional computing architectures were not capable of processing the data efficiently, or in some cases, could not process the...thesis investigates how these modern computing architectures could be leveraged by industry and academia to improve the performance and capabilities of

  19. Efficient ICCG on a shared memory multiprocessor

    NASA Technical Reports Server (NTRS)

    Hammond, Steven W.; Schreiber, Robert

    1989-01-01

    Different approaches are discussed for exploiting parallelism in the ICCG (Incomplete Cholesky Conjugate Gradient) method for solving large sparse symmetric positive definite systems of equations on a shared memory parallel computer. Techniques for efficiently solving triangular systems and computing sparse matrix-vector products are explored. Three methods for scheduling the tasks in solving triangular systems are implemented on the Sequent Balance 21000. Sample problems that are representative of a large class of problems solved using iterative methods are used. We show that a static analysis to determine data dependences in the triangular solve can greatly improve its parallel efficiency. We also show that ignoring symmetry and storing the whole matrix can reduce solution time substantially.

  20. LaRC local area networks to support distributed computing

    NASA Technical Reports Server (NTRS)

    Riddle, E. P.

    1984-01-01

    The Langley Research Center's (LaRC) Local Area Network (LAN) effort is discussed. LaRC initiated the development of a LAN to support a growing distributed computing environment at the Center. The purpose of the network is to provide an improved capability (over inteactive and RJE terminal access) for sharing multivendor computer resources. Specifically, the network will provide a data highway for the transfer of files between mainframe computers, minicomputers, work stations, and personal computers. An important influence on the overall network design was the vital need of LaRC researchers to efficiently utilize the large CDC mainframe computers in the central scientific computing facility. Although there was a steady migration from a centralized to a distributed computing environment at LaRC in recent years, the work load on the central resources increased. Major emphasis in the network design was on communication with the central resources within the distributed environment. The network to be implemented will allow researchers to utilize the central resources, distributed minicomputers, work stations, and personal computers to obtain the proper level of computing power to efficiently perform their jobs.

  1. Multiscale Methods for Accurate, Efficient, and Scale-Aware Models of the Earth System

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Goldhaber, Steve; Holland, Marika

    The major goal of this project was to contribute improvements to the infrastructure of an Earth System Model in order to support research in the Multiscale Methods for Accurate, Efficient, and Scale-Aware models of the Earth System project. In support of this, the NCAR team accomplished two main tasks: improving input/output performance of the model and improving atmospheric model simulation quality. Improvement of the performance and scalability of data input and diagnostic output within the model required a new infrastructure which can efficiently handle the unstructured grids common in multiscale simulations. This allows for a more computationally efficient model, enablingmore » more years of Earth System simulation. The quality of the model simulations was improved by reducing grid-point noise in the spectral element version of the Community Atmosphere Model (CAM-SE). This was achieved by running the physics of the model using grid-cell data on a finite-volume grid.« less

  2. Extensions and improvements on XTRAN3S

    NASA Technical Reports Server (NTRS)

    Borland, C. J.

    1989-01-01

    Improvements to the XTRAN3S computer program are summarized. Work on this code, for steady and unsteady aerodynamic and aeroelastic analysis in the transonic flow regime has concentrated on the following areas: (1) Maintenance of the XTRAN3S code, including correction of errors, enhancement of operational capability, and installation on the Cray X-MP system; (2) Extension of the vectorization concepts in XTRAN3S to include additional areas of the code for improved execution speed; (3) Modification of the XTRAN3S algorithm for improved numerical stability for swept, tapered wing cases and improved computational efficiency; and (4) Extension of the wing-only version of XTRAN3S to include pylon and nacelle or external store capability.

  3. Elastic Cloud Computing Architecture and System for Heterogeneous Spatiotemporal Computing

    NASA Astrophysics Data System (ADS)

    Shi, X.

    2017-10-01

    Spatiotemporal computation implements a variety of different algorithms. When big data are involved, desktop computer or standalone application may not be able to complete the computation task due to limited memory and computing power. Now that a variety of hardware accelerators and computing platforms are available to improve the performance of geocomputation, different algorithms may have different behavior on different computing infrastructure and platforms. Some are perfect for implementation on a cluster of graphics processing units (GPUs), while GPUs may not be useful on certain kind of spatiotemporal computation. This is the same situation in utilizing a cluster of Intel's many-integrated-core (MIC) or Xeon Phi, as well as Hadoop or Spark platforms, to handle big spatiotemporal data. Furthermore, considering the energy efficiency requirement in general computation, Field Programmable Gate Array (FPGA) may be a better solution for better energy efficiency when the performance of computation could be similar or better than GPUs and MICs. It is expected that an elastic cloud computing architecture and system that integrates all of GPUs, MICs, and FPGAs could be developed and deployed to support spatiotemporal computing over heterogeneous data types and computational problems.

  4. Electronic Timekeeping: North Dakota State University Improves Payroll Processing.

    ERIC Educational Resources Information Center

    Vetter, Ronald J.; And Others

    1993-01-01

    North Dakota State University has adopted automated timekeeping to improve the efficiency and effectiveness of payroll processing. The microcomputer-based system accurately records and computes employee time, tracks labor distribution, accommodates complex labor policies and company pay practices, provides automatic data processing and reporting,…

  5. Improved traveling wave tubes

    NASA Technical Reports Server (NTRS)

    Buck, E.

    1980-01-01

    After a brief description of how a typical TWT works, multi-stage depressed collectors (MDC) are discussed. A quick method for computing the expected efficiency of a well engineered TWT is outlined to aid in estimating power supply needs. Applications of improved TWTs and a new power supply are suggested.

  6. A learnable parallel processing architecture towards unity of memory and computing

    NASA Astrophysics Data System (ADS)

    Li, H.; Gao, B.; Chen, Z.; Zhao, Y.; Huang, P.; Ye, H.; Liu, L.; Liu, X.; Kang, J.

    2015-08-01

    Developing energy-efficient parallel information processing systems beyond von Neumann architecture is a long-standing goal of modern information technologies. The widely used von Neumann computer architecture separates memory and computing units, which leads to energy-hungry data movement when computers work. In order to meet the need of efficient information processing for the data-driven applications such as big data and Internet of Things, an energy-efficient processing architecture beyond von Neumann is critical for the information society. Here we show a non-von Neumann architecture built of resistive switching (RS) devices named “iMemComp”, where memory and logic are unified with single-type devices. Leveraging nonvolatile nature and structural parallelism of crossbar RS arrays, we have equipped “iMemComp” with capabilities of computing in parallel and learning user-defined logic functions for large-scale information processing tasks. Such architecture eliminates the energy-hungry data movement in von Neumann computers. Compared with contemporary silicon technology, adder circuits based on “iMemComp” can improve the speed by 76.8% and the power dissipation by 60.3%, together with a 700 times aggressive reduction in the circuit area.

  7. A learnable parallel processing architecture towards unity of memory and computing.

    PubMed

    Li, H; Gao, B; Chen, Z; Zhao, Y; Huang, P; Ye, H; Liu, L; Liu, X; Kang, J

    2015-08-14

    Developing energy-efficient parallel information processing systems beyond von Neumann architecture is a long-standing goal of modern information technologies. The widely used von Neumann computer architecture separates memory and computing units, which leads to energy-hungry data movement when computers work. In order to meet the need of efficient information processing for the data-driven applications such as big data and Internet of Things, an energy-efficient processing architecture beyond von Neumann is critical for the information society. Here we show a non-von Neumann architecture built of resistive switching (RS) devices named "iMemComp", where memory and logic are unified with single-type devices. Leveraging nonvolatile nature and structural parallelism of crossbar RS arrays, we have equipped "iMemComp" with capabilities of computing in parallel and learning user-defined logic functions for large-scale information processing tasks. Such architecture eliminates the energy-hungry data movement in von Neumann computers. Compared with contemporary silicon technology, adder circuits based on "iMemComp" can improve the speed by 76.8% and the power dissipation by 60.3%, together with a 700 times aggressive reduction in the circuit area.

  8. Acceleration of FDTD mode solver by high-performance computing techniques.

    PubMed

    Han, Lin; Xi, Yanping; Huang, Wei-Ping

    2010-06-21

    A two-dimensional (2D) compact finite-difference time-domain (FDTD) mode solver is developed based on wave equation formalism in combination with the matrix pencil method (MPM). The method is validated for calculation of both real guided and complex leaky modes of typical optical waveguides against the bench-mark finite-difference (FD) eigen mode solver. By taking advantage of the inherent parallel nature of the FDTD algorithm, the mode solver is implemented on graphics processing units (GPUs) using the compute unified device architecture (CUDA). It is demonstrated that the high-performance computing technique leads to significant acceleration of the FDTD mode solver with more than 30 times improvement in computational efficiency in comparison with the conventional FDTD mode solver running on CPU of a standard desktop computer. The computational efficiency of the accelerated FDTD method is in the same order of magnitude of the standard finite-difference eigen mode solver and yet require much less memory (e.g., less than 10%). Therefore, the new method may serve as an efficient, accurate and robust tool for mode calculation of optical waveguides even when the conventional eigen value mode solvers are no longer applicable due to memory limitation.

  9. Computer-Assisted Face Processing Instruction Improves Emotion Recognition, Mentalizing, and Social Skills in Students with ASD.

    PubMed

    Rice, Linda Marie; Wall, Carla Anne; Fogel, Adam; Shic, Frederick

    2015-07-01

    This study examined the extent to which a computer-based social skills intervention called FaceSay was associated with improvements in affect recognition, mentalizing, and social skills of school-aged children with Autism Spectrum Disorder (ASD). FaceSay offers students simulated practice with eye gaze, joint attention, and facial recognition skills. This randomized control trial included school-aged children meeting educational criteria for autism (N = 31). Results demonstrated that participants who received the intervention improved their affect recognition and mentalizing skills, as well as their social skills. These findings suggest that, by targeting face-processing skills, computer-based interventions may produce changes in broader cognitive and social-skills domains in a cost- and time-efficient manner.

  10. Improved sonic-box computer program for calculating transonic aerodynamic loads on oscillating wings with thickness

    NASA Technical Reports Server (NTRS)

    Ruo, S. Y.

    1978-01-01

    A computer program was developed to account approximately for the effects of finite wing thickness in transonic potential flow over an oscillation wing of finite span. The program is based on the original sonic box computer program for planar wing which was extended to account for the effect of wing thickness. Computational efficiency and accuracy were improved and swept trailing edges were accounted for. Account for the nonuniform flow caused by finite thickness was made by application of the local linearization concept with appropriate coordinate transformation. A brief description of each computer routine and the applications of cubic spline and spline surface data fitting techniques used in the program are given, and the method of input was shown in detail. Sample calculations as well as a complete listing of the computer program listing are presented.

  11. Improving Distributed Diagnosis Through Structural Model Decomposition

    NASA Technical Reports Server (NTRS)

    Bregon, Anibal; Daigle, Matthew John; Roychoudhury, Indranil; Biswas, Gautam; Koutsoukos, Xenofon; Pulido, Belarmino

    2011-01-01

    Complex engineering systems require efficient fault diagnosis methodologies, but centralized approaches do not scale well, and this motivates the development of distributed solutions. This work presents an event-based approach for distributed diagnosis of abrupt parametric faults in continuous systems, by using the structural model decomposition capabilities provided by Possible Conflicts. We develop a distributed diagnosis algorithm that uses residuals computed by extending Possible Conflicts to build local event-based diagnosers based on global diagnosability analysis. The proposed approach is applied to a multitank system, and results demonstrate an improvement in the design of local diagnosers. Since local diagnosers use only a subset of the residuals, and use subsystem models to compute residuals (instead of the global system model), the local diagnosers are more efficient than previously developed distributed approaches.

  12. Compute as Fast as the Engineers Can Think! ULTRAFAST COMPUTING TEAM FINAL REPORT

    NASA Technical Reports Server (NTRS)

    Biedron, R. T.; Mehrotra, P.; Nelson, M. L.; Preston, M. L.; Rehder, J. J.; Rogersm J. L.; Rudy, D. H.; Sobieski, J.; Storaasli, O. O.

    1999-01-01

    This report documents findings and recommendations by the Ultrafast Computing Team (UCT). In the period 10-12/98, UCT reviewed design case scenarios for a supersonic transport and a reusable launch vehicle to derive computing requirements necessary for support of a design process with efficiency so radically improved that human thought rather than the computer paces the process. Assessment of the present computing capability against the above requirements indicated a need for further improvement in computing speed by several orders of magnitude to reduce time to solution from tens of hours to seconds in major applications. Evaluation of the trends in computer technology revealed a potential to attain the postulated improvement by further increases of single processor performance combined with massively parallel processing in a heterogeneous environment. However, utilization of massively parallel processing to its full capability will require redevelopment of the engineering analysis and optimization methods, including invention of new paradigms. To that end UCT recommends initiation of a new activity at LaRC called Computational Engineering for development of new methods and tools geared to the new computer architectures in disciplines, their coordination, and validation and benefit demonstration through applications.

  13. Efficient shortest-path-tree computation in network routing based on pulse-coupled neural networks.

    PubMed

    Qu, Hong; Yi, Zhang; Yang, Simon X

    2013-06-01

    Shortest path tree (SPT) computation is a critical issue for routers using link-state routing protocols, such as the most commonly used open shortest path first and intermediate system to intermediate system. Each router needs to recompute a new SPT rooted from itself whenever a change happens in the link state. Most commercial routers do this computation by deleting the current SPT and building a new one using static algorithms such as the Dijkstra algorithm at the beginning. Such recomputation of an entire SPT is inefficient, which may consume a considerable amount of CPU time and result in a time delay in the network. Some dynamic updating methods using the information in the updated SPT have been proposed in recent years. However, there are still many limitations in those dynamic algorithms. In this paper, a new modified model of pulse-coupled neural networks (M-PCNNs) is proposed for the SPT computation. It is rigorously proved that the proposed model is capable of solving some optimization problems, such as the SPT. A static algorithm is proposed based on the M-PCNNs to compute the SPT efficiently for large-scale problems. In addition, a dynamic algorithm that makes use of the structure of the previously computed SPT is proposed, which significantly improves the efficiency of the algorithm. Simulation results demonstrate the effective and efficient performance of the proposed approach.

  14. Investigating power capping toward energy-efficient scientific applications: Investigating Power Capping toward Energy-Efficient Scientific Applications

    DOE PAGES

    Haidar, Azzam; Jagode, Heike; Vaccaro, Phil; ...

    2018-03-22

    The emergence of power efficiency as a primary constraint in processor and system design poses new challenges concerning power and energy awareness for numerical libraries and scientific applications. Power consumption also plays a major role in the design of data centers, which may house petascale or exascale-level computing systems. At these extreme scales, understanding and improving the energy efficiency of numerical libraries and their related applications becomes a crucial part of the successful implementation and operation of the computing system. In this paper, we study and investigate the practice of controlling a compute system's power usage, and we explore howmore » different power caps affect the performance of numerical algorithms with different computational intensities. Further, we determine the impact, in terms of performance and energy usage, that these caps have on a system running scientific applications. This analysis will enable us to characterize the types of algorithms that benefit most from these power management schemes. Our experiments are performed using a set of representative kernels and several popular scientific benchmarks. Lastly, we quantify a number of power and performance measurements and draw observations and conclusions that can be viewed as a roadmap to achieving energy efficiency in the design and execution of scientific algorithms.« less

  15. Investigating power capping toward energy-efficient scientific applications: Investigating Power Capping toward Energy-Efficient Scientific Applications

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Haidar, Azzam; Jagode, Heike; Vaccaro, Phil

    The emergence of power efficiency as a primary constraint in processor and system design poses new challenges concerning power and energy awareness for numerical libraries and scientific applications. Power consumption also plays a major role in the design of data centers, which may house petascale or exascale-level computing systems. At these extreme scales, understanding and improving the energy efficiency of numerical libraries and their related applications becomes a crucial part of the successful implementation and operation of the computing system. In this paper, we study and investigate the practice of controlling a compute system's power usage, and we explore howmore » different power caps affect the performance of numerical algorithms with different computational intensities. Further, we determine the impact, in terms of performance and energy usage, that these caps have on a system running scientific applications. This analysis will enable us to characterize the types of algorithms that benefit most from these power management schemes. Our experiments are performed using a set of representative kernels and several popular scientific benchmarks. Lastly, we quantify a number of power and performance measurements and draw observations and conclusions that can be viewed as a roadmap to achieving energy efficiency in the design and execution of scientific algorithms.« less

  16. Efficient volumetric estimation from plenoptic data

    NASA Astrophysics Data System (ADS)

    Anglin, Paul; Reeves, Stanley J.; Thurow, Brian S.

    2013-03-01

    The commercial release of the Lytro camera, and greater availability of plenoptic imaging systems in general, have given the image processing community cost-effective tools for light-field imaging. While this data is most commonly used to generate planar images at arbitrary focal depths, reconstruction of volumetric fields is also possible. Similarly, deconvolution is a technique that is conventionally used in planar image reconstruction, or deblurring, algorithms. However, when leveraged with the ability of a light-field camera to quickly reproduce multiple focal planes within an imaged volume, deconvolution offers a computationally efficient method of volumetric reconstruction. Related research has shown than light-field imaging systems in conjunction with tomographic reconstruction techniques are also capable of estimating the imaged volume and have been successfully applied to particle image velocimetry (PIV). However, while tomographic volumetric estimation through algorithms such as multiplicative algebraic reconstruction techniques (MART) have proven to be highly accurate, they are computationally intensive. In this paper, the reconstruction problem is shown to be solvable by deconvolution. Deconvolution offers significant improvement in computational efficiency through the use of fast Fourier transforms (FFTs) when compared to other tomographic methods. This work describes a deconvolution algorithm designed to reconstruct a 3-D particle field from simulated plenoptic data. A 3-D extension of existing 2-D FFT-based refocusing techniques is presented to further improve efficiency when computing object focal stacks and system point spread functions (PSF). Reconstruction artifacts are identified; their underlying source and methods of mitigation are explored where possible, and reconstructions of simulated particle fields are provided.

  17. Progress in a novel architecture for high performance processing

    NASA Astrophysics Data System (ADS)

    Zhang, Zhiwei; Liu, Meng; Liu, Zijun; Du, Xueliang; Xie, Shaolin; Ma, Hong; Ding, Guangxin; Ren, Weili; Zhou, Fabiao; Sun, Wenqin; Wang, Huijuan; Wang, Donglin

    2018-04-01

    The high performance processing (HPP) is an innovative architecture which targets on high performance computing with excellent power efficiency and computing performance. It is suitable for data intensive applications like supercomputing, machine learning and wireless communication. An example chip with four application-specific integrated circuit (ASIC) cores which is the first generation of HPP cores has been taped out successfully under Taiwan Semiconductor Manufacturing Company (TSMC) 40 nm low power process. The innovative architecture shows great energy efficiency over the traditional central processing unit (CPU) and general-purpose computing on graphics processing units (GPGPU). Compared with MaPU, HPP has made great improvement in architecture. The chip with 32 HPP cores is being developed under TSMC 16 nm field effect transistor (FFC) technology process and is planed to use commercially. The peak performance of this chip can reach 4.3 teraFLOPS (TFLOPS) and its power efficiency is up to 89.5 gigaFLOPS per watt (GFLOPS/W).

  18. Resource-Efficient, Hierarchical Auto-Tuning of a Hybrid Lattice Boltzmann Computation on the Cray XT4

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Computational Research Division, Lawrence Berkeley National Laboratory; NERSC, Lawrence Berkeley National Laboratory; Computer Science Department, University of California, Berkeley

    2009-05-04

    We apply auto-tuning to a hybrid MPI-pthreads lattice Boltzmann computation running on the Cray XT4 at National Energy Research Scientific Computing Center (NERSC). Previous work showed that multicore-specific auto-tuning can improve the performance of lattice Boltzmann magnetohydrodynamics (LBMHD) by a factor of 4x when running on dual- and quad-core Opteron dual-socket SMPs. We extend these studies to the distributed memory arena via a hybrid MPI/pthreads implementation. In addition to conventional auto-tuning at the local SMP node, we tune at the message-passing level to determine the optimal aspect ratio as well as the correct balance between MPI tasks and threads permore » MPI task. Our study presents a detailed performance analysis when moving along an isocurve of constant hardware usage: fixed total memory, total cores, and total nodes. Overall, our work points to approaches for improving intra- and inter-node efficiency on large-scale multicore systems for demanding scientific applications.« less

  19. Distributed collaborative probabilistic design for turbine blade-tip radial running clearance using support vector machine of regression

    NASA Astrophysics Data System (ADS)

    Fei, Cheng-Wei; Bai, Guang-Chen

    2014-12-01

    To improve the computational precision and efficiency of probabilistic design for mechanical dynamic assembly like the blade-tip radial running clearance (BTRRC) of gas turbine, a distribution collaborative probabilistic design method-based support vector machine of regression (SR)(called as DCSRM) is proposed by integrating distribution collaborative response surface method and support vector machine regression model. The mathematical model of DCSRM is established and the probabilistic design idea of DCSRM is introduced. The dynamic assembly probabilistic design of aeroengine high-pressure turbine (HPT) BTRRC is accomplished to verify the proposed DCSRM. The analysis results reveal that the optimal static blade-tip clearance of HPT is gained for designing BTRRC, and improving the performance and reliability of aeroengine. The comparison of methods shows that the DCSRM has high computational accuracy and high computational efficiency in BTRRC probabilistic analysis. The present research offers an effective way for the reliability design of mechanical dynamic assembly and enriches mechanical reliability theory and method.

  20. Space Radiation Transport Methods Development

    NASA Technical Reports Server (NTRS)

    Wilson, J. W.; Tripathi, R. K.; Qualls, G. D.; Cucinotta, F. A.; Prael, R. E.; Norbury, J. W.; Heinbockel, J. H.; Tweed, J.

    2002-01-01

    Improved spacecraft shield design requires early entry of radiation constraints into the design process to maximize performance and minimize costs. As a result, we have been investigating high-speed computational procedures to allow shield analysis from the preliminary design concepts to the final design. In particular, we will discuss the progress towards a full three-dimensional and computationally efficient deterministic code for which the current HZETRN evaluates the lowest order asymptotic term. HZETRN is the first deterministic solution to the Boltzmann equation allowing field mapping within the International Space Station (ISS) in tens of minutes using standard Finite Element Method (FEM) geometry common to engineering design practice enabling development of integrated multidisciplinary design optimization methods. A single ray trace in ISS FEM geometry requires 14 milliseconds and severely limits application of Monte Carlo methods to such engineering models. A potential means of improving the Monte Carlo efficiency in coupling to spacecraft geometry is given in terms of reconfigurable computing and could be utilized in the final design as verification of the deterministic method optimized design.

  1. Energy Use and Power Levels in New Monitors and Personal Computers

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Roberson, Judy A.; Homan, Gregory K.; Mahajan, Akshay

    2002-07-23

    Our research was conducted in support of the EPA ENERGY STAR Office Equipment program, whose goal is to reduce the amount of electricity consumed by office equipment in the U.S. The most energy-efficient models in each office equipment category are eligible for the ENERGY STAR label, which consumers can use to identify and select efficient products. As the efficiency of each category improves over time, the ENERGY STAR criteria need to be revised accordingly. The purpose of this study was to provide reliable data on the energy consumption of the newest personal computers and monitors that the EPA can usemore » to evaluate revisions to current ENERGY STAR criteria as well as to improve the accuracy of ENERGY STAR program savings estimates. We report the results of measuring the power consumption and power management capabilities of a sample of new monitors and computers. These results will be used to improve estimates of program energy savings and carbon emission reductions, and to inform rev isions of the ENERGY STAR criteria for these products. Our sample consists of 35 monitors and 26 computers manufactured between July 2000 and October 2001; it includes cathode ray tube (CRT) and liquid crystal display (LCD) monitors, Macintosh and Intel-architecture computers, desktop and laptop computers, and integrated computer systems, in which power consumption of the computer and monitor cannot be measured separately. For each machine we measured power consumption when off, on, and in each low-power level. We identify trends in and opportunities to reduce power consumption in new personal computers and monitors. Our results include a trend among monitor manufacturers to provide a single very low low-power level, well below the current ENERGY STAR criteria for sleep power consumption. These very low sleep power results mean that energy consumed when monitors are off or in active use has become more important in terms of contribution to the overall unit energy consumption (UEC). Cur rent ENERGY STAR monitor and computer criteria do not specify off or on power, but our results suggest opportunities for saving energy in these modes. Also, significant differences between CRT and LCD technology, and between field-measured and manufacturer-reported power levels reveal the need for standard methods and metrics for measuring and comparing monitor power consumption.« less

  2. On the Use of Electrooculogram for Efficient Human Computer Interfaces

    PubMed Central

    Usakli, A. B.; Gurkan, S.; Aloise, F.; Vecchiato, G.; Babiloni, F.

    2010-01-01

    The aim of this study is to present electrooculogram signals that can be used for human computer interface efficiently. Establishing an efficient alternative channel for communication without overt speech and hand movements is important to increase the quality of life for patients suffering from Amyotrophic Lateral Sclerosis or other illnesses that prevent correct limb and facial muscular responses. We have made several experiments to compare the P300-based BCI speller and EOG-based new system. A five-letter word can be written on average in 25 seconds and in 105 seconds with the EEG-based device. Giving message such as “clean-up” could be performed in 3 seconds with the new system. The new system is more efficient than P300-based BCI system in terms of accuracy, speed, applicability, and cost efficiency. Using EOG signals, it is possible to improve the communication abilities of those patients who can move their eyes. PMID:19841687

  3. Implementation of the diagonalization-free algorithm in the self-consistent field procedure within the four-component relativistic scheme.

    PubMed

    Hrdá, Marcela; Kulich, Tomáš; Repiský, Michal; Noga, Jozef; Malkina, Olga L; Malkin, Vladimir G

    2014-09-05

    A recently developed Thouless-expansion-based diagonalization-free approach for improving the efficiency of self-consistent field (SCF) methods (Noga and Šimunek, J. Chem. Theory Comput. 2010, 6, 2706) has been adapted to the four-component relativistic scheme and implemented within the program package ReSpect. In addition to the implementation, the method has been thoroughly analyzed, particularly with respect to cases for which it is difficult or computationally expensive to find a good initial guess. Based on this analysis, several modifications of the original algorithm, refining its stability and efficiency, are proposed. To demonstrate the robustness and efficiency of the improved algorithm, we present the results of four-component diagonalization-free SCF calculations on several heavy-metal complexes, the largest of which contains more than 80 atoms (about 6000 4-spinor basis functions). The diagonalization-free procedure is about twice as fast as the corresponding diagonalization. Copyright © 2014 Wiley Periodicals, Inc.

  4. Improved Ant Colony Clustering Algorithm and Its Performance Study

    PubMed Central

    Gao, Wei

    2016-01-01

    Clustering analysis is used in many disciplines and applications; it is an important tool that descriptively identifies homogeneous groups of objects based on attribute values. The ant colony clustering algorithm is a swarm-intelligent method used for clustering problems that is inspired by the behavior of ant colonies that cluster their corpses and sort their larvae. A new abstraction ant colony clustering algorithm using a data combination mechanism is proposed to improve the computational efficiency and accuracy of the ant colony clustering algorithm. The abstraction ant colony clustering algorithm is used to cluster benchmark problems, and its performance is compared with the ant colony clustering algorithm and other methods used in existing literature. Based on similar computational difficulties and complexities, the results show that the abstraction ant colony clustering algorithm produces results that are not only more accurate but also more efficiently determined than the ant colony clustering algorithm and the other methods. Thus, the abstraction ant colony clustering algorithm can be used for efficient multivariate data clustering. PMID:26839533

  5. Computer Adaptive Testing, Big Data and Algorithmic Approaches to Education

    ERIC Educational Resources Information Center

    Thompson, Greg

    2017-01-01

    This article critically considers the promise of computer adaptive testing (CAT) and digital data to provide better and quicker data that will improve the quality, efficiency and effectiveness of schooling. In particular, it uses the case of the Australian NAPLAN test that will become an online, adaptive test from 2016. The article argues that…

  6. Evaluating a Computer-Assisted Pronunciation Training (CAPT) Technique for Efficient Classroom Instruction

    ERIC Educational Resources Information Center

    Luo, Beate

    2016-01-01

    This study investigates a computer-assisted pronunciation training (CAPT) technique that combines oral reading with peer review to improve pronunciation of Taiwanese English major students. In addition to traditional in-class instruction, students were given a short passage every week along with a recording of the respective text, read by a native…

  7. A Study on the Methods of Assessment and Strategy of Knowledge Sharing in Computer Course

    ERIC Educational Resources Information Center

    Chan, Pat P. W.

    2014-01-01

    With the advancement of information and communication technology, collaboration and knowledge sharing through technology is facilitated which enhances the learning process and improves the learning efficiency. The purpose of this paper is to review the methods of assessment and strategy of collaboration and knowledge sharing in a computer course,…

  8. Automatic finite element generators

    NASA Technical Reports Server (NTRS)

    Wang, P. S.

    1984-01-01

    The design and implementation of a software system for generating finite elements and related computations are described. Exact symbolic computational techniques are employed to derive strain-displacement matrices and element stiffness matrices. Methods for dealing with the excessive growth of symbolic expressions are discussed. Automatic FORTRAN code generation is described with emphasis on improving the efficiency of the resultant code.

  9. Hidden Markov induced Dynamic Bayesian Network for recovering time evolving gene regulatory networks

    NASA Astrophysics Data System (ADS)

    Zhu, Shijia; Wang, Yadong

    2015-12-01

    Dynamic Bayesian Networks (DBN) have been widely used to recover gene regulatory relationships from time-series data in computational systems biology. Its standard assumption is ‘stationarity’, and therefore, several research efforts have been recently proposed to relax this restriction. However, those methods suffer from three challenges: long running time, low accuracy and reliance on parameter settings. To address these problems, we propose a novel non-stationary DBN model by extending each hidden node of Hidden Markov Model into a DBN (called HMDBN), which properly handles the underlying time-evolving networks. Correspondingly, an improved structural EM algorithm is proposed to learn the HMDBN. It dramatically reduces searching space, thereby substantially improving computational efficiency. Additionally, we derived a novel generalized Bayesian Information Criterion under the non-stationary assumption (called BWBIC), which can help significantly improve the reconstruction accuracy and largely reduce over-fitting. Moreover, the re-estimation formulas for all parameters of our model are derived, enabling us to avoid reliance on parameter settings. Compared to the state-of-the-art methods, the experimental evaluation of our proposed method on both synthetic and real biological data demonstrates more stably high prediction accuracy and significantly improved computation efficiency, even with no prior knowledge and parameter settings.

  10. Enhancing Arithmetic and Word-Problem Solving Skills Efficiently by Individualized Computer-Assisted Practice

    ERIC Educational Resources Information Center

    Schoppek, Wolfgang; Tulis, Maria

    2010-01-01

    The fluency of basic arithmetical operations is a precondition for mathematical problem solving. However, the training of skills plays a minor role in contemporary mathematics instruction. The authors proposed individualization of practice as a means to improve its efficiency, so that the time spent with the training of skills is minimized. As a…

  11. On the tip of the tongue: learning typing and pointing with an intra-oral computer interface.

    PubMed

    Caltenco, Héctor A; Breidegard, Björn; Struijk, Lotte N S Andreasen

    2014-07-01

    To evaluate typing and pointing performance and improvement over time of four able-bodied participants using an intra-oral tongue-computer interface for computer control. A physically disabled individual may lack the ability to efficiently control standard computer input devices. There have been several efforts to produce and evaluate interfaces that provide individuals with physical disabilities the possibility to control personal computers. Training with the intra-oral tongue-computer interface was performed by playing games over 18 sessions. Skill improvement was measured through typing and pointing exercises at the end of each training session. Typing throughput improved from averages of 2.36 to 5.43 correct words per minute. Pointing throughput improved from averages of 0.47 to 0.85 bits/s. Target tracking performance, measured as relative time on target, improved from averages of 36% to 47%. Path following throughput improved from averages of 0.31 to 0.83 bits/s and decreased to 0.53 bits/s with more difficult tasks. Learning curves support the notion that the tongue can rapidly learn novel motor tasks. Typing and pointing performance of the tongue-computer interface is comparable to performances of other proficient assistive devices, which makes the tongue a feasible input organ for computer control. Intra-oral computer interfaces could provide individuals with severe upper-limb mobility impairments the opportunity to control computers and automatic equipment. Typing and pointing performance of the tongue-computer interface is comparable to performances of other proficient assistive devices, but does not cause fatigue easily and might be invisible to other people, which is highly prioritized by assistive device users. Combination of visual and auditory feedback is vital for a good performance of an intra-oral computer interface and helps to reduce involuntary or erroneous activations.

  12. An energy-efficient failure detector for vehicular cloud computing.

    PubMed

    Liu, Jiaxi; Wu, Zhibo; Dong, Jian; Wu, Jin; Wen, Dongxin

    2018-01-01

    Failure detectors are one of the fundamental components for maintaining the high availability of vehicular cloud computing. In vehicular cloud computing, lots of RSUs are deployed along the road to improve the connectivity. Many of them are equipped with solar battery due to the unavailability or excess expense of wired electrical power. So it is important to reduce the battery consumption of RSU. However, the existing failure detection algorithms are not designed to save battery consumption RSU. To solve this problem, a new energy-efficient failure detector 2E-FD has been proposed specifically for vehicular cloud computing. 2E-FD does not only provide acceptable failure detection service, but also saves the battery consumption of RSU. Through the comparative experiments, the results show that our failure detector has better performance in terms of speed, accuracy and battery consumption.

  13. An energy-efficient failure detector for vehicular cloud computing

    PubMed Central

    Liu, Jiaxi; Wu, Zhibo; Wu, Jin; Wen, Dongxin

    2018-01-01

    Failure detectors are one of the fundamental components for maintaining the high availability of vehicular cloud computing. In vehicular cloud computing, lots of RSUs are deployed along the road to improve the connectivity. Many of them are equipped with solar battery due to the unavailability or excess expense of wired electrical power. So it is important to reduce the battery consumption of RSU. However, the existing failure detection algorithms are not designed to save battery consumption RSU. To solve this problem, a new energy-efficient failure detector 2E-FD has been proposed specifically for vehicular cloud computing. 2E-FD does not only provide acceptable failure detection service, but also saves the battery consumption of RSU. Through the comparative experiments, the results show that our failure detector has better performance in terms of speed, accuracy and battery consumption. PMID:29352282

  14. A Survey of Architectural Techniques for Near-Threshold Computing

    DOE PAGES

    Mittal, Sparsh

    2015-12-28

    Energy efficiency has now become the primary obstacle in scaling the performance of all classes of computing systems. In low-voltage computing and specifically, near-threshold voltage computing (NTC), which involves operating the transistor very close to and yet above its threshold voltage, holds the promise of providing many-fold improvement in energy efficiency. However, use of NTC also presents several challenges such as increased parametric variation, failure rate and performance loss etc. Our paper surveys several re- cent techniques which aim to offset these challenges for fully leveraging the potential of NTC. By classifying these techniques along several dimensions, we also highlightmore » their similarities and differences. Ultimately, we hope that this paper will provide insights into state-of-art NTC techniques to researchers and system-designers and inspire further research in this field.« less

  15. A Survey of Architectural Techniques for Near-Threshold Computing

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Mittal, Sparsh

    Energy efficiency has now become the primary obstacle in scaling the performance of all classes of computing systems. In low-voltage computing and specifically, near-threshold voltage computing (NTC), which involves operating the transistor very close to and yet above its threshold voltage, holds the promise of providing many-fold improvement in energy efficiency. However, use of NTC also presents several challenges such as increased parametric variation, failure rate and performance loss etc. Our paper surveys several re- cent techniques which aim to offset these challenges for fully leveraging the potential of NTC. By classifying these techniques along several dimensions, we also highlightmore » their similarities and differences. Ultimately, we hope that this paper will provide insights into state-of-art NTC techniques to researchers and system-designers and inspire further research in this field.« less

  16. A 3D inversion for all-space magnetotelluric data with static shift correction

    NASA Astrophysics Data System (ADS)

    Zhang, Kun

    2017-04-01

    Base on the previous studies on the static shift correction and 3D inversion algorithms, we improve the NLCG 3D inversion method and propose a new static shift correction method which work in the inversion. The static shift correction method is based on the 3D theory and real data. The static shift can be detected by the quantitative analysis of apparent parameters (apparent resistivity and impedance phase) of MT in high frequency range, and completed correction with inversion. The method is an automatic processing technology of computer with 0 cost, and avoids the additional field work and indoor processing with good results. The 3D inversion algorithm is improved (Zhang et al., 2013) base on the NLCG method of Newman & Alumbaugh (2000) and Rodi & Mackie (2001). For the algorithm, we added the parallel structure, improved the computational efficiency, reduced the memory of computer and added the topographic and marine factors. So the 3D inversion could work in general PC with high efficiency and accuracy. And all the MT data of surface stations, seabed stations and underground stations can be used in the inversion algorithm.

  17. Addressing the coming radiology crisis-the Society for Computer Applications in Radiology transforming the radiological interpretation process (TRIP) initiative.

    PubMed

    Andriole, Katherine P; Morin, Richard L; Arenson, Ronald L; Carrino, John A; Erickson, Bradley J; Horii, Steven C; Piraino, David W; Reiner, Bruce I; Seibert, J Anthony; Siegel, Eliot

    2004-12-01

    The Society for Computer Applications in Radiology (SCAR) Transforming the Radiological Interpretation Process (TRIP) Initiative aims to spearhead research, education, and discovery of innovative solutions to address the problem of information and image data overload. The initiative will foster interdisciplinary research on technological, environmental and human factors to better manage and exploit the massive amounts of data. TRIP will focus on the following basic objectives: improving the efficiency of interpretation of large data sets, improving the timeliness and effectiveness of communication, and decreasing medical errors. The ultimate goal of the initiative is to improve the quality and safety of patient care. Interdisciplinary research into several broad areas will be necessary to make progress in managing the ever-increasing volume of data. The six concepts involved are human perception, image processing and computer-aided detection (CAD), visualization, navigation and usability, databases and integration, and evaluation and validation of methods and performance. The result of this transformation will affect several key processes in radiology, including image interpretation; communication of imaging results; workflow and efficiency within the health care enterprise; diagnostic accuracy and a reduction in medical errors; and, ultimately, the overall quality of care.

  18. Distributed collaborative probabilistic design of multi-failure structure with fluid-structure interaction using fuzzy neural network of regression

    NASA Astrophysics Data System (ADS)

    Song, Lu-Kai; Wen, Jie; Fei, Cheng-Wei; Bai, Guang-Chen

    2018-05-01

    To improve the computing efficiency and precision of probabilistic design for multi-failure structure, a distributed collaborative probabilistic design method-based fuzzy neural network of regression (FR) (called as DCFRM) is proposed with the integration of distributed collaborative response surface method and fuzzy neural network regression model. The mathematical model of DCFRM is established and the probabilistic design idea with DCFRM is introduced. The probabilistic analysis of turbine blisk involving multi-failure modes (deformation failure, stress failure and strain failure) was investigated by considering fluid-structure interaction with the proposed method. The distribution characteristics, reliability degree, and sensitivity degree of each failure mode and overall failure mode on turbine blisk are obtained, which provides a useful reference for improving the performance and reliability of aeroengine. Through the comparison of methods shows that the DCFRM reshapes the probability of probabilistic analysis for multi-failure structure and improves the computing efficiency while keeping acceptable computational precision. Moreover, the proposed method offers a useful insight for reliability-based design optimization of multi-failure structure and thereby also enriches the theory and method of mechanical reliability design.

  19. Improved transition path sampling methods for simulation of rare events

    NASA Astrophysics Data System (ADS)

    Chopra, Manan; Malshe, Rohit; Reddy, Allam S.; de Pablo, J. J.

    2008-04-01

    The free energy surfaces of a wide variety of systems encountered in physics, chemistry, and biology are characterized by the existence of deep minima separated by numerous barriers. One of the central aims of recent research in computational chemistry and physics has been to determine how transitions occur between deep local minima on rugged free energy landscapes, and transition path sampling (TPS) Monte-Carlo methods have emerged as an effective means for numerical investigation of such transitions. Many of the shortcomings of TPS-like approaches generally stem from their high computational demands. Two new algorithms are presented in this work that improve the efficiency of TPS simulations. The first algorithm uses biased shooting moves to render the sampling of reactive trajectories more efficient. The second algorithm is shown to substantially improve the accuracy of the transition state ensemble by introducing a subset of local transition path simulations in the transition state. The system considered in this work consists of a two-dimensional rough energy surface that is representative of numerous systems encountered in applications. When taken together, these algorithms provide gains in efficiency of over two orders of magnitude when compared to traditional TPS simulations.

  20. Feature Selection in Classification of Eye Movements Using Electrooculography for Activity Recognition

    PubMed Central

    Mala, S.; Latha, K.

    2014-01-01

    Activity recognition is needed in different requisition, for example, reconnaissance system, patient monitoring, and human-computer interfaces. Feature selection plays an important role in activity recognition, data mining, and machine learning. In selecting subset of features, an efficient evolutionary algorithm Differential Evolution (DE), a very efficient optimizer, is used for finding informative features from eye movements using electrooculography (EOG). Many researchers use EOG signals in human-computer interactions with various computational intelligence methods to analyze eye movements. The proposed system involves analysis of EOG signals using clearness based features, minimum redundancy maximum relevance features, and Differential Evolution based features. This work concentrates more on the feature selection algorithm based on DE in order to improve the classification for faultless activity recognition. PMID:25574185

  1. Feature selection in classification of eye movements using electrooculography for activity recognition.

    PubMed

    Mala, S; Latha, K

    2014-01-01

    Activity recognition is needed in different requisition, for example, reconnaissance system, patient monitoring, and human-computer interfaces. Feature selection plays an important role in activity recognition, data mining, and machine learning. In selecting subset of features, an efficient evolutionary algorithm Differential Evolution (DE), a very efficient optimizer, is used for finding informative features from eye movements using electrooculography (EOG). Many researchers use EOG signals in human-computer interactions with various computational intelligence methods to analyze eye movements. The proposed system involves analysis of EOG signals using clearness based features, minimum redundancy maximum relevance features, and Differential Evolution based features. This work concentrates more on the feature selection algorithm based on DE in order to improve the classification for faultless activity recognition.

  2. Further reduction of minimal first-met bad markings for the computationally efficient synthesis of a maximally permissive controller

    NASA Astrophysics Data System (ADS)

    Liu, GaiYun; Chao, Daniel Yuh

    2015-08-01

    To date, research on the supervisor design for flexible manufacturing systems focuses on speeding up the computation of optimal (maximally permissive) liveness-enforcing controllers. Recent deadlock prevention policies for systems of simple sequential processes with resources (S3PR) reduce the computation burden by considering only the minimal portion of all first-met bad markings (FBMs). Maximal permissiveness is ensured by not forbidding any live state. This paper proposes a method to further reduce the size of minimal set of FBMs to efficiently solve integer linear programming problems while maintaining maximal permissiveness using a vector-covering approach. This paper improves the previous work and achieves the simplest structure with the minimal number of monitors.

  3. Toward high-efficiency and detailed Monte Carlo simulation study of the granular flow spallation target

    NASA Astrophysics Data System (ADS)

    Cai, Han-Jie; Zhang, Zhi-Lei; Fu, Fen; Li, Jian-Yang; Zhang, Xun-Chao; Zhang, Ya-Ling; Yan, Xue-Song; Lin, Ping; Xv, Jian-Ya; Yang, Lei

    2018-02-01

    The dense granular flow spallation target is a new target concept chosen for the Accelerator-Driven Subcritical (ADS) project in China. For the R&D of this kind of target concept, a dedicated Monte Carlo (MC) program named GMT was developed to perform the simulation study of the beam-target interaction. Owing to the complexities of the target geometry, the computational cost of the MC simulation of particle tracks is highly expensive. Thus, improvement of computational efficiency will be essential for the detailed MC simulation studies of the dense granular target. Here we present the special design of the GMT program and its high efficiency performance. In addition, the speedup potential of the GPU-accelerated spallation models is discussed.

  4. Improving the sampling efficiency of Monte Carlo molecular simulations: an evolutionary approach

    NASA Astrophysics Data System (ADS)

    Leblanc, Benoit; Braunschweig, Bertrand; Toulhoat, Hervé; Lutton, Evelyne

    We present a new approach in order to improve the convergence of Monte Carlo (MC) simulations of molecular systems belonging to complex energetic landscapes: the problem is redefined in terms of the dynamic allocation of MC move frequencies depending on their past efficiency, measured with respect to a relevant sampling criterion. We introduce various empirical criteria with the aim of accounting for the proper convergence in phase space sampling. The dynamic allocation is performed over parallel simulations by means of a new evolutionary algorithm involving 'immortal' individuals. The method is bench marked with respect to conventional procedures on a model for melt linear polyethylene. We record significant improvement in sampling efficiencies, thus in computational load, while the optimal sets of move frequencies are liable to allow interesting physical insights into the particular systems simulated. This last aspect should provide a new tool for designing more efficient new MC moves.

  5. P-HS-SFM: a parallel harmony search algorithm for the reproduction of experimental data in the continuous microscopic crowd dynamic models

    NASA Astrophysics Data System (ADS)

    Jaber, Khalid Mohammad; Alia, Osama Moh'd.; Shuaib, Mohammed Mahmod

    2018-03-01

    Finding the optimal parameters that can reproduce experimental data (such as the velocity-density relation and the specific flow rate) is a very important component of the validation and calibration of microscopic crowd dynamic models. Heavy computational demand during parameter search is a known limitation that exists in a previously developed model known as the Harmony Search-Based Social Force Model (HS-SFM). In this paper, a parallel-based mechanism is proposed to reduce the computational time and memory resource utilisation required to find these parameters. More specifically, two MATLAB-based multicore techniques (parfor and create independent jobs) using shared memory are developed by taking advantage of the multithreading capabilities of parallel computing, resulting in a new framework called the Parallel Harmony Search-Based Social Force Model (P-HS-SFM). The experimental results show that the parfor-based P-HS-SFM achieved a better computational time of about 26 h, an efficiency improvement of ? 54% and a speedup factor of 2.196 times in comparison with the HS-SFM sequential processor. The performance of the P-HS-SFM using the create independent jobs approach is also comparable to parfor with a computational time of 26.8 h, an efficiency improvement of about 30% and a speedup of 2.137 times.

  6. Computing exact bundle compliance control charts via probability generating functions.

    PubMed

    Chen, Binchao; Matis, Timothy; Benneyan, James

    2016-06-01

    Compliance to evidenced-base practices, individually and in 'bundles', remains an important focus of healthcare quality improvement for many clinical conditions. The exact probability distribution of composite bundle compliance measures used to develop corresponding control charts and other statistical tests is based on a fairly large convolution whose direct calculation can be computationally prohibitive. Various series expansions and other approximation approaches have been proposed, each with computational and accuracy tradeoffs, especially in the tails. This same probability distribution also arises in other important healthcare applications, such as for risk-adjusted outcomes and bed demand prediction, with the same computational difficulties. As an alternative, we use probability generating functions to rapidly obtain exact results and illustrate the improved accuracy and detection over other methods. Numerical testing across a wide range of applications demonstrates the computational efficiency and accuracy of this approach.

  7. Scattering of Airy elastic sheets by a cylindrical cavity in a solid.

    PubMed

    Mitri, F G

    2017-11-01

    The prediction of the elastic scattering by voids (and cracks) in materials is an important process in structural health monitoring, phononic crystals, metamaterials and non-destructive evaluation/imaging to name a few examples. Earlier analytical theories and numerical computations considered the elastic scattering by voids in plane waves of infinite extent. However, current research suggesting the use of (limited-diffracting, accelerating and self-healing) Airy acoustical-sheet beams for non-destructive evaluation or imaging applications in elastic solids requires the development of an improved analytical formalism to predict the scattering efficiency used as a priori information in quantitative material characterization. Based on the definition of the time-averaged scattered power flow density, an analytical expression for the scattering efficiency of a cylindrical empty cavity (i.e., void) encased in an elastic medium is derived for compressional and normally-polarized shear-wave Airy beams. The multipole expansion method using cylindrical wave functions is utilized. Numerical computations for the scattering energy efficiency factors for compressional and shear waves illustrate the analysis with particular emphasis on the Airy beam parameters and the non-dimensional frequency, for various elastic materials surrounding the cavity. The ratio of the compressional to the shear wave speed stimulates the generation of elastic resonances, which are manifested as a series of peaks in the scattering efficiency plots. The present analysis provides an improved method for the computations of the scattering energy efficiency factors using compressional and shear-wave Airy beams in elastic materials as opposed to plane waves of infinite extent. Copyright © 2017 Elsevier B.V. All rights reserved.

  8. TOUGH3: A new efficient version of the TOUGH suite of multiphase flow and transport simulators

    NASA Astrophysics Data System (ADS)

    Jung, Yoojin; Pau, George Shu Heng; Finsterle, Stefan; Pollyea, Ryan M.

    2017-11-01

    The TOUGH suite of nonisothermal multiphase flow and transport simulators has been updated by various developers over many years to address a vast range of challenging subsurface problems. The increasing complexity of the simulated processes as well as the growing size of model domains that need to be handled call for an improvement in the simulator's computational robustness and efficiency. Moreover, modifications have been frequently introduced independently, resulting in multiple versions of TOUGH that (1) led to inconsistencies in feature implementation and usage, (2) made code maintenance and development inefficient, and (3) caused confusion to users and developers. TOUGH3-a new base version of TOUGH-addresses these issues. It consolidates both the serial (TOUGH2 V2.1) and parallel (TOUGH2-MP V2.0) implementations, enabling simulations to be performed on desktop computers and supercomputers using a single code. New PETSc parallel linear solvers are added to the existing serial solvers of TOUGH2 and the Aztec solver used in TOUGH2-MP. The PETSc solvers generally perform better than the Aztec solvers in parallel and the internal TOUGH3 linear solver in serial. TOUGH3 also incorporates many new features, addresses bugs, and improves the flexibility of data handling. Due to the improved capabilities and usability, TOUGH3 is more robust and efficient for solving tough and computationally demanding problems in diverse scientific and practical applications related to subsurface flow modeling.

  9. Energy Efficient Image/Video Data Transmission on Commercial Multi-Core Processors

    PubMed Central

    Lee, Sungju; Kim, Heegon; Chung, Yongwha; Park, Daihee

    2012-01-01

    In transmitting image/video data over Video Sensor Networks (VSNs), energy consumption must be minimized while maintaining high image/video quality. Although image/video compression is well known for its efficiency and usefulness in VSNs, the excessive costs associated with encoding computation and complexity still hinder its adoption for practical use. However, it is anticipated that high-performance handheld multi-core devices will be used as VSN processing nodes in the near future. In this paper, we propose a way to improve the energy efficiency of image and video compression with multi-core processors while maintaining the image/video quality. We improve the compression efficiency at the algorithmic level or derive the optimal parameters for the combination of a machine and compression based on the tradeoff between the energy consumption and the image/video quality. Based on experimental results, we confirm that the proposed approach can improve the energy efficiency of the straightforward approach by a factor of 2∼5 without compromising image/video quality. PMID:23202181

  10. ClimateSpark: An in-memory distributed computing framework for big climate data analytics

    NASA Astrophysics Data System (ADS)

    Hu, Fei; Yang, Chaowei; Schnase, John L.; Duffy, Daniel Q.; Xu, Mengchao; Bowen, Michael K.; Lee, Tsengdar; Song, Weiwei

    2018-06-01

    The unprecedented growth of climate data creates new opportunities for climate studies, and yet big climate data pose a grand challenge to climatologists to efficiently manage and analyze big data. The complexity of climate data content and analytical algorithms increases the difficulty of implementing algorithms on high performance computing systems. This paper proposes an in-memory, distributed computing framework, ClimateSpark, to facilitate complex big data analytics and time-consuming computational tasks. Chunking data structure improves parallel I/O efficiency, while a spatiotemporal index is built for the chunks to avoid unnecessary data reading and preprocessing. An integrated, multi-dimensional, array-based data model (ClimateRDD) and ETL operations are developed to address big climate data variety by integrating the processing components of the climate data lifecycle. ClimateSpark utilizes Spark SQL and Apache Zeppelin to develop a web portal to facilitate the interaction among climatologists, climate data, analytic operations and computing resources (e.g., using SQL query and Scala/Python notebook). Experimental results show that ClimateSpark conducts different spatiotemporal data queries/analytics with high efficiency and data locality. ClimateSpark is easily adaptable to other big multiple-dimensional, array-based datasets in various geoscience domains.

  11. Efficient computation of the joint sample frequency spectra for multiple populations.

    PubMed

    Kamm, John A; Terhorst, Jonathan; Song, Yun S

    2017-01-01

    A wide range of studies in population genetics have employed the sample frequency spectrum (SFS), a summary statistic which describes the distribution of mutant alleles at a polymorphic site in a sample of DNA sequences and provides a highly efficient dimensional reduction of large-scale population genomic variation data. Recently, there has been much interest in analyzing the joint SFS data from multiple populations to infer parameters of complex demographic histories, including variable population sizes, population split times, migration rates, admixture proportions, and so on. SFS-based inference methods require accurate computation of the expected SFS under a given demographic model. Although much methodological progress has been made, existing methods suffer from numerical instability and high computational complexity when multiple populations are involved and the sample size is large. In this paper, we present new analytic formulas and algorithms that enable accurate, efficient computation of the expected joint SFS for thousands of individuals sampled from hundreds of populations related by a complex demographic model with arbitrary population size histories (including piecewise-exponential growth). Our results are implemented in a new software package called momi (MOran Models for Inference). Through an empirical study we demonstrate our improvements to numerical stability and computational complexity.

  12. Efficient computation of the joint sample frequency spectra for multiple populations

    PubMed Central

    Kamm, John A.; Terhorst, Jonathan; Song, Yun S.

    2016-01-01

    A wide range of studies in population genetics have employed the sample frequency spectrum (SFS), a summary statistic which describes the distribution of mutant alleles at a polymorphic site in a sample of DNA sequences and provides a highly efficient dimensional reduction of large-scale population genomic variation data. Recently, there has been much interest in analyzing the joint SFS data from multiple populations to infer parameters of complex demographic histories, including variable population sizes, population split times, migration rates, admixture proportions, and so on. SFS-based inference methods require accurate computation of the expected SFS under a given demographic model. Although much methodological progress has been made, existing methods suffer from numerical instability and high computational complexity when multiple populations are involved and the sample size is large. In this paper, we present new analytic formulas and algorithms that enable accurate, efficient computation of the expected joint SFS for thousands of individuals sampled from hundreds of populations related by a complex demographic model with arbitrary population size histories (including piecewise-exponential growth). Our results are implemented in a new software package called momi (MOran Models for Inference). Through an empirical study we demonstrate our improvements to numerical stability and computational complexity. PMID:28239248

  13. Computing rank-revealing QR factorizations of dense matrices.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bischof, C. H.; Quintana-Orti, G.; Mathematics and Computer Science

    1998-06-01

    We develop algorithms and implementations for computing rank-revealing QR (RRQR) factorizations of dense matrices. First, we develop an efficient block algorithm for approximating an RRQR factorization, employing a windowed version of the commonly used Golub pivoting strategy, aided by incremental condition estimation. Second, we develop efficiently implementable variants of guaranteed reliable RRQR algorithms for triangular matrices originally suggested by Chandrasekaran and Ipsen and by Pan and Tang. We suggest algorithmic improvements with respect to condition estimation, termination criteria, and Givens updating. By combining the block algorithm with one of the triangular postprocessing steps, we arrive at an efficient and reliablemore » algorithm for computing an RRQR factorization of a dense matrix. Experimental results on IBM RS/6000 SGI R8000 platforms show that this approach performs up to three times faster that the less reliable QR factorization with column pivoting as it is currently implemented in LAPACK, and comes within 15% of the performance of the LAPACK block algorithm for computing a QR factorization without any column exchanges. Thus, we expect this routine to be useful in may circumstances where numerical rank deficiency cannot be ruled out, but currently has been ignored because of the computational cost of dealing with it.« less

  14. Do Computers Improve the Drawing of a Geometrical Figure for 10 Year-Old Children?

    ERIC Educational Resources Information Center

    Martin, Perrine; Velay, Jean-Luc

    2012-01-01

    Nowadays, computer aided design (CAD) is widely used by designers. Would children learn to draw more easily and more efficiently if they were taught with computerised tools? To answer this question, we made an experiment designed to compare two methods for children to do the same drawing: the classical "pen and paper" method and a CAD…

  15. Assisting People with Developmental Disabilities to Improve Computer Pointing Efficiency through Multiple Mice and Automatic Pointing Assistive Programs

    ERIC Educational Resources Information Center

    Shih, Ching-Hsiang

    2011-01-01

    This study combines multi-mice technology (people with disabilities can use standard mice, instead of specialized alternative computer input devices, to achieve complete mouse operation) with an assistive pointing function (i.e. cursor-capturing, which enables the user to move the cursor to the target center automatically), to assess whether two…

  16. Computational fluid dynamics research

    NASA Technical Reports Server (NTRS)

    Chandra, Suresh; Jones, Kenneth; Hassan, Hassan; Mcrae, David Scott

    1992-01-01

    The focus of research in the computational fluid dynamics (CFD) area is two fold: (1) to develop new approaches for turbulence modeling so that high speed compressible flows can be studied for applications to entry and re-entry flows; and (2) to perform research to improve CFD algorithm accuracy and efficiency for high speed flows. Research activities, faculty and student participation, publications, and financial information are outlined.

  17. Adaptation of a program for nonlinear finite element analysis to the CDC STAR 100 computer

    NASA Technical Reports Server (NTRS)

    Pifko, A. B.; Ogilvie, P. L.

    1978-01-01

    The conversion of a nonlinear finite element program to the CDC STAR 100 pipeline computer is discussed. The program called DYCAST was developed for the crash simulation of structures. Initial results with the STAR 100 computer indicated that significant gains in computation time are possible for operations on gloval arrays. However, for element level computations that do not lend themselves easily to long vector processing, the STAR 100 was slower than comparable scalar computers. On this basis it is concluded that in order for pipeline computers to impact the economic feasibility of large nonlinear analyses it is absolutely essential that algorithms be devised to improve the efficiency of element level computations.

  18. Evidence of Effectiveness of Health Care Professionals Using Handheld Computers: A Scoping Review of Systematic Reviews

    PubMed Central

    2013-01-01

    Background Handheld computers and mobile devices provide instant access to vast amounts and types of useful information for health care professionals. Their reduced size and increased processing speed has led to rapid adoption in health care. Thus, it is important to identify whether handheld computers are actually effective in clinical practice. Objective A scoping review of systematic reviews was designed to provide a quick overview of the documented evidence of effectiveness for health care professionals using handheld computers in their clinical work. Methods A detailed search, sensitive for systematic reviews was applied for Cochrane, Medline, EMBASE, PsycINFO, Allied and Complementary Medicine Database (AMED), Global Health, and Cumulative Index to Nursing and Allied Health Literature (CINAHL) databases. All outcomes that demonstrated effectiveness in clinical practice were included. Classroom learning and patient use of handheld computers were excluded. Quality was assessed using the Assessment of Multiple Systematic Reviews (AMSTAR) tool. A previously published conceptual framework was used as the basis for dual data extraction. Reported outcomes were summarized according to the primary function of the handheld computer. Results Five systematic reviews met the inclusion and quality criteria. Together, they reviewed 138 unique primary studies. Most reviewed descriptive intervention studies, where physicians, pharmacists, or medical students used personal digital assistants. Effectiveness was demonstrated across four distinct functions of handheld computers: patient documentation, patient care, information seeking, and professional work patterns. Within each of these functions, a range of positive outcomes were reported using both objective and self-report measures. The use of handheld computers improved patient documentation through more complete recording, fewer documentation errors, and increased efficiency. Handheld computers provided easy access to clinical decision support systems and patient management systems, which improved decision making for patient care. Handheld computers saved time and gave earlier access to new information. There were also reports that handheld computers enhanced work patterns and efficiency. Conclusions This scoping review summarizes the secondary evidence for effectiveness of handheld computers and mhealth. It provides a snapshot of effective use by health care professionals across four key functions. We identified evidence to suggest that handheld computers provide easy and timely access to information and enable accurate and complete documentation. Further, they can give health care professionals instant access to evidence-based decision support and patient management systems to improve clinical decision making. Finally, there is evidence that handheld computers allow health professionals to be more efficient in their work practices. It is anticipated that this evidence will guide clinicians and managers in implementing handheld computers in clinical practice and in designing future research. PMID:24165786

  19. About improving efficiency of the P3 M algorithms when computing the inter-particle forces in beam dynamics

    NASA Astrophysics Data System (ADS)

    Kozynchenko, Alexander I.; Kozynchenko, Sergey A.

    2017-03-01

    In the paper, a problem of improving efficiency of the particle-particle- particle-mesh (P3M) algorithm in computing the inter-particle electrostatic forces is considered. The particle-mesh (PM) part of the algorithm is modified in such a way that the space field equation is solved by the direct method of summation of potentials over the ensemble of particles lying not too close to a reference particle. For this purpose, a specific matrix "pattern" is introduced to describe the spatial field distribution of a single point charge, so the "pattern" contains pre-calculated potential values. This approach allows to reduce a set of arithmetic operations performed at the innermost of nested loops down to an addition and assignment operators and, therefore, to decrease the running time substantially. The simulation model developed in C++ substantiates this view, showing the descent accuracy acceptable in particle beam calculations together with the improved speed performance.

  20. Secure chaotic map based block cryptosystem with application to camera sensor networks.

    PubMed

    Guo, Xianfeng; Zhang, Jiashu; Khan, Muhammad Khurram; Alghathbar, Khaled

    2011-01-01

    Recently, Wang et al. presented an efficient logistic map based block encryption system. The encryption system employs feedback ciphertext to achieve plaintext dependence of sub-keys. Unfortunately, we discovered that their scheme is unable to withstand key stream attack. To improve its security, this paper proposes a novel chaotic map based block cryptosystem. At the same time, a secure architecture for camera sensor network is constructed. The network comprises a set of inexpensive camera sensors to capture the images, a sink node equipped with sufficient computation and storage capabilities and a data processing server. The transmission security between the sink node and the server is gained by utilizing the improved cipher. Both theoretical analysis and simulation results indicate that the improved algorithm can overcome the flaws and maintain all the merits of the original cryptosystem. In addition, computational costs and efficiency of the proposed scheme are encouraging for the practical implementation in the real environment as well as camera sensor network.

  1. Secure Chaotic Map Based Block Cryptosystem with Application to Camera Sensor Networks

    PubMed Central

    Guo, Xianfeng; Zhang, Jiashu; Khan, Muhammad Khurram; Alghathbar, Khaled

    2011-01-01

    Recently, Wang et al. presented an efficient logistic map based block encryption system. The encryption system employs feedback ciphertext to achieve plaintext dependence of sub-keys. Unfortunately, we discovered that their scheme is unable to withstand key stream attack. To improve its security, this paper proposes a novel chaotic map based block cryptosystem. At the same time, a secure architecture for camera sensor network is constructed. The network comprises a set of inexpensive camera sensors to capture the images, a sink node equipped with sufficient computation and storage capabilities and a data processing server. The transmission security between the sink node and the server is gained by utilizing the improved cipher. Both theoretical analysis and simulation results indicate that the improved algorithm can overcome the flaws and maintain all the merits of the original cryptosystem. In addition, computational costs and efficiency of the proposed scheme are encouraging for the practical implementation in the real environment as well as camera sensor network. PMID:22319371

  2. 75 FR 20111 - Energy Conservation Program: Energy Conservation Standards for Residential Water Heaters, Direct...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-04-16

    ... the ``three heating products'') must be designed to ``achieve the maximum improvement in energy... and CO 2 savings are performed with different computer models, leading to different time frames for... of EPCA sets forth a variety of provisions designed to improve energy efficiency. Part A\\1\\ of Title...

  3. Leveraging Gibbs Ensemble Molecular Dynamics and Hybrid Monte Carlo/Molecular Dynamics for Efficient Study of Phase Equilibria.

    PubMed

    Gartner, Thomas E; Epps, Thomas H; Jayaraman, Arthi

    2016-11-08

    We describe an extension of the Gibbs ensemble molecular dynamics (GEMD) method for studying phase equilibria. Our modifications to GEMD allow for direct control over particle transfer between phases and improve the method's numerical stability. Additionally, we found that the modified GEMD approach had advantages in computational efficiency in comparison to a hybrid Monte Carlo (MC)/MD Gibbs ensemble scheme in the context of the single component Lennard-Jones fluid. We note that this increase in computational efficiency does not compromise the close agreement of phase equilibrium results between the two methods. However, numerical instabilities in the GEMD scheme hamper GEMD's use near the critical point. We propose that the computationally efficient GEMD simulations can be used to map out the majority of the phase window, with hybrid MC/MD used as a follow up for conditions under which GEMD may be unstable (e.g., near-critical behavior). In this manner, we can capitalize on the contrasting strengths of these two methods to enable the efficient study of phase equilibria for systems that present challenges for a purely stochastic GEMC method, such as dense or low temperature systems, and/or those with complex molecular topologies.

  4. Improving Engine Efficiency Through Core Developments

    NASA Technical Reports Server (NTRS)

    Heidmann, James D.

    2011-01-01

    The NASA Environmentally Responsible Aviation (ERA) Project and Fundamental Aeronautics Projects are supporting compressor and turbine research with the goal of reducing aircraft engine fuel burn and greenhouse gas emissions. The primary goals of this work are to increase aircraft propulsion system fuel efficiency for a given mission by increasing the overall pressure ratio (OPR) of the engine while maintaining or improving aerodynamic efficiency of these components. An additional area of work involves reducing the amount of cooling air required to cool the turbine blades while increasing the turbine inlet temperature. This is complicated by the fact that the cooling air is becoming hotter due to the increases in OPR. Various methods are being investigated to achieve these goals, ranging from improved compressor three-dimensional blade designs to improved turbine cooling hole shapes and methods. Finally, a complementary effort in improving the accuracy, range, and speed of computational fluid mechanics (CFD) methods is proceeding to better capture the physical mechanisms underlying all these problems, for the purpose of improving understanding and future designs.

  5. Traffic information computing platform for big data

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Duan, Zongtao, E-mail: ztduan@chd.edu.cn; Li, Ying, E-mail: ztduan@chd.edu.cn; Zheng, Xibin, E-mail: ztduan@chd.edu.cn

    Big data environment create data conditions for improving the quality of traffic information service. The target of this article is to construct a traffic information computing platform for big data environment. Through in-depth analysis the connotation and technology characteristics of big data and traffic information service, a distributed traffic atomic information computing platform architecture is proposed. Under the big data environment, this type of traffic atomic information computing architecture helps to guarantee the traffic safety and efficient operation, more intelligent and personalized traffic information service can be used for the traffic information users.

  6. Rapid road repair vehicle

    DOEpatents

    Mara, Leo M.

    1999-01-01

    Disclosed are improvments to a rapid road repair vehicle comprising an improved cleaning device arrangement, two dispensing arrays for filling defects more rapidly and efficiently, an array of pre-heaters to heat the road way surface in order to help the repair material better bond to the repaired surface, a means for detecting, measuring, and computing the number, location and volume of each of the detected surface imperfection, and a computer means schema for controlling the operation of the plurality of vehicle subsystems. The improved vehicle is, therefore, better able to perform its intended function of filling surface imperfections while moving over those surfaces at near normal traffic speeds.

  7. Efficient Irregular Wavefront Propagation Algorithms on Hybrid CPU-GPU Machines

    PubMed Central

    Teodoro, George; Pan, Tony; Kurc, Tahsin; Kong, Jun; Cooper, Lee; Saltz, Joel

    2013-01-01

    We address the problem of efficient execution of a computation pattern, referred to here as the irregular wavefront propagation pattern (IWPP), on hybrid systems with multiple CPUs and GPUs. The IWPP is common in several image processing operations. In the IWPP, data elements in the wavefront propagate waves to their neighboring elements on a grid if a propagation condition is satisfied. Elements receiving the propagated waves become part of the wavefront. This pattern results in irregular data accesses and computations. We develop and evaluate strategies for efficient computation and propagation of wavefronts using a multi-level queue structure. This queue structure improves the utilization of fast memories in a GPU and reduces synchronization overheads. We also develop a tile-based parallelization strategy to support execution on multiple CPUs and GPUs. We evaluate our approaches on a state-of-the-art GPU accelerated machine (equipped with 3 GPUs and 2 multicore CPUs) using the IWPP implementations of two widely used image processing operations: morphological reconstruction and euclidean distance transform. Our results show significant performance improvements on GPUs. The use of multiple CPUs and GPUs cooperatively attains speedups of 50× and 85× with respect to single core CPU executions for morphological reconstruction and euclidean distance transform, respectively. PMID:23908562

  8. Lean and Efficient Software: Whole-Program Optimization of Executables

    DTIC Science & Technology

    2015-06-30

    Approved for public release; distribution is unlimited. Financial Data Contact: Krisztina Nagy T: (607) 273-7340 x.117 F : (607) 273-8752 knagy...grammatech.com Administrative Contact: Derek Burrows T: (607) 273-7340 x.113 F : (607) 273-8752 dburrows@grammatech.com Report Documentation Page...library subroutines, removing redundant argument checking and interface layers, eliminating dead code, and improving computational efficiency. In

  9. Energy Efficient Digital Logic Using Nanoscale Magnetic Devices

    NASA Astrophysics Data System (ADS)

    Lambson, Brian James

    Increasing demand for information processing in the last 50 years has been largely satisfied by the steadily declining price and improving performance of microelectronic devices. Much of this progress has been made by aggressively scaling the size of semiconductor transistors and metal interconnects that microprocessors are built from. As devices shrink to the size regime in which quantum effects pose significant challenges, new physics may be required in order to continue historical scaling trends. A variety of new devices and physics are currently under investigation throughout the scientific and engineering community to meet these challenges. One of the more drastic proposals on the table is to replace the electronic components of information processors with magnetic components. Magnetic components are already commonplace in computers for their information storage capability. Unlike most electronic devices, magnetic materials can store data in the absence of a power supply. Today's magnetic hard disk drives can routinely hold billions of bits of information and are in widespread commercial use. Their ability to function without a constant power source hints at an intrinsic energy efficiency. The question we investigate in this dissertation is whether or not this advantage can be extended from information storage to the notoriously energy intensive task of information processing. Several proof-of-concept magnetic logic devices were proposed and tested in the past decade. In this dissertation, we build on the prior work by answering fundamental questions about how magnetic devices achieve such high energy efficiency and how they can best function in digital logic applications. The results of this analysis are used to suggest and test improvements to nanomagnetic computing devices. Two of our results are seen as especially important to the field of nanomagnetic computing: (1) we show that it is possible to operate nanomagnetic computers at the fundamental thermodyanimic limits of computation and (2) we develop a nanomagnet with a unique shape that is engineered to significantly improve the reliability of nanomagnetic logic.

  10. Scalable and massively parallel Monte Carlo photon transport simulations for heterogeneous computing platforms

    NASA Astrophysics Data System (ADS)

    Yu, Leiming; Nina-Paravecino, Fanny; Kaeli, David; Fang, Qianqian

    2018-01-01

    We present a highly scalable Monte Carlo (MC) three-dimensional photon transport simulation platform designed for heterogeneous computing systems. Through the development of a massively parallel MC algorithm using the Open Computing Language framework, this research extends our existing graphics processing unit (GPU)-accelerated MC technique to a highly scalable vendor-independent heterogeneous computing environment, achieving significantly improved performance and software portability. A number of parallel computing techniques are investigated to achieve portable performance over a wide range of computing hardware. Furthermore, multiple thread-level and device-level load-balancing strategies are developed to obtain efficient simulations using multiple central processing units and GPUs.

  11. Public-Private Partnership: Joint recommendations to improve downloads of large Earth observation data

    NASA Astrophysics Data System (ADS)

    Ramachandran, R.; Murphy, K. J.; Baynes, K.; Lynnes, C.

    2016-12-01

    With the volume of Earth observation data expanding rapidly, cloud computing is quickly changing the way Earth observation data is processed, analyzed, and visualized. The cloud infrastructure provides the flexibility to scale up to large volumes of data and handle high velocity data streams efficiently. Having freely available Earth observation data collocated on a cloud infrastructure creates opportunities for innovation and value-added data re-use in ways unforeseen by the original data provider. These innovations spur new industries and applications and spawn new scientific pathways that were previously limited due to data volume and computational infrastructure issues. NASA, in collaboration with Amazon, Google, and Microsoft, have jointly developed a set of recommendations to enable efficient transfer of Earth observation data from existing data systems to a cloud computing infrastructure. The purpose of these recommendations is to provide guidelines against which all data providers can evaluate existing data systems and be used to improve any issues uncovered to enable efficient search, access, and use of large volumes of data. Additionally, these guidelines ensure that all cloud providers utilize a common methodology for bulk-downloading data from data providers thus preventing the data providers from building custom capabilities to meet the needs of individual cloud providers. The intent is to share these recommendations with other Federal agencies and organizations that serve Earth observation to enable efficient search, access, and use of large volumes of data. Additionally, the adoption of these recommendations will benefit data users interested in moving large volumes of data from data systems to any other location. These data users include the cloud providers, cloud users such as scientists, and other users working in a high performance computing environment who need to move large volumes of data.

  12. An adaptive sparse-grid high-order stochastic collocation method for Bayesian inference in groundwater reactive transport modeling

    NASA Astrophysics Data System (ADS)

    Zhang, Guannan; Lu, Dan; Ye, Ming; Gunzburger, Max; Webster, Clayton

    2013-10-01

    Bayesian analysis has become vital to uncertainty quantification in groundwater modeling, but its application has been hindered by the computational cost associated with numerous model executions required by exploring the posterior probability density function (PPDF) of model parameters. This is particularly the case when the PPDF is estimated using Markov Chain Monte Carlo (MCMC) sampling. In this study, a new approach is developed to improve the computational efficiency of Bayesian inference by constructing a surrogate of the PPDF, using an adaptive sparse-grid high-order stochastic collocation (aSG-hSC) method. Unlike previous works using first-order hierarchical basis, this paper utilizes a compactly supported higher-order hierarchical basis to construct the surrogate system, resulting in a significant reduction in the number of required model executions. In addition, using the hierarchical surplus as an error indicator allows locally adaptive refinement of sparse grids in the parameter space, which further improves computational efficiency. To efficiently build the surrogate system for the PPDF with multiple significant modes, optimization techniques are used to identify the modes, for which high-probability regions are defined and components of the aSG-hSC approximation are constructed. After the surrogate is determined, the PPDF can be evaluated by sampling the surrogate system directly without model execution, resulting in improved efficiency of the surrogate-based MCMC compared with conventional MCMC. The developed method is evaluated using two synthetic groundwater reactive transport models. The first example involves coupled linear reactions and demonstrates the accuracy of our high-order hierarchical basis approach in approximating high-dimensional posteriori distribution. The second example is highly nonlinear because of the reactions of uranium surface complexation, and demonstrates how the iterative aSG-hSC method is able to capture multimodal and non-Gaussian features of PPDF caused by model nonlinearity. Both experiments show that aSG-hSC is an effective and efficient tool for Bayesian inference.

  13. Nonuniform depth grids in parabolic equation solutions.

    PubMed

    Sanders, William M; Collins, Michael D

    2013-04-01

    The parabolic wave equation is solved using a finite-difference solution in depth that involves a nonuniform grid. The depth operator is discretized using Galerkin's method with asymmetric hat functions. Examples are presented to illustrate that this approach can be used to improve efficiency for problems in ocean acoustics and seismo-acoustics. For shallow water problems, accuracy is sensitive to the precise placement of the ocean bottom interface. This issue is often addressed with the inefficient approach of using a fine grid spacing over all depth. Efficiency may be improved by using a relatively coarse grid with nonuniform sampling to precisely position the interface. Efficiency may also be improved by reducing the sampling in the sediment and in an absorbing layer that is used to truncate the computational domain. Nonuniform sampling may also be used to improve the implementation of a single-scattering approximation for sloping fluid-solid interfaces.

  14. A Worst-Case Approach for On-Line Flutter Prediction

    NASA Technical Reports Server (NTRS)

    Lind, Rick C.; Brenner, Martin J.

    1998-01-01

    Worst-case flutter margins may be computed for a linear model with respect to a set of uncertainty operators using the structured singular value. This paper considers an on-line implementation to compute these robust margins in a flight test program. Uncertainty descriptions are updated at test points to account for unmodeled time-varying dynamics of the airplane by ensuring the robust model is not invalidated by measured flight data. Robust margins computed with respect to this uncertainty remain conservative to the changing dynamics throughout the flight. A simulation clearly demonstrates this method can improve the efficiency of flight testing by accurately predicting the flutter margin to improve safety while reducing the necessary flight time.

  15. An Improved Clustering Algorithm of Tunnel Monitoring Data for Cloud Computing

    PubMed Central

    Zhong, Luo; Tang, KunHao; Li, Lin; Yang, Guang; Ye, JingJing

    2014-01-01

    With the rapid development of urban construction, the number of urban tunnels is increasing and the data they produce become more and more complex. It results in the fact that the traditional clustering algorithm cannot handle the mass data of the tunnel. To solve this problem, an improved parallel clustering algorithm based on k-means has been proposed. It is a clustering algorithm using the MapReduce within cloud computing that deals with data. It not only has the advantage of being used to deal with mass data but also is more efficient. Moreover, it is able to compute the average dissimilarity degree of each cluster in order to clean the abnormal data. PMID:24982971

  16. Electromagnetomechanical elastodynamic model for Lamb wave damage quantification in composites

    NASA Astrophysics Data System (ADS)

    Borkowski, Luke; Chattopadhyay, Aditi

    2014-03-01

    Physics-based wave propagation computational models play a key role in structural health monitoring (SHM) and the development of improved damage quantification methodologies. Guided waves (GWs), such as Lamb waves, provide the capability to monitor large plate-like aerospace structures with limited actuators and sensors and are sensitive to small scale damage; however due to the complex nature of GWs, accurate and efficient computation tools are necessary to investigate the mechanisms responsible for dispersion, coupling, and interaction with damage. In this paper, the local interaction simulation approach (LISA) coupled with the sharp interface model (SIM) solution methodology is used to solve the fully coupled electro-magneto-mechanical elastodynamic equations for the piezoelectric and piezomagnetic actuation and sensing of GWs in fiber reinforced composite material systems. The final framework provides the full three-dimensional displacement as well as electrical and magnetic potential fields for arbitrary plate and transducer geometries and excitation waveform and frequency. The model is validated experimentally and proven computationally efficient for a laminated composite plate. Studies are performed with surface bonded piezoelectric and embedded piezomagnetic sensors to gain insight into the physics of experimental techniques used for SHM. The symmetric collocation of piezoelectric actuators is modeled to demonstrate mode suppression in laminated composites for the purpose of damage detection. The effect of delamination and damage (i.e., matrix cracking) on the GW propagation is demonstrated and quantified. The developed model provides a valuable tool for the improvement of SHM techniques due to its proven accuracy and computational efficiency.

  17. The monitoring and managing application of cloud computing based on Internet of Things.

    PubMed

    Luo, Shiliang; Ren, Bin

    2016-07-01

    Cloud computing and the Internet of Things are the two hot points in the Internet application field. The application of the two new technologies is in hot discussion and research, but quite less on the field of medical monitoring and managing application. Thus, in this paper, we study and analyze the application of cloud computing and the Internet of Things on the medical field. And we manage to make a combination of the two techniques in the medical monitoring and managing field. The model architecture for remote monitoring cloud platform of healthcare information (RMCPHI) was established firstly. Then the RMCPHI architecture was analyzed. Finally an efficient PSOSAA algorithm was proposed for the medical monitoring and managing application of cloud computing. Simulation results showed that our proposed scheme can improve the efficiency about 50%. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  18. Accelerating Full Configuration Interaction Calculations for Nuclear Structure

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Yang, Chao; Sternberg, Philip; Maris, Pieter

    2008-04-14

    One of the emerging computational approaches in nuclear physics is the full configuration interaction (FCI) method for solving the many-body nuclear Hamiltonian in a sufficiently large single-particle basis space to obtain exact answers - either directly or by extrapolation. The lowest eigenvalues and correspondingeigenvectors for very large, sparse and unstructured nuclear Hamiltonian matrices are obtained and used to evaluate additional experimental quantities. These matrices pose a significant challenge to the design and implementation of efficient and scalable algorithms for obtaining solutions on massively parallel computer systems. In this paper, we describe the computational strategies employed in a state-of-the-art FCI codemore » MFDn (Many Fermion Dynamics - nuclear) as well as techniques we recently developed to enhance the computational efficiency of MFDn. We will demonstrate the current capability of MFDn and report the latest performance improvement we have achieved. We will also outline our future research directions.« less

  19. A 3D staggered-grid finite difference scheme for poroelastic wave equation

    NASA Astrophysics Data System (ADS)

    Zhang, Yijie; Gao, Jinghuai

    2014-10-01

    Three dimensional numerical modeling has been a viable tool for understanding wave propagation in real media. The poroelastic media can better describe the phenomena of hydrocarbon reservoirs than acoustic and elastic media. However, the numerical modeling in 3D poroelastic media demands significantly more computational capacity, including both computational time and memory. In this paper, we present a 3D poroelastic staggered-grid finite difference (SFD) scheme. During the procedure, parallel computing is implemented to reduce the computational time. Parallelization is based on domain decomposition, and communication between processors is performed using message passing interface (MPI). Parallel analysis shows that the parallelized SFD scheme significantly improves the simulation efficiency and 3D decomposition in domain is the most efficient. We also analyze the numerical dispersion and stability condition of the 3D poroelastic SFD method. Numerical results show that the 3D numerical simulation can provide a real description of wave propagation.

  20. A novel parallel pipeline structure of VP9 decoder

    NASA Astrophysics Data System (ADS)

    Qin, Huabiao; Chen, Wu; Yi, Sijun; Tan, Yunfei; Yi, Huan

    2018-04-01

    To improve the efficiency of VP9 decoder, a novel parallel pipeline structure of VP9 decoder is presented in this paper. According to the decoding workflow, VP9 decoder can be divided into sub-modules which include entropy decoding, inverse quantization, inverse transform, intra prediction, inter prediction, deblocking and pixel adaptive compensation. By analyzing the computing time of each module, hotspot modules are located and the causes of low efficiency of VP9 decoder can be found. Then, a novel pipeline decoder structure is designed by using mixed parallel decoding methods of data division and function division. The experimental results show that this structure can greatly improve the decoding efficiency of VP9.

  1. Parallel Implicit Runge-Kutta Methods Applied to Coupled Orbit/Attitude Propagation

    NASA Astrophysics Data System (ADS)

    Hatten, Noble; Russell, Ryan P.

    2017-12-01

    A variable-step Gauss-Legendre implicit Runge-Kutta (GLIRK) propagator is applied to coupled orbit/attitude propagation. Concepts previously shown to improve efficiency in 3DOF propagation are modified and extended to the 6DOF problem, including the use of variable-fidelity dynamics models. The impact of computing the stage dynamics of a single step in parallel is examined using up to 23 threads and 22 associated GLIRK stages; one thread is reserved for an extra dynamics function evaluation used in the estimation of the local truncation error. Efficiency is found to peak for typical examples when using approximately 8 to 12 stages for both serial and parallel implementations. Accuracy and efficiency compare favorably to explicit Runge-Kutta and linear-multistep solvers for representative scenarios. However, linear-multistep methods are found to be more efficient for some applications, particularly in a serial computing environment, or when parallelism can be applied across multiple trajectories.

  2. EPPRD: An Efficient Privacy-Preserving Power Requirement and Distribution Aggregation Scheme for a Smart Grid.

    PubMed

    Zhang, Lei; Zhang, Jing

    2017-08-07

    A Smart Grid (SG) facilitates bidirectional demand-response communication between individual users and power providers with high computation and communication performance but also brings about the risk of leaking users' private information. Therefore, improving the individual power requirement and distribution efficiency to ensure communication reliability while preserving user privacy is a new challenge for SG. Based on this issue, we propose an efficient and privacy-preserving power requirement and distribution aggregation scheme (EPPRD) based on a hierarchical communication architecture. In the proposed scheme, an efficient encryption and authentication mechanism is proposed for better fit to each individual demand-response situation. Through extensive analysis and experiment, we demonstrate how the EPPRD resists various security threats and preserves user privacy while satisfying the individual requirement in a semi-honest model; it involves less communication overhead and computation time than the existing competing schemes.

  3. Efficient LIDAR Point Cloud Data Managing and Processing in a Hadoop-Based Distributed Framework

    NASA Astrophysics Data System (ADS)

    Wang, C.; Hu, F.; Sha, D.; Han, X.

    2017-10-01

    Light Detection and Ranging (LiDAR) is one of the most promising technologies in surveying and mapping city management, forestry, object recognition, computer vision engineer and others. However, it is challenging to efficiently storage, query and analyze the high-resolution 3D LiDAR data due to its volume and complexity. In order to improve the productivity of Lidar data processing, this study proposes a Hadoop-based framework to efficiently manage and process LiDAR data in a distributed and parallel manner, which takes advantage of Hadoop's storage and computing ability. At the same time, the Point Cloud Library (PCL), an open-source project for 2D/3D image and point cloud processing, is integrated with HDFS and MapReduce to conduct the Lidar data analysis algorithms provided by PCL in a parallel fashion. The experiment results show that the proposed framework can efficiently manage and process big LiDAR data.

  4. EPPRD: An Efficient Privacy-Preserving Power Requirement and Distribution Aggregation Scheme for a Smart Grid

    PubMed Central

    Zhang, Lei; Zhang, Jing

    2017-01-01

    A Smart Grid (SG) facilitates bidirectional demand-response communication between individual users and power providers with high computation and communication performance but also brings about the risk of leaking users’ private information. Therefore, improving the individual power requirement and distribution efficiency to ensure communication reliability while preserving user privacy is a new challenge for SG. Based on this issue, we propose an efficient and privacy-preserving power requirement and distribution aggregation scheme (EPPRD) based on a hierarchical communication architecture. In the proposed scheme, an efficient encryption and authentication mechanism is proposed for better fit to each individual demand-response situation. Through extensive analysis and experiment, we demonstrate how the EPPRD resists various security threats and preserves user privacy while satisfying the individual requirement in a semi-honest model; it involves less communication overhead and computation time than the existing competing schemes. PMID:28783122

  5. An adaptive grid to improve the efficiency and accuracy of modelling underwater noise from shipping

    NASA Astrophysics Data System (ADS)

    Trigg, Leah; Chen, Feng; Shapiro, Georgy; Ingram, Simon; Embling, Clare

    2017-04-01

    Underwater noise from shipping is becoming a significant concern and has been listed as a pollutant under Descriptor 11 of the Marine Strategy Framework Directive. Underwater noise models are an essential tool to assess and predict noise levels for regulatory procedures such as environmental impact assessments and ship noise monitoring. There are generally two approaches to noise modelling. The first is based on simplified energy flux models, assuming either spherical or cylindrical propagation of sound energy. These models are very quick but they ignore important water column and seabed properties, and produce significant errors in the areas subject to temperature stratification (Shapiro et al., 2014). The second type of model (e.g. ray-tracing and parabolic equation) is based on an advanced physical representation of sound propagation. However, these acoustic propagation models are computationally expensive to execute. Shipping noise modelling requires spatial discretization in order to group noise sources together using a grid. A uniform grid size is often selected to achieve either the greatest efficiency (i.e. speed of computations) or the greatest accuracy. In contrast, this work aims to produce efficient and accurate noise level predictions by presenting an adaptive grid where cell size varies with distance from the receiver. The spatial range over which a certain cell size is suitable was determined by calculating the distance from the receiver at which propagation loss becomes uniform across a grid cell. The computational efficiency and accuracy of the resulting adaptive grid was tested by comparing it to uniform 1 km and 5 km grids. These represent an accurate and computationally efficient grid respectively. For a case study of the Celtic Sea, an application of the adaptive grid over an area of 160×160 km reduced the number of model executions required from 25600 for a 1 km grid to 5356 in December and to between 5056 and 13132 in August, which represents a 2 to 5-fold increase in efficiency. The 5 km grid reduces the number of model executions further to 1024. However, over the first 25 km the 5 km grid produces errors of up to 13.8 dB when compared to the highly accurate but inefficient 1 km grid. The newly developed adaptive grid generates much smaller errors of less than 0.5 dB while demonstrating high computational efficiency. Our results show that the adaptive grid provides the ability to retain the accuracy of noise level predictions and improve the efficiency of the modelling process. This can help safeguard sensitive marine ecosystems from noise pollution by improving the underwater noise predictions that inform management activities. References Shapiro, G., Chen, F., Thain, R., 2014. The Effect of Ocean Fronts on Acoustic Wave Propagation in a Shallow Sea, Journal of Marine System, 139: 217 - 226. http://dx.doi.org/10.1016/j.jmarsys.2014.06.007.

  6. A new approach to integrate GPU-based Monte Carlo simulation into inverse treatment plan optimization for proton therapy.

    PubMed

    Li, Yongbao; Tian, Zhen; Song, Ting; Wu, Zhaoxia; Liu, Yaqiang; Jiang, Steve; Jia, Xun

    2017-01-07

    Monte Carlo (MC)-based spot dose calculation is highly desired for inverse treatment planning in proton therapy because of its accuracy. Recent studies on biological optimization have also indicated the use of MC methods to compute relevant quantities of interest, e.g. linear energy transfer. Although GPU-based MC engines have been developed to address inverse optimization problems, their efficiency still needs to be improved. Also, the use of a large number of GPUs in MC calculation is not favorable for clinical applications. The previously proposed adaptive particle sampling (APS) method can improve the efficiency of MC-based inverse optimization by using the computationally expensive MC simulation more effectively. This method is more efficient than the conventional approach that performs spot dose calculation and optimization in two sequential steps. In this paper, we propose a computational library to perform MC-based spot dose calculation on GPU with the APS scheme. The implemented APS method performs a non-uniform sampling of the particles from pencil beam spots during the optimization process, favoring those from the high intensity spots. The library also conducts two computationally intensive matrix-vector operations frequently used when solving an optimization problem. This library design allows a streamlined integration of the MC-based spot dose calculation into an existing proton therapy inverse planning process. We tested the developed library in a typical inverse optimization system with four patient cases. The library achieved the targeted functions by supporting inverse planning in various proton therapy schemes, e.g. single field uniform dose, 3D intensity modulated proton therapy, and distal edge tracking. The efficiency was 41.6  ±  15.3% higher than the use of a GPU-based MC package in a conventional calculation scheme. The total computation time ranged between 2 and 50 min on a single GPU card depending on the problem size.

  7. A new approach to integrate GPU-based Monte Carlo simulation into inverse treatment plan optimization for proton therapy

    NASA Astrophysics Data System (ADS)

    Li, Yongbao; Tian, Zhen; Song, Ting; Wu, Zhaoxia; Liu, Yaqiang; Jiang, Steve; Jia, Xun

    2017-01-01

    Monte Carlo (MC)-based spot dose calculation is highly desired for inverse treatment planning in proton therapy because of its accuracy. Recent studies on biological optimization have also indicated the use of MC methods to compute relevant quantities of interest, e.g. linear energy transfer. Although GPU-based MC engines have been developed to address inverse optimization problems, their efficiency still needs to be improved. Also, the use of a large number of GPUs in MC calculation is not favorable for clinical applications. The previously proposed adaptive particle sampling (APS) method can improve the efficiency of MC-based inverse optimization by using the computationally expensive MC simulation more effectively. This method is more efficient than the conventional approach that performs spot dose calculation and optimization in two sequential steps. In this paper, we propose a computational library to perform MC-based spot dose calculation on GPU with the APS scheme. The implemented APS method performs a non-uniform sampling of the particles from pencil beam spots during the optimization process, favoring those from the high intensity spots. The library also conducts two computationally intensive matrix-vector operations frequently used when solving an optimization problem. This library design allows a streamlined integration of the MC-based spot dose calculation into an existing proton therapy inverse planning process. We tested the developed library in a typical inverse optimization system with four patient cases. The library achieved the targeted functions by supporting inverse planning in various proton therapy schemes, e.g. single field uniform dose, 3D intensity modulated proton therapy, and distal edge tracking. The efficiency was 41.6  ±  15.3% higher than the use of a GPU-based MC package in a conventional calculation scheme. The total computation time ranged between 2 and 50 min on a single GPU card depending on the problem size.

  8. A New Approach to Integrate GPU-based Monte Carlo Simulation into Inverse Treatment Plan Optimization for Proton Therapy

    PubMed Central

    Li, Yongbao; Tian, Zhen; Song, Ting; Wu, Zhaoxia; Liu, Yaqiang; Jiang, Steve; Jia, Xun

    2016-01-01

    Monte Carlo (MC)-based spot dose calculation is highly desired for inverse treatment planning in proton therapy because of its accuracy. Recent studies on biological optimization have also indicated the use of MC methods to compute relevant quantities of interest, e.g. linear energy transfer. Although GPU-based MC engines have been developed to address inverse optimization problems, their efficiency still needs to be improved. Also, the use of a large number of GPUs in MC calculation is not favorable for clinical applications. The previously proposed adaptive particle sampling (APS) method can improve the efficiency of MC-based inverse optimization by using the computationally expensive MC simulation more effectively. This method is more efficient than the conventional approach that performs spot dose calculation and optimization in two sequential steps. In this paper, we propose a computational library to perform MC-based spot dose calculation on GPU with the APS scheme. The implemented APS method performs a non-uniform sampling of the particles from pencil beam spots during the optimization process, favoring those from the high intensity spots. The library also conducts two computationally intensive matrix-vector operations frequently used when solving an optimization problem. This library design allows a streamlined integration of the MC-based spot dose calculation into an existing proton therapy inverse planning process. We tested the developed library in a typical inverse optimization system with four patient cases. The library achieved the targeted functions by supporting inverse planning in various proton therapy schemes, e.g. single field uniform dose, 3D intensity modulated proton therapy, and distal edge tracking. The efficiency was 41.6±15.3% higher than the use of a GPU-based MC package in a conventional calculation scheme. The total computation time ranged between 2 and 50 min on a single GPU card depending on the problem size. PMID:27991456

  9. Improved object optimal synthetic description, modeling, learning, and discrimination by GEOGINE computational kernel

    NASA Astrophysics Data System (ADS)

    Fiorini, Rodolfo A.; Dacquino, Gianfranco

    2005-03-01

    GEOGINE (GEOmetrical enGINE), a state-of-the-art OMG (Ontological Model Generator) based on n-D Tensor Invariants for n-Dimensional shape/texture optimal synthetic representation, description and learning, was presented in previous conferences elsewhere recently. Improved computational algorithms based on the computational invariant theory of finite groups in Euclidean space and a demo application is presented. Progressive model automatic generation is discussed. GEOGINE can be used as an efficient computational kernel for fast reliable application development and delivery in advanced biomedical engineering, biometric, intelligent computing, target recognition, content image retrieval, data mining technological areas mainly. Ontology can be regarded as a logical theory accounting for the intended meaning of a formal dictionary, i.e., its ontological commitment to a particular conceptualization of the world object. According to this approach, "n-D Tensor Calculus" can be considered a "Formal Language" to reliably compute optimized "n-Dimensional Tensor Invariants" as specific object "invariant parameter and attribute words" for automated n-Dimensional shape/texture optimal synthetic object description by incremental model generation. The class of those "invariant parameter and attribute words" can be thought as a specific "Formal Vocabulary" learned from a "Generalized Formal Dictionary" of the "Computational Tensor Invariants" language. Even object chromatic attributes can be effectively and reliably computed from object geometric parameters into robust colour shape invariant characteristics. As a matter of fact, any highly sophisticated application needing effective, robust object geometric/colour invariant attribute capture and parameterization features, for reliable automated object learning and discrimination can deeply benefit from GEOGINE progressive automated model generation computational kernel performance. Main operational advantages over previous, similar approaches are: 1) Progressive Automated Invariant Model Generation, 2) Invariant Minimal Complete Description Set for computational efficiency, 3) Arbitrary Model Precision for robust object description and identification.

  10. Metalevel programming in robotics: Some issues

    NASA Technical Reports Server (NTRS)

    Kumarn, A.; Parameswaran, N.

    1987-01-01

    Computing in robotics has two important requirements: efficiency and flexibility. Algorithms for robot actions are implemented usually in procedural languages such as VAL and AL. But, since their excessive bindings create inflexible structures of computation, it is proposed that Logic Programming is a more suitable language for robot programming due to its non-determinism, declarative nature, and provision for metalevel programming. Logic Programming, however, results in inefficient computations. As a solution to this problem, researchers discuss a framework in which controls can be described to improve efficiency. They have divided controls into: (1) in-code and (2) metalevel and discussed them with reference to selection of rules and dataflow. Researchers illustrated the merit of Logic Programming by modelling the motion of a robot from one point to another avoiding obstacles.

  11. An energy efficient and high speed architecture for convolution computing based on binary resistive random access memory

    NASA Astrophysics Data System (ADS)

    Liu, Chen; Han, Runze; Zhou, Zheng; Huang, Peng; Liu, Lifeng; Liu, Xiaoyan; Kang, Jinfeng

    2018-04-01

    In this work we present a novel convolution computing architecture based on metal oxide resistive random access memory (RRAM) to process the image data stored in the RRAM arrays. The proposed image storage architecture shows performances of better speed-device consumption efficiency compared with the previous kernel storage architecture. Further we improve the architecture for a high accuracy and low power computing by utilizing the binary storage and the series resistor. For a 28 × 28 image and 10 kernels with a size of 3 × 3, compared with the previous kernel storage approach, the newly proposed architecture shows excellent performances including: 1) almost 100% accuracy within 20% LRS variation and 90% HRS variation; 2) more than 67 times speed boost; 3) 71.4% energy saving.

  12. SAGE: The Self-Adaptive Grid Code. 3

    NASA Technical Reports Server (NTRS)

    Davies, Carol B.; Venkatapathy, Ethiraj

    1999-01-01

    The multi-dimensional self-adaptive grid code, SAGE, is an important tool in the field of computational fluid dynamics (CFD). It provides an efficient method to improve the accuracy of flow solutions while simultaneously reducing computer processing time. Briefly, SAGE enhances an initial computational grid by redistributing the mesh points into more appropriate locations. The movement of these points is driven by an equal-error-distribution algorithm that utilizes the relationship between high flow gradients and excessive solution errors. The method also provides a balance between clustering points in the high gradient regions and maintaining the smoothness and continuity of the adapted grid, The latest version, Version 3, includes the ability to change the boundaries of a given grid to more efficiently enclose flow structures and provides alternative redistribution algorithms.

  13. Computationally efficient method for optical simulation of solar cells and their applications

    NASA Astrophysics Data System (ADS)

    Semenikhin, I.; Zanuccoli, M.; Fiegna, C.; Vyurkov, V.; Sangiorgi, E.

    2013-01-01

    This paper presents two novel implementations of the Differential method to solve the Maxwell equations in nanostructured optoelectronic solid state devices. The first proposed implementation is based on an improved and computationally efficient T-matrix formulation that adopts multiple-precision arithmetic to tackle the numerical instability problem which arises due to evanescent modes. The second implementation adopts the iterative approach that allows to achieve low computational complexity O(N logN) or better. The proposed algorithms may work with structures with arbitrary spatial variation of the permittivity. The developed two-dimensional numerical simulator is applied to analyze the dependence of the absorption characteristics of a thin silicon slab on the morphology of the front interface and on the angle of incidence of the radiation with respect to the device surface.

  14. Symmetry-conserving purification of quantum states within the density matrix renormalization group

    DOE PAGES

    Nocera, Alberto; Alvarez, Gonzalo

    2016-01-28

    The density matrix renormalization group (DMRG) algorithm was originally designed to efficiently compute the zero-temperature or ground-state properties of one-dimensional strongly correlated quantum systems. The development of the algorithm at finite temperature has been a topic of much interest, because of the usefulness of thermodynamics quantities in understanding the physics of condensed matter systems, and because of the increased complexity associated with efficiently computing temperature-dependent properties. The ancilla method is a DMRG technique that enables the computation of these thermodynamic quantities. In this paper, we review the ancilla method, and improve its performance by working on reduced Hilbert spaces andmore » using canonical approaches. Furthermore we explore its applicability beyond spins systems to t-J and Hubbard models.« less

  15. Parallelization of Nullspace Algorithm for the computation of metabolic pathways

    PubMed Central

    Jevremović, Dimitrije; Trinh, Cong T.; Srienc, Friedrich; Sosa, Carlos P.; Boley, Daniel

    2011-01-01

    Elementary mode analysis is a useful metabolic pathway analysis tool in understanding and analyzing cellular metabolism, since elementary modes can represent metabolic pathways with unique and minimal sets of enzyme-catalyzed reactions of a metabolic network under steady state conditions. However, computation of the elementary modes of a genome- scale metabolic network with 100–1000 reactions is very expensive and sometimes not feasible with the commonly used serial Nullspace Algorithm. In this work, we develop a distributed memory parallelization of the Nullspace Algorithm to handle efficiently the computation of the elementary modes of a large metabolic network. We give an implementation in C++ language with the support of MPI library functions for the parallel communication. Our proposed algorithm is accompanied with an analysis of the complexity and identification of major bottlenecks during computation of all possible pathways of a large metabolic network. The algorithm includes methods to achieve load balancing among the compute-nodes and specific communication patterns to reduce the communication overhead and improve efficiency. PMID:22058581

  16. A Computationally Efficient Parallel Levenberg-Marquardt Algorithm for Large-Scale Big-Data Inversion

    NASA Astrophysics Data System (ADS)

    Lin, Y.; O'Malley, D.; Vesselinov, V. V.

    2015-12-01

    Inverse modeling seeks model parameters given a set of observed state variables. However, for many practical problems due to the facts that the observed data sets are often large and model parameters are often numerous, conventional methods for solving the inverse modeling can be computationally expensive. We have developed a new, computationally-efficient Levenberg-Marquardt method for solving large-scale inverse modeling. Levenberg-Marquardt methods require the solution of a dense linear system of equations which can be prohibitively expensive to compute for large-scale inverse problems. Our novel method projects the original large-scale linear problem down to a Krylov subspace, such that the dimensionality of the measurements can be significantly reduced. Furthermore, instead of solving the linear system for every Levenberg-Marquardt damping parameter, we store the Krylov subspace computed when solving the first damping parameter and recycle it for all the following damping parameters. The efficiency of our new inverse modeling algorithm is significantly improved by using these computational techniques. We apply this new inverse modeling method to invert for a random transitivity field. Our algorithm is fast enough to solve for the distributed model parameters (transitivity) at each computational node in the model domain. The inversion is also aided by the use regularization techniques. The algorithm is coded in Julia and implemented in the MADS computational framework (http://mads.lanl.gov). Julia is an advanced high-level scientific programing language that allows for efficient memory management and utilization of high-performance computational resources. By comparing with a Levenberg-Marquardt method using standard linear inversion techniques, our Levenberg-Marquardt method yields speed-up ratio of 15 in a multi-core computational environment and a speed-up ratio of 45 in a single-core computational environment. Therefore, our new inverse modeling method is a powerful tool for large-scale applications.

  17. Using tablet computers to teach evidence-based medicine to pediatrics residents: a prospective study.

    PubMed

    Soma, David B; Homme, Jason H; Jacobson, Robert M

    2013-01-01

    We sought to determine if tablet computers-supported by a laboratory experience focused upon skill-development-would improve not only evidence-based medicine (EBM) knowledge but also skills and behavior. We conducted a prospective cohort study where we provided tablet computers to our pediatric residents and then held a series of laboratory sessions focused on speed and efficiency in performing EBM at the bedside. We evaluated the intervention with pre- and postintervention tests and surveys based on a validated tool available for use on MedEdPORTAL. The attending pediatric hospitalists also completed surveys regarding their observations of the residents' behavior. All 38 pediatric residents completed the preintervention test and the pre- and postintervention surveys. All but one completed the posttest. All 7 attending pediatric hospitalists completed their surveys. The testing, targeted to assess EBM knowledge, revealed a median increase of 16 points out of a possible 60 points (P < .0001). We found substantial increases in individual resident's test scores across all 3 years of residency. Resident responses demonstrated statistically significant improvements in self-reported comfort with 6 out of 6 EBM skills and statistically significant increases in self-reported frequencies for 4 out of 7 EBM behaviors. Attending pediatric hospitalists reported improvements in 5 of 7 resident behaviors. This novel approach for teaching EBM to pediatric residents improved knowledge, skills, and behavior through the introduction of a tablet computer and laboratory sessions designed to teach the quick and efficient application of EBM at the bedside. Copyright © 2013 Academic Pediatric Association. Published by Elsevier Inc. All rights reserved.

  18. A synthetic design environment for ship design

    NASA Technical Reports Server (NTRS)

    Chipman, Richard R.

    1995-01-01

    Rapid advances in computer science and information system technology have made possible the creation of synthetic design environments (SDE) which use virtual prototypes to increase the efficiency and agility of the design process. This next generation of computer-based design tools will rely heavily on simulation and advanced visualization techniques to enable integrated product and process teams to concurrently conceptualize, design, and test a product and its fabrication processes. This paper summarizes a successful demonstration of the feasibility of using a simulation based design environment in the shipbuilding industry. As computer science and information science technologies have evolved, there have been many attempts to apply and integrate the new capabilities into systems for the improvement of the process of design. We see the benefits of those efforts in the abundance of highly reliable, technologically complex products and services in the modern marketplace. Furthermore, the computer-based technologies have been so cost effective that the improvements embodied in modern products have been accompanied by lowered costs. Today the state-of-the-art in computerized design has advanced so dramatically that the focus is no longer on merely improving design methodology; rather the goal is to revolutionize the entire process by which complex products are conceived, designed, fabricated, tested, deployed, operated, maintained, refurbished and eventually decommissioned. By concurrently addressing all life-cycle issues, the basic decision making process within an enterprise will be improved dramatically, leading to new levels of quality, innovation, efficiency, and customer responsiveness. By integrating functions and people with an enterprise, such systems will change the fundamental way American industries are organized, creating companies that are more competitive, creative, and productive.

  19. Performance Improvement Through Indexing of Turbine Airfoils. Part 2; Numerical Simulation

    NASA Technical Reports Server (NTRS)

    Griffin, Lisa W.; Huber, Frank W.; Sharma, Om P.

    1996-01-01

    An experimental/analytical study has been conducted to determine the performance improvements achievable by circumferentially indexing succeeding rows of turbine stator airfoils. A series of tests was conducted to experimentally investigate stator wake clocking effects on the performance of the space shuttle main engine (SSME) alternate turbopump development (ATD) fuel turbine test article (TTA). The results from this study indicate that significant increases in stage efficiency can be attained through application of this airfoil clocking concept. Details of the experiment and its results are documented in part 1 of this paper. In order to gain insight into the mechanisms of the performance improvement, extensive computational fluid dynamics (CFD) simulations were executed. The subject of the present paper is the initial results from the CFD investigation of the configurations and conditions detailed in part 1 of the paper. To characterize the aerodynamic environments in the experimental test series, two-dimensional (2D), time accurate, multistage, viscous analyses were performed at the TTA midspan. Computational analyses for five different circumferential positions of the first stage stator have been completed. Details of the computational procedure and the results are presented. The analytical results verify the experimentally demonstrated performance improvement and are compared with data whenever possible. Predictions of time-averaged turbine efficiencies as well as gas conditions throughout the flow field are presented. An initial understanding of the turbine performance improvement mechanism based on the results from this investigation is described.

  20. Deep learning with coherent nanophotonic circuits

    NASA Astrophysics Data System (ADS)

    Shen, Yichen; Harris, Nicholas C.; Skirlo, Scott; Prabhu, Mihika; Baehr-Jones, Tom; Hochberg, Michael; Sun, Xin; Zhao, Shijie; Larochelle, Hugo; Englund, Dirk; Soljačić, Marin

    2017-07-01

    Artificial neural networks are computational network models inspired by signal processing in the brain. These models have dramatically improved performance for many machine-learning tasks, including speech and image recognition. However, today's computing hardware is inefficient at implementing neural networks, in large part because much of it was designed for von Neumann computing schemes. Significant effort has been made towards developing electronic architectures tuned to implement artificial neural networks that exhibit improved computational speed and accuracy. Here, we propose a new architecture for a fully optical neural network that, in principle, could offer an enhancement in computational speed and power efficiency over state-of-the-art electronics for conventional inference tasks. We experimentally demonstrate the essential part of the concept using a programmable nanophotonic processor featuring a cascaded array of 56 programmable Mach-Zehnder interferometers in a silicon photonic integrated circuit and show its utility for vowel recognition.

  1. Efficient Computation Of Manipulator Inertia Matrix

    NASA Technical Reports Server (NTRS)

    Fijany, Amir; Bejczy, Antal K.

    1991-01-01

    Improved method for computation of manipulator inertia matrix developed, based on concept of spatial inertia of composite rigid body. Required for implementation of advanced dynamic-control schemes as well as dynamic simulation of manipulator motion. Motivated by increasing demand for fast algorithms to provide real-time control and simulation capability and, particularly, need for faster-than-real-time simulation capability, required in many anticipated space teleoperation applications.

  2. SIMD Optimization of Linear Expressions for Programmable Graphics Hardware

    PubMed Central

    Bajaj, Chandrajit; Ihm, Insung; Min, Jungki; Oh, Jinsang

    2009-01-01

    The increased programmability of graphics hardware allows efficient graphical processing unit (GPU) implementations of a wide range of general computations on commodity PCs. An important factor in such implementations is how to fully exploit the SIMD computing capacities offered by modern graphics processors. Linear expressions in the form of ȳ = Ax̄ + b̄, where A is a matrix, and x̄, ȳ and b̄ are vectors, constitute one of the most basic operations in many scientific computations. In this paper, we propose a SIMD code optimization technique that enables efficient shader codes to be generated for evaluating linear expressions. It is shown that performance can be improved considerably by efficiently packing arithmetic operations into four-wide SIMD instructions through reordering of the operations in linear expressions. We demonstrate that the presented technique can be used effectively for programming both vertex and pixel shaders for a variety of mathematical applications, including integrating differential equations and solving a sparse linear system of equations using iterative methods. PMID:19946569

  3. Power and Performance Trade-offs for Space Time Adaptive Processing

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Gawande, Nitin A.; Manzano Franco, Joseph B.; Tumeo, Antonino

    Computational efficiency – performance relative to power or energy – is one of the most important concerns when designing RADAR processing systems. This paper analyzes power and performance trade-offs for a typical Space Time Adaptive Processing (STAP) application. We study STAP implementations for CUDA and OpenMP on two computationally efficient architectures, Intel Haswell Core I7-4770TE and NVIDIA Kayla with a GK208 GPU. We analyze the power and performance of STAP’s computationally intensive kernels across the two hardware testbeds. We also show the impact and trade-offs of GPU optimization techniques. We show that data parallelism can be exploited for efficient implementationmore » on the Haswell CPU architecture. The GPU architecture is able to process large size data sets without increase in power requirement. The use of shared memory has a significant impact on the power requirement for the GPU. A balance between the use of shared memory and main memory access leads to an improved performance in a typical STAP application.« less

  4. Low-power, transparent optical network interface for high bandwidth off-chip interconnects.

    PubMed

    Liboiron-Ladouceur, Odile; Wang, Howard; Garg, Ajay S; Bergman, Keren

    2009-04-13

    The recent emergence of multicore architectures and chip multiprocessors (CMPs) has accelerated the bandwidth requirements in high-performance processors for both on-chip and off-chip interconnects. For next generation computing clusters, the delivery of scalable power efficient off-chip communications to each compute node has emerged as a key bottleneck to realizing the full computational performance of these systems. The power dissipation is dominated by the off-chip interface and the necessity to drive high-speed signals over long distances. We present a scalable photonic network interface approach that fully exploits the bandwidth capacity offered by optical interconnects while offering significant power savings over traditional E/O and O/E approaches. The power-efficient interface optically aggregates electronic serial data streams into a multiple WDM channel packet structure at time-of-flight latencies. We demonstrate a scalable optical network interface with 70% improvement in power efficiency for a complete end-to-end PCI Express data transfer.

  5. A Generalized Method for the Comparable and Rigorous Calculation of the Polytropic Efficiencies of Turbocompressors

    NASA Astrophysics Data System (ADS)

    Dimitrakopoulos, Panagiotis

    2018-03-01

    The calculation of polytropic efficiencies is a very important task, especially during the development of new compression units, like compressor impellers, stages and stage groups. Such calculations are also crucial for the determination of the performance of a whole compressor. As processors and computational capacities have substantially been improved in the last years, the need for a new, rigorous, robust, accurate and at the same time standardized method merged, regarding the computation of the polytropic efficiencies, especially based on thermodynamics of real gases. The proposed method is based on the rigorous definition of the polytropic efficiency. The input consists of pressure and temperature values at the end points of the compression path (suction and discharge), for a given working fluid. The average relative error for the studied cases was 0.536 %. Thus, this high-accuracy method is proposed for efficiency calculations related with turbocompressors and their compression units, especially when they are operating at high power levels, for example in jet engines and high-power plants.

  6. Improve the efficiency of the Cartesian tensor based fast multipole method for Coulomb interaction using the traces

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Huang, He; Luo, Li -Shi; Li, Rui

    To compute the non-oscillating mutual interaction for a systems with N points, the fast multipole method (FMM) has an efficiency that scales linearly with the number of points. Specifically, for Coulomb interaction, FMM can be constructed using either the spherical harmonic functions or the totally symmetric Cartesian tensors. In this paper, we will present that the effciency of the Cartesian tensor-based FMM for the Coulomb interaction can be significantly improved by implementing the traces of the Cartesian tensors in calculation to reduce the independent elements of the n-th rank totally symmetric Cartesian tensor from (n + 1)(n + 2)=2 tomore » 2n + 1. The computation complexity for the operations in FMM are analyzed and expressed as polynomials of the highest rank of the Cartesian tensors. For most operations, the complexity is reduced by one order. Numerical examples regarding the convergence and the effciency of the new algorithm are demonstrated. As a result, a reduction of computation time up to 50% has been observed for a moderate number of points and rank of tensors.« less

  7. Improve the efficiency of the Cartesian tensor based fast multipole method for Coulomb interaction using the traces

    DOE PAGES

    Huang, He; Luo, Li -Shi; Li, Rui; ...

    2018-05-17

    To compute the non-oscillating mutual interaction for a systems with N points, the fast multipole method (FMM) has an efficiency that scales linearly with the number of points. Specifically, for Coulomb interaction, FMM can be constructed using either the spherical harmonic functions or the totally symmetric Cartesian tensors. In this paper, we will present that the effciency of the Cartesian tensor-based FMM for the Coulomb interaction can be significantly improved by implementing the traces of the Cartesian tensors in calculation to reduce the independent elements of the n-th rank totally symmetric Cartesian tensor from (n + 1)(n + 2)=2 tomore » 2n + 1. The computation complexity for the operations in FMM are analyzed and expressed as polynomials of the highest rank of the Cartesian tensors. For most operations, the complexity is reduced by one order. Numerical examples regarding the convergence and the effciency of the new algorithm are demonstrated. As a result, a reduction of computation time up to 50% has been observed for a moderate number of points and rank of tensors.« less

  8. Multi-Objective Aerodynamic Optimization of the Streamlined Shape of High-Speed Trains Based on the Kriging Model.

    PubMed

    Xu, Gang; Liang, Xifeng; Yao, Shuanbao; Chen, Dawei; Li, Zhiwei

    2017-01-01

    Minimizing the aerodynamic drag and the lift of the train coach remains a key issue for high-speed trains. With the development of computing technology and computational fluid dynamics (CFD) in the engineering field, CFD has been successfully applied to the design process of high-speed trains. However, developing a new streamlined shape for high-speed trains with excellent aerodynamic performance requires huge computational costs. Furthermore, relationships between multiple design variables and the aerodynamic loads are seldom obtained. In the present study, the Kriging surrogate model is used to perform a multi-objective optimization of the streamlined shape of high-speed trains, where the drag and the lift of the train coach are the optimization objectives. To improve the prediction accuracy of the Kriging model, the cross-validation method is used to construct the optimal Kriging model. The optimization results show that the two objectives are efficiently optimized, indicating that the optimization strategy used in the present study can greatly improve the optimization efficiency and meet the engineering requirements.

  9. A space radiation transport method development

    NASA Technical Reports Server (NTRS)

    Wilson, J. W.; Tripathi, R. K.; Qualls, G. D.; Cucinotta, F. A.; Prael, R. E.; Norbury, J. W.; Heinbockel, J. H.; Tweed, J.

    2004-01-01

    Improved spacecraft shield design requires early entry of radiation constraints into the design process to maximize performance and minimize costs. As a result, we have been investigating high-speed computational procedures to allow shield analysis from the preliminary design concepts to the final design. In particular, we will discuss the progress towards a full three-dimensional and computationally efficient deterministic code for which the current HZETRN evaluates the lowest-order asymptotic term. HZETRN is the first deterministic solution to the Boltzmann equation allowing field mapping within the International Space Station (ISS) in tens of minutes using standard finite element method (FEM) geometry common to engineering design practice enabling development of integrated multidisciplinary design optimization methods. A single ray trace in ISS FEM geometry requires 14 ms and severely limits application of Monte Carlo methods to such engineering models. A potential means of improving the Monte Carlo efficiency in coupling to spacecraft geometry is given in terms of re-configurable computing and could be utilized in the final design as verification of the deterministic method optimized design. Published by Elsevier Ltd on behalf of COSPAR.

  10. Efficient self-consistent viscous-inviscid solutions for unsteady transonic flow

    NASA Technical Reports Server (NTRS)

    Howlett, J. T.

    1985-01-01

    An improved method is presented for coupling a boundary layer code with an unsteady inviscid transonic computer code in a quasi-steady fashion. At each fixed time step, the boundary layer and inviscid equations are successively solved until the process converges. An explicit coupling of the equations is described which greatly accelerates the convergence process. Computer times for converged viscous-inviscid solutions are about 1.8 times the comparable inviscid values. Comparison of the results obtained with experimental data on three airfoils are presented. These comparisons demonstrate that the explicitly coupled viscous-inviscid solutions can provide efficient predictions of pressure distributions and lift for unsteady two-dimensional transonic flows.

  11. Efficient self-consistent viscous-inviscid solutions for unsteady transonic flow

    NASA Technical Reports Server (NTRS)

    Howlett, J. T.

    1985-01-01

    An improved method is presented for coupling a boundary layer code with an unsteady inviscid transonic computer code in a quasi-steady fashion. At each fixed time step, the boundary layer and inviscid equations are successively solved until the process converges. An explicit coupling of the equations is described which greatly accelerates the convergence process. Computer times for converged viscous-inviscid solutions are about 1.8 times the comparable inviscid values. Comparison of the results obtained with experimental data on three airfoils are presented. These comparisons demonstrate that the explicitly coupled viscous-inviscid solutions can provide efficient predictions of pressure distributions and lift for unsteady two-dimensional transonic flow.

  12. Interaction sorting method for molecular dynamics on multi-core SIMD CPU architecture.

    PubMed

    Matvienko, Sergey; Alemasov, Nikolay; Fomin, Eduard

    2015-02-01

    Molecular dynamics (MD) is widely used in computational biology for studying binding mechanisms of molecules, molecular transport, conformational transitions, protein folding, etc. The method is computationally expensive; thus, the demand for the development of novel, much more efficient algorithms is still high. Therefore, the new algorithm designed in 2007 and called interaction sorting (IS) clearly attracted interest, as it outperformed the most efficient MD algorithms. In this work, a new IS modification is proposed which allows the algorithm to utilize SIMD processor instructions. This paper shows that the improvement provides an additional gain in performance, 9% to 45% in comparison to the original IS method.

  13. Using Simulation to Examine the Effect of Physician Heterogeneity on the Operational Efficiency of an Overcrowded Hospital Emergency Department

    NASA Astrophysics Data System (ADS)

    Kuo, Y.-H.; Leung, J. M. Y.; Graham, C. A.

    2015-05-01

    In this paper, we present a case study of modelling and analyzing the patient flow of a hospital emergency department in Hong Kong. The emergency department is facing the challenge of overcrowding and the patients there usually experience a long waiting time. Our project team was requested by a senior consultant of the emergency department to analyze the patient flow and provide a decision support tool to help improve their operations. We adopt a simulation approach to mimic their daily operations. With the simulation model, we conduct a computational study to examine the effect of physician heterogeneity on the emergency department performance. We found that physician heterogeneity has a great impact on the operational efficiency and thus should be considered when developing simulation models. Our computational results show that, with the same average of service rates among the physicians, variation in the rates can improve overcrowding situation. This suggests that emergency departments may consider having some efficient physicians to speed up the overall service rate in return for more time for patients who need extra medical care.

  14. Probabilistic power flow using improved Monte Carlo simulation method with correlated wind sources

    NASA Astrophysics Data System (ADS)

    Bie, Pei; Zhang, Buhan; Li, Hang; Deng, Weisi; Wu, Jiasi

    2017-01-01

    Probabilistic Power Flow (PPF) is a very useful tool for power system steady-state analysis. However, the correlation among different random injection power (like wind power) brings great difficulties to calculate PPF. Monte Carlo simulation (MCS) and analytical methods are two commonly used methods to solve PPF. MCS has high accuracy but is very time consuming. Analytical method like cumulants method (CM) has high computing efficiency but the cumulants calculating is not convenient when wind power output does not obey any typical distribution, especially when correlated wind sources are considered. In this paper, an Improved Monte Carlo simulation method (IMCS) is proposed. The joint empirical distribution is applied to model different wind power output. This method combines the advantages of both MCS and analytical method. It not only has high computing efficiency, but also can provide solutions with enough accuracy, which is very suitable for on-line analysis.

  15. Multilayer shallow water models with locally variable number of layers and semi-implicit time discretization

    NASA Astrophysics Data System (ADS)

    Bonaventura, Luca; Fernández-Nieto, Enrique D.; Garres-Díaz, José; Narbona-Reina, Gladys

    2018-07-01

    We propose an extension of the discretization approaches for multilayer shallow water models, aimed at making them more flexible and efficient for realistic applications to coastal flows. A novel discretization approach is proposed, in which the number of vertical layers and their distribution are allowed to change in different regions of the computational domain. Furthermore, semi-implicit schemes are employed for the time discretization, leading to a significant efficiency improvement for subcritical regimes. We show that, in the typical regimes in which the application of multilayer shallow water models is justified, the resulting discretization does not introduce any major spurious feature and allows again to reduce substantially the computational cost in areas with complex bathymetry. As an example of the potential of the proposed technique, an application to a sediment transport problem is presented, showing a remarkable improvement with respect to standard discretization approaches.

  16. Analysis on trust influencing factors and trust model from multiple perspectives of online Auction

    NASA Astrophysics Data System (ADS)

    Yu, Wang

    2017-10-01

    Current reputation models lack the research on online auction trading completely so they cannot entirely reflect the reputation status of users and may cause problems on operability. To evaluate the user trust in online auction correctly, a trust computing model based on multiple influencing factors is established. It aims at overcoming the efficiency of current trust computing methods and the limitations of traditional theoretical trust models. The improved model comprehensively considers the trust degree evaluation factors of three types of participants according to different participation modes of online auctioneers, to improve the accuracy, effectiveness and robustness of the trust degree. The experiments test the efficiency and the performance of our model under different scale of malicious user, under environment like eBay and Sporas model. The experimental results analysis show the model proposed in this paper makes up the deficiency of existing model and it also has better feasibility.

  17. Robust object tacking based on self-adaptive search area

    NASA Astrophysics Data System (ADS)

    Dong, Taihang; Zhong, Sheng

    2018-02-01

    Discriminative correlation filter (DCF) based trackers have recently achieved excellent performance with great computational efficiency. However, DCF based trackers suffer boundary effects, which result in the unstable performance in challenging situations exhibiting fast motion. In this paper, we propose a novel method to mitigate this side-effect in DCF based trackers. We change the search area according to the prediction of target motion. When the object moves fast, broad search area could alleviate boundary effects and reserve the probability of locating object. When the object moves slowly, narrow search area could prevent effect of useless background information and improve computational efficiency to attain real-time performance. This strategy can impressively soothe boundary effects in situations exhibiting fast motion and motion blur, and it can be used in almost all DCF based trackers. The experiments on OTB benchmark show that the proposed framework improves the performance compared with the baseline trackers.

  18. NASA Lewis Stirling SPRE testing and analysis with reduced number of cooler tubes

    NASA Technical Reports Server (NTRS)

    Wong, Wayne A.; Cairelli, James E.; Swec, Diane M.; Doeberling, Thomas J.; Lakatos, Thomas F.; Madi, Frank J.

    1992-01-01

    Free-piston Stirling power converters are candidates for high capacity space power applications. The Space Power Research Engine (SPRE), a free-piston Stirling engine coupled with a linear alternator, is being tested at the NASA Lewis Research Center in support of the Civil Space Technology Initiative. The SPRE is used as a test bed for evaluating converter modifications which have the potential to improve the converter performance and for validating computer code predictions. Reducing the number of cooler tubes on the SPRE has been identified as a modification with the potential to significantly improve power and efficiency. Experimental tests designed to investigate the effects of reducing the number of cooler tubes on converter power, efficiency and dynamics are described. Presented are test results from the converter operating with a reduced number of cooler tubes and comparisons between this data and both baseline test data and computer code predictions.

  19. Improving the accuracy and efficiency of time-resolved electronic spectra calculations: cellular dephasing representation with a prefactor.

    PubMed

    Zambrano, Eduardo; Šulc, Miroslav; Vaníček, Jiří

    2013-08-07

    Time-resolved electronic spectra can be obtained as the Fourier transform of a special type of time correlation function known as fidelity amplitude, which, in turn, can be evaluated approximately and efficiently with the dephasing representation. Here we improve both the accuracy of this approximation-with an amplitude correction derived from the phase-space propagator-and its efficiency-with an improved cellular scheme employing inverse Weierstrass transform and optimal scaling of the cell size. We demonstrate the advantages of the new methodology by computing dispersed time-resolved stimulated emission spectra in the harmonic potential, pyrazine, and the NCO molecule. In contrast, we show that in strongly chaotic systems such as the quartic oscillator the original dephasing representation is more appropriate than either the cellular or prefactor-corrected methods.

  20. Reducing Bottlenecks to Improve the Efficiency of the Lung Cancer Care Delivery Process: A Process Engineering Modeling Approach to Patient-Centered Care.

    PubMed

    Ju, Feng; Lee, Hyo Kyung; Yu, Xinhua; Faris, Nicholas R; Rugless, Fedoria; Jiang, Shan; Li, Jingshan; Osarogiagbon, Raymond U

    2017-12-01

    The process of lung cancer care from initial lesion detection to treatment is complex, involving multiple steps, each introducing the potential for substantial delays. Identifying the steps with the greatest delays enables a focused effort to improve the timeliness of care-delivery, without sacrificing quality. We retrospectively reviewed clinical events from initial detection, through histologic diagnosis, radiologic and invasive staging, and medical clearance, to surgery for all patients who had an attempted resection of a suspected lung cancer in a community healthcare system. We used a computer process modeling approach to evaluate delays in care delivery, in order to identify potential 'bottlenecks' in waiting time, the reduction of which could produce greater care efficiency. We also conducted 'what-if' analyses to predict the relative impact of simulated changes in the care delivery process to determine the most efficient pathways to surgery. The waiting time between radiologic lesion detection and diagnostic biopsy, and the waiting time from radiologic staging to surgery were the two most critical bottlenecks impeding efficient care delivery (more than 3 times larger compared to reducing other waiting times). Additionally, instituting surgical consultation prior to cardiac consultation for medical clearance and decreasing the waiting time between CT scans and diagnostic biopsies, were potentially the most impactful measures to reduce care delays before surgery. Rigorous computer simulation modeling, using clinical data, can provide useful information to identify areas for improving the efficiency of care delivery by process engineering, for patients who receive surgery for lung cancer.

  1. 28 percent efficient GaAs concentrator solar cells

    NASA Technical Reports Server (NTRS)

    Macmillan, H. F.; Hamaker, H. C.; Kaminar, N. R.; Kuryla, M. S.; Ladle Ristow, M.

    1988-01-01

    AlGaAs/GaAs heteroface solar concentrator cells which exhibit efficiencies in excess of 27 percent at high solar concentrations (over 400 suns, AM1.5D, 100 mW/sq cm) have been fabricated with both n/p and p/n configurations. The best n/p cell achieved an efficiency of 28.1 percent around 400 suns, and the best p/n cell achieved an efficiency of 27.5 percent around 1000 suns. The high performance of these GaAs concentrator cells compared to earlier high-efficiency cells was due to improved control of the metal-organic chemical vapor deposition growth conditions and improved cell fabrication procedures (gridline definition and edge passivation). The design parameters of the solar cell structures and optimized grid pattern were determined with a realistic computer modeling program. An evaluation of the device characteristics and a discussion of future GaAs concentrator cell development are presented.

  2. Assessing the relationship between computational speed and precision: a case study comparing an interpreted versus compiled programming language using a stochastic simulation model in diabetes care.

    PubMed

    McEwan, Phil; Bergenheim, Klas; Yuan, Yong; Tetlow, Anthony P; Gordon, Jason P

    2010-01-01

    Simulation techniques are well suited to modelling diseases yet can be computationally intensive. This study explores the relationship between modelled effect size, statistical precision, and efficiency gains achieved using variance reduction and an executable programming language. A published simulation model designed to model a population with type 2 diabetes mellitus based on the UKPDS 68 outcomes equations was coded in both Visual Basic for Applications (VBA) and C++. Efficiency gains due to the programming language were evaluated, as was the impact of antithetic variates to reduce variance, using predicted QALYs over a 40-year time horizon. The use of C++ provided a 75- and 90-fold reduction in simulation run time when using mean and sampled input values, respectively. For a series of 50 one-way sensitivity analyses, this would yield a total run time of 2 minutes when using C++, compared with 155 minutes for VBA when using mean input values. The use of antithetic variates typically resulted in a 53% reduction in the number of simulation replications and run time required. When drawing all input values to the model from distributions, the use of C++ and variance reduction resulted in a 246-fold improvement in computation time compared with VBA - for which the evaluation of 50 scenarios would correspondingly require 3.8 hours (C++) and approximately 14.5 days (VBA). The choice of programming language used in an economic model, as well as the methods for improving precision of model output can have profound effects on computation time. When constructing complex models, more computationally efficient approaches such as C++ and variance reduction should be considered; concerns regarding model transparency using compiled languages are best addressed via thorough documentation and model validation.

  3. Computational Study of an Axisymmetric Dual Throat Fluidic Thrust Vectoring Nozzle for a Supersonic Aircraft Application

    NASA Technical Reports Server (NTRS)

    Deere, Karen A.; Flamm, Jeffrey D.; Berrier, Bobby L.; Johnson, Stuart K.

    2007-01-01

    A computational investigation of an axisymmetric Dual Throat Nozzle concept has been conducted. This fluidic thrust-vectoring nozzle was designed with a recessed cavity to enhance the throat shifting technique for improved thrust vectoring. The structured-grid, unsteady Reynolds- Averaged Navier-Stokes flow solver PAB3D was used to guide the nozzle design and analyze performance. Nozzle design variables included extent of circumferential injection, cavity divergence angle, cavity length, and cavity convergence angle. Internal nozzle performance (wind-off conditions) and thrust vector angles were computed for several configurations over a range of nozzle pressure ratios from 1.89 to 10, with the fluidic injection flow rate equal to zero and up to 4 percent of the primary flow rate. The effect of a variable expansion ratio on nozzle performance over a range of freestream Mach numbers up to 2 was investigated. Results indicated that a 60 circumferential injection was a good compromise between large thrust vector angles and efficient internal nozzle performance. A cavity divergence angle greater than 10 was detrimental to thrust vector angle. Shortening the cavity length improved internal nozzle performance with a small penalty to thrust vector angle. Contrary to expectations, a variable expansion ratio did not improve thrust efficiency at the flight conditions investigated.

  4. Improving primary health care facility performance in Ghana: efficiency analysis and fiscal space implications.

    PubMed

    Novignon, Jacob; Nonvignon, Justice

    2017-06-12

    Health centers in Ghana play an important role in health care delivery especially in deprived communities. They usually serve as the first line of service and meet basic health care needs. Unfortunately, these facilities are faced with inadequate resources. While health policy makers seek to increase resources committed to primary healthcare, it is important to understand the nature of inefficiencies that exist in these facilities. Therefore, the objectives of this study are threefold; (i) estimate efficiency among primary health facilities (health centers), (ii) examine the potential fiscal space from improved efficiency and (iii) investigate the efficiency disparities in public and private facilities. Data was from the 2015 Access Bottlenecks, Cost and Equity (ABCE) project conducted by the Institute for Health Metrics and Evaluation. The Stochastic Frontier Analysis (SFA) was used to estimate efficiency of health facilities. Efficiency scores were then used to compute potential savings from improved efficiency. Outpatient visits was used as output while number of personnel, hospital beds, expenditure on other capital items and administration were used as inputs. Disparities in efficiency between public and private facilities was estimated using the Nopo matching decomposition procedure. Average efficiency score across all health centers included in the sample was estimated to be 0.51. Also, average efficiency was estimated to be about 0.65 and 0.50 for private and public facilities, respectively. Significant disparities in efficiency were identified across the various administrative regions. With regards to potential fiscal space, we found that, on average, facilities could save about GH₵11,450.70 (US$7633.80) if efficiency was improved. We also found that fiscal space from efficiency gains varies across rural/urban as well as private/public facilities, if best practices are followed. The matching decomposition showed an efficiency gap of 0.29 between private and public facilities. There is need for primary health facility managers to improve productivity via effective and efficient resource use. Efforts to improve efficiency should focus on training health workers and improving facility environment alongside effective monitoring and evaluation exercises.

  5. Optimized blind gamma-ray pulsar searches at fixed computing budget

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Pletsch, Holger J.; Clark, Colin J., E-mail: holger.pletsch@aei.mpg.de

    The sensitivity of blind gamma-ray pulsar searches in multiple years worth of photon data, as from the Fermi LAT, is primarily limited by the finite computational resources available. Addressing this 'needle in a haystack' problem, here we present methods for optimizing blind searches to achieve the highest sensitivity at fixed computing cost. For both coherent and semicoherent methods, we consider their statistical properties and study their search sensitivity under computational constraints. The results validate a multistage strategy, where the first stage scans the entire parameter space using an efficient semicoherent method and promising candidates are then refined through a fullymore » coherent analysis. We also find that for the first stage of a blind search incoherent harmonic summing of powers is not worthwhile at fixed computing cost for typical gamma-ray pulsars. Further enhancing sensitivity, we present efficiency-improved interpolation techniques for the semicoherent search stage. Via realistic simulations we demonstrate that overall these optimizations can significantly lower the minimum detectable pulsed fraction by almost 50% at the same computational expense.« less

  6. High-Speed Photonic Reservoir Computing Using a Time-Delay-Based Architecture: Million Words per Second Classification

    NASA Astrophysics Data System (ADS)

    Larger, Laurent; Baylón-Fuentes, Antonio; Martinenghi, Romain; Udaltsov, Vladimir S.; Chembo, Yanne K.; Jacquot, Maxime

    2017-01-01

    Reservoir computing, originally referred to as an echo state network or a liquid state machine, is a brain-inspired paradigm for processing temporal information. It involves learning a "read-out" interpretation for nonlinear transients developed by high-dimensional dynamics when the latter is excited by the information signal to be processed. This novel computational paradigm is derived from recurrent neural network and machine learning techniques. It has recently been implemented in photonic hardware for a dynamical system, which opens the path to ultrafast brain-inspired computing. We report on a novel implementation involving an electro-optic phase-delay dynamics designed with off-the-shelf optoelectronic telecom devices, thus providing the targeted wide bandwidth. Computational efficiency is demonstrated experimentally with speech-recognition tasks. State-of-the-art speed performances reach one million words per second, with very low word error rate. Additionally, to record speed processing, our investigations have revealed computing-efficiency improvements through yet-unexplored temporal-information-processing techniques, such as simultaneous multisample injection and pitched sampling at the read-out compared to information "write-in".

  7. Advanced Computational Thermal Fluid Physics (CTFP) and Its Assessment for Light Water Reactors and Supercritical Reactors

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    D.M. McEligot; K. G. Condie; G. E. McCreery

    2005-10-01

    Background: The ultimate goal of the study is the improvement of predictive methods for safety analyses and design of Generation IV reactor systems such as supercritical water reactors (SCWR) for higher efficiency, improved performance and operation, design simplification, enhanced safety and reduced waste and cost. The objective of this Korean / US / laboratory / university collaboration of coupled fundamental computational and experimental studies is to develop the supporting knowledge needed for improved predictive techniques for use in the technology development of Generation IV reactor concepts and their passive safety systems. The present study emphasizes SCWR concepts in the Generationmore » IV program.« less

  8. Civil propulsion technology for the next twenty-five years

    NASA Technical Reports Server (NTRS)

    Rosen, Robert; Facey, John R.

    1987-01-01

    The next twenty-five years will see major advances in civil propulsion technology that will result in completely new aircraft systems for domestic, international, commuter and high-speed transports. These aircraft will include advanced aerodynamic, structural, and avionic technologies resulting in major new system capabilities and economic improvements. Propulsion technologies will include high-speed turboprops in the near term, very high bypass ratio turbofans, high efficiency small engines and advanced cycles utilizing high temperature materials for high-speed propulsion. Key fundamental enabling technologies include increased temperature capability and advanced design methods. Increased temperature capability will be based on improved composite materials such as metal matrix, intermetallics, ceramics, and carbon/carbon as well as advanced heat transfer techniques. Advanced design methods will make use of advances in internal computational fluid mechanics, reacting flow computation, computational structural mechanics and computational chemistry. The combination of advanced enabling technologies, new propulsion concepts and advanced control approaches will provide major improvements in civil aircraft.

  9. On some methods for improving time of reachability sets computation for the dynamic system control problem

    NASA Astrophysics Data System (ADS)

    Zimovets, Artem; Matviychuk, Alexander; Ushakov, Vladimir

    2016-12-01

    The paper presents two different approaches to reduce the time of computer calculation of reachability sets. First of these two approaches use different data structures for storing the reachability sets in the computer memory for calculation in single-threaded mode. Second approach is based on using parallel algorithms with reference to the data structures from the first approach. Within the framework of this paper parallel algorithm of approximate reachability set calculation on computer with SMP-architecture is proposed. The results of numerical modelling are presented in the form of tables which demonstrate high efficiency of parallel computing technology and also show how computing time depends on the used data structure.

  10. Fragment informatics and computational fragment-based drug design: an overview and update.

    PubMed

    Sheng, Chunquan; Zhang, Wannian

    2013-05-01

    Fragment-based drug design (FBDD) is a promising approach for the discovery and optimization of lead compounds. Despite its successes, FBDD also faces some internal limitations and challenges. FBDD requires a high quality of target protein and good solubility of fragments. Biophysical techniques for fragment screening necessitate expensive detection equipment and the strategies for evolving fragment hits to leads remain to be improved. Regardless, FBDD is necessary for investigating larger chemical space and can be applied to challenging biological targets. In this scenario, cheminformatics and computational chemistry can be used as alternative approaches that can significantly improve the efficiency and success rate of lead discovery and optimization. Cheminformatics and computational tools assist FBDD in a very flexible manner. Computational FBDD can be used independently or in parallel with experimental FBDD for efficiently generating and optimizing leads. Computational FBDD can also be integrated into each step of experimental FBDD and help to play a synergistic role by maximizing its performance. This review will provide critical analysis of the complementarity between computational and experimental FBDD and highlight recent advances in new algorithms and successful examples of their applications. In particular, fragment-based cheminformatics tools, high-throughput fragment docking, and fragment-based de novo drug design will provide the focus of this review. We will also discuss the advantages and limitations of different methods and the trends in new developments that should inspire future research. © 2012 Wiley Periodicals, Inc.

  11. Three-dimensional photoacoustic tomography based on graphics-processing-unit-accelerated finite element method.

    PubMed

    Peng, Kuan; He, Ling; Zhu, Ziqiang; Tang, Jingtian; Xiao, Jiaying

    2013-12-01

    Compared with commonly used analytical reconstruction methods, the frequency-domain finite element method (FEM) based approach has proven to be an accurate and flexible algorithm for photoacoustic tomography. However, the FEM-based algorithm is computationally demanding, especially for three-dimensional cases. To enhance the algorithm's efficiency, in this work a parallel computational strategy is implemented in the framework of the FEM-based reconstruction algorithm using a graphic-processing-unit parallel frame named the "compute unified device architecture." A series of simulation experiments is carried out to test the accuracy and accelerating effect of the improved method. The results obtained indicate that the parallel calculation does not change the accuracy of the reconstruction algorithm, while its computational cost is significantly reduced by a factor of 38.9 with a GTX 580 graphics card using the improved method.

  12. Optical Computers and Space Technology

    NASA Technical Reports Server (NTRS)

    Abdeldayem, Hossin A.; Frazier, Donald O.; Penn, Benjamin; Paley, Mark S.; Witherow, William K.; Banks, Curtis; Hicks, Rosilen; Shields, Angela

    1995-01-01

    The rapidly increasing demand for greater speed and efficiency on the information superhighway requires significant improvements over conventional electronic logic circuits. Optical interconnections and optical integrated circuits are strong candidates to provide the way out of the extreme limitations imposed on the growth of speed and complexity of nowadays computations by the conventional electronic logic circuits. The new optical technology has increased the demand for high quality optical materials. NASA's recent involvement in processing optical materials in space has demonstrated that a new and unique class of high quality optical materials are processible in a microgravity environment. Microgravity processing can induce improved orders in these materials and could have a significant impact on the development of optical computers. We will discuss NASA's role in processing these materials and report on some of the associated nonlinear optical properties which are quite useful for optical computers technology.

  13. Multiprocessing on supercomputers for computational aerodynamics

    NASA Technical Reports Server (NTRS)

    Yarrow, Maurice; Mehta, Unmeel B.

    1991-01-01

    Little use is made of multiple processors available on current supercomputers (computers with a theoretical peak performance capability equal to 100 MFLOPS or more) to improve turnaround time in computational aerodynamics. The productivity of a computer user is directly related to this turnaround time. In a time-sharing environment, such improvement in this speed is achieved when multiple processors are used efficiently to execute an algorithm. The concept of multiple instructions and multiple data (MIMD) is applied through multitasking via a strategy that requires relatively minor modifications to an existing code for a single processor. This approach maps the available memory to multiple processors, exploiting the C-Fortran-Unix interface. The existing code is mapped without the need for developing a new algorithm. The procedure for building a code utilizing this approach is automated with the Unix stream editor.

  14. Multigrid optimal mass transport for image registration and morphing

    NASA Astrophysics Data System (ADS)

    Rehman, Tauseef ur; Tannenbaum, Allen

    2007-02-01

    In this paper we present a computationally efficient Optimal Mass Transport algorithm. This method is based on the Monge-Kantorovich theory and is used for computing elastic registration and warping maps in image registration and morphing applications. This is a parameter free method which utilizes all of the grayscale data in an image pair in a symmetric fashion. No landmarks need to be specified for correspondence. In our work, we demonstrate significant improvement in computation time when our algorithm is applied as compared to the originally proposed method by Haker et al [1]. The original algorithm was based on a gradient descent method for removing the curl from an initial mass preserving map regarded as 2D vector field. This involves inverting the Laplacian in each iteration which is now computed using full multigrid technique resulting in an improvement in computational time by a factor of two. Greater improvement is achieved by decimating the curl in a multi-resolutional framework. The algorithm was applied to 2D short axis cardiac MRI images and brain MRI images for testing and comparison.

  15. When Does Changing Representation Improve Problem-Solving Performance?

    NASA Technical Reports Server (NTRS)

    Holte, Robert; Zimmer, Robert; MacDonald, Alan

    1992-01-01

    The aim of changing representation is the improvement of problem-solving efficiency. For the most widely studied family of methods of change of representation it is shown that the value of a single parameter, called the expulsion factor, is critical in determining (1) whether the change of representation will improve or degrade problem-solving efficiency and (2) whether the solutions produced using the change of representation will or will not be exponentially longer than the shortest solution. A method of computing the expansion factor for a given change of representation is sketched in general and described in detail for homomorphic changes of representation. The results are illustrated with homomorphic decompositions of the Towers of Hanoi problem.

  16. Impact of computer-assisted data collection, evaluation and management on the cancer genetic counselor's time providing patient care.

    PubMed

    Cohen, Stephanie A; McIlvried, Dawn E

    2011-06-01

    Cancer genetic counseling sessions traditionally encompass collecting medical and family history information, evaluating that information for the likelihood of a genetic predisposition for a hereditary cancer syndrome, conveying that information to the patient, offering genetic testing when appropriate, obtaining consent and subsequently documenting the encounter with a clinic note and pedigree. Software programs exist to collect family and medical history information electronically, intending to improve efficiency and simplicity of collecting, managing and storing this data. This study compares the genetic counselor's time spent in cancer genetic counseling tasks in a traditional model and one using computer-assisted data collection, which is then used to generate a pedigree, risk assessment and consult note. Genetic counselor time spent collecting family and medical history and providing face-to-face counseling for a new patient session decreased from an average of 85-69 min when using the computer-assisted data collection. However, there was no statistically significant change in overall genetic counselor time on all aspects of the genetic counseling process, due to an increased amount of time spent generating an electronic pedigree and consult note. Improvements in the computer program's technical design would potentially minimize data manipulation. Certain aspects of this program, such as electronic collection of family history and risk assessment, appear effective in improving cancer genetic counseling efficiency while others, such as generating an electronic pedigree and consult note, do not.

  17. Strategic deployment plan : intelligent transportation system (ITS) : early deployment study, Kansas City metropolitan bi-state area

    DOT National Transportation Integrated Search

    1997-01-01

    Intelligent transportation systems (ITS) are systems that utilize advanced technologies, including computer, communications and process control technologies, to improve the efficiency and safety of the transportation system. These systems encompass a...

  18. An efficient implementation of 3D high-resolution imaging for large-scale seismic data with GPU/CPU heterogeneous parallel computing

    NASA Astrophysics Data System (ADS)

    Xu, Jincheng; Liu, Wei; Wang, Jin; Liu, Linong; Zhang, Jianfeng

    2018-02-01

    De-absorption pre-stack time migration (QPSTM) compensates for the absorption and dispersion of seismic waves by introducing an effective Q parameter, thereby making it an effective tool for 3D, high-resolution imaging of seismic data. Although the optimal aperture obtained via stationary-phase migration reduces the computational cost of 3D QPSTM and yields 3D stationary-phase QPSTM, the associated computational efficiency is still the main problem in the processing of 3D, high-resolution images for real large-scale seismic data. In the current paper, we proposed a division method for large-scale, 3D seismic data to optimize the performance of stationary-phase QPSTM on clusters of graphics processing units (GPU). Then, we designed an imaging point parallel strategy to achieve an optimal parallel computing performance. Afterward, we adopted an asynchronous double buffering scheme for multi-stream to perform the GPU/CPU parallel computing. Moreover, several key optimization strategies of computation and storage based on the compute unified device architecture (CUDA) were adopted to accelerate the 3D stationary-phase QPSTM algorithm. Compared with the initial GPU code, the implementation of the key optimization steps, including thread optimization, shared memory optimization, register optimization and special function units (SFU), greatly improved the efficiency. A numerical example employing real large-scale, 3D seismic data showed that our scheme is nearly 80 times faster than the CPU-QPSTM algorithm. Our GPU/CPU heterogeneous parallel computing framework significant reduces the computational cost and facilitates 3D high-resolution imaging for large-scale seismic data.

  19. Stirling Air Conditioner for Compact Cooling

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    None

    2010-09-01

    BEETIT Project: Infinia is developing a compact air conditioner that uses an unconventional high efficient Stirling cycle system (vs. conventional vapor compression systems) to produce cool air that is energy efficient and does not rely on polluting refrigerants. The Stirling cycle system is a type of air conditioning system that uses a motor with a piston to remove heat to the outside atmosphere using a gas refrigerant. To date, Stirling systems have been expensive and have not had the right kind of heat exchanger to help cool air efficiently. Infinia is using chip cooling technology from the computer industry tomore » make improvements to the heat exchanger and improve system performance. Infinia’s air conditioner uses helium gas as refrigerant, an environmentally benign gas that does not react with other chemicals and does not burn. Infinia’s improvements to the Stirling cycle system will enable the cost-effective mass production of high-efficiency air conditioners that use no polluting refrigerants.« less

  20. Study of relationships of material properties and high efficiency solar cell performance on material composition

    NASA Technical Reports Server (NTRS)

    Sah, C. T.

    1983-01-01

    The performance improvements obtainable from extending the traditionally thin back-surface-field (BSF) layer deep into the base of silicon solar cells under terrestrial solar illumination (AM1) are analyzed. This extended BSF cell is also known as the back-drift-field cell. About 100 silicon cells were analyzed, each with a different emitter or base dopant impurity distribution whose selection was based on physically anticipated improvements. The four principal performance parameters (the open-circuit voltage, the short-circuit current, the fill factor, and the maximum efficiency) are computed using a FORTRAN program, called Circuit Technique for Semiconductor-device Analysis, CTSA, which numerically solves the six Shockley Equations under AM1 solar illumination at 88.92 mW/cm, at an optimum cell thickness of 50 um. The results show that very significant performance improvements can be realized by extending the BSF layer thickness from 2 um (18% efficiency) to 40 um (20% efficiency).

  1. Efficient parallelization for AMR MHD multiphysics calculations; implementation in AstroBEAR

    NASA Astrophysics Data System (ADS)

    Carroll-Nellenback, Jonathan J.; Shroyer, Brandon; Frank, Adam; Ding, Chen

    2013-03-01

    Current adaptive mesh refinement (AMR) simulations require algorithms that are highly parallelized and manage memory efficiently. As compute engines grow larger, AMR simulations will require algorithms that achieve new levels of efficient parallelization and memory management. We have attempted to employ new techniques to achieve both of these goals. Patch or grid based AMR often employs ghost cells to decouple the hyperbolic advances of each grid on a given refinement level. This decoupling allows each grid to be advanced independently. In AstroBEAR we utilize this independence by threading the grid advances on each level with preference going to the finer level grids. This allows for global load balancing instead of level by level load balancing and allows for greater parallelization across both physical space and AMR level. Threading of level advances can also improve performance by interleaving communication with computation, especially in deep simulations with many levels of refinement. While we see improvements of up to 30% on deep simulations run on a few cores, the speedup is typically more modest (5-20%) for larger scale simulations. To improve memory management we have employed a distributed tree algorithm that requires processors to only store and communicate local sections of the AMR tree structure with neighboring processors. Using this distributed approach we are able to get reasonable scaling efficiency (>80%) out to 12288 cores and up to 8 levels of AMR - independent of the use of threading.

  2. Fast approximation for joint optimization of segmentation, shape, and location priors, and its application in gallbladder segmentation.

    PubMed

    Saito, Atsushi; Nawano, Shigeru; Shimizu, Akinobu

    2017-05-01

    This paper addresses joint optimization for segmentation and shape priors, including translation, to overcome inter-subject variability in the location of an organ. Because a simple extension of the previous exact optimization method is too computationally complex, we propose a fast approximation for optimization. The effectiveness of the proposed approximation is validated in the context of gallbladder segmentation from a non-contrast computed tomography (CT) volume. After spatial standardization and estimation of the posterior probability of the target organ, simultaneous optimization of the segmentation, shape, and location priors is performed using a branch-and-bound method. Fast approximation is achieved by combining sampling in the eigenshape space to reduce the number of shape priors and an efficient computational technique for evaluating the lower bound. Performance was evaluated using threefold cross-validation of 27 CT volumes. Optimization in terms of translation of the shape prior significantly improved segmentation performance. The proposed method achieved a result of 0.623 on the Jaccard index in gallbladder segmentation, which is comparable to that of state-of-the-art methods. The computational efficiency of the algorithm is confirmed to be good enough to allow execution on a personal computer. Joint optimization of the segmentation, shape, and location priors was proposed, and it proved to be effective in gallbladder segmentation with high computational efficiency.

  3. Efficient scatter model for simulation of ultrasound images from computed tomography data

    NASA Astrophysics Data System (ADS)

    D'Amato, J. P.; Lo Vercio, L.; Rubi, P.; Fernandez Vera, E.; Barbuzza, R.; Del Fresno, M.; Larrabide, I.

    2015-12-01

    Background and motivation: Real-time ultrasound simulation refers to the process of computationally creating fully synthetic ultrasound images instantly. Due to the high value of specialized low cost training for healthcare professionals, there is a growing interest in the use of this technology and the development of high fidelity systems that simulate the acquisitions of echographic images. The objective is to create an efficient and reproducible simulator that can run either on notebooks or desktops using low cost devices. Materials and methods: We present an interactive ultrasound simulator based on CT data. This simulator is based on ray-casting and provides real-time interaction capabilities. The simulation of scattering that is coherent with the transducer position in real time is also introduced. Such noise is produced using a simplified model of multiplicative noise and convolution with point spread functions (PSF) tailored for this purpose. Results: The computational efficiency of scattering maps generation was revised with an improved performance. This allowed a more efficient simulation of coherent scattering in the synthetic echographic images while providing highly realistic result. We describe some quality and performance metrics to validate these results, where a performance of up to 55fps was achieved. Conclusion: The proposed technique for real-time scattering modeling provides realistic yet computationally efficient scatter distributions. The error between the original image and the simulated scattering image was compared for the proposed method and the state-of-the-art, showing negligible differences in its distribution.

  4. Far-field radiation patterns of aperture antennas by the Winograd Fourier transform algorithm

    NASA Technical Reports Server (NTRS)

    Heisler, R.

    1978-01-01

    A more time-efficient algorithm for computing the discrete Fourier transform, the Winograd Fourier transform (WFT), is described. The WFT algorithm is compared with other transform algorithms. Results indicate that the WFT algorithm in antenna analysis appears to be a very successful application. Significant savings in cpu time will improve the computer turn around time and circumvent the need to resort to weekend runs.

  5. MOLAR: Modular Linux and Adaptive Runtime Support for HEC OS/R Research

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Frank Mueller

    2009-02-05

    MOLAR is a multi-institution research effort that concentrates on adaptive, reliable,and efficient operating and runtime system solutions for ultra-scale high-end scientific computing on the next generation of supercomputers. This research addresses the challenges outlined by the FAST-OS - forum to address scalable technology for runtime and operating systems --- and HECRTF --- high-end computing revitalization task force --- activities by providing a modular Linux and adaptable runtime support for high-end computing operating and runtime systems. The MOLAR research has the following goals to address these issues. (1) Create a modular and configurable Linux system that allows customized changes based onmore » the requirements of the applications, runtime systems, and cluster management software. (2) Build runtime systems that leverage the OS modularity and configurability to improve efficiency, reliability, scalability, ease-of-use, and provide support to legacy and promising programming models. (3) Advance computer reliability, availability and serviceability (RAS) management systems to work cooperatively with the OS/R to identify and preemptively resolve system issues. (4) Explore the use of advanced monitoring and adaptation to improve application performance and predictability of system interruptions. The overall goal of the research conducted at NCSU is to develop scalable algorithms for high-availability without single points of failure and without single points of control.« less

  6. A Taylor Expansion-Based Adaptive Design Strategy for Global Surrogate Modeling With Applications in Groundwater Modeling

    NASA Astrophysics Data System (ADS)

    Mo, Shaoxing; Lu, Dan; Shi, Xiaoqing; Zhang, Guannan; Ye, Ming; Wu, Jianfeng; Wu, Jichun

    2017-12-01

    Global sensitivity analysis (GSA) and uncertainty quantification (UQ) for groundwater modeling are challenging because of the model complexity and significant computational requirements. To reduce the massive computational cost, a cheap-to-evaluate surrogate model is usually constructed to approximate and replace the expensive groundwater models in the GSA and UQ. Constructing an accurate surrogate requires actual model simulations on a number of parameter samples. Thus, a robust experimental design strategy is desired to locate informative samples so as to reduce the computational cost in surrogate construction and consequently to improve the efficiency in the GSA and UQ. In this study, we develop a Taylor expansion-based adaptive design (TEAD) that aims to build an accurate global surrogate model with a small training sample size. TEAD defines a novel hybrid score function to search informative samples, and a robust stopping criterion to terminate the sample search that guarantees the resulted approximation errors satisfy the desired accuracy. The good performance of TEAD in building global surrogate models is demonstrated in seven analytical functions with different dimensionality and complexity in comparison to two widely used experimental design methods. The application of the TEAD-based surrogate method in two groundwater models shows that the TEAD design can effectively improve the computational efficiency of GSA and UQ for groundwater modeling.

  7. Iterated Gate Teleportation and Blind Quantum Computation.

    PubMed

    Pérez-Delgado, Carlos A; Fitzsimons, Joseph F

    2015-06-05

    Blind quantum computation allows a user to delegate a computation to an untrusted server while keeping the computation hidden. A number of recent works have sought to establish bounds on the communication requirements necessary to implement blind computation, and a bound based on the no-programming theorem of Nielsen and Chuang has emerged as a natural limiting factor. Here we show that this constraint only holds in limited scenarios, and show how to overcome it using a novel method of iterated gate teleportations. This technique enables drastic reductions in the communication required for distributed quantum protocols, extending beyond the blind computation setting. Applied to blind quantum computation, this technique offers significant efficiency improvements, and in some scenarios offers an exponential reduction in communication requirements.

  8. A Round-Efficient Authenticated Key Agreement Scheme Based on Extended Chaotic Maps for Group Cloud Meeting.

    PubMed

    Lin, Tsung-Hung; Tsung, Chen-Kun; Lee, Tian-Fu; Wang, Zeng-Bo

    2017-12-03

    The security is a critical issue for business purposes. For example, the cloud meeting must consider strong security to maintain the communication privacy. Considering the scenario with cloud meeting, we apply extended chaotic map to present passwordless group authentication key agreement, termed as Passwordless Group Authentication Key Agreement (PL-GAKA). PL-GAKA improves the computation efficiency for the simple group password-based authenticated key agreement (SGPAKE) proposed by Lee et al. in terms of computing the session key. Since the extended chaotic map has equivalent security level to the Diffie-Hellman key exchange scheme applied by SGPAKE, the security of PL-GAKA is not sacrificed when improving the computation efficiency. Moreover, PL-GAKA is a passwordless scheme, so the password maintenance is not necessary. Short-term authentication is considered, hence the communication security is stronger than other protocols by dynamically generating session key in each cloud meeting. In our analysis, we first prove that each meeting member can get the correct information during the meeting. We analyze common security issues for the proposed PL-GAKA in terms of session key security, mutual authentication, perfect forward security, and data integrity. Moreover, we also demonstrate that communicating in PL-GAKA is secure when suffering replay attacks, impersonation attacks, privileged insider attacks, and stolen-verifier attacks. Eventually, an overall comparison is given to show the performance between PL-GAKA, SGPAKE and related solutions.

  9. Solving the Coupled System Improves Computational Efficiency of the Bidomain Equations

    PubMed Central

    Southern, James A.; Plank, Gernot; Vigmond, Edward J.; Whiteley, Jonathan P.

    2017-01-01

    The bidomain equations are frequently used to model the propagation of cardiac action potentials across cardiac tissue. At the whole organ level the size of the computational mesh required makes their solution a significant computational challenge. As the accuracy of the numerical solution cannot be compromised, efficiency of the solution technique is important to ensure that the results of the simulation can be obtained in a reasonable time whilst still encapsulating the complexities of the system. In an attempt to increase efficiency of the solver, the bidomain equations are often decoupled into one parabolic equation that is computationally very cheap to solve and an elliptic equation that is much more expensive to solve. In this study the performance of this uncoupled solution method is compared with an alternative strategy in which the bidomain equations are solved as a coupled system. This seems counter-intuitive as the alternative method requires the solution of a much larger linear system at each time step. However, in tests on two 3-D rabbit ventricle benchmarks it is shown that the coupled method is up to 80% faster than the conventional uncoupled method — and that parallel performance is better for the larger coupled problem. PMID:19457741

  10. Extension of the TDCR model to compute counting efficiencies for radionuclides with complex decay schemes.

    PubMed

    Kossert, K; Cassette, Ph; Carles, A Grau; Jörg, G; Gostomski, Christroph Lierse V; Nähle, O; Wolf, Ch

    2014-05-01

    The triple-to-double coincidence ratio (TDCR) method is frequently used to measure the activity of radionuclides decaying by pure β emission or electron capture (EC). Some radionuclides with more complex decays have also been studied, but accurate calculations of decay branches which are accompanied by many coincident γ transitions have not yet been investigated. This paper describes recent extensions of the model to make efficiency computations for more complex decay schemes possible. In particular, the MICELLE2 program that applies a stochastic approach of the free parameter model was extended. With an improved code, efficiencies for β(-), β(+) and EC branches with up to seven coincident γ transitions can be calculated. Moreover, a new parametrization for the computation of electron stopping powers has been implemented to compute the ionization quenching function of 10 commercial scintillation cocktails. In order to demonstrate the capabilities of the TDCR method, the following radionuclides are discussed: (166m)Ho (complex β(-)/γ), (59)Fe (complex β(-)/γ), (64)Cu (β(-), β(+), EC and EC/γ) and (229)Th in equilibrium with its progenies (decay chain with many α, β and complex β(-)/γ transitions). © 2013 Published by Elsevier Ltd.

  11. A new sparse optimization scheme for simultaneous beam angle and fluence map optimization in radiotherapy planning

    NASA Astrophysics Data System (ADS)

    Liu, Hongcheng; Dong, Peng; Xing, Lei

    2017-08-01

    {{\\ell }2,1} -minimization-based sparse optimization was employed to solve the beam angle optimization (BAO) in intensity-modulated radiation therapy (IMRT) planning. The technique approximates the exact BAO formulation with efficiently computable convex surrogates, leading to plans that are inferior to those attainable with recently proposed gradient-based greedy schemes. In this paper, we alleviate/reduce the nontrivial inconsistencies between the {{\\ell }2,1} -based formulations and the exact BAO model by proposing a new sparse optimization framework based on the most recent developments in group variable selection. We propose the incorporation of the group-folded concave penalty (gFCP) as a substitution to the {{\\ell }2,1} -minimization framework. The new formulation is then solved by a variation of an existing gradient method. The performance of the proposed scheme is evaluated by both plan quality and the computational efficiency using three IMRT cases: a coplanar prostate case, a coplanar head-and-neck case, and a noncoplanar liver case. Involved in the evaluation are two alternative schemes: the {{\\ell }2,1} -minimization approach and the gradient norm method (GNM). The gFCP-based scheme outperforms both counterpart approaches. In particular, gFCP generates better plans than those obtained using the {{\\ell }2,1} -minimization for all three cases with a comparable computation time. As compared to the GNM, the gFCP improves both the plan quality and computational efficiency. The proposed gFCP-based scheme provides a promising framework for BAO and promises to improve both planning time and plan quality.

  12. A new sparse optimization scheme for simultaneous beam angle and fluence map optimization in radiotherapy planning.

    PubMed

    Liu, Hongcheng; Dong, Peng; Xing, Lei

    2017-07-20

    [Formula: see text]-minimization-based sparse optimization was employed to solve the beam angle optimization (BAO) in intensity-modulated radiation therapy (IMRT) planning. The technique approximates the exact BAO formulation with efficiently computable convex surrogates, leading to plans that are inferior to those attainable with recently proposed gradient-based greedy schemes. In this paper, we alleviate/reduce the nontrivial inconsistencies between the [Formula: see text]-based formulations and the exact BAO model by proposing a new sparse optimization framework based on the most recent developments in group variable selection. We propose the incorporation of the group-folded concave penalty (gFCP) as a substitution to the [Formula: see text]-minimization framework. The new formulation is then solved by a variation of an existing gradient method. The performance of the proposed scheme is evaluated by both plan quality and the computational efficiency using three IMRT cases: a coplanar prostate case, a coplanar head-and-neck case, and a noncoplanar liver case. Involved in the evaluation are two alternative schemes: the [Formula: see text]-minimization approach and the gradient norm method (GNM). The gFCP-based scheme outperforms both counterpart approaches. In particular, gFCP generates better plans than those obtained using the [Formula: see text]-minimization for all three cases with a comparable computation time. As compared to the GNM, the gFCP improves both the plan quality and computational efficiency. The proposed gFCP-based scheme provides a promising framework for BAO and promises to improve both planning time and plan quality.

  13. Facilitating mathematics learning for students with upper extremity disabilities using touch-input system.

    PubMed

    Choi, Kup-Sze; Chan, Tak-Yin

    2015-03-01

    To investigate the feasibility of using tablet device as user interface for students with upper extremity disabilities to input mathematics efficiently into computer. A touch-input system using tablet device as user interface was proposed to assist these students to write mathematics. User-switchable and context-specific keyboard layouts were designed to streamline the input process. The system could be integrated with conventional computer systems only with minor software setup. A two-week pre-post test study involving five participants was conducted to evaluate the performance of the system and collect user feedback. The mathematics input efficiency of the participants was found to improve during the experiment sessions. In particular, their performance in entering trigonometric expressions by using the touch-input system was significantly better than that by using conventional mathematics editing software with keyboard and mouse. The participants rated the touch-input system positively and were confident that they could operate at ease with more practice. The proposed touch-input system provides a convenient way for the students with hand impairment to write mathematics and has the potential to facilitate their mathematics learning. Implications for Rehabilitation Students with upper extremity disabilities often face barriers to learning mathematics which is largely based on handwriting. Conventional computer user interfaces are inefficient for them to input mathematics into computer. A touch-input system with context-specific and user-switchable keyboard layouts was designed to improve the efficiency of mathematics input. Experimental results and user feedback suggested that the system has the potential to facilitate mathematics learning for the students.

  14. Swellix: a computational tool to explore RNA conformational space.

    PubMed

    Sloat, Nathan; Liu, Jui-Wen; Schroeder, Susan J

    2017-11-21

    The sequence of nucleotides in an RNA determines the possible base pairs for an RNA fold and thus also determines the overall shape and function of an RNA. The Swellix program presented here combines a helix abstraction with a combinatorial approach to the RNA folding problem in order to compute all possible non-pseudoknotted RNA structures for RNA sequences. The Swellix program builds on the Crumple program and can include experimental constraints on global RNA structures such as the minimum number and lengths of helices from crystallography, cryoelectron microscopy, or in vivo crosslinking and chemical probing methods. The conceptual advance in Swellix is to count helices and generate all possible combinations of helices rather than counting and combining base pairs. Swellix bundles similar helices and includes improvements in memory use and efficient parallelization. Biological applications of Swellix are demonstrated by computing the reduction in conformational space and entropy due to naturally modified nucleotides in tRNA sequences and by motif searches in Human Endogenous Retroviral (HERV) RNA sequences. The Swellix motif search reveals occurrences of protein and drug binding motifs in the HERV RNA ensemble that do not occur in minimum free energy or centroid predicted structures. Swellix presents significant improvements over Crumple in terms of efficiency and memory use. The efficient parallelization of Swellix enables the computation of sequences as long as 418 nucleotides with sufficient experimental constraints. Thus, Swellix provides a practical alternative to free energy minimization tools when multiple structures, kinetically determined structures, or complex RNA-RNA and RNA-protein interactions are present in an RNA folding problem.

  15. Validation of numerical model for cook stove using Reynolds averaged Navier-Stokes based solver

    NASA Astrophysics Data System (ADS)

    Islam, Md. Moinul; Hasan, Md. Abdullah Al; Rahman, Md. Mominur; Rahaman, Md. Mashiur

    2017-12-01

    Biomass fired cook stoves, for many years, have been the main cooking appliance for the rural people of developing countries. Several researches have been carried out to the find efficient stoves. In the present study, numerical model of an improved household cook stove is developed to analyze the heat transfer and flow behavior of gas during operation. The numerical model is validated with the experimental results. Computation of the numerical model is executed the using non-premixed combustion model. Reynold's averaged Navier-Stokes (RaNS) equation along with the κ - ɛ model governed the turbulent flow associated within the computed domain. The computational results are in well agreement with the experiment. Developed numerical model can be used to predict the effect of different biomasses on the efficiency of the cook stove.

  16. On modelling three-dimensional piezoelectric smart structures with boundary spectral element method

    NASA Astrophysics Data System (ADS)

    Zou, Fangxin; Aliabadi, M. H.

    2017-05-01

    The computational efficiency of the boundary element method in elastodynamic analysis can be significantly improved by employing high-order spectral elements for boundary discretisation. In this work, for the first time, the so-called boundary spectral element method is utilised to formulate the piezoelectric smart structures that are widely used in structural health monitoring (SHM) applications. The resultant boundary spectral element formulation has been validated by the finite element method (FEM) and physical experiments. The new formulation has demonstrated a lower demand on computational resources and a higher numerical stability than commercial FEM packages. Comparing to the conventional boundary element formulation, a significant reduction in computational expenses has been achieved. In summary, the boundary spectral element formulation presented in this paper provides a highly efficient and stable mathematical tool for the development of SHM applications.

  17. Algebraic model checking for Boolean gene regulatory networks.

    PubMed

    Tran, Quoc-Nam

    2011-01-01

    We present a computational method in which modular and Groebner bases (GB) computation in Boolean rings are used for solving problems in Boolean gene regulatory networks (BN). In contrast to other known algebraic approaches, the degree of intermediate polynomials during the calculation of Groebner bases using our method will never grow resulting in a significant improvement in running time and memory space consumption. We also show how calculation in temporal logic for model checking can be done by means of our direct and efficient Groebner basis computation in Boolean rings. We present our experimental results in finding attractors and control strategies of Boolean networks to illustrate our theoretical arguments. The results are promising. Our algebraic approach is more efficient than the state-of-the-art model checker NuSMV on BNs. More importantly, our approach finds all solutions for the BN problems.

  18. Parallel processing for scientific computations

    NASA Technical Reports Server (NTRS)

    Alkhatib, Hasan S.

    1991-01-01

    The main contribution of the effort in the last two years is the introduction of the MOPPS system. After doing extensive literature search, we introduced the system which is described next. MOPPS employs a new solution to the problem of managing programs which solve scientific and engineering applications on a distributed processing environment. Autonomous computers cooperate efficiently in solving large scientific problems with this solution. MOPPS has the advantage of not assuming the presence of any particular network topology or configuration, computer architecture, or operating system. It imposes little overhead on network and processor resources while efficiently managing programs concurrently. The core of MOPPS is an intelligent program manager that builds a knowledge base of the execution performance of the parallel programs it is managing under various conditions. The manager applies this knowledge to improve the performance of future runs. The program manager learns from experience.

  19. Accelerating global optimization of aerodynamic shapes using a new surrogate-assisted parallel genetic algorithm

    NASA Astrophysics Data System (ADS)

    Ebrahimi, Mehdi; Jahangirian, Alireza

    2017-12-01

    An efficient strategy is presented for global shape optimization of wing sections with a parallel genetic algorithm. Several computational techniques are applied to increase the convergence rate and the efficiency of the method. A variable fidelity computational evaluation method is applied in which the expensive Navier-Stokes flow solver is complemented by an inexpensive multi-layer perceptron neural network for the objective function evaluations. A population dispersion method that consists of two phases, of exploration and refinement, is developed to improve the convergence rate and the robustness of the genetic algorithm. Owing to the nature of the optimization problem, a parallel framework based on the master/slave approach is used. The outcomes indicate that the method is able to find the global optimum with significantly lower computational time in comparison to the conventional genetic algorithm.

  20. Efficiency of using first-generation information during second-generation selection: results of computer simulation.

    Treesearch

    T.Z. Ye; K.J.S. Jayawickrama; G.R. Johnson

    2004-01-01

    BLUP (Best linear unbiased prediction) method has been widely used in forest tree improvement programs. Since one of the properties of BLUP is that related individuals contribute to the predictions of each other, it seems logical that integrating data from all generations and from all populations would improve both the precision and accuracy in predicting genetic...

  1. Assisting People with Multiple Disabilities by Improving Their Computer Pointing Efficiency with an Automatic Target Acquisition Program

    ERIC Educational Resources Information Center

    Shih, Ching-Hsiang; Shih, Ching-Tien; Peng, Chin-Ling

    2011-01-01

    This study evaluated whether two people with multiple disabilities would be able to improve their pointing performance through an Automatic Target Acquisition Program (ATAP) and a newly developed mouse driver (i.e. a new mouse driver replaces standard mouse driver, and is able to monitor mouse movement and intercept click action). Initially, both…

  2. A novel composite adaptive flap controller design by a high-efficient modified differential evolution identification approach.

    PubMed

    Li, Nailu; Mu, Anle; Yang, Xiyun; Magar, Kaman T; Liu, Chao

    2018-05-01

    The optimal tuning of adaptive flap controller can improve adaptive flap control performance on uncertain operating environments, but the optimization process is usually time-consuming and it is difficult to design proper optimal tuning strategy for the flap control system (FCS). To solve this problem, a novel adaptive flap controller is designed based on a high-efficient differential evolution (DE) identification technique and composite adaptive internal model control (CAIMC) strategy. The optimal tuning can be easily obtained by DE identified inverse of the FCS via CAIMC structure. To achieve fast tuning, a high-efficient modified adaptive DE algorithm is proposed with new mutant operator and varying range adaptive mechanism for the FCS identification. A tradeoff between optimized adaptive flap control and low computation cost is successfully achieved by proposed controller. Simulation results show the robustness of proposed method and its superiority to conventional adaptive IMC (AIMC) flap controller and the CAIMC flap controllers using other DE algorithms on various uncertain operating conditions. The high computation efficiency of proposed controller is also verified based on the computation time on those operating cases. Copyright © 2018 ISA. Published by Elsevier Ltd. All rights reserved.

  3. Higher Order Time Integration Schemes for the Unsteady Navier-Stokes Equations on Unstructured Meshes

    NASA Technical Reports Server (NTRS)

    Jothiprasad, Giridhar; Mavriplis, Dimitri J.; Caughey, David A.

    2002-01-01

    The rapid increase in available computational power over the last decade has enabled higher resolution flow simulations and more widespread use of unstructured grid methods for complex geometries. While much of this effort has been focused on steady-state calculations in the aerodynamics community, the need to accurately predict off-design conditions, which may involve substantial amounts of flow separation, points to the need to efficiently simulate unsteady flow fields. Accurate unsteady flow simulations can easily require several orders of magnitude more computational effort than a corresponding steady-state simulation. For this reason, techniques for improving the efficiency of unsteady flow simulations are required in order to make such calculations feasible in the foreseeable future. The purpose of this work is to investigate possible reductions in computer time due to the choice of an efficient time-integration scheme from a series of schemes differing in the order of time-accuracy, and by the use of more efficient techniques to solve the nonlinear equations which arise while using implicit time-integration schemes. This investigation is carried out in the context of a two-dimensional unstructured mesh laminar Navier-Stokes solver.

  4. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Krueger, Jens; Micikevicius, Paulius; Williams, Samuel

    Reverse Time Migration (RTM) is one of the main approaches in the seismic processing industry for imaging the subsurface structure of the Earth. While RTM provides qualitative advantages over its predecessors, it has a high computational cost warranting implementation on HPC architectures. We focus on three progressively more complex kernels extracted from RTM: for isotropic (ISO), vertical transverse isotropic (VTI) and tilted transverse isotropic (TTI) media. In this work, we examine performance optimization of forward wave modeling, which describes the computational kernels used in RTM, on emerging multi- and manycore processors and introduce a novel common subexpression elimination optimization formore » TTI kernels. We compare attained performance and energy efficiency in both the single-node and distributed memory environments in order to satisfy industry’s demands for fidelity, performance, and energy efficiency. Moreover, we discuss the interplay between architecture (chip and system) and optimizations (both on-node computation) highlighting the importance of NUMA-aware approaches to MPI communication. Ultimately, our results show we can improve CPU energy efficiency by more than 10× on Magny Cours nodes while acceleration via multiple GPUs can surpass the energy-efficient Intel Sandy Bridge by as much as 3.6×.« less

  5. Seismic imaging using finite-differences and parallel computers

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ober, C.C.

    1997-12-31

    A key to reducing the risks and costs of associated with oil and gas exploration is the fast, accurate imaging of complex geologies, such as salt domes in the Gulf of Mexico and overthrust regions in US onshore regions. Prestack depth migration generally yields the most accurate images, and one approach to this is to solve the scalar wave equation using finite differences. As part of an ongoing ACTI project funded by the US Department of Energy, a finite difference, 3-D prestack, depth migration code has been developed. The goal of this work is to demonstrate that massively parallel computersmore » can be used efficiently for seismic imaging, and that sufficient computing power exists (or soon will exist) to make finite difference, prestack, depth migration practical for oil and gas exploration. Several problems had to be addressed to get an efficient code for the Intel Paragon. These include efficient I/O, efficient parallel tridiagonal solves, and high single-node performance. Furthermore, to provide portable code the author has been restricted to the use of high-level programming languages (C and Fortran) and interprocessor communications using MPI. He has been using the SUNMOS operating system, which has affected many of his programming decisions. He will present images created from two verification datasets (the Marmousi Model and the SEG/EAEG 3D Salt Model). Also, he will show recent images from real datasets, and point out locations of improved imaging. Finally, he will discuss areas of current research which will hopefully improve the image quality and reduce computational costs.« less

  6. FWT2D: A massively parallel program for frequency-domain full-waveform tomography of wide-aperture seismic data—Part 1: Algorithm

    NASA Astrophysics Data System (ADS)

    Sourbier, Florent; Operto, Stéphane; Virieux, Jean; Amestoy, Patrick; L'Excellent, Jean-Yves

    2009-03-01

    This is the first paper in a two-part series that describes a massively parallel code that performs 2D frequency-domain full-waveform inversion of wide-aperture seismic data for imaging complex structures. Full-waveform inversion methods, namely quantitative seismic imaging methods based on the resolution of the full wave equation, are computationally expensive. Therefore, designing efficient algorithms which take advantage of parallel computing facilities is critical for the appraisal of these approaches when applied to representative case studies and for further improvements. Full-waveform modelling requires the resolution of a large sparse system of linear equations which is performed with the massively parallel direct solver MUMPS for efficient multiple-shot simulations. Efficiency of the multiple-shot solution phase (forward/backward substitutions) is improved by using the BLAS3 library. The inverse problem relies on a classic local optimization approach implemented with a gradient method. The direct solver returns the multiple-shot wavefield solutions distributed over the processors according to a domain decomposition driven by the distribution of the LU factors. The domain decomposition of the wavefield solutions is used to compute in parallel the gradient of the objective function and the diagonal Hessian, this latter providing a suitable scaling of the gradient. The algorithm allows one to test different strategies for multiscale frequency inversion ranging from successive mono-frequency inversion to simultaneous multifrequency inversion. These different inversion strategies will be illustrated in the following companion paper. The parallel efficiency and the scalability of the code will also be quantified.

  7. Distributed GPU Computing in GIScience

    NASA Astrophysics Data System (ADS)

    Jiang, Y.; Yang, C.; Huang, Q.; Li, J.; Sun, M.

    2013-12-01

    Geoscientists strived to discover potential principles and patterns hidden inside ever-growing Big Data for scientific discoveries. To better achieve this objective, more capable computing resources are required to process, analyze and visualize Big Data (Ferreira et al., 2003; Li et al., 2013). Current CPU-based computing techniques cannot promptly meet the computing challenges caused by increasing amount of datasets from different domains, such as social media, earth observation, environmental sensing (Li et al., 2013). Meanwhile CPU-based computing resources structured as cluster or supercomputer is costly. In the past several years with GPU-based technology matured in both the capability and performance, GPU-based computing has emerged as a new computing paradigm. Compare to traditional computing microprocessor, the modern GPU, as a compelling alternative microprocessor, has outstanding high parallel processing capability with cost-effectiveness and efficiency(Owens et al., 2008), although it is initially designed for graphical rendering in visualization pipe. This presentation reports a distributed GPU computing framework for integrating GPU-based computing within distributed environment. Within this framework, 1) for each single computer, computing resources of both GPU-based and CPU-based can be fully utilized to improve the performance of visualizing and processing Big Data; 2) within a network environment, a variety of computers can be used to build up a virtual super computer to support CPU-based and GPU-based computing in distributed computing environment; 3) GPUs, as a specific graphic targeted device, are used to greatly improve the rendering efficiency in distributed geo-visualization, especially for 3D/4D visualization. Key words: Geovisualization, GIScience, Spatiotemporal Studies Reference : 1. Ferreira de Oliveira, M. C., & Levkowitz, H. (2003). From visual data exploration to visual data mining: A survey. Visualization and Computer Graphics, IEEE Transactions on, 9(3), 378-394. 2. Li, J., Jiang, Y., Yang, C., Huang, Q., & Rice, M. (2013). Visualizing 3D/4D Environmental Data Using Many-core Graphics Processing Units (GPUs) and Multi-core Central Processing Units (CPUs). Computers & Geosciences, 59(9), 78-89. 3. Owens, J. D., Houston, M., Luebke, D., Green, S., Stone, J. E., & Phillips, J. C. (2008). GPU computing. Proceedings of the IEEE, 96(5), 879-899.

  8. Computationally Efficient Adaptive Beamformer for Ultrasound Imaging Based on QR Decomposition.

    PubMed

    Park, Jongin; Wi, Seok-Min; Lee, Jin S

    2016-02-01

    Adaptive beamforming methods for ultrasound imaging have been studied to improve image resolution and contrast. The most common approach is the minimum variance (MV) beamformer which minimizes the power of the beamformed output while maintaining the response from the direction of interest constant. The method achieves higher resolution and better contrast than the delay-and-sum (DAS) beamformer, but it suffers from high computational cost. This cost is mainly due to the computation of the spatial covariance matrix and its inverse, which requires O(L(3)) computations, where L denotes the subarray size. In this study, we propose a computationally efficient MV beamformer based on QR decomposition. The idea behind our approach is to transform the spatial covariance matrix to be a scalar matrix σI and we subsequently obtain the apodization weights and the beamformed output without computing the matrix inverse. To do that, QR decomposition algorithm is used and also can be executed at low cost, and therefore, the computational complexity is reduced to O(L(2)). In addition, our approach is mathematically equivalent to the conventional MV beamformer, thereby showing the equivalent performances. The simulation and experimental results support the validity of our approach.

  9. Automated dynamic analytical model improvement for damped structures

    NASA Technical Reports Server (NTRS)

    Fuh, J. S.; Berman, A.

    1985-01-01

    A method is described to improve a linear nonproportionally damped analytical model of a structure. The procedure finds the smallest changes in the analytical model such that the improved model matches the measured modal parameters. Features of the method are: (1) ability to properly treat complex valued modal parameters of a damped system; (2) applicability to realistically large structural models; and (3) computationally efficiency without involving eigensolutions and inversion of a large matrix.

  10. Multidisciplinary design optimization using multiobjective formulation techniques

    NASA Technical Reports Server (NTRS)

    Chattopadhyay, Aditi; Pagaldipti, Narayanan S.

    1995-01-01

    This report addresses the development of a multidisciplinary optimization procedure using an efficient semi-analytical sensitivity analysis technique and multilevel decomposition for the design of aerospace vehicles. A semi-analytical sensitivity analysis procedure is developed for calculating computational grid sensitivities and aerodynamic design sensitivities. Accuracy and efficiency of the sensitivity analysis procedure is established through comparison of the results with those obtained using a finite difference technique. The developed sensitivity analysis technique are then used within a multidisciplinary optimization procedure for designing aerospace vehicles. The optimization problem, with the integration of aerodynamics and structures, is decomposed into two levels. Optimization is performed for improved aerodynamic performance at the first level and improved structural performance at the second level. Aerodynamic analysis is performed by solving the three-dimensional parabolized Navier Stokes equations. A nonlinear programming technique and an approximate analysis procedure are used for optimization. The proceduredeveloped is applied to design the wing of a high speed aircraft. Results obtained show significant improvements in the aircraft aerodynamic and structural performance when compared to a reference or baseline configuration. The use of the semi-analytical sensitivity technique provides significant computational savings.

  11. Implementing the Data Center Energy Productivity Metric in a High Performance Computing Data Center

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sego, Landon H.; Marquez, Andres; Rawson, Andrew

    2013-06-30

    As data centers proliferate in size and number, the improvement of their energy efficiency and productivity has become an economic and environmental imperative. Making these improvements requires metrics that are robust, interpretable, and practical. We discuss the properties of a number of the proposed metrics of energy efficiency and productivity. In particular, we focus on the Data Center Energy Productivity (DCeP) metric, which is the ratio of useful work produced by the data center to the energy consumed performing that work. We describe our approach for using DCeP as the principal outcome of a designed experiment using a highly instrumented,more » high-performance computing data center. We found that DCeP was successful in clearly distinguishing different operational states in the data center, thereby validating its utility as a metric for identifying configurations of hardware and software that would improve energy productivity. We also discuss some of the challenges and benefits associated with implementing the DCeP metric, and we examine the efficacy of the metric in making comparisons within a data center and between data centers.« less

  12. Visualization assisted by parallel processing

    NASA Astrophysics Data System (ADS)

    Lange, B.; Rey, H.; Vasques, X.; Puech, W.; Rodriguez, N.

    2011-01-01

    This paper discusses the experimental results of our visualization model for data extracted from sensors. The objective of this paper is to find a computationally efficient method to produce a real time rendering visualization for a large amount of data. We develop visualization method to monitor temperature variance of a data center. Sensors are placed on three layers and do not cover all the room. We use particle paradigm to interpolate data sensors. Particles model the "space" of the room. In this work we use a partition of the particle set, using two mathematical methods: Delaunay triangulation and Voronoý cells. Avis and Bhattacharya present these two algorithms in. Particles provide information on the room temperature at different coordinates over time. To locate and update particles data we define a computational cost function. To solve this function in an efficient way, we use a client server paradigm. Server computes data and client display this data on different kind of hardware. This paper is organized as follows. The first part presents related algorithm used to visualize large flow of data. The second part presents different platforms and methods used, which was evaluated in order to determine the better solution for the task proposed. The benchmark use the computational cost of our algorithm that formed based on located particles compared to sensors and on update of particles value. The benchmark was done on a personal computer using CPU, multi core programming, GPU programming and hybrid GPU/CPU. GPU programming method is growing in the research field; this method allows getting a real time rendering instates of a precompute rendering. For improving our results, we compute our algorithm on a High Performance Computing (HPC), this benchmark was used to improve multi-core method. HPC is commonly used in data visualization (astronomy, physic, etc) for improving the rendering and getting real-time.

  13. GPU-Accelerated Voxelwise Hepatic Perfusion Quantification

    PubMed Central

    Wang, H; Cao, Y

    2012-01-01

    Voxelwise quantification of hepatic perfusion parameters from dynamic contrast enhanced (DCE) imaging greatly contributes to assessment of liver function in response to radiation therapy. However, the efficiency of the estimation of hepatic perfusion parameters voxel-by-voxel in the whole liver using a dual-input single-compartment model requires substantial improvement for routine clinical applications. In this paper, we utilize the parallel computation power of a graphics processing unit (GPU) to accelerate the computation, while maintaining the same accuracy as the conventional method. Using CUDA-GPU, the hepatic perfusion computations over multiple voxels are run across the GPU blocks concurrently but independently. At each voxel, non-linear least squares fitting the time series of the liver DCE data to the compartmental model is distributed to multiple threads in a block, and the computations of different time points are performed simultaneously and synchronically. An efficient fast Fourier transform in a block is also developed for the convolution computation in the model. The GPU computations of the voxel-by-voxel hepatic perfusion images are compared with ones by the CPU using the simulated DCE data and the experimental DCE MR images from patients. The computation speed is improved by 30 times using a NVIDIA Tesla C2050 GPU compared to a 2.67 GHz Intel Xeon CPU processor. To obtain liver perfusion maps with 626400 voxels in a patient’s liver, it takes 0.9 min with the GPU-accelerated voxelwise computation, compared to 110 min with the CPU, while both methods result in perfusion parameters differences less than 10−6. The method will be useful for generating liver perfusion images in clinical settings. PMID:22892645

  14. Rational design of an enzyme mutant for anti-cocaine therapeutics

    NASA Astrophysics Data System (ADS)

    Zheng, Fang; Zhan, Chang-Guo

    2008-09-01

    (-)-Cocaine is a widely abused drug and there is no available anti-cocaine therapeutic. The disastrous medical and social consequences of cocaine addiction have made the development of an effective pharmacological treatment a high priority. An ideal anti-cocaine medication would be to accelerate (-)-cocaine metabolism producing biologically inactive metabolites. The main metabolic pathway of cocaine in body is the hydrolysis at its benzoyl ester group. Reviewed in this article is the state-of-the-art computational design of high-activity mutants of human butyrylcholinesterase (BChE) against (-)-cocaine. The computational design of BChE mutants have been based on not only the structure of the enzyme, but also the detailed catalytic mechanisms for BChE-catalyzed hydrolysis of (-)-cocaine and (+)-cocaine. Computational studies of the detailed catalytic mechanisms and the structure-and-mechanism-based computational design have been carried out through the combined use of a variety of state-of-the-art techniques of molecular modeling. By using the computational insights into the catalytic mechanisms, a recently developed unique computational design strategy based on the simulation of the rate-determining transition state has been employed to design high-activity mutants of human BChE for hydrolysis of (-)-cocaine, leading to the exciting discovery of BChE mutants with a considerably improved catalytic efficiency against (-)-cocaine. One of the discovered BChE mutants (i.e., A199S/S287G/A328W/Y332G) has a ˜456-fold improved catalytic efficiency against (-)-cocaine. The encouraging outcome of the computational design and discovery effort demonstrates that the unique computational design approach based on the transition-state simulation is promising for rational enzyme redesign and drug discovery.

  15. Dust cyclone research in the 21st century

    USDA-ARS?s Scientific Manuscript database

    Research to meet the demand for ever more efficient dust cyclones continues after some eighty years. Recent trends emphasize design optimization through computational fluid dynamics (CFD) and testing design subtleties not modeled by semi-empirical equations. Improvements to current best available ...

  16. Interplanetary Trajectories, Encke Method (ITEM)

    NASA Technical Reports Server (NTRS)

    Whitlock, F. H.; Wolfe, H.; Lefton, L.; Levine, N.

    1972-01-01

    Modified program has been developed using improved variation of Encke method which avoids accumulation of round-off errors and avoids numerical ambiguities arising from near-circular orbits of low inclination. Variety of interplanetary trajectory problems can be computed with maximum accuracy and efficiency.

  17. A hybrid fuzzy logic and extreme learning machine for improving efficiency of circulating water systems in power generation plant

    NASA Astrophysics Data System (ADS)

    Aziz, Nur Liyana Afiqah Abdul; Siah Yap, Keem; Afif Bunyamin, Muhammad

    2013-06-01

    This paper presents a new approach of the fault detection for improving efficiency of circulating water system (CWS) in a power generation plant using a hybrid Fuzzy Logic System (FLS) and Extreme Learning Machine (ELM) neural network. The FLS is a mathematical tool for calculating the uncertainties where precision and significance are applied in the real world. It is based on natural language which has the ability of "computing the word". The ELM is an extremely fast learning algorithm for neural network that can completed the training cycle in a very short time. By combining the FLS and ELM, new hybrid model, i.e., FLS-ELM is developed. The applicability of this proposed hybrid model is validated in fault detection in CWS which may help to improve overall efficiency of power generation plant, hence, consuming less natural recourses and producing less pollutions.

  18. Computer-aided programming for message-passing system; Problems and a solution

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wu, M.Y.; Gajski, D.D.

    1989-12-01

    As the number of processors and the complexity of problems to be solved increase, programming multiprocessing systems becomes more difficult and error-prone. Program development tools are necessary since programmers are not able to develop complex parallel programs efficiently. Parallel models of computation, parallelization problems, and tools for computer-aided programming (CAP) are discussed. As an example, a CAP tool that performs scheduling and inserts communication primitives automatically is described. It also generates the performance estimates and other program quality measures to help programmers in improving their algorithms and programs.

  19. Computer-assisted design of organic synthesis

    NASA Technical Reports Server (NTRS)

    Kaminaka, H.

    1986-01-01

    The computer programs to design synthetic pathways of organic compounds have been utilized throughout the world since the first system was reported by Corey in 1969, and the LHASA was reported in1972 to become the predominant system. Many programs have been reported mainly in the United States and Europe, and groups of corporations, especially chemical companies, have been trying to improve programs and increase the efficiency of research. In Japan, unfortunately, no concrete movement in this area has been seen. Of course, it goes without saying that these kinds of programs are effective for efficient research, but the remarkable aspect is that these can present unexpected data to the researchers to stimulate them to develop new ideas.

  20. An efficient quantum circuit analyser on qubits and qudits

    NASA Astrophysics Data System (ADS)

    Loke, T.; Wang, J. B.

    2011-10-01

    This paper presents a highly efficient decomposition scheme and its associated Mathematica notebook for the analysis of complicated quantum circuits comprised of single/multiple qubit and qudit quantum gates. In particular, this scheme reduces the evaluation of multiple unitary gate operations with many conditionals to just two matrix additions, regardless of the number of conditionals or gate dimensions. This improves significantly the capability of a quantum circuit analyser implemented in a classical computer. This is also the first efficient quantum circuit analyser to include qudit quantum logic gates.

  1. A computationally efficient parallel Levenberg-Marquardt algorithm for highly parameterized inverse model analyses

    NASA Astrophysics Data System (ADS)

    Lin, Youzuo; O'Malley, Daniel; Vesselinov, Velimir V.

    2016-09-01

    Inverse modeling seeks model parameters given a set of observations. However, for practical problems because the number of measurements is often large and the model parameters are also numerous, conventional methods for inverse modeling can be computationally expensive. We have developed a new, computationally efficient parallel Levenberg-Marquardt method for solving inverse modeling problems with a highly parameterized model space. Levenberg-Marquardt methods require the solution of a linear system of equations which can be prohibitively expensive to compute for moderate to large-scale problems. Our novel method projects the original linear problem down to a Krylov subspace such that the dimensionality of the problem can be significantly reduced. Furthermore, we store the Krylov subspace computed when using the first damping parameter and recycle the subspace for the subsequent damping parameters. The efficiency of our new inverse modeling algorithm is significantly improved using these computational techniques. We apply this new inverse modeling method to invert for random transmissivity fields in 2-D and a random hydraulic conductivity field in 3-D. Our algorithm is fast enough to solve for the distributed model parameters (transmissivity) in the model domain. The algorithm is coded in Julia and implemented in the MADS computational framework (http://mads.lanl.gov). By comparing with Levenberg-Marquardt methods using standard linear inversion techniques such as QR or SVD methods, our Levenberg-Marquardt method yields a speed-up ratio on the order of ˜101 to ˜102 in a multicore computational environment. Therefore, our new inverse modeling method is a powerful tool for characterizing subsurface heterogeneity for moderate to large-scale problems.

  2. Parallel computing method for simulating hydrological processesof large rivers under climate change

    NASA Astrophysics Data System (ADS)

    Wang, H.; Chen, Y.

    2016-12-01

    Climate change is one of the proverbial global environmental problems in the world.Climate change has altered the watershed hydrological processes in time and space distribution, especially in worldlarge rivers.Watershed hydrological process simulation based on physically based distributed hydrological model can could have better results compared with the lumped models.However, watershed hydrological process simulation includes large amount of calculations, especially in large rivers, thus needing huge computing resources that may not be steadily available for the researchers or at high expense, this seriously restricted the research and application. To solve this problem, the current parallel method are mostly parallel computing in space and time dimensions.They calculate the natural features orderly thatbased on distributed hydrological model by grid (unit, a basin) from upstream to downstream.This articleproposes ahigh-performancecomputing method of hydrological process simulation with high speedratio and parallel efficiency.It combinedthe runoff characteristics of time and space of distributed hydrological model withthe methods adopting distributed data storage, memory database, distributed computing, parallel computing based on computing power unit.The method has strong adaptability and extensibility,which means it canmake full use of the computing and storage resources under the condition of limited computing resources, and the computing efficiency can be improved linearly with the increase of computing resources .This method can satisfy the parallel computing requirements ofhydrological process simulation in small, medium and large rivers.

  3. Aerodynamic optimization studies on advanced architecture computers

    NASA Technical Reports Server (NTRS)

    Chawla, Kalpana

    1995-01-01

    The approach to carrying out multi-discipline aerospace design studies in the future, especially in massively parallel computing environments, comprises of choosing (1) suitable solvers to compute solutions to equations characterizing a discipline, and (2) efficient optimization methods. In addition, for aerodynamic optimization problems, (3) smart methodologies must be selected to modify the surface shape. In this research effort, a 'direct' optimization method is implemented on the Cray C-90 to improve aerodynamic design. It is coupled with an existing implicit Navier-Stokes solver, OVERFLOW, to compute flow solutions. The optimization method is chosen such that it can accomodate multi-discipline optimization in future computations. In the work , however, only single discipline aerodynamic optimization will be included.

  4. Meshfree and efficient modeling of swimming cells

    NASA Astrophysics Data System (ADS)

    Gallagher, Meurig T.; Smith, David J.

    2018-05-01

    Locomotion in Stokes flow is an intensively studied problem because it describes important biological phenomena such as the motility of many species' sperm, bacteria, algae, and protozoa. Numerical computations can be challenging, particularly in three dimensions, due to the presence of moving boundaries and complex geometries; methods which combine ease of implementation and computational efficiency are therefore needed. A recently proposed method to discretize the regularized Stokeslet boundary integral equation without the need for a connected mesh is applied to the inertialess locomotion problem in Stokes flow. The mathematical formulation and key aspects of the computational implementation in matlab® or GNU Octave are described, followed by numerical experiments with biflagellate algae and multiple uniflagellate sperm swimming between no-slip surfaces, for which both swimming trajectories and flow fields are calculated. These computational experiments required minutes of time on modest hardware; an extensible implementation is provided in a GitHub repository. The nearest-neighbor discretization dramatically improves convergence and robustness, a key challenge in extending the regularized Stokeslet method to complicated three-dimensional biological fluid problems.

  5. VOFTools - A software package of calculation tools for volume of fluid methods using general convex grids

    NASA Astrophysics Data System (ADS)

    López, J.; Hernández, J.; Gómez, P.; Faura, F.

    2018-02-01

    The VOFTools library includes efficient analytical and geometrical routines for (1) area/volume computation, (2) truncation operations that typically arise in VOF (volume of fluid) methods, (3) area/volume conservation enforcement (VCE) in PLIC (piecewise linear interface calculation) reconstruction and(4) computation of the distance from a given point to the reconstructed interface. The computation of a polyhedron volume uses an efficient formula based on a quadrilateral decomposition and a 2D projection of each polyhedron face. The analytical VCE method is based on coupling an interpolation procedure to bracket the solution with an improved final calculation step based on the above volume computation formula. Although the library was originally created to help develop highly accurate advection and reconstruction schemes in the context of VOF methods, it may have more general applications. To assess the performance of the supplied routines, different tests, which are provided in FORTRAN and C, were implemented for several 2D and 3D geometries.

  6. An Improved Computational Technique for Calculating Electromagnetic Forces and Power Absorptions Generated in Spherical and Deformed Body in Levitation Melting Devices

    NASA Technical Reports Server (NTRS)

    Zong, Jin-Ho; Szekely, Julian; Schwartz, Elliot

    1992-01-01

    An improved computational technique for calculating the electromagnetic force field, the power absorption and the deformation of an electromagnetically levitated metal sample is described. The technique is based on the volume integral method, but represents a substantial refinement; the coordinate transformation employed allows the efficient treatment of a broad class of rotationally symmetrical bodies. Computed results are presented to represent the behavior of levitation melted metal samples in a multi-coil, multi-frequency levitation unit to be used in microgravity experiments. The theoretical predictions are compared with both analytical solutions and with the results or previous computational efforts for the spherical samples and the agreement has been very good. The treatment of problems involving deformed surfaces and actually predicting the deformed shape of the specimens breaks new ground and should be the major usefulness of the proposed method.

  7. Zero-block mode decision algorithm for H.264/AVC.

    PubMed

    Lee, Yu-Ming; Lin, Yinyi

    2009-03-01

    In the previous paper , we proposed a zero-block intermode decision algorithm for H.264 video coding based upon the number of zero-blocks of 4 x 4 DCT coefficients between the current macroblock and the co-located macroblock. The proposed algorithm can achieve significant improvement in computation, but the computation performance is limited for high bit-rate coding. To improve computation efficiency, in this paper, we suggest an enhanced zero-block decision algorithm, which uses an early zero-block detection method to compute the number of zero-blocks instead of direct DCT and quantization (DCT/Q) calculation and incorporates two adequate decision methods into semi-stationary and nonstationary regions of a video sequence. In addition, the zero-block decision algorithm is also applied to the intramode prediction in the P frame. The enhanced zero-block decision algorithm brings out a reduction of average 27% of total encoding time compared to the zero-block decision algorithm.

  8. Accelerated computer generated holography using sparse bases in the STFT domain.

    PubMed

    Blinder, David; Schelkens, Peter

    2018-01-22

    Computer-generated holography at high resolutions is a computationally intensive task. Efficient algorithms are needed to generate holograms at acceptable speeds, especially for real-time and interactive applications such as holographic displays. We propose a novel technique to generate holograms using a sparse basis representation in the short-time Fourier space combined with a wavefront-recording plane placed in the middle of the 3D object. By computing the point spread functions in the transform domain, we update only a small subset of the precomputed largest-magnitude coefficients to significantly accelerate the algorithm over conventional look-up table methods. We implement the algorithm on a GPU, and report a speedup factor of over 30. We show that this transform is superior over wavelet-based approaches, and show quantitative and qualitative improvements over the state-of-the-art WASABI method; we report accuracy gains of 2dB PSNR, as well improved view preservation.

  9. Users' manual for computer program for three-dimensional analysis of coupler-cavity traveling wave tubes

    NASA Technical Reports Server (NTRS)

    Omalley, T. A.

    1984-01-01

    The use of the coupled cavity traveling wave tube for space communications has led to an increased interest in improving the efficiency of the basic interaction process in these devices through velocity resynchronization and other methods. A flexible, three dimensional, axially symmetric, large signal computer program was developed for use on the IBM 370 time sharing system. A users' manual for this program is included.

  10. Electrooptical adaptive switching network for the hypercube computer

    NASA Technical Reports Server (NTRS)

    Chow, E.; Peterson, J.

    1988-01-01

    An all-optical network design for the hyperswitch network using regular free-space interconnects between electronic processor nodes is presented. The adaptive routing model used is described, and an adaptive routing control example is presented. The design demonstrates that existing electrooptical techniques are sufficient for implementing efficient parallel architectures without the need for more complex means of implementing arbitrary interconnection schemes. The electrooptical hyperswitch network significantly improves the communication performance of the hypercube computer.

  11. Improving Computational Efficiency of Prediction in Model-based Prognostics Using the Unscented Transform

    DTIC Science & Technology

    2010-10-01

    bodies becomes greater as surface as- perities wear down (Hutchings, 1992). We characterize friction damage by a change in the friction coefficient...points are such a set, and satisfy an additional constraint in which the skew ( third moment) is minimized, which reduces the average error for a...On sequential Monte Carlo sampling methods for Bayesian filtering. Statistics and Computing, 10, 197–208. Hutchings, I. M. (1992). Tribology : friction

  12. Fault Tolerant Parallel Implementations of Iterative Algorithms for Optimal Control Problems

    DTIC Science & Technology

    1988-01-21

    p/.V)] steps, but did not discuss any specific parallel implementation. Gajski [51 improved upon this result by performing the SIMD computation in...N = p2. our approach reduces to that of [51, except that Gajski presents the coefficient computation and partial solution phases as a single...8217>. the SIMD algo- rithm presented by Gajski [5] can be most efficiently mapped to a unidirec- tional ring network with broadcasting capability. Based

  13. First-in-Man Computed Tomography-Guided Percutaneous Revascularization of Coronary Chronic Total Occlusion Using a Wearable Computer: Proof of Concept.

    PubMed

    Opolski, Maksymilian P; Debski, Artur; Borucki, Bartosz A; Szpak, Marcin; Staruch, Adam D; Kepka, Cezary; Witkowski, Adam

    2016-06-01

    We report a case of successful computed tomography-guided percutaneous revascularization of a chronically occluded right coronary artery using a wearable, hands-free computer with a head-mounted display worn by interventional cardiologists in the catheterization laboratory. The projection of 3-dimensional computed tomographic reconstructions onto the screen of virtual reality glass allowed the operators to clearly visualize the distal coronary vessel, and verify the direction of the guide wire advancement relative to the course of the occluded vessel segment. This case provides proof of concept that wearable computers can improve operator comfort and procedure efficiency in interventional cardiology. Copyright © 2016 Canadian Cardiovascular Society. Published by Elsevier Inc. All rights reserved.

  14. Estimation of Radiative Efficiency of Chemicals with Potentially Significant Global Warming Potential.

    PubMed

    Betowski, Don; Bevington, Charles; Allison, Thomas C

    2016-01-19

    Halogenated chemical substances are used in a broad array of applications, and new chemical substances are continually being developed and introduced into commerce. While recent research has considerably increased our understanding of the global warming potentials (GWPs) of multiple individual chemical substances, this research inevitably lags behind the development of new chemical substances. There are currently over 200 substances known to have high GWP. Evaluation of schemes to estimate radiative efficiency (RE) based on computational chemistry are useful where no measured IR spectrum is available. This study assesses the reliability of values of RE calculated using computational chemistry techniques for 235 chemical substances against the best available values. Computed vibrational frequency data is used to estimate RE values using several Pinnock-type models, and reasonable agreement with reported values is found. Significant improvement is obtained through scaling of both vibrational frequencies and intensities. The effect of varying the computational method and basis set used to calculate the frequency data is discussed. It is found that the vibrational intensities have a strong dependence on basis set and are largely responsible for differences in computed RE values.

  15. REVEAL: An Extensible Reduced Order Model Builder for Simulation and Modeling

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Agarwal, Khushbu; Sharma, Poorva; Ma, Jinliang

    2013-04-30

    Many science domains need to build computationally efficient and accurate representations of high fidelity, computationally expensive simulations. These computationally efficient versions are known as reduced-order models. This paper presents the design and implementation of a novel reduced-order model (ROM) builder, the REVEAL toolset. This toolset generates ROMs based on science- and engineering-domain specific simulations executed on high performance computing (HPC) platforms. The toolset encompasses a range of sampling and regression methods that can be used to generate a ROM, automatically quantifies the ROM accuracy, and provides support for an iterative approach to improve ROM accuracy. REVEAL is designed to bemore » extensible in order to utilize the core functionality with any simulator that has published input and output formats. It also defines programmatic interfaces to include new sampling and regression techniques so that users can ‘mix and match’ mathematical techniques to best suit the characteristics of their model. In this paper, we describe the architecture of REVEAL and demonstrate its usage with a computational fluid dynamics model used in carbon capture.« less

  16. Encoding neural and synaptic functionalities in electron spin: A pathway to efficient neuromorphic computing

    NASA Astrophysics Data System (ADS)

    Sengupta, Abhronil; Roy, Kaushik

    2017-12-01

    Present day computers expend orders of magnitude more computational resources to perform various cognitive and perception related tasks that humans routinely perform every day. This has recently resulted in a seismic shift in the field of computation where research efforts are being directed to develop a neurocomputer that attempts to mimic the human brain by nanoelectronic components and thereby harness its efficiency in recognition problems. Bridging the gap between neuroscience and nanoelectronics, this paper attempts to provide a review of the recent developments in the field of spintronic device based neuromorphic computing. Description of various spin-transfer torque mechanisms that can be potentially utilized for realizing device structures mimicking neural and synaptic functionalities is provided. A cross-layer perspective extending from the device to the circuit and system level is presented to envision the design of an All-Spin neuromorphic processor enabled with on-chip learning functionalities. Device-circuit-algorithm co-simulation framework calibrated to experimental results suggest that such All-Spin neuromorphic systems can potentially achieve almost two orders of magnitude energy improvement in comparison to state-of-the-art CMOS implementations.

  17. Decomposition method for fast computation of gigapixel-sized Fresnel holograms on a graphics processing unit cluster.

    PubMed

    Jackin, Boaz Jessie; Watanabe, Shinpei; Ootsu, Kanemitsu; Ohkawa, Takeshi; Yokota, Takashi; Hayasaki, Yoshio; Yatagai, Toyohiko; Baba, Takanobu

    2018-04-20

    A parallel computation method for large-size Fresnel computer-generated hologram (CGH) is reported. The method was introduced by us in an earlier report as a technique for calculating Fourier CGH from 2D object data. In this paper we extend the method to compute Fresnel CGH from 3D object data. The scale of the computation problem is also expanded to 2 gigapixels, making it closer to real application requirements. The significant feature of the reported method is its ability to avoid communication overhead and thereby fully utilize the computing power of parallel devices. The method exhibits three layers of parallelism that favor small to large scale parallel computing machines. Simulation and optical experiments were conducted to demonstrate the workability and to evaluate the efficiency of the proposed technique. A two-times improvement in computation speed has been achieved compared to the conventional method, on a 16-node cluster (one GPU per node) utilizing only one layer of parallelism. A 20-times improvement in computation speed has been estimated utilizing two layers of parallelism on a very large-scale parallel machine with 16 nodes, where each node has 16 GPUs.

  18. Ultrasound window-modulated compounding Nakagami imaging: Resolution improvement and computational acceleration for liver characterization.

    PubMed

    Ma, Hsiang-Yang; Lin, Ying-Hsiu; Wang, Chiao-Yin; Chen, Chiung-Nien; Ho, Ming-Chih; Tsui, Po-Hsiang

    2016-08-01

    Ultrasound Nakagami imaging is an attractive method for visualizing changes in envelope statistics. Window-modulated compounding (WMC) Nakagami imaging was reported to improve image smoothness. The sliding window technique is typically used for constructing ultrasound parametric and Nakagami images. Using a large window overlap ratio may improve the WMC Nakagami image resolution but reduces computational efficiency. Therefore, the objectives of this study include: (i) exploring the effects of the window overlap ratio on the resolution and smoothness of WMC Nakagami images; (ii) proposing a fast algorithm that is based on the convolution operator (FACO) to accelerate WMC Nakagami imaging. Computer simulations and preliminary clinical tests on liver fibrosis samples (n=48) were performed to validate the FACO-based WMC Nakagami imaging. The results demonstrated that the width of the autocorrelation function and the parameter distribution of the WMC Nakagami image reduce with the increase in the window overlap ratio. One-pixel shifting (i.e., sliding the window on the image data in steps of one pixel for parametric imaging) as the maximum overlap ratio significantly improves the WMC Nakagami image quality. Concurrently, the proposed FACO method combined with a computational platform that optimizes the matrix computation can accelerate WMC Nakagami imaging, allowing the detection of liver fibrosis-induced changes in envelope statistics. FACO-accelerated WMC Nakagami imaging is a new-generation Nakagami imaging technique with an improved image quality and fast computation. Copyright © 2016 Elsevier B.V. All rights reserved.

  19. Disorder-assisted quantum transport in suboptimal decoherence regimes

    PubMed Central

    Novo, Leonardo; Mohseni, Masoud; Omar, Yasser

    2016-01-01

    We investigate quantum transport in binary tree structures and in hypercubes for the disordered Frenkel-exciton Hamiltonian under pure dephasing noise. We compute the energy transport efficiency as a function of disorder and dephasing rates. We demonstrate that dephasing improves transport efficiency not only in the disordered case, but also in the ordered one. The maximal transport efficiency is obtained when the dephasing timescale matches the hopping timescale, which represent new examples of the Goldilocks principle at the quantum scale. Remarkably, we find that in weak dephasing regimes, away from optimal levels of environmental fluctuations, the average effect of increasing disorder is to improve the transport efficiency until an optimal value for disorder is reached. Our results suggest that rational design of the site energies statistical distributions could lead to better performances in transport systems at nanoscale when their natural environments are far from the optimal dephasing regime. PMID:26726133

  20. Designing overall stoichiometric conversions and intervening metabolic reactions

    DOE PAGES

    Chowdhury, Anupam; Maranas, Costas D.

    2015-11-04

    Existing computational tools for de novo metabolic pathway assembly, either based on mixed integer linear programming techniques or graph-search applications, generally only find linear pathways connecting the source to the target metabolite. The overall stoichiometry of conversion along with alternate co-reactant (or co-product) combinations is not part of the pathway design. Therefore, global carbon and energy efficiency is in essence fixed with no opportunities to identify more efficient routes for recycling carbon flux closer to the thermodynamic limit. Here, we introduce a two-stage computational procedure that both identifies the optimum overall stoichiometry (i.e., optStoic) and selects for (non-)native reactions (i.e.,more » minRxn/minFlux) that maximize carbon, energy or price efficiency while satisfying thermodynamic feasibility requirements. Implementation for recent pathway design studies identified non-intuitive designs with improved efficiencies. Specifically, multiple alternatives for non-oxidative glycolysis are generated and non-intuitive ways of co-utilizing carbon dioxide with methanol are revealed for the production of C 2+ metabolites with higher carbon efficiency.« less

  1. Classification of breast tissue in mammograms using efficient coding.

    PubMed

    Costa, Daniel D; Campos, Lúcio F; Barros, Allan K

    2011-06-24

    Female breast cancer is the major cause of death by cancer in western countries. Efforts in Computer Vision have been made in order to improve the diagnostic accuracy by radiologists. Some methods of lesion diagnosis in mammogram images were developed based in the technique of principal component analysis which has been used in efficient coding of signals and 2D Gabor wavelets used for computer vision applications and modeling biological vision. In this work, we present a methodology that uses efficient coding along with linear discriminant analysis to distinguish between mass and non-mass from 5090 region of interest from mammograms. The results show that the best rates of success reached with Gabor wavelets and principal component analysis were 85.28% and 87.28%, respectively. In comparison, the model of efficient coding presented here reached up to 90.07%. Altogether, the results presented demonstrate that independent component analysis performed successfully the efficient coding in order to discriminate mass from non-mass tissues. In addition, we have observed that LDA with ICA bases showed high predictive performance for some datasets and thus provide significant support for a more detailed clinical investigation.

  2. The research of computer multimedia assistant in college English listening

    NASA Astrophysics Data System (ADS)

    Zhang, Qian

    2012-04-01

    With the technology development of network information, there exists more and more seriously questions to our education. Computer multimedia application breaks the traditional foreign language teaching and brings new challenges and opportunities for the education. Through the multiple media application, the teaching process is full of animation, image, voice, and characters. This can improve the learning initiative and objective with great development of learning efficiency. During the traditional foreign language teaching, people use characters learning. However, through this method, the theory performance is good but the practical application is low. During the long time computer multimedia application in the foreign language teaching, many teachers still have prejudice. Therefore, the method is not obtaining the effect. After all the above, the research has significant meaning for improving the teaching quality of foreign language.

  3. Discovering Synergistic Drug Combination from a Computational Perspective.

    PubMed

    Ding, Pingjian; Luo, Jiawei; Liang, Cheng; Xiao, Qiu; Cao, Buwen; Li, Guanghui

    2018-03-30

    Synergistic drug combinations play an important role in the treatment of complex diseases. The identification of effective drug combination is vital to further reduce the side effects and improve therapeutic efficiency. In previous years, in vitro method has been the main route to discover synergistic drug combinations. However, many limitations of time and resource consumption lie within the in vitro method. Therefore, with the rapid development of computational models and the explosive growth of large and phenotypic data, computational methods for discovering synergistic drug combinations are an efficient and promising tool and contribute to precision medicine. It is the key of computational methods how to construct the computational model. Different computational strategies generate different performance. In this review, the recent advancements in computational methods for predicting effective drug combination are concluded from multiple aspects. First, various datasets utilized to discover synergistic drug combinations are summarized. Second, we discussed feature-based approaches and partitioned these methods into two classes including feature-based methods in terms of similarity measure, and feature-based methods in terms of machine learning. Third, we discussed network-based approaches for uncovering synergistic drug combinations. Finally, we analyzed and prospected computational methods for predicting effective drug combinations. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.

  4. Near real-time, on-the-move software PED using VPEF

    NASA Astrophysics Data System (ADS)

    Green, Kevin; Geyer, Chris; Burnette, Chris; Agarwal, Sanjeev; Swett, Bruce; Phan, Chung; Deterline, Diane

    2015-05-01

    The scope of the Micro-Cloud for Operational, Vehicle-Based EO-IR Reconnaissance System (MOVERS) development effort, managed by the Night Vision and Electronic Sensors Directorate (NVESD), is to develop, integrate, and demonstrate new sensor technologies and algorithms that improve improvised device/mine detection using efficient and effective exploitation and fusion of sensor data and target cues from existing and future Route Clearance Package (RCP) sensor systems. Unfortunately, the majority of forward looking Full Motion Video (FMV) and computer vision processing, exploitation, and dissemination (PED) algorithms are often developed using proprietary, incompatible software. This makes the insertion of new algorithms difficult due to the lack of standardized processing chains. In order to overcome these limitations, EOIR developed the Government off-the-shelf (GOTS) Video Processing and Exploitation Framework (VPEF) to be able to provide standardized interfaces (e.g., input/output video formats, sensor metadata, and detected objects) for exploitation software and to rapidly integrate and test computer vision algorithms. EOIR developed a vehicle-based computing framework within the MOVERS and integrated it with VPEF. VPEF was further enhanced for automated processing, detection, and publishing of detections in near real-time, thus improving the efficiency and effectiveness of RCP sensor systems.

  5. Nonlinear histogram binning for quantitative analysis of lung tissue fibrosis in high-resolution CT data

    NASA Astrophysics Data System (ADS)

    Zavaletta, Vanessa A.; Bartholmai, Brian J.; Robb, Richard A.

    2007-03-01

    Diffuse lung diseases, such as idiopathic pulmonary fibrosis (IPF), can be characterized and quantified by analysis of volumetric high resolution CT scans of the lungs. These data sets typically have dimensions of 512 x 512 x 400. It is too subjective and labor intensive for a radiologist to analyze each slice and quantify regional abnormalities manually. Thus, computer aided techniques are necessary, particularly texture analysis techniques which classify various lung tissue types. Second and higher order statistics which relate the spatial variation of the intensity values are good discriminatory features for various textures. The intensity values in lung CT scans range between [-1024, 1024]. Calculation of second order statistics on this range is too computationally intensive so the data is typically binned between 16 or 32 gray levels. There are more effective ways of binning the gray level range to improve classification. An optimal and very efficient way to nonlinearly bin the histogram is to use a dynamic programming algorithm. The objective of this paper is to show that nonlinear binning using dynamic programming is computationally efficient and improves the discriminatory power of the second and higher order statistics for more accurate quantification of diffuse lung disease.

  6. Approximate Algorithms for Computing Spatial Distance Histograms with Accuracy Guarantees

    PubMed Central

    Grupcev, Vladimir; Yuan, Yongke; Tu, Yi-Cheng; Huang, Jin; Chen, Shaoping; Pandit, Sagar; Weng, Michael

    2014-01-01

    Particle simulation has become an important research tool in many scientific and engineering fields. Data generated by such simulations impose great challenges to database storage and query processing. One of the queries against particle simulation data, the spatial distance histogram (SDH) query, is the building block of many high-level analytics, and requires quadratic time to compute using a straightforward algorithm. Previous work has developed efficient algorithms that compute exact SDHs. While beating the naive solution, such algorithms are still not practical in processing SDH queries against large-scale simulation data. In this paper, we take a different path to tackle this problem by focusing on approximate algorithms with provable error bounds. We first present a solution derived from the aforementioned exact SDH algorithm, and this solution has running time that is unrelated to the system size N. We also develop a mathematical model to analyze the mechanism that leads to errors in the basic approximate algorithm. Our model provides insights on how the algorithm can be improved to achieve higher accuracy and efficiency. Such insights give rise to a new approximate algorithm with improved time/accuracy tradeoff. Experimental results confirm our analysis. PMID:24693210

  7. The Examination of Strength and Weakness of Online Evaluation of Faculty Members Teaching by Students in the University of Isfahan

    ERIC Educational Resources Information Center

    Maryam, Ansary; Alireza, Shavakhi; Reza, Nasr Ahmad; Azizollah, Arbabisarjou

    2012-01-01

    Evaluation of faculty members' teaching is a device for recognition of their ability in teaching, assessing, the student's learning and it can improve efficiency of faculty members in teaching. In terms of growth of computer's technologies improvement of universities and its effect on achievement and information processing, it is necessary to use…

  8. Assisting People with Multiple Disabilities and Minimal Motor Behavior to Improve Computer Pointing Efficiency through a Mouse Wheel

    ERIC Educational Resources Information Center

    Shih, Ching-Hsiang; Chang, Man-Ling; Shih, Ching-Tien

    2009-01-01

    This study evaluated whether two people with multiple disabilities and minimal motor behavior would be able to improve their pointing performance using finger poke ability with a mouse wheel through a Dynamic Pointing Assistive Program (DPAP) and a newly developed mouse driver (i.e., a new mouse driver replaces standard mouse driver, changes a…

  9. Assisting People with Multiple Disabilities Improve Their Computer-Pointing Efficiency with Hand Swing through a Standard Mouse

    ERIC Educational Resources Information Center

    Shih, Ching-Hsiang; Chiu, Sheng-Kai; Chu, Chiung-Ling; Shih, Ching-Tien; Liao, Yung-Kun; Lin, Chia-Chen

    2010-01-01

    This study evaluated whether two people with multiple disabilities would be able to improve their pointing performance using hand swing with a standard mouse through an Extended Dynamic Pointing Assistive Program (EDPAP) and a newly developed mouse driver (i.e., a new mouse driver replaces standard mouse driver, and changes a mouse into a precise…

  10. Dynamic graph cuts for efficient inference in Markov Random Fields.

    PubMed

    Kohli, Pushmeet; Torr, Philip H S

    2007-12-01

    Abstract-In this paper we present a fast new fully dynamic algorithm for the st-mincut/max-flow problem. We show how this algorithm can be used to efficiently compute MAP solutions for certain dynamically changing MRF models in computer vision such as image segmentation. Specifically, given the solution of the max-flow problem on a graph, the dynamic algorithm efficiently computes the maximum flow in a modified version of the graph. The time taken by it is roughly proportional to the total amount of change in the edge weights of the graph. Our experiments show that, when the number of changes in the graph is small, the dynamic algorithm is significantly faster than the best known static graph cut algorithm. We test the performance of our algorithm on one particular problem: the object-background segmentation problem for video. It should be noted that the application of our algorithm is not limited to the above problem, the algorithm is generic and can be used to yield similar improvements in many other cases that involve dynamic change.

  11. The future challenge for aeropropulsion

    NASA Technical Reports Server (NTRS)

    Rosen, Robert; Bowditch, David N.

    1992-01-01

    NASA's research in aeropropulsion is focused on improving the efficiency, capability, and environmental compatibility for all classes of future aircraft. The development of innovative concepts, and theoretical, experimental, and computational tools provide the knowledge base for continued propulsion system advances. Key enabling technologies include advances in internal fluid mechanics, structures, light-weight high-strength composite materials, and advanced sensors and controls. Recent emphasis has been on the development of advanced computational tools in internal fluid mechanics, structural mechanics, reacting flows, and computational chemistry. For subsonic transport applications, very high bypass ratio turbofans with increased engine pressure ratio are being investigated to increase fuel efficiency and reduce airport noise levels. In a joint supersonic cruise propulsion program with industry, the critical environmental concerns of emissions and community noise are being addressed. NASA is also providing key technologies for the National Aerospaceplane, and is studying propulsion systems that provide the capability for aircraft to accelerate to and cruise in the Mach 4-6 speed range. The combination of fundamental, component, and focused technology development underway at NASA will make possible dramatic advances in aeropropulsion efficiency and environmental compatibility for future aeronautical vehicles.

  12. Update on Bayesian Blocks: Segmented Models for Sequential Data

    NASA Technical Reports Server (NTRS)

    Scargle, Jeff

    2017-01-01

    The Bayesian Block algorithm, in wide use in astronomy and other areas, has been improved in several ways. The model for block shape has been generalized to include other than constant signal rate - e.g., linear, exponential, or other parametric models. In addition the computational efficiency has been improved, so that instead of O(N**2) the basic algorithm is O(N) in most cases. Other improvements in the theory and application of segmented representations will be described.

  13. The Role of Flow Field Computation in Improving Turbomachinery.

    DTIC Science & Technology

    1986-06-01

    sad total pressure within Solls-Royce t9 design turbine blade %hopes. loan. The comressors need in engines designed in Moruigesn and ReJily. 7 0 ) wrote... engine improve efficiency. In one case, blading designed specific fuel conaumption improved by some I Z. by NOTS for a mall Industrial turbine anufact...the corners Nost small aeronautical gas turbines have there- between wall and blade , and in due course also for fore chosen to use several axial stages

  14. TU-AB-303-08: GPU-Based Software Platform for Efficient Image-Guided Adaptive Radiation Therapy

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Park, S; Robinson, A; McNutt, T

    2015-06-15

    Purpose: In this study, we develop an integrated software platform for adaptive radiation therapy (ART) that combines fast and accurate image registration, segmentation, and dose computation/accumulation methods. Methods: The proposed system consists of three key components; 1) deformable image registration (DIR), 2) automatic segmentation, and 3) dose computation/accumulation. The computationally intensive modules including DIR and dose computation have been implemented on a graphics processing unit (GPU). All required patient-specific data including the planning CT (pCT) with contours, daily cone-beam CTs, and treatment plan are automatically queried and retrieved from their own databases. To improve the accuracy of DIR between pCTmore » and CBCTs, we use the double force demons DIR algorithm in combination with iterative CBCT intensity correction by local intensity histogram matching. Segmentation of daily CBCT is then obtained by propagating contours from the pCT. Daily dose delivered to the patient is computed on the registered pCT by a GPU-accelerated superposition/convolution algorithm. Finally, computed daily doses are accumulated to show the total delivered dose to date. Results: Since the accuracy of DIR critically affects the quality of the other processes, we first evaluated our DIR method on eight head-and-neck cancer cases and compared its performance. Normalized mutual-information (NMI) and normalized cross-correlation (NCC) computed as similarity measures, and our method produced overall NMI of 0.663 and NCC of 0.987, outperforming conventional methods by 3.8% and 1.9%, respectively. Experimental results show that our registration method is more consistent and roust than existing algorithms, and also computationally efficient. Computation time at each fraction took around one minute (30–50 seconds for registration and 15–25 seconds for dose computation). Conclusion: We developed an integrated GPU-accelerated software platform that enables accurate and efficient DIR, auto-segmentation, and dose computation, thus supporting an efficient ART workflow. This work was supported by NIH/NCI under grant R42CA137886.« less

  15. GPU and APU computations of Finite Time Lyapunov Exponent fields

    NASA Astrophysics Data System (ADS)

    Conti, Christian; Rossinelli, Diego; Koumoutsakos, Petros

    2012-03-01

    We present GPU and APU accelerated computations of Finite-Time Lyapunov Exponent (FTLE) fields. The calculation of FTLEs is a computationally intensive process, as in order to obtain the sharp ridges associated with the Lagrangian Coherent Structures an extensive resampling of the flow field is required. The computational performance of this resampling is limited by the memory bandwidth of the underlying computer architecture. The present technique harnesses data-parallel execution of many-core architectures and relies on fast and accurate evaluations of moment conserving functions for the mesh to particle interpolations. We demonstrate how the computation of FTLEs can be efficiently performed on a GPU and on an APU through OpenCL and we report over one order of magnitude improvements over multi-threaded executions in FTLE computations of bluff body flows.

  16. Desktop microsimulation: a tool to improve efficiency in the medical office practice.

    PubMed

    Montgomery, James B; Linville, Beth A; Slonim, Anthony D

    2013-01-01

    Because the economic crisis in the United States continues to have an impact on healthcare organizations, industry leaders must optimize their decision making. Discrete-event computer simulation is a quality tool with a demonstrated track record of improving the precision of analysis for process redesign. However, the use of simulation to consolidate practices and design efficiencies into an unfinished medical office building was a unique task. A discrete-event computer simulation package was used to model the operations and forecast future results for four orthopedic surgery practices. The scenarios were created to allow an evaluation of the impact of process change on the output variables of exam room utilization, patient queue size, and staff utilization. The model helped with decisions regarding space allocation and efficient exam room use by demonstrating the impact of process changes in patient queues at check-in/out, x-ray, and cast room locations when compared to the status quo model. The analysis impacted decisions on facility layout, patient flow, and staff functions in this newly consolidated practice. Simulation was found to be a useful tool for process redesign and decision making even prior to building occupancy. © 2011 National Association for Healthcare Quality.

  17. Comparing memory-efficient genome assemblers on stand-alone and cloud infrastructures.

    PubMed

    Kleftogiannis, Dimitrios; Kalnis, Panos; Bajic, Vladimir B

    2013-01-01

    A fundamental problem in bioinformatics is genome assembly. Next-generation sequencing (NGS) technologies produce large volumes of fragmented genome reads, which require large amounts of memory to assemble the complete genome efficiently. With recent improvements in DNA sequencing technologies, it is expected that the memory footprint required for the assembly process will increase dramatically and will emerge as a limiting factor in processing widely available NGS-generated reads. In this report, we compare current memory-efficient techniques for genome assembly with respect to quality, memory consumption and execution time. Our experiments prove that it is possible to generate draft assemblies of reasonable quality on conventional multi-purpose computers with very limited available memory by choosing suitable assembly methods. Our study reveals the minimum memory requirements for different assembly programs even when data volume exceeds memory capacity by orders of magnitude. By combining existing methodologies, we propose two general assembly strategies that can improve short-read assembly approaches and result in reduction of the memory footprint. Finally, we discuss the possibility of utilizing cloud infrastructures for genome assembly and we comment on some findings regarding suitable computational resources for assembly.

  18. Fast Particle Methods for Multiscale Phenomena Simulations

    NASA Technical Reports Server (NTRS)

    Koumoutsakos, P.; Wray, A.; Shariff, K.; Pohorille, Andrew

    2000-01-01

    We are developing particle methods oriented at improving computational modeling capabilities of multiscale physical phenomena in : (i) high Reynolds number unsteady vortical flows, (ii) particle laden and interfacial flows, (iii)molecular dynamics studies of nanoscale droplets and studies of the structure, functions, and evolution of the earliest living cell. The unifying computational approach involves particle methods implemented in parallel computer architectures. The inherent adaptivity, robustness and efficiency of particle methods makes them a multidisciplinary computational tool capable of bridging the gap of micro-scale and continuum flow simulations. Using efficient tree data structures, multipole expansion algorithms, and improved particle-grid interpolation, particle methods allow for simulations using millions of computational elements, making possible the resolution of a wide range of length and time scales of these important physical phenomena.The current challenges in these simulations are in : [i] the proper formulation of particle methods in the molecular and continuous level for the discretization of the governing equations [ii] the resolution of the wide range of time and length scales governing the phenomena under investigation. [iii] the minimization of numerical artifacts that may interfere with the physics of the systems under consideration. [iv] the parallelization of processes such as tree traversal and grid-particle interpolations We are conducting simulations using vortex methods, molecular dynamics and smooth particle hydrodynamics, exploiting their unifying concepts such as : the solution of the N-body problem in parallel computers, highly accurate particle-particle and grid-particle interpolations, parallel FFT's and the formulation of processes such as diffusion in the context of particle methods. This approach enables us to transcend among seemingly unrelated areas of research.

  19. Improving the energy efficiency of sparse linear system solvers on multicore and manycore systems.

    PubMed

    Anzt, H; Quintana-Ortí, E S

    2014-06-28

    While most recent breakthroughs in scientific research rely on complex simulations carried out in large-scale supercomputers, the power draft and energy spent for this purpose is increasingly becoming a limiting factor to this trend. In this paper, we provide an overview of the current status in energy-efficient scientific computing by reviewing different technologies used to monitor power draft as well as power- and energy-saving mechanisms available in commodity hardware. For the particular domain of sparse linear algebra, we analyse the energy efficiency of a broad collection of hardware architectures and investigate how algorithmic and implementation modifications can improve the energy performance of sparse linear system solvers, without negatively impacting their performance. © 2014 The Author(s) Published by the Royal Society. All rights reserved.

  20. Efficient alignment-free DNA barcode analytics.

    PubMed

    Kuksa, Pavel; Pavlovic, Vladimir

    2009-11-10

    In this work we consider barcode DNA analysis problems and address them using alternative, alignment-free methods and representations which model sequences as collections of short sequence fragments (features). The methods use fixed-length representations (spectrum) for barcode sequences to measure similarities or dissimilarities between sequences coming from the same or different species. The spectrum-based representation not only allows for accurate and computationally efficient species classification, but also opens possibility for accurate clustering analysis of putative species barcodes and identification of critical within-barcode loci distinguishing barcodes of different sample groups. New alignment-free methods provide highly accurate and fast DNA barcode-based identification and classification of species with substantial improvements in accuracy and speed over state-of-the-art barcode analysis methods. We evaluate our methods on problems of species classification and identification using barcodes, important and relevant analytical tasks in many practical applications (adverse species movement monitoring, sampling surveys for unknown or pathogenic species identification, biodiversity assessment, etc.) On several benchmark barcode datasets, including ACG, Astraptes, Hesperiidae, Fish larvae, and Birds of North America, proposed alignment-free methods considerably improve prediction accuracy compared to prior results. We also observe significant running time improvements over the state-of-the-art methods. Our results show that newly developed alignment-free methods for DNA barcoding can efficiently and with high accuracy identify specimens by examining only few barcode features, resulting in increased scalability and interpretability of current computational approaches to barcoding.

  1. Computationally-efficient stochastic cluster dynamics method for modeling damage accumulation in irradiated materials

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hoang, Tuan L.; Physical and Life Sciences Directorate, Lawrence Livermore National Laboratory, CA 94550; Marian, Jaime, E-mail: jmarian@ucla.edu

    2015-11-01

    An improved version of a recently developed stochastic cluster dynamics (SCD) method (Marian and Bulatov, 2012) [6] is introduced as an alternative to rate theory (RT) methods for solving coupled ordinary differential equation (ODE) systems for irradiation damage simulations. SCD circumvents by design the curse of dimensionality of the variable space that renders traditional ODE-based RT approaches inefficient when handling complex defect population comprised of multiple (more than two) defect species. Several improvements introduced here enable efficient and accurate simulations of irradiated materials up to realistic (high) damage doses characteristic of next-generation nuclear systems. The first improvement is a proceduremore » for efficiently updating the defect reaction-network and event selection in the context of a dynamically expanding reaction-network. Next is a novel implementation of the τ-leaping method that speeds up SCD simulations by advancing the state of the reaction network in large time increments when appropriate. Lastly, a volume rescaling procedure is introduced to control the computational complexity of the expanding reaction-network through occasional reductions of the defect population while maintaining accurate statistics. The enhanced SCD method is then applied to model defect cluster accumulation in iron thin films subjected to triple ion-beam (Fe{sup 3+}, He{sup +} and H{sup +}) irradiations, for which standard RT or spatially-resolved kinetic Monte Carlo simulations are prohibitively expensive.« less

  2. Computationally-efficient stochastic cluster dynamics method for modeling damage accumulation in irradiated materials

    NASA Astrophysics Data System (ADS)

    Hoang, Tuan L.; Marian, Jaime; Bulatov, Vasily V.; Hosemann, Peter

    2015-11-01

    An improved version of a recently developed stochastic cluster dynamics (SCD) method (Marian and Bulatov, 2012) [6] is introduced as an alternative to rate theory (RT) methods for solving coupled ordinary differential equation (ODE) systems for irradiation damage simulations. SCD circumvents by design the curse of dimensionality of the variable space that renders traditional ODE-based RT approaches inefficient when handling complex defect population comprised of multiple (more than two) defect species. Several improvements introduced here enable efficient and accurate simulations of irradiated materials up to realistic (high) damage doses characteristic of next-generation nuclear systems. The first improvement is a procedure for efficiently updating the defect reaction-network and event selection in the context of a dynamically expanding reaction-network. Next is a novel implementation of the τ-leaping method that speeds up SCD simulations by advancing the state of the reaction network in large time increments when appropriate. Lastly, a volume rescaling procedure is introduced to control the computational complexity of the expanding reaction-network through occasional reductions of the defect population while maintaining accurate statistics. The enhanced SCD method is then applied to model defect cluster accumulation in iron thin films subjected to triple ion-beam (Fe3+, He+ and H+) irradiations, for which standard RT or spatially-resolved kinetic Monte Carlo simulations are prohibitively expensive.

  3. Finite element computation on nearest neighbor connected machines

    NASA Technical Reports Server (NTRS)

    Mcaulay, A. D.

    1984-01-01

    Research aimed at faster, more cost effective parallel machines and algorithms for improving designer productivity with finite element computations is discussed. A set of 8 boards, containing 4 nearest neighbor connected arrays of commercially available floating point chips and substantial memory, are inserted into a commercially available machine. One-tenth Mflop (64 bit operation) processors provide an 89% efficiency when solving the equations arising in a finite element problem for a single variable regular grid of size 40 by 40 by 40. This is approximately 15 to 20 times faster than a much more expensive machine such as a VAX 11/780 used in double precision. The efficiency falls off as faster or more processors are envisaged because communication times become dominant. A novel successive overrelaxation algorithm which uses cyclic reduction in order to permit data transfer and computation to overlap in time is proposed.

  4. Turbine Internal and Film Cooling Modeling For 3D Navier-Stokes Codes

    NASA Technical Reports Server (NTRS)

    DeWitt, Kenneth; Garg Vijay; Ameri, Ali

    2005-01-01

    The aim of this research project is to make use of NASA Glenn on-site computational facilities in order to develop, validate and apply aerodynamic, heat transfer, and turbine cooling models for use in advanced 3D Navier-Stokes Computational Fluid Dynamics (CFD) codes such as the Glenn-" code. Specific areas of effort include: Application of the Glenn-HT code to specific configurations made available under Turbine Based Combined Cycle (TBCC), and Ultra Efficient Engine Technology (UEET) projects. Validating the use of a multi-block code for the time accurate computation of the detailed flow and heat transfer of cooled turbine airfoils. The goal of the current research is to improve the predictive ability of the Glenn-HT code. This will enable one to design more efficient turbine components for both aviation and power generation. The models will be tested against specific configurations provided by NASA Glenn.

  5. Design of a Variational Multiscale Method for Turbulent Compressible Flows

    NASA Technical Reports Server (NTRS)

    Diosady, Laslo Tibor; Murman, Scott M.

    2013-01-01

    A spectral-element framework is presented for the simulation of subsonic compressible high-Reynolds-number flows. The focus of the work is maximizing the efficiency of the computational schemes to enable unsteady simulations with a large number of spatial and temporal degrees of freedom. A collocation scheme is combined with optimized computational kernels to provide a residual evaluation with computational cost independent of order of accuracy up to 16th order. The optimized residual routines are used to develop a low-memory implicit scheme based on a matrix-free Newton-Krylov method. A preconditioner based on the finite-difference diagonalized ADI scheme is developed which maintains the low memory of the matrix-free implicit solver, while providing improved convergence properties. Emphasis on low memory usage throughout the solver development is leveraged to implement a coupled space-time DG solver which may offer further efficiency gains through adaptivity in both space and time.

  6. Seismic data restoration with a fast L1 norm trust region method

    NASA Astrophysics Data System (ADS)

    Cao, Jingjie; Wang, Yanfei

    2014-08-01

    Seismic data restoration is a major strategy to provide reliable wavefield when field data dissatisfy the Shannon sampling theorem. Recovery by sparsity-promoting inversion often get sparse solutions of seismic data in a transformed domains, however, most methods for sparsity-promoting inversion are line-searching methods which are efficient but are inclined to obtain local solutions. Using trust region method which can provide globally convergent solutions is a good choice to overcome this shortcoming. A trust region method for sparse inversion has been proposed, however, the efficiency should be improved to suitable for large-scale computation. In this paper, a new L1 norm trust region model is proposed for seismic data restoration and a robust gradient projection method for solving the sub-problem is utilized. Numerical results of synthetic and field data demonstrate that the proposed trust region method can get excellent computation speed and is a viable alternative for large-scale computation.

  7. Computationally Efficient 2D DOA Estimation with Uniform Rectangular Array in Low-Grazing Angle.

    PubMed

    Shi, Junpeng; Hu, Guoping; Zhang, Xiaofei; Sun, Fenggang; Xiao, Yu

    2017-02-26

    In this paper, we propose a computationally efficient spatial differencing matrix set (SDMS) method for two-dimensional direction of arrival (2D DOA) estimation with uniform rectangular arrays (URAs) in a low-grazing angle (LGA) condition. By rearranging the auto-correlation and cross-correlation matrices in turn among different subarrays, the SDMS method can estimate the two parameters independently with one-dimensional (1D) subspace-based estimation techniques, where we only perform difference for auto-correlation matrices and the cross-correlation matrices are kept completely. Then, the pair-matching of two parameters is achieved by extracting the diagonal elements of URA. Thus, the proposed method can decrease the computational complexity, suppress the effect of additive noise and also have little information loss. Simulation results show that, in LGA, compared to other methods, the proposed methods can achieve performance improvement in the white or colored noise conditions.

  8. Computationally Efficient 2D DOA Estimation with Uniform Rectangular Array in Low-Grazing Angle

    PubMed Central

    Shi, Junpeng; Hu, Guoping; Zhang, Xiaofei; Sun, Fenggang; Xiao, Yu

    2017-01-01

    In this paper, we propose a computationally efficient spatial differencing matrix set (SDMS) method for two-dimensional direction of arrival (2D DOA) estimation with uniform rectangular arrays (URAs) in a low-grazing angle (LGA) condition. By rearranging the auto-correlation and cross-correlation matrices in turn among different subarrays, the SDMS method can estimate the two parameters independently with one-dimensional (1D) subspace-based estimation techniques, where we only perform difference for auto-correlation matrices and the cross-correlation matrices are kept completely. Then, the pair-matching of two parameters is achieved by extracting the diagonal elements of URA. Thus, the proposed method can decrease the computational complexity, suppress the effect of additive noise and also have little information loss. Simulation results show that, in LGA, compared to other methods, the proposed methods can achieve performance improvement in the white or colored noise conditions. PMID:28245634

  9. Perspective: Ring-polymer instanton theory

    NASA Astrophysics Data System (ADS)

    Richardson, Jeremy O.

    2018-05-01

    Since the earliest explorations of quantum mechanics, it has been a topic of great interest that quantum tunneling allows particles to penetrate classically insurmountable barriers. Instanton theory provides a simple description of these processes in terms of dominant tunneling pathways. Using a ring-polymer discretization, an efficient computational method is obtained for applying this theory to compute reaction rates and tunneling splittings in molecular systems. Unlike other quantum-dynamics approaches, the method scales well with the number of degrees of freedom, and for many polyatomic systems, the method may provide the most accurate predictions which can be practically computed. Instanton theory thus has the capability to produce useful data for many fields of low-temperature chemistry including spectroscopy, atmospheric and astrochemistry, as well as surface science. There is however still room for improvement in the efficiency of the numerical algorithms, and new theories are under development for describing tunneling in nonadiabatic transitions.

  10. PRESAGE: PRivacy-preserving gEnetic testing via SoftwAre Guard Extension.

    PubMed

    Chen, Feng; Wang, Chenghong; Dai, Wenrui; Jiang, Xiaoqian; Mohammed, Noman; Al Aziz, Md Momin; Sadat, Md Nazmus; Sahinalp, Cenk; Lauter, Kristin; Wang, Shuang

    2017-07-26

    Advances in DNA sequencing technologies have prompted a wide range of genomic applications to improve healthcare and facilitate biomedical research. However, privacy and security concerns have emerged as a challenge for utilizing cloud computing to handle sensitive genomic data. We present one of the first implementations of Software Guard Extension (SGX) based securely outsourced genetic testing framework, which leverages multiple cryptographic protocols and minimal perfect hash scheme to enable efficient and secure data storage and computation outsourcing. We compared the performance of the proposed PRESAGE framework with the state-of-the-art homomorphic encryption scheme, as well as the plaintext implementation. The experimental results demonstrated significant performance over the homomorphic encryption methods and a small computational overhead in comparison to plaintext implementation. The proposed PRESAGE provides an alternative solution for secure and efficient genomic data outsourcing in an untrusted cloud by using a hybrid framework that combines secure hardware and multiple crypto protocols.

  11. Computer code for preliminary sizing analysis of axial-flow turbines

    NASA Technical Reports Server (NTRS)

    Glassman, Arthur J.

    1992-01-01

    This mean diameter flow analysis uses a stage average velocity diagram as the basis for the computational efficiency. Input design requirements include power or pressure ratio, flow rate, temperature, pressure, and rotative speed. Turbine designs are generated for any specified number of stages and for any of three types of velocity diagrams (symmetrical, zero exit swirl, or impulse) or for any specified stage swirl split. Exit turning vanes can be included in the design. The program output includes inlet and exit annulus dimensions, exit temperature and pressure, total and static efficiencies, flow angles, and last stage absolute and relative Mach numbers. An analysis is presented along with a description of the computer program input and output with sample cases. The analysis and code presented herein are modifications of those described in NASA-TN-D-6702. These modifications improve modeling rigor and extend code applicability.

  12. Improving the efficiency of the Finite Temperature Density Matrix Renormalization Group method

    NASA Astrophysics Data System (ADS)

    Nocera, Alberto; Alvarez, Gonzalo

    I review the basics of the finite temperature DMRG method, and then show how its efficiency can be improved by working on reduced Hilbert spaces and by using canonical approaches. My talk explains the applicability of the ancilla DMRG method beyond spins systems to t-J and Hubbard models, and addresses the computation of static and dynamical observables at finite temperature. Finally, I discuss the features of and roadmap for our DMRG + + codebase. Work done at CNMS, sponsored by the SUF Division, BES, U.S. DOE under contract with UT-Battelle. Support by the early career research program, DSUF, BES, DOE.

  13. Evaluation of a Stirling engine heater bypass with the NASA Lewis nodal-analysis performance code

    NASA Technical Reports Server (NTRS)

    Sullivan, T. J.

    1986-01-01

    In support of the U.S. Department of Energy's Stirling Engine Highway Vehicle Systems program, the NASA Lewis Research Center investigated whether bypassing the P-40 Stirling engine heater during regenerative cooling would improve engine performance. The Lewis nodal-analysis Stirling engine computer simulation was used for this investigation. Results for the heater-bypass concept showed no significant improvement in the indicated thermal efficiency for the P-40 Stirling engine operating at full-power and part-power conditions. Optimizing the heater tube length produced a small increase in the indicated thermal efficiency with the heater-bypass concept.

  14. A Lightweight Protocol for Secure Video Streaming

    PubMed Central

    Morkevicius, Nerijus; Bagdonas, Kazimieras

    2018-01-01

    The Internet of Things (IoT) introduces many new challenges which cannot be solved using traditional cloud and host computing models. A new architecture known as fog computing is emerging to address these technological and security gaps. Traditional security paradigms focused on providing perimeter-based protections and client/server point to point protocols (e.g., Transport Layer Security (TLS)) are no longer the best choices for addressing new security challenges in fog computing end devices, where energy and computational resources are limited. In this paper, we present a lightweight secure streaming protocol for the fog computing “Fog Node-End Device” layer. This protocol is lightweight, connectionless, supports broadcast and multicast operations, and is able to provide data source authentication, data integrity, and confidentiality. The protocol is based on simple and energy efficient cryptographic methods, such as Hash Message Authentication Codes (HMAC) and symmetrical ciphers, and uses modified User Datagram Protocol (UDP) packets to embed authentication data into streaming data. Data redundancy could be added to improve reliability in lossy networks. The experimental results summarized in this paper confirm that the proposed method efficiently uses energy and computational resources and at the same time provides security properties on par with the Datagram TLS (DTLS) standard. PMID:29757988

  15. A Lightweight Protocol for Secure Video Streaming.

    PubMed

    Venčkauskas, Algimantas; Morkevicius, Nerijus; Bagdonas, Kazimieras; Damaševičius, Robertas; Maskeliūnas, Rytis

    2018-05-14

    The Internet of Things (IoT) introduces many new challenges which cannot be solved using traditional cloud and host computing models. A new architecture known as fog computing is emerging to address these technological and security gaps. Traditional security paradigms focused on providing perimeter-based protections and client/server point to point protocols (e.g., Transport Layer Security (TLS)) are no longer the best choices for addressing new security challenges in fog computing end devices, where energy and computational resources are limited. In this paper, we present a lightweight secure streaming protocol for the fog computing "Fog Node-End Device" layer. This protocol is lightweight, connectionless, supports broadcast and multicast operations, and is able to provide data source authentication, data integrity, and confidentiality. The protocol is based on simple and energy efficient cryptographic methods, such as Hash Message Authentication Codes (HMAC) and symmetrical ciphers, and uses modified User Datagram Protocol (UDP) packets to embed authentication data into streaming data. Data redundancy could be added to improve reliability in lossy networks. The experimental results summarized in this paper confirm that the proposed method efficiently uses energy and computational resources and at the same time provides security properties on par with the Datagram TLS (DTLS) standard.

  16. A nonlinear relaxation/quasi-Newton algorithm for the compressible Navier-Stokes equations

    NASA Technical Reports Server (NTRS)

    Edwards, Jack R.; Mcrae, D. S.

    1992-01-01

    A highly efficient implicit method for the computation of steady, two-dimensional compressible Navier-Stokes flowfields is presented. The discretization of the governing equations is hybrid in nature, with flux-vector splitting utilized in the streamwise direction and central differences with flux-limited artificial dissipation used for the transverse fluxes. Line Jacobi relaxation is used to provide a suitable initial guess for a new nonlinear iteration strategy based on line Gauss-Seidel sweeps. The applicability of quasi-Newton methods as convergence accelerators for this and other line relaxation algorithms is discussed, and efficient implementations of such techniques are presented. Convergence histories and comparisons with experimental data are presented for supersonic flow over a flat plate and for several high-speed compression corner interactions. Results indicate a marked improvement in computational efficiency over more conventional upwind relaxation strategies, particularly for flowfields containing large pockets of streamwise subsonic flow.

  17. Review of Computational Stirling Analysis Methods

    NASA Technical Reports Server (NTRS)

    Dyson, Rodger W.; Wilson, Scott D.; Tew, Roy C.

    2004-01-01

    Nuclear thermal to electric power conversion carries the promise of longer duration missions and higher scientific data transmission rates back to Earth for both Mars rovers and deep space missions. A free-piston Stirling convertor is a candidate technology that is considered an efficient and reliable power conversion device for such purposes. While already very efficient, it is believed that better Stirling engines can be developed if the losses inherent its current designs could be better understood. However, they are difficult to instrument and so efforts are underway to simulate a complete Stirling engine numerically. This has only recently been attempted and a review of the methods leading up to and including such computational analysis is presented. And finally it is proposed that the quality and depth of Stirling loss understanding may be improved by utilizing the higher fidelity and efficiency of recently developed numerical methods. One such method, the Ultra HI-Fl technique is presented in detail.

  18. The design of sport and touring aircraft

    NASA Technical Reports Server (NTRS)

    Eppler, R.; Guenther, W.

    1984-01-01

    General considerations concerning the design of a new aircraft are discussed, taking into account the objective to develop an aircraft can satisfy economically a certain spectrum of tasks. Requirements related to the design of sport and touring aircraft included in the past mainly a high cruising speed and short take-off and landing runs. Additional requirements for new aircraft are now low fuel consumption and optimal efficiency. A computer program for the computation of flight performance makes it possible to vary automatically a number of parameters, such as flight altitude, wing area, and wing span. The appropriate design characteristics are to a large extent determined by the selection of the flight altitude. Three different wing profiles are compared. Potential improvements with respect to the performance of the aircraft and its efficiency are related to the use of fiber composites, the employment of better propeller profiles, more efficient engines, and the utilization of suitable instrumentation for optimal flight conduction.

  19. Krylov subspace methods on supercomputers

    NASA Technical Reports Server (NTRS)

    Saad, Youcef

    1988-01-01

    A short survey of recent research on Krylov subspace methods with emphasis on implementation on vector and parallel computers is presented. Conjugate gradient methods have proven very useful on traditional scalar computers, and their popularity is likely to increase as three-dimensional models gain importance. A conservative approach to derive effective iterative techniques for supercomputers has been to find efficient parallel/vector implementations of the standard algorithms. The main source of difficulty in the incomplete factorization preconditionings is in the solution of the triangular systems at each step. A few approaches consisting of implementing efficient forward and backward triangular solutions are described in detail. Polynomial preconditioning as an alternative to standard incomplete factorization techniques is also discussed. Another efficient approach is to reorder the equations so as to improve the structure of the matrix to achieve better parallelism or vectorization. An overview of these and other ideas and their effectiveness or potential for different types of architectures is given.

  20. Indoor Pedestrian Localization Using iBeacon and Improved Kalman Filter.

    PubMed

    Sung, Kwangjae; Lee, Dong Kyu 'Roy'; Kim, Hwangnam

    2018-05-26

    The reliable and accurate indoor pedestrian positioning is one of the biggest challenges for location-based systems and applications. Most pedestrian positioning systems have drift error and large bias due to low-cost inertial sensors and random motions of human being, as well as unpredictable and time-varying radio-frequency (RF) signals used for position determination. To solve this problem, many indoor positioning approaches that integrate the user's motion estimated by dead reckoning (DR) method and the location data obtained by RSS fingerprinting through Bayesian filter, such as the Kalman filter (KF), unscented Kalman filter (UKF), and particle filter (PF), have recently been proposed to achieve higher positioning accuracy in indoor environments. Among Bayesian filtering methods, PF is the most popular integrating approach and can provide the best localization performance. However, since PF uses a large number of particles for the high performance, it can lead to considerable computational cost. This paper presents an indoor positioning system implemented on a smartphone, which uses simple dead reckoning (DR), RSS fingerprinting using iBeacon and machine learning scheme, and improved KF. The core of the system is the enhanced KF called a sigma-point Kalman particle filter (SKPF), which localize the user leveraging both the unscented transform of UKF and the weighting method of PF. The SKPF algorithm proposed in this study is used to provide the enhanced positioning accuracy by fusing positional data obtained from both DR and fingerprinting with uncertainty. The SKPF algorithm can achieve better positioning accuracy than KF and UKF and comparable performance compared to PF, and it can provide higher computational efficiency compared with PF. iBeacon in our positioning system is used for energy-efficient localization and RSS fingerprinting. We aim to design the localization scheme that can realize the high positioning accuracy, computational efficiency, and energy efficiency through the SKPF and iBeacon indoors. Empirical experiments in real environments show that the use of the SKPF algorithm and iBeacon in our indoor localization scheme can achieve very satisfactory performance in terms of localization accuracy, computational cost, and energy efficiency.

  1. Actions for productivity improvement in crew training

    NASA Technical Reports Server (NTRS)

    Miller, G. E.

    1985-01-01

    Improvement of the productivity of astronaut crew instructors in the Space Shuttle program and beyond is proposed. It is suggested that instructor certification plans should be established to shorten the time required for trainers to develop their skills and improve their ability to convey those skills. Members of the training cadre should be thoroughly cross trained in their task. This provides better understanding of the overall task and greater flexibility in instructor utilization. Improved facility access will give instructors the benefit of practical application experience. Former crews should be integrated into the training of upcoming crews to bridge some of the gap between simulated conditions and the real world. The information contained in lengthy and complex training manuals can be presented more clearly and efficiently as computer lessons. The illustration, animation and interactive capabilities of the computer combine an effective means of explanation.

  2. Triangular covariance factorizations for. Ph.D. Thesis. - Calif. Univ.

    NASA Technical Reports Server (NTRS)

    Thornton, C. L.

    1976-01-01

    An improved computational form of the discrete Kalman filter is derived using an upper triangular factorization of the error covariance matrix. The covariance P is factored such that P = UDUT where U is unit upper triangular and D is diagonal. Recursions are developed for propagating the U-D covariance factors together with the corresponding state estimate. The resulting algorithm, referred to as the U-D filter, combines the superior numerical precision of square root filtering techniques with an efficiency comparable to that of Kalman's original formula. Moreover, this method is easily implemented and involves no more computer storage than the Kalman algorithm. These characteristics make the U-D method an attractive realtime filtering technique. A new covariance error analysis technique is obtained from an extension of the U-D filter equations. This evaluation method is flexible and efficient and may provide significantly improved numerical results. Cost comparisons show that for a large class of problems the U-D evaluation algorithm is noticeably less expensive than conventional error analysis methods.

  3. Heterogeneous concurrent computing with exportable services

    NASA Technical Reports Server (NTRS)

    Sunderam, Vaidy

    1995-01-01

    Heterogeneous concurrent computing, based on the traditional process-oriented model, is approaching its functionality and performance limits. An alternative paradigm, based on the concept of services, supporting data driven computation, and built on a lightweight process infrastructure, is proposed to enhance the functional capabilities and the operational efficiency of heterogeneous network-based concurrent computing. TPVM is an experimental prototype system supporting exportable services, thread-based computation, and remote memory operations that is built as an extension of and an enhancement to the PVM concurrent computing system. TPVM offers a significantly different computing paradigm for network-based computing, while maintaining a close resemblance to the conventional PVM model in the interest of compatibility and ease of transition Preliminary experiences have demonstrated that the TPVM framework presents a natural yet powerful concurrent programming interface, while being capable of delivering performance improvements of upto thirty percent.

  4. Highly Scalable Matching Pursuit Signal Decomposition Algorithm

    NASA Technical Reports Server (NTRS)

    Christensen, Daniel; Das, Santanu; Srivastava, Ashok N.

    2009-01-01

    Matching Pursuit Decomposition (MPD) is a powerful iterative algorithm for signal decomposition and feature extraction. MPD decomposes any signal into linear combinations of its dictionary elements or atoms . A best fit atom from an arbitrarily defined dictionary is determined through cross-correlation. The selected atom is subtracted from the signal and this procedure is repeated on the residual in the subsequent iterations until a stopping criterion is met. The reconstructed signal reveals the waveform structure of the original signal. However, a sufficiently large dictionary is required for an accurate reconstruction; this in return increases the computational burden of the algorithm, thus limiting its applicability and level of adoption. The purpose of this research is to improve the scalability and performance of the classical MPD algorithm. Correlation thresholds were defined to prune insignificant atoms from the dictionary. The Coarse-Fine Grids and Multiple Atom Extraction techniques were proposed to decrease the computational burden of the algorithm. The Coarse-Fine Grids method enabled the approximation and refinement of the parameters for the best fit atom. The ability to extract multiple atoms within a single iteration enhanced the effectiveness and efficiency of each iteration. These improvements were implemented to produce an improved Matching Pursuit Decomposition algorithm entitled MPD++. Disparate signal decomposition applications may require a particular emphasis of accuracy or computational efficiency. The prominence of the key signal features required for the proper signal classification dictates the level of accuracy necessary in the decomposition. The MPD++ algorithm may be easily adapted to accommodate the imposed requirements. Certain feature extraction applications may require rapid signal decomposition. The full potential of MPD++ may be utilized to produce incredible performance gains while extracting only slightly less energy than the standard algorithm. When the utmost accuracy must be achieved, the modified algorithm extracts atoms more conservatively but still exhibits computational gains over classical MPD. The MPD++ algorithm was demonstrated using an over-complete dictionary on real life data. Computational times were reduced by factors of 1.9 and 44 for the emphases of accuracy and performance, respectively. The modified algorithm extracted similar amounts of energy compared to classical MPD. The degree of the improvement in computational time depends on the complexity of the data, the initialization parameters, and the breadth of the dictionary. The results of the research confirm that the three modifications successfully improved the scalability and computational efficiency of the MPD algorithm. Correlation Thresholding decreased the time complexity by reducing the dictionary size. Multiple Atom Extraction also reduced the time complexity by decreasing the number of iterations required for a stopping criterion to be reached. The Course-Fine Grids technique enabled complicated atoms with numerous variable parameters to be effectively represented in the dictionary. Due to the nature of the three proposed modifications, they are capable of being stacked and have cumulative effects on the reduction of the time complexity.

  5. Montage Version 3.0

    NASA Technical Reports Server (NTRS)

    Jacob, Joseph; Katz, Daniel; Prince, Thomas; Berriman, Graham; Good, John; Laity, Anastasia

    2006-01-01

    The final version (3.0) of the Montage software has been released. To recapitulate from previous NASA Tech Briefs articles about Montage: This software generates custom, science-grade mosaics of astronomical images on demand from input files that comply with the Flexible Image Transport System (FITS) standard and contain image data registered on projections that comply with the World Coordinate System (WCS) standards. This software can be executed on single-processor computers, multi-processor computers, and such networks of geographically dispersed computers as the National Science Foundation s TeraGrid or NASA s Information Power Grid. The primary advantage of running Montage in a grid environment is that computations can be done on a remote supercomputer for efficiency. Multiple computers at different sites can be used for different parts of a computation a significant advantage in cases of computations for large mosaics that demand more processor time than is available at any one site. Version 3.0 incorporates several improvements over prior versions. The most significant improvement is that this version is accessible to scientists located anywhere, through operational Web services that provide access to data from several large astronomical surveys and construct mosaics on either local workstations or remote computational grids as needed.

  6. Predicting phenotype from genotype: Improving accuracy through more robust experimental and computational modeling

    PubMed Central

    Gallion, Jonathan; Koire, Amanda; Katsonis, Panagiotis; Schoenegge, Anne‐Marie; Bouvier, Michel

    2017-01-01

    Abstract Computational prediction yields efficient and scalable initial assessments of how variants of unknown significance may affect human health. However, when discrepancies between these predictions and direct experimental measurements of functional impact arise, inaccurate computational predictions are frequently assumed as the source. Here, we present a methodological analysis indicating that shortcomings in both computational and biological data can contribute to these disagreements. We demonstrate that incomplete assaying of multifunctional proteins can affect the strength of correlations between prediction and experiments; a variant's full impact on function is better quantified by considering multiple assays that probe an ensemble of protein functions. Additionally, many variants predictions are sensitive to protein alignment construction and can be customized to maximize relevance of predictions to a specific experimental question. We conclude that inconsistencies between computation and experiment can often be attributed to the fact that they do not test identical hypotheses. Aligning the design of the computational input with the design of the experimental output will require cooperation between computational and biological scientists, but will also lead to improved estimations of computational prediction accuracy and a better understanding of the genotype–phenotype relationship. PMID:28230923

  7. Predicting phenotype from genotype: Improving accuracy through more robust experimental and computational modeling.

    PubMed

    Gallion, Jonathan; Koire, Amanda; Katsonis, Panagiotis; Schoenegge, Anne-Marie; Bouvier, Michel; Lichtarge, Olivier

    2017-05-01

    Computational prediction yields efficient and scalable initial assessments of how variants of unknown significance may affect human health. However, when discrepancies between these predictions and direct experimental measurements of functional impact arise, inaccurate computational predictions are frequently assumed as the source. Here, we present a methodological analysis indicating that shortcomings in both computational and biological data can contribute to these disagreements. We demonstrate that incomplete assaying of multifunctional proteins can affect the strength of correlations between prediction and experiments; a variant's full impact on function is better quantified by considering multiple assays that probe an ensemble of protein functions. Additionally, many variants predictions are sensitive to protein alignment construction and can be customized to maximize relevance of predictions to a specific experimental question. We conclude that inconsistencies between computation and experiment can often be attributed to the fact that they do not test identical hypotheses. Aligning the design of the computational input with the design of the experimental output will require cooperation between computational and biological scientists, but will also lead to improved estimations of computational prediction accuracy and a better understanding of the genotype-phenotype relationship. © 2017 The Authors. **Human Mutation published by Wiley Periodicals, Inc.

  8. A high-order strong stability preserving Runge-Kutta method for three-dimensional full waveform modeling and inversion of anelastic models

    NASA Astrophysics Data System (ADS)

    Wang, N.; Shen, Y.; Yang, D.; Bao, X.; Li, J.; Zhang, W.

    2017-12-01

    Accurate and efficient forward modeling methods are important for high resolution full waveform inversion. Compared with the elastic case, solving anelastic wave equation requires more computational time, because of the need to compute additional material-independent anelastic functions. A numerical scheme with a large Courant-Friedrichs-Lewy (CFL) condition number enables us to use a large time step to simulate wave propagation, which improves computational efficiency. In this work, we apply the fourth-order strong stability preserving Runge-Kutta method with an optimal CFL coeffiecient to solve the anelastic wave equation. We use a fourth order DRP/opt MacCormack scheme for the spatial discretization, and we approximate the rheological behaviors of the Earth by using the generalized Maxwell body model. With a larger CFL condition number, we find that the computational efficient is significantly improved compared with the traditional fourth-order Runge-Kutta method. Then, we apply the scattering-integral method for calculating travel time and amplitude sensitivity kernels with respect to velocity and attenuation structures. For each source, we carry out one forward simulation and save the time-dependent strain tensor. For each station, we carry out three `backward' simulations for the three components and save the corresponding strain tensors. The sensitivity kernels at each point in the medium are the convolution of the two sets of the strain tensors. Finally, we show several synthetic tests to verify the effectiveness of the strong stability preserving Runge-Kutta method in generating accurate synthetics in full waveform modeling, and in generating accurate strain tensors for calculating sensitivity kernels at regional and global scales.

  9. Time and learning efficiency in Internet-based learning: a systematic review and meta-analysis.

    PubMed

    Cook, David A; Levinson, Anthony J; Garside, Sarah

    2010-12-01

    Authors have claimed that Internet-based instruction promotes greater learning efficiency than non-computer methods. determine, through a systematic synthesis of evidence in health professions education, how Internet-based instruction compares with non-computer instruction in time spent learning, and what features of Internet-based instruction are associated with improved learning efficiency. we searched databases including MEDLINE, CINAHL, EMBASE, and ERIC from 1990 through November 2008. STUDY SELECTION AND DATA ABSTRACTION we included all studies quantifying learning time for Internet-based instruction for health professionals, compared with other instruction. Reviewers worked independently, in duplicate, to abstract information on interventions, outcomes, and study design. we identified 20 eligible studies. Random effects meta-analysis of 8 studies comparing Internet-based with non-Internet instruction (positive numbers indicating Internet longer) revealed pooled effect size (ES) for time -0.10 (p = 0.63). Among comparisons of two Internet-based interventions, providing feedback adds time (ES 0.67, p =0.003, two studies), and greater interactivity generally takes longer (ES 0.25, p = 0.089, five studies). One study demonstrated that adapting to learner prior knowledge saves time without significantly affecting knowledge scores. Other studies revealed that audio narration, video clips, interactive models, and animations increase learning time but also facilitate higher knowledge and/or satisfaction. Across all studies, time correlated positively with knowledge outcomes (r = 0.53, p = 0.021). on average, Internet-based instruction and non-computer instruction require similar time. Instructional strategies to enhance feedback and interactivity typically prolong learning time, but in many cases also enhance learning outcomes. Isolated examples suggest potential for improving efficiency in Internet-based instruction.

  10. A Round-Efficient Authenticated Key Agreement Scheme Based on Extended Chaotic Maps for Group Cloud Meeting

    PubMed Central

    Lee, Tian-Fu; Wang, Zeng-Bo

    2017-01-01

    The security is a critical issue for business purposes. For example, the cloud meeting must consider strong security to maintain the communication privacy. Considering the scenario with cloud meeting, we apply extended chaotic map to present passwordless group authentication key agreement, termed as Passwordless Group Authentication Key Agreement (PL-GAKA). PL-GAKA improves the computation efficiency for the simple group password-based authenticated key agreement (SGPAKE) proposed by Lee et al. in terms of computing the session key. Since the extended chaotic map has equivalent security level to the Diffie–Hellman key exchange scheme applied by SGPAKE, the security of PL-GAKA is not sacrificed when improving the computation efficiency. Moreover, PL-GAKA is a passwordless scheme, so the password maintenance is not necessary. Short-term authentication is considered, hence the communication security is stronger than other protocols by dynamically generating session key in each cloud meeting. In our analysis, we first prove that each meeting member can get the correct information during the meeting. We analyze common security issues for the proposed PL-GAKA in terms of session key security, mutual authentication, perfect forward security, and data integrity. Moreover, we also demonstrate that communicating in PL-GAKA is secure when suffering replay attacks, impersonation attacks, privileged insider attacks, and stolen-verifier attacks. Eventually, an overall comparison is given to show the performance between PL-GAKA, SGPAKE and related solutions. PMID:29207509

  11. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hopwood, J.E.; Affeldt, B.

    An IBM personal computer (PC), a Gerber coordinate digitizer, and a collection of other instruments make up a system known as the Coordinate Digitizer Interactive Processor (CDIP). The PC extracts coordinate data from the digitizer through a special interface, and then, after reformatting, transmits the data to a remote VAX computer, a floppy disk, and a display terminal. This system has improved the efficiency of producing printed circuit-board artwork and extended the useful life of the Gerber GCD-1 Digitizer. 1 ref., 12 figs.

  12. Users' manual for computer program for one-dimensional analysis of coupled-cavity traveling wave tubes

    NASA Technical Reports Server (NTRS)

    Omalley, T. A.; Connolly, D. J.

    1977-01-01

    The use of the coupled cavity traveling wave tube for space communications has led to an increased interest in improving the efficiency of the basic interaction process in these devices through velocity resynchronization and other methods. To analyze these methods, a flexible, large signal computer program for use on the IBM 360/67 time-sharing system has been developed. The present report is a users' manual for this program.

  13. A Taylor Expansion-Based Adaptive Design Strategy for Global Surrogate Modeling With Applications in Groundwater Modeling

    DOE PAGES

    Mo, Shaoxing; Lu, Dan; Shi, Xiaoqing; ...

    2017-12-27

    Global sensitivity analysis (GSA) and uncertainty quantification (UQ) for groundwater modeling are challenging because of the model complexity and significant computational requirements. To reduce the massive computational cost, a cheap-to-evaluate surrogate model is usually constructed to approximate and replace the expensive groundwater models in the GSA and UQ. Constructing an accurate surrogate requires actual model simulations on a number of parameter samples. Thus, a robust experimental design strategy is desired to locate informative samples so as to reduce the computational cost in surrogate construction and consequently to improve the efficiency in the GSA and UQ. In this study, we developmore » a Taylor expansion-based adaptive design (TEAD) that aims to build an accurate global surrogate model with a small training sample size. TEAD defines a novel hybrid score function to search informative samples, and a robust stopping criterion to terminate the sample search that guarantees the resulted approximation errors satisfy the desired accuracy. The good performance of TEAD in building global surrogate models is demonstrated in seven analytical functions with different dimensionality and complexity in comparison to two widely used experimental design methods. The application of the TEAD-based surrogate method in two groundwater models shows that the TEAD design can effectively improve the computational efficiency of GSA and UQ for groundwater modeling.« less

  14. A Taylor Expansion-Based Adaptive Design Strategy for Global Surrogate Modeling With Applications in Groundwater Modeling

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Mo, Shaoxing; Lu, Dan; Shi, Xiaoqing

    Global sensitivity analysis (GSA) and uncertainty quantification (UQ) for groundwater modeling are challenging because of the model complexity and significant computational requirements. To reduce the massive computational cost, a cheap-to-evaluate surrogate model is usually constructed to approximate and replace the expensive groundwater models in the GSA and UQ. Constructing an accurate surrogate requires actual model simulations on a number of parameter samples. Thus, a robust experimental design strategy is desired to locate informative samples so as to reduce the computational cost in surrogate construction and consequently to improve the efficiency in the GSA and UQ. In this study, we developmore » a Taylor expansion-based adaptive design (TEAD) that aims to build an accurate global surrogate model with a small training sample size. TEAD defines a novel hybrid score function to search informative samples, and a robust stopping criterion to terminate the sample search that guarantees the resulted approximation errors satisfy the desired accuracy. The good performance of TEAD in building global surrogate models is demonstrated in seven analytical functions with different dimensionality and complexity in comparison to two widely used experimental design methods. The application of the TEAD-based surrogate method in two groundwater models shows that the TEAD design can effectively improve the computational efficiency of GSA and UQ for groundwater modeling.« less

  15. Enhancing Scalability and Efficiency of the TOUGH2_MP for LinuxClusters

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zhang, Keni; Wu, Yu-Shu

    2006-04-17

    TOUGH2{_}MP, the parallel version TOUGH2 code, has been enhanced by implementing more efficient communication schemes. This enhancement is achieved through reducing the amount of small-size messages and the volume of large messages. The message exchange speed is further improved by using non-blocking communications for both linear and nonlinear iterations. In addition, we have modified the AZTEC parallel linear-equation solver to nonblocking communication. Through the improvement of code structuring and bug fixing, the new version code is now more stable, while demonstrating similar or even better nonlinear iteration converging speed than the original TOUGH2 code. As a result, the new versionmore » of TOUGH2{_}MP is improved significantly in its efficiency. In this paper, the scalability and efficiency of the parallel code are demonstrated by solving two large-scale problems. The testing results indicate that speedup of the code may depend on both problem size and complexity. In general, the code has excellent scalability in memory requirement as well as computing time.« less

  16. Using Chebyshev polynomial interpolation to improve the computational efficiency of gravity models near an irregularly-shaped asteroid

    NASA Astrophysics Data System (ADS)

    Hu, Shou-Cun; Ji, Jiang-Hui

    2017-12-01

    In asteroid rendezvous missions, the dynamical environment near an asteroid’s surface should be made clear prior to launch of the mission. However, most asteroids have irregular shapes, which lower the efficiency of calculating their gravitational field by adopting the traditional polyhedral method. In this work, we propose a method to partition the space near an asteroid adaptively along three spherical coordinates and use Chebyshev polynomial interpolation to represent the gravitational acceleration in each cell. Moreover, we compare four different interpolation schemes to obtain the best precision with identical initial parameters. An error-adaptive octree division is combined to improve the interpolation precision near the surface. As an example, we take the typical irregularly-shaped near-Earth asteroid 4179 Toutatis to demonstrate the advantage of this method; as a result, we show that the efficiency can be increased by hundreds to thousands of times with our method. Our results indicate that this method can be applicable to other irregularly-shaped asteroids and can greatly improve the evaluation efficiency.

  17. Machine learning action parameters in lattice quantum chromodynamics

    NASA Astrophysics Data System (ADS)

    Shanahan, Phiala E.; Trewartha, Daniel; Detmold, William

    2018-05-01

    Numerical lattice quantum chromodynamics studies of the strong interaction are important in many aspects of particle and nuclear physics. Such studies require significant computing resources to undertake. A number of proposed methods promise improved efficiency of lattice calculations, and access to regions of parameter space that are currently computationally intractable, via multi-scale action-matching approaches that necessitate parametric regression of generated lattice datasets. The applicability of machine learning to this regression task is investigated, with deep neural networks found to provide an efficient solution even in cases where approaches such as principal component analysis fail. The high information content and complex symmetries inherent in lattice QCD datasets require custom neural network layers to be introduced and present opportunities for further development.

  18. Towards developing robust algorithms for solving partial differential equations on MIMD machines

    NASA Technical Reports Server (NTRS)

    Saltz, Joel H.; Naik, Vijay K.

    1988-01-01

    Methods for efficient computation of numerical algorithms on a wide variety of MIMD machines are proposed. These techniques reorganize the data dependency patterns to improve the processor utilization. The model problem finds the time-accurate solution to a parabolic partial differential equation discretized in space and implicitly marched forward in time. The algorithms are extensions of Jacobi and SOR. The extensions consist of iterating over a window of several timesteps, allowing efficient overlap of computation with communication. The methods increase the degree to which work can be performed while data are communicated between processors. The effect of the window size and of domain partitioning on the system performance is examined both by implementing the algorithm on a simulated multiprocessor system.

  19. A review of critical in-flight events research methodology

    NASA Technical Reports Server (NTRS)

    Giffin, W. C.; Rockwell, T. H.; Smith, P. E.

    1985-01-01

    Pilot's cognitive responses to critical in-flight events (CIFE's) were investigated, using pilots, who had on the average about 2540 flight hours each, in four experiments: (1) full-mission simulation in a general aviation trainer, (2) paper and pencil CIFE tests, (3) interactive computer-aided scenario testing, and (4) verbal protocols in fault diagnosis tasks. The results of both computer and paper and pencil tests showed only 50 percent efficiency in correct diagnosis of critical events. The efficiency in arriving at a diagnosis was also low: over 20 inquiries were made for 21 percent of the scenarios diagnosed. The information-seeking pattern was random, with frequent retracing over old inquiries. The measures for developing improved cognitive skills for CIFE's are discussed.

  20. Towards developing robust algorithms for solving partial differential equations on MIMD machines

    NASA Technical Reports Server (NTRS)

    Saltz, J. H.; Naik, V. K.

    1985-01-01

    Methods for efficient computation of numerical algorithms on a wide variety of MIMD machines are proposed. These techniques reorganize the data dependency patterns to improve the processor utilization. The model problem finds the time-accurate solution to a parabolic partial differential equation discretized in space and implicitly marched forward in time. The algorithms are extensions of Jacobi and SOR. The extensions consist of iterating over a window of several timesteps, allowing efficient overlap of computation with communication. The methods increase the degree to which work can be performed while data are communicated between processors. The effect of the window size and of domain partitioning on the system performance is examined both by implementing the algorithm on a simulated multiprocessor system.

  1. Computer enhancement through interpretive techniques

    NASA Technical Reports Server (NTRS)

    Foster, G.; Spaanenburg, H. A. E.; Stumpf, W. E.

    1972-01-01

    The improvement in the usage of the digital computer through the use of the technique of interpretation rather than the compilation of higher ordered languages was investigated by studying the efficiency of coding and execution of programs written in FORTRAN, ALGOL, PL/I and COBOL. FORTRAN was selected as the high level language for examining programs which were compiled, and A Programming Language (APL) was chosen for the interpretive language. It is concluded that APL is competitive, not because it and the algorithms being executed are well written, but rather because the batch processing is less efficient than has been admitted. There is not a broad base of experience founded on trying different implementation strategies which have been targeted at open competition with traditional processing methods.

  2. Multigrid treatment of implicit continuum diffusion

    NASA Astrophysics Data System (ADS)

    Francisquez, Manaure; Zhu, Ben; Rogers, Barrett

    2017-10-01

    Implicit treatment of diffusive terms of various differential orders common in continuum mechanics modeling, such as computational fluid dynamics, is investigated with spectral and multigrid algorithms in non-periodic 2D domains. In doubly periodic time dependent problems these terms can be efficiently and implicitly handled by spectral methods, but in non-periodic systems solved with distributed memory parallel computing and 2D domain decomposition, this efficiency is lost for large numbers of processors. We built and present here a multigrid algorithm for these types of problems which outperforms a spectral solution that employs the highly optimized FFTW library. This multigrid algorithm is not only suitable for high performance computing but may also be able to efficiently treat implicit diffusion of arbitrary order by introducing auxiliary equations of lower order. We test these solvers for fourth and sixth order diffusion with idealized harmonic test functions as well as a turbulent 2D magnetohydrodynamic simulation. It is also shown that an anisotropic operator without cross-terms can improve model accuracy and speed, and we examine the impact that the various diffusion operators have on the energy, the enstrophy, and the qualitative aspect of a simulation. This work was supported by DOE-SC-0010508. This research used resources of the National Energy Research Scientific Computing Center (NERSC).

  3. Fundamental energy limits of SET-based Brownian NAND and half-adder circuits. Preliminary findings from a physical-information-theoretic methodology

    NASA Astrophysics Data System (ADS)

    Ercan, İlke; Suyabatmaz, Enes

    2018-06-01

    The saturation in the efficiency and performance scaling of conventional electronic technologies brings about the development of novel computational paradigms. Brownian circuits are among the promising alternatives that can exploit fluctuations to increase the efficiency of information processing in nanocomputing. A Brownian cellular automaton, where signals propagate randomly and are driven by local transition rules, can be made computationally universal by embedding arbitrary asynchronous circuits on it. One of the potential realizations of such circuits is via single electron tunneling (SET) devices since SET technology enable simulation of noise and fluctuations in a fashion similar to Brownian search. In this paper, we perform a physical-information-theoretic analysis on the efficiency limitations in a Brownian NAND and half-adder circuits implemented using SET technology. The method we employed here establishes a solid ground that enables studying computational and physical features of this emerging technology on an equal footing, and yield fundamental lower bounds that provide valuable insights into how far its efficiency can be improved in principle. In order to provide a basis for comparison, we also analyze a NAND gate and half-adder circuit implemented in complementary metal oxide semiconductor technology to show how the fundamental bound of the Brownian circuit compares against a conventional paradigm.

  4. Multiscale finite element modeling of sheet molding compound (SMC) composite structure based on stochastic mesostructure reconstruction

    DOE PAGES

    Chen, Zhangxing; Huang, Tianyu; Shao, Yimin; ...

    2018-03-15

    Predicting the mechanical behavior of the chopped carbon fiber Sheet Molding Compound (SMC) due to spatial variations in local material properties is critical for the structural performance analysis but is computationally challenging. Such spatial variations are induced by the material flow in the compression molding process. In this work, a new multiscale SMC modeling framework and the associated computational techniques are developed to provide accurate and efficient predictions of SMC mechanical performance. The proposed multiscale modeling framework contains three modules. First, a stochastic algorithm for 3D chip-packing reconstruction is developed to efficiently generate the SMC mesoscale Representative Volume Element (RVE)more » model for Finite Element Analysis (FEA). A new fiber orientation tensor recovery function is embedded in the reconstruction algorithm to match reconstructions with the target characteristics of fiber orientation distribution. Second, a metamodeling module is established to improve the computational efficiency by creating the surrogates of mesoscale analyses. Third, the macroscale behaviors are predicted by an efficient multiscale model, in which the spatially varying material properties are obtained based on the local fiber orientation tensors. Our approach is further validated through experiments at both meso- and macro-scales, such as tensile tests assisted by Digital Image Correlation (DIC) and mesostructure imaging.« less

  5. Multiscale finite element modeling of sheet molding compound (SMC) composite structure based on stochastic mesostructure reconstruction

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chen, Zhangxing; Huang, Tianyu; Shao, Yimin

    Predicting the mechanical behavior of the chopped carbon fiber Sheet Molding Compound (SMC) due to spatial variations in local material properties is critical for the structural performance analysis but is computationally challenging. Such spatial variations are induced by the material flow in the compression molding process. In this work, a new multiscale SMC modeling framework and the associated computational techniques are developed to provide accurate and efficient predictions of SMC mechanical performance. The proposed multiscale modeling framework contains three modules. First, a stochastic algorithm for 3D chip-packing reconstruction is developed to efficiently generate the SMC mesoscale Representative Volume Element (RVE)more » model for Finite Element Analysis (FEA). A new fiber orientation tensor recovery function is embedded in the reconstruction algorithm to match reconstructions with the target characteristics of fiber orientation distribution. Second, a metamodeling module is established to improve the computational efficiency by creating the surrogates of mesoscale analyses. Third, the macroscale behaviors are predicted by an efficient multiscale model, in which the spatially varying material properties are obtained based on the local fiber orientation tensors. Our approach is further validated through experiments at both meso- and macro-scales, such as tensile tests assisted by Digital Image Correlation (DIC) and mesostructure imaging.« less

  6. ZigBee-based wireless intra-oral control system for quadriplegic patients.

    PubMed

    Peng, Qiyu; Budinger, Thomas F

    2007-01-01

    A human-to-computer system that includes a wireless intra-oral module, a wireless coordinator and distributed wireless controllers, is presented. The state-of-the-art ZigBee protocol is employed to achieve reliable, low-power and cost-efficient wireless communication between the tongue, computer and controllers. By manipulating five buttons on the wireless intra-oral module using the tongue, the subject can control cursors, computer menus, wheelchair, lights, TV, phone and robotic devices. The system is designed to improve the life quality of patients with stroke and patients with spinal cord injury.

  7. A Hierarchical Algorithm for Fast Debye Summation with Applications to Small Angle Scattering

    PubMed Central

    Gumerov, Nail A.; Berlin, Konstantin; Fushman, David; Duraiswami, Ramani

    2012-01-01

    Debye summation, which involves the summation of sinc functions of distances between all pair of atoms in three dimensional space, arises in computations performed in crystallography, small/wide angle X-ray scattering (SAXS/WAXS) and small angle neutron scattering (SANS). Direct evaluation of Debye summation has quadratic complexity, which results in computational bottleneck when determining crystal properties, or running structure refinement protocols that involve SAXS or SANS, even for moderately sized molecules. We present a fast approximation algorithm that efficiently computes the summation to any prescribed accuracy ε in linear time. The algorithm is similar to the fast multipole method (FMM), and is based on a hierarchical spatial decomposition of the molecule coupled with local harmonic expansions and translation of these expansions. An even more efficient implementation is possible when the scattering profile is all that is required, as in small angle scattering reconstruction (SAS) of macromolecules. We examine the relationship of the proposed algorithm to existing approximate methods for profile computations, and show that these methods may result in inaccurate profile computations, unless an error bound derived in this paper is used. Our theoretical and computational results show orders of magnitude improvement in computation complexity over existing methods, while maintaining prescribed accuracy. PMID:22707386

  8. Improved numerical methods for turbulent viscous recirculating flows

    NASA Technical Reports Server (NTRS)

    Vandoormaal, J. P.; Turan, A.; Raithby, G. D.

    1986-01-01

    The objective of the present study is to improve both the accuracy and computational efficiency of existing numerical techniques used to predict viscous recirculating flows in combustors. A review of the status of the study is presented along with some illustrative results. The effort to improve the numerical techniques consists of the following technical tasks: (1) selection of numerical techniques to be evaluated; (2) two dimensional evaluation of selected techniques; and (3) three dimensional evaluation of technique(s) recommended in Task 2.

  9. A computationally efficient parallel Levenberg-Marquardt algorithm for highly parameterized inverse model analyses

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Lin, Youzuo; O'Malley, Daniel; Vesselinov, Velimir V.

    Inverse modeling seeks model parameters given a set of observations. However, for practical problems because the number of measurements is often large and the model parameters are also numerous, conventional methods for inverse modeling can be computationally expensive. We have developed a new, computationally-efficient parallel Levenberg-Marquardt method for solving inverse modeling problems with a highly parameterized model space. Levenberg-Marquardt methods require the solution of a linear system of equations which can be prohibitively expensive to compute for moderate to large-scale problems. Our novel method projects the original linear problem down to a Krylov subspace, such that the dimensionality of themore » problem can be significantly reduced. Furthermore, we store the Krylov subspace computed when using the first damping parameter and recycle the subspace for the subsequent damping parameters. The efficiency of our new inverse modeling algorithm is significantly improved using these computational techniques. We apply this new inverse modeling method to invert for random transmissivity fields in 2D and a random hydraulic conductivity field in 3D. Our algorithm is fast enough to solve for the distributed model parameters (transmissivity) in the model domain. The algorithm is coded in Julia and implemented in the MADS computational framework (http://mads.lanl.gov). By comparing with Levenberg-Marquardt methods using standard linear inversion techniques such as QR or SVD methods, our Levenberg-Marquardt method yields a speed-up ratio on the order of ~10 1 to ~10 2 in a multi-core computational environment. Furthermore, our new inverse modeling method is a powerful tool for characterizing subsurface heterogeneity for moderate- to large-scale problems.« less

  10. Hadoop-MCC: Efficient Multiple Compound Comparison Algorithm Using Hadoop.

    PubMed

    Hua, Guan-Jie; Hung, Che-Lun; Tang, Chuan Yi

    2018-01-01

    In the past decade, the drug design technologies have been improved enormously. The computer-aided drug design (CADD) has played an important role in analysis and prediction in drug development, which makes the procedure more economical and efficient. However, computation with big data, such as ZINC containing more than 60 million compounds data and GDB-13 with more than 930 million small molecules, is a noticeable issue of time-consuming problem. Therefore, we propose a novel heterogeneous high performance computing method, named as Hadoop-MCC, integrating Hadoop and GPU, to copy with big chemical structure data efficiently. Hadoop-MCC gains the high availability and fault tolerance from Hadoop, as Hadoop is used to scatter input data to GPU devices and gather the results from GPU devices. Hadoop framework adopts mapper/reducer computation model. In the proposed method, mappers response for fetching SMILES data segments and perform LINGO method on GPU, then reducers collect all comparison results produced by mappers. Due to the high availability of Hadoop, all of LINGO computational jobs on mappers can be completed, even if some of the mappers encounter problems. A comparison of LINGO is performed on each the GPU device in parallel. According to the experimental results, the proposed method on multiple GPU devices can achieve better computational performance than the CUDA-MCC on a single GPU device. Hadoop-MCC is able to achieve scalability, high availability, and fault tolerance granted by Hadoop, and high performance as well by integrating computational power of both of Hadoop and GPU. It has been shown that using the heterogeneous architecture as Hadoop-MCC effectively can enhance better computational performance than on a single GPU device. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.

  11. A computationally efficient parallel Levenberg-Marquardt algorithm for highly parameterized inverse model analyses

    DOE PAGES

    Lin, Youzuo; O'Malley, Daniel; Vesselinov, Velimir V.

    2016-09-01

    Inverse modeling seeks model parameters given a set of observations. However, for practical problems because the number of measurements is often large and the model parameters are also numerous, conventional methods for inverse modeling can be computationally expensive. We have developed a new, computationally-efficient parallel Levenberg-Marquardt method for solving inverse modeling problems with a highly parameterized model space. Levenberg-Marquardt methods require the solution of a linear system of equations which can be prohibitively expensive to compute for moderate to large-scale problems. Our novel method projects the original linear problem down to a Krylov subspace, such that the dimensionality of themore » problem can be significantly reduced. Furthermore, we store the Krylov subspace computed when using the first damping parameter and recycle the subspace for the subsequent damping parameters. The efficiency of our new inverse modeling algorithm is significantly improved using these computational techniques. We apply this new inverse modeling method to invert for random transmissivity fields in 2D and a random hydraulic conductivity field in 3D. Our algorithm is fast enough to solve for the distributed model parameters (transmissivity) in the model domain. The algorithm is coded in Julia and implemented in the MADS computational framework (http://mads.lanl.gov). By comparing with Levenberg-Marquardt methods using standard linear inversion techniques such as QR or SVD methods, our Levenberg-Marquardt method yields a speed-up ratio on the order of ~10 1 to ~10 2 in a multi-core computational environment. Furthermore, our new inverse modeling method is a powerful tool for characterizing subsurface heterogeneity for moderate- to large-scale problems.« less

  12. Improved Algorithms Speed It Up for Codes

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hazi, A

    2005-09-20

    Huge computers, huge codes, complex problems to solve. The longer it takes to run a code, the more it costs. One way to speed things up and save time and money is through hardware improvements--faster processors, different system designs, bigger computers. But another side of supercomputing can reap savings in time and speed: software improvements to make codes--particularly the mathematical algorithms that form them--run faster and more efficiently. Speed up math? Is that really possible? According to Livermore physicist Eugene Brooks, the answer is a resounding yes. ''Sure, you get great speed-ups by improving hardware,'' says Brooks, the deputy leadermore » for Computational Physics in N Division, which is part of Livermore's Physics and Advanced Technologies (PAT) Directorate. ''But the real bonus comes on the software side, where improvements in software can lead to orders of magnitude improvement in run times.'' Brooks knows whereof he speaks. Working with Laboratory physicist Abraham Szoeke and others, he has been instrumental in devising ways to shrink the running time of what has, historically, been a tough computational nut to crack: radiation transport codes based on the statistical or Monte Carlo method of calculation. And Brooks is not the only one. Others around the Laboratory, including physicists Andrew Williamson, Randolph Hood, and Jeff Grossman, have come up with innovative ways to speed up Monte Carlo calculations using pure mathematics.« less

  13. Assisting People with Multiple Disabilities and Minimal Motor Behavior to Improve Computer Drag-and-Drop Efficiency through a Mouse Wheel

    ERIC Educational Resources Information Center

    Shih, Ching-Hsiang

    2011-01-01

    This study evaluated whether two people with multiple disabilities and minimal motor behavior would be able to improve their Drag-and-Drop (DnD) performance using their finger/thumb poke ability with a mouse scroll wheel through a Dynamic Drag-and-Drop Assistive Program (DDnDAP). A multiple probe design across participants was used in this study…

  14. Achieving Helicopter Modernization with Advanced Technology Turbine Engines

    DTIC Science & Technology

    1999-04-01

    computer modeling of compressor and turbine aerody- digital engine control ( FADEC ) with manual backup. namics. Modern directionally solidified and single...controlled by a dual RAH.66A M channel FADEC , and features a very simple installation "" Improved Gross Weight and significantly reduced pilot...air separation efficiencies as an "advanced technology" engine. Technological meas- high as 97.5%. The FADEC improves acceleration, ures include but

  15. An Adaptive Dynamic Pointing Assistance Program to Help People with Multiple Disabilities Improve Their Computer Pointing Efficiency with Hand Swing through a Standard Mouse

    ERIC Educational Resources Information Center

    Shih, Ching-Hsiang; Shih, Ching-Tien; Wu, Hsiao-Ling

    2010-01-01

    The latest research adopted software technology to redesign the mouse driver, and turned a mouse into a useful pointing assistive device for people with multiple disabilities who cannot easily or possibly use a standard mouse, to improve their pointing performance through a new operation method, Extended Dynamic Pointing Assistive Program (EDPAP),…

  16. Extending compile-time reverse mode and exploiting partial separability in ADIFOR

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bischof, C.H.; El-Khadiri, M.

    1992-10-01

    The numerical methods employed in the solution of many scientific computing problems require the computation of the gradient of a function f: R[sup n] [yields] R. ADIFOR is a source translator that, given a collection of subroutines to compute f, generates Fortran 77 code for computing the derivative of this function. Using the so-called torsion problem from the MINPACK-2 test collection as an example, this paper explores two issues in automatic differentiation: the efficient computation of derivatives for partial separable functions and the use of the compile-time reverse mode for the generation of derivatives. We show that orders of magnitudesmore » of improvement are possible when exploiting partial separability and maximizing use of the reverse mode.« less

  17. Assessing the potential for improved scramjet performance through application of electromagnetic flow control

    NASA Astrophysics Data System (ADS)

    Lindsey, Martin Forrester

    Sustained hypersonic flight using scramjet propulsion is the key technology bridging the gap between turbojets and the exoatmospheric environment where a rocket is required. Recent efforts have focused on electromagnetic (EM) flow control to mitigate the problems of high thermomechanical loads and low propulsion efficiencies associated with scramjet propulsion. This research effort is the first flight-scale, three-dimensional computational analysis of a realistic scramjet to determine how EM flow control can improve scramjet performance. Development of a quasi-one dimensional design tool culminated in the first open source geometry of an entire scramjet flowpath. This geometry was then tested extensively with the Air Force Research Laboratory's three-dimensional Navier-Stokes and EM coupled computational code. As part of improving the model fidelity, a loosely coupled algorithm was developed to incorporate thermochemistry. This resulted in the only open-source model of fuel injection, mixing and combustion in a magnetogasdynamic (MGD) flow controlled engine. In addition, a control volume analysis tool with an electron beam ionization model was presented for the first time in the context of the established computational method used. Local EM flow control within the internal inlet greatly impacted drag forces and wall heat transfer but was only marginally successful in raising the average pressure entering the combustor. The use of an MGD accelerator to locally increase flow momentum was an effective approach to improve flow into the scramjet's isolator. Combustor-based MGD generators proved superior to the inlet generator with respect to power density and overall engine efficiency. MGD acceleration was shown to be ineffective in improving overall performance, with all of the bypass engines having approximately 33% more drag than baseline and none of them achieving a self-powered state.

  18. Test Administration Models

    ERIC Educational Resources Information Center

    Becker, Kirk A.; Bergstrom, Betty A.

    2013-01-01

    The need for increased exam security, improved test formats, more flexible scheduling, better measurement, and more efficient administrative processes has caused testing agencies to consider converting the administration of their exams from paper-and-pencil to computer-based testing (CBT). Many decisions must be made in order to provide an optimal…

  19. Manufacturing Processes: New Methods for the "Materials Age." Resources in Technology.

    ERIC Educational Resources Information Center

    Technology Teacher, 1990

    1990-01-01

    To make the best use of new materials developed for everything from computers to artificial hearts to more fuel-efficient cars, improved materials syntheses and manufacturing processes are needed. This instructional module includes teacher materials, a student quiz, and possible student outcomes. (JOW)

  20. Assessment of students' ability to incorporate a computer into increasingly complex simulated patient encounters.

    PubMed

    Ray, Sarah; Valdovinos, Katie

    Pharmacy students should be exposed to and offered opportunities to practice the skill of incorporating a computer into a patient interview in the didactic setting. Faculty sought to improve retention of student ability to incorporate computers into their patient-pharmacist communication. Students were required to utilize a computer to document clinical information gathered during a simulated patient encounter (SPE). Students utilized electronic worksheets and were evaluated by instructors on their ability to effectively incorporate a computer into a SPE using a rubric. Students received specific instruction on effective computer use during patient encounters. Students were then re-evaluated by an instructor during subsequent SPEs of increasing complexity using standardized rubrics blinded from the students. Pre-instruction, 45% of students effectively incorporated a computer into a SPE. After receiving instruction, 67% of students were effective in their use of a computer during a SPE of performing a pharmaceutical care assessment for a patient with chronic obstructive pulmonary disease (COPD) (p < 0.05 compared to pre-instruction), and 58% of students were effective in their use of a computer during a SPE of retrieving a medication list and social history from a simulated alcohol-impaired patient (p = 0.087 compared to pre-instruction). Instruction can improve pharmacy students' ability to incorporate a computer into SPEs, a critical skill in building and maintaining rapport with patients and improving efficiency of patient visits. Complex encounters may affect students' ability to utilize a computer appropriately. Students may benefit from repeated practice with this skill, especially with SPEs of increasing complexity. Copyright © 2016 Elsevier Inc. All rights reserved.

  1. Energy Efficient Engine: Control system component performance report

    NASA Technical Reports Server (NTRS)

    Beitler, R. S.; Bennett, G. W.

    1984-01-01

    An Energy Efficient Engine (E3) program was established to develop technology for improving the energy efficiency of future commercial transport aircraft engines. As part of this program, General Electric designed and tested a new engine. The design, fabrication, bench and engine testing of the Full Authority Digital Electronic Control (FADEC) system used for controlling the E3 Demonstrator Engine is described. The system design was based on many of the proven concepts and component designs used on the General Electric family of engines. One significant difference is the use of the FADEC in place of hydromechanical computation currently used.

  2. Towards the clinical implementation of iterative low-dose cone-beam CT reconstruction in image-guided radiation therapy: Cone/ring artifact correction and multiple GPU implementation

    PubMed Central

    Yan, Hao; Wang, Xiaoyu; Shi, Feng; Bai, Ti; Folkerts, Michael; Cervino, Laura; Jiang, Steve B.; Jia, Xun

    2014-01-01

    Purpose: Compressed sensing (CS)-based iterative reconstruction (IR) techniques are able to reconstruct cone-beam CT (CBCT) images from undersampled noisy data, allowing for imaging dose reduction. However, there are a few practical concerns preventing the clinical implementation of these techniques. On the image quality side, data truncation along the superior–inferior direction under the cone-beam geometry produces severe cone artifacts in the reconstructed images. Ring artifacts are also seen in the half-fan scan mode. On the reconstruction efficiency side, the long computation time hinders clinical use in image-guided radiation therapy (IGRT). Methods: Image quality improvement methods are proposed to mitigate the cone and ring image artifacts in IR. The basic idea is to use weighting factors in the IR data fidelity term to improve projection data consistency with the reconstructed volume. In order to improve the computational efficiency, a multiple graphics processing units (GPUs)-based CS-IR system was developed. The parallelization scheme, detailed analyses of computation time at each step, their relationship with image resolution, and the acceleration factors were studied. The whole system was evaluated in various phantom and patient cases. Results: Ring artifacts can be mitigated by properly designing a weighting factor as a function of the spatial location on the detector. As for the cone artifact, without applying a correction method, it contaminated 13 out of 80 slices in a head-neck case (full-fan). Contamination was even more severe in a pelvis case under half-fan mode, where 36 out of 80 slices were affected, leading to poorer soft tissue delineation and reduced superior–inferior coverage. The proposed method effectively corrects those contaminated slices with mean intensity differences compared to FDK results decreasing from ∼497 and ∼293 HU to ∼39 and ∼27 HU for the full-fan and half-fan cases, respectively. In terms of efficiency boost, an overall 3.1 × speedup factor has been achieved with four GPU cards compared to a single GPU-based reconstruction. The total computation time is ∼30 s for typical clinical cases. Conclusions: The authors have developed a low-dose CBCT IR system for IGRT. By incorporating data consistency-based weighting factors in the IR model, cone/ring artifacts can be mitigated. A boost in computational efficiency is achieved by multi-GPU implementation. PMID:25370645

  3. Towards the clinical implementation of iterative low-dose cone-beam CT reconstruction in image-guided radiation therapy: Cone/ring artifact correction and multiple GPU implementation

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Yan, Hao, E-mail: steve.jiang@utsouthwestern.edu, E-mail: xun.jia@utsouthwestern.edu; Shi, Feng; Jiang, Steve B.

    Purpose: Compressed sensing (CS)-based iterative reconstruction (IR) techniques are able to reconstruct cone-beam CT (CBCT) images from undersampled noisy data, allowing for imaging dose reduction. However, there are a few practical concerns preventing the clinical implementation of these techniques. On the image quality side, data truncation along the superior–inferior direction under the cone-beam geometry produces severe cone artifacts in the reconstructed images. Ring artifacts are also seen in the half-fan scan mode. On the reconstruction efficiency side, the long computation time hinders clinical use in image-guided radiation therapy (IGRT). Methods: Image quality improvement methods are proposed to mitigate the conemore » and ring image artifacts in IR. The basic idea is to use weighting factors in the IR data fidelity term to improve projection data consistency with the reconstructed volume. In order to improve the computational efficiency, a multiple graphics processing units (GPUs)-based CS-IR system was developed. The parallelization scheme, detailed analyses of computation time at each step, their relationship with image resolution, and the acceleration factors were studied. The whole system was evaluated in various phantom and patient cases. Results: Ring artifacts can be mitigated by properly designing a weighting factor as a function of the spatial location on the detector. As for the cone artifact, without applying a correction method, it contaminated 13 out of 80 slices in a head-neck case (full-fan). Contamination was even more severe in a pelvis case under half-fan mode, where 36 out of 80 slices were affected, leading to poorer soft tissue delineation and reduced superior–inferior coverage. The proposed method effectively corrects those contaminated slices with mean intensity differences compared to FDK results decreasing from ∼497 and ∼293 HU to ∼39 and ∼27 HU for the full-fan and half-fan cases, respectively. In terms of efficiency boost, an overall 3.1 × speedup factor has been achieved with four GPU cards compared to a single GPU-based reconstruction. The total computation time is ∼30 s for typical clinical cases. Conclusions: The authors have developed a low-dose CBCT IR system for IGRT. By incorporating data consistency-based weighting factors in the IR model, cone/ring artifacts can be mitigated. A boost in computational efficiency is achieved by multi-GPU implementation.« less

  4. Data distribution method of workflow in the cloud environment

    NASA Astrophysics Data System (ADS)

    Wang, Yong; Wu, Junjuan; Wang, Ying

    2017-08-01

    Cloud computing for workflow applications provides the required high efficiency calculation and large storage capacity and it also brings challenges to the protection of trade secrets and other privacy data. Because of privacy data will cause the increase of the data transmission time, this paper presents a new data allocation algorithm based on data collaborative damage degree, to improve the existing data allocation strategy? Safety and public cloud computer algorithm depends on the private cloud; the static allocation method in the initial stage only to the non-confidential data division to improve the original data, in the operational phase will continue to generate data to dynamically adjust the data distribution scheme. The experimental results show that the improved method is effective in reducing the data transmission time.

  5. Technical- and environmental-efficiency analysis of irrigated cotton-cropping systems in Punjab, Pakistan using data envelopment analysis.

    PubMed

    Ullah, Asmat; Perret, Sylvain R

    2014-08-01

    Cotton cropping in Pakistan uses substantial quantities of resources and adversely affects the environment with pollutants from the inputs, particularly pesticides. A question remains regarding to what extent the reduction of such environmental impact is possible without compromising the farmers' income. This paper investigates the environmental, technical, and economic performances of selected irrigated cotton-cropping systems in Punjab to quantify the sustainability of cotton farming and reveal options for improvement. Using mostly primary data, our study quantifies the technical, cost, and environmental efficiencies of different farm sizes. A set of indicators has been computed to reflect these three domains of efficiency using the data envelopment analysis technique. The results indicate that farmers are broadly environmentally inefficient; which primarily results from poor technical inefficiency. Based on an improved input mix, the average potential environmental impact reduction for small, medium, and large farms is 9, 13, and 11 %, respectively, without compromising the economic return. Moreover, the differences in technical, cost, and environmental efficiencies between small and medium and small and large farm sizes were statistically significant. The second-stage regression analysis identifies that the entire farm size significantly affects the efficiencies, whereas exposure to extension and training has positive effects, and the sowing methods significantly affect the technical and environmental efficiencies. Paradoxically, the formal education level is determined to affect the efficiencies negatively. This paper discusses policy interventions that can improve the technical efficiency to ultimately increase the environmental efficiency and reduce the farmers' operating costs.

  6. Time-domain wavefield reconstruction inversion

    NASA Astrophysics Data System (ADS)

    Li, Zhen-Chun; Lin, Yu-Zhao; Zhang, Kai; Li, Yuan-Yuan; Yu, Zhen-Nan

    2017-12-01

    Wavefield reconstruction inversion (WRI) is an improved full waveform inversion theory that has been proposed in recent years. WRI method expands the searching space by introducing the wave equation into the objective function and reconstructing the wavefield to update model parameters, thereby improving the computing efficiency and mitigating the influence of the local minimum. However, frequency-domain WRI is difficult to apply to real seismic data because of the high computational memory demand and requirement of time-frequency transformation with additional computational costs. In this paper, wavefield reconstruction inversion theory is extended into the time domain, the augmented wave equation of WRI is derived in the time domain, and the model gradient is modified according to the numerical test with anomalies. The examples of synthetic data illustrate the accuracy of time-domain WRI and the low dependency of WRI on low-frequency information.

  7. Localized basis functions and other computational improvements in variational nonorthogonal basis function methods for quantum mechanical scattering problems involving chemical reactions

    NASA Technical Reports Server (NTRS)

    Schwenke, David W.; Truhlar, Donald G.

    1990-01-01

    The Generalized Newton Variational Principle for 3D quantum mechanical reactive scattering is briefly reviewed. Then three techniques are described which improve the efficiency of the computations. First, the fact that the Hamiltonian is Hermitian is used to reduce the number of integrals computed, and then the properties of localized basis functions are exploited in order to eliminate redundant work in the integral evaluation. A new type of localized basis function with desirable properties is suggested. It is shown how partitioned matrices can be used with localized basis functions to reduce the amount of work required to handle the complex boundary conditions. The new techniques do not introduce any approximations into the calculations, so they may be used to obtain converged solutions of the Schroedinger equation.

  8. [Transmission efficiency analysis of near-field fiber probe using FDTD simulation].

    PubMed

    Huang, Wei; Dai, Song-Tao; Wang, Huai-Yu; Zhou, Yun-Song

    2011-10-01

    A fiber probe is the key component of near-field optical technology which is widely used in high resolution imaging, spectroscopy detection and nano processing. How to improve the transmission efficiency of the fiber probe is a very important problem in the application of near-field optical technology. Based on the results of 3D-FDTD computation, the dependence of the transmission efficiency on the cone angle, the aperture diameter, the wavelength and the thickness of metal cladding is revealed. The authors have also made a comparison between naked probe and the probe with metal cladding in terms of transmission efficiency and spatial resolution. In addition, the authors have discovered the fluctuation phenomena of transmission efficiency as the wavelength of incident laser increases.

  9. Improved treatment of exact exchange in Quantum ESPRESSO

    DOE PAGES

    Barnes, Taylor A.; Kurth, Thorsten; Carrier, Pierre; ...

    2017-01-18

    Here, we present an algorithm and implementation for the parallel computation of exact exchange in Quantum ESPRESSO (QE) that exhibits greatly improved strong scaling. QE is an open-source software package for electronic structure calculations using plane wave density functional theory, and supports the use of local, semi-local, and hybrid DFT functionals. Wider application of hybrid functionals is desirable for the improved simulation of electronic band energy alignments and thermodynamic properties, but the computational complexity of evaluating the exact exchange potential limits the practical application of hybrid functionals to large systems and requires efficient implementations. We demonstrate that existing implementations ofmore » hybrid DFT that utilize a single data structure for both the local and exact exchange regions of the code are significantly limited in the degree of parallelization achievable. We present a band-pair parallelization approach, in which the calculation of exact exchange is parallelized and evaluated independently from the parallelization of the remainder of the calculation, with the wavefunction data being efficiently transformed on-the-fly into a form that is optimal for each part of the calculation. For a 64 water molecule supercell, our new algorithm reduces the overall time to solution by nearly an order of magnitude.« less

  10. Efficient Simulation of Tropical Cyclone Pathways with Stochastic Perturbations

    NASA Astrophysics Data System (ADS)

    Webber, R.; Plotkin, D. A.; Abbot, D. S.; Weare, J.

    2017-12-01

    Global Climate Models (GCMs) are known to statistically underpredict intense tropical cyclones (TCs) because they fail to capture the rapid intensification and high wind speeds characteristic of the most destructive TCs. Stochastic parametrization schemes have the potential to improve the accuracy of GCMs. However, current analysis of these schemes through direct sampling is limited by the computational expense of simulating a rare weather event at fine spatial gridding. The present work introduces a stochastically perturbed parametrization tendency (SPPT) scheme to increase simulated intensity of TCs. We adapt the Weighted Ensemble algorithm to simulate the distribution of TCs at a fraction of the computational effort required in direct sampling. We illustrate the efficiency of the SPPT scheme by comparing simulations at different spatial resolutions and stochastic parameter regimes. Stochastic parametrization and rare event sampling strategies have great potential to improve TC prediction and aid understanding of tropical cyclogenesis. Since rising sea surface temperatures are postulated to increase the intensity of TCs, these strategies can also improve predictions about climate change-related weather patterns. The rare event sampling strategies used in the current work are not only a novel tool for studying TCs, but they may also be applied to sampling any range of extreme weather events.

  11. An Improved Lattice Boltzmann Model for Non-Newtonian Flows with Applications to Solid-Fluid Interactions in External Flows

    NASA Astrophysics Data System (ADS)

    Adam, Saad; Premnath, Kannan

    2016-11-01

    Fluid mechanics of non-Newtonian fluids, which arise in numerous settings, are characterized by non-linear constitutive models that pose certain unique challenges for computational methods. Here, we consider the lattice Boltzmann method (LBM), which offers some computational advantages due to its kinetic basis and its simpler stream-and-collide procedure enabling efficient simulations. However, further improvements are necessary to improve its numerical stability and accuracy for computations involving broader parameter ranges. Hence, in this study, we extend the cascaded LBM formulation by modifying its moment equilibria and relaxation parameters to handle a variety of non-Newtonian constitutive equations, including power-law and Bingham fluids, with improved stability. In addition, we include corrections to the moment equilibria to obtain an inertial frame invariant scheme without cubic-velocity defects. After preforming its validation study for various benchmark flows, we study the physics of non-Newtonian flow over pairs of circular and square cylinders in a tandem arrangement, especially the wake structure interactions and their effects on resulting forces in each cylinder, and elucidate the effect of the various characteristic parameters.

  12. Scatter correction for cone-beam computed tomography using self-adaptive scatter kernel superposition

    NASA Astrophysics Data System (ADS)

    Xie, Shi-Peng; Luo, Li-Min

    2012-06-01

    The authors propose a combined scatter reduction and correction method to improve image quality in cone beam computed tomography (CBCT). The scatter kernel superposition (SKS) method has been used occasionally in previous studies. However, this method differs in that a scatter detecting blocker (SDB) was used between the X-ray source and the tested object to model the self-adaptive scatter kernel. This study first evaluates the scatter kernel parameters using the SDB, and then isolates the scatter distribution based on the SKS. The quality of image can be improved by removing the scatter distribution. The results show that the method can effectively reduce the scatter artifacts, and increase the image quality. Our approach increases the image contrast and reduces the magnitude of cupping. The accuracy of the SKS technique can be significantly improved in our method by using a self-adaptive scatter kernel. This method is computationally efficient, easy to implement, and provides scatter correction using a single scan acquisition.

  13. A visual tracking method based on improved online multiple instance learning

    NASA Astrophysics Data System (ADS)

    He, Xianhui; Wei, Yuxing

    2016-09-01

    Visual tracking is an active research topic in the field of computer vision and has been well studied in the last decades. The method based on multiple instance learning (MIL) was recently introduced into the tracking task, which can solve the problem that template drift well. However, MIL method has relatively poor performance in running efficiency and accuracy, due to its strong classifiers updating strategy is complicated, and the speed of the classifiers update is not always same with the change of the targets' appearance. In this paper, we present a novel online effective MIL (EMIL) tracker. A new update strategy for strong classifier was proposed to improve the running efficiency of MIL method. In addition, to improve the t racking accuracy and stability of the MIL method, a new dynamic mechanism for learning rate renewal of the classifier and variable search window were proposed. Experimental results show that our method performs good performance under the complex scenes, with strong stability and high efficiency.

  14. SCI Identification (SCIDNT) program user's guide. [maximum likelihood method for linear rotorcraft models

    NASA Technical Reports Server (NTRS)

    1979-01-01

    The computer program Linear SCIDNT which evaluates rotorcraft stability and control coefficients from flight or wind tunnel test data is described. It implements the maximum likelihood method to maximize the likelihood function of the parameters based on measured input/output time histories. Linear SCIDNT may be applied to systems modeled by linear constant-coefficient differential equations. This restriction in scope allows the application of several analytical results which simplify the computation and improve its efficiency over the general nonlinear case.

  15. Computational fluid dynamics combustion analysis evaluation

    NASA Technical Reports Server (NTRS)

    Kim, Y. M.; Shang, H. M.; Chen, C. P.; Ziebarth, J. P.

    1992-01-01

    This study involves the development of numerical modelling in spray combustion. These modelling efforts are mainly motivated to improve the computational efficiency in the stochastic particle tracking method as well as to incorporate the physical submodels of turbulence, combustion, vaporization, and dense spray effects. The present mathematical formulation and numerical methodologies can be casted in any time-marching pressure correction methodologies (PCM) such as FDNS code and MAST code. A sequence of validation cases involving steady burning sprays and transient evaporating sprays will be included.

  16. Structures and Dynamics Division research and technology plans, FY 1982

    NASA Technical Reports Server (NTRS)

    Bales, K. S.

    1982-01-01

    Computational devices to improve efficiency for structural calculations are assessed. The potential of large arrays of microprocessors operating in parallel for finite element analysis is defined, and the impact of specialized computer hardware on static, dynamic, thermal analysis in the optimization of structural analysis and design calculations is determined. General aviation aircraft crashworthiness and occupant survivability is also considered. Mechanics technology required for design coefficient, fault tolerant advanced composite aircraft components subject to combined loads, impact, postbuckling effects and local discontinuities are developed.

  17. The decision tree approach to classification

    NASA Technical Reports Server (NTRS)

    Wu, C.; Landgrebe, D. A.; Swain, P. H.

    1975-01-01

    A class of multistage decision tree classifiers is proposed and studied relative to the classification of multispectral remotely sensed data. The decision tree classifiers are shown to have the potential for improving both the classification accuracy and the computation efficiency. Dimensionality in pattern recognition is discussed and two theorems on the lower bound of logic computation for multiclass classification are derived. The automatic or optimization approach is emphasized. Experimental results on real data are reported, which clearly demonstrate the usefulness of decision tree classifiers.

  18. Dimension improvement in Dhar's refutation of the Eden conjecture

    NASA Astrophysics Data System (ADS)

    Bertrand, Quentin; Pertinand, Jules

    2018-03-01

    We consider the Eden model on the d-dimensional hypercubical unoriented lattice, for large d. Initially, every lattice point is healthy, except the origin which is infected. Then, each infected lattice point contaminates any of its neighbours with rate 1. The Eden model is equivalent to first passage percolation, with exponential passage times on edges. The Eden conjecture states that the limit shape of the Eden model is a Euclidean ball. By pushing the computations of Dhar [5] a little further with modern computers and efficient implementation we obtain improved bounds for the speed of infection. This shows that the Eden conjecture does not hold in dimension superior to 22 (the lowest known dimension was 35).

  19. Development of energy-saving devices for a full slow-speed ship through improving propulsion performance

    NASA Astrophysics Data System (ADS)

    Kim, Jung-Hun; Choi, Jung-Eun; Choi, Bong-Jun; Chung, Seok-Ho; Seo, Heung-Won

    2015-06-01

    Energy-saving devices for 317K VLCC have been developed from a propulsion standpoint. Two ESD candidates were designed via computational tools. The first device WAFon composes of flow-control fins adapted for the ship wake to reduce the loss of rotational energy. The other is WAFon-D, which is a WAFon with a duct to obtain additional thrust and to distribute the inflow velocity on the propeller plane uniform. After selecting the candidates from the computed results, the speed performances were validated with model-tests. The hydrodynamic characteristics of the ESDs may be found in improved hull and propulsive efficiencies through increased wake fraction.

  20. Elimination of trait blocks from multiple trait mixed model equations with singular (Co)variance parameter matrices

    USDA-ARS?s Scientific Manuscript database

    Transformations to multiple trait mixed model equations (MME) which are intended to improve computational efficiency in best linear unbiased prediction (BLUP) and restricted maximum likelihood (REML) are described. It is shown that traits that are expected or estimated to have zero residual variance...

  1. Network-Centric Data Mining for Medical Applications

    ERIC Educational Resources Information Center

    Davis, Darcy A.

    2012-01-01

    Faced with unsustainable costs and enormous amounts of under-utilized data, health care needs more efficient practices, research, and tools to harness the benefits of data. These methods create a feedback loop where computational tools guide and facilitate research, leading to improved biological knowledge and clinical standards, which will in…

  2. Economic Modeling as a Component of Academic Strategic Planning.

    ERIC Educational Resources Information Center

    MacKinnon, Joyce; Sothmann, Mark; Johnson, James

    2001-01-01

    Computer-based economic modeling was used to enable a school of allied health to define outcomes, identify associated costs, develop cost and revenue models, and create a financial planning system. As a strategic planning tool, it assisted realistic budgeting and improved efficiency and effectiveness. (Contains 18 references.) (SK)

  3. Machine-Aided Translation: From Terminology Banks to Interactive Translation Systems.

    ERIC Educational Resources Information Center

    Greenfield, Concetta C.; Serain, Daniel

    The rapid growth of the need for technical translations in recent years has led specialists to utilize computer technology to improve the efficiency and quality of translation. The two approaches considered were automatic translation and terminology banks. Since the results of fully automatic translation were considered unsatisfactory by various…

  4. Application of an improved Nelson-Nguyen analysis to eccentric, arbitrary profile liquid annular seals

    NASA Technical Reports Server (NTRS)

    Padavala, Satyasrinivas; Palazzolo, Alan B.; Vallely, Pat; Ryan, Steve

    1994-01-01

    An improved dynamic analysis for liquid annular seals with arbitrary profile based on a method, first proposed by Nelson and Nguyen, is presented. An improved first order solution that incorporates a continuous interpolation of perturbed quantities in the circumferential direction, is presented. The original method uses an approximation scheme for circumferential gradients, based on Fast Fourier Transforms (FFT). A simpler scheme based on cubic splines is found to be computationally more efficient with better convergence at higher eccentricities. A new approach of computing dynamic coefficients based on external specified load is introduced. This improved analysis is extended to account for arbitrarily varying seal profile in both axial and circumferential directions. An example case of an elliptical seal with varying degrees of axial curvature is analyzed. A case study based on actual operating clearances of an interstage seal of the Space Shuttle Main Engine High Pressure Oxygen Turbopump is presented.

  5. Enhanced electrocaloric cooling in ferroelectric single crystals by electric field reversal

    NASA Astrophysics Data System (ADS)

    Ma, Yang-Bin; Novak, Nikola; Koruza, Jurij; Yang, Tongqing; Albe, Karsten; Xu, Bai-Xiang

    2016-09-01

    An improved thermodynamic cycle is validated in ferroelectric single crystals, where the cooling effect of an electrocaloric refrigerant is enhanced by applying a reversed electric field. In contrast to the conventional adiabatic heating or cooling by on-off cycles of the external electric field, applying a reversed field is significantly improving the cooling efficiency, since the variation in configurational entropy is increased. By comparing results from computer simulations using Monte Carlo algorithms and experiments using direct electrocaloric measurements, we show that the electrocaloric cooling efficiency can be enhanced by more than 20% in standard ferroelectrics and also relaxor ferroelectrics, like Pb (Mg1 /3 /Nb2 /3)0.71Ti0.29O3 .

  6. Characteristics of a semi-custom library development system

    NASA Technical Reports Server (NTRS)

    Yancey, M.; Cannon, R.

    1990-01-01

    Standard cell and gate array macro libraries are in common use with workstation computer aided design (CAD) tools for application specific integrated circuit (ASIC) semi-custom application and have resulted in significant improvements in the overall design efficiencies as contrasted with custom design methodologies. Similar design methodology enhancements in providing for the efficient development of the library cells is an important factor in responding to the need for continuous technology improvement. The characteristics of a library development system that provides design flexibility and productivity enhancements for the library development engineer as he provides libraries in the state-of-the-art process technologies are presented. An overview of Gould's library development system ('Accolade') is also presented.

  7. On the value of incorporating spatial statistics in large-scale geophysical inversions: the SABRe case

    NASA Astrophysics Data System (ADS)

    Kokkinaki, A.; Sleep, B. E.; Chambers, J. E.; Cirpka, O. A.; Nowak, W.

    2010-12-01

    Electrical Resistance Tomography (ERT) is a popular method for investigating subsurface heterogeneity. The method relies on measuring electrical potential differences and obtaining, through inverse modeling, the underlying electrical conductivity field, which can be related to hydraulic conductivities. The quality of site characterization strongly depends on the utilized inversion technique. Standard ERT inversion methods, though highly computationally efficient, do not consider spatial correlation of soil properties; as a result, they often underestimate the spatial variability observed in earth materials, thereby producing unrealistic subsurface models. Also, these methods do not quantify the uncertainty of the estimated properties, thus limiting their use in subsequent investigations. Geostatistical inverse methods can be used to overcome both these limitations; however, they are computationally expensive, which has hindered their wide use in practice. In this work, we compare a standard Gauss-Newton smoothness constrained least squares inversion method against the quasi-linear geostatistical approach using the three-dimensional ERT dataset of the SABRe (Source Area Bioremediation) project. The two methods are evaluated for their ability to: a) produce physically realistic electrical conductivity fields that agree with the wide range of data available for the SABRe site while being computationally efficient, and b) provide information on the spatial statistics of other parameters of interest, such as hydraulic conductivity. To explore the trade-off between inversion quality and computational efficiency, we also employ a 2.5-D forward model with corrections for boundary conditions and source singularities. The 2.5-D model accelerates the 3-D geostatistical inversion method. New adjoint equations are developed for the 2.5-D forward model for the efficient calculation of sensitivities. Our work shows that spatial statistics can be incorporated in large-scale ERT inversions to improve the inversion results without making them computationally prohibitive.

  8. An effective and secure key-management scheme for hierarchical access control in E-medicine system.

    PubMed

    Odelu, Vanga; Das, Ashok Kumar; Goswami, Adrijit

    2013-04-01

    Recently several hierarchical access control schemes are proposed in the literature to provide security of e-medicine systems. However, most of them are either insecure against 'man-in-the-middle attack' or they require high storage and computational overheads. Wu and Chen proposed a key management method to solve dynamic access control problems in a user hierarchy based on hybrid cryptosystem. Though their scheme improves computational efficiency over Nikooghadam et al.'s approach, it suffers from large storage space for public parameters in public domain and computational inefficiency due to costly elliptic curve point multiplication. Recently, Nikooghadam and Zakerolhosseini showed that Wu-Chen's scheme is vulnerable to man-in-the-middle attack. In order to remedy this security weakness in Wu-Chen's scheme, they proposed a secure scheme which is again based on ECC (elliptic curve cryptography) and efficient one-way hash function. However, their scheme incurs huge computational cost for providing verification of public information in the public domain as their scheme uses ECC digital signature which is costly when compared to symmetric-key cryptosystem. In this paper, we propose an effective access control scheme in user hierarchy which is only based on symmetric-key cryptosystem and efficient one-way hash function. We show that our scheme reduces significantly the storage space for both public and private domains, and computational complexity when compared to Wu-Chen's scheme, Nikooghadam-Zakerolhosseini's scheme, and other related schemes. Through the informal and formal security analysis, we further show that our scheme is secure against different attacks and also man-in-the-middle attack. Moreover, dynamic access control problems in our scheme are also solved efficiently compared to other related schemes, making our scheme is much suitable for practical applications of e-medicine systems.

  9. Improving and Evaluating Nested Sampling Algorithm for Marginal Likelihood Estimation

    NASA Astrophysics Data System (ADS)

    Ye, M.; Zeng, X.; Wu, J.; Wang, D.; Liu, J.

    2016-12-01

    With the growing impacts of climate change and human activities on the cycle of water resources, an increasing number of researches focus on the quantification of modeling uncertainty. Bayesian model averaging (BMA) provides a popular framework for quantifying conceptual model and parameter uncertainty. The ensemble prediction is generated by combining each plausible model's prediction, and each model is attached with a model weight which is determined by model's prior weight and marginal likelihood. Thus, the estimation of model's marginal likelihood is crucial for reliable and accurate BMA prediction. Nested sampling estimator (NSE) is a new proposed method for marginal likelihood estimation. The process of NSE is accomplished by searching the parameters' space from low likelihood area to high likelihood area gradually, and this evolution is finished iteratively via local sampling procedure. Thus, the efficiency of NSE is dominated by the strength of local sampling procedure. Currently, Metropolis-Hasting (M-H) algorithm is often used for local sampling. However, M-H is not an efficient sampling algorithm for high-dimensional or complicated parameter space. For improving the efficiency of NSE, it could be ideal to incorporate the robust and efficient sampling algorithm - DREAMzs into the local sampling of NSE. The comparison results demonstrated that the improved NSE could improve the efficiency of marginal likelihood estimation significantly. However, both improved and original NSEs suffer from heavy instability. In addition, the heavy computation cost of huge number of model executions is overcome by using an adaptive sparse grid surrogates.

  10. Node fingerprinting: an efficient heuristic for aligning biological networks.

    PubMed

    Radu, Alex; Charleston, Michael

    2014-10-01

    With the continuing increase in availability of biological data and improvements to biological models, biological network analysis has become a promising area of research. An emerging technique for the analysis of biological networks is through network alignment. Network alignment has been used to calculate genetic distance, similarities between regulatory structures, and the effect of external forces on gene expression, and to depict conditional activity of expression modules in cancer. Network alignment is algorithmically complex, and therefore we must rely on heuristics, ideally as efficient and accurate as possible. The majority of current techniques for network alignment rely on precomputed information, such as with protein sequence alignment, or on tunable network alignment parameters, which may introduce an increased computational overhead. Our presented algorithm, which we call Node Fingerprinting (NF), is appropriate for performing global pairwise network alignment without precomputation or tuning, can be fully parallelized, and is able to quickly compute an accurate alignment between two biological networks. It has performed as well as or better than existing algorithms on biological and simulated data, and with fewer computational resources. The algorithmic validation performed demonstrates the low computational resource requirements of NF.

  11. Enrichment of Human-Computer Interaction in Brain-Computer Interfaces via Virtual Environments

    PubMed Central

    Víctor Rodrigo, Mercado-García

    2017-01-01

    Tridimensional representations stimulate cognitive processes that are the core and foundation of human-computer interaction (HCI). Those cognitive processes take place while a user navigates and explores a virtual environment (VE) and are mainly related to spatial memory storage, attention, and perception. VEs have many distinctive features (e.g., involvement, immersion, and presence) that can significantly improve HCI in highly demanding and interactive systems such as brain-computer interfaces (BCI). BCI is as a nonmuscular communication channel that attempts to reestablish the interaction between an individual and his/her environment. Although BCI research started in the sixties, this technology is not efficient or reliable yet for everyone at any time. Over the past few years, researchers have argued that main BCI flaws could be associated with HCI issues. The evidence presented thus far shows that VEs can (1) set out working environmental conditions, (2) maximize the efficiency of BCI control panels, (3) implement navigation systems based not only on user intentions but also on user emotions, and (4) regulate user mental state to increase the differentiation between control and noncontrol modalities. PMID:29317861

  12. Rapid Calculation of Max-Min Fair Rates for Multi-Commodity Flows in Fat-Tree Networks

    DOE PAGES

    Mollah, Md Atiqul; Yuan, Xin; Pakin, Scott; ...

    2017-08-29

    Max-min fairness is often used in the performance modeling of interconnection networks. Existing methods to compute max-min fair rates for multi-commodity flows have high complexity and are computationally infeasible for large networks. In this paper, we show that by considering topological features, this problem can be solved efficiently for the fat-tree topology that is widely used in data centers and high performance compute clusters. Several efficient new algorithms are developed for this problem, including a parallel algorithm that can take advantage of multi-core and shared-memory architectures. Using these algorithms, we demonstrate that it is possible to find the max-min fairmore » rate allocation for multi-commodity flows in fat-tree networks that support tens of thousands of nodes. We evaluate the run-time performance of the proposed algorithms and show improvement in orders of magnitude over the previously best known method. Finally, we further demonstrate a new application of max-min fair rate allocation that is only computationally feasible using our new algorithms.« less

  13. Rapid Calculation of Max-Min Fair Rates for Multi-Commodity Flows in Fat-Tree Networks

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Mollah, Md Atiqul; Yuan, Xin; Pakin, Scott

    Max-min fairness is often used in the performance modeling of interconnection networks. Existing methods to compute max-min fair rates for multi-commodity flows have high complexity and are computationally infeasible for large networks. In this paper, we show that by considering topological features, this problem can be solved efficiently for the fat-tree topology that is widely used in data centers and high performance compute clusters. Several efficient new algorithms are developed for this problem, including a parallel algorithm that can take advantage of multi-core and shared-memory architectures. Using these algorithms, we demonstrate that it is possible to find the max-min fairmore » rate allocation for multi-commodity flows in fat-tree networks that support tens of thousands of nodes. We evaluate the run-time performance of the proposed algorithms and show improvement in orders of magnitude over the previously best known method. Finally, we further demonstrate a new application of max-min fair rate allocation that is only computationally feasible using our new algorithms.« less

  14. EMILiO: a fast algorithm for genome-scale strain design.

    PubMed

    Yang, Laurence; Cluett, William R; Mahadevan, Radhakrishnan

    2011-05-01

    Systems-level design of cell metabolism is becoming increasingly important for renewable production of fuels, chemicals, and drugs. Computational models are improving in the accuracy and scope of predictions, but are also growing in complexity. Consequently, efficient and scalable algorithms are increasingly important for strain design. Previous algorithms helped to consolidate the utility of computational modeling in this field. To meet intensifying demands for high-performance strains, both the number and variety of genetic manipulations involved in strain construction are increasing. Existing algorithms have experienced combinatorial increases in computational complexity when applied toward the design of such complex strains. Here, we present EMILiO, a new algorithm that increases the scope of strain design to include reactions with individually optimized fluxes. Unlike existing approaches that would experience an explosion in complexity to solve this problem, we efficiently generated numerous alternate strain designs producing succinate, l-glutamate and l-serine. This was enabled by successive linear programming, a technique new to the area of computational strain design. Copyright © 2011 Elsevier Inc. All rights reserved.

  15. A heterogeneous computing accelerated SCE-UA global optimization method using OpenMP, OpenCL, CUDA, and OpenACC.

    PubMed

    Kan, Guangyuan; He, Xiaoyan; Ding, Liuqian; Li, Jiren; Liang, Ke; Hong, Yang

    2017-10-01

    The shuffled complex evolution optimization developed at the University of Arizona (SCE-UA) has been successfully applied in various kinds of scientific and engineering optimization applications, such as hydrological model parameter calibration, for many years. The algorithm possesses good global optimality, convergence stability and robustness. However, benchmark and real-world applications reveal the poor computational efficiency of the SCE-UA. This research aims at the parallelization and acceleration of the SCE-UA method based on powerful heterogeneous computing technology. The parallel SCE-UA is implemented on Intel Xeon multi-core CPU (by using OpenMP and OpenCL) and NVIDIA Tesla many-core GPU (by using OpenCL, CUDA, and OpenACC). The serial and parallel SCE-UA were tested based on the Griewank benchmark function. Comparison results indicate the parallel SCE-UA significantly improves computational efficiency compared to the original serial version. The OpenCL implementation obtains the best overall acceleration results however, with the most complex source code. The parallel SCE-UA has bright prospects to be applied in real-world applications.

  16. Solving the hypersingular boundary integral equation in three-dimensional acoustics using a regularization relationship.

    PubMed

    Yan, Zai You; Hung, Kin Chew; Zheng, Hui

    2003-05-01

    Regularization of the hypersingular integral in the normal derivative of the conventional Helmholtz integral equation through a double surface integral method or regularization relationship has been studied. By introducing the new concept of discretized operator matrix, evaluation of the double surface integrals is reduced to calculate the product of two discretized operator matrices. Such a treatment greatly improves the computational efficiency. As the number of frequencies to be computed increases, the computational cost of solving the composite Helmholtz integral equation is comparable to that of solving the conventional Helmholtz integral equation. In this paper, the detailed formulation of the proposed regularization method is presented. The computational efficiency and accuracy of the regularization method are demonstrated for a general class of acoustic radiation and scattering problems. The radiation of a pulsating sphere, an oscillating sphere, and a rigid sphere insonified by a plane acoustic wave are solved using the new method with curvilinear quadrilateral isoparametric elements. It is found that the numerical results rapidly converge to the corresponding analytical solutions as finer meshes are applied.

  17. Computationally efficient multibody simulations

    NASA Technical Reports Server (NTRS)

    Ramakrishnan, Jayant; Kumar, Manoj

    1994-01-01

    Computationally efficient approaches to the solution of the dynamics of multibody systems are presented in this work. The computational efficiency is derived from both the algorithmic and implementational standpoint. Order(n) approaches provide a new formulation of the equations of motion eliminating the assembly and numerical inversion of a system mass matrix as required by conventional algorithms. Computational efficiency is also gained in the implementation phase by the symbolic processing and parallel implementation of these equations. Comparison of this algorithm with existing multibody simulation programs illustrates the increased computational efficiency.

  18. Fast Image Texture Classification Using Decision Trees

    NASA Technical Reports Server (NTRS)

    Thompson, David R.

    2011-01-01

    Texture analysis would permit improved autonomous, onboard science data interpretation for adaptive navigation, sampling, and downlink decisions. These analyses would assist with terrain analysis and instrument placement in both macroscopic and microscopic image data products. Unfortunately, most state-of-the-art texture analysis demands computationally expensive convolutions of filters involving many floating-point operations. This makes them infeasible for radiation- hardened computers and spaceflight hardware. A new method approximates traditional texture classification of each image pixel with a fast decision-tree classifier. The classifier uses image features derived from simple filtering operations involving integer arithmetic. The texture analysis method is therefore amenable to implementation on FPGA (field-programmable gate array) hardware. Image features based on the "integral image" transform produce descriptive and efficient texture descriptors. Training the decision tree on a set of training data yields a classification scheme that produces reasonable approximations of optimal "texton" analysis at a fraction of the computational cost. A decision-tree learning algorithm employing the traditional k-means criterion of inter-cluster variance is used to learn tree structure from training data. The result is an efficient and accurate summary of surface morphology in images. This work is an evolutionary advance that unites several previous algorithms (k-means clustering, integral images, decision trees) and applies them to a new problem domain (morphology analysis for autonomous science during remote exploration). Advantages include order-of-magnitude improvements in runtime, feasibility for FPGA hardware, and significant improvements in texture classification accuracy.

  19. Efficient implementation of the many-body Reactive Bond Order (REBO) potential on GPU

    NASA Astrophysics Data System (ADS)

    Trędak, Przemysław; Rudnicki, Witold R.; Majewski, Jacek A.

    2016-09-01

    The second generation Reactive Bond Order (REBO) empirical potential is commonly used to accurately model a wide range hydrocarbon materials. It is also extensible to other atom types and interactions. REBO potential assumes complex multi-body interaction model, that is difficult to represent efficiently in the SIMD or SIMT programming model. Hence, despite its importance, no efficient GPGPU implementation has been developed for this potential. Here we present a detailed description of a highly efficient GPGPU implementation of molecular dynamics algorithm using REBO potential. The presented algorithm takes advantage of rarely used properties of the SIMT architecture of a modern GPU to solve difficult synchronizations issues that arise in computations of multi-body potential. Techniques developed for this problem may be also used to achieve efficient solutions of different problems. The performance of proposed algorithm is assessed using a range of model systems. It is compared to highly optimized CPU implementation (both single core and OpenMP) available in LAMMPS package. These experiments show up to 6x improvement in forces computation time using single processor of the NVIDIA Tesla K80 compared to high end 16-core Intel Xeon processor.

  20. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Trędak, Przemysław, E-mail: przemyslaw.tredak@fuw.edu.pl; Rudnicki, Witold R.; Interdisciplinary Centre for Mathematical and Computational Modelling, University of Warsaw, ul. Pawińskiego 5a, 02-106 Warsaw

    The second generation Reactive Bond Order (REBO) empirical potential is commonly used to accurately model a wide range hydrocarbon materials. It is also extensible to other atom types and interactions. REBO potential assumes complex multi-body interaction model, that is difficult to represent efficiently in the SIMD or SIMT programming model. Hence, despite its importance, no efficient GPGPU implementation has been developed for this potential. Here we present a detailed description of a highly efficient GPGPU implementation of molecular dynamics algorithm using REBO potential. The presented algorithm takes advantage of rarely used properties of the SIMT architecture of a modern GPUmore » to solve difficult synchronizations issues that arise in computations of multi-body potential. Techniques developed for this problem may be also used to achieve efficient solutions of different problems. The performance of proposed algorithm is assessed using a range of model systems. It is compared to highly optimized CPU implementation (both single core and OpenMP) available in LAMMPS package. These experiments show up to 6x improvement in forces computation time using single processor of the NVIDIA Tesla K80 compared to high end 16-core Intel Xeon processor.« less

  1. CBESW: sequence alignment on the Playstation 3.

    PubMed

    Wirawan, Adrianto; Kwoh, Chee Keong; Hieu, Nim Tri; Schmidt, Bertil

    2008-09-17

    The exponential growth of available biological data has caused bioinformatics to be rapidly moving towards a data-intensive, computational science. As a result, the computational power needed by bioinformatics applications is growing exponentially as well. The recent emergence of accelerator technologies has made it possible to achieve an excellent improvement in execution time for many bioinformatics applications, compared to current general-purpose platforms. In this paper, we demonstrate how the PlayStation 3, powered by the Cell Broadband Engine, can be used as a computational platform to accelerate the Smith-Waterman algorithm. For large datasets, our implementation on the PlayStation 3 provides a significant improvement in running time compared to other implementations such as SSEARCH, Striped Smith-Waterman and CUDA. Our implementation achieves a peak performance of up to 3,646 MCUPS. The results from our experiments demonstrate that the PlayStation 3 console can be used as an efficient low cost computational platform for high performance sequence alignment applications.

  2. CBESW: Sequence Alignment on the Playstation 3

    PubMed Central

    Wirawan, Adrianto; Kwoh, Chee Keong; Hieu, Nim Tri; Schmidt, Bertil

    2008-01-01

    Background The exponential growth of available biological data has caused bioinformatics to be rapidly moving towards a data-intensive, computational science. As a result, the computational power needed by bioinformatics applications is growing exponentially as well. The recent emergence of accelerator technologies has made it possible to achieve an excellent improvement in execution time for many bioinformatics applications, compared to current general-purpose platforms. In this paper, we demonstrate how the PlayStation® 3, powered by the Cell Broadband Engine, can be used as a computational platform to accelerate the Smith-Waterman algorithm. Results For large datasets, our implementation on the PlayStation® 3 provides a significant improvement in running time compared to other implementations such as SSEARCH, Striped Smith-Waterman and CUDA. Our implementation achieves a peak performance of up to 3,646 MCUPS. Conclusion The results from our experiments demonstrate that the PlayStation® 3 console can be used as an efficient low cost computational platform for high performance sequence alignment applications. PMID:18798993

  3. Efficient alignment-free DNA barcode analytics

    PubMed Central

    Kuksa, Pavel; Pavlovic, Vladimir

    2009-01-01

    Background In this work we consider barcode DNA analysis problems and address them using alternative, alignment-free methods and representations which model sequences as collections of short sequence fragments (features). The methods use fixed-length representations (spectrum) for barcode sequences to measure similarities or dissimilarities between sequences coming from the same or different species. The spectrum-based representation not only allows for accurate and computationally efficient species classification, but also opens possibility for accurate clustering analysis of putative species barcodes and identification of critical within-barcode loci distinguishing barcodes of different sample groups. Results New alignment-free methods provide highly accurate and fast DNA barcode-based identification and classification of species with substantial improvements in accuracy and speed over state-of-the-art barcode analysis methods. We evaluate our methods on problems of species classification and identification using barcodes, important and relevant analytical tasks in many practical applications (adverse species movement monitoring, sampling surveys for unknown or pathogenic species identification, biodiversity assessment, etc.) On several benchmark barcode datasets, including ACG, Astraptes, Hesperiidae, Fish larvae, and Birds of North America, proposed alignment-free methods considerably improve prediction accuracy compared to prior results. We also observe significant running time improvements over the state-of-the-art methods. Conclusion Our results show that newly developed alignment-free methods for DNA barcoding can efficiently and with high accuracy identify specimens by examining only few barcode features, resulting in increased scalability and interpretability of current computational approaches to barcoding. PMID:19900305

  4. KARL: A Knowledge-Assisted Retrieval Language. M.S. Thesis Final Report, 1 Jul. 1985 - 31 Dec. 1987

    NASA Technical Reports Server (NTRS)

    Dominick, Wayne D. (Editor); Triantafyllopoulos, Spiros

    1985-01-01

    Data classification and storage are tasks typically performed by application specialists. In contrast, information users are primarily non-computer specialists who use information in their decision-making and other activities. Interaction efficiency between such users and the computer is often reduced by machine requirements and resulting user reluctance to use the system. This thesis examines the problems associated with information retrieval for non-computer specialist users, and proposes a method for communicating in restricted English that uses knowledge of the entities involved, relationships between entities, and basic English language syntax and semantics to translate the user requests into formal queries. The proposed method includes an intelligent dictionary, syntax and semantic verifiers, and a formal query generator. In addition, the proposed system has a learning capability that can improve portability and performance. With the increasing demand for efficient human-machine communication, the significance of this thesis becomes apparent. As human resources become more valuable, software systems that will assist in improving the human-machine interface will be needed and research addressing new solutions will be of utmost importance. This thesis presents an initial design and implementation as a foundation for further research and development into the emerging field of natural language database query systems.

  5. Communication devices in the operating room.

    PubMed

    Ruskin, Keith J

    2006-12-01

    Effective communication is essential to patient safety. Although radio pagers have been the cornerstone of medical communication, new devices such as cellular telephones, personal digital assistants (PDAs), and laptop or tablet computers can help anesthesiologists to get information quickly and reliably. Anesthesiologists can use these devices to speak with colleagues, access the medical record, or help a colleague in another location without having to leave a patient's side. Recent advances in communication technology offer anesthesiologists new ways to improve patient care. Anesthesiologists rely on a wide variety of information to make decisions, including vital signs, laboratory values, and entries in the medical record. Devices such as PDAs and computers with wireless networking can be used to access this information. Mobile telephones can be used to get help or ask for advice, and are more efficient than radio pagers. Voice over Internet protocol is a new technology that allows voice conversations to be routed over computer networks. It is widely believed that wireless devices can cause life-threatening interference with medical devices. The actual risk is very low, and is offset by a significant reduction in medical errors that results from more efficient communication. Using common technology like cellular telephones and wireless networks is a simple, cost-effective way to improve patient care.

  6. Fast dose kernel interpolation using Fourier transform with application to permanent prostate brachytherapy dosimetry.

    PubMed

    Liu, Derek; Sloboda, Ron S

    2014-05-01

    Boyer and Mok proposed a fast calculation method employing the Fourier transform (FT), for which calculation time is independent of the number of seeds but seed placement is restricted to calculation grid points. Here an interpolation method is described enabling unrestricted seed placement while preserving the computational efficiency of the original method. The Iodine-125 seed dose kernel was sampled and selected values were modified to optimize interpolation accuracy for clinically relevant doses. For each seed, the kernel was shifted to the nearest grid point via convolution with a unit impulse, implemented in the Fourier domain. The remaining fractional shift was performed using a piecewise third-order Lagrange filter. Implementation of the interpolation method greatly improved FT-based dose calculation accuracy. The dose distribution was accurate to within 2% beyond 3 mm from each seed. Isodose contours were indistinguishable from explicit TG-43 calculation. Dose-volume metric errors were negligible. Computation time for the FT interpolation method was essentially the same as Boyer's method. A FT interpolation method for permanent prostate brachytherapy TG-43 dose calculation was developed which expands upon Boyer's original method and enables unrestricted seed placement. The proposed method substantially improves the clinically relevant dose accuracy with negligible additional computation cost, preserving the efficiency of the original method.

  7. Decomposed multidimensional control grid interpolation for common consumer electronic image processing applications

    NASA Astrophysics Data System (ADS)

    Zwart, Christine M.; Venkatesan, Ragav; Frakes, David H.

    2012-10-01

    Interpolation is an essential and broadly employed function of signal processing. Accordingly, considerable development has focused on advancing interpolation algorithms toward optimal accuracy. Such development has motivated a clear shift in the state-of-the art from classical interpolation to more intelligent and resourceful approaches, registration-based interpolation for example. As a natural result, many of the most accurate current algorithms are highly complex, specific, and computationally demanding. However, the diverse hardware destinations for interpolation algorithms present unique constraints that often preclude use of the most accurate available options. For example, while computationally demanding interpolators may be suitable for highly equipped image processing platforms (e.g., computer workstations and clusters), only more efficient interpolators may be practical for less well equipped platforms (e.g., smartphones and tablet computers). The latter examples of consumer electronics present a design tradeoff in this regard: high accuracy interpolation benefits the consumer experience but computing capabilities are limited. It follows that interpolators with favorable combinations of accuracy and efficiency are of great practical value to the consumer electronics industry. We address multidimensional interpolation-based image processing problems that are common to consumer electronic devices through a decomposition approach. The multidimensional problems are first broken down into multiple, independent, one-dimensional (1-D) interpolation steps that are then executed with a newly modified registration-based one-dimensional control grid interpolator. The proposed approach, decomposed multidimensional control grid interpolation (DMCGI), combines the accuracy of registration-based interpolation with the simplicity, flexibility, and computational efficiency of a 1-D interpolation framework. Results demonstrate that DMCGI provides improved interpolation accuracy (and other benefits) in image resizing, color sample demosaicing, and video deinterlacing applications, at a computational cost that is manageable or reduced in comparison to popular alternatives.

  8. Synthetic biology as it relates to CAM photosynthesis: challenges and opportunities.

    PubMed

    DePaoli, Henrique C; Borland, Anne M; Tuskan, Gerald A; Cushman, John C; Yang, Xiaohan

    2014-07-01

    To meet future food and energy security needs, which are amplified by increasing population growth and reduced natural resource availability, metabolic engineering efforts have moved from manipulating single genes/proteins to introducing multiple genes and novel pathways to improve photosynthetic efficiency in a more comprehensive manner. Biochemical carbon-concentrating mechanisms such as crassulacean acid metabolism (CAM), which improves photosynthetic, water-use, and possibly nutrient-use efficiency, represent a strategic target for synthetic biology to engineer more productive C3 crops for a warmer and drier world. One key challenge for introducing multigene traits like CAM onto a background of C3 photosynthesis is to gain a better understanding of the dynamic spatial and temporal regulatory events that underpin photosynthetic metabolism. With the aid of systems and computational biology, vast amounts of experimental data encompassing transcriptomics, proteomics, and metabolomics can be related in a network to create dynamic models. Such models can undergo simulations to discover key regulatory elements in metabolism and suggest strategic substitution or augmentation by synthetic components to improve photosynthetic performance and water-use efficiency in C3 crops. Another key challenge in the application of synthetic biology to photosynthesis research is to develop efficient systems for multigene assembly and stacking. Here, we review recent progress in computational modelling as applied to plant photosynthesis, with attention to the requirements for CAM, and recent advances in synthetic biology tool development. Lastly, we discuss possible options for multigene pathway construction in plants with an emphasis on CAM-into-C3 engineering. © The Author 2014. Published by Oxford University Press on behalf of the Society for Experimental Biology. All rights reserved. For permissions, please email: journals.permissions@oup.com.

  9. Stream Kriging: Incremental and recursive ordinary Kriging over spatiotemporal data streams

    NASA Astrophysics Data System (ADS)

    Zhong, Xu; Kealy, Allison; Duckham, Matt

    2016-05-01

    Ordinary Kriging is widely used for geospatial interpolation and estimation. Due to the O (n3) time complexity of solving the system of linear equations, ordinary Kriging for a large set of source points is computationally intensive. Conducting real-time Kriging interpolation over continuously varying spatiotemporal data streams can therefore be especially challenging. This paper develops and tests two new strategies for improving the performance of an ordinary Kriging interpolator adapted to a stream-processing environment. These strategies rely on the expectation that, over time, source data points will frequently refer to the same spatial locations (for example, where static sensor nodes are generating repeated observations of a dynamic field). First, an incremental strategy improves efficiency in cases where a relatively small proportion of previously processed spatial locations are absent from the source points at any given iteration. Second, a recursive strategy improves efficiency in cases where there is substantial set overlap between the sets of spatial locations of source points at the current and previous iterations. These two strategies are evaluated in terms of their computational efficiency in comparison to ordinary Kriging algorithm. The results show that these two strategies can reduce the time taken to perform the interpolation by up to 90%, and approach average-case time complexity of O (n2) when most but not all source points refer to the same locations over time. By combining the approaches developed in this paper with existing heuristic ordinary Kriging algorithms, the conclusions indicate how further efficiency gains could potentially be accrued. The work ultimately contributes to the development of online ordinary Kriging interpolation algorithms, capable of real-time spatial interpolation with large streaming data sets.

  10. DOVIS 2.0: an efficient and easy to use parallel virtual screening tool based on AutoDock 4.0.

    PubMed

    Jiang, Xiaohui; Kumar, Kamal; Hu, Xin; Wallqvist, Anders; Reifman, Jaques

    2008-09-08

    Small-molecule docking is an important tool in studying receptor-ligand interactions and in identifying potential drug candidates. Previously, we developed a software tool (DOVIS) to perform large-scale virtual screening of small molecules in parallel on Linux clusters, using AutoDock 3.05 as the docking engine. DOVIS enables the seamless screening of millions of compounds on high-performance computing platforms. In this paper, we report significant advances in the software implementation of DOVIS 2.0, including enhanced screening capability, improved file system efficiency, and extended usability. To keep DOVIS up-to-date, we upgraded the software's docking engine to the more accurate AutoDock 4.0 code. We developed a new parallelization scheme to improve runtime efficiency and modified the AutoDock code to reduce excessive file operations during large-scale virtual screening jobs. We also implemented an algorithm to output docked ligands in an industry standard format, sd-file format, which can be easily interfaced with other modeling programs. Finally, we constructed a wrapper-script interface to enable automatic rescoring of docked ligands by arbitrarily selected third-party scoring programs. The significance of the new DOVIS 2.0 software compared with the previous version lies in its improved performance and usability. The new version makes the computation highly efficient by automating load balancing, significantly reducing excessive file operations by more than 95%, providing outputs that conform to industry standard sd-file format, and providing a general wrapper-script interface for rescoring of docked ligands. The new DOVIS 2.0 package is freely available to the public under the GNU General Public License.

  11. Analysis of scalability of high-performance 3D image processing platform for virtual colonoscopy

    NASA Astrophysics Data System (ADS)

    Yoshida, Hiroyuki; Wu, Yin; Cai, Wenli

    2014-03-01

    One of the key challenges in three-dimensional (3D) medical imaging is to enable the fast turn-around time, which is often required for interactive or real-time response. This inevitably requires not only high computational power but also high memory bandwidth due to the massive amount of data that need to be processed. For this purpose, we previously developed a software platform for high-performance 3D medical image processing, called HPC 3D-MIP platform, which employs increasingly available and affordable commodity computing systems such as the multicore, cluster, and cloud computing systems. To achieve scalable high-performance computing, the platform employed size-adaptive, distributable block volumes as a core data structure for efficient parallelization of a wide range of 3D-MIP algorithms, supported task scheduling for efficient load distribution and balancing, and consisted of a layered parallel software libraries that allow image processing applications to share the common functionalities. We evaluated the performance of the HPC 3D-MIP platform by applying it to computationally intensive processes in virtual colonoscopy. Experimental results showed a 12-fold performance improvement on a workstation with 12-core CPUs over the original sequential implementation of the processes, indicating the efficiency of the platform. Analysis of performance scalability based on the Amdahl's law for symmetric multicore chips showed the potential of a high performance scalability of the HPC 3DMIP platform when a larger number of cores is available.

  12. Large-scale inverse model analyses employing fast randomized data reduction

    NASA Astrophysics Data System (ADS)

    Lin, Youzuo; Le, Ellen B.; O'Malley, Daniel; Vesselinov, Velimir V.; Bui-Thanh, Tan

    2017-08-01

    When the number of observations is large, it is computationally challenging to apply classical inverse modeling techniques. We have developed a new computationally efficient technique for solving inverse problems with a large number of observations (e.g., on the order of 107 or greater). Our method, which we call the randomized geostatistical approach (RGA), is built upon the principal component geostatistical approach (PCGA). We employ a data reduction technique combined with the PCGA to improve the computational efficiency and reduce the memory usage. Specifically, we employ a randomized numerical linear algebra technique based on a so-called "sketching" matrix to effectively reduce the dimension of the observations without losing the information content needed for the inverse analysis. In this way, the computational and memory costs for RGA scale with the information content rather than the size of the calibration data. Our algorithm is coded in Julia and implemented in the MADS open-source high-performance computational framework (http://mads.lanl.gov). We apply our new inverse modeling method to invert for a synthetic transmissivity field. Compared to a standard geostatistical approach (GA), our method is more efficient when the number of observations is large. Most importantly, our method is capable of solving larger inverse problems than the standard GA and PCGA approaches. Therefore, our new model inversion method is a powerful tool for solving large-scale inverse problems. The method can be applied in any field and is not limited to hydrogeological applications such as the characterization of aquifer heterogeneity.

  13. A computationally efficient ductile damage model accounting for nucleation and micro-inertia at high triaxialities

    DOE PAGES

    Versino, Daniele; Bronkhorst, Curt Allan

    2018-01-31

    The computational formulation of a micro-mechanical material model for the dynamic failure of ductile metals is presented in this paper. The statistical nature of porosity initiation is accounted for by introducing an arbitrary probability density function which describes the pores nucleation pressures. Each micropore within the representative volume element is modeled as a thick spherical shell made of plastically incompressible material. The treatment of porosity by a distribution of thick-walled spheres also allows for the inclusion of micro-inertia effects under conditions of shock and dynamic loading. The second order ordinary differential equation governing the microscopic porosity evolution is solved withmore » a robust implicit procedure. A new Chebyshev collocation method is employed to approximate the porosity distribution and remapping is used to optimize memory usage. The adaptive approximation of the porosity distribution leads to a reduction of computational time and memory usage of up to two orders of magnitude. Moreover, the proposed model affords consistent performance: changing the nucleation pressure probability density function and/or the applied strain rate does not reduce accuracy or computational efficiency of the material model. The numerical performance of the model and algorithms presented is tested against three problems for high density tantalum: single void, one-dimensional uniaxial strain, and two-dimensional plate impact. Here, the results using the integration and algorithmic advances suggest a significant improvement in computational efficiency and accuracy over previous treatments for dynamic loading conditions.« less

  14. GPU-based Parallel Application Design for Emerging Mobile Devices

    NASA Astrophysics Data System (ADS)

    Gupta, Kshitij

    A revolution is underway in the computing world that is causing a fundamental paradigm shift in device capabilities and form-factor, with a move from well-established legacy desktop/laptop computers to mobile devices in varying sizes and shapes. Amongst all the tasks these devices must support, graphics has emerged as the 'killer app' for providing a fluid user interface and high-fidelity game rendering, effectively making the graphics processor (GPU) one of the key components in (present and future) mobile systems. By utilizing the GPU as a general-purpose parallel processor, this dissertation explores the GPU computing design space from an applications standpoint, in the mobile context, by focusing on key challenges presented by these devices---limited compute, memory bandwidth, and stringent power consumption requirements---while improving the overall application efficiency of the increasingly important speech recognition workload for mobile user interaction. We broadly partition trends in GPU computing into four major categories. We analyze hardware and programming model limitations in current-generation GPUs and detail an alternate programming style called Persistent Threads, identify four use case patterns, and propose minimal modifications that would be required for extending native support. We show how by manually extracting data locality and altering the speech recognition pipeline, we are able to achieve significant savings in memory bandwidth while simultaneously reducing the compute burden on GPU-like parallel processors. As we foresee GPU computing to evolve from its current 'co-processor' model into an independent 'applications processor' that is capable of executing complex work independently, we create an alternate application framework that enables the GPU to handle all control-flow dependencies autonomously at run-time while minimizing host involvement to just issuing commands, that facilitates an efficient application implementation. Finally, as compute and communication capabilities of mobile devices improve, we analyze energy implications of processing speech recognition locally (on-chip) and offloading it to servers (in-cloud).

  15. Uncertainty quantification of seabed parameters for large data volumes along survey tracks with a tempered particle filter

    NASA Astrophysics Data System (ADS)

    Dettmer, J.; Quijano, J. E.; Dosso, S. E.; Holland, C. W.; Mandolesi, E.

    2016-12-01

    Geophysical seabed properties are important for the detection and classification of unexploded ordnance. However, current surveying methods such as vertical seismic profiling, coring, or inversion are of limited use when surveying large areas with high spatial sampling density. We consider surveys based on a source and receiver array towed by an autonomous vehicle which produce large volumes of seabed reflectivity data that contain unprecedented and detailed seabed information. The data are analyzed with a particle filter, which requires efficient reflection-coefficient computation, efficient inversion algorithms and efficient use of computer resources. The filter quantifies information content of multiple sequential data sets by considering results from previous data along the survey track to inform the importance sampling at the current point. Challenges arise from environmental changes along the track where the number of sediment layers and their properties change. This is addressed by a trans-dimensional model in the filter which allows layering complexity to change along a track. Efficiency is improved by likelihood tempering of various particle subsets and including exchange moves (parallel tempering). The filter is implemented on a hybrid computer that combines central processing units (CPUs) and graphics processing units (GPUs) to exploit three levels of parallelism: (1) fine-grained parallel computation of spherical reflection coefficients with a GPU implementation of Levin integration; (2) updating particles by concurrent CPU processes which exchange information using automatic load balancing (coarse grained parallelism); (3) overlapping CPU-GPU communication (a major bottleneck) with GPU computation by staggering CPU access to the multiple GPUs. The algorithm is applied to spherical reflection coefficients for data sets along a 14-km track on the Malta Plateau, Mediterranean Sea. We demonstrate substantial efficiency gains over previous methods. [This research was supported in part by the U.S. Dept of Defense, thought the Strategic Environmental Research and Development Program (SERDP).

  16. An Efficient Neural-Network-Based Microseismic Monitoring Platform for Hydraulic Fracture on an Edge Computing Architecture.

    PubMed

    Zhang, Xiaopu; Lin, Jun; Chen, Zubin; Sun, Feng; Zhu, Xi; Fang, Gengfa

    2018-06-05

    Microseismic monitoring is one of the most critical technologies for hydraulic fracturing in oil and gas production. To detect events in an accurate and efficient way, there are two major challenges. One challenge is how to achieve high accuracy due to a poor signal-to-noise ratio (SNR). The other one is concerned with real-time data transmission. Taking these challenges into consideration, an edge-computing-based platform, namely Edge-to-Center LearnReduce, is presented in this work. The platform consists of a data center with many edge components. At the data center, a neural network model combined with convolutional neural network (CNN) and long short-term memory (LSTM) is designed and this model is trained by using previously obtained data. Once the model is fully trained, it is sent to edge components for events detection and data reduction. At each edge component, a probabilistic inference is added to the neural network model to improve its accuracy. Finally, the reduced data is delivered to the data center. Based on experiment results, a high detection accuracy (over 96%) with less transmitted data (about 90%) was achieved by using the proposed approach on a microseismic monitoring system. These results show that the platform can simultaneously improve the accuracy and efficiency of microseismic monitoring.

  17. Computer architecture for efficient algorithmic executions in real-time systems: New technology for avionics systems and advanced space vehicles

    NASA Technical Reports Server (NTRS)

    Carroll, Chester C.; Youngblood, John N.; Saha, Aindam

    1987-01-01

    Improvements and advances in the development of computer architecture now provide innovative technology for the recasting of traditional sequential solutions into high-performance, low-cost, parallel system to increase system performance. Research conducted in development of specialized computer architecture for the algorithmic execution of an avionics system, guidance and control problem in real time is described. A comprehensive treatment of both the hardware and software structures of a customized computer which performs real-time computation of guidance commands with updated estimates of target motion and time-to-go is presented. An optimal, real-time allocation algorithm was developed which maps the algorithmic tasks onto the processing elements. This allocation is based on the critical path analysis. The final stage is the design and development of the hardware structures suitable for the efficient execution of the allocated task graph. The processing element is designed for rapid execution of the allocated tasks. Fault tolerance is a key feature of the overall architecture. Parallel numerical integration techniques, tasks definitions, and allocation algorithms are discussed. The parallel implementation is analytically verified and the experimental results are presented. The design of the data-driven computer architecture, customized for the execution of the particular algorithm, is discussed.

  18. A Parallel Nonrigid Registration Algorithm Based on B-Spline for Medical Images.

    PubMed

    Du, Xiaogang; Dang, Jianwu; Wang, Yangping; Wang, Song; Lei, Tao

    2016-01-01

    The nonrigid registration algorithm based on B-spline Free-Form Deformation (FFD) plays a key role and is widely applied in medical image processing due to the good flexibility and robustness. However, it requires a tremendous amount of computing time to obtain more accurate registration results especially for a large amount of medical image data. To address the issue, a parallel nonrigid registration algorithm based on B-spline is proposed in this paper. First, the Logarithm Squared Difference (LSD) is considered as the similarity metric in the B-spline registration algorithm to improve registration precision. After that, we create a parallel computing strategy and lookup tables (LUTs) to reduce the complexity of the B-spline registration algorithm. As a result, the computing time of three time-consuming steps including B-splines interpolation, LSD computation, and the analytic gradient computation of LSD, is efficiently reduced, for the B-spline registration algorithm employs the Nonlinear Conjugate Gradient (NCG) optimization method. Experimental results of registration quality and execution efficiency on the large amount of medical images show that our algorithm achieves a better registration accuracy in terms of the differences between the best deformation fields and ground truth and a speedup of 17 times over the single-threaded CPU implementation due to the powerful parallel computing ability of Graphics Processing Unit (GPU).

  19. Computer architecture for efficient algorithmic executions in real-time systems: new technology for avionics systems and advanced space vehicles

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Carroll, C.C.; Youngblood, J.N.; Saha, A.

    1987-12-01

    Improvements and advances in the development of computer architecture now provide innovative technology for the recasting of traditional sequential solutions into high-performance, low-cost, parallel system to increase system performance. Research conducted in development of specialized computer architecture for the algorithmic execution of an avionics system, guidance and control problem in real time is described. A comprehensive treatment of both the hardware and software structures of a customized computer which performs real-time computation of guidance commands with updated estimates of target motion and time-to-go is presented. An optimal, real-time allocation algorithm was developed which maps the algorithmic tasks onto the processingmore » elements. This allocation is based on the critical path analysis. The final stage is the design and development of the hardware structures suitable for the efficient execution of the allocated task graph. The processing element is designed for rapid execution of the allocated tasks. Fault tolerance is a key feature of the overall architecture. Parallel numerical integration techniques, tasks definitions, and allocation algorithms are discussed. The parallel implementation is analytically verified and the experimental results are presented. The design of the data-driven computer architecture, customized for the execution of the particular algorithm, is discussed.« less

  20. Intricacies of modern supercomputing illustrated with recent advances in simulations of strongly correlated electron systems

    NASA Astrophysics Data System (ADS)

    Schulthess, Thomas C.

    2013-03-01

    The continued thousand-fold improvement in sustained application performance per decade on modern supercomputers keeps opening new opportunities for scientific simulations. But supercomputers have become very complex machines, built with thousands or tens of thousands of complex nodes consisting of multiple CPU cores or, most recently, a combination of CPU and GPU processors. Efficient simulations on such high-end computing systems require tailored algorithms that optimally map numerical methods to particular architectures. These intricacies will be illustrated with simulations of strongly correlated electron systems, where the development of quantum cluster methods, Monte Carlo techniques, as well as their optimal implementation by means of algorithms with improved data locality and high arithmetic density have gone hand in hand with evolving computer architectures. The present work would not have been possible without continued access to computing resources at the National Center for Computational Science of Oak Ridge National Laboratory, which is funded by the Facilities Division of the Office of Advanced Scientific Computing Research, and the Swiss National Supercomputing Center (CSCS) that is funded by ETH Zurich.

Top