Efficient and accurate computation of generalized singular-value decompositions
NASA Astrophysics Data System (ADS)
Drmac, Zlatko
2001-11-01
We present a new family of algorithms for accurate floating--point computation of the singular value decomposition (SVD) of various forms of products (quotients) of two or three matrices. The main goal of such an algorithm is to compute all singular values to high relative accuracy. This means that we are seeking guaranteed number of accurate digits even in the smallest singular values. We also want to achieve computational efficiency, while maintaining high accuracy. To illustrate, consider the SVD of the product A=BTSC. The new algorithm uses certain preconditioning (based on diagonal scalings, the LU and QR factorizations) to replace A with A'=(B')TS'C', where A and A' have the same singular values and the matrix A' is computed explicitly. Theoretical analysis and numerical evidence show that, in the case of full rank B, C, S, the accuracy of the new algorithm is unaffected by replacing B, S, C with, respectively, D1B, D2SD3, D4C, where Di, i=1,...,4 are arbitrary diagonal matrices. As an application, the paper proposes new accurate algorithms for computing the (H,K)-SVD and (H1,K)-SVD of S.
NASA Technical Reports Server (NTRS)
Liu, Yen; Vinokur, Marcel
1989-01-01
This paper treats the accurate and efficient calculation of thermodynamic properties of arbitrary gas mixtures for equilibrium flow computations. New improvements in the Stupochenko-Jaffe model for the calculation of thermodynamic properties of diatomic molecules are presented. A unified formulation of equilibrium calculations for gas mixtures in terms of irreversible entropy is given. Using a highly accurate thermo-chemical data base, a new, efficient and vectorizable search algorithm is used to construct piecewise interpolation procedures with generate accurate thermodynamic variable and their derivatives required by modern computational algorithms. Results are presented for equilibrium air, and compared with those given by the Srinivasan program.
Automated Development of Accurate Algorithms and Efficient Codes for Computational Aeroacoustics
NASA Technical Reports Server (NTRS)
Goodrich, John W.; Dyson, Rodger W.
1999-01-01
The simulation of sound generation and propagation in three space dimensions with realistic aircraft components is a very large time dependent computation with fine details. Simulations in open domains with embedded objects require accurate and robust algorithms for propagation, for artificial inflow and outflow boundaries, and for the definition of geometrically complex objects. The development, implementation, and validation of methods for solving these demanding problems is being done to support the NASA pillar goals for reducing aircraft noise levels. Our goal is to provide algorithms which are sufficiently accurate and efficient to produce usable results rapidly enough to allow design engineers to study the effects on sound levels of design changes in propulsion systems, and in the integration of propulsion systems with airframes. There is a lack of design tools for these purposes at this time. Our technical approach to this problem combines the development of new, algorithms with the use of Mathematica and Unix utilities to automate the algorithm development, code implementation, and validation. We use explicit methods to ensure effective implementation by domain decomposition for SPMD parallel computing. There are several orders of magnitude difference in the computational efficiencies of the algorithms which we have considered. We currently have new artificial inflow and outflow boundary conditions that are stable, accurate, and unobtrusive, with implementations that match the accuracy and efficiency of the propagation methods. The artificial numerical boundary treatments have been proven to have solutions which converge to the full open domain problems, so that the error from the boundary treatments can be driven as low as is required. The purpose of this paper is to briefly present a method for developing highly accurate algorithms for computational aeroacoustics, the use of computer automation in this process, and a brief survey of the algorithms that
Khan, Usman; Falconi, Christian
2014-01-01
Ideally, the design of high-performance micro-hotplates would require a large number of simulations because of the existence of many important design parameters as well as the possibly crucial effects of both spread and drift. However, the computational cost of FEM simulations, which are the only available tool for accurately predicting the temperature in micro-hotplates, is very high. As a result, micro-hotplate designers generally have no effective simulation-tools for the optimization. In order to circumvent these issues, here, we propose a model for practical circular-symmetric micro-hot-plates which takes advantage of modified Bessel functions, computationally efficient matrix-approach for considering the relevant boundary conditions, Taylor linearization for modeling the Joule heating and radiation losses, and external-region-segmentation strategy in order to accurately take into account radiation losses in the entire micro-hotplate. The proposed model is almost as accurate as FEM simulations and two to three orders of magnitude more computationally efficient (e.g., 45 s versus more than 8 h). The residual errors, which are mainly associated to the undesired heating in the electrical contacts, are small (e.g., few degrees Celsius for an 800 °C operating temperature) and, for important analyses, almost constant. Therefore, we also introduce a computationally-easy single-FEM-compensation strategy in order to reduce the residual errors to about 1 °C. As illustrative examples of the power of our approach, we report the systematic investigation of a spread in the membrane thermal conductivity and of combined variations of both ambient and bulk temperatures. Our model enables a much faster characterization of micro-hotplates and, thus, a much more effective optimization prior to fabrication. PMID:24763214
An Accurate and Computationally Efficient Model for Membrane-Type Circular-Symmetric Micro-Hotplates
Khan, Usman; Falconi, Christian
2014-01-01
Ideally, the design of high-performance micro-hotplates would require a large number of simulations because of the existence of many important design parameters as well as the possibly crucial effects of both spread and drift. However, the computational cost of FEM simulations, which are the only available tool for accurately predicting the temperature in micro-hotplates, is very high. As a result, micro-hotplate designers generally have no effective simulation-tools for the optimization. In order to circumvent these issues, here, we propose a model for practical circular-symmetric micro-hot-plates which takes advantage of modified Bessel functions, computationally efficient matrix-approach for considering the relevant boundary conditions, Taylor linearization for modeling the Joule heating and radiation losses, and external-region-segmentation strategy in order to accurately take into account radiation losses in the entire micro-hotplate. The proposed model is almost as accurate as FEM simulations and two to three orders of magnitude more computationally efficient (e.g., 45 s versus more than 8 h). The residual errors, which are mainly associated to the undesired heating in the electrical contacts, are small (e.g., few degrees Celsius for an 800 °C operating temperature) and, for important analyses, almost constant. Therefore, we also introduce a computationally-easy single-FEM-compensation strategy in order to reduce the residual errors to about 1 °C. As illustrative examples of the power of our approach, we report the systematic investigation of a spread in the membrane thermal conductivity and of combined variations of both ambient and bulk temperatures. Our model enables a much faster characterization of micro-hotplates and, thus, a much more effective optimization prior to fabrication. PMID:24763214
NASA Astrophysics Data System (ADS)
Yoshidome, Takashi; Ekimoto, Toru; Matubayasi, Nobuyuki; Harano, Yuichi; Kinoshita, Masahiro; Ikeguchi, Mitsunori
2015-05-01
The hydration free energy (HFE) is a crucially important physical quantity to discuss various chemical processes in aqueous solutions. Although an explicit-solvent computation with molecular dynamics (MD) simulations is a preferable treatment of the HFE, huge computational load has been inevitable for large, complex solutes like proteins. In the present paper, we propose an efficient computation method for the HFE. In our method, the HFE is computed as a sum of
Stable, accurate and efficient computation of normal modes for horizontal stratified models
NASA Astrophysics Data System (ADS)
Wu, Bo; Chen, Xiaofei
2016-08-01
We propose an adaptive root-determining strategy that is very useful when dealing with trapped modes or Stoneley modes whose energies become very insignificant on the free surface in the presence of low-velocity layers or fluid layers in the model. Loss of modes in these cases or inaccuracy in the calculation of these modes may then be easily avoided. Built upon the generalized reflection/transmission coefficients, the concept of `family of secular functions' that we herein call `adaptive mode observers' is thus naturally introduced to implement this strategy, the underlying idea of which has been distinctly noted for the first time and may be generalized to other applications such as free oscillations or applied to other methods in use when these cases are encountered. Additionally, we have made further improvements upon the generalized reflection/transmission coefficient method; mode observers associated with only the free surface and low-velocity layers (and the fluid/solid interface if the model contains fluid layers) are adequate to guarantee no loss and high precision at the same time of any physically existent modes without excessive calculations. Finally, the conventional definition of the fundamental mode is reconsidered, which is entailed in the cases under study. Some computational aspects are remarked on. With the additional help afforded by our superior root-searching scheme and the possibility of speeding calculation using a less number of layers aided by the concept of `turning point', our algorithm is remarkably efficient as well as stable and accurate and can be used as a powerful tool for widely related applications.
Stable, accurate and efficient computation of normal modes for horizontal stratified models
NASA Astrophysics Data System (ADS)
Wu, Bo; Chen, Xiaofei
2016-06-01
We propose an adaptive root-determining strategy that is very useful when dealing with trapped modes or Stoneley modes whose energies become very insignificant on the free surface in the presence of low-velocity layers or fluid layers in the model. Loss of modes in these cases or inaccuracy in the calculation of these modes may then be easily avoided. Built upon the generalized reflection/transmission coefficients, the concept of "family of secular functions" that we herein call "adaptive mode observers", is thus naturally introduced to implement this strategy, the underlying idea of which has been distinctly noted for the first time and may be generalized to other applications such as free oscillations or applied to other methods in use when these cases are encountered. Additionally, we have made further improvements upon the generalized reflection/transmission coefficient method; mode observers associated with only the free surface and low-velocity layers (and the fluid/solid interface if the model contains fluid layers) are adequate to guarantee no loss and high precision at the same time of any physically existent modes without excessive calculations. Finally, the conventional definition of the fundamental mode is reconsidered, which is entailed in the cases under study. Some computational aspects are remarked on. With the additional help afforded by our superior root-searching scheme and the possibility of speeding calculation using a less number of layers aided by the concept of "turning point", our algorithm is remarkably efficient as well as stable and accurate and can be used as a powerful tool for widely related applications.
Wijma, Hein J; Marrink, Siewert J; Janssen, Dick B
2014-07-28
Computational approaches could decrease the need for the laborious high-throughput experimental screening that is often required to improve enzymes by mutagenesis. Here, we report that using multiple short molecular dynamics (MD) simulations makes it possible to accurately model enantioselectivity for large numbers of enzyme-substrate combinations at low computational costs. We chose four different haloalkane dehalogenases as model systems because of the availability of a large set of experimental data on the enantioselective conversion of 45 different substrates. To model the enantioselectivity, we quantified the frequency of occurrence of catalytically productive conformations (near attack conformations) for pairs of enantiomers during MD simulations. We found that the angle of nucleophilic attack that leads to carbon-halogen bond cleavage was a critical variable that limited the occurrence of productive conformations; enantiomers for which this angle reached values close to 180° were preferentially converted. A cluster of 20-40 very short (10 ps) MD simulations allowed adequate conformational sampling and resulted in much better agreement to experimental enantioselectivities than single long MD simulations (22 ns), while the computational costs were 50-100 fold lower. With single long MD simulations, the dynamics of enzyme-substrate complexes remained confined to a conformational subspace that rarely changed significantly, whereas with multiple short MD simulations a larger diversity of conformations of enzyme-substrate complexes was observed. PMID:24916632
Efficiency and Accuracy of Time-Accurate Turbulent Navier-Stokes Computations
NASA Technical Reports Server (NTRS)
Rumsey, Christopher L.; Sanetrik, Mark D.; Biedron, Robert T.; Melson, N. Duane; Parlette, Edward B.
1995-01-01
The accuracy and efficiency of two types of subiterations in both explicit and implicit Navier-Stokes codes are explored for unsteady laminar circular-cylinder flow and unsteady turbulent flow over an 18-percent-thick circular-arc (biconvex) airfoil. Grid and time-step studies are used to assess the numerical accuracy of the methods. Nonsubiterative time-stepping schemes and schemes with physical time subiterations are subject to time-step limitations in practice that are removed by pseudo time sub-iterations. Computations for the circular-arc airfoil indicate that a one-equation turbulence model predicts the unsteady separated flow better than an algebraic turbulence model; also, the hysteresis with Mach number of the self-excited unsteadiness due to shock and boundary-layer separation is well predicted.
Vela, Sergi; Fumanal, Maria; Ribas-Arino, Jordi; Robert, Vincent
2015-07-01
The DFT + U methodology is regarded as one of the most-promising strategies to treat the solid state of molecular materials, as it may provide good energetic accuracy at a moderate computational cost. However, a careful parametrization of the U-term is mandatory since the results may be dramatically affected by the selected value. Herein, we benchmarked the Hubbard-like U-term for seven Fe(ii)N6-based pseudo-octahedral spin crossover (SCO) compounds, using as a reference an estimation of the electronic enthalpy difference (ΔHelec) extracted from experimental data (T1/2, ΔS and ΔH). The parametrized U-value obtained for each of those seven compounds ranges from 2.37 eV to 2.97 eV, with an average value of U = 2.65 eV. Interestingly, we have found that this average value can be taken as a good starting point since it leads to an unprecedented mean absolute error (MAE) of only 4.3 kJ mol(-1) in the evaluation of ΔHelec for the studied compounds. Moreover, by comparing our results on the solid state and the gas phase of the materials, we quantify the influence of the intermolecular interactions on the relative stability of the HS and LS states, with an average effect of ca. 5 kJ mol(-1), whose sign cannot be generalized. Overall, the findings reported in this manuscript pave the way for future studies devoted to understand the crystalline phase of SCO compounds, or the adsorption of individual molecules on organic or metallic surfaces, in which the rational incorporation of the U-term within DFT + U yields the required energetic accuracy that is dramatically missing when using bare-DFT functionals. PMID:26040609
NASA Technical Reports Server (NTRS)
Kory, Carol L.
1999-01-01
The phenomenal growth of commercial communications has created a great demand for traveling-wave tube (TWT) amplifiers. Although the helix slow-wave circuit remains the mainstay of the TWT industry because of its exceptionally wide bandwidth, until recently it has been impossible to accurately analyze a helical TWT using its exact dimensions because of the complexity of its geometrical structure. For the first time, an accurate three-dimensional helical model was developed that allows accurate prediction of TWT cold-test characteristics including operating frequency, interaction impedance, and attenuation. This computational model, which was developed at the NASA Lewis Research Center, allows TWT designers to obtain a more accurate value of interaction impedance than is possible using experimental methods. Obtaining helical slow-wave circuit interaction impedance is an important part of the design process for a TWT because it is related to the gain and efficiency of the tube. This impedance cannot be measured directly; thus, conventional methods involve perturbing a helical circuit with a cylindrical dielectric rod placed on the central axis of the circuit and obtaining the difference in resonant frequency between the perturbed and unperturbed circuits. A mathematical relationship has been derived between this frequency difference and the interaction impedance (ref. 1). However, because of the complex configuration of the helical circuit, deriving this relationship involves several approximations. In addition, this experimental procedure is time-consuming and expensive, but until recently it was widely accepted as the most accurate means of determining interaction impedance. The advent of an accurate three-dimensional helical circuit model (ref. 2) made it possible for Lewis researchers to fully investigate standard approximations made in deriving the relationship between measured perturbation data and interaction impedance. The most prominent approximations made
Dybeck, Eric C; Schieber, Natalie P; Shirts, Michael R
2016-08-01
We examine the free energies of three benzene polymorphs as a function of temperature in the point-charge OPLS-AA and GROMOS54A7 potentials as well as the polarizable AMOEBA09 potential. For this system, using a polarizable Hamiltonian instead of the cheaper point-charge potentials is shown to have a significantly smaller effect on the stability at 250 K than on the lattice energy at 0 K. The benzene I polymorph is found to be the most stable crystal structure in all three potentials examined and at all temperatures examined. For each potential, we report the free energies over a range of temperatures and discuss the added value of using full free energy methods over the minimized lattice energy to determine the relative crystal stability at finite temperatures. The free energies in the polarizable Hamiltonian are efficiently calculated using samples collected in a cheaper point-charge potential. The polarizable free energies are estimated from the point-charge trajectories using Boltzmann reweighting with MBAR. The high configuration-space overlap necessary for efficient Boltzmann reweighting is achieved by designing point-charge potentials with intramolecular parameters matching those in the expensive polarizable Hamiltonian. Finally, we compare the computational cost of this indirect reweighted free energy estimate to the cost of simulating directly in the expensive polarizable Hamiltonian. PMID:27341280
NASA Astrophysics Data System (ADS)
Hrubý, Jan
2012-04-01
Mathematical modeling of the non-equilibrium condensing transonic steam flow in the complex 3D geometry of a steam turbine is a demanding problem both concerning the physical concepts and the required computational power. Available accurate formulations of steam properties IAPWS-95 and IAPWS-IF97 require much computation time. For this reason, the modelers often accept the unrealistic ideal-gas behavior. Here we present a computation scheme based on a piecewise, thermodynamically consistent representation of the IAPWS-95 formulation. Density and internal energy are chosen as independent variables to avoid variable transformations and iterations. On the contrary to the previous Tabular Taylor Series Expansion Method, the pressure and temperature are continuous functions of the independent variables, which is a desirable property for the solution of the differential equations of the mass, energy, and momentum conservation for both phases.
NASA Astrophysics Data System (ADS)
Feldgus, Steven; Shields, George C.
2001-10-01
The Bergman cyclization of large polycyclic enediyne systems that mimic the cores of the enediyne anticancer antibiotics was studied using the ONIOM hybrid method. Tests on small enediynes show that ONIOM can accurately match experimental data. The effect of the triggering reaction in the natural products is investigated, and we support the argument that it is strain effects that lower the cyclization barrier. The barrier for the triggered molecule is very low, leading to a reasonable half-life at biological temperatures. No evidence is found that would suggest a concerted cyclization/H-atom abstraction mechanism is necessary for DNA cleavage.
NASA Technical Reports Server (NTRS)
Lindner, Bernhard Lee; Ackerman, Thomas P.; Pollack, James B.
1990-01-01
CO2 comprises 95 pct. of the composition of the Martian atmosphere. However, the Martian atmosphere also has a high aerosol content. Dust particles vary from less than 0.2 to greater than 3.0. CO2 is an active absorber and emitter in near IR and IR wavelengths; the near IR absorption bands of CO2 provide significant heating of the atmosphere, and the 15 micron band provides rapid cooling. Including both CO2 and aerosol radiative transfer simultaneously in a model is difficult. Aerosol radiative transfer requires a multiple scattering code, while CO2 radiative transfer must deal with complex wavelength structure. As an alternative to the pure atmosphere treatment in most models which causes inaccuracies, a treatment was developed called the exponential sum or k distribution approximation. The chief advantage of the exponential sum approach is that the integration over k space of f(k) can be computed more quickly than the integration of k sub upsilon over frequency. The exponential sum approach is superior to the photon path distribution and emissivity techniques for dusty conditions. This study was the first application of the exponential sum approach to Martian conditions.
NASA Astrophysics Data System (ADS)
Lee, Y. C.; Thompson, H. M.; Gaskell, P. H.
2009-12-01
, industrial and physical applications. However, despite recent modelling advances, the accurate numerical solution of the equations governing such problems is still at a relatively early stage. Indeed, recent studies employing a simplifying long-wave approximation have shown that highly efficient numerical methods are necessary to solve the resulting lubrication equations in order to achieve the level of grid resolution required to accurately capture the effects of micro- and nano-scale topographical features. Solution method: A portable parallel multigrid algorithm has been developed for the above purpose, for the particular case of flow over submerged topographical features. Within the multigrid framework adopted, a W-cycle is used to accelerate convergence in respect of the time dependent nature of the problem, with relaxation sweeps performed using a fixed number of pre- and post-Red-Black Gauss-Seidel Newton iterations. In addition, the algorithm incorporates automatic adaptive time-stepping to avoid the computational expense associated with repeated time-step failure. Running time: 1.31 minutes using 128 processors on BlueGene/P with a problem size of over 16.7 million mesh points.
NASA Astrophysics Data System (ADS)
Walker, Olivier; Varadan, Ranjani; Fushman, David
2004-06-01
We present a computer program ROTDIF for efficient determination of a complete rotational diffusion tensor of a molecule from NMR relaxation data. The derivation of the rotational diffusion tensor in the case of a fully anisotropic model is based on a six-dimensional search, which could be very time consuming, particularly if a grid search in the Euler angle space is involved. Here, we use an efficient Levenberg-Marquardt algorithm combined with Monte Carlo generation of initial guesses. The result is a dramatic, up to 50-fold improvement in the computational efficiency over the previous approaches [Biochemistry 38 (1999) 10225; J. Magn. Reson. 149 (2001) 214]. This method is demonstrated on a computer-generated and real protein systems. We also address the issue of sensitivity of the diffusion tensor determination from 15N relaxation measurements to experimental errors in the relaxation rates and discuss possible artifacts from applying higher-symmetry tensor model and how to recognize them.
Accurate modeling of parallel scientific computations
NASA Technical Reports Server (NTRS)
Nicol, David M.; Townsend, James C.
1988-01-01
Scientific codes are usually parallelized by partitioning a grid among processors. To achieve top performance it is necessary to partition the grid so as to balance workload and minimize communication/synchronization costs. This problem is particularly acute when the grid is irregular, changes over the course of the computation, and is not known until load time. Critical mapping and remapping decisions rest on the ability to accurately predict performance, given a description of a grid and its partition. This paper discusses one approach to this problem, and illustrates its use on a one-dimensional fluids code. The models constructed are shown to be accurate, and are used to find optimal remapping schedules.
Efficient and accurate sound propagation using adaptive rectangular decomposition.
Raghuvanshi, Nikunj; Narain, Rahul; Lin, Ming C
2009-01-01
Accurate sound rendering can add significant realism to complement visual display in interactive applications, as well as facilitate acoustic predictions for many engineering applications, like accurate acoustic analysis for architectural design. Numerical simulation can provide this realism most naturally by modeling the underlying physics of wave propagation. However, wave simulation has traditionally posed a tough computational challenge. In this paper, we present a technique which relies on an adaptive rectangular decomposition of 3D scenes to enable efficient and accurate simulation of sound propagation in complex virtual environments. It exploits the known analytical solution of the Wave Equation in rectangular domains, and utilizes an efficient implementation of the Discrete Cosine Transform on Graphics Processors (GPU) to achieve at least a 100-fold performance gain compared to a standard Finite-Difference Time-Domain (FDTD) implementation with comparable accuracy, while also being 10-fold more memory efficient. Consequently, we are able to perform accurate numerical acoustic simulation on large, complex scenes in the kilohertz range. To the best of our knowledge, it was not previously possible to perform such simulations on a desktop computer. Our work thus enables acoustic analysis on large scenes and auditory display for complex virtual environments on commodity hardware. PMID:19590105
Computationally efficient multibody simulations
NASA Technical Reports Server (NTRS)
Ramakrishnan, Jayant; Kumar, Manoj
1994-01-01
Computationally efficient approaches to the solution of the dynamics of multibody systems are presented in this work. The computational efficiency is derived from both the algorithmic and implementational standpoint. Order(n) approaches provide a new formulation of the equations of motion eliminating the assembly and numerical inversion of a system mass matrix as required by conventional algorithms. Computational efficiency is also gained in the implementation phase by the symbolic processing and parallel implementation of these equations. Comparison of this algorithm with existing multibody simulation programs illustrates the increased computational efficiency.
Accurate Measurement of Organic Solar Cell Efficiency
Emery, K.; Moriarty, T.
2008-01-01
We discuss the measurement and analysis of current vs. voltage (I-V) characteristics of organic and dye-sensitized photovoltaic cells and modules. A brief discussion of the history of photovoltaic efficiency measurements and procedures will be presented. We discuss both the error sources in the measurements and the strategies to minimize their influence. These error sources include the sample area, spectral errors, temperature fluctuations, current and voltage response time, contacting, and degradation during testing. Information that can be extracted from light and dark I-V measurement includes peak power, open-circuit voltage, short-circuit current, series and shunt resistance, diode quality factor, dark current, and photo-current. The quantum efficiency provides information on photo-current nonlinearities, current generation, and recombination mechanisms.
NASA Astrophysics Data System (ADS)
Merced-Grafals, Emmanuelle J.; Dávila, Noraica; Ge, Ning; Williams, R. Stanley; Strachan, John Paul
2016-09-01
Beyond use as high density non-volatile memories, memristors have potential as synaptic components of neuromorphic systems. We investigated the suitability of tantalum oxide (TaO x ) transistor-memristor (1T1R) arrays for such applications, particularly the ability to accurately, repeatedly, and rapidly reach arbitrary conductance states. Programming is performed by applying an adaptive pulsed algorithm that utilizes the transistor gate voltage to control the SET switching operation and increase programming speed of the 1T1R cells. We show the capability of programming 64 conductance levels with <0.5% average accuracy using 100 ns pulses and studied the trade-offs between programming speed and programming error. The algorithm is also utilized to program 16 conductance levels on a population of cells in the 1T1R array showing robustness to cell-to-cell variability. In general, the proposed algorithm results in approximately 10× improvement in programming speed over standard algorithms that do not use the transistor gate to control memristor switching. In addition, after only two programming pulses (an initialization pulse followed by a programming pulse), the resulting conductance values are within 12% of the target values in all cases. Finally, endurance of more than 106 cycles is shown through open-loop (single pulses) programming across multiple conductance levels using the optimized gate voltage of the transistor. These results are relevant for applications that require high speed, accurate, and repeatable programming of the cells such as in neural networks and analog data processing.
NASA Astrophysics Data System (ADS)
Merced-Grafals, Emmanuelle J.; Dávila, Noraica; Ge, Ning; Williams, R. Stanley; Strachan, John Paul
2016-09-01
Beyond use as high density non-volatile memories, memristors have potential as synaptic components of neuromorphic systems. We investigated the suitability of tantalum oxide (TaOx) transistor-memristor (1T1R) arrays for such applications, particularly the ability to accurately, repeatedly, and rapidly reach arbitrary conductance states. Programming is performed by applying an adaptive pulsed algorithm that utilizes the transistor gate voltage to control the SET switching operation and increase programming speed of the 1T1R cells. We show the capability of programming 64 conductance levels with <0.5% average accuracy using 100 ns pulses and studied the trade-offs between programming speed and programming error. The algorithm is also utilized to program 16 conductance levels on a population of cells in the 1T1R array showing robustness to cell-to-cell variability. In general, the proposed algorithm results in approximately 10× improvement in programming speed over standard algorithms that do not use the transistor gate to control memristor switching. In addition, after only two programming pulses (an initialization pulse followed by a programming pulse), the resulting conductance values are within 12% of the target values in all cases. Finally, endurance of more than 106 cycles is shown through open-loop (single pulses) programming across multiple conductance levels using the optimized gate voltage of the transistor. These results are relevant for applications that require high speed, accurate, and repeatable programming of the cells such as in neural networks and analog data processing.
Merced-Grafals, Emmanuelle J; Dávila, Noraica; Ge, Ning; Williams, R Stanley; Strachan, John Paul
2016-09-01
Beyond use as high density non-volatile memories, memristors have potential as synaptic components of neuromorphic systems. We investigated the suitability of tantalum oxide (TaOx) transistor-memristor (1T1R) arrays for such applications, particularly the ability to accurately, repeatedly, and rapidly reach arbitrary conductance states. Programming is performed by applying an adaptive pulsed algorithm that utilizes the transistor gate voltage to control the SET switching operation and increase programming speed of the 1T1R cells. We show the capability of programming 64 conductance levels with <0.5% average accuracy using 100 ns pulses and studied the trade-offs between programming speed and programming error. The algorithm is also utilized to program 16 conductance levels on a population of cells in the 1T1R array showing robustness to cell-to-cell variability. In general, the proposed algorithm results in approximately 10× improvement in programming speed over standard algorithms that do not use the transistor gate to control memristor switching. In addition, after only two programming pulses (an initialization pulse followed by a programming pulse), the resulting conductance values are within 12% of the target values in all cases. Finally, endurance of more than 10(6) cycles is shown through open-loop (single pulses) programming across multiple conductance levels using the optimized gate voltage of the transistor. These results are relevant for applications that require high speed, accurate, and repeatable programming of the cells such as in neural networks and analog data processing. PMID:27479054
Computationally efficient Bayesian tracking
NASA Astrophysics Data System (ADS)
Aughenbaugh, Jason; La Cour, Brian
2012-06-01
In this paper, we describe the progress we have achieved in developing a computationally efficient, grid-based Bayesian fusion tracking system. In our approach, the probability surface is represented by a collection of multidimensional polynomials, each computed adaptively on a grid of cells representing state space. Time evolution is performed using a hybrid particle/grid approach and knowledge of the grid structure, while sensor updates use a measurement-based sampling method with a Delaunay triangulation. We present an application of this system to the problem of tracking a submarine target using a field of active and passive sonar buoys.
Computationally efficient control allocation
NASA Technical Reports Server (NTRS)
Durham, Wayne (Inventor)
2001-01-01
A computationally efficient method for calculating near-optimal solutions to the three-objective, linear control allocation problem is disclosed. The control allocation problem is that of distributing the effort of redundant control effectors to achieve some desired set of objectives. The problem is deemed linear if control effectiveness is affine with respect to the individual control effectors. The optimal solution is that which exploits the collective maximum capability of the effectors within their individual physical limits. Computational efficiency is measured by the number of floating-point operations required for solution. The method presented returned optimal solutions in more than 90% of the cases examined; non-optimal solutions returned by the method were typically much less than 1% different from optimal and the errors tended to become smaller than 0.01% as the number of controls was increased. The magnitude of the errors returned by the present method was much smaller than those that resulted from either pseudo inverse or cascaded generalized inverse solutions. The computational complexity of the method presented varied linearly with increasing numbers of controls; the number of required floating point operations increased from 5.5 i, to seven times faster than did the minimum-norm solution (the pseudoinverse), and at about the same rate as did the cascaded generalized inverse solution. The computational requirements of the method presented were much better than that of previously described facet-searching methods which increase in proportion to the square of the number of controls.
Direct computation of parameters for accurate polarizable force fields
Verstraelen, Toon Vandenbrande, Steven; Ayers, Paul W.
2014-11-21
We present an improved electronic linear response model to incorporate polarization and charge-transfer effects in polarizable force fields. This model is a generalization of the Atom-Condensed Kohn-Sham Density Functional Theory (DFT), approximated to second order (ACKS2): it can now be defined with any underlying variational theory (next to KS-DFT) and it can include atomic multipoles and off-center basis functions. Parameters in this model are computed efficiently as expectation values of an electronic wavefunction, obviating the need for their calibration, regularization, and manual tuning. In the limit of a complete density and potential basis set in the ACKS2 model, the linear response properties of the underlying theory for a given molecular geometry are reproduced exactly. A numerical validation with a test set of 110 molecules shows that very accurate models can already be obtained with fluctuating charges and dipoles. These features greatly facilitate the development of polarizable force fields.
Efficient determination of accurate atomic polarizabilities for polarizeable embedding calculations.
Schröder, Heiner; Schwabe, Tobias
2016-08-15
We evaluate embedding potentials, obtained via various methods, used for polarizable embedding computations of excitation energies of para-nitroaniline in water and organic solvents as well as of the green fluorescent protein. We found that isotropic polarizabilities derived from DFTD3 dispersion coefficients correlate well with those obtained via the LoProp method. We show that these polarizabilities in conjunction with appropriately derived point charges are in good agreement with calculations employing static multipole moments up to quadrupoles and anisotropic polarizabilities for both computed systems. The (partial) use of these easily-accessible parameters drastically reduces the computational effort to obtain accurate embedding potentials especially for proteins. © 2016 The Authors. Journal of Computational Chemistry Published by Wiley Periodicals, Inc. PMID:27317509
Efficient Methods to Compute Genomic Predictions
Technology Transfer Automated Retrieval System (TEKTRAN)
Efficient methods for processing genomic data were developed to increase reliability of estimated breeding values and simultaneously estimate thousands of marker effects. Algorithms were derived and computer programs tested on simulated data for 50,000 markers and 2,967 bulls. Accurate estimates of ...
Efficient and Accurate WLAN Positioning with Weighted Graphs
NASA Astrophysics Data System (ADS)
Hansen, René; Thomsen, Bent
This paper concerns indoor location determination by using existing WLAN infrastructures and WLAN enabled mobile devices. The location fingerprinting technique performs localization by first constructing a radio map of signal strengths from nearby access points. The radio map is subsequently searched using a classification algorithm to determine a location estimate. This paper addresses two distinct challenges of location fingerprinting incurred by positioning moving users. Firstly, movement affects the positioning accuracy negatively due to increased signal strength fluctuations. Secondly, tracking moving users requires a low-latency overhead which translates into efficient computations to be done on a mobile device with limited capabilities. We present a technique to simultaneously improve the positioning accuracy and computational efficiency. The technique utilizes a weighted graph model of the indoor environment to improve positioning accuracy and computational efficiency by only considering the subset of locations in the radio map that are feasible to reach from a previously estimated position. The technique is general and can be used on top of any existing location system. Our results indicate that we are able to achieve similar dynamic localization accuracy to static localization. Effectively, we are able to counter the adverse effects of added signal fluctuations caused by movement. However, as some of our experiments testify, any location system is fundamentally constrained by the underlying environment. We give pointers to research which allows such problems to be detected early and thereby avoided before deploying a system.
Efficient universal blind quantum computation.
Giovannetti, Vittorio; Maccone, Lorenzo; Morimae, Tomoyuki; Rudolph, Terry G
2013-12-01
We give a cheat sensitive protocol for blind universal quantum computation that is efficient in terms of computational and communication resources: it allows one party to perform an arbitrary computation on a second party's quantum computer without revealing either which computation is performed, or its input and output. The first party's computational capabilities can be extremely limited: she must only be able to create and measure single-qubit superposition states. The second party is not required to use measurement-based quantum computation. The protocol requires the (optimal) exchange of O(Jlog2(N)) single-qubit states, where J is the computational depth and N is the number of qubits needed for the computation. PMID:24476238
Efficient Universal Blind Quantum Computation
NASA Astrophysics Data System (ADS)
Giovannetti, Vittorio; Maccone, Lorenzo; Morimae, Tomoyuki; Rudolph, Terry G.
2013-12-01
We give a cheat sensitive protocol for blind universal quantum computation that is efficient in terms of computational and communication resources: it allows one party to perform an arbitrary computation on a second party’s quantum computer without revealing either which computation is performed, or its input and output. The first party’s computational capabilities can be extremely limited: she must only be able to create and measure single-qubit superposition states. The second party is not required to use measurement-based quantum computation. The protocol requires the (optimal) exchange of O(Jlog2(N)) single-qubit states, where J is the computational depth and N is the number of qubits needed for the computation.
Accurate and efficient spin integration for particle accelerators
NASA Astrophysics Data System (ADS)
Abell, Dan T.; Meiser, Dominic; Ranjbar, Vahid H.; Barber, Desmond P.
2015-02-01
Accurate spin tracking is a valuable tool for understanding spin dynamics in particle accelerators and can help improve the performance of an accelerator. In this paper, we present a detailed discussion of the integrators in the spin tracking code gpuSpinTrack. We have implemented orbital integrators based on drift-kick, bend-kick, and matrix-kick splits. On top of the orbital integrators, we have implemented various integrators for the spin motion. These integrators use quaternions and Romberg quadratures to accelerate both the computation and the convergence of spin rotations. We evaluate their performance and accuracy in quantitative detail for individual elements as well as for the entire RHIC lattice. We exploit the inherently data-parallel nature of spin tracking to accelerate our algorithms on graphics processing units.
Efficient and Accurate Indoor Localization Using Landmark Graphs
NASA Astrophysics Data System (ADS)
Gu, F.; Kealy, A.; Khoshelham, K.; Shang, J.
2016-06-01
Indoor localization is important for a variety of applications such as location-based services, mobile social networks, and emergency response. Fusing spatial information is an effective way to achieve accurate indoor localization with little or with no need for extra hardware. However, existing indoor localization methods that make use of spatial information are either too computationally expensive or too sensitive to the completeness of landmark detection. In this paper, we solve this problem by using the proposed landmark graph. The landmark graph is a directed graph where nodes are landmarks (e.g., doors, staircases, and turns) and edges are accessible paths with heading information. We compared the proposed method with two common Dead Reckoning (DR)-based methods (namely, Compass + Accelerometer + Landmarks and Gyroscope + Accelerometer + Landmarks) by a series of experiments. Experimental results show that the proposed method can achieve 73% accuracy with a positioning error less than 2.5 meters, which outperforms the other two DR-based methods.
Accurate and efficient reconstruction of deep phylogenies from structured RNAs
Stocsits, Roman R.; Letsch, Harald; Hertel, Jana; Misof, Bernhard; Stadler, Peter F.
2009-01-01
Ribosomal RNA (rRNA) genes are probably the most frequently used data source in phylogenetic reconstruction. Individual columns of rRNA alignments are not independent as a consequence of their highly conserved secondary structures. Unless explicitly taken into account, these correlation can distort the phylogenetic signal and/or lead to gross overestimates of tree stability. Maximum likelihood and Bayesian approaches are of course amenable to using RNA-specific substitution models that treat conserved base pairs appropriately, but require accurate secondary structure models as input. So far, however, no accurate and easy-to-use tool has been available for computing structure-aware alignments and consensus structures that can deal with the large rRNAs. The RNAsalsa approach is designed to fill this gap. Capitalizing on the improved accuracy of pairwise consensus structures and informed by a priori knowledge of group-specific structural constraints, the tool provides both alignments and consensus structures that are of sufficient accuracy for routine phylogenetic analysis based on RNA-specific substitution models. The power of the approach is demonstrated using two rRNA data sets: a mitochondrial rRNA set of 26 Mammalia, and a collection of 28S nuclear rRNAs representative of the five major echinoderm groups. PMID:19723687
Accurate and efficient reconstruction of deep phylogenies from structured RNAs.
Stocsits, Roman R; Letsch, Harald; Hertel, Jana; Misof, Bernhard; Stadler, Peter F
2009-10-01
Ribosomal RNA (rRNA) genes are probably the most frequently used data source in phylogenetic reconstruction. Individual columns of rRNA alignments are not independent as a consequence of their highly conserved secondary structures. Unless explicitly taken into account, these correlation can distort the phylogenetic signal and/or lead to gross overestimates of tree stability. Maximum likelihood and Bayesian approaches are of course amenable to using RNA-specific substitution models that treat conserved base pairs appropriately, but require accurate secondary structure models as input. So far, however, no accurate and easy-to-use tool has been available for computing structure-aware alignments and consensus structures that can deal with the large rRNAs. The RNAsalsa approach is designed to fill this gap. Capitalizing on the improved accuracy of pairwise consensus structures and informed by a priori knowledge of group-specific structural constraints, the tool provides both alignments and consensus structures that are of sufficient accuracy for routine phylogenetic analysis based on RNA-specific substitution models. The power of the approach is demonstrated using two rRNA data sets: a mitochondrial rRNA set of 26 Mammalia, and a collection of 28S nuclear rRNAs representative of the five major echinoderm groups. PMID:19723687
Automated generation of highly accurate, efficient and transferable pseudopotentials
NASA Astrophysics Data System (ADS)
Hansel, R. A.; Brock, C. N.; Paikoff, B. C.; Tackett, A. R.; Walker, D. G.
2015-11-01
A multi-objective genetic algorithm (MOGA) was used to automate a search for optimized pseudopotential parameters. Pseudopotentials were generated using the atomPAW program and density functional theory (DFT) simulations were conducted using the pwPAW program. The optimized parameters were the cutoff radius and projector energies for the s and p orbitals. The two objectives were low pseudopotential error and low computational work requirements. The error was determined from (1) the root mean square difference between the all-electron and pseudized-electron log derivative, (2) the calculated lattice constant versus reference data of Holzwarth et al., and (3) the calculated bulk modulus versus reference potentials. The computational work was defined as the number of flops required to perform the DFT simulation. Pseudopotential transferability was encouraged by optimizing each element in different lattices: (1) nitrogen in GaN, AlN, and YN, (2) oxygen in NO, ZnO, and SiO4, and (3) fluorine in LiF, NaF, and KF. The optimal solutions were equivalent in error and required significantly less computational work than the reference data. This proof-of-concept study demonstrates that the combination of MOGA and ab-initio simulations is a powerful tool that can generate a set of transferable potentials with a trade-off between accuracy (error) and computational efficiency (work).
Accurate and efficient linear scaling DFT calculations with universal applicability.
Mohr, Stephan; Ratcliff, Laura E; Genovese, Luigi; Caliste, Damien; Boulanger, Paul; Goedecker, Stefan; Deutsch, Thierry
2015-12-21
Density functional theory calculations are computationally extremely expensive for systems containing many atoms due to their intrinsic cubic scaling. This fact has led to the development of so-called linear scaling algorithms during the last few decades. In this way it becomes possible to perform ab initio calculations for several tens of thousands of atoms within reasonable walltimes. However, even though the use of linear scaling algorithms is physically well justified, their implementation often introduces some small errors. Consequently most implementations offering such a linear complexity either yield only a limited accuracy or, if one wants to go beyond this restriction, require a tedious fine tuning of many parameters. In our linear scaling approach within the BigDFT package, we were able to overcome this restriction. Using an ansatz based on localized support functions expressed in an underlying Daubechies wavelet basis - which offers ideal properties for accurate linear scaling calculations - we obtain an amazingly high accuracy and a universal applicability while still keeping the possibility of simulating large system with linear scaling walltimes requiring only a moderate demand of computing resources. We prove the effectiveness of our method on a wide variety of systems with different boundary conditions, for single-point calculations as well as for geometry optimizations and molecular dynamics. PMID:25958954
Neutron supermirrors: an accurate theory for layer thickness computation
NASA Astrophysics Data System (ADS)
Bray, Michael
2001-11-01
We present a new theory for the computation of Super-Mirror stacks, using accurate formulas derived from the classical optics field. Approximations are introduced into the computation, but at a later stage than existing theories, providing a more rigorous treatment of the problem. The final result is a continuous thickness stack, whose properties can be determined at the outset of the design. We find that the well-known fourth power dependence of number of layers versus maximum angle is (of course) asymptotically correct. We find a formula giving directly the relation between desired reflectance, maximum angle, and number of layers (for a given pair of materials). Note: The author of this article, a classical opticist, has limited knowledge of the Neutron world, and begs forgiveness for any shortcomings, erroneous assumptions and/or misinterpretation of previous authors' work on the subject.
Accurate Computation of Survival Statistics in Genome-Wide Studies
Vandin, Fabio; Papoutsaki, Alexandra; Raphael, Benjamin J.; Upfal, Eli
2015-01-01
A key challenge in genomics is to identify genetic variants that distinguish patients with different survival time following diagnosis or treatment. While the log-rank test is widely used for this purpose, nearly all implementations of the log-rank test rely on an asymptotic approximation that is not appropriate in many genomics applications. This is because: the two populations determined by a genetic variant may have very different sizes; and the evaluation of many possible variants demands highly accurate computation of very small p-values. We demonstrate this problem for cancer genomics data where the standard log-rank test leads to many false positive associations between somatic mutations and survival time. We develop and analyze a novel algorithm, Exact Log-rank Test (ExaLT), that accurately computes the p-value of the log-rank statistic under an exact distribution that is appropriate for any size populations. We demonstrate the advantages of ExaLT on data from published cancer genomics studies, finding significant differences from the reported p-values. We analyze somatic mutations in six cancer types from The Cancer Genome Atlas (TCGA), finding mutations with known association to survival as well as several novel associations. In contrast, standard implementations of the log-rank test report dozens-hundreds of likely false positive associations as more significant than these known associations. PMID:25950620
Accurate ionization potential of semiconductors from efficient density functional calculations
NASA Astrophysics Data System (ADS)
Ye, Lin-Hui
2016-07-01
Despite its huge successes in total-energy-related applications, the Kohn-Sham scheme of density functional theory cannot get reliable single-particle excitation energies for solids. In particular, it has not been able to calculate the ionization potential (IP), one of the most important material parameters, for semiconductors. We illustrate that an approximate exact-exchange optimized effective potential (EXX-OEP), the Becke-Johnson exchange, can be used to largely solve this long-standing problem. For a group of 17 semiconductors, we have obtained the IPs to an accuracy similar to that of the much more sophisticated G W approximation (GWA), with the computational cost of only local-density approximation/generalized gradient approximation. The EXX-OEP, therefore, is likely as useful for solids as for finite systems. For solid surfaces, the asymptotic behavior of the vx c has effects similar to those of finite systems which, when neglected, typically cause the semiconductor IPs to be underestimated. This may partially explain why standard GWA systematically underestimates the IPs and why using the same GWA procedures has not been able to get an accurate IP and band gap at the same time.
An Accurate and Dynamic Computer Graphics Muscle Model
NASA Technical Reports Server (NTRS)
Levine, David Asher
1997-01-01
A computer based musculo-skeletal model was developed at the University in the departments of Mechanical and Biomedical Engineering. This model accurately represents human shoulder kinematics. The result of this model is the graphical display of bones moving through an appropriate range of motion based on inputs of EMGs and external forces. The need existed to incorporate a geometric muscle model in the larger musculo-skeletal model. Previous muscle models did not accurately represent muscle geometries, nor did they account for the kinematics of tendons. This thesis covers the creation of a new muscle model for use in the above musculo-skeletal model. This muscle model was based on anatomical data from the Visible Human Project (VHP) cadaver study. Two-dimensional digital images from the VHP were analyzed and reconstructed to recreate the three-dimensional muscle geometries. The recreated geometries were smoothed, reduced, and sliced to form data files defining the surfaces of each muscle. The muscle modeling function opened these files during run-time and recreated the muscle surface. The modeling function applied constant volume limitations to the muscle and constant geometry limitations to the tendons.
A fast and accurate computational approach to protein ionization
Spassov, Velin Z.; Yan, Lisa
2008-01-01
We report a very fast and accurate physics-based method to calculate pH-dependent electrostatic effects in protein molecules and to predict the pK values of individual sites of titration. In addition, a CHARMm-based algorithm is included to construct and refine the spatial coordinates of all hydrogen atoms at a given pH. The present method combines electrostatic energy calculations based on the Generalized Born approximation with an iterative mobile clustering approach to calculate the equilibria of proton binding to multiple titration sites in protein molecules. The use of the GBIM (Generalized Born with Implicit Membrane) CHARMm module makes it possible to model not only water-soluble proteins but membrane proteins as well. The method includes a novel algorithm for preliminary refinement of hydrogen coordinates. Another difference from existing approaches is that, instead of monopeptides, a set of relaxed pentapeptide structures are used as model compounds. Tests on a set of 24 proteins demonstrate the high accuracy of the method. On average, the RMSD between predicted and experimental pK values is close to 0.5 pK units on this data set, and the accuracy is achieved at very low computational cost. The pH-dependent assignment of hydrogen atoms also shows very good agreement with protonation states and hydrogen-bond network observed in neutron-diffraction structures. The method is implemented as a computational protocol in Accelrys Discovery Studio and provides a fast and easy way to study the effect of pH on many important mechanisms such as enzyme catalysis, ligand binding, protein–protein interactions, and protein stability. PMID:18714088
Photoacoustic computed tomography without accurate ultrasonic transducer responses
NASA Astrophysics Data System (ADS)
Sheng, Qiwei; Wang, Kun; Xia, Jun; Zhu, Liren; Wang, Lihong V.; Anastasio, Mark A.
2015-03-01
Conventional photoacoustic computed tomography (PACT) image reconstruction methods assume that the object and surrounding medium are described by a constant speed-of-sound (SOS) value. In order to accurately recover fine structures, SOS heterogeneities should be quantified and compensated for during PACT reconstruction. To address this problem, several groups have proposed hybrid systems that combine PACT with ultrasound computed tomography (USCT). In such systems, a SOS map is reconstructed first via USCT. Consequently, this SOS map is employed to inform the PACT reconstruction method. Additionally, the SOS map can provide structural information regarding tissue, which is complementary to the functional information from the PACT image. We propose a paradigm shift in the way that images are reconstructed in hybrid PACT-USCT imaging. Inspired by our observation that information about the SOS distribution is encoded in PACT measurements, we propose to jointly reconstruct the absorbed optical energy density and SOS distributions from a combined set of USCT and PACT measurements, thereby reducing the two reconstruction problems into one. This innovative approach has several advantages over conventional approaches in which PACT and USCT images are reconstructed independently: (1) Variations in the SOS will automatically be accounted for, optimizing PACT image quality; (2) The reconstructed PACT and USCT images will possess minimal systematic artifacts because errors in the imaging models will be optimally balanced during the joint reconstruction; (3) Due to the exploitation of information regarding the SOS distribution in the full-view PACT data, our approach will permit high-resolution reconstruction of the SOS distribution from sparse array data.
Efficient computation of NACT seismograms
NASA Astrophysics Data System (ADS)
Zheng, Z.; Romanowicz, B. A.
2009-12-01
We present a modification to the NACT formalism (Li and Romanowicz, 1995) for computing synthetic seismograms and sensitivity kernels in global seismology. In the NACT theory, the perturbed seismogram consists of an along-branch coupling term, which is computed under the well-known PAVA approximation (e.g. Woodhouse and Dziewonski, 1984), and an across-branch coupling term, which is computed under the linear Born approximation. In the classical formalism, the Born part is obtained by a double summation over all pairs of coupling modes, where the numerical cost grows as (number of sources * number of receivers) * (corner frequency)^4. Here, however, by adapting the approach of Capdeville (2005), we are able to separate the computation into two single summations, which are responsible for the “source to scatterer” and the “scatterer to receiver” contributions, respectively. As a result, the numerical cost of the new scheme grows as (number of sources + number of receivers) * (corner frequency)^2. Moreover, by expanding eigen functions on a wavelet basis, a compression factor of at least 3 (larger at lower frequency) is achieved, leading to a factor of ~10 saving in disk storage. Numerical experiments show that the synthetic seismograms computed from the new approach agree well with those from the classical mode coupling method. The new formalism is significantly more efficient when approaching higher frequencies and in cases of large numbers of sources and receivers, while the across-branch mode coupling feature is still preserved, though not explicitly.
Nguyen, Thuy-Diem; Schmidt, Bertil; Zheng, Zejun; Kwoh, Chee-Keong
2015-01-01
De novo clustering is a popular technique to perform taxonomic profiling of a microbial community by grouping 16S rRNA amplicon reads into operational taxonomic units (OTUs). In this work, we introduce a new dendrogram-based OTU clustering pipeline called CRiSPy. The key idea used in CRiSPy to improve clustering accuracy is the application of an anomaly detection technique to obtain a dynamic distance cutoff instead of using the de facto value of 97 percent sequence similarity as in most existing OTU clustering pipelines. This technique works by detecting an abrupt change in the merging heights of a dendrogram. To produce the output dendrograms, CRiSPy employs the OTU hierarchical clustering approach that is computed on a genetic distance matrix derived from an all-against-all read comparison by pairwise sequence alignment. However, most existing dendrogram-based tools have difficulty processing datasets larger than 10,000 unique reads due to high computational complexity. We address this difficulty by developing two efficient algorithms for CRiSPy: a compute-efficient GPU-accelerated parallel algorithm for pairwise distance matrix computation and a memory-efficient hierarchical clustering algorithm. Our experiments on various datasets with distinct attributes show that CRiSPy is able to produce more accurate OTU groupings than most OTU clustering applications. PMID:26451819
Zhang, Hao; Zhao, Yan; Cao, Liangcai; Jin, Guofan
2015-02-23
We propose an algorithm based on fully computed holographic stereogram for calculating full-parallax computer-generated holograms (CGHs) with accurate depth cues. The proposed method integrates point source algorithm and holographic stereogram based algorithm to reconstruct the three-dimensional (3D) scenes. Precise accommodation cue and occlusion effect can be created, and computer graphics rendering techniques can be employed in the CGH generation to enhance the image fidelity. Optical experiments have been performed using a spatial light modulator (SLM) and a fabricated high-resolution hologram, the results show that our proposed algorithm can perform quality reconstructions of 3D scenes with arbitrary depth information. PMID:25836429
The development of accurate and efficient methods of numerical quadrature
NASA Technical Reports Server (NTRS)
Feagin, T.
1973-01-01
Some new methods for performing numerical quadrature of an integrable function over a finite interval are described. Each method provides a sequence of approximations of increasing order to the value of the integral. Each approximation makes use of all previously computed values of the integrand. The points at which new values of the integrand are computed are selected in such a way that the order of the approximation is maximized. The methods are compared with the quadrature methods of Clenshaw and Curtis, Gauss, Patterson, and Romberg using several examples.
Measurement of Fracture Geometry for Accurate Computation of Hydraulic Conductivity
NASA Astrophysics Data System (ADS)
Chae, B.; Ichikawa, Y.; Kim, Y.
2003-12-01
Fluid flow in rock mass is controlled by geometry of fractures which is mainly characterized by roughness, aperture and orientation. Fracture roughness and aperture was observed by a new confocal laser scanning microscope (CLSM; Olympus OLS1100). The wavelength of laser is 488nm, and the laser scanning is managed by a light polarization method using two galvano-meter scanner mirrors. The system improves resolution in the light axis (namely z) direction because of the confocal optics. The sampling is managed in a spacing 2.5 μ m along x and y directions. The highest measurement resolution of z direction is 0.05 μ m, which is the more accurate than other methods. For the roughness measurements, core specimens of coarse and fine grained granites were provided. Measurements were performed along three scan lines on each fracture surface. The measured data were represented as 2-D and 3-D digital images showing detailed features of roughness. Spectral analyses by the fast Fourier transform (FFT) were performed to characterize on the roughness data quantitatively and to identify influential frequency of roughness. The FFT results showed that components of low frequencies were dominant in the fracture roughness. This study also verifies that spectral analysis is a good approach to understand complicate characteristics of fracture roughness. For the aperture measurements, digital images of the aperture were acquired under applying five stages of uniaxial normal stresses. This method can characterize the response of aperture directly using the same specimen. Results of measurements show that reduction values of aperture are different at each part due to rough geometry of fracture walls. Laboratory permeability tests were also conducted to evaluate changes of hydraulic conductivities related to aperture variation due to different stress levels. The results showed non-uniform reduction of hydraulic conductivity under increase of the normal stress and different values of
Accurate and efficient halo-based galaxy clustering modelling with simulations
NASA Astrophysics Data System (ADS)
Zheng, Zheng; Guo, Hong
2016-06-01
Small- and intermediate-scale galaxy clustering can be used to establish the galaxy-halo connection to study galaxy formation and evolution and to tighten constraints on cosmological parameters. With the increasing precision of galaxy clustering measurements from ongoing and forthcoming large galaxy surveys, accurate models are required to interpret the data and extract relevant information. We introduce a method based on high-resolution N-body simulations to accurately and efficiently model the galaxy two-point correlation functions (2PCFs) in projected and redshift spaces. The basic idea is to tabulate all information of haloes in the simulations necessary for computing the galaxy 2PCFs within the framework of halo occupation distribution or conditional luminosity function. It is equivalent to populating galaxies to dark matter haloes and using the mock 2PCF measurements as the model predictions. Besides the accurate 2PCF calculations, the method is also fast and therefore enables an efficient exploration of the parameter space. As an example of the method, we decompose the redshift-space galaxy 2PCF into different components based on the type of galaxy pairs and show the redshift-space distortion effect in each component. The generalizations and limitations of the method are discussed.
Nakhleh, Luay
2014-03-12
I proposed to develop computationally efficient tools for accurate detection and reconstruction of microbes' complex evolutionary mechanisms, thus enabling rapid and accurate annotation, analysis and understanding of their genomes. To achieve this goal, I proposed to address three aspects. (1) Mathematical modeling. A major challenge facing the accurate detection of HGT is that of distinguishing between these two events on the one hand and other events that have similar "effects." I proposed to develop a novel mathematical approach for distinguishing among these events. Further, I proposed to develop a set of novel optimization criteria for the evolutionary analysis of microbial genomes in the presence of these complex evolutionary events. (2) Algorithm design. In this aspect of the project, I proposed to develop an array of e cient and accurate algorithms for analyzing microbial genomes based on the formulated optimization criteria. Further, I proposed to test the viability of the criteria and the accuracy of the algorithms in an experimental setting using both synthetic as well as biological data. (3) Software development. I proposed the nal outcome to be a suite of software tools which implements the mathematical models as well as the algorithms developed.
Accurate, efficient, and (iso)geometrically flexible collocation methods for phase-field models
NASA Astrophysics Data System (ADS)
Gomez, Hector; Reali, Alessandro; Sangalli, Giancarlo
2014-04-01
We propose new collocation methods for phase-field models. Our algorithms are based on isogeometric analysis, a new technology that makes use of functions from computational geometry, such as, for example, Non-Uniform Rational B-Splines (NURBS). NURBS exhibit excellent approximability and controllable global smoothness, and can represent exactly most geometries encapsulated in Computer Aided Design (CAD) models. These attributes permitted us to derive accurate, efficient, and geometrically flexible collocation methods for phase-field models. The performance of our method is demonstrated by several numerical examples of phase separation modeled by the Cahn-Hilliard equation. We feel that our method successfully combines the geometrical flexibility of finite elements with the accuracy and simplicity of pseudo-spectral collocation methods, and is a viable alternative to classical collocation methods.
Efficient and Accurate Explicit Integration Algorithms with Application to Viscoplastic Models
NASA Technical Reports Server (NTRS)
Arya, Vinod K.
1994-01-01
Several explicit integration algorithms with self-adative time integration strategies are developed and investigated for efficiency and accuracy. These algorithms involve the Runge-Kutta second order, the lower Runge-Kutta method of orders one and two, and the exponential integration method. The algorithms are applied to viscoplastic models put forth by Freed and Verrilli and Bodner and Partom for thermal/mechanical loadings (including tensile, relaxation, and cyclic loadings). The large amount of computations performed showed that, for comparable accuracy, the efficiency of an integration algorithm depends significantly on the type of application (loading). However, in general, for the aforementioned loadings and viscoplastic models, the exponential integration algorithm with the proposed self-adaptive time integration strategy worked more (or comparably) efficiently and accurately than the other integration algorithms. Using this strategy for integrating viscoplastic models may lead to considerable savings in computer time (better efficiency) without adversely affecting the accuracy of the results. This conclusion should encourage the utilization of viscoplastic models in the stress analysis and design of structural components.
Coupling Efforts to the Accurate and Efficient Tsunami Modelling System
NASA Astrophysics Data System (ADS)
Son, S.
2015-12-01
In the present study, we couple two different types of tsunami models, i.e., nondispersive shallow water model of characteristic form(MOST ver.4) and dispersive Boussinesq model of non-characteristic form(Son et al. (2011)) in an attempt to improve modelling accuracy and efficiency. Since each model deals with different type of primary variables, additional care on matching boundary condition is required. Using an absorbing-generating boundary condition developed by Van Dongeren and Svendsen(1997), model coupling and integration is achieved. Characteristic variables(i.e., Riemann invariants) in MOST are converted to non-characteristic variables for Boussinesq solver without any loss of physical consistency. Established modelling system has been validated through typical test problems to realistic tsunami events. Simulated results reveal good performance of developed modelling system. Since coupled modelling system provides advantageous flexibility feature during implementation, great efficiencies and accuracies are expected to be gained through spot-focusing application of Boussinesq model inside the entire domain of tsunami propagation.
Efficient computation of optimal actions
Todorov, Emanuel
2009-01-01
Optimal choice of actions is a fundamental problem relevant to fields as diverse as neuroscience, psychology, economics, computer science, and control engineering. Despite this broad relevance the abstract setting is similar: we have an agent choosing actions over time, an uncertain dynamical system whose state is affected by those actions, and a performance criterion that the agent seeks to optimize. Solving problems of this kind remains hard, in part, because of overly generic formulations. Here, we propose a more structured formulation that greatly simplifies the construction of optimal control laws in both discrete and continuous domains. An exhaustive search over actions is avoided and the problem becomes linear. This yields algorithms that outperform Dynamic Programming and Reinforcement Learning, and thereby solve traditional problems more efficiently. Our framework also enables computations that were not possible before: composing optimal control laws by mixing primitives, applying deterministic methods to stochastic systems, quantifying the benefits of error tolerance, and inferring goals from behavioral data via convex optimization. Development of a general class of easily solvable problems tends to accelerate progress—as linear systems theory has done, for example. Our framework may have similar impact in fields where optimal choice of actions is relevant. PMID:19574462
Computationally efficient lossless image coder
NASA Astrophysics Data System (ADS)
Sriram, Parthasarathy; Sudharsanan, Subramania I.
1999-12-01
Lossless coding of image data has been a very active area of research in the field of medical imaging, remote sensing and document processing/delivery. While several lossless image coders such as JPEG and JBIG have been in existence for a while, their compression performance for encoding continuous-tone images were rather poor. Recently, several state of the art techniques like CALIC and LOCO were introduced with significant improvement in compression performance over traditional coders. However, these coders are very difficult to implement using dedicated hardware or in software using media processors due to their inherently serial nature of their encoding process. In this work, we propose a lossless image coding technique with a compression performance that is very close to the performance of CALIC and LOCO while being very efficient to implement both in hardware and software. Comparisons for encoding the JPEG- 2000 image set show that the compression performance of the proposed coder is within 2 - 5% of the more complex coders while being computationally very efficient. In addition, the encoder is shown to be parallelizabl at a hierarchy of levels. The execution time of the proposed encoder is smaller than what is required by LOCO while the decoder is 2 - 3 times faster that the execution time required by LOCO decoder.
Efficient radiometrically accurate synthetic representation of IR scenes
NASA Astrophysics Data System (ADS)
Shaw, Patrick C.; Gover, Robert E.
2003-08-01
A technique is developed for synthesizing a high spectral resolution IR ship signature image, for use in an imaging IR Anti-Ship Cruise Missile (ASCM) model, from an IR scene database provided by the ship signature model NTCS/ShipIR. This synthesized IR ship image is generated for use over ranges representative of an ASCM engagement. The technique presented focuses on the application of in-band averaged transmittance to the source ship signature as a means of reducing the spectral calculations required by the cruise missile model. In order to achieve this reduction in computation, while preserving the fidelity of the apparent ship signature, the idea of sub-banding is introduced. Sub-banding describes the manner in which the IR band is partitioned into smaller bandwidths, such that the error produced in the ship's average contrast radiance due to the use of in-band averaged transmittance is minimized over range. The difference between the average contrast radiance of an IR ship image generated using in-band averaging and the average contrast radiance of a spectrally generated IR ship image is the metric for this minimization. This choice is based on measured data collected from the recent NATO SIMVEX trial, which used high quality IR measurements of the CFAV Quest in an effort to refine the NTCS/ShipIR model. The technique is general and applicable to any band(s) of interest. Results are presented which verify that the use of in-band averaged transmittance over an IR band (3.5-5.0 μm), partitioned using three optimal sub-bands, produces an IR ship image with an average contrast radiance within the desired error bar of a spectrally generated ship image's average contrast radiance.
NASA Astrophysics Data System (ADS)
Vallet, A.; Bertrand, C.; Fabbri, O.; Mudry, J.
2015-01-01
Pore water pressure build-up by recharge of underground hydrosystems is one of the main triggering factors of deep-seated landslides. In most deep-seated landslides, pore water pressure data are not available since piezometers, if any, have a very short lifespan because of slope movements. As a consequence, indirect parameters, such as the calculated recharge, are the only data which enable understanding landslide hydrodynamic behaviour. However, in landslide studies, methods and recharge-area parameters used to determine the groundwater recharge are rarely detailed. In this study, the groundwater recharge is estimated with a soil-water balance based on characterisation of evapotranspiration and parameters characterising the recharge area (soil available water capacity, runoff and vegetation coefficient). A workflow to compute daily groundwater recharge is developed. This workflow requires the records of precipitation, air temperature, relative humidity, solar radiation and wind speed within or close to the landslide area. The determination of the parameters of the recharge area is based on a spatial analysis requiring field observations and spatial data sets (digital elevation models, aerial photographs and geological maps). This study demonstrates that the performance of the correlation with landslide displacement velocity data is significantly improved using the recharge estimated with the proposed workflow. The coefficient of determination obtained with the recharge estimated with the proposed workflow is 78% higher on average than that obtained with precipitation, and is 38% higher on average than that obtained with recharge computed with a commonly used simplification in landslide studies (recharge = precipitation minus non-calibrated evapotranspiration method).
Passeri, A; Formiconi, A R; De Cristofaro, M T; Pupi, A; Meldolesi, U
1997-04-01
It is well known that the quantitative potential of emission computed tomography (ECT) relies on the ability to compensate for resolution, attenuation and scatter effects. Reconstruction algorithms which are able to take these effects into account are highly demanding in terms of computing resources. The reported work aimed to investigate the use of a parallel high-performance computing platform for ECT reconstruction taking into account an accurate model of the acquisition of single-photon emission tomographic (SPET) data. An iterative algorithm with an accurate model of the variable system response was ported on the MIMD (Multiple Instruction Multiple Data) parallel architecture of a 64-node Cray T3D massively parallel computer. The system was organized to make it easily accessible even from low-cost PC-based workstations through standard TCP/IP networking. A complete brain study of 30 (64x64) slices could be reconstructed from a set of 90 (64x64) projections with ten iterations of the conjugate gradients algorithm in 9 s, corresponding to an actual speed-up factor of 135. This work demonstrated the possibility of exploiting remote high-performance computing and networking resources from hospital sites by means of low-cost workstations using standard communication protocols without particular problems for routine use. The achievable speed-up factors allow the assessment of the clinical benefit of advanced reconstruction techniques which require a heavy computational burden for the compensation effects such as variable spatial resolution, scatter and attenuation. The possibility of using the same software on the same hardware platform with data acquired in different laboratories with various kinds of SPET instrumentation is appealing for software quality control and for the evaluation of the clinical impact of the reconstruction methods. PMID:9096089
Accurate Computation of Gaussian Quadrature for Tension Powers
NASA Astrophysics Data System (ADS)
Singer, Saša
2007-09-01
We consider Gaussian quadrature formulæ which exactly integrate a system of tension powers 1,x,x2,…,xn-3, sinh(px), cosh(px), on a given interval [a,b], where n⩾4 is an even integer and p>0 is a given tension parameter. In some applications it is essential that p can be changed dynamically, and we need an efficient "on-demand" algorithm that calculates the nodes and weights of Gaussian quadrature formulas for many different values of p, which are not known in advance. It is an interesting numerical challenge to achieve the required full machine precision accuracy in such an algorithm, for all possible values of p. By exploiting various analytic and numerical techniques, we show that this can be done efficiently for all reasonably low values of n that are of any practical importance.
NASA Astrophysics Data System (ADS)
Chen, Duan; Cai, Wei; Zinser, Brian; Cho, Min Hyung
2016-09-01
In this paper, we develop an accurate and efficient Nyström volume integral equation (VIE) method for the Maxwell equations for a large number of 3-D scatterers. The Cauchy Principal Values that arise from the VIE are computed accurately using a finite size exclusion volume together with explicit correction integrals consisting of removable singularities. Also, the hyper-singular integrals are computed using interpolated quadrature formulae with tensor-product quadrature nodes for cubes, spheres and cylinders, that are frequently encountered in the design of meta-materials. The resulting Nyström VIE method is shown to have high accuracy with a small number of collocation points and demonstrates p-convergence for computing the electromagnetic scattering of these objects. Numerical calculations of multiple scatterers of cubic, spherical, and cylindrical shapes validate the efficiency and accuracy of the proposed method.
Kang, Dongwan D.; Froula, Jeff; Egan, Rob; Wang, Zhong
2015-01-01
Grouping large genomic fragments assembled from shotgun metagenomic sequences to deconvolute complex microbial communities, or metagenome binning, enables the study of individual organisms and their interactions. Because of the complex nature of these communities, existing metagenome binning methods often miss a large number of microbial species. In addition, most of the tools are not scalable to large datasets. Here we introduce automated software called MetaBAT that integrates empirical probabilistic distances of genome abundance and tetranucleotide frequency for accurate metagenome binning. MetaBAT outperforms alternative methods in accuracy and computational efficiency on both synthetic and real metagenome datasets. Lastly, it automatically formsmore » hundreds of high quality genome bins on a very large assembly consisting millions of contigs in a matter of hours on a single node. MetaBAT is open source software and available at https://bitbucket.org/berkeleylab/metabat.« less
Kang, Dongwan D.; Froula, Jeff; Egan, Rob; Wang, Zhong
2015-01-01
Grouping large genomic fragments assembled from shotgun metagenomic sequences to deconvolute complex microbial communities, or metagenome binning, enables the study of individual organisms and their interactions. Because of the complex nature of these communities, existing metagenome binning methods often miss a large number of microbial species. In addition, most of the tools are not scalable to large datasets. Here we introduce automated software called MetaBAT that integrates empirical probabilistic distances of genome abundance and tetranucleotide frequency for accurate metagenome binning. MetaBAT outperforms alternative methods in accuracy and computational efficiency on both synthetic and real metagenome datasets. Lastly, it automatically forms hundreds of high quality genome bins on a very large assembly consisting millions of contigs in a matter of hours on a single node. MetaBAT is open source software and available at https://bitbucket.org/berkeleylab/metabat.
Computing accurate age and distance factors in cosmology
NASA Astrophysics Data System (ADS)
Christiansen, Jodi L.; Siver, Andrew
2012-05-01
As the universe expands astronomical observables such as brightness and angular size on the sky change in ways that differ from our simple Cartesian expectation. We show how observed quantities depend on the expansion of space and demonstrate how to calculate such quantities using the Friedmann equations. The general solution to the Friedmann equations requires a numerical solution, which is easily coded in any computing language (including excel). We use these numerical calculations in four projects that help students build their understanding of high-redshift phenomena and cosmology. Instructions for these projects are available as supplementary materials.
An accurate and efficient 3-D micromagnetic simulation of metal evaporated tape
NASA Astrophysics Data System (ADS)
Jones, M.; Miles, J. J.
1997-07-01
Metal evaporated tape (MET) has a complex column-like structure in which magnetic domains are arranged randomly. In order to accurately simulate the behaviour of MET it is important to capture these aspects of the material in a high-resolution 3-D micromagnetic model. The scale of this problem prohibits the use of traditional scalar computers and leads us to develop algorithms for a vector processor architecture. We demonstrate that despite the materials highly non-uniform structure, it is possible to develop fast vector algorithms for the computation of the magnetostatic interaction field. We do this by splitting the field calculation into near and far components. The near field component is calculated exactly using an efficient vector algorithm, whereas the far field is calculated approximately using a novel fast Fourier transform (FFT) technique. Results are presented which demonstrate that, in practice, the algorithms require sub-O( N log( N)) computation time. In addition results of highly realistic simulation of hysteresis in MET are presented.
Efficient computation of Wigner-Eisenbud functions
NASA Astrophysics Data System (ADS)
Raffah, Bahaaudin M.; Abbott, Paul C.
2013-06-01
The R-matrix method, introduced by Wigner and Eisenbud (1947) [1], has been applied to a broad range of electron transport problems in nanoscale quantum devices. With the rapid increase in the development and modeling of nanodevices, efficient, accurate, and general computation of Wigner-Eisenbud functions is required. This paper presents the Mathematica package WignerEisenbud, which uses the Fourier discrete cosine transform to compute the Wigner-Eisenbud functions in dimensionless units for an arbitrary potential in one dimension, and two dimensions in cylindrical coordinates. Program summaryProgram title: WignerEisenbud Catalogue identifier: AEOU_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEOU_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: Standard CPC licence, http://cpc.cs.qub.ac.uk/licence/licence.html Distribution format: tar.gz Programming language: Mathematica Operating system: Any platform supporting Mathematica 7.0 and above Keywords: Wigner-Eisenbud functions, discrete cosine transform (DCT), cylindrical nanowires Classification: 7.3, 7.9, 4.6, 5 Nature of problem: Computing the 1D and 2D Wigner-Eisenbud functions for arbitrary potentials using the DCT. Solution method: The R-matrix method is applied to the physical problem. Separation of variables is used for eigenfunction expansion of the 2D Wigner-Eisenbud functions. Eigenfunction computation is performed using the DCT to convert the Schrödinger equation with Neumann boundary conditions to a generalized matrix eigenproblem. Limitations: Restricted to uniform (rectangular grid) sampling of the potential. In 1D the number of sample points, n, results in matrix computations involving n×n matrices. Unusual features: Eigenfunction expansion using the DCT is fast and accurate. Users can specify scattering potentials using functions, or interactively using mouse input. Use of dimensionless units permits application to a
Margot Gerritsen
2008-10-31
Gas-injection processes are widely and increasingly used for enhanced oil recovery (EOR). In the United States, for example, EOR production by gas injection accounts for approximately 45% of total EOR production and has tripled since 1986. The understanding of the multiphase, multicomponent flow taking place in any displacement process is essential for successful design of gas-injection projects. Due to complex reservoir geometry, reservoir fluid properties and phase behavior, the design of accurate and efficient numerical simulations for the multiphase, multicomponent flow governing these processes is nontrivial. In this work, we developed, implemented and tested a streamline based solver for gas injection processes that is computationally very attractive: as compared to traditional Eulerian solvers in use by industry it computes solutions with a computational speed orders of magnitude higher and a comparable accuracy provided that cross-flow effects do not dominate. We contributed to the development of compositional streamline solvers in three significant ways: improvement of the overall framework allowing improved streamline coverage and partial streamline tracing, amongst others; parallelization of the streamline code, which significantly improves wall clock time; and development of new compositional solvers that can be implemented along streamlines as well as in existing Eulerian codes used by industry. We designed several novel ideas in the streamline framework. First, we developed an adaptive streamline coverage algorithm. Adding streamlines locally can reduce computational costs by concentrating computational efforts where needed, and reduce mapping errors. Adapting streamline coverage effectively controls mass balance errors that mostly result from the mapping from streamlines to pressure grid. We also introduced the concept of partial streamlines: streamlines that do not necessarily start and/or end at wells. This allows more efficient coverage and avoids
Curvelet-based sampling for accurate and efficient multimodal image registration
NASA Astrophysics Data System (ADS)
Safran, M. N.; Freiman, M.; Werman, M.; Joskowicz, L.
2009-02-01
We present a new non-uniform adaptive sampling method for the estimation of mutual information in multi-modal image registration. The method uses the Fast Discrete Curvelet Transform to identify regions along anatomical curves on which the mutual information is computed. Its main advantages of over other non-uniform sampling schemes are that it captures the most informative regions, that it is invariant to feature shapes, orientations, and sizes, that it is efficient, and that it yields accurate results. Extensive evaluation on 20 validated clinical brain CT images to Proton Density (PD) and T1 and T2-weighted MRI images from the public RIRE database show the effectiveness of our method. Rigid registration accuracy measured at 10 clinical targets and compared to ground truth measurements yield a mean target registration error of 0.68mm(std=0.4mm) for CT-PD and 0.82mm(std=0.43mm) for CT-T2. This is 0.3mm (1mm) more accurate in the average (worst) case than five existing sampling methods. Our method has the lowest registration errors recorded to date for the registration of CT-PD and CT-T2 images in the RIRE website when compared to methods that were tested on at least three patient datasets.
NASA Astrophysics Data System (ADS)
Peng, Liang-You; Gong, Qihuang
2010-12-01
The accurate computations of hydrogenic continuum wave functions are very important in many branches of physics such as electron-atom collisions, cold atom physics, and atomic ionization in strong laser fields, etc. Although there already exist various algorithms and codes, most of them are only reliable in a certain ranges of parameters. In some practical applications, accurate continuum wave functions need to be calculated at extremely low energies, large radial distances and/or large angular momentum number. Here we provide such a code, which can generate accurate hydrogenic continuum wave functions and corresponding Coulomb phase shifts at a wide range of parameters. Without any essential restrict to angular momentum number, the present code is able to give reliable results at the electron energy range [10,10] eV for radial distances of [10,10] a.u. We also find the present code is very efficient, which should find numerous applications in many fields such as strong field physics. Program summaryProgram title: HContinuumGautchi Catalogue identifier: AEHD_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEHD_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: Standard CPC licence, http://cpc.cs.qub.ac.uk/licence/licence.html No. of lines in distributed program, including test data, etc.: 1233 No. of bytes in distributed program, including test data, etc.: 7405 Distribution format: tar.gz Programming language: Fortran90 in fixed format Computer: AMD Processors Operating system: Linux RAM: 20 MBytes Classification: 2.7, 4.5 Nature of problem: The accurate computation of atomic continuum wave functions is very important in many research fields such as strong field physics and cold atom physics. Although there have already existed various algorithms and codes, most of them can only be applicable and reliable in a certain range of parameters. We present here an accurate FORTRAN program for
Towards fast and accurate algorithms for processing fuzzy data: interval computations revisited
NASA Astrophysics Data System (ADS)
Xiang, Gang; Kreinovich, Vladik
2013-02-01
In many practical applications, we need to process data, e.g. to predict the future values of different quantities based on their current values. Often, the only information that we have about the current values comes from experts, and is described in informal ('fuzzy') terms like 'small'. To process such data, it is natural to use fuzzy techniques, techniques specifically designed by Lotfi Zadeh to handle such informal information. In this survey, we start by revisiting the motivation behind Zadeh's formulae for processing fuzzy data, and explain how the algorithmic problem of processing fuzzy data can be described in terms of interval computations (α-cuts). Many fuzzy practitioners claim 'I tried interval computations, they did not work' - meaning that they got estimates which are much wider than the desired α-cuts. We show that such statements are usually based on a (widely spread) misunderstanding - that interval computations simply mean replacing each arithmetic operation with the corresponding operation with intervals. We show that while such straightforward interval techniques indeed often lead to over-wide estimates, the current advanced interval computations techniques result in estimates which are much more accurate. We overview such advanced interval computations techniques, and show that by using them, we can efficiently and accurately process fuzzy data. We wrote this survey with three audiences in mind. First, we want fuzzy researchers and practitioners to understand the current advanced interval computations techniques and to use them to come up with faster and more accurate algorithms for processing fuzzy data. For this 'fuzzy' audience, we explain these current techniques in detail. Second, we also want interval researchers to better understand this important application area for their techniques. For this 'interval' audience, we want to explain where fuzzy techniques come from, what are possible variants of these techniques, and what are the
Accurate methods for computing inviscid and viscous Kelvin-Helmholtz instability
NASA Astrophysics Data System (ADS)
Chen, Michael J.; Forbes, Lawrence K.
2011-02-01
The Kelvin-Helmholtz instability is modelled for inviscid and viscous fluids. Here, two bounded fluid layers flow parallel to each other with the interface between them growing in an unstable fashion when subjected to a small perturbation. In the various configurations of this problem, and the related problem of the vortex sheet, there are several phenomena associated with the evolution of the interface; notably the formation of a finite time curvature singularity and the ‘roll-up' of the interface. Two contrasting computational schemes will be presented. A spectral method is used to follow the evolution of the interface in the inviscid version of the problem. This allows the interface shape to be computed up to the time that a curvature singularity forms, with several computational difficulties overcome to reach that point. A weakly compressible viscous version of the problem is studied using finite difference techniques and a vorticity-streamfunction formulation. The two versions have comparable, but not identical, initial conditions and so the results exhibit some differences in timing. By including a small amount of viscosity the interface may be followed to the point that it rolls up into a classic ‘cat's-eye' shape. Particular attention was given to computing a consistent initial condition and solving the continuity equation both accurately and efficiently.
Accurate and Efficient Resolution of Overlapping Isotopic Envelopes in Protein Tandem Mass Spectra
Xiao, Kaijie; Yu, Fan; Fang, Houqin; Xue, Bingbing; Liu, Yan; Tian, Zhixin
2015-01-01
It has long been an analytical challenge to accurately and efficiently resolve extremely dense overlapping isotopic envelopes (OIEs) in protein tandem mass spectra to confidently identify proteins. Here, we report a computationally efficient method, called OIE_CARE, to resolve OIEs by calculating the relative deviation between the ideal and observed experimental abundance. In the OIE_CARE method, the ideal experimental abundance of a particular overlapping isotopic peak (OIP) is first calculated for all the OIEs sharing this OIP. The relative deviation (RD) of the overall observed experimental abundance of this OIP relative to the summed ideal value is then calculated. The final individual abundance of the OIP for each OIE is the individual ideal experimental abundance multiplied by 1 + RD. Initial studies were performed using higher-energy collisional dissociation tandem mass spectra on myoglobin (with direct infusion) and the intact E. coli proteome (with liquid chromatographic separation). Comprehensive data at the protein and proteome levels, high confidence and good reproducibility were achieved. The resolving method reported here can, in principle, be extended to resolve any envelope-type overlapping data for which the corresponding theoretical reference values are available. PMID:26439836
An accurate and efficient Lagrangian sub-grid model for multi-particle dispersion
NASA Astrophysics Data System (ADS)
Toschi, Federico; Mazzitelli, Irene; Lanotte, Alessandra S.
2014-11-01
Many natural and industrial processes involve the dispersion of particle in turbulent flows. Despite recent theoretical progresses in the understanding of particle dynamics in simple turbulent flows, complex geometries often call for numerical approaches based on eulerian Large Eddy Simulation (LES). One important issue related to the Lagrangian integration of tracers in under-resolved velocity fields is connected to the lack of spatial correlations at unresolved scales. Here we propose a computationally efficient Lagrangian model for the sub-grid velocity of tracers dispersed in statistically homogeneous and isotropic turbulent flows. The model incorporates the multi-scale nature of turbulent temporal and spatial correlations that are essential to correctly reproduce the dynamics of multi-particle dispersion. The new model is able to describe the Lagrangian temporal and spatial correlations in clouds of particles. In particular we show that pairs and tetrads dispersion compare well with results from Direct Numerical Simulations of statistically isotropic and homogeneous 3d turbulence. This model may offer an accurate and efficient way to describe multi-particle dispersion in under resolved turbulent velocity fields such as the one employed in eulerian LES. This work is part of the research programmes FP112 of the Foundation for Fundamental Research on Matter (FOM), which is part of the Netherlands Organisation for Scientific Research (NWO). We acknowledge support from the EU COST Action MP0806.
Wu, Yonghui; Bhat, Prasanna R.; Close, Timothy J.; Lonardi, Stefano
2008-01-01
Genetic linkage maps are cornerstones of a wide spectrum of biotechnology applications, including map-assisted breeding, association genetics, and map-assisted gene cloning. During the past several years, the adoption of high-throughput genotyping technologies has been paralleled by a substantial increase in the density and diversity of genetic markers. New genetic mapping algorithms are needed in order to efficiently process these large datasets and accurately construct high-density genetic maps. In this paper, we introduce a novel algorithm to order markers on a genetic linkage map. Our method is based on a simple yet fundamental mathematical property that we prove under rather general assumptions. The validity of this property allows one to determine efficiently the correct order of markers by computing the minimum spanning tree of an associated graph. Our empirical studies obtained on genotyping data for three mapping populations of barley (Hordeum vulgare), as well as extensive simulations on synthetic data, show that our algorithm consistently outperforms the best available methods in the literature, particularly when the input data are noisy or incomplete. The software implementing our algorithm is available in the public domain as a web tool under the name MSTmap. PMID:18846212
Development of highly accurate approximate scheme for computing the charge transfer integral.
Pershin, Anton; Szalay, Péter G
2015-08-21
The charge transfer integral is a key parameter required by various theoretical models to describe charge transport properties, e.g., in organic semiconductors. The accuracy of this important property depends on several factors, which include the level of electronic structure theory and internal simplifications of the applied formalism. The goal of this paper is to identify the performance of various approximate approaches of the latter category, while using the high level equation-of-motion coupled cluster theory for the electronic structure. The calculations have been performed on the ethylene dimer as one of the simplest model systems. By studying different spatial perturbations, it was shown that while both energy split in dimer and fragment charge difference methods are equivalent with the exact formulation for symmetrical displacements, they are less efficient when describing transfer integral along the asymmetric alteration coordinate. Since the "exact" scheme was found computationally expensive, we examine the possibility to obtain the asymmetric fluctuation of the transfer integral by a Taylor expansion along the coordinate space. By exploring the efficiency of this novel approach, we show that the Taylor expansion scheme represents an attractive alternative to the "exact" calculations due to a substantial reduction of computational costs, when a considerably large region of the potential energy surface is of interest. Moreover, we show that the Taylor expansion scheme, irrespective of the dimer symmetry, is very accurate for the entire range of geometry fluctuations that cover the space the molecule accesses at room temperature. PMID:26298117
Development of highly accurate approximate scheme for computing the charge transfer integral
Pershin, Anton; Szalay, Péter G.
2015-08-21
The charge transfer integral is a key parameter required by various theoretical models to describe charge transport properties, e.g., in organic semiconductors. The accuracy of this important property depends on several factors, which include the level of electronic structure theory and internal simplifications of the applied formalism. The goal of this paper is to identify the performance of various approximate approaches of the latter category, while using the high level equation-of-motion coupled cluster theory for the electronic structure. The calculations have been performed on the ethylene dimer as one of the simplest model systems. By studying different spatial perturbations, it was shown that while both energy split in dimer and fragment charge difference methods are equivalent with the exact formulation for symmetrical displacements, they are less efficient when describing transfer integral along the asymmetric alteration coordinate. Since the “exact” scheme was found computationally expensive, we examine the possibility to obtain the asymmetric fluctuation of the transfer integral by a Taylor expansion along the coordinate space. By exploring the efficiency of this novel approach, we show that the Taylor expansion scheme represents an attractive alternative to the “exact” calculations due to a substantial reduction of computational costs, when a considerably large region of the potential energy surface is of interest. Moreover, we show that the Taylor expansion scheme, irrespective of the dimer symmetry, is very accurate for the entire range of geometry fluctuations that cover the space the molecule accesses at room temperature.
Efficient Computational Model of Hysteresis
NASA Technical Reports Server (NTRS)
Shields, Joel
2005-01-01
A recently developed mathematical model of the output (displacement) versus the input (applied voltage) of a piezoelectric transducer accounts for hysteresis. For the sake of computational speed, the model is kept simple by neglecting the dynamic behavior of the transducer. Hence, the model applies to static and quasistatic displacements only. A piezoelectric transducer of the type to which the model applies is used as an actuator in a computer-based control system to effect fine position adjustments. Because the response time of the rest of such a system is usually much greater than that of a piezoelectric transducer, the model remains an acceptably close approximation for the purpose of control computations, even though the dynamics are neglected. The model (see Figure 1) represents an electrically parallel, mechanically series combination of backlash elements, each having a unique deadband width and output gain. The zeroth element in the parallel combination has zero deadband width and, hence, represents a linear component of the input/output relationship. The other elements, which have nonzero deadband widths, are used to model the nonlinear components of the hysteresis loop. The deadband widths and output gains of the elements are computed from experimental displacement-versus-voltage data. The hysteresis curve calculated by use of this model is piecewise linear beyond deadband limits.
Computing Efficiency Of Transfer Of Microwave Power
NASA Technical Reports Server (NTRS)
Pinero, L. R.; Acosta, R.
1995-01-01
BEAM computer program enables user to calculate microwave power-transfer efficiency between two circular apertures at arbitrary range. Power-transfer efficiency obtained numerically. Two apertures have generally different sizes and arbitrary taper illuminations. BEAM also analyzes effect of distance and taper illumination on transmission efficiency for two apertures of equal size. Written in FORTRAN.
Accurate 3-D finite difference computation of traveltimes in strongly heterogeneous media
NASA Astrophysics Data System (ADS)
Noble, M.; Gesret, A.; Belayouni, N.
2014-12-01
Seismic traveltimes and their spatial derivatives are the basis of many imaging methods such as pre-stack depth migration and tomography. A common approach to compute these quantities is to solve the eikonal equation with a finite-difference scheme. If many recently published algorithms for resolving the eikonal equation do now yield fairly accurate traveltimes for most applications, the spatial derivatives of traveltimes remain very approximate. To address this accuracy issue, we develop a new hybrid eikonal solver that combines a spherical approximation when close to the source and a plane wave approximation when far away. This algorithm reproduces properly the spherical behaviour of wave fronts in the vicinity of the source. We implement a combination of 16 local operators that enables us to handle velocity models with sharp vertical and horizontal velocity contrasts. We associate to these local operators a global fast sweeping method to take into account all possible directions of wave propagation. Our formulation allows us to introduce a variable grid spacing in all three directions of space. We demonstrate the efficiency of this algorithm in terms of computational time and the gain in accuracy of the computed traveltimes and their derivatives on several numerical examples.
A Computationally Efficient Algorithm for Aerosol Phase Equilibrium
Zaveri, Rahul A.; Easter, Richard C.; Peters, Len K.; Wexler, Anthony S.
2004-10-04
Three-dimensional models of atmospheric inorganic aerosols need an accurate yet computationally efficient thermodynamic module that is repeatedly used to compute internal aerosol phase state equilibrium. In this paper, we describe the development and evaluation of a computationally efficient numerical solver called MESA (Multicomponent Equilibrium Solver for Aerosols). The unique formulation of MESA allows iteration of all the equilibrium equations simultaneously while maintaining overall mass conservation and electroneutrality in both the solid and liquid phases. MESA is unconditionally stable, shows robust convergence, and typically requires only 10 to 20 single-level iterations (where all activity coefficients and aerosol water content are updated) per internal aerosol phase equilibrium calculation. Accuracy of MESA is comparable to that of the highly accurate Aerosol Inorganics Model (AIM), which uses a rigorous Gibbs free energy minimization approach. Performance evaluation will be presented for a number of complex multicomponent mixtures commonly found in urban and marine tropospheric aerosols.
Efficient, massively parallel eigenvalue computation
NASA Technical Reports Server (NTRS)
Huo, Yan; Schreiber, Robert
1993-01-01
In numerical simulations of disordered electronic systems, one of the most common approaches is to diagonalize random Hamiltonian matrices and to study the eigenvalues and eigenfunctions of a single electron in the presence of a random potential. An effort to implement a matrix diagonalization routine for real symmetric dense matrices on massively parallel SIMD computers, the Maspar MP-1 and MP-2 systems, is described. Results of numerical tests and timings are also presented.
NASA Technical Reports Server (NTRS)
Liu, Yi; Anusonti-Inthra, Phuriwat; Diskin, Boris
2011-01-01
A physics-based, systematically coupled, multidisciplinary prediction tool (MUTE) for rotorcraft noise was developed and validated with a wide range of flight configurations and conditions. MUTE is an aggregation of multidisciplinary computational tools that accurately and efficiently model the physics of the source of rotorcraft noise, and predict the noise at far-field observer locations. It uses systematic coupling approaches among multiple disciplines including Computational Fluid Dynamics (CFD), Computational Structural Dynamics (CSD), and high fidelity acoustics. Within MUTE, advanced high-order CFD tools are used around the rotor blade to predict the transonic flow (shock wave) effects, which generate the high-speed impulsive noise. Predictions of the blade-vortex interaction noise in low speed flight are also improved by using the Particle Vortex Transport Method (PVTM), which preserves the wake flow details required for blade/wake and fuselage/wake interactions. The accuracy of the source noise prediction is further improved by utilizing a coupling approach between CFD and CSD, so that the effects of key structural dynamics, elastic blade deformations, and trim solutions are correctly represented in the analysis. The blade loading information and/or the flow field parameters around the rotor blade predicted by the CFD/CSD coupling approach are used to predict the acoustic signatures at far-field observer locations with a high-fidelity noise propagation code (WOPWOP3). The predicted results from the MUTE tool for rotor blade aerodynamic loading and far-field acoustic signatures are compared and validated with a variation of experimental data sets, such as UH60-A data, DNW test data and HART II test data.
Lower bounds on the computational efficiency of optical computing systems
NASA Astrophysics Data System (ADS)
Barakat, Richard; Reif, John
1987-03-01
A general model for determining the computational efficiency of optical computing systems, termed the VLSIO model, is described. It is a 3-dimensional generalization of the wire model of a 2-dimensional VLSI with optical beams (via Gabor's theorem) replacing the wires as communication channels. Lower bounds (in terms of simultaneous volume and time) on the computational resources of the VLSIO are obtained for computing various problems such as matrix multiplication.
Miliordos, Evangelos; Xantheas, Sotiris S.
2015-06-21
We report MP2 and CCSD(T) binding energies with basis sets up to pentuple zeta quality for the m = 2-6, 8 clusters. Or best CCSD(T)/CBS estimates are -4.99 kcal/mol (dimer), -15.77 kcal/mol (trimer), -27.39 kcal/mol (tetramer), -35.9 ± 0.3 kcal/mol (pentamer), -46.2 ± 0.3 kcal/mol (prism hexamer), -45.9 ± 0.3 kcal/mol (cage hexamer), -45.4 ± 0.3 kcal/mol (book hexamer), -44.3 ± 0.3 kcal/mol (ring hexamer), -73.0 ± 0.5 kcal/mol (D_{2d} octamer) and -72.9 ± 0.5 kcal/mol (S4 octamer). We have found that the percentage of both the uncorrected (dimer) and BSSE-corrected (dimer^{CP}_{e}) binding energies recovered with respect to the CBS limit falls into a narrow range for each basis set for all clusters and in addition this range was found to decrease upon increasing the basis set. Relatively accurate estimates (within < 0.5%) of the CBS limits can be obtained when using the “ 2/3, 1/3” (for the AVDZ set) or the “½ , ½” (for the AVTZ, AVQZ and AV5Z sets) mixing ratio between dimer_{e} and dimer^{CP}e. Based on those findings we propose an accurate and efficient computational protocol that can be used to estimate accurate binding energies of clusters at the MP2 (for up to 100 molecules) and CCSD(T) (for up to 30 molecules) levels of theory. This work was supported by the US Department of Energy, Office of Science, Office of Basic Energy Sciences, Division of Chemical Sciences, Geosciences and Biosciences. Pacific Northwest National Laboratory (PNNL) is a multi program national laboratory operated for DOE by Battelle. This research also used resources of the National Energy Research Scientific Computing Center, which is supported by the Office of Science of the U.S. Department of Energy under Contract No. AC02-05CH11231.
NASA Astrophysics Data System (ADS)
Lee, Dongwook
2013-06-01
In this paper, we extend the unsplit staggered mesh scheme (USM) for 2D magnetohydrodynamics (MHD) [D. Lee, A.E. Deane, An unsplit staggered mesh scheme for multidimensional magnetohydrodynamics, J. Comput. Phys. 228 (2009) 952-975] to a full 3D MHD scheme. The scheme is a finite-volume Godunov method consisting of a constrained transport (CT) method and an efficient and accurate single-step, directionally unsplit multidimensional data reconstruction-evolution algorithm, which extends Colella's original 2D corner transport upwind (CTU) method [P. Colella, Multidimensional upwind methods for hyperbolic conservation laws, J. Comput. Phys. 87 (1990) 446-466]. We present two types of data reconstruction-evolution algorithms for 3D: (1) a reduced CTU scheme and (2) a full CTU scheme. The reduced 3D CTU scheme is a variant of a simple 3D extension of Collela's 2D CTU method and is considered as a direct extension from the 2D USM scheme. The full 3D CTU scheme is our primary 3D solver which includes all multidimensional cross-derivative terms for stability. The latter method is logically analogous to the 3D unsplit CTU method by Saltzman [J. Saltzman, An unsplit 3D upwind method for hyperbolic conservation laws, J. Comput. Phys. 115 (1994) 153-168]. The major novelties in our algorithms are twofold. First, we extend the reduced CTU scheme to the full CTU scheme which is able to run with CFL numbers close to unity. Both methods utilize the transverse update technique developed in the 2D USM algorithm to account for transverse fluxes without solving intermediate Riemann problems, which in turn gives cost-effective 3D methods by reducing the total number of Riemann solves. The proposed algorithms are simple and efficient especially when including multidimensional MHD terms that maintain in-plane magnetic field dynamics. Second, we introduce a new CT scheme that makes use of proper upwind information in taking averages of electric fields. Our 3D USM schemes can be easily
A Computationally Efficient Bedrock Model
NASA Astrophysics Data System (ADS)
Fastook, J. L.
2002-05-01
Full treatments of the Earth's crust, mantle, and core for ice sheet modeling are often computationally overwhelming, in that the requirements to calculate a full self-gravitating spherical Earth model for the time-varying load history of an ice sheet are considerably greater than the computational requirements for the ice dynamics and thermodynamics combined. For this reason, we adopt a ``reasonable'' approximation for the behavior of the deforming bedrock beneath the ice sheet. This simpler model of the Earth treats the crust as an elastic plate supported from below by a hydrostatic fluid. Conservation of linear and angular momentum for an elastic plate leads to the classical Poisson-Kirchhoff fourth order differential equation in the crustal displacement. By adding a time-dependent term this treatment allows for an exponentially-decaying response of the bed to loading and unloading events. This component of the ice sheet model (along with the ice dynamics and thermodynamics) is solved using the Finite Element Method (FEM). C1 FEMs are difficult to implement in more than one dimension, and as such the engineering community has turned away from classical Poisson-Kirchhoff plate theory to treatments such as Reissner-Mindlin plate theory, which are able to accommodate transverse shear and hence require only C0 continuity of basis functions (only the function, and not the derivative, is required to be continuous at the element boundary) (Hughes 1987). This method reduces the complexity of the C1 formulation by adding additional degrees of freedom (the transverse shear in x and y) at each node. This ``reasonable'' solution is compared with two self-gravitating spherical Earth models (1. Ivins et al. (1997) and James and Ivins (1998) } and 2. Tushingham and Peltier 1991 ICE3G run by Jim Davis and Glenn Milne), as well as with preliminary results of residual rebound rates measured with GPS by the BIFROST project. Modeled responses of a simulated ice sheet experiencing a
Candel, A.; Kabel, A.; Lee, L.; Li, Z.; Limborg, C.; Ng, C.; Prudencio, E.; Schussman, G.; Uplenchwar, R.; Ko, K.; /SLAC
2009-06-19
Over the past years, SLAC's Advanced Computations Department (ACD), under SciDAC sponsorship, has developed a suite of 3D (2D) parallel higher-order finite element (FE) codes, T3P (T2P) and Pic3P (Pic2P), aimed at accurate, large-scale simulation of wakefields and particle-field interactions in radio-frequency (RF) cavities of complex shape. The codes are built on the FE infrastructure that supports SLAC's frequency domain codes, Omega3P and S3P, to utilize conformal tetrahedral (triangular)meshes, higher-order basis functions and quadratic geometry approximation. For time integration, they adopt an unconditionally stable implicit scheme. Pic3P (Pic2P) extends T3P (T2P) to treat charged-particle dynamics self-consistently using the PIC (particle-in-cell) approach, the first such implementation on a conformal, unstructured grid using Whitney basis functions. Examples from applications to the International Linear Collider (ILC), Positron Electron Project-II (PEP-II), Linac Coherent Light Source (LCLS) and other accelerators will be presented to compare the accuracy and computational efficiency of these codes versus their counterparts using structured grids.
NASA Astrophysics Data System (ADS)
Shukla, Ratnesh K.
2014-11-01
Single fluid schemes that rely on an interface function for phase identification in multicomponent compressible flows are widely used to study hydrodynamic flow phenomena in several diverse applications. Simulations based on standard numerical implementation of these schemes suffer from an artificial increase in the width of the interface function owing to the numerical dissipation introduced by an upwind discretization of the governing equations. In addition, monotonicity requirements which ensure that the sharp interface function remains bounded at all times necessitate use of low-order accurate discretization strategies. This results in a significant reduction in accuracy along with a loss of intricate flow features. In this paper we develop a nonlinear transformation based interface capturing method which achieves superior accuracy without compromising the simplicity, computational efficiency and robustness of the original flow solver. A nonlinear map from the signed distance function to the sigmoid type interface function is used to effectively couple a standard single fluid shock and interface capturing scheme with a high-order accurate constrained level set reinitialization method in a way that allows for oscillation-free transport of the sharp material interface. Imposition of a maximum principle, which ensures that the multidimensional preconditioned interface capturing method does not produce new maxima or minima even in the extreme events of interface merger or breakup, allows for an explicit determination of the interface thickness in terms of the grid spacing. A narrow band method is formulated in order to localize computations pertinent to the preconditioned interface capturing method. Numerical tests in one dimension reveal a significant improvement in accuracy and convergence; in stark contrast to the conventional scheme, the proposed method retains its accuracy and convergence characteristics in a shifted reference frame. Results from the test
Bauer, Sebastian; Mathias, Gerald; Tavan, Paul
2014-03-14
We present a reaction field (RF) method which accurately solves the Poisson equation for proteins embedded in dielectric solvent continua at a computational effort comparable to that of an electrostatics calculation with polarizable molecular mechanics (MM) force fields. The method combines an approach originally suggested by Egwolf and Tavan [J. Chem. Phys. 118, 2039 (2003)] with concepts generalizing the Born solution [Z. Phys. 1, 45 (1920)] for a solvated ion. First, we derive an exact representation according to which the sources of the RF potential and energy are inducible atomic anti-polarization densities and atomic shielding charge distributions. Modeling these atomic densities by Gaussians leads to an approximate representation. Here, the strengths of the Gaussian shielding charge distributions are directly given in terms of the static partial charges as defined, e.g., by standard MM force fields for the various atom types, whereas the strengths of the Gaussian anti-polarization densities are calculated by a self-consistency iteration. The atomic volumes are also described by Gaussians. To account for covalently overlapping atoms, their effective volumes are calculated by another self-consistency procedure, which guarantees that the dielectric function ε(r) is close to one everywhere inside the protein. The Gaussian widths σ{sub i} of the atoms i are parameters of the RF approximation. The remarkable accuracy of the method is demonstrated by comparison with Kirkwood's analytical solution for a spherical protein [J. Chem. Phys. 2, 351 (1934)] and with computationally expensive grid-based numerical solutions for simple model systems in dielectric continua including a di-peptide (Ac-Ala-NHMe) as modeled by a standard MM force field. The latter example shows how weakly the RF conformational free energy landscape depends on the parameters σ{sub i}. A summarizing discussion highlights the achievements of the new theory and of its approximate solution
NASA Astrophysics Data System (ADS)
Bauer, Sebastian; Mathias, Gerald; Tavan, Paul
2014-03-01
We present a reaction field (RF) method which accurately solves the Poisson equation for proteins embedded in dielectric solvent continua at a computational effort comparable to that of an electrostatics calculation with polarizable molecular mechanics (MM) force fields. The method combines an approach originally suggested by Egwolf and Tavan [J. Chem. Phys. 118, 2039 (2003)] with concepts generalizing the Born solution [Z. Phys. 1, 45 (1920)] for a solvated ion. First, we derive an exact representation according to which the sources of the RF potential and energy are inducible atomic anti-polarization densities and atomic shielding charge distributions. Modeling these atomic densities by Gaussians leads to an approximate representation. Here, the strengths of the Gaussian shielding charge distributions are directly given in terms of the static partial charges as defined, e.g., by standard MM force fields for the various atom types, whereas the strengths of the Gaussian anti-polarization densities are calculated by a self-consistency iteration. The atomic volumes are also described by Gaussians. To account for covalently overlapping atoms, their effective volumes are calculated by another self-consistency procedure, which guarantees that the dielectric function ɛ(r) is close to one everywhere inside the protein. The Gaussian widths σi of the atoms i are parameters of the RF approximation. The remarkable accuracy of the method is demonstrated by comparison with Kirkwood's analytical solution for a spherical protein [J. Chem. Phys. 2, 351 (1934)] and with computationally expensive grid-based numerical solutions for simple model systems in dielectric continua including a di-peptide (Ac-Ala-NHMe) as modeled by a standard MM force field. The latter example shows how weakly the RF conformational free energy landscape depends on the parameters σi. A summarizing discussion highlights the achievements of the new theory and of its approximate solution particularly by
Bauer, Sebastian; Mathias, Gerald; Tavan, Paul
2014-03-14
We present a reaction field (RF) method which accurately solves the Poisson equation for proteins embedded in dielectric solvent continua at a computational effort comparable to that of an electrostatics calculation with polarizable molecular mechanics (MM) force fields. The method combines an approach originally suggested by Egwolf and Tavan [J. Chem. Phys. 118, 2039 (2003)] with concepts generalizing the Born solution [Z. Phys. 1, 45 (1920)] for a solvated ion. First, we derive an exact representation according to which the sources of the RF potential and energy are inducible atomic anti-polarization densities and atomic shielding charge distributions. Modeling these atomic densities by Gaussians leads to an approximate representation. Here, the strengths of the Gaussian shielding charge distributions are directly given in terms of the static partial charges as defined, e.g., by standard MM force fields for the various atom types, whereas the strengths of the Gaussian anti-polarization densities are calculated by a self-consistency iteration. The atomic volumes are also described by Gaussians. To account for covalently overlapping atoms, their effective volumes are calculated by another self-consistency procedure, which guarantees that the dielectric function ε(r) is close to one everywhere inside the protein. The Gaussian widths σ(i) of the atoms i are parameters of the RF approximation. The remarkable accuracy of the method is demonstrated by comparison with Kirkwood's analytical solution for a spherical protein [J. Chem. Phys. 2, 351 (1934)] and with computationally expensive grid-based numerical solutions for simple model systems in dielectric continua including a di-peptide (Ac-Ala-NHMe) as modeled by a standard MM force field. The latter example shows how weakly the RF conformational free energy landscape depends on the parameters σ(i). A summarizing discussion highlights the achievements of the new theory and of its approximate solution particularly by
Wu, Baolin; Guan, Weihua; Pankow, James S
2016-03-01
The objective of this paper is to discuss and develop alternative computational methods to accurately and efficiently calculate significance P-values for the commonly used sequence kernel association test (SKAT) and adaptive sum of SKAT and burden test (SKAT-O) for variant set association. We show that the existing software can lead to either conservative or inflated type I errors. We develop alternative and efficient computational algorithms that quickly compute the SKAT P-value and have well-controlled type I errors. In addition, we derive an alternative and simplified formula for calculating the significance P-value of SKAT-O, which sheds light on the development of efficient and accurate numerical algorithms. We implement the proposed methods in the publicly available R package that can be readily used or adapted to large-scale sequencing studies. Given that more and more large-scale exome and whole genome sequencing or re-sequencing studies are being conducted, the proposed methods are practically very important. We conduct extensive numerical studies to investigate the performance of the proposed methods. We further illustrate their usefulness with application to associations between rare exonic variants and fasting glucose levels in the Atherosclerosis Risk in Communities (ARIC) study. PMID:26757198
Haley, William E.; Ibrahim, El-Sayed H.; Qu, Mingliang; Cernigliaro, Joseph G.; Goldfarb, David S.; McCollough, Cynthia H.
2015-01-01
Dual-energy computed tomography (DECT) has recently been suggested as the imaging modality of choice for kidney stones due to its ability to provide information on stone composition. Standard postprocessing of the dual-energy images accurately identifies uric acid stones, but not other types. Cystine stones can be identified from DECT images when analyzed with advanced postprocessing. This case report describes clinical implications of accurate diagnosis of cystine stones using DECT. PMID:26688770
Haley, William E; Ibrahim, El-Sayed H; Qu, Mingliang; Cernigliaro, Joseph G; Goldfarb, David S; McCollough, Cynthia H
2015-01-01
Dual-energy computed tomography (DECT) has recently been suggested as the imaging modality of choice for kidney stones due to its ability to provide information on stone composition. Standard postprocessing of the dual-energy images accurately identifies uric acid stones, but not other types. Cystine stones can be identified from DECT images when analyzed with advanced postprocessing. This case report describes clinical implications of accurate diagnosis of cystine stones using DECT. PMID:26688770
Computationally efficient prediction of area per lipid
NASA Astrophysics Data System (ADS)
Chaban, Vitaly
2014-11-01
Area per lipid (APL) is an important property of biological and artificial membranes. Newly constructed bilayers are characterized by their APL and newly elaborated force fields must reproduce APL. Computer simulations of APL are very expensive due to slow conformational dynamics. The simulated dynamics increases exponentially with respect to temperature. APL dependence on temperature is linear over an entire temperature range. I provide numerical evidence that thermal expansion coefficient of a lipid bilayer can be computed at elevated temperatures and extrapolated to the temperature of interest. Thus, sampling times to predict accurate APL are reduced by a factor of ∼10.
Creation of Anatomically Accurate Computer-Aided Design (CAD) Solid Models from Medical Images
NASA Technical Reports Server (NTRS)
Stewart, John E.; Graham, R. Scott; Samareh, Jamshid A.; Oberlander, Eric J.; Broaddus, William C.
1999-01-01
Most surgical instrumentation and implants used in the world today are designed with sophisticated Computer-Aided Design (CAD)/Computer-Aided Manufacturing (CAM) software. This software automates the mechanical development of a product from its conceptual design through manufacturing. CAD software also provides a means of manipulating solid models prior to Finite Element Modeling (FEM). Few surgical products are designed in conjunction with accurate CAD models of human anatomy because of the difficulty with which these models are created. We have developed a novel technique that creates anatomically accurate, patient specific CAD solids from medical images in a matter of minutes.
Toward accurate tooth segmentation from computed tomography images using a hybrid level set model
Gan, Yangzhou; Zhao, Qunfei; Xia, Zeyang E-mail: jing.xiong@siat.ac.cn; Hu, Ying; Xiong, Jing E-mail: jing.xiong@siat.ac.cn; Zhang, Jianwei
2015-01-15
Purpose: A three-dimensional (3D) model of the teeth provides important information for orthodontic diagnosis and treatment planning. Tooth segmentation is an essential step in generating the 3D digital model from computed tomography (CT) images. The aim of this study is to develop an accurate and efficient tooth segmentation method from CT images. Methods: The 3D dental CT volumetric images are segmented slice by slice in a two-dimensional (2D) transverse plane. The 2D segmentation is composed of a manual initialization step and an automatic slice by slice segmentation step. In the manual initialization step, the user manually picks a starting slice and selects a seed point for each tooth in this slice. In the automatic slice segmentation step, a developed hybrid level set model is applied to segment tooth contours from each slice. Tooth contour propagation strategy is employed to initialize the level set function automatically. Cone beam CT (CBCT) images of two subjects were used to tune the parameters. Images of 16 additional subjects were used to validate the performance of the method. Volume overlap metrics and surface distance metrics were adopted to assess the segmentation accuracy quantitatively. The volume overlap metrics were volume difference (VD, mm{sup 3}) and Dice similarity coefficient (DSC, %). The surface distance metrics were average symmetric surface distance (ASSD, mm), RMS (root mean square) symmetric surface distance (RMSSSD, mm), and maximum symmetric surface distance (MSSD, mm). Computation time was recorded to assess the efficiency. The performance of the proposed method has been compared with two state-of-the-art methods. Results: For the tested CBCT images, the VD, DSC, ASSD, RMSSSD, and MSSD for the incisor were 38.16 ± 12.94 mm{sup 3}, 88.82 ± 2.14%, 0.29 ± 0.03 mm, 0.32 ± 0.08 mm, and 1.25 ± 0.58 mm, respectively; the VD, DSC, ASSD, RMSSSD, and MSSD for the canine were 49.12 ± 9.33 mm{sup 3}, 91.57 ± 0.82%, 0.27 ± 0.02 mm, 0
An accurate and efficient algorithm for Peptide and ptm identification by tandem mass spectrometry.
Ning, Kang; Ng, Hoong Kee; Leong, Hon Wai
2007-01-01
Peptide identification by tandem mass spectrometry (MS/MS) is one of the most important problems in proteomics. Recent advances in high throughput MS/MS experiments result in huge amount of spectra. Unfortunately, identification of these spectra is relatively slow, and the accuracies of current algorithms are not high with the presence of noises and post-translational modifications (PTMs). In this paper, we strive to achieve high accuracy and efficiency for peptide identification problem, with special concern on identification of peptides with PTMs. This paper expands our previous work on PepSOM with the introduction of two accurate modified scoring functions: Slambda for peptide identification and Slambda* for identification of peptides with PTMs. Experiments showed that our algorithm is both fast and accurate for peptide identification. Experiments on spectra with simulated and real PTMs confirmed that our algorithm is accurate for identifying PTMs. PMID:18546510
Schwörer, Magnus; Lorenzen, Konstantin; Mathias, Gerald; Tavan, Paul
2015-03-14
Recently, a novel approach to hybrid quantum mechanics/molecular mechanics (QM/MM) molecular dynamics (MD) simulations has been suggested [Schwörer et al., J. Chem. Phys. 138, 244103 (2013)]. Here, the forces acting on the atoms are calculated by grid-based density functional theory (DFT) for a solute molecule and by a polarizable molecular mechanics (PMM) force field for a large solvent environment composed of several 10(3)-10(5) molecules as negative gradients of a DFT/PMM hybrid Hamiltonian. The electrostatic interactions are efficiently described by a hierarchical fast multipole method (FMM). Adopting recent progress of this FMM technique [Lorenzen et al., J. Chem. Theory Comput. 10, 3244 (2014)], which particularly entails a strictly linear scaling of the computational effort with the system size, and adapting this revised FMM approach to the computation of the interactions between the DFT and PMM fragments of a simulation system, here, we show how one can further enhance the efficiency and accuracy of such DFT/PMM-MD simulations. The resulting gain of total performance, as measured for alanine dipeptide (DFT) embedded in water (PMM) by the product of the gains in efficiency and accuracy, amounts to about one order of magnitude. We also demonstrate that the jointly parallelized implementation of the DFT and PMM-MD parts of the computation enables the efficient use of high-performance computing systems. The associated software is available online. PMID:25770527
Schwörer, Magnus; Lorenzen, Konstantin; Mathias, Gerald; Tavan, Paul
2015-03-14
Recently, a novel approach to hybrid quantum mechanics/molecular mechanics (QM/MM) molecular dynamics (MD) simulations has been suggested [Schwörer et al., J. Chem. Phys. 138, 244103 (2013)]. Here, the forces acting on the atoms are calculated by grid-based density functional theory (DFT) for a solute molecule and by a polarizable molecular mechanics (PMM) force field for a large solvent environment composed of several 10{sup 3}-10{sup 5} molecules as negative gradients of a DFT/PMM hybrid Hamiltonian. The electrostatic interactions are efficiently described by a hierarchical fast multipole method (FMM). Adopting recent progress of this FMM technique [Lorenzen et al., J. Chem. Theory Comput. 10, 3244 (2014)], which particularly entails a strictly linear scaling of the computational effort with the system size, and adapting this revised FMM approach to the computation of the interactions between the DFT and PMM fragments of a simulation system, here, we show how one can further enhance the efficiency and accuracy of such DFT/PMM-MD simulations. The resulting gain of total performance, as measured for alanine dipeptide (DFT) embedded in water (PMM) by the product of the gains in efficiency and accuracy, amounts to about one order of magnitude. We also demonstrate that the jointly parallelized implementation of the DFT and PMM-MD parts of the computation enables the efficient use of high-performance computing systems. The associated software is available online.
NASA Astrophysics Data System (ADS)
Schwörer, Magnus; Lorenzen, Konstantin; Mathias, Gerald; Tavan, Paul
2015-03-01
Recently, a novel approach to hybrid quantum mechanics/molecular mechanics (QM/MM) molecular dynamics (MD) simulations has been suggested [Schwörer et al., J. Chem. Phys. 138, 244103 (2013)]. Here, the forces acting on the atoms are calculated by grid-based density functional theory (DFT) for a solute molecule and by a polarizable molecular mechanics (PMM) force field for a large solvent environment composed of several 103-105 molecules as negative gradients of a DFT/PMM hybrid Hamiltonian. The electrostatic interactions are efficiently described by a hierarchical fast multipole method (FMM). Adopting recent progress of this FMM technique [Lorenzen et al., J. Chem. Theory Comput. 10, 3244 (2014)], which particularly entails a strictly linear scaling of the computational effort with the system size, and adapting this revised FMM approach to the computation of the interactions between the DFT and PMM fragments of a simulation system, here, we show how one can further enhance the efficiency and accuracy of such DFT/PMM-MD simulations. The resulting gain of total performance, as measured for alanine dipeptide (DFT) embedded in water (PMM) by the product of the gains in efficiency and accuracy, amounts to about one order of magnitude. We also demonstrate that the jointly parallelized implementation of the DFT and PMM-MD parts of the computation enables the efficient use of high-performance computing systems. The associated software is available online.
Accurate charge capture and cost allocation: cost justification for bedside computing.
Grewal, R.; Reed, R. L.
1993-01-01
This paper shows that cost justification for bedside clinical computing can be made by recouping charges with accurate charge capture. Twelve months worth of professional charges for a sixteen bed surgical intensive care unit are computed from charted data in a bedside clinical database and are compared to the professional charges actually billed by the unit. A substantial difference in predicted charges and billed charges was found. This paper also discusses the concept of appropriate cost allocation in the inpatient environment and the feasibility of appropriate allocation as a by-product of bedside computing. PMID:8130444
Efficient computation of Lorentzian 6J symbols
NASA Astrophysics Data System (ADS)
Willis, Joshua
2007-04-01
Spin foam models are a proposal for a quantum theory of gravity, and an important open question is whether they reproduce classical general relativity in the low energy limit. One approach to tackling that problem is to simulate spin-foam models on the computer, but this is hampered by the high computational cost of evaluating the basic building block of these models, the so-called 10J symbol. For Euclidean models, Christensen and Egan have developed an efficient algorithm, but for Lorentzian models this problem remains open. In this talk we describe an efficient method developed for Lorentzian 6J symbols, and we also report on recent work in progress to use this efficient algorithm in calculating the 10J symbols that are of real interest.
NASA Astrophysics Data System (ADS)
Jiang, Xikai; Karpeev, Dmitry; Li, Jiyuan; de Pablo, Juan; Hernandez-Ortiz, Juan; Heinonen, Olle
Boundary integrals arise in many electrostatic and magnetostatic problems. In computational modeling of these problems, although the integral is performed only on the boundary of a domain, its direct evaluation needs O(N2) operations, where N is number of unknowns on the boundary. The O(N2) scaling impedes a wider usage of the boundary integral method in scientific and engineering communities. We have developed a parallel computational approach that utilize the Fast Multipole Method to evaluate the boundary integral in O(N) operations. To demonstrate the accuracy, efficiency, and scalability of our approach, we consider two test cases. In the first case, we solve a boundary value problem for a ferroelectric/ferromagnetic volume in free space using a hybrid finite element-boundary integral method. In the second case, we solve an electrostatic problem involving the polarization of dielectric objects in free space using the boundary element method. The results from test cases show that our parallel approach can enable highly efficient and accurate simulations of mesoscale electrostatic/magnetostatic problems. Computing resources was provided by Blues, a high-performance cluster operated by the Laboratory Computing Resource Center at Argonne National Laboratory. Work at Argonne was supported by U. S. DOE, Office of Science under Contract No. DE-AC02-06CH11357.
An Efficient Method for Computing All Reducts
NASA Astrophysics Data System (ADS)
Bao, Yongguang; Du, Xiaoyong; Deng, Mingrong; Ishii, Naohiro
In the process of data mining of decision table using Rough Sets methodology, the main computational effort is associated with the determination of the reducts. Computing all reducts is a combinatorial NP-hard computational problem. Therefore the only way to achieve its faster execution is by providing an algorithm, with a better constant factor, which may solve this problem in reasonable time for real-life data sets. The purpose of this presentation is to propose two new efficient algorithms to compute reducts in information systems. The proposed algorithms are based on the proposition of reduct and the relation between the reduct and discernibility matrix. Experiments have been conducted on some real world domains in execution time. The results show it improves the execution time when compared with the other methods. In real application, we can combine the two proposed algorithms.
Efficient and accurate laser shaping with liquid crystal spatial light modulators
NASA Astrophysics Data System (ADS)
Maxson, Jared M.; Bartnik, Adam C.; Bazarov, Ivan V.
2014-10-01
A phase-only spatial light modulator (SLM) is capable of precise transverse laser shaping by either functioning as a variable phase grating or by serving as a variable mask via polarization rotation. As a phase grating, the highest accuracy algorithms, based on computer generated holograms (CGHs), have been shown to yield extended laser shapes with <10% rms error, but conversely little is known about the experimental efficiency of the method in general. In this work, we compare the experimental tradeoff between error and efficiency for both the best known CGH method and polarization rotation-based intensity masking when generating hard-edged flat top beams. We find that the masking method performs comparably with CGHs, both having rms error < 10% with efficiency > 15%. Informed by best practices for high efficiency from a SLM phase grating, we introduce an adaptive refractive algorithm which has high efficiency (92%) but also higher error (16%), for nearly cylindrically symmetric cases.
Computer-based personality judgments are more accurate than those made by humans
Youyou, Wu; Kosinski, Michal; Stillwell, David
2015-01-01
Judging others’ personalities is an essential skill in successful social living, as personality is a key driver behind people’s interactions, behaviors, and emotions. Although accurate personality judgments stem from social-cognitive skills, developments in machine learning show that computer models can also make valid judgments. This study compares the accuracy of human and computer-based personality judgments, using a sample of 86,220 volunteers who completed a 100-item personality questionnaire. We show that (i) computer predictions based on a generic digital footprint (Facebook Likes) are more accurate (r = 0.56) than those made by the participants’ Facebook friends using a personality questionnaire (r = 0.49); (ii) computer models show higher interjudge agreement; and (iii) computer personality judgments have higher external validity when predicting life outcomes such as substance use, political attitudes, and physical health; for some outcomes, they even outperform the self-rated personality scores. Computers outpacing humans in personality judgment presents significant opportunities and challenges in the areas of psychological assessment, marketing, and privacy. PMID:25583507
Computer-based personality judgments are more accurate than those made by humans.
Youyou, Wu; Kosinski, Michal; Stillwell, David
2015-01-27
Judging others' personalities is an essential skill in successful social living, as personality is a key driver behind people's interactions, behaviors, and emotions. Although accurate personality judgments stem from social-cognitive skills, developments in machine learning show that computer models can also make valid judgments. This study compares the accuracy of human and computer-based personality judgments, using a sample of 86,220 volunteers who completed a 100-item personality questionnaire. We show that (i) computer predictions based on a generic digital footprint (Facebook Likes) are more accurate (r = 0.56) than those made by the participants' Facebook friends using a personality questionnaire (r = 0.49); (ii) computer models show higher interjudge agreement; and (iii) computer personality judgments have higher external validity when predicting life outcomes such as substance use, political attitudes, and physical health; for some outcomes, they even outperform the self-rated personality scores. Computers outpacing humans in personality judgment presents significant opportunities and challenges in the areas of psychological assessment, marketing, and privacy. PMID:25583507
NASA Technical Reports Server (NTRS)
Tang, Charles C. H.
1988-01-01
By using Von Zeipel's generating function procedure the perturbing earth gravitational potential is averaged with respect to the fast variable (mean anomaly) and a set of 'fictitous' mean orbital elements which can be used as a long-term satellite orbit predictor is obtained. The set of elements is shown to be a function of the nonlinear square of the second zonal harmonic coefficient. It is found that the long-term orbit prediction using the 'fictitous' mean elements is as accurate as that using the osculating elements, but has a computing speed about two orders of magnitude faster. For short-term orbit predictions, the osculating elements approach must be used.
A Unified Methodology for Computing Accurate Quaternion Color Moments and Moment Invariants.
Karakasis, Evangelos G; Papakostas, George A; Koulouriotis, Dimitrios E; Tourassis, Vassilios D
2014-02-01
In this paper, a general framework for computing accurate quaternion color moments and their corresponding invariants is proposed. The proposed unified scheme arose by studying the characteristics of different orthogonal polynomials. These polynomials are used as kernels in order to form moments, the invariants of which can easily be derived. The resulted scheme permits the usage of any polynomial-like kernel in a unified and consistent way. The resulted moments and moment invariants demonstrate robustness to noisy conditions and high discriminative power. Additionally, in the case of continuous moments, accurate computations take place to avoid approximation errors. Based on this general methodology, the quaternion Tchebichef, Krawtchouk, Dual Hahn, Legendre, orthogonal Fourier-Mellin, pseudo Zernike and Zernike color moments, and their corresponding invariants are introduced. A selected paradigm presents the reconstruction capability of each moment family, whereas proper classification scenarios evaluate the performance of color moment invariants. PMID:24216719
Efficient and accurate laser shaping with liquid crystal spatial light modulators
Maxson, Jared M.; Bartnik, Adam C.; Bazarov, Ivan V.
2014-10-27
A phase-only spatial light modulator (SLM) is capable of precise transverse laser shaping by either functioning as a variable phase grating or by serving as a variable mask via polarization rotation. As a phase grating, the highest accuracy algorithms, based on computer generated holograms (CGHs), have been shown to yield extended laser shapes with <10% rms error, but conversely little is known about the experimental efficiency of the method in general. In this work, we compare the experimental tradeoff between error and efficiency for both the best known CGH method and polarization rotation-based intensity masking when generating hard-edged flat top beams. We find that the masking method performs comparably with CGHs, both having rms error < 10% with efficiency > 15%. Informed by best practices for high efficiency from a SLM phase grating, we introduce an adaptive refractive algorithm which has high efficiency (92%) but also higher error (16%), for nearly cylindrically symmetric cases.
Changing computing paradigms towards power efficiency.
Klavík, Pavel; Malossi, A Cristiano I; Bekas, Costas; Curioni, Alessandro
2014-06-28
Power awareness is fast becoming immensely important in computing, ranging from the traditional high-performance computing applications to the new generation of data centric workloads. In this work, we describe our efforts towards a power-efficient computing paradigm that combines low- and high-precision arithmetic. We showcase our ideas for the widely used kernel of solving systems of linear equations that finds numerous applications in scientific and engineering disciplines as well as in large-scale data analytics, statistics and machine learning. Towards this goal, we developed tools for the seamless power profiling of applications at a fine-grain level. In addition, we verify here previous work on post-FLOPS/W metrics and show that these can shed much more light in the power/energy profile of important applications. PMID:24842033
Brandenburg, Jan Gerit; Caldeweyher, Eike; Grimme, Stefan
2016-06-21
We extend the recently introduced PBEh-3c global hybrid density functional [S. Grimme et al., J. Chem. Phys., 2015, 143, 054107] by a screened Fock exchange variant based on the Henderson-Janesko-Scuseria exchange hole model. While the excellent performance of the global hybrid is maintained for small covalently bound molecules, its performance for computed condensed phase mass densities is further improved. Most importantly, a speed up of 30 to 50% can be achieved and especially for small orbital energy gap cases, the method is numerically much more robust. The latter point is important for many applications, e.g., for metal-organic frameworks, organic semiconductors, or protein structures. This enables an accurate density functional based electronic structure calculation of a full DNA helix structure on a single core desktop computer which is presented as an example in addition to comprehensive benchmark results. PMID:27240749
WIPPER: an accurate and efficient field phenotyping platform for large-scale applications
Utsushi, Hiroe; Abe, Akira; Tamiru, Muluneh; Ogasawara, Yumiko; Obara, Tsutomu; Sato, Emiko; Ochiai, Yusuke; Terauchi, Ryohei; Takagi, Hiroki
2015-01-01
More accurate, rapid, and easy phenotyping tools are required to match the recent advances in high-throughput genotyping for accelerating breeding and genetic analysis. The conventional data recording in field notebooks and then inputting data to computers for further analysis is inefficient, time-consuming, laborious, and prone to human error. Here, we report WIPPER (for Wireless Plant Phenotyper), a new phenotyping platform that combines field phenotyping and data recording with the aid of Bluetooth communication, thus saving time and labor not only for field data recoding but also for inputting data to computers. Additionally, it eliminates the risk of human error associated with phenotyping and inputting data. We applied WIPPER to 100 individuals of a rice recombinant inbred line (RIL) for measuring leaf width and relative chlorophyll content (SPAD value), and were able to record an accurate data in a significantly reduced time compared with the conventional method of data collection. We are currently using WIPPER for routine management of rice germplasm including recording and documenting information on phenotypic data, seeds, and DNA for their accelerated utilization in crop breeding. PMID:26175626
Accurate computation of Stokes flow driven by an open immersed interface
NASA Astrophysics Data System (ADS)
Li, Yi; Layton, Anita T.
2012-06-01
We present numerical methods for computing two-dimensional Stokes flow driven by forces singularly supported along an open, immersed interface. Two second-order accurate methods are developed: one for accurately evaluating boundary integral solutions at a point, and another for computing Stokes solution values on a rectangular mesh. We first describe a method for computing singular or nearly singular integrals, such as a double layer potential due to sources on a curve in the plane, evaluated at a point on or near the curve. To improve accuracy of the numerical quadrature, we add corrections for the errors arising from discretization, which are found by asymptotic analysis. When used to solve the Stokes equations with sources on an open, immersed interface, the method generates second-order approximations, for both the pressure and the velocity, and preserves the jumps in the solutions and their derivatives across the boundary. We then combine the method with a mesh-based solver to yield a hybrid method for computing Stokes solutions at N2 grid points on a rectangular grid. Numerical results are presented which exhibit second-order accuracy. To demonstrate the applicability of the method, we use the method to simulate fluid dynamics induced by the beating motion of a cilium. The method preserves the sharp jumps in the Stokes solution and their derivatives across the immersed boundary. Model results illustrate the distinct hydrodynamic effects generated by the effective stroke and by the recovery stroke of the ciliary beat cycle.
Efficient communication in massively parallel computers
Cypher, R.E.
1989-01-01
A fundamental operation in parallel computation is sorting. Sorting is important not only because it is required by many algorithms, but also because it can be used to implement irregular, pointer-based communication. The author studies two algorithms for sorting in massively parallel computers. First, he examines Shellsort. Shellsort is a sorting algorithm that is based on a sequence of parameters called increments. Shellsort can be used to create a parallel sorting device known as a sorting network. Researchers have suggested that if the correct increment sequence is used, an optimal size sorting network can be obtained. All published increment sequences have been monotonically decreasing. He shows that no monotonically decreasing increment sequence will yield an optimal size sorting network. Second, he presents a sorting algorithm called Cubesort. Cubesort is the fastest known sorting algorithm for a variety of parallel computers aver a wide range of parameters. He also presents a paradigm for developing parallel algorithms that have efficient communication. The paradigm, called the data reduction paradigm, consists of using a divide-and-conquer strategy. Both the division and combination phases of the divide-and-conquer algorithm may require irregular, pointer-based communication between processors. However, the problem is divided so as to limit the amount of data that must be communicated. As a result the communication can be performed efficiently. He presents data reduction algorithms for the image component labeling problem, the closest pair problem and four versions of the parallel prefix problem.
Accurate calculation of computer-generated holograms using angular-spectrum layer-oriented method.
Zhao, Yan; Cao, Liangcai; Zhang, Hao; Kong, Dezhao; Jin, Guofan
2015-10-01
Fast calculation and correct depth cue are crucial issues in the calculation of computer-generated hologram (CGH) for high quality three-dimensional (3-D) display. An angular-spectrum based algorithm for layer-oriented CGH is proposed. Angular spectra from each layer are synthesized as a layer-corresponded sub-hologram based on the fast Fourier transform without paraxial approximation. The proposed method can avoid the huge computational cost of the point-oriented method and yield accurate predictions of the whole diffracted field compared with other layer-oriented methods. CGHs of versatile formats of 3-D digital scenes, including computed tomography and 3-D digital models, are demonstrated with precise depth performance and advanced image quality. PMID:26480062
Time accurate application of the MacCormack 2-4 scheme on massively parallel computers
NASA Technical Reports Server (NTRS)
Hudson, Dale A.; Long, Lyle N.
1995-01-01
Many recent computational efforts in turbulence and acoustics research have used higher order numerical algorithms. One popular method has been the explicit MacCormack 2-4 scheme. The MacCormack 2-4 scheme is second order accurate in time and fourth order accurate in space, and is stable for CFL's below 2/3. Current research has shown that the method can give accurate results but does exhibit significant Gibbs phenomena at sharp discontinuities. The impact of adding Jameson type second, third, and fourth order artificial viscosity was examined here. Category 2 problems, the nonlinear traveling wave and the Riemann problem, were computed using a CFL number of 0.25. This research has found that dispersion errors can be significantly reduced or nearly eliminated by using a combination of second and third order terms in the damping. Use of second and fourth order terms reduced the magnitude of dispersion errors but not as effectively as the second and third order combination. The program was coded using Thinking Machine's CM Fortran, a variant of Fortran 90/High Performance Fortran, and was executed on a 2K CM-200. Simple extrapolation boundary conditions were used for both problems.
Palm computer demonstrates a fast and accurate means of burn data collection.
Lal, S O; Smith, F W; Davis, J P; Castro, H Y; Smith, D W; Chinkes, D L; Barrow, R E
2000-01-01
Manual biomedical data collection and entry of the data into a personal computer is time-consuming and can be prone to errors. The purpose of this study was to compare data entry into a hand-held computer versus hand written data followed by entry of the data into a personal computer. A Palm (3Com Palm IIIx, Santa, Clara, Calif) computer with a custom menu-driven program was used for the entry and retrieval of burn-related variables. These variables were also used to create an identical sheet that was filled in by hand. Identical data were retrieved twice from 110 charts 48 hours apart and then used to create an Excel (Microsoft, Redmond, Wash) spreadsheet. One time data were recorded by the Palm entry method, and the other time the data were handwritten. The method of retrieval was alternated between the Palm system and handwritten system every 10 charts. The total time required to log data and to generate an Excel spreadsheet was recorded and used as a study endpoint. The total time for the Palm method of data collection and downloading to a personal computer was 23% faster than hand recording with the personal computer entry method (P < 0.05), and 58% fewer errors were generated with the Palm method.) The Palm is a faster and more accurate means of data collection than a handwritten technique. PMID:11194811
Computational efficiency improvements for image colorization
NASA Astrophysics Data System (ADS)
Yu, Chao; Sharma, Gaurav; Aly, Hussein
2013-03-01
We propose an efficient algorithm for colorization of greyscale images. As in prior work, colorization is posed as an optimization problem: a user specifies the color for a few scribbles drawn on the greyscale image and the color image is obtained by propagating color information from the scribbles to surrounding regions, while maximizing the local smoothness of colors. In this formulation, colorization is obtained by solving a large sparse linear system, which normally requires substantial computation and memory resources. Our algorithm improves the computational performance through three innovations over prior colorization implementations. First, the linear system is solved iteratively without explicitly constructing the sparse matrix, which significantly reduces the required memory. Second, we formulate each iteration in terms of integral images obtained by dynamic programming, reducing repetitive computation. Third, we use a coarseto- fine framework, where a lower resolution subsampled image is first colorized and this low resolution color image is upsampled to initialize the colorization process for the fine level. The improvements we develop provide significant speedup and memory savings compared to the conventional approach of solving the linear system directly using off-the-shelf sparse solvers, and allow us to colorize images with typical sizes encountered in realistic applications on typical commodity computing platforms.
Efficient yet accurate approximations for ab initio calculations of alcohol cluster thermochemistry
NASA Astrophysics Data System (ADS)
Umer, Muhammad; Kopp, Wassja A.; Leonhard, Kai
2015-12-01
We have calculated the binding enthalpies and entropies of gas phase alcohol clusters from ethanol to 1-decanol. In addition to the monomers, we have investigated dimers, tetramers, and pentamers. Geometries have been obtained at the B3LYP/TZVP level and single point energy calculations have been performed with the Resolution of the Identity-MP2 (RIMP2) method and basis set limit extrapolation using aug-cc-pVTZ and aug-cc-pVQZ basis sets. Thermochemistry is calculated with decoupled hindered rotor treatment for large amplitude motions. The results show three points: First, it is more accurate to transfer the rigid-rotor harmonic oscillator entropies from propanol to longer alcohols than to compute them with an ultra-fine grid and tight geometry convergence criteria. Second, the computational effort can be reduced considerably by using dimerization energies of longer alcohols at density functional theory (B3LYP) level plus a RIMP2 correction obtained from 1-propanol. This approximation yields results almost with the same accuracy as RIMP2 — both methods differ for 1-decanol only 0.4 kJ/mol. Third, the entropy of dimerization including the hindered rotation contribution is converged at 1-propanol with respect to chain length. This allows for a transfer of hindered rotation contributions from smaller alcohols to longer ones which reduces the required computational and man power considerably.
Efficient yet accurate approximations for ab initio calculations of alcohol cluster thermochemistry.
Umer, Muhammad; Kopp, Wassja A; Leonhard, Kai
2015-12-01
We have calculated the binding enthalpies and entropies of gas phase alcohol clusters from ethanol to 1-decanol. In addition to the monomers, we have investigated dimers, tetramers, and pentamers. Geometries have been obtained at the B3LYP/TZVP level and single point energy calculations have been performed with the Resolution of the Identity-MP2 (RIMP2) method and basis set limit extrapolation using aug-cc-pVTZ and aug-cc-pVQZ basis sets. Thermochemistry is calculated with decoupled hindered rotor treatment for large amplitude motions. The results show three points: First, it is more accurate to transfer the rigid-rotor harmonic oscillator entropies from propanol to longer alcohols than to compute them with an ultra-fine grid and tight geometry convergence criteria. Second, the computational effort can be reduced considerably by using dimerization energies of longer alcohols at density functional theory (B3LYP) level plus a RIMP2 correction obtained from 1-propanol. This approximation yields results almost with the same accuracy as RIMP2 - both methods differ for 1-decanol only 0.4 kJ/mol. Third, the entropy of dimerization including the hindered rotation contribution is converged at 1-propanol with respect to chain length. This allows for a transfer of hindered rotation contributions from smaller alcohols to longer ones which reduces the required computational and man power considerably. PMID:26646881
Erguel, Ozguer; Guerel, Levent
2008-12-01
We present a novel stabilization procedure for accurate surface formulations of electromagnetic scattering problems involving three-dimensional dielectric objects with arbitrarily low contrasts. Conventional surface integral equations provide inaccurate results for the scattered fields when the contrast of the object is low, i.e., when the electromagnetic material parameters of the scatterer and the host medium are close to each other. We propose a stabilization procedure involving the extraction of nonradiating currents and rearrangement of the right-hand side of the equations using fictitious incident fields. Then, only the radiating currents are solved to calculate the scattered fields accurately. This technique can easily be applied to the existing implementations of conventional formulations, it requires negligible extra computational cost, and it is also appropriate for the solution of large problems with the multilevel fast multipole algorithm. We show that the stabilization leads to robust formulations that are valid even for the solutions of extremely low-contrast objects.
An accurate quadrature technique for the contact boundary in 3D finite element computations
NASA Astrophysics Data System (ADS)
Duong, Thang X.; Sauer, Roger A.
2015-01-01
This paper presents a new numerical integration technique for 3D contact finite element implementations, focusing on a remedy for the inaccurate integration due to discontinuities at the boundary of contact surfaces. The method is based on the adaptive refinement of the integration domain along the boundary of the contact surface, and is accordingly denoted RBQ for refined boundary quadrature. It can be used for common element types of any order, e.g. Lagrange, NURBS, or T-Spline elements. In terms of both computational speed and accuracy, RBQ exhibits great advantages over a naive increase of the number of quadrature points. Also, the RBQ method is shown to remain accurate for large deformations. Furthermore, since the sharp boundary of the contact surface is determined, it can be used for various purposes like the accurate post-processing of the contact pressure. Several examples are presented to illustrate the new technique.
A computational efficient modelling of laminar separation bubbles
NASA Astrophysics Data System (ADS)
Dini, Paolo; Maughmer, Mark D.
1990-07-01
In predicting the aerodynamic characteristics of airfoils operating at low Reynolds numbers, it is often important to account for the effects of laminar (transitional) separation bubbles. Previous approaches to the modelling of this viscous phenomenon range from fast but sometimes unreliable empirical correlations for the length of the bubble and the associated increase in momentum thickness, to more accurate but significantly slower displacement-thickness iteration methods employing inverse boundary-layer formulations in the separated regions. Since the penalty in computational time associated with the more general methods is unacceptable for airfoil design applications, use of an accurate yet computationally efficient model is highly desirable. To this end, a semi-empirical bubble model was developed and incorporated into the Eppler and Somers airfoil design and analysis program. The generality and the efficiency was achieved by successfully approximating the local viscous/inviscid interaction, the transition location, and the turbulent reattachment process within the framework of an integral boundary-layer method. Comparisons of the predicted aerodynamic characteristics with experimental measurements for several airfoils show excellent and consistent agreement for Reynolds numbers from 2,000,000 down to 100,000.
A computationally efficient modelling of laminar separation bubbles
NASA Astrophysics Data System (ADS)
Dini, Paolo
1990-08-01
In predicting the aerodynamic characteristics of airfoils operating at low Reynolds numbers, it is often important to account for the effects of laminar (transitional) separation bubbles. Previous approaches to the modeling of this viscous phenomenon range from fast by sometimes unreliable empirical correlations for the length of the bubble and the associated increase in momentum thickness, to more accurate but significantly slower displacement thickness iteration methods employing inverse boundary layer formulations in the separated regions. Since the penalty in computational time associated with the more general methods is unacceptable for airfoil design applications, use of an accurate yet computationally efficient model is highly desirable. To this end, a semi-empirical bubble model was developed and incorporated into the Eppler and Somers airfoil design and analysis program. The generality and the efficiency were achieved by successfully approximating the local viscous/inviscid interaction, the transition location, and the turbulent reattachment process within the framework of an integral boundary-layer method. Comparisons of the predicted aerodynamic characteristics with experimental measurements for several airfoils show excellent and consistent agreement for Reynolds numbers from 2,000,000 down to 100,000.
A computational efficient modelling of laminar separation bubbles
NASA Technical Reports Server (NTRS)
Dini, Paolo; Maughmer, Mark D.
1990-01-01
In predicting the aerodynamic characteristics of airfoils operating at low Reynolds numbers, it is often important to account for the effects of laminar (transitional) separation bubbles. Previous approaches to the modelling of this viscous phenomenon range from fast but sometimes unreliable empirical correlations for the length of the bubble and the associated increase in momentum thickness, to more accurate but significantly slower displacement-thickness iteration methods employing inverse boundary-layer formulations in the separated regions. Since the penalty in computational time associated with the more general methods is unacceptable for airfoil design applications, use of an accurate yet computationally efficient model is highly desirable. To this end, a semi-empirical bubble model was developed and incorporated into the Eppler and Somers airfoil design and analysis program. The generality and the efficiency was achieved by successfully approximating the local viscous/inviscid interaction, the transition location, and the turbulent reattachment process within the framework of an integral boundary-layer method. Comparisons of the predicted aerodynamic characteristics with experimental measurements for several airfoils show excellent and consistent agreement for Reynolds numbers from 2,000,000 down to 100,000.
Efficient Parallel Kernel Solvers for Computational Fluid Dynamics Applications
NASA Technical Reports Server (NTRS)
Sun, Xian-He
1997-01-01
Distributed-memory parallel computers dominate today's parallel computing arena. These machines, such as Intel Paragon, IBM SP2, and Cray Origin2OO, have successfully delivered high performance computing power for solving some of the so-called "grand-challenge" problems. Despite initial success, parallel machines have not been widely accepted in production engineering environments due to the complexity of parallel programming. On a parallel computing system, a task has to be partitioned and distributed appropriately among processors to reduce communication cost and to attain load balance. More importantly, even with careful partitioning and mapping, the performance of an algorithm may still be unsatisfactory, since conventional sequential algorithms may be serial in nature and may not be implemented efficiently on parallel machines. In many cases, new algorithms have to be introduced to increase parallel performance. In order to achieve optimal performance, in addition to partitioning and mapping, a careful performance study should be conducted for a given application to find a good algorithm-machine combination. This process, however, is usually painful and elusive. The goal of this project is to design and develop efficient parallel algorithms for highly accurate Computational Fluid Dynamics (CFD) simulations and other engineering applications. The work plan is 1) developing highly accurate parallel numerical algorithms, 2) conduct preliminary testing to verify the effectiveness and potential of these algorithms, 3) incorporate newly developed algorithms into actual simulation packages. The work plan has well achieved. Two highly accurate, efficient Poisson solvers have been developed and tested based on two different approaches: (1) Adopting a mathematical geometry which has a better capacity to describe the fluid, (2) Using compact scheme to gain high order accuracy in numerical discretization. The previously developed Parallel Diagonal Dominant (PDD) algorithm
Improving the Efficiency of Abdominal Aortic Aneurysm Wall Stress Computations
Zelaya, Jaime E.; Goenezen, Sevan; Dargon, Phong T.; Azarbal, Amir-Farzin; Rugonyi, Sandra
2014-01-01
An abdominal aortic aneurysm is a pathological dilation of the abdominal aorta, which carries a high mortality rate if ruptured. The most commonly used surrogate marker of rupture risk is the maximal transverse diameter of the aneurysm. More recent studies suggest that wall stress from models of patient-specific aneurysm geometries extracted, for instance, from computed tomography images may be a more accurate predictor of rupture risk and an important factor in AAA size progression. However, quantification of wall stress is typically computationally intensive and time-consuming, mainly due to the nonlinear mechanical behavior of the abdominal aortic aneurysm walls. These difficulties have limited the potential of computational models in clinical practice. To facilitate computation of wall stresses, we propose to use a linear approach that ensures equilibrium of wall stresses in the aneurysms. This proposed linear model approach is easy to implement and eliminates the burden of nonlinear computations. To assess the accuracy of our proposed approach to compute wall stresses, results from idealized and patient-specific model simulations were compared to those obtained using conventional approaches and to those of a hypothetical, reference abdominal aortic aneurysm model. For the reference model, wall mechanical properties and the initial unloaded and unstressed configuration were assumed to be known, and the resulting wall stresses were used as reference for comparison. Our proposed linear approach accurately approximates wall stresses for varying model geometries and wall material properties. Our findings suggest that the proposed linear approach could be used as an effective, efficient, easy-to-use clinical tool to estimate patient-specific wall stresses. PMID:25007052
CLASS2: accurate and efficient splice variant annotation from RNA-seq reads.
Song, Li; Sabunciyan, Sarven; Florea, Liliana
2016-06-01
Next generation sequencing of cellular RNA is making it possible to characterize genes and alternative splicing in unprecedented detail. However, designing bioinformatics tools to accurately capture splicing variation has proven difficult. Current programs can find major isoforms of a gene but miss lower abundance variants, or are sensitive but imprecise. CLASS2 is a novel open source tool for accurate genome-guided transcriptome assembly from RNA-seq reads based on the model of splice graph. An extension of our program CLASS, CLASS2 jointly optimizes read patterns and the number of supporting reads to score and prioritize transcripts, implemented in a novel, scalable and efficient dynamic programming algorithm. When compared against reference programs, CLASS2 had the best overall accuracy and could detect up to twice as many splicing events with precision similar to the best reference program. Notably, it was the only tool to produce consistently reliable transcript models for a wide range of applications and sequencing strategies, including ribosomal RNA-depleted samples. Lightweight and multi-threaded, CLASS2 requires <3GB RAM and can analyze a 350 million read set within hours, and can be widely applied to transcriptomics studies ranging from clinical RNA sequencing, to alternative splicing analyses, and to the annotation of new genomes. PMID:26975657
CLASS2: accurate and efficient splice variant annotation from RNA-seq reads
Song, Li; Sabunciyan, Sarven; Florea, Liliana
2016-01-01
Next generation sequencing of cellular RNA is making it possible to characterize genes and alternative splicing in unprecedented detail. However, designing bioinformatics tools to accurately capture splicing variation has proven difficult. Current programs can find major isoforms of a gene but miss lower abundance variants, or are sensitive but imprecise. CLASS2 is a novel open source tool for accurate genome-guided transcriptome assembly from RNA-seq reads based on the model of splice graph. An extension of our program CLASS, CLASS2 jointly optimizes read patterns and the number of supporting reads to score and prioritize transcripts, implemented in a novel, scalable and efficient dynamic programming algorithm. When compared against reference programs, CLASS2 had the best overall accuracy and could detect up to twice as many splicing events with precision similar to the best reference program. Notably, it was the only tool to produce consistently reliable transcript models for a wide range of applications and sequencing strategies, including ribosomal RNA-depleted samples. Lightweight and multi-threaded, CLASS2 requires <3GB RAM and can analyze a 350 million read set within hours, and can be widely applied to transcriptomics studies ranging from clinical RNA sequencing, to alternative splicing analyses, and to the annotation of new genomes. PMID:26975657
NASA Technical Reports Server (NTRS)
Tamma, Kumar K.; Railkar, Sudhir B.
1988-01-01
This paper represents an attempt to apply extensions of a hybrid transfinite element computational approach for accurately predicting thermoelastic stress waves. The applicability of the present formulations for capturing the thermal stress waves induced by boundary heating for the well known Danilovskaya problems is demonstrated. A unique feature of the proposed formulations for applicability to the Danilovskaya problem of thermal stress waves in elastic solids lies in the hybrid nature of the unified formulations and the development of special purpose transfinite elements in conjunction with the classical Galerkin techniques and transformation concepts. Numerical test cases validate the applicability and superior capability to capture the thermal stress waves induced due to boundary heating.
Shaughnessy, M C; Jones, R E
2016-02-01
We develop and demonstrate a method to efficiently use density functional calculations to drive classical dynamics of complex atomic and molecular systems. The method has the potential to scale to systems and time scales unreachable with current ab initio molecular dynamics schemes. It relies on an adapting dataset of independently computed Hellmann-Feynman forces for atomic configurations endowed with a distance metric. The metric on configurations enables fast database lookup and robust interpolation of the stored forces. We discuss mechanisms for the database to adapt to the needs of the evolving dynamics, while maintaining accuracy, and other extensions of the basic algorithm. PMID:26669825
NASA Astrophysics Data System (ADS)
Mehmani, Yashar; Oostrom, Mart; Balhoff, Matthew T.
2014-03-01
Several approaches have been developed in the literature for solving flow and transport at the pore scale. Some authors use a direct modeling approach where the fundamental flow and transport equations are solved on the actual pore-space geometry. Such direct modeling, while very accurate, comes at a great computational cost. Network models are computationally more efficient because the pore-space morphology is approximated. Typically, a mixed cell method (MCM) is employed for solving the flow and transport system which assumes pore-level perfect mixing. This assumption is invalid at moderate to high Peclet regimes. In this work, a novel Eulerian perspective on modeling flow and transport at the pore scale is developed. The new streamline splitting method (SSM) allows for circumventing the pore-level perfect-mixing assumption, while maintaining the computational efficiency of pore-network models. SSM was verified with direct simulations and validated against micromodel experiments; excellent matches were obtained across a wide range of pore-structure and fluid-flow parameters. The increase in the computational cost from MCM to SSM is shown to be minimal, while the accuracy of SSM is much higher than that of MCM and comparable to direct modeling approaches. Therefore, SSM can be regarded as an appropriate balance between incorporating detailed physics and controlling computational cost. The truly predictive capability of the model allows for the study of pore-level interactions of fluid flow and transport in different porous materials. In this paper, we apply SSM and MCM to study the effects of pore-level mixing on transverse dispersion in 3-D disordered granular media.
Mehmani, Yashar; Oostrom, Martinus; Balhoff, Matthew
2014-03-20
Several approaches have been developed in the literature for solving flow and transport at the pore-scale. Some authors use a direct modeling approach where the fundamental flow and transport equations are solved on the actual pore-space geometry. Such direct modeling, while very accurate, comes at a great computational cost. Network models are computationally more efficient because the pore-space morphology is approximated. Typically, a mixed cell method (MCM) is employed for solving the flow and transport system which assumes pore-level perfect mixing. This assumption is invalid at moderate to high Peclet regimes. In this work, a novel Eulerian perspective on modeling flow and transport at the pore-scale is developed. The new streamline splitting method (SSM) allows for circumventing the pore-level perfect mixing assumption, while maintaining the computational efficiency of pore-network models. SSM was verified with direct simulations and excellent matches were obtained against micromodel experiments across a wide range of pore-structure and fluid-flow parameters. The increase in the computational cost from MCM to SSM is shown to be minimal, while the accuracy of SSM is much higher than that of MCM and comparable to direct modeling approaches. Therefore, SSM can be regarded as an appropriate balance between incorporating detailed physics and controlling computational cost. The truly predictive capability of the model allows for the study of pore-level interactions of fluid flow and transport in different porous materials. In this paper, we apply SSM and MCM to study the effects of pore-level mixing on transverse dispersion in 3D disordered granular media.
Joldes, Grand Roman; Wittek, Adam; Miller, Karol
2008-01-01
Real time computation of soft tissue deformation is important for the use of augmented reality devices and for providing haptic feedback during operation or surgeon training. This requires algorithms that are fast, accurate and can handle material nonlinearities and large deformations. A set of such algorithms is presented in this paper, starting with the finite element formulation and the integration scheme used and addressing common problems such as hourglass control and locking. The computation examples presented prove that by using these algorithms, real time computations become possible without sacrificing the accuracy of the results. For a brain model having more than 7000 degrees of freedom, we computed the reaction forces due to indentation with frequency of around 1000 Hz using a standard dual core PC. Similarly, we conducted simulation of brain shift using a model with more than 50 000 degrees of freedom in less than a minute. The speed benefits of our models results from combining the Total Lagrangian formulation with explicit time integration and low order finite elements. PMID:19152791
A primer on the energy efficiency of computing
Koomey, Jonathan G.
2015-03-30
The efficiency of computing at peak output has increased rapidly since the dawn of the computer age. This paper summarizes some of the key factors affecting the efficiency of computing in all usage modes. While there is still great potential for improving the efficiency of computing devices, we will need to alter how we do computing in the next few decades because we are finally approaching the limits of current technologies.
A primer on the energy efficiency of computing
NASA Astrophysics Data System (ADS)
Koomey, Jonathan G.
2015-03-01
The efficiency of computing at peak output has increased rapidly since the dawn of the computer age. This paper summarizes some of the key factors affecting the efficiency of computing in all usage modes. While there is still great potential for improving the efficiency of computing devices, we will need to alter how we do computing in the next few decades because we are finally approaching the limits of current technologies.
An efficient second-order accurate and continuous interpolation for block-adaptive grids
NASA Astrophysics Data System (ADS)
Borovikov, Dmitry; Sokolov, Igor V.; Tóth, Gábor
2015-09-01
In this paper we present a second-order and continuous interpolation algorithm for cell-centered adaptive-mesh-refinement (AMR) grids. Continuity requirement poses a non-trivial problem at resolution changes. We develop a classification of the resolution changes, which allows us to employ efficient and simple linear interpolation in the majority of the computational domain. The algorithm is well suited for massively parallel computations. Our interpolation algorithm allows extracting jump-free interpolated data distribution along lines and surfaces within the computational domain. This capability is important for various applications, including kinetic particles tracking in three dimensional vector fields, visualization (i.e. surface extraction) and extracting variables along one-dimensional curves such as field lines, streamlines and satellite trajectories, etc. Particular examples are models for acceleration of solar energetic particles (SEPs) along magnetic field-lines. As such models are sensitive to sharp gradients and discontinuities the capability to interpolate the data from the AMR grid to be passed to the SEP model without producing false gradients numerically becomes crucial. We provide a complete description of the algorithm and make the code publicly available as a Fortran 90 library.
Marelli, Damián; Baumgartner, Robert; Majdak, Piotr
2015-01-01
Head-related transfer functions (HRTFs) describe the acoustic filtering of incoming sounds by the human morphology and are essential for listeners to localize sound sources in virtual auditory displays. Since rendering complex virtual scenes is computationally demanding, we propose four algorithms for efficiently representing HRTFs in subbands, i.e., as an analysis filterbank (FB) followed by a transfer matrix and a synthesis FB. All four algorithms use sparse approximation procedures to minimize the computational complexity while maintaining perceptually relevant HRTF properties. The first two algorithms separately optimize the complexity of the transfer matrix associated to each HRTF for fixed FBs. The other two algorithms jointly optimize the FBs and transfer matrices for complete HRTF sets by two variants. The first variant aims at minimizing the complexity of the transfer matrices, while the second one does it for the FBs. Numerical experiments investigate the latency-complexity trade-off and show that the proposed methods offer significant computational savings when compared with other available approaches. Psychoacoustic localization experiments were modeled and conducted to find a reasonable approximation tolerance so that no significant localization performance degradation was introduced by the subband representation. PMID:26681930
Methods for increased computational efficiency of multibody simulations
NASA Astrophysics Data System (ADS)
Epple, Alexander
This thesis is concerned with the efficient numerical simulation of finite element based flexible multibody systems. Scaling operations are systematically applied to the governing index-3 differential algebraic equations in order to solve the problem of ill conditioning for small time step sizes. The importance of augmented Lagrangian terms is demonstrated. The use of fast sparse solvers is justified for the solution of the linearized equations of motion resulting in significant savings of computational costs. Three time stepping schemes for the integration of the governing equations of flexible multibody systems are discussed in detail. These schemes are the two-stage Radau IIA scheme, the energy decaying scheme, and the generalized-a method. Their formulations are adapted to the specific structure of the governing equations of flexible multibody systems. The efficiency of the time integration schemes is comprehensively evaluated on a series of test problems. Formulations for structural and constraint elements are reviewed and the problem of interpolation of finite rotations in geometrically exact structural elements is revisited. This results in the development of a new improved interpolation algorithm, which preserves the objectivity of the strain field and guarantees stable simulations in the presence of arbitrarily large rotations. Finally, strategies for the spatial discretization of beams in the presence of steep variations in cross-sectional properties are developed. These strategies reduce the number of degrees of freedom needed to accurately analyze beams with discontinuous properties, resulting in improved computational efficiency.
A simplified approach to characterizing a kilovoltage source spectrum for accurate dose computation
Poirier, Yannick; Kouznetsov, Alexei; Tambasco, Mauro
2012-06-15
% for the homogeneous and heterogeneous block phantoms, and agreement for the transverse dose profiles was within 6%. Conclusions: The HVL and kVp are sufficient for characterizing a kV x-ray source spectrum for accurate dose computation. As these parameters can be easily and accurately measured, they provide for a clinically feasible approach to characterizing a kV energy spectrum to be used for patient specific x-ray dose computations. Furthermore, these results provide experimental validation of our novel hybrid dose computation algorithm.
Efficient gradient computation for dynamical models
Sengupta, B.; Friston, K.J.; Penny, W.D.
2014-01-01
Data assimilation is a fundamental issue that arises across many scales in neuroscience — ranging from the study of single neurons using single electrode recordings to the interaction of thousands of neurons using fMRI. Data assimilation involves inverting a generative model that can not only explain observed data but also generate predictions. Typically, the model is inverted or fitted using conventional tools of (convex) optimization that invariably extremise some functional — norms, minimum descriptive length, variational free energy, etc. Generally, optimisation rests on evaluating the local gradients of the functional to be optimized. In this paper, we compare three different gradient estimation techniques that could be used for extremising any functional in time — (i) finite differences, (ii) forward sensitivities and a method based on (iii) the adjoint of the dynamical system. We demonstrate that the first-order gradients of a dynamical system, linear or non-linear, can be computed most efficiently using the adjoint method. This is particularly true for systems where the number of parameters is greater than the number of states. For such systems, integrating several sensitivity equations – as required with forward sensitivities – proves to be most expensive, while finite-difference approximations have an intermediate efficiency. In the context of neuroimaging, adjoint based inversion of dynamical causal models (DCMs) can, in principle, enable the study of models with large numbers of nodes and parameters. PMID:24769182
NASA Astrophysics Data System (ADS)
Pau, George Shu Heng; Shen, Chaopeng; Riley, William J.; Liu, Yaning
2016-02-01
The topography, and the biotic and abiotic parameters are typically upscaled to make watershed-scale hydrologic-biogeochemical models computationally tractable. However, upscaling procedure can produce biases when nonlinear interactions between different processes are not fully captured at coarse resolutions. Here we applied the Proper Orthogonal Decomposition Mapping Method (PODMM) to downscale the field solutions from a coarse (7 km) resolution grid to a fine (220 m) resolution grid. PODMM trains a reduced-order model (ROM) with coarse-resolution and fine-resolution solutions, here obtained using PAWS+CLM, a quasi-3-D watershed processes model that has been validated for many temperate watersheds. Subsequent fine-resolution solutions were approximated based only on coarse-resolution solutions and the ROM. The approximation errors were efficiently quantified using an error estimator. By jointly estimating correlated variables and temporally varying the ROM parameters, we further reduced the approximation errors by up to 20%. We also improved the method's robustness by constructing multiple ROMs using different set of variables, and selecting the best approximation based on the error estimator. The ROMs produced accurate downscaling of soil moisture, latent heat flux, and net primary production with O(1000) reduction in computational cost. The subgrid distributions were also nearly indistinguishable from the ones obtained using the fine-resolution model. Compared to coarse-resolution solutions, biases in upscaled ROM solutions were reduced by up to 80%. This method has the potential to help address the long-standing spatial scaling problem in hydrology and enable long-time integration, parameter estimation, and stochastic uncertainty analysis while accurately representing the heterogeneities.
Dimensioning storage and computing clusters for efficient high throughput computing
NASA Astrophysics Data System (ADS)
Accion, E.; Bria, A.; Bernabeu, G.; Caubet, M.; Delfino, M.; Espinal, X.; Merino, G.; Lopez, F.; Martinez, F.; Planas, E.
2012-12-01
Scientific experiments are producing huge amounts of data, and the size of their datasets and total volume of data continues increasing. These data are then processed by researchers belonging to large scientific collaborations, with the Large Hadron Collider being a good example. The focal point of scientific data centers has shifted from efficiently coping with PetaByte scale storage to deliver quality data processing throughput. The dimensioning of the internal components in High Throughput Computing (HTC) data centers is of crucial importance to cope with all the activities demanded by the experiments, both the online (data acceptance) and the offline (data processing, simulation and user analysis). This requires a precise setup involving disk and tape storage services, a computing cluster and the internal networking to prevent bottlenecks, overloads and undesired slowness that lead to losses cpu cycles and batch jobs failures. In this paper we point out relevant features for running a successful data storage and processing service in an intensive HTC environment.
Optical computed tomography of radiochromic gels for accurate three-dimensional dosimetry
NASA Astrophysics Data System (ADS)
Babic, Steven
In this thesis, three-dimensional (3-D) radiochromic Ferrous Xylenol-orange (FX) and Leuco Crystal Violet (LCV) micelles gels were imaged by laser and cone-beam (Vista(TM)) optical computed tomography (CT) scanners. The objective was to develop optical CT of radiochromic gels for accurate 3-D dosimetry of intensity-modulated radiation therapy (IMRT) and small field techniques used in modern radiotherapy. First, the cause of a threshold dose response in FX gel dosimeters when scanned with a yellow light source was determined. This effect stems from a spectral sensitivity to multiple chemical complexes that are at different dose levels between ferric ions and xylenol-orange. To negate the threshold dose, an initial concentration of ferric ions is needed in order to shift the chemical equilibrium so that additional dose results in a linear production of a coloured complex that preferentially absorbs at longer wavelengths. Second, a low diffusion leuco-based radiochromic gel consisting of Triton X-100 micelles was developed. The diffusion coefficient of the LCV micelle gel was found to be minimal (0.036 + 0.001 mm2 hr-1 ). Although a dosimetric characterization revealed a reduced sensitivity to radiation, this was offset by a lower auto-oxidation rate and base optical density, higher melting point and no spectral sensitivity. Third, the Radiological Physics Centre (RPC) head-and-neck IMRT protocol was extended to 3-D dose verification using laser and cone-beam (Vista(TM)) optical CT scans of FX gels. Both optical systems yielded comparable measured dose distributions in high-dose regions and low gradients. The FX gel dosimetry results were crossed checked against independent thermoluminescent dosimeter and GAFChromicRTM EBT film measurements made by the RPC. It was shown that optical CT scanned FX gels can be used for accurate IMRT dose verification in 3-D. Finally, corrections for FX gel diffusion and scattered stray light in the Vista(TM) scanner were developed to
NASA Astrophysics Data System (ADS)
Zhong, Fengquan; Ma, Sugang; Zhang, Xinyu; Sung, Chih-Jen; Niemeyer, Kyle E.
2015-10-01
In this paper, the methodology of the directed relation graph with error propagation and sensitivity analysis (DRGEPSA), proposed by Niemeyer et al. (Combust Flame 157:1760-1770, 2010), and its differences to the original directed relation graph method are described. Using DRGEPSA, the detailed mechanism of ethylene containing 71 species and 395 reaction steps is reduced to several skeletal mechanisms with different error thresholds. The 25-species and 131-step mechanism and the 24-species and 115-step mechanism are found to be accurate for the predictions of ignition delay time and laminar flame speed. Although further reduction leads to a smaller skeletal mechanism with 19 species and 68 steps, it is no longer able to represent the correct reaction processes. With the DRGEPSA method, a detailed mechanism for n-dodecane considering low-temperature chemistry and containing 2115 species and 8157 steps is reduced to a much smaller mechanism with 249 species and 910 steps while retaining good accuracy. If considering only high-temperature (higher than 1000 K) applications, the detailed mechanism can be simplified to even smaller mechanisms with 65 species and 340 steps or 48 species and 220 steps. Furthermore, a detailed mechanism for a kerosene surrogate having 207 species and 1592 steps is reduced with various error thresholds and the results show that the 72-species and 429-step mechanism and the 66-species and 392-step mechanism are capable of predicting correct combustion properties compared to those of the detailed mechanism. It is well recognized that kinetic mechanisms can be effectively used in computations only after they are reduced to an acceptable size level for computation capacity and at the same time retaining accuracy. Thus, the skeletal mechanisms generated from the present work are expected to be useful for the application of kinetic mechanisms of hydrocarbons to numerical simulations of turbulent or supersonic combustion.
NASA Astrophysics Data System (ADS)
Singh, Malkiat; Bettenhausen, Michael H.
2011-08-01
Faraday rotation changes the polarization plane of linearly polarized microwaves which propagate through the ionosphere. To correct for ionospheric polarization error, it is necessary to have electron density profiles on a global scale that represent the ionosphere in real time. We use raytrace through the combined models of ionospheric conductivity and electron density (ICED), Bent, and Gallagher models (RIBG model) to specify the ionospheric conditions by ingesting the GPS data from observing stations that are as close as possible to the observation time and location of the space system for which the corrections are required. To accurately calculate Faraday rotation corrections, we also utilize the raytrace utility of the RIBG model instead of the normal shell model assumption for the ionosphere. We use WindSat data, which exhibits a wide range of orientations of the raypath and a high data rate of observations, to provide a realistic data set for analysis. The standard single-shell models at 350 and 400 km are studied along with a new three-shell model and compared with the raytrace method for computation time and accuracy. We have compared the Faraday results obtained with climatological (International Reference Ionosphere and RIBG) and physics-based (Global Assimilation of Ionospheric Measurements) ionospheric models. We also study the impact of limitations in the availability of GPS data on the accuracy of the Faraday rotation calculations.
Yao, Y. X.; Liu, J.; Liu, C.; Lu, W. C.; Wang, C. Z.; Ho, K. M.
2015-08-28
We present an efficient method for calculating the electronic structure and total energy of strongly correlated electron systems. The method extends the traditional Gutzwiller approximation for one-particle operators to the evaluation of the expectation values of two particle operators in the many-electron Hamiltonian. The method is free of adjustable Coulomb parameters, and has no double counting issues in the calculation of total energy, and has the correct atomic limit. We demonstrate that the method describes well the bonding and dissociation behaviors of the hydrogen and nitrogen clusters, as well as the ammonia composed of hydrogen and nitrogen atoms. We alsomore » show that the method can satisfactorily tackle great challenging problems faced by the density functional theory recently discussed in the literature. The computational workload of our method is similar to the Hartree-Fock approach while the results are comparable to high-level quantum chemistry calculations.« less
Yao, Y. X.; Liu, J.; Liu, C.; Lu, W. C.; Wang, C. Z.; Ho, K. M.
2015-01-01
We present an efficient method for calculating the electronic structure and total energy of strongly correlated electron systems. The method extends the traditional Gutzwiller approximation for one-particle operators to the evaluation of the expectation values of two particle operators in the many-electron Hamiltonian. The method is free of adjustable Coulomb parameters, and has no double counting issues in the calculation of total energy, and has the correct atomic limit. We demonstrate that the method describes well the bonding and dissociation behaviors of the hydrogen and nitrogen clusters, as well as the ammonia composed of hydrogen and nitrogen atoms. We also show that the method can satisfactorily tackle great challenging problems faced by the density functional theory recently discussed in the literature. The computational workload of our method is similar to the Hartree-Fock approach while the results are comparable to high-level quantum chemistry calculations. PMID:26315767
NASA Astrophysics Data System (ADS)
Yao, Y. X.; Liu, J.; Liu, C.; Lu, W. C.; Wang, C. Z.; Ho, K. M.
2015-08-01
We present an efficient method for calculating the electronic structure and total energy of strongly correlated electron systems. The method extends the traditional Gutzwiller approximation for one-particle operators to the evaluation of the expectation values of two particle operators in the many-electron Hamiltonian. The method is free of adjustable Coulomb parameters, and has no double counting issues in the calculation of total energy, and has the correct atomic limit. We demonstrate that the method describes well the bonding and dissociation behaviors of the hydrogen and nitrogen clusters, as well as the ammonia composed of hydrogen and nitrogen atoms. We also show that the method can satisfactorily tackle great challenging problems faced by the density functional theory recently discussed in the literature. The computational workload of our method is similar to the Hartree-Fock approach while the results are comparable to high-level quantum chemistry calculations.
Yao, Y. X.; Liu, J.; Liu, C.; Lu, W. C.; Wang, C. Z.; Ho, K. M.
2015-08-28
We present an efficient method for calculating the electronic structure and total energy of strongly correlated electron systems. The method extends the traditional Gutzwiller approximation for one-particle operators to the evaluation of the expectation values of two particle operators in the many-electron Hamiltonian. The method is free of adjustable Coulomb parameters, and has no double counting issues in the calculation of total energy, and has the correct atomic limit. We demonstrate that the method describes well the bonding and dissociation behaviors of the hydrogen and nitrogen clusters, as well as the ammonia composed of hydrogen and nitrogen atoms. We also show that the method can satisfactorily tackle great challenging problems faced by the density functional theory recently discussed in the literature. The computational workload of our method is similar to the Hartree-Fock approach while the results are comparable to high-level quantum chemistry calculations.
Efficient and accurate numerical methods for the Klein-Gordon-Schroedinger equations
Bao, Weizhu . E-mail: bao@math.nus.edu.sg; Yang, Li . E-mail: yangli@nus.edu.sg
2007-08-10
In this paper, we present efficient, unconditionally stable and accurate numerical methods for approximations of the Klein-Gordon-Schroedinger (KGS) equations with/without damping terms. The key features of our methods are based on: (i) the application of a time-splitting spectral discretization for a Schroedinger-type equation in KGS (ii) the utilization of Fourier pseudospectral discretization for spatial derivatives in the Klein-Gordon equation in KGS (iii) the adoption of solving the ordinary differential equations (ODEs) in phase space analytically under appropriate chosen transmission conditions between different time intervals or applying Crank-Nicolson/leap-frog for linear/nonlinear terms for time derivatives. The numerical methods are either explicit or implicit but can be solved explicitly, unconditionally stable, and of spectral accuracy in space and second-order accuracy in time. Moreover, they are time reversible and time transverse invariant when there is no damping terms in KGS, conserve (or keep the same decay rate of) the wave energy as that in KGS without (or with a linear) damping term, keep the same dynamics of the mean value of the meson field, and give exact results for the plane-wave solution. Extensive numerical tests are presented to confirm the above properties of our numerical methods for KGS. Finally, the methods are applied to study solitary-wave collisions in one dimension (1D), as well as dynamics of a 2D problem in KGS.
NASA Astrophysics Data System (ADS)
Gish, Moshe; Dafni, Amots; Inbar, Moshe
2011-09-01
Mammalian herbivores eat plants that may also provide food and shelter for insects. The direct trophic effect of the browsing and grazing of mammalian herbivory on insects, which is probably prevalent in terrestrial ecosystems, has been mostly neglected by ecologists. We examined how the aphid Uroleucon sonchi L. deals with the danger of incidental predation by mammalian herbivores. We found that most (76%) of the aphids in a colony survive the ingestion of the plant by a feeding herbivore. They do so by sensing the combination of heat and humidity in the herbivore's breath and immediately dropping off the plant in large numbers. Their ability to sense the herbivore's breath or their tendency to drop off the plant weakens as ambient temperature rises. This could indicate a limitation of the aphids' sensory system or an adaptation that enables them to avoid the hostile conditions on a hot ground. Once on the ground, U. sonchi is highly mobile and capable of locating a new host plant by advancing in a pattern that differs significantly from random movement. The accurate and efficient defense mechanism of U. sonchi emphasizes the significance of incidental predation as a danger to plant-dwelling invertebrates.
CoMOGrad and PHOG: From Computer Vision to Fast and Accurate Protein Tertiary Structure Retrieval
Karim, Rezaul; Aziz, Mohd. Momin Al; Shatabda, Swakkhar; Rahman, M. Sohel; Mia, Md. Abul Kashem; Zaman, Farhana; Rakin, Salman
2015-01-01
The number of entries in a structural database of proteins is increasing day by day. Methods for retrieving protein tertiary structures from such a large database have turn out to be the key to comparative analysis of structures that plays an important role to understand proteins and their functions. In this paper, we present fast and accurate methods for the retrieval of proteins having tertiary structures similar to a query protein from a large database. Our proposed methods borrow ideas from the field of computer vision. The speed and accuracy of our methods come from the two newly introduced features- the co-occurrence matrix of the oriented gradient and pyramid histogram of oriented gradient- and the use of Euclidean distance as the distance measure. Experimental results clearly indicate the superiority of our approach in both running time and accuracy. Our method is readily available for use from this website: http://research.buet.ac.bd:8080/Comograd/. PMID:26293226
CoMOGrad and PHOG: From Computer Vision to Fast and Accurate Protein Tertiary Structure Retrieval.
Karim, Rezaul; Aziz, Mohd Momin Al; Shatabda, Swakkhar; Rahman, M Sohel; Mia, Md Abul Kashem; Zaman, Farhana; Rakin, Salman
2015-01-01
The number of entries in a structural database of proteins is increasing day by day. Methods for retrieving protein tertiary structures from such a large database have turn out to be the key to comparative analysis of structures that plays an important role to understand proteins and their functions. In this paper, we present fast and accurate methods for the retrieval of proteins having tertiary structures similar to a query protein from a large database. Our proposed methods borrow ideas from the field of computer vision. The speed and accuracy of our methods come from the two newly introduced features- the co-occurrence matrix of the oriented gradient and pyramid histogram of oriented gradient- and the use of Euclidean distance as the distance measure. Experimental results clearly indicate the superiority of our approach in both running time and accuracy. Our method is readily available for use from this website: http://research.buet.ac.bd:8080/Comograd/. PMID:26293226
NASA Astrophysics Data System (ADS)
Hochlaf, M.; Puzzarini, C.; Senent, M. L.
2015-07-01
We present multi-component computations for rotational constants, vibrational and torsional levels of medium-sized molecules. Through the treatment of two organic sulphur molecules, ethyl mercaptan and dimethyl sulphide, which are relevant for atmospheric and astrophysical media, we point out the outstanding capabilities of explicitly correlated coupled clusters (CCSD(T)-F12) method in conjunction with the cc-pVTZ-F12 basis set for the accurate predictions of such quantities. Indeed, we show that the CCSD(T)-F12/cc-pVTZ-F12 equilibrium rotational constants are in good agreement with those obtained by means of a composite scheme based on CCSD(T) calculations that accounts for the extrapolation to the complete basis set (CBS) limit and core-correlation effects [CCSD(T)/CBS+CV], thus leading to values of ground-state rotational constants rather close to the corresponding experimental data. For vibrational and torsional levels, our analysis reveals that the anharmonic frequencies derived from CCSD(T)-F12/cc-pVTZ-F12 harmonic frequencies and anharmonic corrections (Δν = ω - ν) at the CCSD/cc-pVTZ level closely agree with experimental results. The pattern of the torsional transitions and the shape of the potential energy surfaces along the torsional modes are also well reproduced using the CCSD(T)-F12/cc-pVTZ-F12 energies. Interestingly, this good accuracy is accompanied with a strong reduction of the computational costs. This makes the procedures proposed here as schemes of choice for effective and accurate prediction of spectroscopic properties of organic compounds. Finally, popular density functional approaches are compared with the coupled cluster (CC) methodologies in torsional studies. The long-range CAM-B3LYP functional of Handy and co-workers is recommended for large systems.
Enabling fast, stable and accurate peridynamic computations using multi-time-step integration
Lindsay, P.; Parks, M. L.; Prakash, A.
2016-04-13
Peridynamics is a nonlocal extension of classical continuum mechanics that is well-suited for solving problems with discontinuities such as cracks. This paper extends the peridynamic formulation to decompose a problem domain into a number of smaller overlapping subdomains and to enable the use of different time steps in different subdomains. This approach allows regions of interest to be isolated and solved at a small time step for increased accuracy while the rest of the problem domain can be solved at a larger time step for greater computational efficiency. Lastly, performance of the proposed method in terms of stability, accuracy, andmore » computational cost is examined and several numerical examples are presented to corroborate the findings.« less
Matrix-vector multiplication using digital partitioning for more accurate optical computing
NASA Technical Reports Server (NTRS)
Gary, C. K.
1992-01-01
Digital partitioning offers a flexible means of increasing the accuracy of an optical matrix-vector processor. This algorithm can be implemented with the same architecture required for a purely analog processor, which gives optical matrix-vector processors the ability to perform high-accuracy calculations at speeds comparable with or greater than electronic computers as well as the ability to perform analog operations at a much greater speed. Digital partitioning is compared with digital multiplication by analog convolution, residue number systems, and redundant number representation in terms of the size and the speed required for an equivalent throughput as well as in terms of the hardware requirements. Digital partitioning and digital multiplication by analog convolution are found to be the most efficient alogrithms if coding time and hardware are considered, and the architecture for digital partitioning permits the use of analog computations to provide the greatest throughput for a single processor.
Time-Accurate Computation of Viscous Flow Around Deforming Bodies Using Overset Grids
Fast, P; Henshaw, W D
2001-04-02
Dynamically evolving boundaries and deforming bodies interacting with a flow are commonly encountered in fluid dynamics. However, the numerical simulation of flows with dynamic boundaries is difficult with current methods. We propose a new method for studying such problems. The key idea is to use the overset grid method with a thin, body-fitted grid near the deforming boundary, while using fixed Cartesian grids to cover most of the computational domain. Our approach combines the strengths of earlier moving overset grid methods for rigid body motion, and unstructured grid methods for Aow-structure interactions. Large scale deformation of the flow boundaries can be handled without a global regridding, and in a computationally efficient way. In terms of computational cost, even a full overset grid regridding is significantly cheaper than a full regridding of an unstructured grid for the same domain, especially in three dimensions. Numerical studies are used to verify accuracy and convergence of our flow solver. As a computational example, we consider two-dimensional incompressible flow past a flexible filament with prescribed dynamics.
Li, Xiangrui; Lu, Zhong-Lin
2012-01-01
Display systems based on conventional computer graphics cards are capable of generating images with 8-bit gray level resolution. However, most experiments in vision research require displays with more than 12 bits of luminance resolution. Several solutions are available. Bit++ (1) and DataPixx (2) use the Digital Visual Interface (DVI) output from graphics cards and high resolution (14 or 16-bit) digital-to-analog converters to drive analog display devices. The VideoSwitcher (3) described here combines analog video signals from the red and blue channels of graphics cards with different weights using a passive resister network (4) and an active circuit to deliver identical video signals to the three channels of color monitors. The method provides an inexpensive way to enable high-resolution monochromatic displays using conventional graphics cards and analog monitors. It can also provide trigger signals that can be used to mark stimulus onsets, making it easy to synchronize visual displays with physiological recordings or response time measurements. Although computer keyboards and mice are frequently used in measuring response times (RT), the accuracy of these measurements is quite low. The RTbox is a specialized hardware and software solution for accurate RT measurements. Connected to the host computer through a USB connection, the driver of the RTbox is compatible with all conventional operating systems. It uses a microprocessor and high-resolution clock to record the identities and timing of button events, which are buffered until the host computer retrieves them. The recorded button events are not affected by potential timing uncertainties or biases associated with data transmission and processing in the host computer. The asynchronous storage greatly simplifies the design of user programs. Several methods are available to synchronize the clocks of the RTbox and the host computer. The RTbox can also receive external triggers and be used to measure RT with respect
NASA Astrophysics Data System (ADS)
Sagui, Celeste; Pedersen, Lee G.; Darden, Thomas A.
2004-01-01
The accurate simulation of biologically active macromolecules faces serious limitations that originate in the treatment of electrostatics in the empirical force fields. The current use of "partial charges" is a significant source of errors, since these vary widely with different conformations. By contrast, the molecular electrostatic potential (MEP) obtained through the use of a distributed multipole moment description, has been shown to converge to the quantum MEP outside the van der Waals surface, when higher order multipoles are used. However, in spite of the considerable improvement to the representation of the electronic cloud, higher order multipoles are not part of current classical biomolecular force fields due to the excessive computational cost. In this paper we present an efficient formalism for the treatment of higher order multipoles in Cartesian tensor formalism. The Ewald "direct sum" is evaluated through a McMurchie-Davidson formalism [L. McMurchie and E. Davidson, J. Comput. Phys. 26, 218 (1978)]. The "reciprocal sum" has been implemented in three different ways: using an Ewald scheme, a particle mesh Ewald (PME) method, and a multigrid-based approach. We find that even though the use of the McMurchie-Davidson formalism considerably reduces the cost of the calculation with respect to the standard matrix implementation of multipole interactions, the calculation in direct space remains expensive. When most of the calculation is moved to reciprocal space via the PME method, the cost of a calculation where all multipolar interactions (up to hexadecapole-hexadecapole) are included is only about 8.5 times more expensive than a regular AMBER 7 [D. A. Pearlman et al., Comput. Phys. Commun. 91, 1 (1995)] implementation with only charge-charge interactions. The multigrid implementation is slower but shows very promising results for parallelization. It provides a natural way to interface with continuous, Gaussian-based electrostatics in the future. It is
Increasing computational efficiency of cochlear models using boundary layers
NASA Astrophysics Data System (ADS)
Alkhairy, Samiya A.; Shera, Christopher A.
2015-12-01
Our goal is to develop methods to improve the efficiency of computational models of the cochlea for applications that require the solution accurately only within a basal region of interest, specifically by decreasing the number of spatial sections needed for simulation of the problem with good accuracy. We design algebraic spatial and parametric transformations to computational models of the cochlea. These transformations are applied after the basal region of interest and allow for spatial preservation, driven by the natural characteristics of approximate spatial causality of cochlear models. The project is of foundational nature and hence the goal is to design, characterize and develop an understanding and framework rather than optimization and globalization. Our scope is as follows: designing the transformations; understanding the mechanisms by which computational load is decreased for each transformation; development of performance criteria; characterization of the results of applying each transformation to a specific physical model and discretization and solution schemes. In this manuscript, we introduce one of the proposed methods (complex spatial transformation) for a case study physical model that is a linear, passive, transmission line model in which the various abstraction layers (electric parameters, filter parameters, wave parameters) are clearer than other models. This is conducted in the frequency domain for multiple frequencies using a second order finite difference scheme for discretization and direct elimination for solving the discrete system of equations. The performance is evaluated using two developed simulative criteria for each of the transformations. In conclusion, the developed methods serve to increase efficiency of a computational traveling wave cochlear model when spatial preservation can hold, while maintaining good correspondence with the solution of interest and good accuracy, for applications in which the interest is in the solution
IMPROVING TACONITE PROCESSING PLANT EFFICIENCY BY COMPUTER SIMULATION, Final Report
William M. Bond; Salih Ersayin
2007-03-30
This project involved industrial scale testing of a mineral processing simulator to improve the efficiency of a taconite processing plant, namely the Minorca mine. The Concentrator Modeling Center at the Coleraine Minerals Research Laboratory, University of Minnesota Duluth, enhanced the capabilities of available software, Usim Pac, by developing mathematical models needed for accurate simulation of taconite plants. This project provided funding for this technology to prove itself in the industrial environment. As the first step, data representing existing plant conditions were collected by sampling and sample analysis. Data were then balanced and provided a basis for assessing the efficiency of individual devices and the plant, and also for performing simulations aimed at improving plant efficiency. Performance evaluation served as a guide in developing alternative process strategies for more efficient production. A large number of computer simulations were then performed to quantify the benefits and effects of implementing these alternative schemes. Modification of makeup ball size was selected as the most feasible option for the target performance improvement. This was combined with replacement of existing hydrocyclones with more efficient ones. After plant implementation of these modifications, plant sampling surveys were carried out to validate findings of the simulation-based study. Plant data showed very good agreement with the simulated data, confirming results of simulation. After the implementation of modifications in the plant, several upstream bottlenecks became visible. Despite these bottlenecks limiting full capacity, concentrator energy improvement of 7% was obtained. Further improvements in energy efficiency are expected in the near future. The success of this project demonstrated the feasibility of a simulation-based approach. Currently, the Center provides simulation-based service to all the iron ore mining companies operating in northern
A hybrid method for efficient and accurate simulations of diffusion compartment imaging signals
NASA Astrophysics Data System (ADS)
Rensonnet, Gaëtan; Jacobs, Damien; Macq, Benoît; Taquet, Maxime
2015-12-01
Diffusion-weighted imaging is sensitive to the movement of water molecules through the tissue microstructure and can therefore be used to gain insight into the tissue cellular architecture. While the diffusion signal arising from simple geometrical microstructure is known analytically, it remains unclear what diffusion signal arises from complex microstructural configurations. Such knowledge is important to design optimal acquisition sequences, to understand the limitations of diffusion-weighted imaging and to validate novel models of the brain microstructure. We present a novel framework for the efficient simulation of high-quality DW-MRI signals based on the hybrid combination of exact analytic expressions in simple geometric compartments such as cylinders and spheres and Monte Carlo simulations in more complex geometries. We validate our approach on synthetic arrangements of parallel cylinders representing the geometry of white matter fascicles, by comparing it to complete, all-out Monte Carlo simulations commonly used in the literature. For typical configurations, equal levels of accuracy are obtained with our hybrid method in less than one fifth of the computational time required for Monte Carlo simulations.
Sun, Y. Y.; Kim, Y. H.; Lee, K.; Zhang, S. B.
2008-01-01
Density functional theory (DFT) in the commonly used local density or generalized gradient approximation fails to describe van der Waals (vdW) interactions that are vital to organic, biological, and other molecular systems. Here, we propose a simple, efficient, yet accurate local atomic potential (LAP) approach, named DFT+LAP, for including vdW interactions in the framework of DFT. The LAPs for H, C, N, and O are generated by fitting the DFT+LAP potential energy curves of small molecule dimers to those obtained from coupled cluster calculations with single, double, and perturbatively treated triple excitations, CCSD(T). Excellent transferability of the LAPs is demonstrated by remarkable agreement with the JSCH-2005 benchmark database [P. Jurecka et al. Phys. Chem. Chem. Phys. 8, 1985 (2006)], which provides the interaction energies of CCSD(T) quality for 165 vdW and hydrogen-bonded complexes. For over 100 vdW dominant complexes in this database, our DFT+LAP calculations give a mean absolute deviation from the benchmark results less than 0.5 kcal/mol. The DFT+LAP approach involves no extra computational cost other than standard DFT calculations and no modification of existing DFT codes, which enables straightforward quantum simulations, such as ab initio molecular dynamics, on biomolecular systems, as well as on other organic systems.
Accurate computation and interpretation of spin-dependent properties in metalloproteins
NASA Astrophysics Data System (ADS)
Rodriguez, Jorge
2006-03-01
Nature uses the properties of open-shell transition metal ions to carry out a variety of functions associated with vital life processes. Mononuclear and binuclear iron centers, in particular, are intriguing structural motifs present in many heme and non-heme proteins. Hemerythrin and methane monooxigenase, for example, are members of the latter class whose diiron active sites display magnetic ordering. We have developed a computational protocol based on spin density functional theory (SDFT) to accurately predict physico-chemical parameters of metal sites in proteins and bioinorganic complexes which traditionally had only been determined from experiment. We have used this new methodology to perform a comprehensive study of the electronic structure and magnetic properties of heme and non-heme iron proteins and related model compounds. We have been able to predict with a high degree of accuracy spectroscopic (Mössbauer, EPR, UV-vis, Raman) and magnetization parameters of iron proteins and, at the same time, gained unprecedented microscopic understanding of their physico-chemical properties. Our results have allowed us to establish important correlations between the electronic structure, geometry, spectroscopic data, and biochemical function of heme and non- heme iron proteins.
Aeroacoustic Flow Phenomena Accurately Captured by New Computational Fluid Dynamics Method
NASA Technical Reports Server (NTRS)
Blech, Richard A.
2002-01-01
One of the challenges in the computational fluid dynamics area is the accurate calculation of aeroacoustic phenomena, especially in the presence of shock waves. One such phenomenon is "transonic resonance," where an unsteady shock wave at the throat of a convergent-divergent nozzle results in the emission of acoustic tones. The space-time Conservation-Element and Solution-Element (CE/SE) method developed at the NASA Glenn Research Center can faithfully capture the shock waves, their unsteady motion, and the generated acoustic tones. The CE/SE method is a revolutionary new approach to the numerical modeling of physical phenomena where features with steep gradients (e.g., shock waves, phase transition, etc.) must coexist with those having weaker variations. The CE/SE method does not require the complex interpolation procedures (that allow for the possibility of a shock between grid cells) used by many other methods to transfer information between grid cells. These interpolation procedures can add too much numerical dissipation to the solution process. Thus, while shocks are resolved, weaker waves, such as acoustic waves, are washed out.
Fast and accurate computation of two-dimensional non-separable quadratic-phase integrals.
Koç, Aykut; Ozaktas, Haldun M; Hesselink, Lambertus
2010-06-01
We report a fast and accurate algorithm for numerical computation of two-dimensional non-separable linear canonical transforms (2D-NS-LCTs). Also known as quadratic-phase integrals, this class of integral transforms represents a broad class of optical systems including Fresnel propagation in free space, propagation in graded-index media, passage through thin lenses, and arbitrary concatenations of any number of these, including anamorphic/astigmatic/non-orthogonal cases. The general two-dimensional non-separable case poses several challenges which do not exist in the one-dimensional case and the separable two-dimensional case. The algorithm takes approximately N log N time, where N is the two-dimensional space-bandwidth product of the signal. Our method properly tracks and controls the space-bandwidth products in two dimensions, in order to achieve information theoretically sufficient, but not wastefully redundant, sampling required for the reconstruction of the underlying continuous functions at any stage of the algorithm. Additionally, we provide an alternative definition of general 2D-NS-LCTs that shows its kernel explicitly in terms of its ten parameters, and relate these parameters bidirectionally to conventional ABCD matrix parameters. PMID:20508697
Accurate computation of surface stresses and forces with immersed boundary methods
NASA Astrophysics Data System (ADS)
Goza, Andres; Liska, Sebastian; Morley, Benjamin; Colonius, Tim
2016-09-01
Many immersed boundary methods solve for surface stresses that impose the velocity boundary conditions on an immersed body. These surface stresses may contain spurious oscillations that make them ill-suited for representing the physical surface stresses on the body. Moreover, these inaccurate stresses often lead to unphysical oscillations in the history of integrated surface forces such as the coefficient of lift. While the errors in the surface stresses and forces do not necessarily affect the convergence of the velocity field, it is desirable, especially in fluid-structure interaction problems, to obtain smooth and convergent stress distributions on the surface. To this end, we show that the equation for the surface stresses is an integral equation of the first kind whose ill-posedness is the source of spurious oscillations in the stresses. We also demonstrate that for sufficiently smooth delta functions, the oscillations may be filtered out to obtain physically accurate surface stresses. The filtering is applied as a post-processing procedure, so that the convergence of the velocity field is unaffected. We demonstrate the efficacy of the method by computing stresses and forces that converge to the physical stresses and forces for several test problems.
Accurate and efficient calculation of discrete correlation functions and power spectra
NASA Astrophysics Data System (ADS)
Xu, Y. F.; Liu, J. M.; Zhu, W. D.
2015-07-01
Operational modal analysis (OMA), or output-only modal analysis, has been widely conducted especially when excitation applied on a structure is unknown or difficult to measure. Discrete cross-correlation functions and cross-power spectra between a reference data series and measured response data series are bases for OMA to identify modal properties of a structure. Such functions and spectra can be efficiently transformed from each other using the discrete Fourier transform (DFT) and inverse DFT (IDFT) based on the cross-correlation theorem. However, a direct application of the theorem and transforms, including the DFT and IDFT, can yield physically erroneous results due to periodic extension of the DFT on a function of a finite length to be transformed, which is false most of the time. Padding zero series to ends of data series before applying the theorem and transforms can reduce the errors, but the results are still physically erroneous. A new methodology is developed in this work to calculate discrete cross-correlation functions of non-negative time delays and associated cross-power spectra, referred to as half spectra, for OMA. The methodology can be extended to cross-correlation functions of any time delays and associated cross-power spectra, referred to as full spectra. The new methodology is computationally efficient due to use of the transforms. Data series are properly processed to avoid the errors caused by the periodic extension, and the resulting cross-correlation functions and associated cross-power spectra perfectly comply with their definitions. A coherence function, a convergence function, and a convergence index are introduced to evaluate qualities of measured cross-correlation functions and associated cross-power spectra. The new methodology was numerically and experimentally applied to an ideal two-degree-of-freedom (2-DOF) mass-spring-damper system and a damaged aluminum beam, respectively, and OMA was conducted using half spectra to estimate
Efficient Universal Computing Architectures for Decoding Neural Activity
Rapoport, Benjamin I.; Turicchia, Lorenzo; Wattanapanitch, Woradorn; Davidson, Thomas J.; Sarpeshkar, Rahul
2012-01-01
The ability to decode neural activity into meaningful control signals for prosthetic devices is critical to the development of clinically useful brain– machine interfaces (BMIs). Such systems require input from tens to hundreds of brain-implanted recording electrodes in order to deliver robust and accurate performance; in serving that primary function they should also minimize power dissipation in order to avoid damaging neural tissue; and they should transmit data wirelessly in order to minimize the risk of infection associated with chronic, transcutaneous implants. Electronic architectures for brain– machine interfaces must therefore minimize size and power consumption, while maximizing the ability to compress data to be transmitted over limited-bandwidth wireless channels. Here we present a system of extremely low computational complexity, designed for real-time decoding of neural signals, and suited for highly scalable implantable systems. Our programmable architecture is an explicit implementation of a universal computing machine emulating the dynamics of a network of integrate-and-fire neurons; it requires no arithmetic operations except for counting, and decodes neural signals using only computationally inexpensive logic operations. The simplicity of this architecture does not compromise its ability to compress raw neural data by factors greater than . We describe a set of decoding algorithms based on this computational architecture, one designed to operate within an implanted system, minimizing its power consumption and data transmission bandwidth; and a complementary set of algorithms for learning, programming the decoder, and postprocessing the decoded output, designed to operate in an external, nonimplanted unit. The implementation of the implantable portion is estimated to require fewer than 5000 operations per second. A proof-of-concept, 32-channel field-programmable gate array (FPGA) implementation of this portion is consequently energy efficient
Efficient computation of spontaneous emission dynamics in arbitrary photonic structures
NASA Astrophysics Data System (ADS)
Teimourpour, M. H.; El-Ganainy, R.
2015-12-01
Defining a quantum mechanical wavefunction for photons is one of the remaining open problems in quantum physics. Thus quantum states of light are usually treated within the realm of second quantization. Consequently, spontaneous emission (SE) in arbitrary photonic media is often described by Fock space Hamiltonians. Here, we present a real space formulation of the SE process that can capture the physics of the problem accurately under different coupling conditions. Starting from first principles, we map the unitary evolution of a dressed two-level quantum emitter onto the problem of electromagnetic radiation from a self-interacting complex harmonic oscillator. Our formalism naturally leads to an efficient computational scheme of SE dynamics using finite difference time domain method without the need for calculating the photonic eigenmodes of the surrounding environment. In contrast to earlier investigations, our computational framework provides a unified numerical treatment for both weak and strong coupling regimes alike. We illustrate the versatility of our scheme by considering several different examples.
Efficient Computation Of Behavior Of Aircraft Tires
NASA Technical Reports Server (NTRS)
Tanner, John A.; Noor, Ahmed K.; Andersen, Carl M.
1989-01-01
NASA technical paper discusses challenging application of computational structural mechanics to numerical simulation of responses of aircraft tires during taxing, takeoff, and landing. Presents details of three main elements of computational strategy: use of special three-field, mixed-finite-element models; use of operator splitting; and application of technique reducing substantially number of degrees of freedom. Proposed computational strategy applied to two quasi-symmetric problems: linear analysis of anisotropic tires through use of two-dimensional-shell finite elements and nonlinear analysis of orthotropic tires subjected to unsymmetric loading. Three basic types of symmetry and combinations exhibited by response of tire identified.
NASA Astrophysics Data System (ADS)
Toyokuni, Genti; Takenaka, Hiroshi
2012-06-01
We propose a method for modeling global seismic wave propagation through an attenuative Earth model including the center. This method enables accurate and efficient computations since it is based on the 2.5-D approach, which solves wave equations only on a 2-D cross section of the whole Earth and can correctly model 3-D geometrical spreading. We extend a numerical scheme for the elastic waves in spherical coordinates using the finite-difference method (FDM), to solve the viscoelastodynamic equation. For computation of realistic seismic wave propagation, incorporation of anelastic attenuation is crucial. Since the nature of Earth material is both elastic solid and viscous fluid, we should solve stress-strain relations of viscoelastic material, including attenuative structures. These relations represent the stress as a convolution integral in time, which has had difficulty treating viscoelasticity in time-domain computation such as the FDM. However, we now have a method using so-called memory variables, invented in the 1980s, followed by improvements in Cartesian coordinates. Arbitrary values of the quality factor (Q) can be incorporated into the wave equation via an array of Zener bodies. We also introduce the multi-domain, an FD grid of several layers with different grid spacings, into our FDM scheme. This allows wider lateral grid spacings with depth, so as not to perturb the FD stability criterion around the Earth center. In addition, we propose a technique to avoid the singularity problem of the wave equation in spherical coordinates at the Earth center. We develop a scheme to calculate wavefield variables on this point, based on linear interpolation for the velocity-stress, staggered-grid FDM. This scheme is validated through a comparison of synthetic seismograms with those obtained by the Direct Solution Method for a spherically symmetric Earth model, showing excellent accuracy for our FDM scheme. As a numerical example, we apply the method to simulate seismic
Efficient Parallel Engineering Computing on Linux Workstations
NASA Technical Reports Server (NTRS)
Lou, John Z.
2010-01-01
A C software module has been developed that creates lightweight processes (LWPs) dynamically to achieve parallel computing performance in a variety of engineering simulation and analysis applications to support NASA and DoD project tasks. The required interface between the module and the application it supports is simple, minimal and almost completely transparent to the user applications, and it can achieve nearly ideal computing speed-up on multi-CPU engineering workstations of all operating system platforms. The module can be integrated into an existing application (C, C++, Fortran and others) either as part of a compiled module or as a dynamically linked library (DLL).
NASA Technical Reports Server (NTRS)
Emmons, Louisa; De Zafra, Robert
1991-01-01
A simple method for milling accurate off-axis parabolic mirrors with a computer-controlled milling machine is discussed. For machines with a built-in circle-cutting routine, an exact paraboloid can be milled with few computer commands and without the use of the spherical or linear approximations. The proposed method can be adapted easily to cut off-axis sections of elliptical or spherical mirrors.
Accurate technique for complete geometric calibration of cone-beam computed tomography systems.
Cho, Youngbin; Moseley, Douglas J; Siewerdsen, Jeffrey H; Jaffray, David A
2005-04-01
Cone-beam computed tomography systems have been developed to provide in situ imaging for the purpose of guiding radiation therapy. Clinical systems have been constructed using this approach, a clinical linear accelerator (Elekta Synergy RP) and an iso-centric C-arm. Geometric calibration involves the estimation of a set of parameters that describes the geometry of such systems, and is essential for accurate image reconstruction. We have developed a general analytic algorithm and corresponding calibration phantom for estimating these geometric parameters in cone-beam computed tomography (CT) systems. The performance of the calibration algorithm is evaluated and its application is discussed. The algorithm makes use of a calibration phantom to estimate the geometric parameters of the system. The phantom consists of 24 steel ball bearings (BBs) in a known geometry. Twelve BBs are spaced evenly at 30 deg in two plane-parallel circles separated by a given distance along the tube axis. The detector (e.g., a flat panel detector) is assumed to have no spatial distortion. The method estimates geometric parameters including the position of the x-ray source, position, and rotation of the detector, and gantry angle, and can describe complex source-detector trajectories. The accuracy and sensitivity of the calibration algorithm was analyzed. The calibration algorithm estimates geometric parameters in a high level of accuracy such that the quality of CT reconstruction is not degraded by the error of estimation. Sensitivity analysis shows uncertainty of 0.01 degrees (around beam direction) to 0.3 degrees (normal to the beam direction) in rotation, and 0.2 mm (orthogonal to the beam direction) to 4.9 mm (beam direction) in position for the medical linear accelerator geometry. Experimental measurements using a laboratory bench Cone-beam CT system of known geometry demonstrate the sensitivity of the method in detecting small changes in the imaging geometry with an uncertainty of 0
Efficient Computation Of Manipulator Inertia Matrix
NASA Technical Reports Server (NTRS)
Fijany, Amir; Bejczy, Antal K.
1991-01-01
Improved method for computation of manipulator inertia matrix developed, based on concept of spatial inertia of composite rigid body. Required for implementation of advanced dynamic-control schemes as well as dynamic simulation of manipulator motion. Motivated by increasing demand for fast algorithms to provide real-time control and simulation capability and, particularly, need for faster-than-real-time simulation capability, required in many anticipated space teleoperation applications.
Experimental Realization of High-Efficiency Counterfactual Computation
NASA Astrophysics Data System (ADS)
Kong, Fei; Ju, Chenyong; Huang, Pu; Wang, Pengfei; Kong, Xi; Shi, Fazhan; Jiang, Liang; Du, Jiangfeng
2015-08-01
Counterfactual computation (CFC) exemplifies the fascinating quantum process by which the result of a computation may be learned without actually running the computer. In previous experimental studies, the counterfactual efficiency is limited to below 50%. Here we report an experimental realization of the generalized CFC protocol, in which the counterfactual efficiency can break the 50% limit and even approach unity in principle. The experiment is performed with the spins of a negatively charged nitrogen-vacancy color center in diamond. Taking advantage of the quantum Zeno effect, the computer can remain in the not-running subspace due to the frequent projection by the environment, while the computation result can be revealed by final detection. The counterfactual efficiency up to 85% has been demonstrated in our experiment, which opens the possibility of many exciting applications of CFC, such as high-efficiency quantum integration and imaging.
NASA Astrophysics Data System (ADS)
Namin, Farhad A.; Yuwen, Yu A.; Liu, Liu; Panaretos, Anastasios H.; Werner, Douglas H.; Mayer, Theresa S.
2016-02-01
In this paper, the scattering properties of two-dimensional quasicrystalline plasmonic lattices are investigated. We combine a newly developed synthesis technique, which allows for accurate fabrication of spherical nanoparticles, with a recently published variation of generalized multiparticle Mie theory to develop the first quantitative model for plasmonic nano-spherical arrays based on quasicrystalline morphologies. In particular, we study the scattering properties of Penrose and Ammann- Beenker gold spherical nanoparticle array lattices. We demonstrate that by using quasicrystalline lattices, one can obtain multi-band or broadband plasmonic resonances which are not possible in periodic structures. Unlike previously published works, our technique provides quantitative results which show excellent agreement with experimental measurements.
Efficient Associative Computation with Discrete Synapses.
Knoblauch, Andreas
2016-01-01
Neural associative networks are a promising computational paradigm for both modeling neural circuits of the brain and implementing associative memory and Hebbian cell assemblies in parallel VLSI or nanoscale hardware. Previous work has extensively investigated synaptic learning in linear models of the Hopfield type and simple nonlinear models of the Steinbuch/Willshaw type. Optimized Hopfield networks of size n can store a large number of about n(2)/k memories of size k (or associations between them) but require real-valued synapses, which are expensive to implement and can store at most C = 0.72 bits per synapse. Willshaw networks can store a much smaller number of about n(2)/k(2) memories but get along with much cheaper binary synapses. Here I present a learning model employing synapses with discrete synaptic weights. For optimal discretization parameters, this model can store, up to a factor ζ close to one, the same number of memories as for optimized Hopfield-type learning--for example, ζ = 0.64 for binary synapses, ζ = 0.88 for 2 bit (four-state) synapses, ζ = 0.96 for 3 bit (8-state) synapses, and ζ > 0.99 for 4 bit (16-state) synapses. The model also provides the theoretical framework to determine optimal discretization parameters for computer implementations or brainlike parallel hardware including structural plasticity. In particular, as recently shown for the Willshaw network, it is possible to store C(I) = 1 bit per computer bit and up to C(S) = log n bits per nonsilent synapse, whereas the absolute number of stored memories can be much larger than for the Willshaw model. PMID:26599711
Reliability and Efficiency of a DNA-Based Computation
NASA Astrophysics Data System (ADS)
Deaton, R.; Garzon, M.; Murphy, R. C.; Rose, J. A.; Franceschetti, D. R.; Stevens, S. E., Jr.
1998-01-01
DNA-based computing uses the tendency of nucleotide bases to bind (hybridize) in preferred combinations to do computation. Depending on reaction conditions, oligonucleotides can bind despite noncomplementary base pairs. These mismatched hybridizations are a source of false positives and negatives, which limit the efficiency and scalability of DNA-based computing. The ability of specific base sequences to support error-tolerant Adleman-style computation is analyzed, and criteria are proposed to increase reliability and efficiency. A method is given to calculate reaction conditions from estimates of DNA melting.
Namin, Farhad A; Yuwen, Yu A; Liu, Liu; Panaretos, Anastasios H; Werner, Douglas H; Mayer, Theresa S
2016-01-01
In this paper, the scattering properties of two-dimensional quasicrystalline plasmonic lattices are investigated. We combine a newly developed synthesis technique, which allows for accurate fabrication of spherical nanoparticles, with a recently published variation of generalized multiparticle Mie theory to develop the first quantitative model for plasmonic nano-spherical arrays based on quasicrystalline morphologies. In particular, we study the scattering properties of Penrose and Ammann- Beenker gold spherical nanoparticle array lattices. We demonstrate that by using quasicrystalline lattices, one can obtain multi-band or broadband plasmonic resonances which are not possible in periodic structures. Unlike previously published works, our technique provides quantitative results which show excellent agreement with experimental measurements. PMID:26911709
Namin, Farhad A.; Yuwen, Yu A.; Liu, Liu; Panaretos, Anastasios H.; Werner, Douglas H.; Mayer, Theresa S.
2016-01-01
In this paper, the scattering properties of two-dimensional quasicrystalline plasmonic lattices are investigated. We combine a newly developed synthesis technique, which allows for accurate fabrication of spherical nanoparticles, with a recently published variation of generalized multiparticle Mie theory to develop the first quantitative model for plasmonic nano-spherical arrays based on quasicrystalline morphologies. In particular, we study the scattering properties of Penrose and Ammann- Beenker gold spherical nanoparticle array lattices. We demonstrate that by using quasicrystalline lattices, one can obtain multi-band or broadband plasmonic resonances which are not possible in periodic structures. Unlike previously published works, our technique provides quantitative results which show excellent agreement with experimental measurements. PMID:26911709
Efficient computation of parameter confidence intervals
NASA Technical Reports Server (NTRS)
Murphy, Patrick C.
1987-01-01
An important step in system identification of aircraft is the estimation of stability and control derivatives from flight data along with an assessment of parameter accuracy. When the maximum likelihood estimation technique is used, parameter accuracy is commonly assessed by the Cramer-Rao lower bound. It is known, however, that in some cases the lower bound can be substantially different from the parameter variance. Under these circumstances the Cramer-Rao bounds may be misleading as an accuracy measure. This paper discusses the confidence interval estimation problem based on likelihood ratios, which offers a more general estimate of the error bounds. Four approaches are considered for computing confidence intervals of maximum likelihood parameter estimates. Each approach is applied to real flight data and compared.
Efficient tree codes on SIMD computer architectures
NASA Astrophysics Data System (ADS)
Olson, Kevin M.
1996-11-01
This paper describes changes made to a previous implementation of an N -body tree code developed for a fine-grained, SIMD computer architecture. These changes include (1) switching from a balanced binary tree to a balanced oct tree, (2) addition of quadrupole corrections, and (3) having the particles search the tree in groups rather than individually. An algorithm for limiting errors is also discussed. In aggregate, these changes have led to a performance increase of over a factor of 10 compared to the previous code. For problems several times larger than the processor array, the code now achieves performance levels of ~ 1 Gflop on the Maspar MP-2 or roughly 20% of the quoted peak performance of this machine. This percentage is competitive with other parallel implementations of tree codes on MIMD architectures. This is significant, considering the low relative cost of SIMD architectures.
Efficient algorithm to compute the Berry conductivity
NASA Astrophysics Data System (ADS)
Dauphin, A.; Müller, M.; Martin-Delgado, M. A.
2014-07-01
We propose and construct a numerical algorithm to calculate the Berry conductivity in topological band insulators. The method is applicable to cold atom systems as well as solid state setups, both for the insulating case where the Fermi energy lies in the gap between two bulk bands as well as in the metallic regime. This method interpolates smoothly between both regimes. The algorithm is gauge-invariant by construction, efficient, and yields the Berry conductivity with known and controllable statistical error bars. We apply the algorithm to several paradigmatic models in the field of topological insulators, including Haldane's model on the honeycomb lattice, the multi-band Hofstadter model, and the BHZ model, which describes the 2D spin Hall effect observed in CdTe/HgTe/CdTe quantum well heterostructures.
Fortenberry, Ryan C; Huang, Xinchuan; Schwenke, David W; Lee, Timothy J
2014-02-01
In this work, computational procedures are employed to compute the rotational and rovibrational spectra and line lists for H2O, CO2, and SO2. Building on the established use of quartic force fields, MP2 and CCSD(T) Dipole Moment Surfaces (DMSs) are computed for each system of study in order to produce line intensities as well as the transition energies. The computed results exhibit a clear correlation to reference data available in the HITRAN database. Additionally, even though CCSD(T) DMSs produce more accurate intensities as compared to experiment, the use of MP2 DMSs results in reliable line lists that are still comparable to experiment. The use of the less computationally costly MP2 method is beneficial in the study of larger systems where use of CCSD(T) would be more costly. PMID:23692860
TOPICA: an accurate and efficient numerical tool for analysis and design of ICRF antennas
NASA Astrophysics Data System (ADS)
Lancellotti, V.; Milanesio, D.; Maggiora, R.; Vecchi, G.; Kyrytsya, V.
2006-07-01
The demand for a predictive tool to help in designing ion-cyclotron radio frequency (ICRF) antenna systems for today's fusion experiments has driven the development of codes such as ICANT, RANT3D, and the early development of TOPICA (TOrino Polytechnic Ion Cyclotron Antenna) code. This paper describes the substantive evolution of TOPICA formulation and implementation that presently allow it to handle the actual geometry of ICRF antennas (with curved, solid straps, a general-shape housing, Faraday screen, etc) as well as an accurate plasma description, accounting for density and temperature profiles and finite Larmor radius effects. The antenna is assumed to be housed in a recess-like enclosure. Both goals have been attained by formally separating the problem into two parts: the vacuum region around the antenna and the plasma region inside the toroidal chamber. Field continuity and boundary conditions allow formulating of a set of two coupled integral equations for the unknown equivalent (current) sources; then the equations are reduced to a linear system by a method of moments solution scheme employing 2D finite elements defined over a 3D non-planar surface triangular-cell mesh. In the vacuum region calculations are done in the spatial (configuration) domain, whereas in the plasma region a spectral (wavenumber) representation of fields and currents is adopted, thus permitting a description of the plasma by a surface impedance matrix. Owing to this approach, any plasma model can be used in principle, and at present the FELICE code has been employed. The natural outcomes of TOPICA are the induced currents on the conductors (antenna, housing, etc) and the electric field in front of the plasma, whence the antenna circuit parameters (impedance/scattering matrices), the radiated power and the fields (at locations other than the chamber aperture) are then obtained. An accurate model of the feeding coaxial lines is also included. The theoretical model and its TOPICA
An efficient and accurate approach to MTE-MART for time-resolved tomographic PIV
NASA Astrophysics Data System (ADS)
Lynch, K. P.; Scarano, F.
2015-03-01
The motion-tracking-enhanced MART (MTE-MART; Novara et al. in Meas Sci Technol 21:035401, 2010) has demonstrated the potential to increase the accuracy of tomographic PIV by the combined use of a short sequence of non-simultaneous recordings. A clear bottleneck of the MTE-MART technique has been its computational cost. For large datasets comprising time-resolved sequences, MTE-MART becomes unaffordable and has been barely applied even for the analysis of densely seeded tomographic PIV datasets. A novel implementation is proposed for tomographic PIV image sequences, which strongly reduces the computational burden of MTE-MART, possibly below that of regular MART. The method is a sequential algorithm that produces a time-marching estimation of the object intensity field based on an enhanced guess, which is built upon the object reconstructed at the previous time instant. As the method becomes effective after a number of snapshots (typically 5-10), the sequential MTE-MART (SMTE) is most suited for time-resolved sequences. The computational cost reduction due to SMTE simply stems from the fewer MART iterations required for each time instant. Moreover, the method yields superior reconstruction quality and higher velocity field measurement precision when compared with both MART and MTE-MART. The working principle is assessed in terms of computational effort, reconstruction quality and velocity field accuracy with both synthetic time-resolved tomographic images of a turbulent boundary layer and two experimental databases documented in the literature. The first is the time-resolved data of flow past an airfoil trailing edge used in the study of Novara and Scarano (Exp Fluids 52:1027-1041, 2012); the second is a swirling jet in a water flow. In both cases, the effective elimination of ghost particles is demonstrated in number and intensity within a short temporal transient of 5-10 frames, depending on the seeding density. The increased value of the velocity space
Texture functions in image analysis: A computationally efficient solution
NASA Technical Reports Server (NTRS)
Cox, S. C.; Rose, J. F.
1983-01-01
A computationally efficient means for calculating texture measurements from digital images by use of the co-occurrence technique is presented. The calculation of the statistical descriptors of image texture and a solution that circumvents the need for calculating and storing a co-occurrence matrix are discussed. The results show that existing efficient algorithms for calculating sums, sums of squares, and cross products can be used to compute complex co-occurrence relationships directly from the digital image input.
Computationally efficient Bayesian inference for inverse problems.
Marzouk, Youssef M.; Najm, Habib N.; Rahn, Larry A.
2007-10-01
Bayesian statistics provides a foundation for inference from noisy and incomplete data, a natural mechanism for regularization in the form of prior information, and a quantitative assessment of uncertainty in the inferred results. Inverse problems - representing indirect estimation of model parameters, inputs, or structural components - can be fruitfully cast in this framework. Complex and computationally intensive forward models arising in physical applications, however, can render a Bayesian approach prohibitive. This difficulty is compounded by high-dimensional model spaces, as when the unknown is a spatiotemporal field. We present new algorithmic developments for Bayesian inference in this context, showing strong connections with the forward propagation of uncertainty. In particular, we introduce a stochastic spectral formulation that dramatically accelerates the Bayesian solution of inverse problems via rapid evaluation of a surrogate posterior. We also explore dimensionality reduction for the inference of spatiotemporal fields, using truncated spectral representations of Gaussian process priors. These new approaches are demonstrated on scalar transport problems arising in contaminant source inversion and in the inference of inhomogeneous material or transport properties. We also present a Bayesian framework for parameter estimation in stochastic models, where intrinsic stochasticity may be intermingled with observational noise. Evaluation of a likelihood function may not be analytically tractable in these cases, and thus several alternative Markov chain Monte Carlo (MCMC) schemes, operating on the product space of the observations and the parameters, are introduced.
Duality quantum computer and the efficient quantum simulations
NASA Astrophysics Data System (ADS)
Wei, Shi-Jie; Long, Gui-Lu
2016-03-01
Duality quantum computing is a new mode of a quantum computer to simulate a moving quantum computer passing through a multi-slit. It exploits the particle wave duality property for computing. A quantum computer with n qubits and a qudit simulates a moving quantum computer with n qubits passing through a d-slit. Duality quantum computing can realize an arbitrary sum of unitaries and therefore a general quantum operator, which is called a generalized quantum gate. All linear bounded operators can be realized by the generalized quantum gates, and unitary operators are just the extreme points of the set of generalized quantum gates. Duality quantum computing provides flexibility and a clear physical picture in designing quantum algorithms, and serves as a powerful bridge between quantum and classical algorithms. In this paper, after a brief review of the theory of duality quantum computing, we will concentrate on the applications of duality quantum computing in simulations of Hamiltonian systems. We will show that duality quantum computing can efficiently simulate quantum systems by providing descriptions of the recent efficient quantum simulation algorithm of Childs and Wiebe (Quantum Inf Comput 12(11-12):901-924, 2012) for the fast simulation of quantum systems with a sparse Hamiltonian, and the quantum simulation algorithm by Berry et al. (Phys Rev Lett 114:090502, 2015), which provides exponential improvement in precision for simulating systems with a sparse Hamiltonian.
TOPLHA: an accurate and efficient numerical tool for analysis and design of LH antennas
NASA Astrophysics Data System (ADS)
Milanesio, D.; Lancellotti, V.; Meneghini, O.; Maggiora, R.; Vecchi, G.; Bilato, R.
2007-09-01
Auxiliary ICRF heating systems in tokamaks often involve large complex antennas, made up of several conducting straps hosted in distinct cavities that open towards the plasma. The same holds especially true in the LH regime, wherein the antennas are comprised of arrays of many phased waveguides. Upon observing that the various cavities or waveguides couple to each other only through the EM fields existing over the plasma-facing apertures, we self-consistently formulated the EM problem by a convenient set of multiple coupled integral equations. Subsequent application of the Method of Moments yields a highly sparse algebraic system; therefore formal inversion of the system matrix happens to be not so memory demanding, despite the number of unknowns may be quite large (typically 105 or so). The overall strategy has been implemented in an enhanced version of TOPICA (Torino Polytechnic Ion Cyclotron Antenna) and in a newly developed code named TOPLHA (Torino Polytechnic Lower Hybrid Antenna). Both are simulation and prediction tools for plasma facing antennas that incorporate commercial-grade 3D graphic interfaces along with an accurate description of the plasma. In this work we present the new proposed formulation along with examples of application to real life large LH antenna systems.
Chen, Yu-Wen; Tseng, Sheng-Hao
2015-03-01
In general, diffuse reflectance spectroscopy (DRS) systems work with photon diffusion models to determine the absorption coefficient μa and reduced scattering coefficient μs' of turbid samples. However, in some DRS measurement scenarios, such as using short source-detector separations to investigate superficial tissues with comparable μa and μs', photon diffusion models might be invalid or might not have analytical solutions. In this study, a systematic workflow of constructing a rapid, accurate photon transport model that is valid at short source-detector separations (SDSs) and at a wide range of sample albedo is revealed. To create such a model, we first employed a GPU (Graphic Processing Unit) based Monte Carlo model to calculate the reflectance at various sample optical property combinations and established a database at high speed. The database was then utilized to train an artificial neural network (ANN) for determining the sample absorption and reduced scattering coefficients from the reflectance measured at several SDSs without applying spectral constraints. The robustness of the produced ANN model was rigorously validated. We evaluated the performance of a successfully trained ANN using tissue simulating phantoms. We also determined the 500-1000 nm absorption and reduced scattering spectra of in-vivo skin using our ANN model and found that the values agree well with those reported in several independent studies. PMID:25798300
TTVFast: An efficient and accurate code for transit timing inversion problems
Deck, Katherine M.; Agol, Eric; Holman, Matthew J.; Nesvorný, David
2014-06-01
Transit timing variations (TTVs) have proven to be a powerful technique for confirming Kepler planet candidates, for detecting non-transiting planets, and for constraining the masses and orbital elements of multi-planet systems. These TTV applications often require the numerical integration of orbits for computation of transit times (as well as impact parameters and durations); frequently tens of millions to billions of simulations are required when running statistical analyses of the planetary system properties. We have created a fast code for transit timing computation, TTVFast, which uses a symplectic integrator with a Keplerian interpolator for the calculation of transit times. The speed comes at the expense of accuracy in the calculated times, but the accuracy lost is largely unnecessary, as transit times do not need to be calculated to accuracies significantly smaller than the measurement uncertainties on the times. The time step can be tuned to give sufficient precision for any particular system. We find a speed-up of at least an order of magnitude relative to dynamical integrations with high precision using a Bulirsch-Stoer integrator.
Earthquake detection through computationally efficient similarity search
Yoon, Clara E.; O’Reilly, Ossian; Bergen, Karianne J.; Beroza, Gregory C.
2015-01-01
Seismology is experiencing rapid growth in the quantity of data, which has outpaced the development of processing algorithms. Earthquake detection—identification of seismic events in continuous data—is a fundamental operation for observational seismology. We developed an efficient method to detect earthquakes using waveform similarity that overcomes the disadvantages of existing detection methods. Our method, called Fingerprint And Similarity Thresholding (FAST), can analyze a week of continuous seismic waveform data in less than 2 hours, or 140 times faster than autocorrelation. FAST adapts a data mining algorithm, originally designed to identify similar audio clips within large databases; it first creates compact “fingerprints” of waveforms by extracting key discriminative features, then groups similar fingerprints together within a database to facilitate fast, scalable search for similar fingerprint pairs, and finally generates a list of earthquake detections. FAST detected most (21 of 24) cataloged earthquakes and 68 uncataloged earthquakes in 1 week of continuous data from a station located near the Calaveras Fault in central California, achieving detection performance comparable to that of autocorrelation, with some additional false detections. FAST is expected to realize its full potential when applied to extremely long duration data sets over a distributed network of seismic stations. The widespread application of FAST has the potential to aid in the discovery of unexpected seismic signals, improve seismic monitoring, and promote a greater understanding of a variety of earthquake processes. PMID:26665176
Earthquake detection through computationally efficient similarity search.
Yoon, Clara E; O'Reilly, Ossian; Bergen, Karianne J; Beroza, Gregory C
2015-12-01
Seismology is experiencing rapid growth in the quantity of data, which has outpaced the development of processing algorithms. Earthquake detection-identification of seismic events in continuous data-is a fundamental operation for observational seismology. We developed an efficient method to detect earthquakes using waveform similarity that overcomes the disadvantages of existing detection methods. Our method, called Fingerprint And Similarity Thresholding (FAST), can analyze a week of continuous seismic waveform data in less than 2 hours, or 140 times faster than autocorrelation. FAST adapts a data mining algorithm, originally designed to identify similar audio clips within large databases; it first creates compact "fingerprints" of waveforms by extracting key discriminative features, then groups similar fingerprints together within a database to facilitate fast, scalable search for similar fingerprint pairs, and finally generates a list of earthquake detections. FAST detected most (21 of 24) cataloged earthquakes and 68 uncataloged earthquakes in 1 week of continuous data from a station located near the Calaveras Fault in central California, achieving detection performance comparable to that of autocorrelation, with some additional false detections. FAST is expected to realize its full potential when applied to extremely long duration data sets over a distributed network of seismic stations. The widespread application of FAST has the potential to aid in the discovery of unexpected seismic signals, improve seismic monitoring, and promote a greater understanding of a variety of earthquake processes. PMID:26665176
SIESTA-PEXSI: Massively parallel method for efficient and accurate ab initio materials simulation
NASA Astrophysics Data System (ADS)
Lin, Lin; Huhs, Georg; Garcia, Alberto; Yang, Chao
2014-03-01
We describe how to combine the pole expansion and selected inversion (PEXSI) technique with the SIESTA method, which uses numerical atomic orbitals for Kohn-Sham density functional theory (KSDFT) calculations. The PEXSI technique can efficiently utilize the sparsity pattern of the Hamiltonian matrix and the overlap matrix generated from codes such as SIESTA, and solves KSDFT without using cubic scaling matrix diagonalization procedure. The complexity of PEXSI scales at most quadratically with respect to the system size, and the accuracy is comparable to that obtained from full diagonalization. One distinct feature of PEXSI is that it achieves low order scaling without using the near-sightedness property and can be therefore applied to metals as well as insulators and semiconductors, at room temperature or even lower temperature. The PEXSI method is highly scalable, and the recently developed massively parallel PEXSI technique can make efficient usage of 10,000 ~100,000 processors on high performance machines. We demonstrate the performance the SIESTA-PEXSI method using several examples for large scale electronic structure calculation including long DNA chain and graphene-like structures with more than 20000 atoms. Funded by Luis Alvarez fellowship in LBNL, and DOE SciDAC project in partnership with BES.
NASA Astrophysics Data System (ADS)
Lee, Timothy J.; Huang, Xinchuan; Fortenberry, Ryan C.; Schwenke, David W.
2013-06-01
Theoretical chemists have been computing vibrational and rovibrational spectra of small molecules for more than 40 years, but over the last decade the interest in this application has grown significantly. The increased interest in computing accurate rotational and rovibrational spectra for small molecules could not come at a better time, as NASA and ESA have begun to acquire a mountain of high-resolution spectra from the Herschel mission, and soon will from the SOFIA and JWST missions. In addition, the ground-based telescope, ALMA, has begun to acquire high-resolution spectra in the same time frame. Hence the need for highly accurate line lists for many small molecules, including their minor isotopologues, will only continue to increase. I will present the latest developments from our group on using the "Best Theory + High-Resolution Experimental Data" strategy to compute highly accurate rotational and rovibrational spectra for small molecules, including NH3, CO2, and SO2. I will also present the latest work from our group in producing purely ab initio line lists and spectroscopic constants for small molecules thought to exist in various astrophysical environments, but for which there is either limited or no high-resolution experimental data available. These more limited line lists include purely rotational transitions as well as rovibrational transitions for bands up through a few combination/overtones.
Accurate and efficient fiber optical shape sensor for MRI compatible minimally invasive instruments
NASA Astrophysics Data System (ADS)
van der Heiden, M. S.; Henken, K. R.; Chen, L. K.; van den Bosch, B. G.; van den Braber, R.; Dankelman, J.; van den Dobbelsteen, J.
2012-12-01
Background The mechanical properties of small minimally invasive instruments are limited and thus must be treated as flexible instruments. Proper functional behavior of these instruments can be significantly enhanced when the instrument is equipped with a shape sensor to track the path of the flexible instrument. MRI compatible instruments, and thus the corresponding paths, are long in particular. Therefore, the accuracy of the tip position is stringent. Approach We have developed and realized a thin Fiber Bragg Grating (FBG) based fiber optical shape sensor. The main advantages of this fiber optical sensor are its minimum dimensions, the intrinsic MRI compatibility, and the ability of sensing deformation with submicro-strain accuracy. The shape sensor consists of three fibers, each equipped with multiple FBG's, which are integrated physically by gluing and can be positioned inside an flexible instrument. In this study a critical component analysis and numerical error analysis were performed. To improve performance, a calibration procedure was developed for the shape sensor. Results and Conclusion With current state of the art interrogators it is possible to measure a local deformation with a triplet of FBG sensor very accurately. At high radii of curvature, the accuracy is dominated by the interrogator, whereas at low radii of curvature, the position of the fibers is leading. The results show that position error of a single segment of the shape sensor (outer diameter of 220 μm, a segment length of 23.5 mm and a minimum bending radius of 30 mm) could be measured with accuracies (3σ) of 100 μm for low radius of curvature upto 8 μm for high radii of curvature.
Fast and accurate determination of the detergent efficiency by optical fiber sensors
NASA Astrophysics Data System (ADS)
Patitsa, Maria; Pfeiffer, Helge; Wevers, Martine
2011-06-01
An optical fiber sensor was developed to control the cleaning efficiency of surfactants. Prior to the measurements, the sensing part of the probe is covered with a uniform standardized soil layer (lipid multilayer), and a gold mirror is deposited at the end of the optical fiber. For the lipid multilayer deposition on the fiber, Langmuir-Blodgett technique was used and the progress of deposition was followed online by ultraviolet spectroscopy. The invention provides a miniaturized Surface Plasmon Resonance dip-sensor for automated on-line testing that can replace the cost and time consuming existing methods and develop a breakthrough in detergent testing in combining optical sensing, surface chemistry and automated data acquisition. The sensor is to be used to evaluate detergency of different cleaning products and also indicate how formulation, concentration, lipid nature and temperature affect the cleaning behavior of a surfactant.
A fourth order accurate finite difference scheme for the computation of elastic waves
NASA Technical Reports Server (NTRS)
Bayliss, A.; Jordan, K. E.; Lemesurier, B. J.; Turkel, E.
1986-01-01
A finite difference for elastic waves is introduced. The model is based on the first order system of equations for the velocities and stresses. The differencing is fourth order accurate on the spatial derivatives and second order accurate in time. The model is tested on a series of examples including the Lamb problem, scattering from plane interf aces and scattering from a fluid-elastic interface. The scheme is shown to be effective for these problems. The accuracy and stability is insensitive to the Poisson ratio. For the class of problems considered here it is found that the fourth order scheme requires for two-thirds to one-half the resolution of a typical second order scheme to give comparable accuracy.
NASA Astrophysics Data System (ADS)
Zhang, G.; Burgueño, R.; Elvin, N. G.
2010-02-01
This paper presents an efficient stiffness identification technique for truss structures based on distributed local computation. Sensor nodes on each element are assumed to collect strain data and communicate only with sensors on neighboring elements. This can significantly reduce the energy demand for data transmission and the complexity of transmission protocols, thus enabling a simplified wireless implementation. Element stiffness parameters are identified by simple low order matrix inversion at a local level, which reduces the computational energy, allows for distributed computation and makes parallel data processing possible. The proposed method also permits addressing the problem of missing data or faulty sensors. Numerical examples, with and without missing data, are presented and the element stiffness parameters are accurately identified. The computation efficiency of the proposed method is n2 times higher than previously proposed global damage identification methods.
Efficiently modeling neural networks on massively parallel computers
NASA Technical Reports Server (NTRS)
Farber, Robert M.
1993-01-01
Neural networks are a very useful tool for analyzing and modeling complex real world systems. Applying neural network simulations to real world problems generally involves large amounts of data and massive amounts of computation. To efficiently handle the computational requirements of large problems, we have implemented at Los Alamos a highly efficient neural network compiler for serial computers, vector computers, vector parallel computers, and fine grain SIMD computers such as the CM-2 connection machine. This paper describes the mapping used by the compiler to implement feed-forward backpropagation neural networks for a SIMD (Single Instruction Multiple Data) architecture parallel computer. Thinking Machines Corporation has benchmarked our code at 1.3 billion interconnects per second (approximately 3 gigaflops) on a 64,000 processor CM-2 connection machine (Singer 1990). This mapping is applicable to other SIMD computers and can be implemented on MIMD computers such as the CM-5 connection machine. Our mapping has virtually no communications overhead with the exception of the communications required for a global summation across the processors (which has a sub-linear runtime growth on the order of O(log(number of processors)). We can efficiently model very large neural networks which have many neurons and interconnects and our mapping can extend to arbitrarily large networks (within memory limitations) by merging the memory space of separate processors with fast adjacent processor interprocessor communications. This paper will consider the simulation of only feed forward neural network although this method is extendable to recurrent networks.
An efficient method for computation of the manipulator inertia matrix
NASA Technical Reports Server (NTRS)
Fijany, Amir; Bejczy, Antal K.
1989-01-01
An efficient method of computation of the manipulator inertia matrix is presented. Using spatial notations, the method leads to the definition of the composite rigid-body spatial inertia, which is a spatial representation of the notion of augmented body. The previously proposed methods, the physical interpretations leading to their derivation, and their redundancies are analyzed. The proposed method achieves a greater efficiency by eliminating the redundancy in the intrinsic equations as well as by a better choice of coordinate frame for their projection. In this case, removing the redundancy leads to greater efficiency of the computation in both serial and parallel senses.
Xu, Jing; Ding, Yunhong; Peucheret, Christophe; Xue, Weiqi; Seoane, Jorge; Zsigri, Beáta; Jeppesen, Palle; Mørk, Jesper
2011-01-01
Although patterning effects (PEs) are known to be a limiting factor of ultrafast photonic switches based on semiconductor optical amplifiers (SOAs), a simple approach for their evaluation in numerical simulations and experiments is missing. In this work, we experimentally investigate and verify a theoretical prediction of the pseudo random binary sequence (PRBS) length needed to capture the full impact of PEs. A wide range of SOAs and operation conditions are investigated. The very simple form of the PRBS length condition highlights the role of two parameters, i.e. the recovery time of the SOAs as well as the operation bit rate. Furthermore, a simple and effective method for probing the maximum PEs is demonstrated, which may relieve the computational effort or the experimental difficulties associated with the use of long PRBSs for the simulation or characterization of SOA-based switches. Good agrement with conventional PRBS characterization is obtained. The method is suitable for quick and systematic estimation and optimization of the switching performance. PMID:21263552
Lippert, Ross A; Predescu, Cristian; Ierardi, Douglas J; Mackenzie, Kenneth M; Eastwood, Michael P; Dror, Ron O; Shaw, David E
2013-10-28
In molecular dynamics simulations, control over temperature and pressure is typically achieved by augmenting the original system with additional dynamical variables to create a thermostat and a barostat, respectively. These variables generally evolve on timescales much longer than those of particle motion, but typical integrator implementations update the additional variables along with the particle positions and momenta at each time step. We present a framework that replaces the traditional integration procedure with separate barostat, thermostat, and Newtonian particle motion updates, allowing thermostat and barostat updates to be applied infrequently. Such infrequent updates provide a particularly substantial performance advantage for simulations parallelized across many computer processors, because thermostat and barostat updates typically require communication among all processors. Infrequent updates can also improve accuracy by alleviating certain sources of error associated with limited-precision arithmetic. In addition, separating the barostat, thermostat, and particle motion update steps reduces certain truncation errors, bringing the time-average pressure closer to its target value. Finally, this framework, which we have implemented on both general-purpose and special-purpose hardware, reduces software complexity and improves software modularity. PMID:24182003
NASA Astrophysics Data System (ADS)
Wong, Molly; Zhang, Da; Rong, John; Wu, Xizeng; Liu, Hong
2009-10-01
Our goal was to evaluate the error contributed by photon fluence measurements to the detective quantum efficiency (DQE) of an x-ray imaging system. The investigation consisted of separate error analyses for the exposure and spectrum measurements that determine the photon fluence. Methods were developed for each to determine the number of measurements required to achieve an acceptable error. A new method for calculating the magnification factor in the exposure measurements was presented and compared to the existing method. The new method not only produces much lower error at small source-to-image distances (SIDs) such as clinical systems, but is also independent of SID. The exposure and spectra results were combined to determine the photon fluence error contribution to the DQE of 4%. The error in this study is small because the measurements resulted from precisely controlled experimental procedures designed to minimize the error. However, these procedures are difficult to follow in clinical environments, and application of this method on clinical systems could therefore provide important insight into error reduction. This investigation was focused on the error in the photon fluence contribution to the DQE, but the error analysis method can easily be extended to a wide range of applications.
Cloutier, Barbara C.; Cloutier, Ashley K.; Alocilja, Evangelyn C.
2015-01-01
Food defense requires the means to efficiently screen large volumes of food for microbial pathogens. Even rapid detection methods often require lengthy enrichment steps, making them impractical for this application. There is a great need for rapid, sensitive, specific, and inexpensive methods for extracting and concentrating microbial pathogens from food. In this study, an immuno-magnetic separation (IMS) methodology was developed for Escherichia coli O157:H7, using electrically active magnetic nanoparticles (EAMNPs). The analytical specificity of the IMS method was evaluated against Escherichia coli O55:H7 and Shigella boydii, and was improved over previous protocols by the addition of sodium chloride during the conjugation of antibodies onto MNPs. The analytical sensitivity of the IMS method was greatest when a high concentration of antibodies (1.0 mg/mL) was present during conjugation. EAMNP concentrations of 1.0 and 0.5 mg/mL provided optimal analytical sensitivity and analytical specificity. The entire IMS procedure requires only 35 min, and antibody-conjugated MNPs show no decline in performance up to 149 days after conjugation. This analytically sensitive and specific extraction protocol has excellent longevity and shows promise as an effective extraction for multiple electrochemical biosensor applications. PMID:25664527
NASA Technical Reports Server (NTRS)
Goodwin, Sabine A.; Raj, P.
1999-01-01
Progress to date towards the development and validation of a fast, accurate and cost-effective aeroelastic method for advanced parallel computing platforms such as the IBM SP2 and the SGI Origin 2000 is presented in this paper. The ENSAERO code, developed at the NASA-Ames Research Center has been selected for this effort. The code allows for the computation of aeroelastic responses by simultaneously integrating the Euler or Navier-Stokes equations and the modal structural equations of motion. To assess the computational performance and accuracy of the ENSAERO code, this paper reports the results of the Navier-Stokes simulations of the transonic flow over a flexible aeroelastic wing body configuration. In addition, a forced harmonic oscillation analysis in the frequency domain and an analysis in the time domain are done on a wing undergoing a rigid pitch and plunge motion. Finally, to demonstrate the ENSAERO flutter-analysis capability, aeroelastic Euler and Navier-Stokes computations on an L-1011 wind tunnel model including pylon, nacelle and empennage are underway. All computational solutions are compared with experimental data to assess the level of accuracy of ENSAERO. As the computations described above are performed, a meticulous log of computational performance in terms of wall clock time, execution speed, memory and disk storage is kept. Code scalability is also demonstrated by studying the impact of varying the number of processors on computational performance on the IBM SP2 and the Origin 2000 systems.
Bonetto, Paola; Qi, Jinyi; Leahy, Richard M.
1999-10-01
We describe a method for computing linear observer statistics for maximum a posteriori (MAP) reconstructions of PET images. The method is based on a theoretical approximation for the mean and covariance of MAP reconstructions. In particular, we derive here a closed form for the channelized Hotelling observer (CHO) statistic applied to 2D MAP images. We show reasonably good correspondence between these theoretical results and Monte Carlo studies. The accuracy and low computational cost of the approximation allow us to analyze the observer performance over a wide range of operating conditions and parameter settings for the MAP reconstruction algorithm.
Time-Accurate Computations of Isolated Circular Synthetic Jets in Crossflow
NASA Technical Reports Server (NTRS)
Rumsey, C. L.; Schaeffler, N. W.; Milanovic, I. M.; Zaman, K. B. M. Q.
2007-01-01
Results from unsteady Reynolds-averaged Navier-Stokes computations are described for two different synthetic jet flows issuing into a turbulent boundary layer crossflow through a circular orifice. In one case the jet effect is mostly contained within the boundary layer, while in the other case the jet effect extends beyond the boundary layer edge. Both cases have momentum flux ratios less than 2. Several numerical parameters are investigated, and some lessons learned regarding the CFD methods for computing these types of flow fields are summarized. Results in both cases are compared to experiment.
Time-Accurate Computations of Isolated Circular Synthetic Jets in Crossflow
NASA Technical Reports Server (NTRS)
Rumsey, Christoper L.; Schaeffler, Norman W.; Milanovic, I. M.; Zaman, K. B. M. Q.
2005-01-01
Results from unsteady Reynolds-averaged Navier-Stokes computations are described for two different synthetic jet flows issuing into a turbulent boundary layer crossflow through a circular orifice. In one case the jet effect is mostly contained within the boundary layer, while in the other case the jet effect extends beyond the boundary layer edge. Both cases have momentum flux ratios less than 2. Several numerical parameters are investigated, and some lessons learned regarding the CFD methods for computing these types of flow fields are outlined. Results in both cases are compared to experiment.
Computer subroutine ISUDS accurately solves large system of simultaneous linear algebraic equations
NASA Technical Reports Server (NTRS)
Collier, G.
1967-01-01
Computer program, an Iterative Scheme Using a Direct Solution, obtains double precision accuracy using a single-precision coefficient matrix. ISUDS solves a system of equations written in matrix form as AX equals B, where A is a square non-singular coefficient matrix, X is a vector, and B is a vector.
Revisiting the Efficiency of Malicious Two-Party Computation
NASA Astrophysics Data System (ADS)
Woodruff, David P.
In a recent paper Mohassel and Franklin study the efficiency of secure two-party computation in the presence of malicious behavior. Their aim is to make classical solutions to this problem, such as zero-knowledge compilation, more efficient. The authors provide several schemes which are the most efficient to date. We propose a modification to their main scheme using expanders. Our modification asymptotically improves at least one measure of efficiency of all known schemes. We also point out an error, and improve the analysis of one of their schemes.
Accurate Analysis and Computer Aided Design of Microstrip Dual Mode Resonators and Filters.
NASA Astrophysics Data System (ADS)
Grounds, Preston Whitfield, III
1995-01-01
Microstrip structures are of interest due to their many applications in microwave circuit design. Their small size and ease of connection to both passive and active components make them well suited for use in systems where size and space is at a premium. These include satellite communication systems, radar systems, satellite navigation systems, cellular phones and many others. In general, space is always a premium for any mobile system. Microstrip resonators find particular application in oscillators and filters. In typical filters each microstrip patch corresponds to one resonator. However, when dual mode patches are employed, each patch acts as two resonators and therefore reduces the amount of space required to build the filter. This dissertation focuses on the accurate electromagnetic analysis of the components of planar dual mode filters. Highly accurate analyses are required so that the resonator to resonator coupling and the resonator to input/output can be predicted with precision. Hence, filters can be built with a minimum of design iterations and tuning. The analysis used herein is an integral equation formulation in the spectral domain. The analysis is done in the spectral domain since the Green's function can be derived in closed form, and the spatial domain convolution becomes a simple product. The resulting set of equations is solved using the Method of Moments with Galerkin's procedure. The electromagnetic analysis is applied to range of problems including unloaded dual mode patches, dual mode patches coupled to microstrip feedlines, and complete filter structures. At each step calculated results are compared to measured results and good agreement is found. The calculated results are also compared to results from the circuit analysis program HP EESOF^{ rm TM} and again good agreement is found. A dual mode elliptic filter is built and good performance is obtained.
NASA Technical Reports Server (NTRS)
Ellison, Donald; Conway, Bruce; Englander, Jacob
2015-01-01
A significant body of work exists showing that providing a nonlinear programming (NLP) solver with expressions for the problem constraint gradient substantially increases the speed of program execution and can also improve the robustness of convergence, especially for local optimizers. Calculation of these derivatives is often accomplished through the computation of spacecraft's state transition matrix (STM). If the two-body gravitational model is employed as is often done in the context of preliminary design, closed form expressions for these derivatives may be provided. If a high fidelity dynamics model, that might include perturbing forces such as the gravitational effect from multiple third bodies and solar radiation pressure is used then these STM's must be computed numerically. We present a method for the power hardward model and a full ephemeris model. An adaptive-step embedded eight order Dormand-Prince numerical integrator is discussed and a method for the computation of the time of flight derivatives in this framework is presented. The use of these numerically calculated derivatieves offer a substantial improvement over finite differencing in the context of a global optimizer. Specifically the inclusion of these STM's into the low thrust missiondesign tool chain in use at NASA Goddard Spaceflight Center allows for an increased preliminary mission design cadence.
Johnson, K.A.; Holman, B.L.; Rosen, T.J.; Nagel, J.S.; English, R.J.; Growdon, J.H. )
1990-04-01
To determine the diagnostic accuracy of iofetamine hydrochloride I 123 (IMP) with single photon emission computed tomography in Alzheimer's disease, we studied 58 patients with AD and 15 age-matched healthy control subjects. We used a qualitative method to assess regional IMP uptake in the entire brain and to rate image data sets as normal or abnormal without knowledge of subjects'clinical classification. The sensitivity and specificity of IMP with single photon emission computed tomography in AD were 88% and 87%, respectively. In 15 patients with mild cognitive deficits (Blessed Dementia Scale score, less than or equal to 10), sensitivity was 80%. With the use of a semiquantitative measure of regional cortical IMP uptake, the parietal lobes were the most functionally impaired in AD and the most strongly associated with the patients' Blessed Dementia Scale scores. These results indicated that IMP with single photon emission computed tomography may be a useful adjunct in the clinical diagnosis of AD in early, mild disease.
NASA Astrophysics Data System (ADS)
Zheng, Chang-Jun; Gao, Hai-Feng; Du, Lei; Chen, Hai-Bo; Zhang, Chuanzeng
2016-01-01
An accurate numerical solver is developed in this paper for eigenproblems governed by the Helmholtz equation and formulated through the boundary element method. A contour integral method is used to convert the nonlinear eigenproblem into an ordinary eigenproblem, so that eigenvalues can be extracted accurately by solving a set of standard boundary element systems of equations. In order to accelerate the solution procedure, the parameters affecting the accuracy and efficiency of the method are studied and two contour paths are compared. Moreover, a wideband fast multipole method is implemented with a block IDR (s) solver to reduce the overall solution cost of the boundary element systems of equations with multiple right-hand sides. The Burton-Miller formulation is employed to identify the fictitious eigenfrequencies of the interior acoustic problems with multiply connected domains. The actual effect of the Burton-Miller formulation on tackling the fictitious eigenfrequency problem is investigated and the optimal choice of the coupling parameter as α = i / k is confirmed through exterior sphere examples. Furthermore, the numerical eigenvalues obtained by the developed method are compared with the results obtained by the finite element method to show the accuracy and efficiency of the developed method.
NASA Astrophysics Data System (ADS)
Reinhardt, Colin N.; Ritcey, James A.
2015-09-01
We present a novel method for efficient and physically-accurate modeling & simulation of anisoplanatic imaging through the atmosphere; in particular we present a new space-variant volumetric image blur algorithm. The method is based on the use of physical atmospheric meteorology models, such as vertical turbulence profiles and aerosol/molecular profiles which can be in general fully spatially-varying in 3 dimensions and also evolving in time. The space-variant modeling method relies on the metadata provided by 3D computer graphics modeling and rendering systems to decompose the image into a set of slices which can be treated in an independent but physically consistent manner to achieve simulated image blur effects which are more accurate and realistic than the homogeneous and stationary blurring methods which are commonly used today. We also present a simple illustrative example of the application of our algorithm, and show its results and performance are in agreement with the expected relative trends and behavior of the prescribed turbulence profile physical model used to define the initial spatially-varying environmental scenario conditions. We present the details of an efficient Fourier-transform-domain formulation of the SV volumetric blur algorithm and detailed algorithm pseudocode description of the method implementation and clarification of some nonobvious technical details.
A scheme for efficient quantum computation with linear optics
NASA Astrophysics Data System (ADS)
Knill, E.; Laflamme, R.; Milburn, G. J.
2001-01-01
Quantum computers promise to increase greatly the efficiency of solving problems such as factoring large integers, combinatorial optimization and quantum physics simulation. One of the greatest challenges now is to implement the basic quantum-computational elements in a physical system and to demonstrate that they can be reliably and scalably controlled. One of the earliest proposals for quantum computation is based on implementing a quantum bit with two optical modes containing one photon. The proposal is appealing because of the ease with which photon interference can be observed. Until now, it suffered from the requirement for non-linear couplings between optical modes containing few photons. Here we show that efficient quantum computation is possible using only beam splitters, phase shifters, single photon sources and photo-detectors. Our methods exploit feedback from photo-detectors and are robust against errors from photon loss and detector inefficiency. The basic elements are accessible to experimental investigation with current technology.
NASA Astrophysics Data System (ADS)
Osei-Kuffuor, Daniel; Fattebert, Jean-Luc
2014-01-01
We present the first truly scalable first-principles molecular dynamics algorithm with O(N) complexity and controllable accuracy, capable of simulating systems with finite band gaps of sizes that were previously impossible with this degree of accuracy. By avoiding global communications, we provide a practical computational scheme capable of extreme scalability. Accuracy is controlled by the mesh spacing of the finite difference discretization, the size of the localization regions in which the electronic wave functions are confined, and a cutoff beyond which the components of the overlap matrix can be omitted when computing selected elements of its inverse. We demonstrate the algorithm's excellent parallel scaling for up to 101 952 atoms on 23 328 processors, with a wall-clock time of the order of 1 min per molecular dynamics time step and numerical error on the forces of less than 7×10-4 Ha/Bohr.
Osei-Kuffuor, Daniel; Fattebert, Jean-Luc
2014-01-01
We present the first truly scalable first-principles molecular dynamics algorithm with O(N) complexity and controllable accuracy, capable of simulating systems with finite band gaps of sizes that were previously impossible with this degree of accuracy. By avoiding global communications, we provide a practical computational scheme capable of extreme scalability. Accuracy is controlled by the mesh spacing of the finite difference discretization, the size of the localization regions in which the electronic wave functions are confined, and a cutoff beyond which the components of the overlap matrix can be omitted when computing selected elements of its inverse. We demonstrate the algorithm's excellent parallel scaling for up to 101 952 atoms on 23 328 processors, with a wall-clock time of the order of 1 min per molecular dynamics time step and numerical error on the forces of less than 7x10^{-4} Ha/Bohr.
iTagPlot: an accurate computation and interactive drawing tool for tag density plot
Kim, Sung-Hwan; Ezenwoye, Onyeka; Cho, Hwan-Gue; Robertson, Keith D.; Choi, Jeong-Hyeon
2015-01-01
Motivation: Tag density plots are very important to intuitively reveal biological phenomena from capture-based sequencing data by visualizing the normalized read depth in a region. Results: We have developed iTagPlot to compute tag density across functional features in parallel using multicores and a grid engine and to interactively explore it in a graphical user interface. It allows us to stratify features by defining groups based on biological function and measurement, summary statistics and unsupervised clustering. Availability and implementation: http://sourceforge.net/projects/itagplot/. Contact: jechoi@gru.edu and jeochoi@gmail.com Supplementary information: Supplementary data are available at Bioinformatics online. PMID:25792550
I/O-Efficient Scientific Computation Using TPIE
NASA Technical Reports Server (NTRS)
Vengroff, Darren Erik; Vitter, Jeffrey Scott
1996-01-01
In recent years, input/output (I/O)-efficient algorithms for a wide variety of problems have appeared in the literature. However, systems specifically designed to assist programmers in implementing such algorithms have remained scarce. TPIE is a system designed to support I/O-efficient paradigms for problems from a variety of domains, including computational geometry, graph algorithms, and scientific computation. The TPIE interface frees programmers from having to deal not only with explicit read and write calls, but also the complex memory management that must be performed for I/O-efficient computation. In this paper we discuss applications of TPIE to problems in scientific computation. We discuss algorithmic issues underlying the design and implementation of the relevant components of TPIE and present performance results of programs written to solve a series of benchmark problems using our current TPIE prototype. Some of the benchmarks we present are based on the NAS parallel benchmarks while others are of our own creation. We demonstrate that the central processing unit (CPU) overhead required to manage I/O is small and that even with just a single disk, the I/O overhead of I/O-efficient computation ranges from negligible to the same order of magnitude as CPU time. We conjecture that if we use a number of disks in parallel this overhead can be all but eliminated.
Accurate Experiment to Computation Coupling for Understanding QH-mode physics using NIMROD
NASA Astrophysics Data System (ADS)
King, J. R.; Burrell, K. H.; Garofalo, A. M.; Groebner, R. J.; Hanson, J. D.; Hebert, J. D.; Hudson, S. R.; Pankin, A. Y.; Kruger, S. E.; Snyder, P. B.
2015-11-01
It is desirable to have an ITER H-mode regime that is quiescent to edge-localized modes (ELMs). The quiescent H-mode (QH-mode) with edge harmonic oscillations (EHO) is one such regime. High quality equilibria are essential for accurate EHO simulations with initial-value codes such as NIMROD. We include profiles outside the LCFS which generate associated currents when we solve the Grad-Shafranov equation with open-flux regions using the NIMEQ solver. The new solution is an equilibrium that closely resembles the original reconstruction (which does not contain open-flux currents). This regenerated equilibrium is consistent with the profiles that are measured by the high quality diagnostics on DIII-D. Results from nonlinear NIMROD simulations of the EHO are presented. The full measured rotation profiles are included in the simulation. The simulation develops into a saturated state. The saturation mechanism of the EHO is explored and simulation is compared to magnetic-coil measurements. This work is currently supported in part by the US DOE Office of Science under awards DE-FC02-04ER54698, DE-AC02-09CH11466 and the SciDAC Center for Extended MHD Modeling.
Equilibrium analysis of the efficiency of an autonomous molecular computer
NASA Astrophysics Data System (ADS)
Rose, John A.; Deaton, Russell J.; Hagiya, Masami; Suyama, Akira
2002-02-01
In the whiplash polymerase chain reaction (WPCR), autonomous molecular computation is implemented in vitro by the recursive, self-directed polymerase extension of a mixture of DNA hairpins. Although computational efficiency is known to be reduced by a tendency for DNAs to self-inhibit by backhybridization, both the magnitude of this effect and its dependence on the reaction conditions have remained open questions. In this paper, the impact of backhybridization on WPCR efficiency is addressed by modeling the recursive extension of each strand as a Markov chain. The extension efficiency per effective polymerase-DNA encounter is then estimated within the framework of a statistical thermodynamic model. Model predictions are shown to provide close agreement with the premature halting of computation reported in a recent in vitro WPCR implementation, a particularly significant result, given that backhybridization had been discounted as the dominant error process. The scaling behavior further indicates completion times to be sufficiently long to render WPCR-based massive parallelism infeasible. A modified architecture, PNA-mediated WPCR (PWPCR) is then proposed in which the occupancy of backhybridized hairpins is reduced by targeted PNA2/DNA triplex formation. The efficiency of PWPCR is discussed using a modified form of the model developed for WPCR. Predictions indicate the PWPCR efficiency is sufficient to allow the implementation of autonomous molecular computation on a massive scale.
Gravitational Focusing and the Computation of an Accurate Moon/Mars Cratering Ratio
NASA Technical Reports Server (NTRS)
Matney, Mark J.
2006-01-01
There have been a number of attempts to use asteroid populations to simultaneously compute cratering rates on the Moon and bodies elsewhere in the Solar System to establish the cratering ratio (e.g., [1],[2]). These works use current asteroid orbit population databases combined with collision rate calculations based on orbit intersections alone. As recent work on meteoroid fluxes [3] have highlighted, however, collision rates alone are insufficient to describe the cratering rates on planetary surfaces - especially planets with stronger gravitational fields than the Moon, such as Earth and Mars. Such calculations also need to include the effects of gravitational focusing, whereby the spatial density of the slower-moving impactors is preferentially "focused" by the gravity of the body. This leads overall to higher fluxes and cratering rates, and is highly dependent on the detailed velocity distributions of the impactors. In this paper, a comprehensive gravitational focusing algorithm originally developed to describe fluxes of interplanetary meteoroids [3] is applied to the collision rates and cratering rates of populations of asteroids and long-period comets to compute better cratering ratios for terrestrial bodies in the Solar System. These results are compared to the calculations of other researchers.
Thermal Conductivities in Solids from First Principles: Accurate Computations and Rapid Estimates
NASA Astrophysics Data System (ADS)
Carbogno, Christian; Scheffler, Matthias
In spite of significant research efforts, a first-principles determination of the thermal conductivity κ at high temperatures has remained elusive. Boltzmann transport techniques that account for anharmonicity perturbatively become inaccurate under such conditions. Ab initio molecular dynamics (MD) techniques using the Green-Kubo (GK) formalism capture the full anharmonicity, but can become prohibitively costly to converge in time and size. We developed a formalism that accelerates such GK simulations by several orders of magnitude and that thus enables its application within the limited time and length scales accessible in ab initio MD. For this purpose, we determine the effective harmonic potential occurring during the MD, the associated temperature-dependent phonon properties and lifetimes. Interpolation in reciprocal and frequency space then allows to extrapolate to the macroscopic scale. For both force-field and ab initio MD, we validate this approach by computing κ for Si and ZrO2, two materials known for their particularly harmonic and anharmonic character. Eventually, we demonstrate how these techniques facilitate reasonable estimates of κ from existing MD calculations at virtually no additional computational cost.
NASA Astrophysics Data System (ADS)
Kees, C. E.; Farthing, M. W.; Terrel, A.; Certik, O.; Seljebotn, D.
2013-12-01
This presentation will focus on two barriers to progress in the hydrological modeling community, and research and development conducted to lessen or eliminate them. The first is a barrier to sharing hydrological models among specialized scientists that is caused by intertwining the implementation of numerical methods with the implementation of abstract numerical modeling information. In the Proteus toolkit for computational methods and simulation, we have decoupled these two important parts of computational model through separate "physics" and "numerics" interfaces. More recently we have begun developing the Strong Form Language for easy and direct representation of the mathematical model formulation in a domain specific language embedded in Python. The second major barrier is sharing ANY scientific software tools that have complex library or module dependencies, as most parallel, multi-physics hydrological models must have. In this setting, users and developer are dependent on an entire distribution, possibly depending on multiple compilers and special instructions depending on the environment of the target machine. To solve these problem we have developed, hashdist, a stateless package management tool and a resulting portable, open source scientific software distribution.
Time-Accurate Computational Fluid Dynamics Simulation of a Pair of Moving Solid Rocket Boosters
NASA Technical Reports Server (NTRS)
Strutzenberg, Louise L.; Williams, Brandon R.
2011-01-01
Since the Columbia accident, the threat to the Shuttle launch vehicle from debris during the liftoff timeframe has been assessed by the Liftoff Debris Team at NASA/MSFC. In addition to engineering methods of analysis, CFD-generated flow fields during the liftoff timeframe have been used in conjunction with 3-DOF debris transport methods to predict the motion of liftoff debris. Early models made use of a quasi-steady flow field approximation with the vehicle positioned at a fixed location relative to the ground; however, a moving overset mesh capability has recently been developed for the Loci/CHEM CFD software which enables higher-fidelity simulation of the Shuttle transient plume startup and liftoff environment. The present work details the simulation of the launch pad and mobile launch platform (MLP) with truncated solid rocket boosters (SRBs) moving in a prescribed liftoff trajectory derived from Shuttle flight measurements. Using Loci/CHEM, time-accurate RANS and hybrid RANS/LES simulations were performed for the timeframe T0+0 to T0+3.5 seconds, which consists of SRB startup to a vehicle altitude of approximately 90 feet above the MLP. Analysis of the transient flowfield focuses on the evolution of the SRB plumes in the MLP plume holes and the flame trench, impingement on the flame deflector, and especially impingment on the MLP deck resulting in upward flow which is a transport mechanism for debris. The results show excellent qualitative agreement with the visual record from past Shuttle flights, and comparisons to pressure measurements in the flame trench and on the MLP provide confidence in these simulation capabilities.
A model for the accurate computation of the lateral scattering of protons in water.
Bellinzona, E V; Ciocca, M; Embriaco, A; Ferrari, A; Fontana, A; Mairani, A; Parodi, K; Rotondi, A; Sala, P; Tessonnier, T
2016-02-21
A pencil beam model for the calculation of the lateral scattering in water of protons for any therapeutic energy and depth is presented. It is based on the full Molière theory, taking into account the energy loss and the effects of mixtures and compounds. Concerning the electromagnetic part, the model has no free parameters and is in very good agreement with the FLUKA Monte Carlo (MC) code. The effects of the nuclear interactions are parametrized with a two-parameter tail function, adjusted on MC data calculated with FLUKA. The model, after the convolution with the beam and the detector response, is in agreement with recent proton data in water from HIT. The model gives results with the same accuracy of the MC codes based on Molière theory, with a much shorter computing time. PMID:26808380
A model for the accurate computation of the lateral scattering of protons in water
NASA Astrophysics Data System (ADS)
Bellinzona, E. V.; Ciocca, M.; Embriaco, A.; Ferrari, A.; Fontana, A.; Mairani, A.; Parodi, K.; Rotondi, A.; Sala, P.; Tessonnier, T.
2016-02-01
A pencil beam model for the calculation of the lateral scattering in water of protons for any therapeutic energy and depth is presented. It is based on the full Molière theory, taking into account the energy loss and the effects of mixtures and compounds. Concerning the electromagnetic part, the model has no free parameters and is in very good agreement with the FLUKA Monte Carlo (MC) code. The effects of the nuclear interactions are parametrized with a two-parameter tail function, adjusted on MC data calculated with FLUKA. The model, after the convolution with the beam and the detector response, is in agreement with recent proton data in water from HIT. The model gives results with the same accuracy of the MC codes based on Molière theory, with a much shorter computing time.
NASA Technical Reports Server (NTRS)
Kemp, James Herbert (Inventor); Talukder, Ashit (Inventor); Lambert, James (Inventor); Lam, Raymond (Inventor)
2008-01-01
A computer-implemented system and method of intra-oral analysis for measuring plaque removal is disclosed. The system includes hardware for real-time image acquisition and software to store the acquired images on a patient-by-patient basis. The system implements algorithms to segment teeth of interest from surrounding gum, and uses a real-time image-based morphing procedure to automatically overlay a grid onto each segmented tooth. Pattern recognition methods are used to classify plaque from surrounding gum and enamel, while ignoring glare effects due to the reflection of camera light and ambient light from enamel regions. The system integrates these components into a single software suite with an easy-to-use graphical user interface (GUI) that allows users to do an end-to-end run of a patient record, including tooth segmentation of all teeth, grid morphing of each segmented tooth, and plaque classification of each tooth image.
Quick, Accurate, Smart: 3D Computer Vision Technology Helps Assessing Confined Animals’ Behaviour
Calderara, Simone; Pistocchi, Simone; Cucchiara, Rita; Podaliri-Vulpiani, Michele; Messori, Stefano; Ferri, Nicola
2016-01-01
Mankind directly controls the environment and lifestyles of several domestic species for purposes ranging from production and research to conservation and companionship. These environments and lifestyles may not offer these animals the best quality of life. Behaviour is a direct reflection of how the animal is coping with its environment. Behavioural indicators are thus among the preferred parameters to assess welfare. However, behavioural recording (usually from video) can be very time consuming and the accuracy and reliability of the output rely on the experience and background of the observers. The outburst of new video technology and computer image processing gives the basis for promising solutions. In this pilot study, we present a new prototype software able to automatically infer the behaviour of dogs housed in kennels from 3D visual data and through structured machine learning frameworks. Depth information acquired through 3D features, body part detection and training are the key elements that allow the machine to recognise postures, trajectories inside the kennel and patterns of movement that can be later labelled at convenience. The main innovation of the software is its ability to automatically cluster frequently observed temporal patterns of movement without any pre-set ethogram. Conversely, when common patterns are defined through training, a deviation from normal behaviour in time or between individuals could be assessed. The software accuracy in correctly detecting the dogs’ behaviour was checked through a validation process. An automatic behaviour recognition system, independent from human subjectivity, could add scientific knowledge on animals’ quality of life in confinement as well as saving time and resources. This 3D framework was designed to be invariant to the dog’s shape and size and could be extended to farm, laboratory and zoo quadrupeds in artificial housing. The computer vision technique applied to this software is innovative in non
Quick, Accurate, Smart: 3D Computer Vision Technology Helps Assessing Confined Animals' Behaviour.
Barnard, Shanis; Calderara, Simone; Pistocchi, Simone; Cucchiara, Rita; Podaliri-Vulpiani, Michele; Messori, Stefano; Ferri, Nicola
2016-01-01
Mankind directly controls the environment and lifestyles of several domestic species for purposes ranging from production and research to conservation and companionship. These environments and lifestyles may not offer these animals the best quality of life. Behaviour is a direct reflection of how the animal is coping with its environment. Behavioural indicators are thus among the preferred parameters to assess welfare. However, behavioural recording (usually from video) can be very time consuming and the accuracy and reliability of the output rely on the experience and background of the observers. The outburst of new video technology and computer image processing gives the basis for promising solutions. In this pilot study, we present a new prototype software able to automatically infer the behaviour of dogs housed in kennels from 3D visual data and through structured machine learning frameworks. Depth information acquired through 3D features, body part detection and training are the key elements that allow the machine to recognise postures, trajectories inside the kennel and patterns of movement that can be later labelled at convenience. The main innovation of the software is its ability to automatically cluster frequently observed temporal patterns of movement without any pre-set ethogram. Conversely, when common patterns are defined through training, a deviation from normal behaviour in time or between individuals could be assessed. The software accuracy in correctly detecting the dogs' behaviour was checked through a validation process. An automatic behaviour recognition system, independent from human subjectivity, could add scientific knowledge on animals' quality of life in confinement as well as saving time and resources. This 3D framework was designed to be invariant to the dog's shape and size and could be extended to farm, laboratory and zoo quadrupeds in artificial housing. The computer vision technique applied to this software is innovative in non
Tiwari, Saumya; Reddy, Vijaya B.; Bhargava, Rohit; Raman, Jaishankar
2015-01-01
Rejection is a common problem after cardiac transplants leading to significant number of adverse events and deaths, particularly in the first year of transplantation. The gold standard to identify rejection is endomyocardial biopsy. This technique is complex, cumbersome and requires a lot of expertise in the correct interpretation of stained biopsy sections. Traditional histopathology cannot be used actively or quickly during cardiac interventions or surgery. Our objective was to develop a stain-less approach using an emerging technology, Fourier transform infrared (FT-IR) spectroscopic imaging to identify different components of cardiac tissue by their chemical and molecular basis aided by computer recognition, rather than by visual examination using optical microscopy. We studied this technique in assessment of cardiac transplant rejection to evaluate efficacy in an example of complex cardiovascular pathology. We recorded data from human cardiac transplant patients’ biopsies, used a Bayesian classification protocol and developed a visualization scheme to observe chemical differences without the need of stains or human supervision. Using receiver operating characteristic curves, we observed probabilities of detection greater than 95% for four out of five histological classes at 10% probability of false alarm at the cellular level while correctly identifying samples with the hallmarks of the immune response in all cases. The efficacy of manual examination can be significantly increased by observing the inherent biochemical changes in tissues, which enables us to achieve greater diagnostic confidence in an automated, label-free manner. We developed a computational pathology system that gives high contrast images and seems superior to traditional staining procedures. This study is a prelude to the development of real time in situ imaging systems, which can assist interventionists and surgeons actively during procedures. PMID:25932912
NASA Astrophysics Data System (ADS)
Osei-Kuffuor, Daniel; Fattebert, Jean-Luc
2014-03-01
We present a truly scalable First-Principles Molecular Dynamics algorithm with O(N) complexity and fully controllable accuracy, capable of simulating systems of sizes that were previously impossible with this degree of accuracy. By avoiding global communication, we have extended W. Kohn's condensed matter ``nearsightedness'' principle to a practical computational scheme capable of extreme scalability. Accuracy is controlled by the mesh spacing of the finite difference discretization, the size of the localization regions in which the electronic wavefunctions are confined, and a cutoff beyond which the components of the overlap matrix can be omitted when computing selected elements of its inverse. We demonstrate the algorithm's excellent parallel scaling for up to 100,000 atoms on 100,000 processors, with a wall-clock time of the order of one minute per molecular dynamics time step. This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344.
Popescu-Rohrlich correlations imply efficient instantaneous nonlocal quantum computation
NASA Astrophysics Data System (ADS)
Broadbent, Anne
2016-08-01
In instantaneous nonlocal quantum computation, two parties cooperate in order to perform a quantum computation on their joint inputs, while being restricted to a single round of simultaneous communication. Previous results showed that instantaneous nonlocal quantum computation is possible, at the cost of an exponential amount of prior shared entanglement (in the size of the input). Here, we show that a linear amount of entanglement suffices, (in the size of the computation), as long as the parties share nonlocal correlations as given by the Popescu-Rohrlich box. This means that communication is not required for efficient instantaneous nonlocal quantum computation. Exploiting the well-known relation to position-based cryptography, our result also implies the impossibility of secure position-based cryptography against adversaries with nonsignaling correlations. Furthermore, our construction establishes a quantum analog of the classical communication complexity collapse under nonsignaling correlations.
Efficient Turing-Universal Computation with DNA Polymers
NASA Astrophysics Data System (ADS)
Qian, Lulu; Soloveichik, David; Winfree, Erik
Bennett's proposed chemical Turing machine is one of the most important thought experiments in the study of the thermodynamics of computation. Yet the sophistication of molecular engineering required to physically construct Bennett's hypothetical polymer substrate and enzymes has deterred experimental implementations. Here we propose a chemical implementation of stack machines - a Turing-universal model of computation similar to Turing machines - using DNA strand displacement cascades as the underlying chemical primitive. More specifically, the mechanism described herein is the addition and removal of monomers from the end of a DNA polymer, controlled by strand displacement logic. We capture the motivating feature of Bennett's scheme: that physical reversibility corresponds to logically reversible computation, and arbitrarily little energy per computation step is required. Further, as a method of embedding logic control into chemical and biological systems, polymer-based chemical computation is significantly more efficient than geometry-free chemical reaction networks.
Communication-efficient parallel architectures and algorithms for image computations
Alnuweiri, H.M.
1989-01-01
The main purpose of this dissertation is the design of efficient parallel techniques for image computations which require global operations on image pixels, as well as the development of parallel architectures with special communication features which can support global data movement efficiently. The class of image problems considered in this dissertation involves global operations on image pixels, and irregular (data-dependent) data movement operations. Such problems include histogramming, component labeling, proximity computations, computing the Hough Transform, computing convexity of regions and related properties such as computing the diameter and a smallest area enclosing rectangle for each region. Images with multiple figures and multiple labeled-sets of pixels are also considered. Efficient solutions to such problems involve integer sorting, graph theoretic techniques, and techniques from computational geometry. Although such solutions are not computationally intensive (they all require O(n{sup 2}) operations to be performed on an n {times} n image), they require global communications. The emphasis here is on developing parallel techniques for data movement, reduction, and distribution, which lead to processor-time optimal solutions for such problems on the proposed organizations. The proposed parallel architectures are based on a memory array which can be viewed as an arrangement of memory modules in a k-dimensional space such that the modules are connected to buses placed parallel to the orthogonal axes of the space, and each bus is connected to one processor or a group of processors. It will be shown that such organizations are communication-efficient and are thus highly suited to the image problems considered here, and also to several other classes of problems. The proposed organizations have p processors and O(n{sup 2}) words of memory to process n {times} n images.
NASA Astrophysics Data System (ADS)
Zhang, Bin; Liang, Chunlei
2015-08-01
This paper presents a simple, efficient, and high-order accurate sliding-mesh interface approach to the spectral difference (SD) method. We demonstrate the approach by solving the two-dimensional compressible Navier-Stokes equations on quadrilateral grids. This approach is an extension of the straight mortar method originally designed for stationary domains [7,8]. Our sliding method creates curved dynamic mortars on sliding-mesh interfaces to couple rotating and stationary domains. On the nonconforming sliding-mesh interfaces, the related variables are first projected from cell faces to mortars to compute common fluxes, and then the common fluxes are projected back from the mortars to the cell faces to ensure conservation. To verify the spatial order of accuracy of the sliding-mesh spectral difference (SSD) method, both inviscid and viscous flow cases are tested. It is shown that the SSD method preserves the high-order accuracy of the SD method. Meanwhile, the SSD method is found to be very efficient in terms of computational cost. This novel sliding-mesh interface method is very suitable for parallel processing with domain decomposition. It can be applied to a wide range of problems, such as the hydrodynamics of marine propellers, the aerodynamics of rotorcraft, wind turbines, and oscillating wing power generators, etc.
NASA Astrophysics Data System (ADS)
Maloney, James G.; Smith, Glenn S.; Scott, Waymond R., Jr.
1990-07-01
Two antennas are considered, a cylindrical monopole and a conical monopole. Both are driven through an image plane from a coaxial transmission line. Each of these antennas corresponds to a well-posed theoretical electromagnetic boundary value problem and a realizable experimental model. These antennas are analyzed by a straightforward application of the time-domain finite-difference method. The computed results for these antennas are shown to be in excellent agreement with accurate experimental measurements for both the time domain and the frequency domain. The graphical displays presented for the transient near-zone and far-zone radiation from these antennas provide physical insight into the radiation process.
NASA Astrophysics Data System (ADS)
Meng, Qingyong; Chen, Jun; Zhang, Dong H.
2016-04-01
To fast and accurately compute rate coefficients of the H/D + CH4 → H2/HD + CH3 reactions, we propose a segmented strategy for fitting suitable potential energy surface (PES), on which ring-polymer molecular dynamics (RPMD) simulations are performed. On the basis of recently developed permutation invariant polynomial neural-network approach [J. Li et al., J. Chem. Phys. 142, 204302 (2015)], PESs in local configuration spaces are constructed. In this strategy, global PES is divided into three parts, including asymptotic, intermediate, and interaction parts, along the reaction coordinate. Since less fitting parameters are involved in the local PESs, the computational efficiency for operating the PES routine is largely enhanced by a factor of ˜20, comparing with that for global PES. On interaction part, the RPMD computational time for the transmission coefficient can be further efficiently reduced by cutting off the redundant part of the child trajectories. For H + CH4, good agreements among the present RPMD rates and those from previous simulations as well as experimental results are found. For D + CH4, on the other hand, qualitative agreement between present RPMD and experimental results is predicted.
Cobb, J.W.
1995-02-01
There is an increasing need for more accurate numerical methods for large-scale nonlinear magneto-fluid turbulence calculations. These methods should not only increase the current state of the art in terms of accuracy, but should also continue to optimize other desired properties such as simplicity, minimized computation, minimized memory requirements, and robust stability. This includes the ability to stably solve stiff problems with long time-steps. This work discusses a general methodology for deriving higher-order numerical methods. It also discusses how the selection of various choices can affect the desired properties. The explicit discussion focuses on third-order Runge-Kutta methods, including general solutions and five examples. The study investigates the linear numerical analysis of these methods, including their accuracy, general stability, and stiff stability. Additional appendices discuss linear multistep methods, discuss directions for further work, and exhibit numerical analysis results for some other commonly used lower-order methods.
Highly Accurate Frequency Calculations of Crab Cavities Using the VORPAL Computational Framework
Austin, T.M.; Cary, J.R.; Bellantoni, L.; /Argonne
2009-05-01
We have applied the Werner-Cary method [J. Comp. Phys. 227, 5200-5214 (2008)] for extracting modes and mode frequencies from time-domain simulations of crab cavities, as are needed for the ILC and the beam delivery system of the LHC. This method for frequency extraction relies on a small number of simulations, and post-processing using the SVD algorithm with Tikhonov regularization. The time-domain simulations were carried out using the VORPAL computational framework, which is based on the eminently scalable finite-difference time-domain algorithm. A validation study was performed on an aluminum model of the 3.9 GHz RF separators built originally at Fermi National Accelerator Laboratory in the US. Comparisons with measurements of the A15 cavity show that this method can provide accuracy to within 0.01% of experimental results after accounting for manufacturing imperfections. To capture the near degeneracies two simulations, requiring in total a few hours on 600 processors were employed. This method has applications across many areas including obtaining MHD spectra from time-domain simulations.
Put Your Computers in the Most Efficient Environment.
ERIC Educational Resources Information Center
Yeaman, Andrew R. J.
1984-01-01
Discusses factors that should be considered in selecting video display screens and furniture and designing work spaces for computerized instruction that will provide optimal conditions for student health and learning efficiency. Use of work patterns found to be least stressful by computer workers is also suggested. (MBR)
McNeil, Nikki C; Bridges, Robert A; Iannacone, Michael D; Czejdo, Bogdan; Perez, Nicolas E; Goodall, John R
2013-01-01
Public disclosure of important security information, such as knowledge of vulnerabilities or exploits, often occurs in blogs, tweets, mailing lists, and other online sources significantly before proper classification into structured databases. In order to facilitate timely discovery of such knowledge, we propose a novel semi-supervised learning algorithm, PACE, for identifying and classifying relevant entities in text sources. The main contribution of this paper is an enhancement of the traditional bootstrapping method for entity extraction by employing a time-memory trade-off that simultaneously circumvents a costly corpus search while strengthening pattern nomination, which should increase accuracy. An implementation in the cyber-security domain is discussed as well as challenges to Natural Language Processing imposed by the security domain.
A High-Accurate and High-Efficient Monte Carlo Code by Improved Molière Functions with Ionization
NASA Astrophysics Data System (ADS)
Nakatsuka, Takao; Okei, Kazuhide
2003-07-01
Although the Molière theory of multiple Coulomb scattering is less accue rate in tracing solid angles than the Goudsmit and Saunderson theory due to the small angle approximation, it still acts very important roles in developments of high-efficient simulation codes of relativistic charged particles like cosmic-ray particles. Molière expansion is well explained by the physical model, that is the e normal distribution attributing to the high-frequent moderate scatterings and subsequent correction terms attributing to the additive large-angle scatterings. Based on these physical concepts, we have improved a high-accurate and highefficient Monte Carlo code taking account of ionization loss.
Gray, Alan; Harlen, Oliver G.; Harris, Sarah A.; Khalid, Syma; Leung, Yuk Ming; Lonsdale, Richard; Mulholland, Adrian J.; Pearson, Arwen R.; Read, Daniel J.; Richardson, Robin A.
2015-01-01
Despite huge advances in the computational techniques available for simulating biomolecules at the quantum-mechanical, atomistic and coarse-grained levels, there is still a widespread perception amongst the experimental community that these calculations are highly specialist and are not generally applicable by researchers outside the theoretical community. In this article, the successes and limitations of biomolecular simulation and the further developments that are likely in the near future are discussed. A brief overview is also provided of the experimental biophysical methods that are commonly used to probe biomolecular structure and dynamics, and the accuracy of the information that can be obtained from each is compared with that from modelling. It is concluded that progress towards an accurate spatial and temporal model of biomacromolecules requires a combination of all of these biophysical techniques, both experimental and computational. PMID:25615870
2011-01-01
Background Genes of the Major Histocompatibility Complex (MHC) are very popular genetic markers among evolutionary biologists because of their potential role in pathogen confrontation and sexual selection. However, MHC genotyping still remains challenging and time-consuming in spite of substantial methodological advances. Although computational haplotype inference has brought into focus interesting alternatives, high heterozygosity, extensive genetic variation and population admixture are known to cause inaccuracies. We have investigated the role of sample size, genetic polymorphism and genetic structuring on the performance of the popular Bayesian PHASE algorithm. To cover this aim, we took advantage of a large database of known genotypes (using traditional laboratory-based techniques) at single MHC class I (N = 56 individuals and 50 alleles) and MHC class II B (N = 103 individuals and 62 alleles) loci in the lesser kestrel Falco naumanni. Findings Analyses carried out over real MHC genotypes showed that the accuracy of gametic phase reconstruction improved with sample size as a result of the reduction in the allele to individual ratio. We then simulated different data sets introducing variations in this parameter to define an optimal ratio. Conclusions Our results demonstrate a critical influence of the allele to individual ratio on PHASE performance. We found that a minimum allele to individual ratio (1:2) yielded 100% accuracy for both MHC loci. Sampling effort is therefore a crucial step to obtain reliable MHC haplotype reconstructions and must be accomplished accordingly to the degree of MHC polymorphism. We expect our findings provide a foothold into the design of straightforward and cost-effective genotyping strategies of those MHC loci from which locus-specific primers are available. PMID:21615903
Accurate micro-computed tomography imaging of pore spaces in collagen-based scaffold.
Zidek, Jan; Vojtova, Lucy; Abdel-Mohsen, A M; Chmelik, Jiri; Zikmund, Tomas; Brtnikova, Jana; Jakubicek, Roman; Zubal, Lukas; Jan, Jiri; Kaiser, Jozef
2016-06-01
In this work we have used X-ray micro-computed tomography (μCT) as a method to observe the morphology of 3D porous pure collagen and collagen-composite scaffolds useful in tissue engineering. Two aspects of visualizations were taken into consideration: improvement of the scan and investigation of its sensitivity to the scan parameters. Due to the low material density some parts of collagen scaffolds are invisible in a μCT scan. Therefore, here we present different contrast agents, which increase the contrast of the scanned biopolymeric sample for μCT visualization. The increase of contrast of collagenous scaffolds was performed with ceramic hydroxyapatite microparticles (HAp), silver ions (Ag(+)) and silver nanoparticles (Ag-NPs). Since a relatively small change in imaging parameters (e.g. in 3D volume rendering, threshold value and μCT acquisition conditions) leads to a completely different visualized pattern, we have optimized these parameters to obtain the most realistic picture for visual and qualitative evaluation of the biopolymeric scaffold. Moreover, scaffold images were stereoscopically visualized in order to better see the 3D biopolymer composite scaffold morphology. However, the optimized visualization has some discontinuities in zoomed view, which can be problematic for further analysis of interconnected pores by commonly used numerical methods. Therefore, we applied the locally adaptive method to solve discontinuities issue. The combination of contrast agent and imaging techniques presented in this paper help us to better understand the structure and morphology of the biopolymeric scaffold that is crucial in the design of new biomaterials useful in tissue engineering. PMID:27153826
2016-01-01
Tumor metastasis is responsible for 1 in 4 deaths in the United States. Though it has been well-documented over past two decades that circulating tumor cells (CTCs) in blood can be used as a biomarker for metastatic cancer, there are enormous challenges in capturing and identifying CTCs with sufficient sensitivity and specificity. Because of the heterogeneous expression of CTC markers, it is now well understood that a single CTC marker is insufficient to capture all CTCs from the blood. Driven by the clear need, this study reports for the first time highly efficient capture and accurate identification of multiple types of CTCs from infected blood using aptamer-modified porous graphene oxide membranes. The results demonstrate that dye-modified S6, A9, and YJ-1 aptamers attached to 20–40 μm porous garphene oxide membranes are capable of capturing multiple types of tumor cells (SKBR3 breast cancer cells, LNCaP prostate cancer cells, and SW-948 colon cancer cells) selectively and simultaneously from infected blood. Our result shows that the capture efficiency of graphene oxide membranes is ∼95% for multiple types of tumor cells; for each tumor concentration, 10 cells are present per milliliter of blood sample. The selectivity of our assay for capturing targeted tumor cells has been demonstrated using membranes without an antibody. Blood infected with different cells also has been used to demonstrate the targeted tumor cell capturing ability of aptamer-conjugated membranes. Our data also demonstrate that accurate analysis of multiple types of captured CTCs can be performed using multicolor fluorescence imaging. Aptamer-conjugated membranes reported here have good potential for the early diagnosis of diseases that are currently being detected by means of cell capture technologies. PMID:25565372
An overview of energy efficiency techniques in cluster computing systems
Valentini, Giorgio Luigi; Lassonde, Walter; Khan, Samee Ullah; Min-Allah, Nasro; Madani, Sajjad A.; Li, Juan; Zhang, Limin; Wang, Lizhe; Ghani, Nasir; Kolodziej, Joanna; Li, Hongxiang; Zomaya, Albert Y.; Xu, Cheng-Zhong; Balaji, Pavan; Vishnu, Abhinav; Pinel, Fredric; Pecero, Johnatan E.; Kliazovich, Dzmitry; Bouvry, Pascal
2011-09-10
Two major constraints demand more consideration for energy efficiency in cluster computing: (a) operational costs, and (b) system reliability. Increasing energy efficiency in cluster systems will reduce energy consumption, excess heat, lower operational costs, and improve system reliability. Based on the energy-power relationship, and the fact that energy consumption can be reduced with strategic power management, we focus in this survey on the characteristic of two main power management technologies: (a) static power management (SPM) systems that utilize low-power components to save the energy, and (b) dynamic power management (DPM) systems that utilize software and power-scalable components to optimize the energy consumption. We present the current state of the art in both of the SPM and DPM techniques, citing representative examples. The survey is concluded with a brief discussion and some assumptions about the possible future directions that could be explored to improve the energy efficiency in cluster computing.
Corzo, H H; Galano, Annia; Dolgounitcheva, O; Zakrzewski, V G; Ortiz, J V
2015-08-20
Two accurate and computationally efficient electron-propagator (EP) methods for calculating the valence, vertical ionization energies (VIEs) of closed-shell molecules have been identified through comparisons with related approximations. VIEs of a representative set of closed-shell molecules were calculated with EP methods using 10 basis sets. The most easily executed method, the diagonal, second-order (D2) EP approximation, produces results that steadily rise as basis sets are improved toward values based on extrapolated coupled-cluster singles and doubles plus perturbative triples calculations, but its mean errors remain unacceptably large. The outer valence Green function, partial third-order and renormalized partial third-order methods (P3+), which employ the diagonal self-energy approximation, produce markedly better results but have a greater tendency to overestimate VIEs with larger basis sets. The best combination of accuracy and efficiency with a diagonal self-energy matrix is the P3+ approximation, which exhibits the best trends with respect to basis-set saturation. Several renormalized methods with more flexible nondiagonal self-energies also have been examined: the two-particle, one-hole Tamm-Dancoff approximation (2ph-TDA), the third-order algebraic diagrammatic construction or ADC(3), the renormalized third-order (3+) method, and the nondiagonal second-order renormalized (NR2) approximation. Like D2, 2ph-TDA produces steady improvements with basis set augmentation, but its average errors are too large. Errors obtained with 3+ and ADC(3) are smaller on average than those of 2ph-TDA. These methods also have a greater tendency to overestimate VIEs with larger basis sets. The smallest average errors occur for the NR2 approximation; these errors decrease steadily with basis augmentations. As basis sets approach saturation, NR2 becomes the most accurate and efficient method with a nondiagonal self-energy. PMID:26226061
A computationally efficient modelling of laminar separation bubbles
NASA Astrophysics Data System (ADS)
Dini, Paolo; Maughmer, Mark D.
1989-02-01
The goal is to accurately predict the characteristics of the laminar separation bubble and its effects on airfoil performance. Toward this end, a computational model of the separation bubble was developed and incorporated into the Eppler and Somers airfoil design and analysis program. Thus far, the focus of the research was limited to the development of a model which can accurately predict situations in which the interaction between the bubble and the inviscid velocity distribution is weak, the so-called short bubble. A summary of the research performed in the past nine months is presented. The bubble model in its present form is then described. Lastly, the performance of this model in predicting bubble characteristics is shown for a few cases.
A computationally efficient modelling of laminar separation bubbles
NASA Technical Reports Server (NTRS)
Dini, Paolo; Maughmer, Mark D.
1989-01-01
The goal is to accurately predict the characteristics of the laminar separation bubble and its effects on airfoil performance. Toward this end, a computational model of the separation bubble was developed and incorporated into the Eppler and Somers airfoil design and analysis program. Thus far, the focus of the research was limited to the development of a model which can accurately predict situations in which the interaction between the bubble and the inviscid velocity distribution is weak, the so-called short bubble. A summary of the research performed in the past nine months is presented. The bubble model in its present form is then described. Lastly, the performance of this model in predicting bubble characteristics is shown for a few cases.
NASA Astrophysics Data System (ADS)
Wiktor, Julia; Jomard, Gérald; Torrent, Marc
2015-09-01
Many techniques have been developed in the past in order to compute positron lifetimes in materials from first principles. However, there is still a lack of a fast and accurate self-consistent scheme that could handle accurately the forces acting on the ions induced by the presence of the positron. We will show in this paper that we have reached this goal by developing the two-component density functional theory within the projector augmented-wave (PAW) method in the open-source code abinit. This tool offers the accuracy of the all-electron methods with the computational efficiency of the plane-wave ones. We can thus deal with supercells that contain few hundreds to thousands of atoms to study point defects as well as more extended defects clusters. Moreover, using the PAW basis set allows us to use techniques able to, for instance, treat strongly correlated systems or spin-orbit coupling, which are necessary to study heavy elements, such as the actinides or their compounds.
NASA Astrophysics Data System (ADS)
Yi, Sha-Sha; Pan, Cong; Hu, Zhong-Han
2015-12-01
Modern computer simulations of biological systems often involve an explicit treatment of the complex interactions among a large number of molecules. While it is straightforward to compute the short-ranged Van der Waals interaction in classical molecular dynamics simulations, it has been a long-lasting issue to develop accurate methods for the longranged Coulomb interaction. In this short review, we discuss three types of methodologies for the accurate treatment of electrostatics in simulations of explicit molecules: truncation-type methods, Ewald-type methods, and mean-field-type methods. Throughout the discussion, we brief the formulations and developments of these methods, emphasize the intrinsic connections among the three types of methods, and focus on the existing problems which are often associated with the boundary conditions of electrostatics. This brief survey is summarized with a short perspective on future trends along the method developments and applications in the field of biological simulations. Project supported by the National Natural Science Foundation of China (Grant Nos. 91127015 and 21522304) and the Open Project from the State Key Laboratory of Theoretical Physics, and the Innovation Project from the State Key Laboratory of Supramolecular Structure and Materials.
Chavanon, O; Barbe, C; Troccaz, J; Carrat, L; Ribuot, C; Noirclerc, M; Maitrasse, B; Blin, D
1999-06-01
In the field of percutaneous access to soft tissues, our project was to improve classical pericardiocentesis by performing accurate guidance to a selected target, according to a model of the pericardial effusion acquired through three-dimensional (3D) data recording. Required hardware is an echocardiographic device and a needle, both linked to a 3D localizer, and a computer. After acquiring echographic data, a modeling procedure allows definition of the optimal puncture strategy, taking into consideration the mobility of the heart, by determining a stable region, whatever the period of the cardiac cycle. A passive guidance system is then used to reach the planned target accurately, generally a site in the middle of the stable region. After validation on a dynamic phantom and a feasibility study in dogs, an accuracy and reliability analysis protocol was realized on pigs with experimental pericardial effusion. Ten consecutive successful punctures using various trajectories were performed on eight pigs. Nonbloody liquid was collected from pericardial effusions in the stable region (5 to 9 mm wide) within 10 to 15 minutes from echographic acquisition to drainage. Accuracy of at least 2.5 mm was demonstrated. This study demonstrates the feasibility of computer-assisted pericardiocentesis. Beyond the simple improvement of the current technique, this method could be a new way to reach the heart or a new tool for percutaneous access and image-guided puncture of soft tissues. Further investigation will be necessary before routine human application. PMID:10414543
Fragoso, Margarida; Kawrakow, Iwan; Faddegon, Bruce A.; Solberg, Timothy D.; Chetty, Indrin J.
2009-12-15
electron splitting. When DBS was used with electron splitting and combined with augmented charged particle range rejection, a technique recently introduced in BEAMnrc, relative efficiencies were {approx}420 ({approx}253 min on a single processor) and {approx}175 ({approx}58 min on a single processor) for the 10x10 and 40x40 cm{sup 2} field sizes, respectively. Calculations of the Siemens Primus treatment head with VMC++ produced relative efficiencies of {approx}1400 ({approx}6 min on a single processor) and {approx}60 ({approx}4 min on a single processor) for the 10x10 and 40x40 cm{sup 2} field sizes, respectively. BEAMnrc PHSP calculations with DBS alone or DBS in combination with charged particle range rejection were more efficient than the other efficiency enhancing techniques used. Using VMC++, accurate simulations of the entire linac treatment head were performed within minutes on a single processor. Noteworthy differences ({+-}1%-3%) in the mean energy, planar fluence, and angular and spectral distributions were observed with the NIST bremsstrahlung cross sections compared with those of Bethe-Heitler (BEAMnrc default bremsstrahlung cross section). However, MC calculated dose distributions in water phantoms (using combinations of VRTs/AEITs and cross-section data) agreed within 2% of measurements. Furthermore, MC calculated dose distributions in a simulated water/air/water phantom, using NIST cross sections, were within 2% agreement with the BEAMnrc Bethe-Heitler default case.
Fragoso, Margarida; Kawrakow, Iwan; Faddegon, Bruce A.; Solberg, Timothy D.; Chetty, Indrin J.
2009-01-01
DBS was used with electron splitting and combined with augmented charged particle range rejection, a technique recently introduced in BEAMnrc, relative efficiencies were ∼420 (∼253 min on a single processor) and ∼175 (∼58 min on a single processor) for the 10×10 and 40×40 cm2 field sizes, respectively. Calculations of the Siemens Primus treatment head with VMC++ produced relative efficiencies of ∼1400 (∼6 min on a single processor) and ∼60 (∼4 min on a single processor) for the 10×10 and 40×40 cm2 field sizes, respectively. BEAMnrc PHSP calculations with DBS alone or DBS in combination with charged particle range rejection were more efficient than the other efficiency enhancing techniques used. Using VMC++, accurate simulations of the entire linac treatment head were performed within minutes on a single processor. Noteworthy differences (±1%–3%) in the mean energy, planar fluence, and angular and spectral distributions were observed with the NIST bremsstrahlung cross sections compared with those of Bethe–Heitler (BEAMnrc default bremsstrahlung cross section). However, MC calculated dose distributions in water phantoms (using combinations of VRTs∕AEITs and cross-section data) agreed within 2% of measurements. Furthermore, MC calculated dose distributions in a simulated water∕air∕water phantom, using NIST cross sections, were within 2% agreement with the BEAMnrc Bethe–Heitler default case. PMID:20095258
NASA Astrophysics Data System (ADS)
Farah, A.
The Ionospheric delay is still one of the largest sources of error that affects the positioning accuracy of any satellite positioning system. This problem could be solved due to the dispersive nature of the Ionosphere by combining simultaneous measurements of signals at two different frequencies but it is still there for single- frequency users. Much effort has been made in establishing models for single- frequency users to make this effect as small as possible. These models vary in accuracy, input data and computational complexity, so the choice between the different models depends on the individual circumstances of the user. From the simulation point of view, the model needed should be accurate with a global coverage and good description to the Ionosphere's variable nature with both time and location. The author reviews some of these established models, starting with the BENT model, the Klobuchar model and the IRI (International Reference Ionosphere) model. Since quiet a long time, Klobuchar model considers the most widely used model ever in this field, due to its simplicity and time saving. Any GPS user could find Klobuchar model's coefficients in the broadcast navigation message. CODE, Centre for Orbit Determination in Europe provides a new set of coefficients for Klobuchar model, which gives more accurate results for the Ionospheric delay computation. IGS (International GPS Service) services include providing GPS community with a global Ionospheric maps in IONEX-format (IONosphere Map Exchange format) which enables the computation of the Ionospheric delay at the desired location and time. The study was undertaken from GPS-data simulation point of view. The aim was to select a model for the simulation of GPS data that gives a good description of the Ionosphere's nature with a high degree of accuracy in computing the Ionospheric delay that yields to better-simulated data. A new model developed by the author based on IGS global Ionospheric maps. A comparison
On the Use of Electrooculogram for Efficient Human Computer Interfaces
Usakli, A. B.; Gurkan, S.; Aloise, F.; Vecchiato, G.; Babiloni, F.
2010-01-01
The aim of this study is to present electrooculogram signals that can be used for human computer interface efficiently. Establishing an efficient alternative channel for communication without overt speech and hand movements is important to increase the quality of life for patients suffering from Amyotrophic Lateral Sclerosis or other illnesses that prevent correct limb and facial muscular responses. We have made several experiments to compare the P300-based BCI speller and EOG-based new system. A five-letter word can be written on average in 25 seconds and in 105 seconds with the EEG-based device. Giving message such as “clean-up” could be performed in 3 seconds with the new system. The new system is more efficient than P300-based BCI system in terms of accuracy, speed, applicability, and cost efficiency. Using EOG signals, it is possible to improve the communication abilities of those patients who can move their eyes. PMID:19841687
Efficient computations of quantum canonical Gibbs state in phase space
NASA Astrophysics Data System (ADS)
Bondar, Denys I.; Campos, Andre G.; Cabrera, Renan; Rabitz, Herschel A.
2016-06-01
The Gibbs canonical state, as a maximum entropy density matrix, represents a quantum system in equilibrium with a thermostat. This state plays an essential role in thermodynamics and serves as the initial condition for nonequilibrium dynamical simulations. We solve a long standing problem for computing the Gibbs state Wigner function with nearly machine accuracy by solving the Bloch equation directly in the phase space. Furthermore, the algorithms are provided yielding high quality Wigner distributions for pure stationary states as well as for Thomas-Fermi and Bose-Einstein distributions. The developed numerical methods furnish a long-sought efficient computation framework for nonequilibrium quantum simulations directly in the Wigner representation.
A compute-Efficient Bitmap Compression Index for Database Applications
Wu, Kesheng; Shoshani, Arie
2006-01-01
FastBit: A Compute-Efficient Bitmap Compression Index for Database Applications The Word-Aligned Hybrid (WAH) bitmap compression method and data structure is highly efficient for performing search and retrieval operations on large datasets. The WAH technique is optimized for computational efficiency. The WAH-based bitmap indexing software, called FastBit, is particularly appropriate to infrequently varying databases, including those found in the on-line analytical processing (OLAP) industry. Some commercial database products already include some Version of a bitmap index, which could possibly be replaced by the WAR bitmap compression techniques for potentially large operational speedup. Experimental results show performance improvements by an average factor of 10 over bitmap technology used by industry, as well as increased efficiencies in constructing compressed bitmaps. FastBit can be use as a stand-alone index, or integrated into a database system. ien integrated into a database system, this technique may be particularly useful for real-time business analysis applications. Additional FastRit applications may include efficient real-time exploration of scientific models, such as climate and combustion simulations, to minimize search time for analysis and subsequent data visualization. FastBit was proven theoretically to be time-optimal because it provides a search time proportional to the number of elements selected by the index.
A compute-Efficient Bitmap Compression Index for Database Applications
2006-01-01
FastBit: A Compute-Efficient Bitmap Compression Index for Database Applications The Word-Aligned Hybrid (WAH) bitmap compression method and data structure is highly efficient for performing search and retrieval operations on large datasets. The WAH technique is optimized for computational efficiency. The WAH-based bitmap indexing software, called FastBit, is particularly appropriate to infrequently varying databases, including those found in the on-line analytical processing (OLAP) industry. Some commercial database products already include some Version of a bitmap index,more » which could possibly be replaced by the WAR bitmap compression techniques for potentially large operational speedup. Experimental results show performance improvements by an average factor of 10 over bitmap technology used by industry, as well as increased efficiencies in constructing compressed bitmaps. FastBit can be use as a stand-alone index, or integrated into a database system. ien integrated into a database system, this technique may be particularly useful for real-time business analysis applications. Additional FastRit applications may include efficient real-time exploration of scientific models, such as climate and combustion simulations, to minimize search time for analysis and subsequent data visualization. FastBit was proven theoretically to be time-optimal because it provides a search time proportional to the number of elements selected by the index.« less
NASA Astrophysics Data System (ADS)
Zhou, Kan
With the modern trend of transportation electrification, electric machines are a key component of electric/hybrid electric vehicle (EV/HEV) powertrains. It is therefore important that vehicle powertrain-level and system-level designers and control engineers have access to accurate yet computationally-efficient (CE), physics-based modeling tools of the thermal and electromagnetic (EM) behavior of electric machines. In this dissertation, CE yet sufficiently-accurate thermal and EM models for electric machines, which are suitable for use in vehicle powertrain design, optimization, and control, are developed. This includes not only creating fast and accurate thermal and EM models for specific machine designs, but also the ability to quickly generate and determine the performance of new machine designs through the application of scaling techniques to existing designs. With the developed techniques, the thermal and EM performance can be accurately and efficiently estimated. Furthermore, powertrain or system designers can easily and quickly adjust the characteristics and the performance of the machine in ways that are favorable to the overall vehicle performance.
A Novel Green Cloud Computing Framework for Improving System Efficiency
NASA Astrophysics Data System (ADS)
Lin, Chen
As the prevalence of Cloud computing continues to rise, the need for power saving mechanisms within the Cloud also increases. In this paper we have presented a novel Green Cloud framework for improving system efficiency in a data center. To demonstrate the potential of our framework, we have presented new energy efficient scheduling, VM system image, and image management components that explore new ways to conserve power. Though our research presented in this paper, we have found new ways to save vast amounts of energy while minimally impacting performance.
A procedure for computing accurate ab initio quartic force fields: Application to HO2+ and H2O
NASA Astrophysics Data System (ADS)
Huang, Xinchuan; Lee, Timothy J.
2008-07-01
A procedure for the calculation of molecular quartic force fields (QFFs) is proposed and investigated. The goal is to generate highly accurate ab initio QFFs that include many of the so-called ``small'' effects that are necessary to achieve high accuracy. The small effects investigated in the present study include correlation of the core electrons (core correlation), extrapolation to the one-particle basis set limit, correction for scalar relativistic contributions, correction for higher-order correlation effects, and inclusion of diffuse functions in the one-particle basis set. The procedure is flexible enough to allow for some effects to be computed directly, while others may be added as corrections. A single grid of points is used and is centered about an initial reference geometry that is designed to be as close as possible to the final ab initio equilibrium structure (with all effects included). It is shown that the least-squares fit of the QFF is not compromised by the added corrections, and the balance between elimination of contamination from higher-order force constants while retaining energy differences large enough to yield meaningful quartic force constants is essentially unchanged from the standard procedures we have used for many years. The initial QFF determined from the least-squares fit is transformed to the exact minimum in order to eliminate gradient terms and allow for the use of second-order perturbation theory for evaluation of spectroscopic constants. It is shown that this step has essentially no effect on the quality of the QFF largely because the initial reference structure is, by design, very close to the final ab initio equilibrium structure. The procedure is used to compute an accurate, purely ab initio QFF for the H2O molecule, which is used as a benchmark test case. The procedure is then applied to the ground and first excited electronic states of the HO2+ molecular cation. Fundamental vibrational frequencies and spectroscopic
Accurate Time-Dependent Traveling-Wave Tube Model Developed for Computational Bit-Error-Rate Testing
NASA Technical Reports Server (NTRS)
Kory, Carol L.
2001-01-01
The phenomenal growth of the satellite communications industry has created a large demand for traveling-wave tubes (TWT's) operating with unprecedented specifications requiring the design and production of many novel devices in record time. To achieve this, the TWT industry heavily relies on computational modeling. However, the TWT industry's computational modeling capabilities need to be improved because there are often discrepancies between measured TWT data and that predicted by conventional two-dimensional helical TWT interaction codes. This limits the analysis and design of novel devices or TWT's with parameters differing from what is conventionally manufactured. In addition, the inaccuracy of current computational tools limits achievable TWT performance because optimized designs require highly accurate models. To address these concerns, a fully three-dimensional, time-dependent, helical TWT interaction model was developed using the electromagnetic particle-in-cell code MAFIA (Solution of MAxwell's equations by the Finite-Integration-Algorithm). The model includes a short section of helical slow-wave circuit with excitation fed by radiofrequency input/output couplers, and an electron beam contained by periodic permanent magnet focusing. A cutaway view of several turns of the three-dimensional helical slow-wave circuit with input/output couplers is shown. This has been shown to be more accurate than conventionally used two-dimensional models. The growth of the communications industry has also imposed a demand for increased data rates for the transmission of large volumes of data. To achieve increased data rates, complex modulation and multiple access techniques are employed requiring minimum distortion of the signal as it is passed through the TWT. Thus, intersymbol interference (ISI) becomes a major consideration, as well as suspected causes such as reflections within the TWT. To experimentally investigate effects of the physical TWT on ISI would be
Gray, Alan; Harlen, Oliver G.; Harris, Sarah A.; Khalid, Syma; Leung, Yuk Ming; Lonsdale, Richard; Mulholland, Adrian J.; Pearson, Arwen R.; Read, Daniel J.; Richardson, Robin A.
2015-01-01
The current computational techniques available for biomolecular simulation are described, and the successes and limitations of each with reference to the experimental biophysical methods that they complement are presented. Despite huge advances in the computational techniques available for simulating biomolecules at the quantum-mechanical, atomistic and coarse-grained levels, there is still a widespread perception amongst the experimental community that these calculations are highly specialist and are not generally applicable by researchers outside the theoretical community. In this article, the successes and limitations of biomolecular simulation and the further developments that are likely in the near future are discussed. A brief overview is also provided of the experimental biophysical methods that are commonly used to probe biomolecular structure and dynamics, and the accuracy of the information that can be obtained from each is compared with that from modelling. It is concluded that progress towards an accurate spatial and temporal model of biomacromolecules requires a combination of all of these biophysical techniques, both experimental and computational.
A Computationally Efficient Method for Polyphonic Pitch Estimation
NASA Astrophysics Data System (ADS)
Zhou, Ruohua; Reiss, Joshua D.; Mattavelli, Marco; Zoia, Giorgio
2009-12-01
This paper presents a computationally efficient method for polyphonic pitch estimation. The method employs the Fast Resonator Time-Frequency Image (RTFI) as the basic time-frequency analysis tool. The approach is composed of two main stages. First, a preliminary pitch estimation is obtained by means of a simple peak-picking procedure in the pitch energy spectrum. Such spectrum is calculated from the original RTFI energy spectrum according to harmonic grouping principles. Then the incorrect estimations are removed according to spectral irregularity and knowledge of the harmonic structures of the music notes played on commonly used music instruments. The new approach is compared with a variety of other frame-based polyphonic pitch estimation methods, and results demonstrate the high performance and computational efficiency of the approach.
Computational methods for efficient structural reliability and reliability sensitivity analysis
NASA Technical Reports Server (NTRS)
Wu, Y.-T.
1993-01-01
This paper presents recent developments in efficient structural reliability analysis methods. The paper proposes an efficient, adaptive importance sampling (AIS) method that can be used to compute reliability and reliability sensitivities. The AIS approach uses a sampling density that is proportional to the joint PDF of the random variables. Starting from an initial approximate failure domain, sampling proceeds adaptively and incrementally with the goal of reaching a sampling domain that is slightly greater than the failure domain to minimize over-sampling in the safe region. Several reliability sensitivity coefficients are proposed that can be computed directly and easily from the above AIS-based failure points. These probability sensitivities can be used for identifying key random variables and for adjusting design to achieve reliability-based objectives. The proposed AIS methodology is demonstrated using a turbine blade reliability analysis problem.
Dendritic nonlinearities are tuned for efficient spike-based computations in cortical circuits.
Ujfalussy, Balázs B; Makara, Judit K; Branco, Tiago; Lengyel, Máté
2015-01-01
Cortical neurons integrate thousands of synaptic inputs in their dendrites in highly nonlinear ways. It is unknown how these dendritic nonlinearities in individual cells contribute to computations at the level of neural circuits. Here, we show that dendritic nonlinearities are critical for the efficient integration of synaptic inputs in circuits performing analog computations with spiking neurons. We developed a theory that formalizes how a neuron's dendritic nonlinearity that is optimal for integrating synaptic inputs depends on the statistics of its presynaptic activity patterns. Based on their in vivo preynaptic population statistics (firing rates, membrane potential fluctuations, and correlations due to ensemble dynamics), our theory accurately predicted the responses of two different types of cortical pyramidal cells to patterned stimulation by two-photon glutamate uncaging. These results reveal a new computational principle underlying dendritic integration in cortical neurons by suggesting a functional link between cellular and systems--level properties of cortical circuits. PMID:26705334
Computationally efficient, rotational nonequilibrium CW chemical laser model
Sentman, L.H.; Rushmore, W.
1981-10-01
The essential fluid dynamic and kinetic phenomena required for a quantitative, computationally efficient, rotational nonequilibrium model of a CW HF chemical laser are identified. It is shown that, in addition to the pumping, collisional deactivation, and rotational relaxation reactions, F-atom wall recombination, the hot pumping reaction, and multiquantum deactivation reactions play a significant role in determining laser performance. Several problems with the HF kinetics package are identified. The effect of various parameters on run time is discussed.
Efficient MATLAB computations with sparse and factored tensors.
Bader, Brett William; Kolda, Tamara Gibson (Sandia National Lab, Livermore, CA)
2006-12-01
In this paper, the term tensor refers simply to a multidimensional or N-way array, and we consider how specially structured tensors allow for efficient storage and computation. First, we study sparse tensors, which have the property that the vast majority of the elements are zero. We propose storing sparse tensors using coordinate format and describe the computational efficiency of this scheme for various mathematical operations, including those typical to tensor decomposition algorithms. Second, we study factored tensors, which have the property that they can be assembled from more basic components. We consider two specific types: a Tucker tensor can be expressed as the product of a core tensor (which itself may be dense, sparse, or factored) and a matrix along each mode, and a Kruskal tensor can be expressed as the sum of rank-1 tensors. We are interested in the case where the storage of the components is less than the storage of the full tensor, and we demonstrate that many elementary operations can be computed using only the components. All of the efficiencies described in this paper are implemented in the Tensor Toolbox for MATLAB.
Improving computational efficiency of Monte Carlo simulations with variance reduction
Turner, A.
2013-07-01
CCFE perform Monte-Carlo transport simulations on large and complex tokamak models such as ITER. Such simulations are challenging since streaming and deep penetration effects are equally important. In order to make such simulations tractable, both variance reduction (VR) techniques and parallel computing are used. It has been found that the application of VR techniques in such models significantly reduces the efficiency of parallel computation due to 'long histories'. VR in MCNP can be accomplished using energy-dependent weight windows. The weight window represents an 'average behaviour' of particles, and large deviations in the arriving weight of a particle give rise to extreme amounts of splitting being performed and a long history. When running on parallel clusters, a long history can have a detrimental effect on the parallel efficiency - if one process is computing the long history, the other CPUs complete their batch of histories and wait idle. Furthermore some long histories have been found to be effectively intractable. To combat this effect, CCFE has developed an adaptation of MCNP which dynamically adjusts the WW where a large weight deviation is encountered. The method effectively 'de-optimises' the WW, reducing the VR performance but this is offset by a significant increase in parallel efficiency. Testing with a simple geometry has shown the method does not bias the result. This 'long history method' has enabled CCFE to significantly improve the performance of MCNP calculations for ITER on parallel clusters, and will be beneficial for any geometry combining streaming and deep penetration effects. (authors)
NASA Technical Reports Server (NTRS)
Daigle, Matthew John; Goebel, Kai Frank
2010-01-01
Model-based prognostics captures system knowledge in the form of physics-based models of components, and how they fail, in order to obtain accurate predictions of end of life (EOL). EOL is predicted based on the estimated current state distribution of a component and expected profiles of future usage. In general, this requires simulations of the component using the underlying models. In this paper, we develop a simulation-based prediction methodology that achieves computational efficiency by performing only the minimal number of simulations needed in order to accurately approximate the mean and variance of the complete EOL distribution. This is performed through the use of the unscented transform, which predicts the means and covariances of a distribution passed through a nonlinear transformation. In this case, the EOL simulation acts as that nonlinear transformation. In this paper, we review the unscented transform, and describe how this concept is applied to efficient EOL prediction. As a case study, we develop a physics-based model of a solenoid valve, and perform simulation experiments to demonstrate improved computational efficiency without sacrificing prediction accuracy.
NASA Astrophysics Data System (ADS)
Joost, William J.
2012-09-01
Transportation accounts for approximately 28% of U.S. energy consumption with the majority of transportation energy derived from petroleum sources. Many technologies such as vehicle electrification, advanced combustion, and advanced fuels can reduce transportation energy consumption by improving the efficiency of cars and trucks. Lightweight materials are another important technology that can improve passenger vehicle fuel efficiency by 6-8% for each 10% reduction in weight while also making electric and alternative vehicles more competitive. Despite the opportunities for improved efficiency, widespread deployment of lightweight materials for automotive structures is hampered by technology gaps most often associated with performance, manufacturability, and cost. In this report, the impact of reduced vehicle weight on energy efficiency is discussed with a particular emphasis on quantitative relationships determined by several researchers. The most promising lightweight materials systems are described along with a brief review of the most significant technical barriers to their implementation. For each material system, the development of accurate material models is critical to support simulation-intensive processing and structural design for vehicles; improved models also contribute to an integrated computational materials engineering (ICME) approach for addressing technical barriers and accelerating deployment. The value of computational techniques is described by considering recent ICME and computational materials science success stories with an emphasis on applying problem-specific methods.
NASA Astrophysics Data System (ADS)
Sangiovanni, D. G.; Hellman, O.; Alling, B.; Abrikosov, I. A.
2016-03-01
We revisit the color-diffusion algorithm [Aeberhard et al., Phys. Rev. Lett. 108, 095901 (2012), 10.1103/PhysRevLett.108.095901] in non equilibrium ab initio molecular dynamics (NE-AIMD) and propose a simple efficient approach for the estimation of monovacancy jump rates in crystalline solids at temperatures well below melting. Color-diffusion applied to monovacancy migration entails that one lattice atom (colored atom) is accelerated toward the neighboring defect site by an external constant force F. Considering bcc molybdenum between 1000 and 2800 K as a model system, NE-AIMD results show that the colored-atom jump rate kNE increases exponentially with the force intensity F , up to F values far beyond the linear-fitting regime employed previously. Using a simple model, we derive an analytical expression which reproduces the observed kNE(F ) dependence on F . Equilibrium rates extrapolated by NE-AIMD results are in excellent agreement with those of unconstrained dynamics. The gain in computational efficiency achieved with our approach increases rapidly with decreasing temperatures and reaches a factor of 4 orders of magnitude at the lowest temperature considered in the present study.
Energy Efficient Biomolecular Simulations with FPGA-based Reconfigurable Computing
Hampton, Scott S; Agarwal, Pratul K
2010-05-01
Reconfigurable computing (RC) is being investigated as a hardware solution for improving time-to-solution for biomolecular simulations. A number of popular molecular dynamics (MD) codes are used to study various aspects of biomolecules. These codes are now capable of simulating nanosecond time-scale trajectories per day on conventional microprocessor-based hardware, but biomolecular processes often occur at the microsecond time-scale or longer. A wide gap exists between the desired and achievable simulation capability; therefore, there is considerable interest in alternative algorithms and hardware for improving the time-to-solution of MD codes. The fine-grain parallelism provided by Field Programmable Gate Arrays (FPGA) combined with their low power consumption make them an attractive solution for improving the performance of MD simulations. In this work, we use an FPGA-based coprocessor to accelerate the compute-intensive calculations of LAMMPS, a popular MD code, achieving up to 5.5 fold speed-up on the non-bonded force computations of the particle mesh Ewald method and up to 2.2 fold speed-up in overall time-to-solution, and potentially an increase by a factor of 9 in power-performance efficiencies for the pair-wise computations. The results presented here provide an example of the multi-faceted benefits to an application in a heterogeneous computing environment.
Energy efficient hybrid computing systems using spin devices
NASA Astrophysics Data System (ADS)
Sharad, Mrigank
Emerging spin-devices like magnetic tunnel junctions (MTJ's), spin-valves and domain wall magnets (DWM) have opened new avenues for spin-based logic design. This work explored potential computing applications which can exploit such devices for higher energy-efficiency and performance. The proposed applications involve hybrid design schemes, where charge-based devices supplement the spin-devices, to gain large benefits at the system level. As an example, lateral spin valves (LSV) involve switching of nanomagnets using spin-polarized current injection through a metallic channel such as Cu. Such spin-torque based devices possess several interesting properties that can be exploited for ultra-low power computation. Analog characteristic of spin current facilitate non-Boolean computation like majority evaluation that can be used to model a neuron. The magneto-metallic neurons can operate at ultra-low terminal voltage of ˜20mV, thereby resulting in small computation power. Moreover, since nano-magnets inherently act as memory elements, these devices can facilitate integration of logic and memory in interesting ways. The spin based neurons can be integrated with CMOS and other emerging devices leading to different classes of neuromorphic/non-Von-Neumann architectures. The spin-based designs involve `mixed-mode' processing and hence can provide very compact and ultra-low energy solutions for complex computation blocks, both digital as well as analog. Such low-power, hybrid designs can be suitable for various data processing applications like cognitive computing, associative memory, and currentmode on-chip global interconnects. Simulation results for these applications based on device-circuit co-simulation framework predict more than ˜100x improvement in computation energy as compared to state of the art CMOS design, for optimal spin-device parameters.
Chang, Chih-Hao . E-mail: chchang@engineering.ucsb.edu; Liou, Meng-Sing . E-mail: meng-sing.liou@grc.nasa.gov
2007-07-01
In this paper, we propose a new approach to compute compressible multifluid equations. Firstly, a single-pressure compressible multifluid model based on the stratified flow model is proposed. The stratified flow model, which defines different fluids in separated regions, is shown to be amenable to the finite volume method. We can apply the conservation law to each subregion and obtain a set of balance equations. Secondly, the AUSM{sup +} scheme, which is originally designed for the compressible gas flow, is extended to solve compressible liquid flows. By introducing additional dissipation terms into the numerical flux, the new scheme, called AUSM{sup +}-up, can be applied to both liquid and gas flows. Thirdly, the contribution to the numerical flux due to interactions between different phases is taken into account and solved by the exact Riemann solver. We will show that the proposed approach yields an accurate and robust method for computing compressible multiphase flows involving discontinuities, such as shock waves and fluid interfaces. Several one-dimensional test problems are used to demonstrate the capability of our method, including the Ransom's water faucet problem and the air-water shock tube problem. Finally, several two dimensional problems will show the capability to capture enormous details and complicated wave patterns in flows having large disparities in the fluid density and velocities, such as interactions between water shock wave and air bubble, between air shock wave and water column(s), and underwater explosion.
Improving robustness and computational efficiency using modern C++
NASA Astrophysics Data System (ADS)
Paterno, M.; Kowalkowski, J.; Green, C.
2014-06-01
For nearly two decades, the C++ programming language has been the dominant programming language for experimental HEP. The publication of ISO/IEC 14882:2011, the current version of the international standard for the C++ programming language, makes available a variety of language and library facilities for improving the robustness, expressiveness, and computational efficiency of C++ code. However, much of the C++ written by the experimental HEP community does not take advantage of the features of the language to obtain these benefits, either due to lack of familiarity with these features or concern that these features must somehow be computationally inefficient. In this paper, we address some of the features of modern C+-+, and show how they can be used to make programs that are both robust and computationally efficient. We compare and contrast simple yet realistic examples of some common implementation patterns in C, currently-typical C++, and modern C++, and show (when necessary, down to the level of generated assembly language code) the quality of the executable code produced by recent C++ compilers, with the aim of allowing the HEP community to make informed decisions on the costs and benefits of the use of modern C++.
Improving robustness and computational efficiency using modern C++
Paterno, M.; Kowalkowski, J.; Green, C.
2014-01-01
For nearly two decades, the C++ programming language has been the dominant programming language for experimental HEP. The publication of ISO/IEC 14882:2011, the current version of the international standard for the C++ programming language, makes available a variety of language and library facilities for improving the robustness, expressiveness, and computational efficiency of C++ code. However, much of the C++ written by the experimental HEP community does not take advantage of the features of the language to obtain these benefits, either due to lack of familiarity with these features or concern that these features must somehow be computationally inefficient. In this paper, we address some of the features of modern C+-+, and show how they can be used to make programs that are both robust and computationally efficient. We compare and contrast simple yet realistic examples of some common implementation patterns in C, currently-typical C++, and modern C++, and show (when necessary, down to the level of generated assembly language code) the quality of the executable code produced by recent C++ compilers, with the aim of allowing the HEP community to make informed decisions on the costs and benefits of the use of modern C++.
Harb, Moussab
2015-10-14
Using accurate first-principles quantum calculations based on DFT (including the DFPT) with the range-separated hybrid HSE06 exchange-correlation functional, we can predict the essential fundamental properties (such as bandgap, optical absorption co-efficient, dielectric constant, charge carrier effective masses and exciton binding energy) of two stable monoclinic vanadium oxynitride (VON) semiconductor crystals for solar energy conversion applications. In addition to the predicted band gaps in the optimal range for making single-junction solar cells, both polymorphs exhibit a relatively high absorption efficiency in the visible range, high dielectric constant, high charge carrier mobility and much lower exciton binding energy than the thermal energy at room temperature. Moreover, their optical absorption, dielectric and exciton dissociation properties were found to be better than those obtained for semiconductors frequently utilized in photovoltaic devices such as Si, CdTe and GaAs. These novel results offer a great opportunity for this stoichiometric VON material to be properly synthesized and considered as a new good candidate for photovoltaic applications. PMID:26351755
A fast and accurate method for computing the Sunyaev-Zel'dovich signal of hot galaxy clusters
NASA Astrophysics Data System (ADS)
Chluba, Jens; Nagai, Daisuke; Sazonov, Sergey; Nelson, Kaylea
2012-10-01
New-generation ground- and space-based cosmic microwave background experiments have ushered in discoveries of massive galaxy clusters via the Sunyaev-Zel'dovich (SZ) effect, providing a new window for studying cluster astrophysics and cosmology. Many of the newly discovered, SZ-selected clusters contain hot intracluster plasma (kTe ≳ 10 keV) and exhibit disturbed morphology, indicative of frequent mergers with large peculiar velocity (v ≳ 1000 km s-1). It is well known that for the interpretation of the SZ signal from hot, moving galaxy clusters, relativistic corrections must be taken into account, and in this work, we present a fast and accurate method for computing these effects. Our approach is based on an alternative derivation of the Boltzmann collision term which provides new physical insight into the sources of different kinematic corrections in the scattering problem. In contrast to previous works, this allows us to obtain a clean separation of kinematic and scattering terms. We also briefly mention additional complications connected with kinematic effects that should be considered when interpreting future SZ data for individual clusters. One of the main outcomes of this work is SZPACK, a numerical library which allows very fast and precise (≲0.001 per cent at frequencies hν ≲ 20kTγ) computation of the SZ signals up to high electron temperature (kTe ≃ 25 keV) and large peculiar velocity (v/c ≃ 0.01). The accuracy is well beyond the current and future precision of SZ observations and practically eliminates uncertainties which are usually overcome with more expensive numerical evaluation of the Boltzmann collision term. Our new approach should therefore be useful for analysing future high-resolution, multifrequency SZ observations as well as computing the predicted SZ effect signals from numerical simulations.
A computationally efficient modelling of laminar separation bubbles
NASA Technical Reports Server (NTRS)
Maughmer, Mark D.
1988-01-01
The goal of this research is to accurately predict the characteristics of the laminar separation bubble and its effects on airfoil performance. To this end, a model of the bubble is under development and will be incorporated in the analysis section of the Eppler and Somers program. As a first step in this direction, an existing bubble model was inserted into the program. It was decided to address the problem of the short bubble before attempting the prediction of the long bubble. In the second place, an integral boundary-layer method is believed more desirable than a finite difference approach. While these two methods achieve similar prediction accuracy, finite-difference methods tend to involve significantly longer computer run times than the integral methods. Finally, as the boundary-layer analysis in the Eppler and Somers program employs the momentum and kinetic energy integral equations, a short-bubble model compatible with these equations is most preferable.
A computationally efficient modelling of laminar separation bubbles
NASA Astrophysics Data System (ADS)
Maughmer, Mark D.
1988-02-01
The goal of this research is to accurately predict the characteristics of the laminar separation bubble and its effects on airfoil performance. To this end, a model of the bubble is under development and will be incorporated in the analysis section of the Eppler and Somers program. As a first step in this direction, an existing bubble model was inserted into the program. It was decided to address the problem of the short bubble before attempting the prediction of the long bubble. In the second place, an integral boundary-layer method is believed more desirable than a finite difference approach. While these two methods achieve similar prediction accuracy, finite-difference methods tend to involve significantly longer computer run times than the integral methods. Finally, as the boundary-layer analysis in the Eppler and Somers program employs the momentum and kinetic energy integral equations, a short-bubble model compatible with these equations is most preferable.
Exploiting stoichiometric redundancies for computational efficiency and network reduction
Ingalls, Brian P.; Bembenek, Eric
2015-01-01
Abstract Analysis of metabolic networks typically begins with construction of the stoichiometry matrix, which characterizes the network topology. This matrix provides, via the balance equation, a description of the potential steady-state flow distribution. This paper begins with the observation that the balance equation depends only on the structure of linear redundancies in the network, and so can be stated in a succinct manner, leading to computational efficiencies in steady-state analysis. This alternative description of steady-state behaviour is then used to provide a novel method for network reduction, which complements existing algorithms for describing intracellular networks in terms of input-output macro-reactions (to facilitate bioprocess optimization and control). Finally, it is demonstrated that this novel reduction method can be used to address elementary mode analysis of large networks: the modes supported by a reduced network can capture the input-output modes of a metabolic module with significantly reduced computational effort. PMID:25547516
Exploiting stoichiometric redundancies for computational efficiency and network reduction.
Ingalls, Brian P; Bembenek, Eric
2015-01-01
Analysis of metabolic networks typically begins with construction of the stoichiometry matrix, which characterizes the network topology. This matrix provides, via the balance equation, a description of the potential steady-state flow distribution. This paper begins with the observation that the balance equation depends only on the structure of linear redundancies in the network, and so can be stated in a succinct manner, leading to computational efficiencies in steady-state analysis. This alternative description of steady-state behaviour is then used to provide a novel method for network reduction, which complements existing algorithms for describing intracellular networks in terms of input-output macro-reactions (to facilitate bioprocess optimization and control). Finally, it is demonstrated that this novel reduction method can be used to address elementary mode analysis of large networks: the modes supported by a reduced network can capture the input-output modes of a metabolic module with significantly reduced computational effort. PMID:25547516
Differential area profiles: decomposition properties and efficient computation.
Ouzounis, Georgios K; Pesaresi, Martino; Soille, Pierre
2012-08-01
Differential area profiles (DAPs) are point-based multiscale descriptors used in pattern analysis and image segmentation. They are defined through sets of size-based connected morphological filters that constitute a joint area opening top-hat and area closing bottom-hat scale-space of the input image. The work presented in this paper explores the properties of this image decomposition through sets of area zones. An area zone defines a single plane of the DAP vector field and contains all the peak components of the input image, whose size is between the zone's attribute extrema. Area zones can be computed efficiently from hierarchical image representation structures, in a way similar to regular attribute filters. Operations on the DAP vector field can then be computed without the need for exporting it first, and an example with the leveling-like convex/concave segmentation scheme is given. This is referred to as the one-pass method and it is demonstrated on the Max-Tree structure. Its computational performance is tested and compared against conventional means for computing differential profiles, relying on iterative application of area openings and closings. Applications making use of the area zone decomposition are demonstrated in problems related to remote sensing and medical image analysis. PMID:22184259
Orenstein, Yaron; Wang, Yuhao; Berger, Bonnie
2016-01-01
Motivation: Protein–RNA interactions, which play vital roles in many processes, are mediated through both RNA sequence and structure. CLIP-based methods, which measure protein–RNA binding in vivo, suffer from experimental noise and systematic biases, whereas in vitro experiments capture a clearer signal of protein RNA-binding. Among them, RNAcompete provides binding affinities of a specific protein to more than 240 000 unstructured RNA probes in one experiment. The computational challenge is to infer RNA structure- and sequence-based binding models from these data. The state-of-the-art in sequence models, Deepbind, does not model structural preferences. RNAcontext models both sequence and structure preferences, but is outperformed by GraphProt. Unfortunately, GraphProt cannot detect structural preferences from RNAcompete data due to the unstructured nature of the data, as noted by its developers, nor can it be tractably run on the full RNACompete dataset. Results: We develop RCK, an efficient, scalable algorithm that infers both sequence and structure preferences based on a new k-mer based model. Remarkably, even though RNAcompete data is designed to be unstructured, RCK can still learn structural preferences from it. RCK significantly outperforms both RNAcontext and Deepbind in in vitro binding prediction for 244 RNAcompete experiments. Moreover, RCK is also faster and uses less memory, which enables scalability. While currently on par with existing methods in in vivo binding prediction on a small scale test, we demonstrate that RCK will increasingly benefit from experimentally measured RNA structure profiles as compared to computationally predicted ones. By running RCK on the entire RNAcompete dataset, we generate and provide as a resource a set of protein–RNA structure-based models on an unprecedented scale. Availability and Implementation: Software and models are freely available at http://rck.csail.mit.edu/ Contact: bab@mit.edu Supplementary information
Efficient parallel global garbage collection on massively parallel computers
Kamada, Tomio; Matsuoka, Satoshi; Yonezawa, Akinori
1994-12-31
On distributed-memory high-performance MPPs where processors are interconnected by an asynchronous network, efficient Garbage Collection (GC) becomes difficult due to inter-node references and references within pending, unprocessed messages. The parallel global GC algorithm (1) takes advantage of reference locality, (2) efficiently traverses references over nodes, (3) admits minimum pause time of ongoing computations, and (4) has been shown to scale up to 1024 node MPPs. The algorithm employs a global weight counting scheme to substantially reduce message traffic. The two methods for confirming the arrival of pending messages are used: one counts numbers of messages and the other uses network `bulldozing.` Performance evaluation in actual implementations on a multicomputer with 32-1024 nodes, Fujitsu AP1000, reveals various favorable properties of the algorithm.
Computationally efficient strategies to perform anomaly detection in hyperspectral images
NASA Astrophysics Data System (ADS)
Rossi, Alessandro; Acito, Nicola; Diani, Marco; Corsini, Giovanni
2012-11-01
In remote sensing, hyperspectral sensors are effectively used for target detection and recognition because of their high spectral resolution that allows discrimination of different materials in the sensed scene. When a priori information about the spectrum of the targets of interest is not available, target detection turns into anomaly detection (AD), i.e. searching for objects that are anomalous with respect to the scene background. In the field of AD, anomalies can be generally associated to observations that statistically move away from background clutter, being this latter intended as a local neighborhood surrounding the observed pixel or as a large part of the image. In this context, many efforts have been put to reduce the computational load of AD algorithms so as to furnish information for real-time decision making. In this work, a sub-class of AD methods is considered that aim at detecting small rare objects that are anomalous with respect to their local background. Such techniques not only are characterized by mathematical tractability but also allow the design of real-time strategies for AD. Within these methods, one of the most-established anomaly detectors is the RX algorithm which is based on a local Gaussian model for background modeling. In the literature, the RX decision rule has been employed to develop computationally efficient algorithms implemented in real-time systems. In this work, a survey of computationally efficient methods to implement the RX detector is presented where advanced algebraic strategies are exploited to speed up the estimate of the covariance matrix and of its inverse. The comparison of the overall number of operations required by the different implementations of the RX algorithms is given and discussed by varying the RX parameters in order to show the computational improvements achieved with the introduced algebraic strategy.
NASA Astrophysics Data System (ADS)
Rike, Erik R.; Delbalzo, Donald R.
2005-04-01
Transmission Loss (TL) computations in littoral areas require a dense spatial and azimuthal grid to achieve acceptable accuracy and detail. The computational cost of accurate predictions led to a new concept, OGRES (Objective Grid/Radials using Environmentally-sensitive Selection), which produces sparse, irregular acoustic grids, with controlled accuracy. Recent work to further increase accuracy and efficiency with better metrics and interpolation led to EAGLE (Efficient Adaptive Gridder for Littoral Environments). On each iteration, EAGLE produces grids with approximately constant spatial uncertainty (hence, iso-deviance), yielding predictions with ever-increasing resolution and accuracy. The EAGLE point-selection mechanism is tested using the predictive error metric and 1-D synthetic data-sets created from combinations of simple signal functions (e.g., polynomials, sines, cosines, exponentials), along with white and chromatic noise. The speed, efficiency, fidelity, and iso-deviance of EAGLE are determined for each combination of signal, noise, and interpolator. The results show significant efficiency enhancements compared to uniform grids of the same accuracy. [Work sponsored by ONR under the LADC project.
NASA Astrophysics Data System (ADS)
Liu, P.; Zhang, Y.
2008-04-01
Accurately simulating secondary organic aerosols (SOA) in three-dimensional (3-D) air quality models is challenging due to the complexity of the physics and chemistry involved and the high computational demand required. A computationally-efficient yet accurate SOA module is necessary in 3-D applications for long-term simulations and real-time air quality forecasting. A coupled gas and aerosol box model (i.e., 0-D CMAQ-MADRID 2) is used to optimize relevant processes in order to develop such a SOA module. Solving the partitioning equations for condensable volatile organic compounds (VOCs) and calculating their activity coefficients in the multicomponent mixtures are identified to be the most computationally-expensive processes. The two processes can be speeded up by relaxing the error tolerance levels and reducing the maximum number of iterations of the numerical solver for the partitioning equations for organic species; turning on organic-inorganic interactions only when the water content associated with organic compounds is significant; and parameterizing the calculation of activity coefficients for organic mixtures in the hydrophilic module. The optimal speed-up method can reduce the total CPU cost by up to a factor of 29.7 with ±15% deviation from benchmark results. These speedup methods are applicable to other SOA modules that are based on partitioning theories.
NASA Astrophysics Data System (ADS)
Liu, P.; Zhang, Y.
2008-07-01
Accurately simulating secondary organic aerosols (SOA) in three-dimensional (3-D) air quality models is challenging due to the complexity of the physics and chemistry involved and the high computational demand required. A computationally-efficient yet accurate SOA module is necessary in 3-D applications for long-term simulations and real-time air quality forecasting. A coupled gas and aerosol box model (i.e., 0-D CMAQ-MADRID 2) is used to optimize relevant processes in order to develop such a SOA module. Solving the partitioning equations for condensable volatile organic compounds (VOCs) and calculating their activity coefficients in the multicomponent mixtures are identified to be the most computationally-expensive processes. The two processes can be speeded up by relaxing the error tolerance levels and reducing the maximum number of iterations of the numerical solver for the partitioning equations for organic species; conditionally activating organic-inorganic interactions; and parameterizing the calculation of activity coefficients for organic mixtures in the hydrophilic module. The optimal speed-up method can reduce the total CPU cost by up to a factor of 31.4 from benchmark under the rural conditions with 2 ppb isoprene and by factors of 10 71 under various test conditions with 2 10 ppb isoprene and >40% relative humidity while maintaining ±15% deviation. These speed-up methods are applicable to other SOA modules that are based on partitioning theories.
Computationally efficient sub-band coding of ECG signals.
Husøy, J H; Gjerde, T
1996-03-01
A data compression technique is presented for the compression of discrete time electrocardiogram (ECG) signals. The compression system is based on sub-band coding, a technique traditionally used for compressing speech and images. The sub-band coder employs quadrature mirror filter banks (QMF) with up to 32 critically sampled sub-bands. Both finite impulse response (FIR) and the more computationally efficient infinite impulse response (IIR) filter banks are considered as candidates in a complete ECG coding system. The sub-bands are threshold, quantized using uniform quantizers and run-length coded. The output of the run-length coder is further compressed by a Huffman coder. Extensive simulations indicate that 16 sub-bands are a suitable choice for this application. Furthermore, IIR filter banks are preferable due to their superiority in terms of computational efficiency. We conclude that the present scheme, which is suitable for real time implementation on a PC, can provide compression ratios between 5 and 15 without loss of clinical information. PMID:8673319
Efficient Computation of the Topology of Level Sets
Pascucci, V; Cole-McLaughlin, K
2002-07-19
This paper introduces two efficient algorithms that compute the Contour Tree of a 3D scalar field F and its augmented version with the Betti numbers of each isosurface. The Contour Tree is a fundamental data structure in scientific visualization that is used to pre-process the domain mesh to allow optimal computation of isosurfaces with minimal storage overhead. The Contour Tree can be also used to build user interfaces reporting the complete topological characterization of a scalar field, as shown in Figure 1. In the first part of the paper we present a new scheme that augments the Contour Tree with the Betti numbers of each isocontour in linear time. We show how to extend the scheme introduced in 3 with the Betti number computation without increasing its complexity. Thus we improve on the time complexity from our previous approach 8 from 0(m log m) to 0(n log n+m), where m is the number of tetrahedra and n is the number of vertices in the domain of F. In the second part of the paper we introduce a new divide and conquer algorithm that computes the Augmented Contour Tree for scalar fields defined on rectilinear grids. The central part of the scheme computes the output contour tree by merging two intermediate contour trees and is independent of the interpolant. In this way we confine any knowledge regarding a specific interpolant to an oracle that computes the tree for a single cell. We have implemented this oracle for the trilinear interpolant and plan to replace it with higher order interpolants when needed. The complexity of the scheme is O(n + t log n), where t is the number of critical points of F. This allows for the first time to compute the Contour Tree in linear time in many practical cases when t = O(n{sup 1-e}). We report the running times for a parallel implementation of our algorithm, showing good scalability with the number of processors.
Computationally efficient implementation of combustion chemistry in parallel PDF calculations
NASA Astrophysics Data System (ADS)
Lu, Liuyan; Lantz, Steven R.; Ren, Zhuyin; Pope, Stephen B.
2009-08-01
In parallel calculations of combustion processes with realistic chemistry, the serial in situ adaptive tabulation (ISAT) algorithm [S.B. Pope, Computationally efficient implementation of combustion chemistry using in situ adaptive tabulation, Combustion Theory and Modelling, 1 (1997) 41-63; L. Lu, S.B. Pope, An improved algorithm for in situ adaptive tabulation, Journal of Computational Physics 228 (2009) 361-386] substantially speeds up the chemistry calculations on each processor. To improve the parallel efficiency of large ensembles of such calculations in parallel computations, in this work, the ISAT algorithm is extended to the multi-processor environment, with the aim of minimizing the wall clock time required for the whole ensemble. Parallel ISAT strategies are developed by combining the existing serial ISAT algorithm with different distribution strategies, namely purely local processing (PLP), uniformly random distribution (URAN), and preferential distribution (PREF). The distribution strategies enable the queued load redistribution of chemistry calculations among processors using message passing. They are implemented in the software x2f_mpi, which is a Fortran 95 library for facilitating many parallel evaluations of a general vector function. The relative performance of the parallel ISAT strategies is investigated in different computational regimes via the PDF calculations of multiple partially stirred reactors burning methane/air mixtures. The results show that the performance of ISAT with a fixed distribution strategy strongly depends on certain computational regimes, based on how much memory is available and how much overlap exists between tabulated information on different processors. No one fixed strategy consistently achieves good performance in all the regimes. Therefore, an adaptive distribution strategy, which blends PLP, URAN and PREF, is devised and implemented. It yields consistently good performance in all regimes. In the adaptive parallel
Recent Algorithmic and Computational Efficiency Improvements in the NIMROD Code
NASA Astrophysics Data System (ADS)
Plimpton, S. J.; Sovinec, C. R.; Gianakon, T. A.; Parker, S. E.
1999-11-01
Extreme anisotropy and temporal stiffness impose severe challenges to simulating low frequency, nonlinear behavior in magnetized fusion plasmas. To address these challenges in computations of realistic experiment configurations, NIMROD(Glasser, et al., Plasma Phys. Control. Fusion 41) (1999) A747. uses a time-split, semi-implicit advance of the two-fluid equations for magnetized plasmas with a finite element/Fourier series spatial representation. The stiffness and anisotropy lead to ill-conditioned linear systems of equations, and they emphasize any truncation errors that may couple different modes of the continuous system. Recent work significantly improves NIMROD's performance in these areas. Implementing a parallel global preconditioning scheme in structured-grid regions permits scaling to large problems and large time steps, which are critical for achieving realistic S-values. In addition, coupling to the AZTEC parallel linear solver package now permits efficient computation with regions of unstructured grid. Changes in the time-splitting scheme improve numerical behavior in simulations with strong flow, and quadratic basis elements are being explored for accuracy. Different numerical forms of anisotropic thermal conduction, critical for slow island evolution, are compared. Algorithms for including gyrokinetic ions in the finite element computations are discussed.
Efficient Homotopy Continuation Algorithms with Application to Computational Fluid Dynamics
NASA Astrophysics Data System (ADS)
Brown, David A.
New homotopy continuation algorithms are developed and applied to a parallel implicit finite-difference Newton-Krylov-Schur external aerodynamic flow solver for the compressible Euler, Navier-Stokes, and Reynolds-averaged Navier-Stokes equations with the Spalart-Allmaras one-equation turbulence model. Many new analysis tools, calculations, and numerical algorithms are presented for the study and design of efficient and robust homotopy continuation algorithms applicable to solving very large and sparse nonlinear systems of equations. Several specific homotopies are presented and studied and a methodology is presented for assessing the suitability of specific homotopies for homotopy continuation. . A new class of homotopy continuation algorithms, referred to as monolithic homotopy continuation algorithms, is developed. These algorithms differ from classical predictor-corrector algorithms by combining the predictor and corrector stages into a single update, significantly reducing the amount of computation and avoiding wasted computational effort resulting from over-solving in the corrector phase. The new algorithms are also simpler from a user perspective, with fewer input parameters, which also improves the user's ability to choose effective parameters on the first flow solve attempt. Conditional convergence is proved analytically and studied numerically for the new algorithms. The performance of a fully-implicit monolithic homotopy continuation algorithm is evaluated for several inviscid, laminar, and turbulent flows over NACA 0012 airfoils and ONERA M6 wings. The monolithic algorithm is demonstrated to be more efficient than the predictor-corrector algorithm for all applications investigated. It is also demonstrated to be more efficient than the widely-used pseudo-transient continuation algorithm for all inviscid and laminar cases investigated, and good performance scaling with grid refinement is demonstrated for the inviscid cases. Performance is also demonstrated
NASA Astrophysics Data System (ADS)
Ding, Feizhi
motion. All these developments and applications will open up new computational and theoretical tools to be applied to the development and understanding of chemical reactions, nonlinear optics, electromagnetism, and spintronics. Lastly, we present a new algorithm for large-scale MCSCF calculations that can utilize massively parallel machines while still maintaining optimal performance for each single processor. This will great improve the efficiency in the MCSCF calculations for studying chemical dissociation and high-accuracy quantum-mechanical simulations.
The Efficiency of Various Computers and Optimizations in Performing Finite Element Computations
NASA Technical Reports Server (NTRS)
Marcus, Martin H.; Broduer, Steve (Technical Monitor)
2001-01-01
With the advent of computers with many processors, it becomes unclear how to best exploit this advantage. For example, matrices can be inverted by applying several processors to each vector operation, or one processor can be applied to each matrix. The former approach has diminishing returns beyond a handful of processors, but how many processors depends on the computer architecture. Applying one processor to each matrix is feasible with enough ram memory and scratch disk space, but the speed at which this is done is found to vary by a factor of three depending on how it is done. The cost of the computer must also be taken into account. A computer with many processors and fast interprocessor communication is much more expensive than the same computer and processors with slow interprocessor communication. Consequently, for problems that require several matrices to be inverted, the best speed per dollar for computers is found to be several small workstations that are networked together, such as in a Beowulf cluster. Since these machines typically have two processors per node, each matrix is most efficiently inverted with no more than two processors assigned to it.
Efficient computer algebra algorithms for polynomial matrices in control design
NASA Technical Reports Server (NTRS)
Baras, J. S.; Macenany, D. C.; Munach, R.
1989-01-01
The theory of polynomial matrices plays a key role in the design and analysis of multi-input multi-output control and communications systems using frequency domain methods. Examples include coprime factorizations of transfer functions, cannonical realizations from matrix fraction descriptions, and the transfer function design of feedback compensators. Typically, such problems abstract in a natural way to the need to solve systems of Diophantine equations or systems of linear equations over polynomials. These and other problems involving polynomial matrices can in turn be reduced to polynomial matrix triangularization procedures, a result which is not surprising given the importance of matrix triangularization techniques in numerical linear algebra. Matrices with entries from a field and Gaussian elimination play a fundamental role in understanding the triangularization process. In the case of polynomial matrices, matrices with entries from a ring for which Gaussian elimination is not defined and triangularization is accomplished by what is quite properly called Euclidean elimination. Unfortunately, the numerical stability and sensitivity issues which accompany floating point approaches to Euclidean elimination are not very well understood. New algorithms are presented which circumvent entirely such numerical issues through the use of exact, symbolic methods in computer algebra. The use of such error-free algorithms guarantees that the results are accurate to within the precision of the model data--the best that can be hoped for. Care must be taken in the design of such algorithms due to the phenomenon of intermediate expressions swell.
Optimization of computation efficiency in underwater acoustic navigation system.
Lee, Hua
2016-04-01
This paper presents a technique for the estimation of the relative bearing angle between the unmanned underwater vehicle (UUV) and the base station for the homing and docking operations. The key requirement of this project includes computation efficiency and estimation accuracy for direct implementation onto the UUV electronic hardware, subject to the extreme constraints of physical limitation of the hardware due to the size and dimension of the UUV housing, electric power consumption for the requirement of UUV survey duration and range coverage, and heat dissipation of the hardware. Subsequent to the design and development of the algorithm, two phases of experiments were conducted to illustrate the feasibility and capability of this technique. The presentation of this paper includes system modeling, mathematical analysis, and results from laboratory experiments and full-scale sea tests. PMID:27106337
Efficient Computer Network Anomaly Detection by Changepoint Detection Methods
NASA Astrophysics Data System (ADS)
Tartakovsky, Alexander G.; Polunchenko, Aleksey S.; Sokolov, Grigory
2013-02-01
We consider the problem of efficient on-line anomaly detection in computer network traffic. The problem is approached statistically, as that of sequential (quickest) changepoint detection. A multi-cyclic setting of quickest change detection is a natural fit for this problem. We propose a novel score-based multi-cyclic detection algorithm. The algorithm is based on the so-called Shiryaev-Roberts procedure. This procedure is as easy to employ in practice and as computationally inexpensive as the popular Cumulative Sum chart and the Exponentially Weighted Moving Average scheme. The likelihood ratio based Shiryaev-Roberts procedure has appealing optimality properties, particularly it is exactly optimal in a multi-cyclic setting geared to detect a change occurring at a far time horizon. It is therefore expected that an intrusion detection algorithm based on the Shiryaev-Roberts procedure will perform better than other detection schemes. This is confirmed experimentally for real traces. We also discuss the possibility of complementing our anomaly detection algorithm with a spectral-signature intrusion detection system with false alarm filtering and true attack confirmation capability, so as to obtain a synergistic system.
An efficient parallel algorithm for accelerating computational protein design
Zhou, Yichao; Xu, Wei; Donald, Bruce R.; Zeng, Jianyang
2014-01-01
Motivation: Structure-based computational protein design (SCPR) is an important topic in protein engineering. Under the assumption of a rigid backbone and a finite set of discrete conformations of side-chains, various methods have been proposed to address this problem. A popular method is to combine the dead-end elimination (DEE) and A* tree search algorithms, which provably finds the global minimum energy conformation (GMEC) solution. Results: In this article, we improve the efficiency of computing A* heuristic functions for protein design and propose a variant of A* algorithm in which the search process can be performed on a single GPU in a massively parallel fashion. In addition, we make some efforts to address the memory exceeding problem in A* search. As a result, our enhancements can achieve a significant speedup of the A*-based protein design algorithm by four orders of magnitude on large-scale test data through pre-computation and parallelization, while still maintaining an acceptable memory overhead. We also show that our parallel A* search algorithm could be successfully combined with iMinDEE, a state-of-the-art DEE criterion, for rotamer pruning to further improve SCPR with the consideration of continuous side-chain flexibility. Availability: Our software is available and distributed open-source under the GNU Lesser General License Version 2.1 (GNU, February 1999). The source code can be downloaded from http://www.cs.duke.edu/donaldlab/osprey.php or http://iiis.tsinghua.edu.cn/∼compbio/software.html. Contact: zengjy321@tsinghua.edu.cn Supplementary information: Supplementary data are available at Bioinformatics online. PMID:24931991
Textbook Multigrid Efficiency for Computational Fluid Dynamics Simulations
NASA Technical Reports Server (NTRS)
Brandt, Achi; Thomas, James L.; Diskin, Boris
2001-01-01
Considerable progress over the past thirty years has been made in the development of large-scale computational fluid dynamics (CFD) solvers for the Euler and Navier-Stokes equations. Computations are used routinely to design the cruise shapes of transport aircraft through complex-geometry simulations involving the solution of 25-100 million equations; in this arena the number of wind-tunnel tests for a new design has been substantially reduced. However, simulations of the entire flight envelope of the vehicle, including maximum lift, buffet onset, flutter, and control effectiveness have not been as successful in eliminating the reliance on wind-tunnel testing. These simulations involve unsteady flows with more separation and stronger shock waves than at cruise. The main reasons limiting further inroads of CFD into the design process are: (1) the reliability of turbulence models; and (2) the time and expense of the numerical simulation. Because of the prohibitive resolution requirements of direct simulations at high Reynolds numbers, transition and turbulence modeling is expected to remain an issue for the near term. The focus of this paper addresses the latter problem by attempting to attain optimal efficiencies in solving the governing equations. Typically current CFD codes based on the use of multigrid acceleration techniques and multistage Runge-Kutta time-stepping schemes are able to converge lift and drag values for cruise configurations within approximately 1000 residual evaluations. An optimally convergent method is defined as having textbook multigrid efficiency (TME), meaning the solutions to the governing system of equations are attained in a computational work which is a small (less than 10) multiple of the operation count in the discretized system of equations (residual equations). In this paper, a distributed relaxation approach to achieving TME for Reynolds-averaged Navier-Stokes (RNAS) equations are discussed along with the foundations that form the
NASA Astrophysics Data System (ADS)
Hoang, Tuan L.; Marian, Jaime; Bulatov, Vasily V.; Hosemann, Peter
2015-11-01
An improved version of a recently developed stochastic cluster dynamics (SCD) method (Marian and Bulatov, 2012) [6] is introduced as an alternative to rate theory (RT) methods for solving coupled ordinary differential equation (ODE) systems for irradiation damage simulations. SCD circumvents by design the curse of dimensionality of the variable space that renders traditional ODE-based RT approaches inefficient when handling complex defect population comprised of multiple (more than two) defect species. Several improvements introduced here enable efficient and accurate simulations of irradiated materials up to realistic (high) damage doses characteristic of next-generation nuclear systems. The first improvement is a procedure for efficiently updating the defect reaction-network and event selection in the context of a dynamically expanding reaction-network. Next is a novel implementation of the τ-leaping method that speeds up SCD simulations by advancing the state of the reaction network in large time increments when appropriate. Lastly, a volume rescaling procedure is introduced to control the computational complexity of the expanding reaction-network through occasional reductions of the defect population while maintaining accurate statistics. The enhanced SCD method is then applied to model defect cluster accumulation in iron thin films subjected to triple ion-beam (Fe3+, He+ and H+) irradiations, for which standard RT or spatially-resolved kinetic Monte Carlo simulations are prohibitively expensive.
Issa, Naiem T; Peters, Oakland J; Byers, Stephen W; Dakshanamurthy, Sivanesan
2015-01-01
We describe here RepurposeVS for the reliable prediction of drug-target signatures using X-ray protein crystal structures. RepurposeVS is a virtual screening method that incorporates docking, drug-centric and protein-centric 2D/3D fingerprints with a rigorous mathematical normalization procedure to account for the variability in units and provide high-resolution contextual information for drug-target binding. Validity was confirmed by the following: (1) providing the greatest enrichment of known drug binders for multiple protein targets in virtual screening experiments, (2) determining that similarly shaped protein target pockets are predicted to bind drugs of similar 3D shapes when RepurposeVS is applied to 2,335 human protein targets, and (3) determining true biological associations in vitro for mebendazole (MBZ) across many predicted kinase targets for potential cancer repurposing. Since RepurposeVS is a drug repurposing-focused method, benchmarking was conducted on a set of 3,671 FDA approved and experimental drugs rather than the Database of Useful Decoys (DUDE) so as to streamline downstream repurposing experiments. We further apply RepurposeVS to explore the overall potential drug repurposing space for currently approved drugs. RepurposeVS is not computationally intensive and increases performance accuracy, thus serving as an efficient and powerful in silico tool to predict drug-target associations in drug repurposing. PMID:26234515
NASA Astrophysics Data System (ADS)
Coquerelle, Mathieu; Glockner, Stéphane
2016-01-01
We propose an accurate and robust fourth-order curvature extension algorithm in a level set framework for the transport of the interface. The method is based on the Continuum Surface Force approach, and is shown to efficiently calculate surface tension forces for two-phase flows. In this framework, the accuracy of the algorithms mostly relies on the precise computation of the surface curvature which we propose to accomplish using a two-step algorithm: first by computing a reliable fourth-order curvature estimation from the level set function, and second by extending this curvature rigorously in the vicinity of the surface, following the Closest Point principle. The algorithm is easy to implement and to integrate into existing solvers, and can easily be extended to 3D. We propose a detailed analysis of the geometrical and numerical criteria responsible for the appearance of spurious currents, a well known phenomenon observed in various numerical frameworks. We study the effectiveness of this novel numerical method on state-of-the-art test cases showing that the resulting curvature estimate significantly reduces parasitic currents. In addition, the proposed approach converges to fourth-order regarding spatial discretization, which is two orders of magnitude better than algorithms currently available. We also show the necessity for high-order transport methods for the surface by studying the case of the 2D advection of a column at equilibrium thereby proving the robustness of the proposed approach. The algorithm is further validated on more complex test cases such as a rising bubble.
NASA Astrophysics Data System (ADS)
McNamara, Roger P.; Eagle, C. D.
1992-08-01
Planetary Observer High Accuracy Orbit Prediction Program (POHOP), an existing numerical integrator, was modified with the solar and lunar formulae developed by T.C. Van Flandern and K.F. Pulkkinen to provide the accuracy required to evaluate long-term orbit characteristics of objects on the geosynchronous region. The orbit of a 1000 kg class spacecraft is numerically integrated over 50 years using both the original and the more accurate solar and lunar ephemerides methods. Results of this study demonstrate that, over the long term, for an object located in the geosynchronous region, the more accurate solar and lunar ephemerides effects on the objects's position are significantly different than using the current POHOP ephemeris.
Efficient and accurate approach to modeling the microstructure and defect properties of LaCoO3
NASA Astrophysics Data System (ADS)
Buckeridge, J.; Taylor, F. H.; Catlow, C. R. A.
2016-04-01
Complex perovskite oxides are promising materials for cathode layers in solid oxide fuel cells. Such materials have intricate electronic, magnetic, and crystalline structures that prove challenging to model accurately. We analyze a wide range of standard density functional theory approaches to modeling a highly promising system, the perovskite LaCoO3, focusing on optimizing the Hubbard U parameter to treat the self-interaction of the B-site cation's d states, in order to determine the most appropriate method to study defect formation and the effect of spin on local structure. By calculating structural and electronic properties for different magnetic states we determine that U =4 eV for Co in LaCoO3 agrees best with available experiments. We demonstrate that the generalized gradient approximation (PBEsol +U ) is most appropriate for studying structure versus spin state, while the local density approximation (LDA +U ) is most appropriate for determining accurate energetics for defect properties.
A computationally efficient particle-simulation method suited to vector-computer architectures
McDonald, J.D.
1990-01-01
Recent interest in a National Aero-Space Plane (NASP) and various Aero-assisted Space Transfer Vehicles (ASTVs) presents the need for a greater understanding of high-speed rarefied flight conditions. Particle simulation techniques such as the Direct Simulation Monte Carlo (DSMC) method are well suited to such problems, but the high cost of computation limits the application of the methods to two-dimensional or very simple three-dimensional problems. This research re-examines the algorithmic structure of existing particle simulation methods and re-structures them to allow efficient implementation on vector-oriented supercomputers. A brief overview of the DSMC method and the Cray-2 vector computer architecture are provided, and the elements of the DSMC method that inhibit substantial vectorization are identified. One such element is the collision selection algorithm. A complete reformulation of underlying kinetic theory shows that this may be efficiently vectorized for general gas mixtures. The mechanics of collisions are vectorizable in the DSMC method, but several optimizations are suggested that greatly enhance performance. Also this thesis proposes a new mechanism for the exchange of energy between vibration and other energy modes. The developed scheme makes use of quantized vibrational states and is used in place of the Borgnakke-Larsen model. Finally, a simplified representation of physical space and boundary conditions is utilized to further reduce the computational cost of the developed method. Comparison to solutions obtained from the DSMC method for the relaxation of internal energy modes in a homogeneous gas, as well as single and multiple specie shock wave profiles, are presented. Additionally, a large scale simulation of the flow about the proposed Aeroassisted Flight Experiment (AFE) vehicle is included as an example of the new computational capability of the developed particle simulation method.
A computational study of the effect of unstructured mesh quality on solution efficiency
Batdorf, M.; Freitag, L.A.; Ollivier-Gooch, C.
1997-09-01
It is well known that mesh quality affects both efficiency and accuracy of CFD solutions. Meshes with distorted elements make solutions both more difficult to compute and less accurate. We review a recently proposed technique for improving mesh quality as measured by element angle (dihedral angle in three dimensions) using a combination of optimization-based smoothing techniques and local reconnection schemes. Typical results that quantify mesh improvement for a number of application meshes are presented. We then examine effects of mesh quality as measured by the maximum angle in the mesh on the convergence rates of two commonly used CFD solution techniques. Numerical experiments are performed that quantify the cost and benefit of using mesh optimization schemes for incompressible flow over a cylinder and weakly compressible flow over a cylinder.
Modeling weakly-ionized plasmas in magnetic field: A new computationally-efficient approach
NASA Astrophysics Data System (ADS)
Parent, Bernard; Macheret, Sergey O.; Shneider, Mikhail N.
2015-11-01
Despite its success at simulating accurately both non-neutral and quasi-neutral weakly-ionized plasmas, the drift-diffusion model has been observed to be a particularly stiff set of equations. Recently, it was demonstrated that the stiffness of the system could be relieved by rewriting the equations such that the potential is obtained from Ohm's law rather than Gauss's law while adding some source terms to the ion transport equation to ensure that Gauss's law is satisfied in non-neutral regions. Although the latter was applicable to multicomponent and multidimensional plasmas, it could not be used for plasmas in which the magnetic field was significant. This paper hence proposes a new computationally-efficient set of electron and ion transport equations that can be used not only for a plasma with multiple types of positive and negative ions, but also for a plasma in magnetic field. Because the proposed set of equations is obtained from the same physical model as the conventional drift-diffusion equations without introducing new assumptions or simplifications, it results in the same exact solution when the grid is refined sufficiently while being more computationally efficient: not only is the proposed approach considerably less stiff and hence requires fewer iterations to reach convergence but it yields a converged solution that exhibits a significantly higher resolution. The combined faster convergence and higher resolution is shown to result in a hundredfold increase in computational efficiency for some typical steady and unsteady plasma problems including non-neutral cathode and anode sheaths as well as quasi-neutral regions.
Devereux, Mike; Raghunathan, Shampa; Fedorov, Dmitri G; Meuwly, Markus
2014-10-14
A truncated multipole expansion can be re-expressed exactly using an appropriate arrangement of point charges. This means that groups of point charges that are shifted away from nuclear coordinates can be used to achieve accurate electrostatics for molecular systems. We introduce a multipolar electrostatic model formulated in this way for use in computationally efficient multipolar molecular dynamics simulations with well-defined forces and energy conservation in NVE (constant number-volume-energy) simulations. A framework is introduced to distribute torques arising from multipole moments throughout a molecule, and a refined fitting approach is suggested to obtain atomic multipole moments that are optimized for accuracy and numerical stability in a force field context. The formulation of the charge model is outlined as it has been implemented into CHARMM, with application to test systems involving H2O and chlorobenzene. As well as ease of implementation and computational efficiency, the approach can be used to provide snapshots for multipolar QM/MM calculations in QM/MM-MD studies and easily combined with a standard point-charge force field to allow mixed multipolar/point charge simulations of large systems. PMID:26588121
NASA Astrophysics Data System (ADS)
Sun, Yuansheng; Periasamy, Ammasi
2010-03-01
Förster resonance energy transfer (FRET) microscopy is commonly used to monitor protein interactions with filter-based imaging systems, which require spectral bleedthrough (or cross talk) correction to accurately measure energy transfer efficiency (E). The double-label (donor+acceptor) specimen is excited with the donor wavelength, the acceptor emission provided the uncorrected FRET signal and the donor emission (the donor channel) represents the quenched donor (qD), the basis for the E calculation. Our results indicate this is not the most accurate determination of the quenched donor signal as it fails to consider the donor spectral bleedthrough (DSBT) signals in the qD for the E calculation, which our new model addresses, leading to a more accurate E result. This refinement improves E comparisons made with lifetime and spectral FRET imaging microscopy as shown here using several genetic (FRET standard) constructs, where cerulean and venus fluorescent proteins are tethered by different amino acid linkers.
NASA Astrophysics Data System (ADS)
Sizov, Gennadi Y.
In this dissertation, a model-based multi-objective optimal design of permanent magnet ac machines, supplied by sine-wave current regulated drives, is developed and implemented. The design procedure uses an efficient electromagnetic finite element-based solver to accurately model nonlinear material properties and complex geometric shapes associated with magnetic circuit design. Application of an electromagnetic finite element-based solver allows for accurate computation of intricate performance parameters and characteristics. The first contribution of this dissertation is the development of a rapid computational method that allows accurate and efficient exploration of large multi-dimensional design spaces in search of optimum design(s). The computationally efficient finite element-based approach developed in this work provides a framework of tools that allow rapid analysis of synchronous electric machines operating under steady-state conditions. In the developed modeling approach, major steady-state performance parameters such as, winding flux linkages and voltages, average, cogging and ripple torques, stator core flux densities, core losses, efficiencies and saturated machine winding inductances, are calculated with minimum computational effort. In addition, the method includes means for rapid estimation of distributed stator forces and three-dimensional effects of stator and/or rotor skew on the performance of the machine. The second contribution of this dissertation is the development of the design synthesis and optimization method based on a differential evolution algorithm. The approach relies on the developed finite element-based modeling method for electromagnetic analysis and is able to tackle large-scale multi-objective design problems using modest computational resources. Overall, computational time savings of up to two orders of magnitude are achievable, when compared to current and prevalent state-of-the-art methods. These computational savings allow
A universal and efficient method to compute maps from image-based prediction models.
Sabuncu, Mert R
2014-01-01
Discriminative supervised learning algorithms, such as Support Vector Machines, are becoming increasingly popular in biomedical image computing. One of their main uses is to construct image-based prediction models, e.g., for computer aided diagnosis or "mind reading." A major challenge in these applications is the biological interpretation of the machine learning models, which can be arbitrarily complex functions of the input features (e.g., as induced by kernel-based methods). Recent work has proposed several strategies for deriving maps that highlight regions relevant for accurate prediction. Yet most of these methods o n strong assumptions about t he prediction model (e.g., linearity, sparsity) and/or data (e.g., Gaussianity), or fail to exploit the covariance structure in the data. In this work, we propose a computationally efficient and universal framework for quantifying associations captured by black box machine learning models. Furthermore, our theoretical perspective reveals that examining associations with predictions, in the absence of ground truth labels, can be very informative. We apply the proposed method to machine learning models trained to predict cognitive impairment from structural neuroimaging data. We demonstrate that our approach yields biologically meaningful maps of association. PMID:25320819
Dendritic nonlinearities are tuned for efficient spike-based computations in cortical circuits
Ujfalussy, Balázs B; Makara, Judit K; Branco, Tiago; Lengyel, Máté
2015-01-01
Cortical neurons integrate thousands of synaptic inputs in their dendrites in highly nonlinear ways. It is unknown how these dendritic nonlinearities in individual cells contribute to computations at the level of neural circuits. Here, we show that dendritic nonlinearities are critical for the efficient integration of synaptic inputs in circuits performing analog computations with spiking neurons. We developed a theory that formalizes how a neuron's dendritic nonlinearity that is optimal for integrating synaptic inputs depends on the statistics of its presynaptic activity patterns. Based on their in vivo preynaptic population statistics (firing rates, membrane potential fluctuations, and correlations due to ensemble dynamics), our theory accurately predicted the responses of two different types of cortical pyramidal cells to patterned stimulation by two-photon glutamate uncaging. These results reveal a new computational principle underlying dendritic integration in cortical neurons by suggesting a functional link between cellular and systems--level properties of cortical circuits. DOI: http://dx.doi.org/10.7554/eLife.10056.001 PMID:26705334
The Effect of Computer Automation on Institutional Review Board (IRB) Office Efficiency
ERIC Educational Resources Information Center
Oder, Karl; Pittman, Stephanie
2015-01-01
Companies purchase computer systems to make their processes more efficient through automation. Some academic medical centers (AMC) have purchased computer systems for their institutional review boards (IRB) to increase efficiency and compliance with regulations. IRB computer systems are expensive to purchase, deploy, and maintain. An AMC should…
Building Efficient Wireless Infrastructures for Pervasive Computing Environments
ERIC Educational Resources Information Center
Sheng, Bo
2010-01-01
Pervasive computing is an emerging concept that thoroughly brings computing devices and the consequent technology into people's daily life and activities. Most of these computing devices are very small, sometimes even "invisible", and often embedded into the objects surrounding people. In addition, these devices usually are not isolated, but…
Balancing Accuracy and Computational Efficiency for Ternary Gas Hydrate Systems
NASA Astrophysics Data System (ADS)
White, M. D.
2011-12-01
phase transitions. This paper describes and demonstrates a numerical solution scheme for ternary hydrate systems that seeks a balance between accuracy and computational efficiency. This scheme uses a generalize cubic equation of state, functional forms for the hydrate equilibria and cage occupancies, variable switching scheme for phase transitions, and kinetic exchange of hydrate formers (i.e., CH4, CO2, and N2) between the mobile phases (i.e., aqueous, liquid CO2, and gas) and hydrate phase. Accuracy of the scheme will be evaluated by comparing property values and phase equilibria against experimental data. Computational efficiency of the scheme will be evaluated by comparing the base scheme against variants. The application of interest will the production of a natural gas hydrate deposit from a geologic formation, using the guest molecule exchange process; where, a mixture of CO2 and N2 are injected into the formation. During the guest-molecule exchange, CO2 and N2 will predominately replace CH4 in the large and small cages of the sI structure, respectively.
Li, Qiang; Zhang, Wei; Guan, Xin; Bai, Yu; Jia, Jing
2014-01-01
The intima-media thickness (IMT) of common carotid artery (CCA) can serve as an important indicator for the assessment of cardiovascular diseases (CVDs). In this paper an improved approach for automatic IMT measurement with low complexity and high accuracy is presented. 100 ultrasound images from 100 patients were tested with the proposed approach. The ground truth (GT) of the IMT was manually measured for six times and averaged, while the automatic segmented (AS) IMT was computed by the algorithm proposed in this paper. The mean difference±standard deviation between AS and GT IMT is 0.0231±0.0348 mm, and the correlation coefficient between them is 0.9629. The computational time is 0.3223 s per image with MATLAB under Windows XP on an Intel Core 2 Duo CPU E7500 @2.93 GHz. The proposed algorithm has the potential to achieve real-time measurement under Visual Studio. PMID:25215292
Time-Accurate, Unstructured-Mesh Navier-Stokes Computations with the Space-Time CESE Method
NASA Technical Reports Server (NTRS)
Chang, Chau-Lyan
2006-01-01
Application of the newly emerged space-time conservation element solution element (CESE) method to compressible Navier-Stokes equations is studied. In contrast to Euler equations solvers, several issues such as boundary conditions, numerical dissipation, and grid stiffness warrant systematic investigations and validations. Non-reflecting boundary conditions applied at the truncated boundary are also investigated from the stand point of acoustic wave propagation. Validations of the numerical solutions are performed by comparing with exact solutions for steady-state as well as time-accurate viscous flow problems. The test cases cover a broad speed regime for problems ranging from acoustic wave propagation to 3D hypersonic configurations. Model problems pertinent to hypersonic configurations demonstrate the effectiveness of the CESE method in treating flows with shocks, unsteady waves, and separations. Good agreement with exact solutions suggests that the space-time CESE method provides a viable alternative for time-accurate Navier-Stokes calculations of a broad range of problems.
Kleinschmidt, A M; Pederson, T
1987-01-01
The small nuclear RNAs U1, U2, U4, and U5 are cofactors in mRNA splicing and, like the pre-mRNAs with which they interact, are transcribed by RNA polymerase II. Also like mRNAs, mature U1 and U2 RNAs are generated by 3' processing of their primary transcripts. In this study we have investigated the in vitro processing of an SP6-transcribed human U2 RNA precursor, the 3' end of which matches that of authentic human U2 RNA precursor molecules. Although the SP6-U2 RNA precursor was efficiently processed in an ammonium sulfate-fractionated HeLa cytoplasmic S100 extract, the product RNA was unstable. Further purification of the processing activity on glycerol gradients resolved a 7S activity that nonspecifically cleaved all RNAs tested and a 15S activity that efficiently processed the 3' end of pre-U2 RNA. The 15S activity did not process the 3' end of a tRNA precursor molecule. As demonstrated by RNase protection, the processed 3' end of the SP6-U2 RNA maps to the same nucleotides as does mature HeLa U2 RNA. Images PMID:3670307
Hashemifar, Somaye; Xu, Jinbo
2014-01-01
Motivation: High-throughput experimental techniques have produced a large amount of protein–protein interaction (PPI) data. The study of PPI networks, such as comparative analysis, shall benefit the understanding of life process and diseases at the molecular level. One way of comparative analysis is to align PPI networks to identify conserved or species-specific subnetwork motifs. A few methods have been developed for global PPI network alignment, but it still remains challenging in terms of both accuracy and efficiency. Results: This paper presents a novel global network alignment algorithm, denoted as HubAlign, that makes use of both network topology and sequence homology information, based upon the observation that topologically important proteins in a PPI network usually are much more conserved and thus, more likely to be aligned. HubAlign uses a minimum-degree heuristic algorithm to estimate the topological and functional importance of a protein from the global network topology information. Then HubAlign aligns topologically important proteins first and gradually extends the alignment to the whole network. Extensive tests indicate that HubAlign greatly outperforms several popular methods in terms of both accuracy and efficiency, especially in detecting functionally similar proteins. Availability: HubAlign is available freely for non-commercial purposes at http://ttic.uchicago.edu/∼hashemifar/software/HubAlign.zip Contact: jinboxu@gmail.com Supplementary information: Supplementary data are available at Bioinformatics online. PMID:25161231
A Computationally Efficient Multicomponent Equilibrium Solver for Aerosols (MESA)
Zaveri, Rahul A.; Easter, Richard C.; Peters, Len K.
2005-12-23
deliquescence points as well as mass growth factors for the sulfate-rich systems. The MESA-MTEM configuration required only 5 to 10 single-level iterations to obtain the equilibrium solution for ~44% of the 328 multiphase problems solved in the 16 test cases at RH values ranging between 20% and 90%, while ~85% of the problems solved required less than 20 iterations. Based on the accuracy and computational efficiency considerations, the MESA-MTEM configuration is attractive for use in 3-D aerosol/air quality models.
O’Kane, Dermot B.; Lawrentschuk, Nathan; Bolton, Damien M.
2016-01-01
We herein present a case of a 76-year-old gentleman, where prostate-specific membrane antigen positron emission tomography-computed tomography (PSMA PET-CT) was used to accurately detect prostate cancer (PCa), pelvic lymph node (LN) metastasis in the setting of biochemical recurrence following definitive treatment for PCa. The positive PSMA PET-CT result was confirmed with histological examination of the involved pelvic LNs following pelvic LN dissection. PMID:27141207
Heydari, M.H.; Hooshmandasl, M.R.; Cattani, C.; Maalek Ghaini, F.M.
2015-02-15
Because of the nonlinearity, closed-form solutions of many important stochastic functional equations are virtually impossible to obtain. Thus, numerical solutions are a viable alternative. In this paper, a new computational method based on the generalized hat basis functions together with their stochastic operational matrix of Itô-integration is proposed for solving nonlinear stochastic Itô integral equations in large intervals. In the proposed method, a new technique for computing nonlinear terms in such problems is presented. The main advantage of the proposed method is that it transforms problems under consideration into nonlinear systems of algebraic equations which can be simply solved. Error analysis of the proposed method is investigated and also the efficiency of this method is shown on some concrete examples. The obtained results reveal that the proposed method is very accurate and efficient. As two useful applications, the proposed method is applied to obtain approximate solutions of the stochastic population growth models and stochastic pendulum problem.
Keshavarz, Mohammad Hossein; Gharagheizi, Farhad; Shokrolahi, Arash; Zakinejad, Sajjad
2012-10-30
Most of benzoic acid derivatives are toxic, which may cause serious public health and environmental problems. Two novel simple and reliable models are introduced for desk calculations of the toxicity of benzoic acid compounds in mice via oral LD(50) with more reliance on their answers as one could attach to the more complex outputs. They require only elemental composition and molecular fragments without using any computer codes. The first model is based on only the number of carbon and hydrogen atoms, which can be improved by several molecular fragments in the second model. For 57 benzoic compounds, where the computed results of quantitative structure-toxicity relationship (QSTR) were recently reported, the predicted results of two simple models of present method are more reliable than QSTR computations. The present simple method is also tested with further 324 benzoic acid compounds including complex molecular structures, which confirm good forecasting ability of the second model. PMID:22959133
NASA Astrophysics Data System (ADS)
Patchkovskii, Serguei; Muller, H. G.
2016-02-01
Modelling atomic processes in intense laser fields often relies on solving the time-dependent Schrödinger equation (TDSE). For processes involving ionisation, such as above-threshold ionisation (ATI) and high-harmonic generation (HHG), this is a formidable task even if only one electron is active. Several powerful ideas for efficient implementation of atomic TDSE were introduced by H.G. Muller some time ago (Muller, 1999), including: separation of Hamiltonian terms into tri-diagonal parts; implicit representation of the spatial derivatives; and use of a rotating reference frame. Here, we extend these techniques to allow for non-uniform radial grids, arbitrary laser field polarisation, and non-Hermitian terms in the Hamiltonian due to the implicit form of the derivatives (previously neglected). We implement the resulting propagator in a parallel Fortran program, adapted for multi-core execution. Cost of TDSE propagation scales linearly with the problem size, enabling full-dimensional calculations of strong-field ATI and HHG spectra for arbitrary field polarisations on a standard desktop PC.
Qu, Jun; Truhan, Jr., John J
2006-01-01
Point contact is often used in unidirectional pin-on-disk and reciprocating pin-on-flat sliding friction and wear tests. The slider tip could have either a spherical shape or compound curvatures (such as an ellipsoidal shape), and the worn tip usually is not flat but has unknown curvatures. Current methods for determining the wear volumes of sliders suffer from one or more limitations. For example, the gravimetric method is not able to detect small amounts of wear, and the two-dimensional wear scar size measurement is valid only for flat wear scars. More rigorous methods can be very time consuming, such as the 3D surface profiling method that involves obtaining tedious multiple surface profiles and analyzing a large set of data. In this study, a new 'single-trace' analysis is introduced to efficiently evaluate the wear volumes of non-flat worn sliders. This method requires only the measurement of the wear scar size and one trace of profiling to obtain the curvature on the wear cap. The wear volume calculation only involves closed-form algebraic equations. This single-trace method has demonstrated much higher accuracy and fewer limitations than the gravimetric method and 2D method, and has shown good agreement with the 3D method while saving significant surface profiling and data analysis time.
Computationally efficient algorithms for real-time attitude estimation
NASA Technical Reports Server (NTRS)
Pringle, Steven R.
1993-01-01
For many practical spacecraft applications, algorithms for determining spacecraft attitude must combine inputs from diverse sensors and provide redundancy in the event of sensor failure. A Kalman filter is suitable for this task, however, it may impose a computational burden which may be avoided by sub optimal methods. A suboptimal estimator is presented which was implemented successfully on the Delta Star spacecraft which performed a 9 month SDI flight experiment in 1989. This design sought to minimize algorithm complexity to accommodate the limitations of an 8K guidance computer. The algorithm used is interpreted in the framework of Kalman filtering and a derivation is given for the computation.
NASA Technical Reports Server (NTRS)
Kylling, Arve; Stamnes, Knut
1992-01-01
The present solutions to the linear transport equation pertain to monoenergetic particles' interaction with a multiple scattering/absorbing layered medium with a general anisotropic internal source term. Attention is given to a novel exponential-linear approximation to the internal source, as a function of scattering depth, which furnishes an at-once efficient and accurate solution to the linear transport equation through its reduction of the spatial mesh size. The great superiority of the proposed method is demonstrated by the numerical results obtained in the illustrative cases of (1) an embedded thermal source and (2) a rapidly varying beam pseudosource.
NASA Astrophysics Data System (ADS)
Neese, Frank; Wennmohs, Frank; Hansen, Andreas
2009-03-01
Coupled-electron pair approximations (CEPAs) and coupled-pair functionals (CPFs) have been popular in the 1970s and 1980s and have yielded excellent results for small molecules. Recently, interest in CEPA and CPF methods has been renewed. It has been shown that these methods lead to competitive thermochemical, kinetic, and structural predictions. They greatly surpass second order Møller-Plesset and popular density functional theory based approaches in accuracy and are intermediate in quality between CCSD and CCSD(T) in extended benchmark studies. In this work an efficient production level implementation of the closed shell CEPA and CPF methods is reported that can be applied to medium sized molecules in the range of 50-100 atoms and up to about 2000 basis functions. The internal space is spanned by localized internal orbitals. The external space is greatly compressed through the method of pair natural orbitals (PNOs) that was also introduced by the pioneers of the CEPA approaches. Our implementation also makes extended use of density fitting (or resolution of the identity) techniques in order to speed up the laborious integral transformations. The method is called local pair natural orbital CEPA (LPNO-CEPA) (LPNO-CPF). The implementation is centered around the concepts of electron pairs and matrix operations. Altogether three cutoff parameters are introduced that control the size of the significant pair list, the average number of PNOs per electron pair, and the number of contributing basis functions per PNO. With the conservatively chosen default values of these thresholds, the method recovers about 99.8% of the canonical correlation energy. This translates to absolute deviations from the canonical result of only a few kcal mol-1. Extended numerical test calculations demonstrate that LPNO-CEPA (LPNO-CPF) has essentially the same accuracy as parent CEPA (CPF) methods for thermochemistry, kinetics, weak interactions, and potential energy surfaces but is up to 500
Oltean, Gabriel; Ivanciu, Laura-Nicoleta
2016-01-01
The design and verification of complex electronic systems, especially the analog and mixed-signal ones, prove to be extremely time consuming tasks, if only circuit-level simulations are involved. A significant amount of time can be saved if a cost effective solution is used for the extensive analysis of the system, under all conceivable conditions. This paper proposes a data-driven method to build fast to evaluate, but also accurate metamodels capable of generating not-yet simulated waveforms as a function of different combinations of the parameters of the system. The necessary data are obtained by early-stage simulation of an electronic control system from the automotive industry. The metamodel development is based on three key elements: a wavelet transform for waveform characterization, a genetic algorithm optimization to detect the optimal wavelet transform and to identify the most relevant decomposition coefficients, and an artificial neuronal network to derive the relevant coefficients of the wavelet transform for any new parameters combination. The resulted metamodels for three different waveform families are fully reliable. They satisfy the required key points: high accuracy (a maximum mean squared error of 7.1x10-5 for the unity-based normalized waveforms), efficiency (fully affordable computational effort for metamodel build-up: maximum 18 minutes on a general purpose computer), and simplicity (less than 1 second for running the metamodel, the user only provides the parameters combination). The metamodels can be used for very efficient generation of new waveforms, for any possible combination of dependent parameters, offering the possibility to explore the entire design space. A wide range of possibilities becomes achievable for the user, such as: all design corners can be analyzed, possible worst-case situations can be investigated, extreme values of waveforms can be discovered, sensitivity analyses can be performed (the influence of each parameter on the
Oltean, Gabriel; Ivanciu, Laura-Nicoleta
2016-01-01
The design and verification of complex electronic systems, especially the analog and mixed-signal ones, prove to be extremely time consuming tasks, if only circuit-level simulations are involved. A significant amount of time can be saved if a cost effective solution is used for the extensive analysis of the system, under all conceivable conditions. This paper proposes a data-driven method to build fast to evaluate, but also accurate metamodels capable of generating not-yet simulated waveforms as a function of different combinations of the parameters of the system. The necessary data are obtained by early-stage simulation of an electronic control system from the automotive industry. The metamodel development is based on three key elements: a wavelet transform for waveform characterization, a genetic algorithm optimization to detect the optimal wavelet transform and to identify the most relevant decomposition coefficients, and an artificial neuronal network to derive the relevant coefficients of the wavelet transform for any new parameters combination. The resulted metamodels for three different waveform families are fully reliable. They satisfy the required key points: high accuracy (a maximum mean squared error of 7.1x10-5 for the unity-based normalized waveforms), efficiency (fully affordable computational effort for metamodel build-up: maximum 18 minutes on a general purpose computer), and simplicity (less than 1 second for running the metamodel, the user only provides the parameters combination). The metamodels can be used for very efficient generation of new waveforms, for any possible combination of dependent parameters, offering the possibility to explore the entire design space. A wide range of possibilities becomes achievable for the user, such as: all design corners can be analyzed, possible worst-case situations can be investigated, extreme values of waveforms can be discovered, sensitivity analyses can be performed (the influence of each parameter on the
NASA Astrophysics Data System (ADS)
Xu, Li; Weng, Peifen
2014-02-01
An improved fifth-order weighted essentially non-oscillatory (WENO-Z) scheme combined with the moving overset grid technique has been developed to compute unsteady compressible viscous flows on the helicopter rotor in forward flight. In order to enforce periodic rotation and pitching of the rotor and relative motion between rotor blades, the moving overset grid technique is extended, where a special judgement standard is presented near the odd surface of the blade grid during search donor cells by using the Inverse Map method. The WENO-Z scheme is adopted for reconstructing left and right state values with the Roe Riemann solver updating the inviscid fluxes and compared with the monotone upwind scheme for scalar conservation laws (MUSCL) and the classical WENO scheme. Since the WENO schemes require a six point stencil to build the fifth-order flux, the method of three layers of fringes for hole boundaries and artificial external boundaries is proposed to carry out flow information exchange between chimera grids. The time advance on the unsteady solution is performed by the full implicit dual time stepping method with Newton type LU-SGS subiteration, where the solutions of pseudo steady computation are as the initial fields of the unsteady flow computation. Numerical results on non-variable pitch rotor and periodic variable pitch rotor in forward flight reveal that the approach can effectively capture vortex wake with low dissipation and reach periodic solutions very soon.
Limits on efficient computation in the physical world
NASA Astrophysics Data System (ADS)
Aaronson, Scott Joel
More than a speculative technology, quantum computing seems to challenge our most basic intuitions about how the physical world should behave. In this thesis I show that, while some intuitions from classical computer science must be jettisoned in the light of modern physics, many others emerge nearly unscathed; and I use powerful tools from computational complexity theory to help determine which are which. In the first part of the thesis, I attack the common belief that quantum computing resembles classical exponential parallelism, by showing that quantum computers would face serious limitations on a wider range of problems than was previously known. In particular, any quantum algorithm that solves the collision problem---that of deciding whether a sequence of n integers is one-to-one or two-to-one---must query the sequence O (n1/5) times. This resolves a question that was open for years; previously no lower bound better than constant was known. A corollary is that there is no "black-box" quantum algorithm to break cryptographic hash functions or solve the Graph Isomorphism problem in polynomial time. I also show that relative to an oracle, quantum computers could not solve NP-complete problems in polynomial time, even with the help of nonuniform "quantum advice states"; and that any quantum algorithm needs O (2n/4/n) queries to find a local minimum of a black-box function on the n-dimensional hypercube. Surprisingly, the latter result also leads to new classical lower bounds for the local search problem. Finally, I give new lower bounds on quantum one-way communication complexity, and on the quantum query complexity of total Boolean functions and recursive Fourier sampling. The second part of the thesis studies the relationship of the quantum computing model to physical reality. I first examine the arguments of Leonid Levin, Stephen Wolfram, and others who believe quantum computing to be fundamentally impossible. I find their arguments unconvincing without a "Sure
Computationally efficient calibration of WATCLASS Hydrologic models using surrogate optimization
NASA Astrophysics Data System (ADS)
Kamali, M.; Ponnambalam, K.; Soulis, E. D.
2007-07-01
In this approach, exploration of the cost function space was performed with an inexpensive surrogate function, not the expensive original function. The Design and Analysis of Computer Experiments(DACE) surrogate function, which is one type of approximate models, which takes correlation function for error was employed. The results for Monte Carlo Sampling, Latin Hypercube Sampling and Design and Analysis of Computer Experiments(DACE) approximate model have been compared. The results show that DACE model has a good potential for predicting the trend of simulation results. The case study of this document was WATCLASS hydrologic model calibration on Smokey-River watershed.
Efficient computational simulation of actin stress fiber remodeling.
Ristori, T; Obbink-Huizer, C; Oomens, C W J; Baaijens, F P T; Loerakker, S
2016-09-01
Understanding collagen and stress fiber remodeling is essential for the development of engineered tissues with good functionality. These processes are complex, highly interrelated, and occur over different time scales. As a result, excessive computational costs are required to computationally predict the final organization of these fibers in response to dynamic mechanical conditions. In this study, an analytical approximation of a stress fiber remodeling evolution law was derived. A comparison of the developed technique with the direct numerical integration of the evolution law showed relatively small differences in results, and the proposed method is one to two orders of magnitude faster. PMID:26823159
Methods for Computationally Efficient Structured CFD Simulations of Complex Turbomachinery Flows
NASA Technical Reports Server (NTRS)
Herrick, Gregory P.; Chen, Jen-Ping
2012-01-01
This research presents more efficient computational methods by which to perform multi-block structured Computational Fluid Dynamics (CFD) simulations of turbomachinery, thus facilitating higher-fidelity solutions of complicated geometries and their associated flows. This computational framework offers flexibility in allocating resources to balance process count and wall-clock computation time, while facilitating research interests of simulating axial compressor stall inception with more complete gridding of the flow passages and rotor tip clearance regions than is typically practiced with structured codes. The paradigm presented herein facilitates CFD simulation of previously impractical geometries and flows. These methods are validated and demonstrate improved computational efficiency when applied to complicated geometries and flows.
Efficient algorithm to compute mutually connected components in interdependent networks.
Hwang, S; Choi, S; Lee, Deokjae; Kahng, B
2015-02-01
Mutually connected components (MCCs) play an important role as a measure of resilience in the study of interdependent networks. Despite their importance, an efficient algorithm to obtain the statistics of all MCCs during the removal of links has thus far been absent. Here, using a well-known fully dynamic graph algorithm, we propose an efficient algorithm to accomplish this task. We show that the time complexity of this algorithm is approximately O(N(1.2)) for random graphs, which is more efficient than O(N(2)) of the brute-force algorithm. We confirm the correctness of our algorithm by comparing the behavior of the order parameter as links are removed with existing results for three types of double-layer multiplex networks. We anticipate that this algorithm will be used for simulations of large-size systems that have been previously inaccessible. PMID:25768559
An Efficient Virtual Machine Consolidation Scheme for Multimedia Cloud Computing
Han, Guangjie; Que, Wenhui; Jia, Gangyong; Shu, Lei
2016-01-01
Cloud computing has innovated the IT industry in recent years, as it can delivery subscription-based services to users in the pay-as-you-go model. Meanwhile, multimedia cloud computing is emerging based on cloud computing to provide a variety of media services on the Internet. However, with the growing popularity of multimedia cloud computing, its large energy consumption cannot only contribute to greenhouse gas emissions, but also result in the rising of cloud users’ costs. Therefore, the multimedia cloud providers should try to minimize its energy consumption as much as possible while satisfying the consumers’ resource requirements and guaranteeing quality of service (QoS). In this paper, we have proposed a remaining utilization-aware (RUA) algorithm for virtual machine (VM) placement, and a power-aware algorithm (PA) is proposed to find proper hosts to shut down for energy saving. These two algorithms have been combined and applied to cloud data centers for completing the process of VM consolidation. Simulation results have shown that there exists a trade-off between the cloud data center’s energy consumption and service-level agreement (SLA) violations. Besides, the RUA algorithm is able to deal with variable workload to prevent hosts from overloading after VM placement and to reduce the SLA violations dramatically. PMID:26901201
An Efficient Virtual Machine Consolidation Scheme for Multimedia Cloud Computing.
Han, Guangjie; Que, Wenhui; Jia, Gangyong; Shu, Lei
2016-01-01
Cloud computing has innovated the IT industry in recent years, as it can delivery subscription-based services to users in the pay-as-you-go model. Meanwhile, multimedia cloud computing is emerging based on cloud computing to provide a variety of media services on the Internet. However, with the growing popularity of multimedia cloud computing, its large energy consumption cannot only contribute to greenhouse gas emissions, but also result in the rising of cloud users' costs. Therefore, the multimedia cloud providers should try to minimize its energy consumption as much as possible while satisfying the consumers' resource requirements and guaranteeing quality of service (QoS). In this paper, we have proposed a remaining utilization-aware (RUA) algorithm for virtual machine (VM) placement, and a power-aware algorithm (PA) is proposed to find proper hosts to shut down for energy saving. These two algorithms have been combined and applied to cloud data centers for completing the process of VM consolidation. Simulation results have shown that there exists a trade-off between the cloud data center's energy consumption and service-level agreement (SLA) violations. Besides, the RUA algorithm is able to deal with variable workload to prevent hosts from overloading after VM placement and to reduce the SLA violations dramatically. PMID:26901201
A Computationally Efficient Model for Multicomponent Activity Coefficients in Aqueous Solutions
Zaveri, Rahul A.; Easter, Richard C.; Wexler, Anthony S.
2004-10-04
Three-dimensional models of atmospheric inorganic aerosols need an accurate yet computationally efficient parameterization of activity coefficients, which are repeatedly updated in aerosol phase equilibrium and gas-aerosol partitioning calculations. In this paper, we describe the development and evaluation of a new mixing rule for estimating multicomponent activity coefficients of electrolytes typically found in atmospheric aerosol systems containing H(+), NH4(+), Na(+), Ca(2+), SO4(2-), HSO4(-), NO3(-), and Cl(-) ions. The new mixing rule, called MTEM (Multicomponent Taylor Expansion Model), estimates the mean activity coefficient of an electrolyte A in a multicomponent solution from a linear combination of its values in ternary solutions of A-A-H2O, A-B-H2O, A-C-H2O, etc., as the amount of A approaches zero in the mixture at the solution water activity, aw, assuming aw is equal to the ambient relative humidity. Predictions from MTEM are found to be within a factor of 0.8 to 1.25 of the comprehensive Pitzer-Simonson-Clegg (PSC) model over a wide range of water activities, and are shown to be significantly more accurate than the widely used Kusik and Meissner (KM) mixing rule, especially for electrolytes in sulfate-rich aerosol systems and for relatively minor but important aerosol components such as HNO3 and HCl acids. Because the ternary activity coefficient polynomials are parameterized as a function of aw, they have to be computed only once at every grid point at the beginning of every 3-D model time step as opposed to repeated evaluations of the ionic strength dependent binary activity coefficient polynomials in the KM method. Additionally, MTEM also yields a non-iterative solution of the bisulfate ion dissociation in sulfate-rich systems, which is a major computational advantage over other iterative methods as will be shown by a comparison of the CPU time requirements of MTEM for both sulfate-poor and sulfate-rich systems relative to other methods.
Gritzo, L.A.; Koski, J.A.; Suo-Anttila, A.J.
1999-03-16
The Container Analysis Fire Environment computer code (CAFE) is intended to provide Type B package designers with an enhanced engulfing fire boundary condition when combined with the PATRAN/P-Thermal commercial code. Historically an engulfing fire boundary condition has been modeled as {sigma}T{sup 4} where {sigma} is the Stefan-Boltzman constant, and T is the fire temperature. The CAFE code includes the necessary chemistry, thermal radiation, and fluid mechanics to model an engulfing fire. Effects included are the local cooling of gases that form a protective boundary layer that reduces the incoming radiant heat flux to values lower than expected from a simple {sigma}T{sup 4} model. In addition, the effect of object shape on mixing that may increase the local fire temperature is included. Both high and low temperature regions that depend upon the local availability of oxygen are also calculated. Thus the competing effects that can both increase and decrease the local values of radiant heat flux are included in a reamer that is not predictable a-priori. The CAFE package consists of a group of computer subroutines that can be linked to workstation-based thermal analysis codes in order to predict package performance during regulatory and other accident fire scenarios.
Learning with Computer-Based Multimedia: Gender Effects on Efficiency
ERIC Educational Resources Information Center
Pohnl, Sabine; Bogner, Franz X.
2012-01-01
Up to now, only a few studies in multimedia learning have focused on gender effects. While research has mostly focused on learning success, the effect of gender on instructional efficiency (IE) has not yet been considered. Consequently, we used a quasi-experimental design to examine possible gender differences in the learning success, mental…
Sundaramurthy, Aravind; Alai, Aaron; Ganpule, Shailesh; Holmberg, Aaron; Plougonven, Erwan; Chandra, Namas
2012-09-01
Blast waves generated by improvised explosive devices (IEDs) cause traumatic brain injury (TBI) in soldiers and civilians. In vivo animal models that use shock tubes are extensively used in laboratories to simulate field conditions, to identify mechanisms of injury, and to develop injury thresholds. In this article, we place rats in different locations along the length of the shock tube (i.e., inside, outside, and near the exit), to examine the role of animal placement location (APL) in the biomechanical load experienced by the animal. We found that the biomechanical load on the brain and internal organs in the thoracic cavity (lungs and heart) varied significantly depending on the APL. When the specimen is positioned outside, organs in the thoracic cavity experience a higher pressure for a longer duration, in contrast to APL inside the shock tube. This in turn will possibly alter the injury type, severity, and lethality. We found that the optimal APL is where the Friedlander waveform is first formed inside the shock tube. Once the optimal APL was determined, the effect of the incident blast intensity on the surface and intracranial pressure was measured and analyzed. Noticeably, surface and intracranial pressure increases linearly with the incident peak overpressures, though surface pressures are significantly higher than the other two. Further, we developed and validated an anatomically accurate finite element model of the rat head. With this model, we determined that the main pathway of pressure transmission to the brain was through the skull and not through the snout; however, the snout plays a secondary role in diffracting the incoming blast wave towards the skull. PMID:22620716
Computationally efficient statistical differential equation modeling using homogenization
Hooten, Mevin B.; Garlick, Martha J.; Powell, James A.
2013-01-01
Statistical models using partial differential equations (PDEs) to describe dynamically evolving natural systems are appearing in the scientific literature with some regularity in recent years. Often such studies seek to characterize the dynamics of temporal or spatio-temporal phenomena such as invasive species, consumer-resource interactions, community evolution, and resource selection. Specifically, in the spatial setting, data are often available at varying spatial and temporal scales. Additionally, the necessary numerical integration of a PDE may be computationally infeasible over the spatial support of interest. We present an approach to impose computationally advantageous changes of support in statistical implementations of PDE models and demonstrate its utility through simulation using a form of PDE known as “ecological diffusion.” We also apply a statistical ecological diffusion model to a data set involving the spread of mountain pine beetle (Dendroctonus ponderosae) in Idaho, USA.
Labeled trees and the efficient computation of derivations
NASA Technical Reports Server (NTRS)
Grossman, Robert; Larson, Richard G.
1989-01-01
The effective parallel symbolic computation of operators under composition is discussed. Examples include differential operators under composition and vector fields under the Lie bracket. Data structures consisting of formal linear combinations of rooted labeled trees are discussed. A multiplication on rooted labeled trees is defined, thereby making the set of these data structures into an associative algebra. An algebra homomorphism is defined from the original algebra of operators into this algebra of trees. An algebra homomorphism from the algebra of trees into the algebra of differential operators is then described. The cancellation which occurs when noncommuting operators are expressed in terms of commuting ones occurs naturally when the operators are represented using this data structure. This leads to an algorithm which, for operators which are derivations, speeds up the computation exponentially in the degree of the operator. It is shown that the algebra of trees leads naturally to a parallel version of the algorithm.
Algorithmic and architectural optimizations for computationally efficient particle filtering.
Sankaranarayanan, Aswin C; Srivastava, Ankur; Chellappa, Rama
2008-05-01
In this paper, we analyze the computational challenges in implementing particle filtering, especially to video sequences. Particle filtering is a technique used for filtering nonlinear dynamical systems driven by non-Gaussian noise processes. It has found widespread applications in detection, navigation, and tracking problems. Although, in general, particle filtering methods yield improved results, it is difficult to achieve real time performance. In this paper, we analyze the computational drawbacks of traditional particle filtering algorithms, and present a method for implementing the particle filter using the Independent Metropolis Hastings sampler, that is highly amenable to pipelined implementations and parallelization. We analyze the implementations of the proposed algorithm, and, in particular, concentrate on implementations that have minimum processing times. It is shown that the design parameters for the fastest implementation can be chosen by solving a set of convex programs. The proposed computational methodology was verified using a cluster of PCs for the application of visual tracking. We demonstrate a linear speed-up of the algorithm using the methodology proposed in the paper. PMID:18390378
Chunking as the result of an efficiency computation trade-off.
Ramkumar, Pavan; Acuna, Daniel E; Berniker, Max; Grafton, Scott T; Turner, Robert S; Kording, Konrad P
2016-01-01
How to move efficiently is an optimal control problem, whose computational complexity grows exponentially with the horizon of the planned trajectory. Breaking a compound movement into a series of chunks, each planned over a shorter horizon can thus reduce the overall computational complexity and associated costs while limiting the achievable efficiency. This trade-off suggests a cost-effective learning strategy: to learn new movements we should start with many short chunks (to limit the cost of computation). As practice reduces the impediments to more complex computation, the chunking structure should evolve to allow progressively more efficient movements (to maximize efficiency). Here we show that monkeys learning a reaching sequence over an extended period of time adopt this strategy by performing movements that can be described as locally optimal trajectories. Chunking can thus be understood as a cost-effective strategy for producing and learning efficient movements. PMID:27397420
Bedogni, Alberto; Fedele, Stefano; Bedogni, Giorgio; Scoletta, Matteo; Favia, Gianfranco; Colella, Giuseppe; Agrillo, Alessandro; Bettini, Giordana; Di Fede, Olga; Oteri, Giacomo; Fusco, Vittorio; Gabriele, Mario; Ottolenghi, Livia; Valsecchi, Stefano; Porter, Stephen; Petruzzi, Massimo; Arduino, Paolo; D'Amato, Salvatore; Ungari, Claudio; Fung Polly, Pok-Lam; Saia, Giorgia; Campisi, Giuseppina
2014-09-01
Management of osteonecrosis of the jaw associated with antiresorptive agents is challenging, and outcomes are unpredictable. The severity of disease is the main guide to management, and can help to predict prognosis. Most available staging systems for osteonecrosis, including the widely-used American Association of Oral and Maxillofacial Surgeons (AAOMS) system, classify severity on the basis of clinical and radiographic findings. However, clinical inspection and radiography are limited in their ability to identify the extent of necrotic bone disease compared with computed tomography (CT). We have organised a large multicentre retrospective study (known as MISSION) to investigate the agreement between the AAOMS staging system and the extent of osteonecrosis of the jaw (focal compared with diffuse involvement of bone) as detected on CT. We studied 799 patients with detailed clinical phenotyping who had CT images taken. Features of diffuse bone disease were identified on CT within all AAOMS stages (20%, 8%, 48%, and 24% of patients in stages 0, 1, 2, and 3, respectively). Of the patients classified as stage 0, 110/192 (57%) had diffuse disease on CT, and about 1 in 3 with CT evidence of diffuse bone disease was misclassified by the AAOMS system as having stages 0 and 1 osteonecrosis. In addition, more than a third of patients with AAOMS stage 2 (142/405, 35%) had focal bone disease on CT. We conclude that the AAOMS staging system does not correctly identify the extent of bony disease in patients with osteonecrosis of the jaw. PMID:24856927
Yildiz, Dilan; Bozkaya, Uğur
2016-01-30
The extended Koopmans' theorem (EKT) provides a straightforward way to compute ionization potentials and electron affinities from any level of theory. Although it is widely applied to ionization potentials, the EKT approach has not been applied to evaluation of the chemical reactivity. We present the first benchmarking study to investigate the performance of the EKT methods for predictions of chemical potentials (μ) (hence electronegativities), chemical hardnesses (η), and electrophilicity indices (ω). We assess the performance of the EKT approaches for post-Hartree-Fock methods, such as Møller-Plesset perturbation theory, the coupled-electron pair theory, and their orbital-optimized counterparts for the evaluation of the chemical reactivity. Especially, results of the orbital-optimized coupled-electron pair theory method (with the aug-cc-pVQZ basis set) for predictions of the chemical reactivity are very promising; the corresponding mean absolute errors are 0.16, 0.28, and 0.09 eV for μ, η, and ω, respectively. PMID:26458329
Alecu, I. M.; Truhlar, D. G.
2011-04-07
The reactions of CH_{3}OH with the HO_{2} and CH_{3} radicals are important in the combustion of methanol and are prototypes for reactions of heavier alcohols in biofuels. The reaction energies and barrier heights for these reaction systems are computed with CCSD(T) theory extrapolated to the complete basis set limit using correlation-consistent basis sets, both augmented and unaugmented, and further refined by including a fully coupled treatment of the connected triple excitations, a second-order perturbative treatment of quadruple excitations (by CCSDT(2)_{Q}), core–valence corrections, and scalar relativistic effects. It is shown that the M08-HX and M08-SO hybrid meta-GGA density functionals can achieve sub-kcal mol^{-1} agreement with the high-level ab initio results, identifying these functionals as important potential candidates for direct dynamics studies on the rates of these and homologous reaction systems.
BINGO: a code for the efficient computation of the scalar bi-spectrum
Hazra, Dhiraj Kumar; Sriramkumar, L.; Martin, Jérôme E-mail: sriram@physics.iitm.ac.in
2013-05-01
We present a new and accurate Fortran code, the BI-spectra and Non-Gaussianity Operator (BINGO), for the efficient numerical computation of the scalar bi-spectrum and the non-Gaussianity parameter f{sub NL} in single field inflationary models involving the canonical scalar field. The code can calculate all the different contributions to the bi-spectrum and the parameter f{sub NL} for an arbitrary triangular configuration of the wavevectors. Focusing firstly on the equilateral limit, we illustrate the accuracy of BINGO by comparing the results from the code with the spectral dependence of the bi-spectrum expected in power law inflation. Then, considering an arbitrary triangular configuration, we contrast the numerical results with the analytical expression available in the slow roll limit, for, say, the case of the conventional quadratic potential. Considering a non-trivial scenario involving deviations from slow roll, we compare the results from the code with the analytical results that have recently been obtained in the case of the Starobinsky model in the equilateral limit. As an immediate application, we utilize BINGO to examine of the power of the non-Gaussianity parameter f{sub NL} to discriminate between various inflationary models that admit departures from slow roll and lead to similar features in the scalar power spectrum. We close with a summary and discussion on the implications of the results we obtain.
Williams, Ian; Constandinou, Timothy G.
2014-01-01
Accurate models of proprioceptive neural patterns could 1 day play an important role in the creation of an intuitive proprioceptive neural prosthesis for amputees. This paper looks at combining efficient implementations of biomechanical and proprioceptor models in order to generate signals that mimic human muscular proprioceptive patterns for future experimental work in prosthesis feedback. A neuro-musculoskeletal model of the upper limb with 7 degrees of freedom and 17 muscles is presented and generates real time estimates of muscle spindle and Golgi Tendon Organ neural firing patterns. Unlike previous neuro-musculoskeletal models, muscle activation and excitation levels are unknowns in this application and an inverse dynamics tool (static optimization) is integrated to estimate these variables. A proprioceptive prosthesis will need to be portable and this is incompatible with the computationally demanding nature of standard biomechanical and proprioceptor modeling. This paper uses and proposes a number of approximations and optimizations to make real time operation on portable hardware feasible. Finally technical obstacles to mimicking natural feedback for an intuitive proprioceptive prosthesis, as well as issues and limitations with existing models, are identified and discussed. PMID:25009463
Efficient Helicopter Aerodynamic and Aeroacoustic Predictions on Parallel Computers
NASA Technical Reports Server (NTRS)
Wissink, Andrew M.; Lyrintzis, Anastasios S.; Strawn, Roger C.; Oliker, Leonid; Biswas, Rupak
1996-01-01
This paper presents parallel implementations of two codes used in a combined CFD/Kirchhoff methodology to predict the aerodynamics and aeroacoustics properties of helicopters. The rotorcraft Navier-Stokes code, TURNS, computes the aerodynamic flowfield near the helicopter blades and the Kirchhoff acoustics code computes the noise in the far field, using the TURNS solution as input. The overall parallel strategy adds MPI message passing calls to the existing serial codes to allow for communication between processors. As a result, the total code modifications required for parallel execution are relatively small. The biggest bottleneck in running the TURNS code in parallel comes from the LU-SGS algorithm that solves the implicit system of equations. We use a new hybrid domain decomposition implementation of LU-SGS to obtain good parallel performance on the SP-2. TURNS demonstrates excellent parallel speedups for quasi-steady and unsteady three-dimensional calculations of a helicopter blade in forward flight. The execution rate attained by the code on 114 processors is six times faster than the same cases run on one processor of the Cray C-90. The parallel Kirchhoff code also shows excellent parallel speedups and fast execution rates. As a performance demonstration, unsteady acoustic pressures are computed at 1886 far-field observer locations for a sample acoustics problem. The calculation requires over two hundred hours of CPU time on one C-90 processor but takes only a few hours on 80 processors of the SP2. The resultant far-field acoustic field is analyzed with state of-the-art audio and video rendering of the propagating acoustic signals.
Nudds, Robert L.; Taylor, Graham K.; Thomas, Adrian L. R.
2004-01-01
The wing kinematics of birds vary systematically with body size, but we still, after several decades of research, lack a clear mechanistic understanding of the aerodynamic selection pressures that shape them. Swimming and flying animals have recently been shown to cruise at Strouhal numbers (St) corresponding to a regime of vortex growth and shedding in which the propulsive efficiency of flapping foils peaks (St approximately fA/U, where f is wingbeat frequency, U is cruising speed and A approximately bsin(theta/2) is stroke amplitude, in which b is wingspan and theta is stroke angle). We show that St is a simple and accurate predictor of wingbeat frequency in birds. The Strouhal numbers of cruising birds have converged on the lower end of the range 0.2 < St < 0.4 associated with high propulsive efficiency. Stroke angle scales as theta approximately 67b-0.24, so wingbeat frequency can be predicted as f approximately St.U/bsin(33.5b-0.24), with St0.21 and St0.25 for direct and intermittent fliers, respectively. This simple aerodynamic model predicts wingbeat frequency better than any other relationship proposed to date, explaining 90% of the observed variance in a sample of 60 bird species. Avian wing kinematics therefore appear to have been tuned by natural selection for high aerodynamic efficiency: physical and physiological constraints upon wing kinematics must be reconsidered in this light. PMID:15451698
NASA Astrophysics Data System (ADS)
Huang, X.; Schwenke, D.; Lee, T. J.
2014-12-01
Last year we reported a semi-empirical 32S16O2 spectroscopic line list (denoted Ames-296K) for its atmospheric characterization in Venus and other Exoplanetary environments. In order to facilitate the Sulfur isotopic ratio and Sulfur chemistry model determination, now we present Ames-296K line lists for both 626 (upgraded) and other 4 symmetric isotopologues: 636, 646, 666 and 828. The line lists are computed on an ab initio potential energy surface refined with most reliable high resolution experimental data, using a high quality CCSD(T)/aug-cc-pV(Q+d)Z dipole moment surface. The most valuable part of our approach is to provide "truly reliable" predictions (and alternatives) for those unknown or hard-to-measure/analyze spectra. This strategy has guaranteed the lists are the best available alternative for those wide spectra region missing from spectroscopic databases such as HITRAN and GEISA, where only very limited data exist for 626/646 and no Infrared data at all for 636/666 or other minor isotopologues. Our general line position accuracy up to 5000 cm-1 is 0.01 - 0.02 cm-1 or better. Most transition intensity deviations are less than 5%, compare to experimentally measured quantities. Note that we have solved a convergence issue and further improved the quality and completeness of the main isotopologue 626 list at 296K. We will compare the lists to available models in CDMS/JPL/HITRAN and discuss the future mutually beneficial interactions between theoretical and experimental efforts.
A computationally efficient modelling of laminar separation bubbles
NASA Astrophysics Data System (ADS)
Dini, Paolo; Maughmer, Mark D.
1989-10-01
In order to predict the aerodynamic characteristics of airfoils operating at low Reynolds numbers, it is necessary to accurately account for the effects of laminar (transitional) separation bubbles. Generally, the greatest difficulty comes about when attempting to determine the increase in profile drag that results from the presence of separation bubbles. While a number of empirically based separation bubble models have been introduced in the past, the majority assume that the bubble development is fully predictable from upstream conditions. One way of accounting for laminar separation bubbles in airfoil design is the bubble analog used in the design and analysis program of Eppler and Somers. A locally interactive separation bubble model was developed and incorporated into the Eppler and Somers program. Although unable to account for strong interactions such as the large reduction in suction peak sometimes caused by leading edge bubbles, it is able to predict the increase in drag and the local alteration of the airfoil pressure distribution that is caused by bubbles occurring in the operational range which is of most interest.
A computationally efficient modelling of laminar separation bubbles
NASA Technical Reports Server (NTRS)
Dini, Paolo; Maughmer, Mark D.
1989-01-01
In order to predict the aerodynamic characteristics of airfoils operating at low Reynolds numbers, it is necessary to accurately account for the effects of laminar (transitional) separation bubbles. Generally, the greatest difficulty comes about when attempting to determine the increase in profile drag that results from the presence of separation bubbles. While a number of empirically based separation bubble models have been introduced in the past, the majority assume that the bubble development is fully predictable from upstream conditions. One way of accounting for laminar separation bubbles in airfoil design is the bubble analog used in the design and analysis program of Eppler and Somers. A locally interactive separation bubble model was developed and incorporated into the Eppler and Somers program. Although unable to account for strong interactions such as the large reduction in suction peak sometimes caused by leading edge bubbles, it is able to predict the increase in drag and the local alteration of the airfoil pressure distribution that is caused by bubbles occurring in the operational range which is of most interest.
Probabilistic Damage Characterization Using the Computationally-Efficient Bayesian Approach
NASA Technical Reports Server (NTRS)
Warner, James E.; Hochhalter, Jacob D.
2016-01-01
This work presents a computationally-ecient approach for damage determination that quanti es uncertainty in the provided diagnosis. Given strain sensor data that are polluted with measurement errors, Bayesian inference is used to estimate the location, size, and orientation of damage. This approach uses Bayes' Theorem to combine any prior knowledge an analyst may have about the nature of the damage with information provided implicitly by the strain sensor data to form a posterior probability distribution over possible damage states. The unknown damage parameters are then estimated based on samples drawn numerically from this distribution using a Markov Chain Monte Carlo (MCMC) sampling algorithm. Several modi cations are made to the traditional Bayesian inference approach to provide signi cant computational speedup. First, an ecient surrogate model is constructed using sparse grid interpolation to replace a costly nite element model that must otherwise be evaluated for each sample drawn with MCMC. Next, the standard Bayesian posterior distribution is modi ed using a weighted likelihood formulation, which is shown to improve the convergence of the sampling process. Finally, a robust MCMC algorithm, Delayed Rejection Adaptive Metropolis (DRAM), is adopted to sample the probability distribution more eciently. Numerical examples demonstrate that the proposed framework e ectively provides damage estimates with uncertainty quanti cation and can yield orders of magnitude speedup over standard Bayesian approaches.
Efficient design of direct-binary-search computer-generated holograms
Jennison, B.K.; Allebach. J.P. ); Sweeney, D.W. )
1991-04-01
Computer-generated holograms (CGH's) synthesized by the iterative direct-binary-search (DBS) algorithm yield lower reconstruction error and higher diffraction efficiency than do CGH's designed by conventional methods, but the DBS algorithm is computationally intensive. A fast algorithm for DBS is developed that recursively computes the error measure to be minimized. For complex amplitude-based error, the required computation for an L-point and modifications are considered in order to make the algorithm more efficient. An acceleration technique that attempts to increase the rate of convergence of the DBS algorithm is also investigated.
ERIC Educational Resources Information Center
Rom, Mark Carl
2011-01-01
Grades matter. College grading systems, however, are often ad hoc and prone to mistakes. This essay focuses on one factor that contributes to high-quality grading systems: grading accuracy (or "efficiency"). I proceed in several steps. First, I discuss the elements of "efficient" (i.e., accurate) grading. Next, I present analytical results…
NASA Astrophysics Data System (ADS)
Lin, Dejun
2015-09-01
Accurate representation of intermolecular forces has been the central task of classical atomic simulations, known as molecular mechanics. Recent advancements in molecular mechanics models have put forward the explicit representation of permanent and/or induced electric multipole (EMP) moments. The formulas developed so far to calculate EMP interactions tend to have complicated expressions, especially in Cartesian coordinates, which can only be applied to a specific kernel potential function. For example, one needs to develop a new formula each time a new kernel function is encountered. The complication of these formalisms arises from an intriguing and yet obscured mathematical relation between the kernel functions and the gradient operators. Here, I uncover this relation via rigorous derivation and find that the formula to calculate EMP interactions is basically invariant to the potential kernel functions as long as they are of the form f(r), i.e., any Green's function that depends on inter-particle distance. I provide an algorithm for efficient evaluation of EMP interaction energies, forces, and torques for any kernel f(r) up to any arbitrary rank of EMP moments in Cartesian coordinates. The working equations of this algorithm are essentially the same for any kernel f(r). Recently, a few recursive algorithms were proposed to calculate EMP interactions. Depending on the kernel functions, the algorithm here is about 4-16 times faster than these algorithms in terms of the required number of floating point operations and is much more memory efficient. I show that it is even faster than a theoretically ideal recursion scheme, i.e., one that requires 1 floating point multiplication and 1 addition per recursion step. This algorithm has a compact vector-based expression that is optimal for computer programming. The Cartesian nature of this algorithm makes it fit easily into modern molecular simulation packages as compared with spherical coordinate-based algorithms. A
Computer modeling of high-efficiency solar cells
NASA Technical Reports Server (NTRS)
Schartz, R. J.; Lundstrom, M. S.
1980-01-01
Transport equations which describe the flow of holes and electrons in the heavily doped regions of a solar cell are presented in a form that is suitable for device modeling. Two experimentally determinable parameters, the effective bandgap shrinkage and the effective asymmetry factor are required to completely model the cell in these regions. Nevertheless, a knowledge of only the effective bandgap shrinkage is sufficient to model the terminal characteristics of the cell. The results of computer simulations of the effects of heavy doping are presented. The insensitivity of the terminal characteristics to the choice of effective asymmetry factor is shown along with the sensitivity of the electric field and quasielectric fields to this parameter. The dependence of the terminal characteristics on the effective bandgap shrinkage is also presented.
Efficient computation of the spectrum of viscoelastic flows
NASA Astrophysics Data System (ADS)
Valério, J. V.; Carvalho, M. S.; Tomei, C.
2009-03-01
The understanding of viscoelastic flows in many situations requires not only the steady state solution of the governing equations, but also its sensitivity to small perturbations. Linear stability analysis leads to a generalized eigenvalue problem (GEVP), whose numerical analysis may be challenging, even for Newtonian liquids, because the incompressibility constraint creates singularities that lead to non-physical eigenvalues at infinity. For viscoelastic flows, the difficulties increase due to the presence of continuous spectrum, related to the constitutive equations. The Couette flow of upper convected Maxwell (UCM) liquids has been used as a case study of the stability of viscoelastic flows. The spectrum consists of two discrete eigenvalues and a continuous segment with real part equal to -1/ We ( We is the Weissenberg number). Most of the approximations in the literature were obtained using spectral expansions. The eigenvalues close to the continuous part of the spectrum show very slow convergence. In this work, the linear stability of Couette flow of a UCM liquid is studied using a finite element method. A new procedure to eliminate the eigenvalues at infinity from the GEVP is proposed. The procedure takes advantage of the structure of the matrices involved and avoids the computational overhead of the usual mapping techniques. The GEVP is transformed into a non-degenerate GEVP of dimension five times smaller. The computed eigenfunctions related to the continuous spectrum are in good agreement with the analytic solutions obtained by Graham [M.D. Graham, Effect of axial flow on viscoelastic Taylor-Couette instability, J. Fluid Mech. 360 (1998) 341].
An efficient network for interconnecting remote monitoring instruments and computers
Halbig, J.K.; Gainer, K.E.; Klosterbuer, S.F.
1994-08-01
Remote monitoring instrumentation must be connected with computers and other instruments. The cost and intrusiveness of installing cables in new and existing plants presents problems for the facility and the International Atomic Energy Agency (IAEA). The authors have tested a network that could accomplish this interconnection using mass-produced commercial components developed for use in industrial applications. Unlike components in the hardware of most networks, the components--manufactured and distributed in North America, Europe, and Asia--lend themselves to small and low-powered applications. The heart of the network is a chip with three microprocessors and proprietary network software contained in Read Only Memory. In addition to all nonuser levels of protocol, the software also contains message authentication capabilities. This chip can be interfaced to a variety of transmission media, for example, RS-485 lines, fiber topic cables, rf waves, and standard ac power lines. The use of power lines as the transmission medium in a facility could significantly reduce cabling costs.
Enabling Efficient Climate Science Workflows in High Performance Computing Environments
NASA Astrophysics Data System (ADS)
Krishnan, H.; Byna, S.; Wehner, M. F.; Gu, J.; O'Brien, T. A.; Loring, B.; Stone, D. A.; Collins, W.; Prabhat, M.; Liu, Y.; Johnson, J. N.; Paciorek, C. J.
2015-12-01
A typical climate science workflow often involves a combination of acquisition of data, modeling, simulation, analysis, visualization, publishing, and storage of results. Each of these tasks provide a myriad of challenges when running on a high performance computing environment such as Hopper or Edison at NERSC. Hurdles such as data transfer and management, job scheduling, parallel analysis routines, and publication require a lot of forethought and planning to ensure that proper quality control mechanisms are in place. These steps require effectively utilizing a combination of well tested and newly developed functionality to move data, perform analysis, apply statistical routines, and finally, serve results and tools to the greater scientific community. As part of the CAlibrated and Systematic Characterization, Attribution and Detection of Extremes (CASCADE) project we highlight a stack of tools our team utilizes and has developed to ensure that large scale simulation and analysis work are commonplace and provide operations that assist in everything from generation/procurement of data (HTAR/Globus) to automating publication of results to portals like the Earth Systems Grid Federation (ESGF), all while executing everything in between in a scalable environment in a task parallel way (MPI). We highlight the use and benefit of these tools by showing several climate science analysis use cases they have been applied to.
An Efficient Objective Analysis System for Parallel Computers
NASA Technical Reports Server (NTRS)
Stobie, J.
1999-01-01
A new atmospheric objective analysis system designed for parallel computers will be described. The system can produce a global analysis (on a 1 X 1 lat-lon grid with 18 levels of heights and winds and 10 levels of moisture) using 120,000 observations in 17 minutes on 32 CPUs (SGI Origin 2000). No special parallel code is needed (e.g. MPI or multitasking) and the 32 CPUs do not have to be on the same platform. The system is totally portable and can run on several different architectures at once. In addition, the system can easily scale up to 100 or more CPUS. This will allow for much higher resolution and significant increases in input data. The system scales linearly as the number of observations and the number of grid points. The cost overhead in going from 1 to 32 CPUs is 18%. In addition, the analysis results are identical regardless of the number of processors used. This system has all the characteristics of optimal interpolation, combining detailed instrument and first guess error statistics to produce the best estimate of the atmospheric state. Static tests with a 2 X 2.5 resolution version of this system showed it's analysis increments are comparable to the latest NASA operational system including maintenance of mass-wind balance. Results from several months of cycling test in the Goddard EOS Data Assimilation System (GEOS DAS) show this new analysis retains the same level of agreement between the first guess and observations (O-F statistics) as the current operational system.
An Efficient Objective Analysis System for Parallel Computers
NASA Technical Reports Server (NTRS)
Stobie, James G.
1999-01-01
A new objective analysis system designed for parallel computers will be described. The system can produce a global analysis (on a 2 x 2.5 lat-lon grid with 20 levels of heights and winds and 10 levels of moisture) using 120,000 observations in less than 3 minutes on 32 CPUs (SGI Origin 2000). No special parallel code is needed (e.g. MPI or multitasking) and the 32 CPUs do not have to be on the same platform. The system Ls totally portable and can run on -several different architectures at once. In addition, the system can easily scale up to 100 or more CPUS. This will allow for much higher resolution and significant increases in input data. The system scales linearly as the number of observations and the number of grid points. The cost overhead in going from I to 32 CPus is 18%. in addition, the analysis results are identical regardless of the number of processors used. T'his system has all the characteristics of optimal interpolation, combining detailed instrument and first guess error statistics to produce the best estimate of the atmospheric state. It also includes a new quality control (buddy check) system. Static tests with the system showed it's analysis increments are comparable to the latest NASA operational system including maintenance of mass-wind balance. Results from a 2-month cycling test in the Goddard EOS Data Assimilation System (GEOS DAS) show this new analysis retains the same level of agreement between the first guess and observations (0-F statistics) throughout the entire two months.
Time efficient 3-D electromagnetic modeling on massively parallel computers
Alumbaugh, D.L.; Newman, G.A.
1995-08-01
A numerical modeling algorithm has been developed to simulate the electromagnetic response of a three dimensional earth to a dipole source for frequencies ranging from 100 to 100MHz. The numerical problem is formulated in terms of a frequency domain--modified vector Helmholtz equation for the scattered electric fields. The resulting differential equation is approximated using a staggered finite difference grid which results in a linear system of equations for which the matrix is sparse and complex symmetric. The system of equations is solved using a preconditioned quasi-minimum-residual method. Dirichlet boundary conditions are employed at the edges of the mesh by setting the tangential electric fields equal to zero. At frequencies less than 1MHz, normal grid stretching is employed to mitigate unwanted reflections off the grid boundaries. For frequencies greater than this, absorbing boundary conditions must be employed by making the stretching parameters of the modified vector Helmholtz equation complex which introduces loss at the boundaries. To allow for faster calculation of realistic models, the original serial version of the code has been modified to run on a massively parallel architecture. This modification involves three distinct tasks; (1) mapping the finite difference stencil to a processor stencil which allows for the necessary information to be exchanged between processors that contain adjacent nodes in the model, (2) determining the most efficient method to input the model which is accomplished by dividing the input into ``global`` and ``local`` data and then reading the two sets in differently, and (3) deciding how to output the data which is an inherently nonparallel process.
Adaptive and Efficient Computing for Subsurface Simulation within ParFlow
Tiedeman, H; Woodward, C S
2010-11-16
This project is concerned with the PF.WRF model as a means to enable more accurate predictions of wind fluctuations and subsurface storage. As developed at LLNL, PF.WRF couples a groundwater (subsurface) and surface water flow model (ParFlow) to a mesoscale atmospheric model (WRF, Weather Research and Forecasting Model). It was developed as a unique tool to address coupled water balance and wind energy questions that occur across traditionally separated research regimes of the atmosphere, land surface, and subsurface. PF.WRF is capable of simulating fluid, mass, and energy transport processes in groundwater, vadose zone, root zone, and land surface systems, including overland flow, and allows for the WRF model to both directly drive and respond to surface and subsurface hydrologic processes and conditions. The current PF.WRF model is constrained to have uniform spatial gridding below the land surface and matching areal grids with the WRF model at the land surface. There are often cases where it is advantageous for land surface, overland flow and subsurface models to have finer gridding than their atmospheric counterparts. Finer vertical discretization is also advantageous near the land surface (to properly capture feedbacks) yet many applications have a large vertical extent. However, the surface flow is strongly dependent on topography leading to a need for greater lateral resolution in some regions and the subsurface flow is tightly coupled to the atmospheric model near the surface leading to a need for finer vertical resolution. In addition, the interactions (e.g. rain) will be highly variable in space and time across the problem domain so an adaptive scheme is preferred to a static strategy to efficiently use computing and memory resources. As a result, this project focussed on algorithmic research required for development of an adaptive simulation capability in the PF.WRF system and its subsequent use in an application problem in the Central Valley of
Introduction: From Efficient Quantum Computation to Nonextensive Statistical Mechanics
NASA Astrophysics Data System (ADS)
Prosen, Tomaz
These few pages will attempt to make a short comprehensive overview of several contributions to this volume which concern rather diverse topics. I shall review the following works, essentially reversing the sequence indicated in my title: • First, by C. Tsallis on the relation of nonextensive statistics to the stability of quantum motion on the edge of quantum chaos. • Second, the contribution by P. Jizba on information theoretic foundations of generalized (nonextensive) statistics. • Third, the contribution by J. Rafelski on a possible generalization of Boltzmann kinetics, again, formulated in terms of nonextensive statistics. • Fourth, the contribution by D.L. Stein on the state-of-the-art open problems in spin glasses and on the notion of complexity there. • Fifth, the contribution by F.T. Arecchi on the quantum-like uncertainty relations and decoherence appearing in the description of perceptual tasks of the brain. • Sixth, the contribution by G. Casati on the measurement and information extraction in the simulation of complex dynamics by a quantum computer. Immediately, the following question arises: What do the topics of these talks have in common? Apart from the variety of questions they address, it is quite obvious that the common denominator of these contributions is an approach to describe and control "the complexity" by simple means. One of the very useful tools to handle such problems, also often used or at least referred to in several of the works presented here, is the concept of Tsallis entropy and nonextensive statistics.
ERIC Educational Resources Information Center
Kablan, Z.; Erden, M.
2008-01-01
This study deals with the instructional efficiency of integrating text and animation into computer-based science instruction. The participants were 84 seventh-grade students in a private primary school in Istanbul. The efficiency of instruction was measured by mental effort and performance level of the learners. The results of the study showed…
Liu, Yang; Yang, Yeran; Tang, Tie-Shan; Zhang, Hui; Wang, Zhifeng; Friedberg, Errol; Yang, Wei; Guo, Caixia
2014-01-01
DNA polymerase κ (Polκ) is the only known Y-family DNA polymerase that bypasses the 10S (+)-trans-anti-benzo[a]pyrene diol epoxide (BPDE)-N2-deoxyguanine adducts efficiently and accurately. The unique features of Polκ, a large structure gap between the catalytic core and little finger domain and a 90-residue addition at the N terminus known as the N-clasp, may give rise to its special translesion capability. We designed and constructed two mouse Polκ variants, which have reduced gap size on both sides [Polκ Gap Mutant (PGM) 1] or one side flanking the template base (PGM2). These Polκ variants are nearly as efficient as WT in normal DNA synthesis, albeit with reduced accuracy. However, PGM1 is strongly blocked by the 10S (+)-trans-anti-BPDE-N2-dG lesion. Steady-state kinetic measurements reveal a significant reduction in efficiency of dCTP incorporation opposite the lesion by PGM1 and a moderate reduction by PGM2. Consistently, Polκ-deficient cells stably complemented with PGM1 GFP-Polκ remained hypersensitive to BPDE treatment, and complementation with WT or PGM2 GFP-Polκ restored BPDE resistance. Furthermore, deletion of the first 51 residues of the N-clasp in mouse Polκ (mPolκ52–516) leads to reduced polymerization activity, and the mutant PGM252–516 but not PGM152–516 can partially compensate the N-terminal deletion and restore the catalytic activity on normal DNA. However, neither WT nor PGM2 mPolκ52–516 retains BPDE bypass activity. We conclude that the structural gap physically accommodates the bulky aromatic adduct and the N-clasp is essential for the structural integrity and flexibility of Polκ during translesion synthesis. PMID:24449898
Adde, Lars; Helbostad, Jorunn; Jensenius, Alexander R; Langaas, Mette; Støen, Ragnhild
2013-08-01
This study evaluates the role of postterm age at assessment and the use of one or two video recordings for the detection of fidgety movements (FMs) and prediction of cerebral palsy (CP) using computer vision software. Recordings between 9 and 17 weeks postterm age from 52 preterm and term infants (24 boys, 28 girls; 26 born preterm) were used. Recordings were analyzed using computer vision software. Movement variables, derived from differences between subsequent video frames, were used for quantitative analysis. Sensitivities, specificities, and area under curve were estimated for the first and second recording, or a mean of both. FMs were classified based on the Prechtl approach of general movement assessment. CP status was reported at 2 years. Nine children developed CP of whom all recordings had absent FMs. The mean variability of the centroid of motion (CSD) from two recordings was more accurate than using only one recording, and identified all children who were diagnosed with CP at 2 years. Age at assessment did not influence the detection of FMs or prediction of CP. The accuracy of computer vision techniques in identifying FMs and predicting CP based on two recordings should be confirmed in future studies. PMID:23343036
Efficient conjugate gradient algorithms for computation of the manipulator forward dynamics
NASA Technical Reports Server (NTRS)
Fijany, Amir; Scheid, Robert E.
1989-01-01
The applicability of conjugate gradient algorithms for computation of the manipulator forward dynamics is investigated. The redundancies in the previously proposed conjugate gradient algorithm are analyzed. A new version is developed which, by avoiding these redundancies, achieves a significantly greater efficiency. A preconditioned conjugate gradient algorithm is also presented. A diagonal matrix whose elements are the diagonal elements of the inertia matrix is proposed as the preconditioner. In order to increase the computational efficiency, an algorithm is developed which exploits the synergism between the computation of the diagonal elements of the inertia matrix and that required by the conjugate gradient algorithm.
NASA Astrophysics Data System (ADS)
Lin, Huimin; Tang, Huazhong; Cai, Wei
2014-02-01
This paper will investigate the numerical accuracy and efficiency in computing the electrostatic potential for a finite-height cylinder, used in an explicit/implicit hybrid solvation model for ion channel and embedded in a layered dielectric/electrolyte medium representing a biological membrane and ionic solvents. A charge locating inside the cylinder cavity, where ion channel proteins and ions are given explicit atomistic representations, will be influenced by the polarization field of the surrounding implicit dielectric/electrolyte medium. Two numerical techniques, a specially designed boundary integral equation method and an image charge method, will be investigated and compared in terms of accuracy and efficiency for computing the electrostatic potential. The boundary integral equation method based on the three-dimensional layered Green's functions provides a highly accurate solution suitable for producing a benchmark reference solution, while the image charge method is found to give reasonable accuracy and highly efficient and viable to use the fast multipole method for interactions of a large number of charges in the atomistic region of the hybrid solvation model.
Lin, Huimin; Tang, Huazhong; Cai, Wei
2014-02-15
This paper will investigate the numerical accuracy and efficiency in computing the electrostatic potential for a finite-height cylinder, used in an explicit/implicit hybrid solvation model for ion channel and embedded in a layered dielectric/electrolyte medium representing a biological membrane and ionic solvents. A charge locating inside the cylinder cavity, where ion channel proteins and ions are given explicit atomistic representations, will be influenced by the polarization field of the surrounding implicit dielectric/electrolyte medium. Two numerical techniques, a specially designed boundary integral equation method and an image charge method, will be investigated and compared in terms of accuracy and efficiency for computing the electrostatic potential. The boundary integral equation method based on the three-dimensional layered Green's functions provides a highly accurate solution suitable for producing a benchmark reference solution, while the image charge method is found to give reasonable accuracy and highly efficient and viable to use the fast multipole method for interactions of a large number of charges in the atomistic region of the hybrid solvation model.
A single user efficiency measure for evaluation of parallel or pipeline computer architectures
NASA Technical Reports Server (NTRS)
Jones, W. P.
1978-01-01
A precise statement of the relationship between sequential computation at one rate, parallel or pipeline computation at a much higher rate, the data movement rate between levels of memory, the fraction of inherently sequential operations or data that must be processed sequentially, the fraction of data to be moved that cannot be overlapped with computation, and the relative computational complexity of the algorithms for the two processes, scalar and vector, was developed. The relationship should be applied to the multirate processes that obtain in the employment of various new or proposed computer architectures for computational aerodynamics. The relationship, an efficiency measure that the single user of the computer system perceives, argues strongly in favor of separating scalar and vector processes, sometimes referred to as loosely coupled processes, to achieve optimum use of hardware.
Hadley, Kevin R.; McCabe, Clare
2010-01-01
Developing accurate models of water for use in computer simulations is important for the study of many chemical and biological systems, including lipid bilayer self-assembly. The large temporal and spatial scales needed to study such self-assembly have led to the development and application of coarse-grained models for the lipid-lipid, lipid-solvent and solvent-solvent interactions. Unfortunately, popular center-of-mass-based coarse-graining techniques are limited to modeling water with one-water per bead. In this work, we have utilized the K-means algorithm to determine the optimal clustering of waters to allow the mapping of multiple waters to single coarse-grained beads. Through the study of a simple mixture between water and an amphiphilic solute (1-pentanol), we find a 4-water bead model has the optimal balance between computational efficiency and accurate solvation and structural properties when compared to water models ranging from 1 to 9 waters per bead. The 4-water model was subsequently utilized in studies of the solvation of hexadecanoic acid and the structure, as measured via radial distribution functions, for the hydrophobic tails and the bulk water phase were found to agree well with experimental data and their atomistic targets. PMID:20230012
Liu, Aiping; Li, Junning; Wang, Z. Jane; McKeown, Martin J.
2012-01-01
Graphical models appear well suited for inferring brain connectivity from fMRI data, as they can distinguish between direct and indirect brain connectivity. Nevertheless, biological interpretation requires not only that the multivariate time series are adequately modeled, but also that there is accurate error-control of the inferred edges. The PCfdr algorithm, which was developed by Li and Wang, was to provide a computationally efficient means to control the false discovery rate (FDR) of computed edges asymptotically. The original PCfdr algorithm was unable to accommodate a priori information about connectivity and was designed to infer connectivity from a single subject rather than a group of subjects. Here we extend the original PCfdr algorithm and propose a multisubject, error-rate-controlled brain connectivity modeling approach that allows incorporation of prior knowledge of connectivity. In simulations, we show that the two proposed extensions can still control the FDR around or below a specified threshold. When the proposed approach is applied to fMRI data in a Parkinson's disease study, we find robust group evidence of the disease-related changes, the compensatory changes, and the normalizing effect of L-dopa medication. The proposed method provides a robust, accurate, and practical method for the assessment of brain connectivity patterns from functional neuroimaging data. PMID:23251232
Brounstein, Anna; Hacihaliloglu, Ilker; Guy, Pierre; Hodgson, Antony; Abugharbieh, Rafeef
2015-12-01
Automatic, accurate and real-time registration is an important step in providing effective guidance and successful anatomic restoration in ultrasound (US)-based computer assisted orthopedic surgery. We propose a method in which local phase-based bone surfaces, extracted from intra-operative US data, are registered to pre-operatively segmented computed tomography data. Extracted bone surfaces are downsampled and reinforced with high curvature features. A novel hierarchical simplification algorithm is used to further optimize the point clouds. The final point clouds are represented as Gaussian mixture models and iteratively matched by minimizing the dissimilarity between them using an L2 metric. For 44 clinical data sets from 25 pelvic fracture patients and 49 phantom data sets, we report mean surface registration accuracies of 0.31 and 0.77 mm, respectively, with an average registration time of 1.41 s. Our results suggest the viability and potential of the chosen method for real-time intra-operative registration in orthopedic surgery. PMID:26365924
NASA Astrophysics Data System (ADS)
Giner, Emmanuel; Angeli, Celestino
2016-03-01
The present work describes a new method to compute accurate spin densities for open shell systems. The proposed approach follows two steps: first, it provides molecular orbitals which correctly take into account the spin delocalization; second, a proper CI treatment allows to account for the spin polarization effect while keeping a restricted formalism and avoiding spin contamination. The main idea of the optimization procedure is based on the orbital relaxation of the various charge transfer determinants responsible for the spin delocalization. The algorithm is tested and compared to other existing methods on a series of organic and inorganic open shell systems. The results reported here show that the new approach (almost black-box) provides accurate spin densities at a reasonable computational cost making it suitable for a systematic study of open shell systems.
NASA Astrophysics Data System (ADS)
Hu, X.; Zhang, Y.
2007-05-01
The Weather Research and Forecast/Chemistry Model (WRF/Chem) that simulates chemistry simultaneously with meteorology has recently been developed for real-time forecasting by the U.S. National Center for Atmospheric Research (NCAR) and National Oceanic & Atmospheric Administration (NOAA). As one of the six air quality models, WRF/Chem with a modal aerosol module has been applied for ozone and PM2.5 ensemble forecasts over eastern North America as part of the 2004 New England Air Quality Study (NEAQS) program (NEAQS-2004). Significant differences exist in the partitioning of volatile species (e.g., ammonium and nitrate) simulated by the six models. Model biases are partially attributed to the equilibrium assumption used in the gas/particles mass transfer approach in some models. Development of a more accurate, yet computationally- efficient gas/particle mass transfer approach for three-dimensional (3-D) applications, in particular, real-time forecasting, is therefore warranted. Model of Aerosol Dynamics, Reaction, Ionization, and Dissolution (MADRID) has been implemented into WRF/Chem (referred to as WRF/Chem-MADRID). WRF/Chem-MADRID offers three gas/particle partitioning treatments: equilibrium, kinetic, and hybrid approaches. The equilibrium approach is computationally-efficient and commonly used in 3-D air quality models but less accurate under certain conditions (e.g., in the presence of coarse, reactive particles such as PM containing sea-salts in the coastal areas). The kinetic approach is accurate but computationally-expensive, limiting its 3-D applications. The hybrid approach attempts to provide a compromise between merits and drawbacks of the two approaches by treating fine PM (typically < ~ 1 μm) with the equilibrium approach and coarse PM with the kinetic approach. A computationally-efficient kinetic gas/particle mass transfer approach in MADRID has recently been developed for 3-D applications based on an Analytical Predictor of Condensation (referred
Lin, Dejun
2015-09-21
Accurate representation of intermolecular forces has been the central task of classical atomic simulations, known as molecular mechanics. Recent advancements in molecular mechanics models have put forward the explicit representation of permanent and/or induced electric multipole (EMP) moments. The formulas developed so far to calculate EMP interactions tend to have complicated expressions, especially in Cartesian coordinates, which can only be applied to a specific kernel potential function. For example, one needs to develop a new formula each time a new kernel function is encountered. The complication of these formalisms arises from an intriguing and yet obscured mathematical relation between the kernel functions and the gradient operators. Here, I uncover this relation via rigorous derivation and find that the formula to calculate EMP interactions is basically invariant to the potential kernel functions as long as they are of the form f(r), i.e., any Green’s function that depends on inter-particle distance. I provide an algorithm for efficient evaluation of EMP interaction energies, forces, and torques for any kernel f(r) up to any arbitrary rank of EMP moments in Cartesian coordinates. The working equations of this algorithm are essentially the same for any kernel f(r). Recently, a few recursive algorithms were proposed to calculate EMP interactions. Depending on the kernel functions, the algorithm here is about 4–16 times faster than these algorithms in terms of the required number of floating point operations and is much more memory efficient. I show that it is even faster than a theoretically ideal recursion scheme, i.e., one that requires 1 floating point multiplication and 1 addition per recursion step. This algorithm has a compact vector-based expression that is optimal for computer programming. The Cartesian nature of this algorithm makes it fit easily into modern molecular simulation packages as compared with spherical coordinate-based algorithms. A
Lin, Dejun
2015-09-21
Accurate representation of intermolecular forces has been the central task of classical atomic simulations, known as molecular mechanics. Recent advancements in molecular mechanics models have put forward the explicit representation of permanent and/or induced electric multipole (EMP) moments. The formulas developed so far to calculate EMP interactions tend to have complicated expressions, especially in Cartesian coordinates, which can only be applied to a specific kernel potential function. For example, one needs to develop a new formula each time a new kernel function is encountered. The complication of these formalisms arises from an intriguing and yet obscured mathematical relation between the kernel functions and the gradient operators. Here, I uncover this relation via rigorous derivation and find that the formula to calculate EMP interactions is basically invariant to the potential kernel functions as long as they are of the form f(r), i.e., any Green's function that depends on inter-particle distance. I provide an algorithm for efficient evaluation of EMP interaction energies, forces, and torques for any kernel f(r) up to any arbitrary rank of EMP moments in Cartesian coordinates. The working equations of this algorithm are essentially the same for any kernel f(r). Recently, a few recursive algorithms were proposed to calculate EMP interactions. Depending on the kernel functions, the algorithm here is about 4-16 times faster than these algorithms in terms of the required number of floating point operations and is much more memory efficient. I show that it is even faster than a theoretically ideal recursion scheme, i.e., one that requires 1 floating point multiplication and 1 addition per recursion step. This algorithm has a compact vector-based expression that is optimal for computer programming. The Cartesian nature of this algorithm makes it fit easily into modern molecular simulation packages as compared with spherical coordinate-based algorithms. A
Sankaran, Sethuraman; Humphrey, Jay D.; Marsden, Alison L.
2013-01-01
Computational models for vascular growth and remodeling (G&R) are used to predict the long-term response of vessels to changes in pressure, flow, and other mechanical loading conditions. Accurate predictions of these responses are essential for understanding numerous disease processes. Such models require reliable inputs of numerous parameters, including material properties and growth rates, which are often experimentally derived, and inherently uncertain. While earlier methods have used a brute force approach, systematic uncertainty quantification in G&R models promises to provide much better information. In this work, we introduce an efficient framework for uncertainty quantification and optimal parameter selection, and illustrate it via several examples. First, an adaptive sparse grid stochastic collocation scheme is implemented in an established G&R solver to quantify parameter sensitivities, and near-linear scaling with the number of parameters is demonstrated. This non-intrusive and parallelizable algorithm is compared with standard sampling algorithms such as Monte-Carlo. Second, we determine optimal arterial wall material properties by applying robust optimization. We couple the G&R simulator with an adaptive sparse grid collocation approach and a derivative-free optimization algorithm. We show that an artery can achieve optimal homeostatic conditions over a range of alterations in pressure and flow; robustness of the solution is enforced by including uncertainty in loading conditions in the objective function. We then show that homeostatic intramural and wall shear stress is maintained for a wide range of material properties, though the time it takes to achieve this state varies. We also show that the intramural stress is robust and lies within 5% of its mean value for realistic variability of the material parameters. We observe that prestretch of elastin and collagen are most critical to maintaining homeostasis, while values of the material properties are
NASA Technical Reports Server (NTRS)
White, C. W.
1981-01-01
The computational efficiency of the impedance type loads prediction method was studied. Three goals were addressed: devise a method to make the impedance method operate more efficiently in the computer; assess the accuracy and convenience of the method for determining the effect of design changes; and investigate the use of the method to identify design changes for reduction of payload loads. The method is suitable for calculation of dynamic response in either the frequency or time domain. It is concluded that: the choice of an orthogonal coordinate system will allow the impedance method to operate more efficiently in the computer; the approximate mode impedance technique is adequate for determining the effect of design changes, and is applicable for both statically determinate and statically indeterminate payload attachments; and beneficial design changes to reduce payload loads can be identified by the combined application of impedance techniques and energy distribution review techniques.
NASA Astrophysics Data System (ADS)
Lin, Y.; O'Malley, D.; Vesselinov, V. V.
2015-12-01
Inverse modeling seeks model parameters given a set of observed state variables. However, for many practical problems due to the facts that the observed data sets are often large and model parameters are often numerous, conventional methods for solving the inverse modeling can be computationally expensive. We have developed a new, computationally-efficient Levenberg-Marquardt method for solving large-scale inverse modeling. Levenberg-Marquardt methods require the solution of a dense linear system of equations which can be prohibitively expensive to compute for large-scale inverse problems. Our novel method projects the original large-scale linear problem down to a Krylov subspace, such that the dimensionality of the measurements can be significantly reduced. Furthermore, instead of solving the linear system for every Levenberg-Marquardt damping parameter, we store the Krylov subspace computed when solving the first damping parameter and recycle it for all the following damping parameters. The efficiency of our new inverse modeling algorithm is significantly improved by using these computational techniques. We apply this new inverse modeling method to invert for a random transitivity field. Our algorithm is fast enough to solve for the distributed model parameters (transitivity) at each computational node in the model domain. The inversion is also aided by the use regularization techniques. The algorithm is coded in Julia and implemented in the MADS computational framework (http://mads.lanl.gov). Julia is an advanced high-level scientific programing language that allows for efficient memory management and utilization of high-performance computational resources. By comparing with a Levenberg-Marquardt method using standard linear inversion techniques, our Levenberg-Marquardt method yields speed-up ratio of 15 in a multi-core computational environment and a speed-up ratio of 45 in a single-core computational environment. Therefore, our new inverse modeling method is a
Woodruff, S.B.
1992-05-01
The Transient Reactor Analysis Code (TRAC), which features a two- fluid treatment of thermal-hydraulics, is designed to model transients in water reactors and related facilities. One of the major computational costs associated with TRAC and similar codes is calculating constitutive coefficients. Although the formulations for these coefficients are local the costs are flow-regime- or data-dependent; i.e., the computations needed for a given spatial node often vary widely as a function of time. Consequently, poor load balancing will degrade efficiency on either vector or data parallel architectures when the data are organized according to spatial location. Unfortunately, a general automatic solution to the load-balancing problem associated with data-dependent computations is not yet available for massively parallel architectures. This document discusses why developers algorithms, such as a neural net representation, that do not exhibit algorithms, such as a neural net representation, that do not exhibit load-balancing problems.
Computationally Efficient Use of Derivatives in Emulation of Complex Computational Models
Williams, Brian J.; Marcy, Peter W.
2012-06-07
We will investigate the use of derivative information in complex computer model emulation when the correlation function is of the compactly supported Bohman class. To this end, a Gaussian process model similar to that used by Kaufman et al. (2011) is extended to a situation where first partial derivatives in each dimension are calculated at each input site (i.e. using gradients). A simulation study in the ten-dimensional case is conducted to assess the utility of the Bohman correlation function against strictly positive correlation functions when a high degree of sparsity is induced.
Efficient and Flexible Computation of Many-Electron Wave Function Overlaps
2016-01-01
A new algorithm for the computation of the overlap between many-electron wave functions is described. This algorithm allows for the extensive use of recurring intermediates and thus provides high computational efficiency. Because of the general formalism employed, overlaps can be computed for varying wave function types, molecular orbitals, basis sets, and molecular geometries. This paves the way for efficiently computing nonadiabatic interaction terms for dynamics simulations. In addition, other application areas can be envisaged, such as the comparison of wave functions constructed at different levels of theory. Aside from explaining the algorithm and evaluating the performance, a detailed analysis of the numerical stability of wave function overlaps is carried out, and strategies for overcoming potential severe pitfalls due to displaced atoms and truncated wave functions are presented. PMID:26854874
Development of efficient computer program for dynamic simulation of telerobotic manipulation
NASA Technical Reports Server (NTRS)
Chen, J.; Ou, Y. J.
1989-01-01
Research in robot control has generated interest in computationally efficient forms of dynamic equations for multi-body systems. For a simply connected open-loop linkage, dynamic equations arranged in recursive form were found to be particularly efficient. A general computer program capable of simulating an open-loop manipulator with arbitrary number of links has been developed based on an efficient recursive form of Kane's dynamic equations. Also included in the program is some of the important dynamics of the joint drive system, i.e., the rotational effect of the motor rotors. Further efficiency is achieved by the use of symbolic manipulation program to generate the FORTRAN simulation program tailored for a specific manipulator based on the parameter values given. The formulations and the validation of the program are described, and some results are shown.
A new computationally-efficient computer program for simulating spectral gamma-ray logs
Conaway, J.G.
1995-12-31
Several techniques to improve the accuracy of radionuclide concentration estimates as a function of depth from gamma-ray logs have appeared in the literature. Much of that work was driven by interest in uranium as an economic mineral. More recently, the problem of mapping and monitoring artificial gamma-emitting contaminants in the ground has rekindled interest in improving the accuracy of radioelement concentration estimates from gamma-ray logs. We are looking at new approaches to accomplishing such improvements. The first step in this effort has been to develop a new computational model of a spectral gamma-ray logging sonde in a borehole environment. The model supports attenuation in any combination of materials arranged in 2-D cylindrical geometry, including any combination of attenuating materials in the borehole, formation, and logging sonde. The model can also handle any distribution of sources in the formation. The model considers unscattered radiation only, as represented by the background-corrected area under a given spectral photopeak as a function of depth. Benchmark calculations using the standard Monte Carlo model MCNP show excellent agreement with total gamma flux estimates with a computation time of about 0.01% of the time required for the MCNP calculations. This model lacks the flexibility of MCNP, although for this application a great deal can be accomplished without that flexibility.
Bubbles, Clusters and Denaturation in Genomic Dna: Modeling, Parametrization, Efficient Computation
NASA Astrophysics Data System (ADS)
Theodorakopoulos, Nikos
2011-08-01
The paper uses mesoscopic, non-linear lattice dynamics based (Peyrard-Bishop-Dauxois, PBD) modeling to describe thermal properties of DNA below and near the denaturation temperature. Computationally efficient notation is introduced for the relevant statistical mechanics. Computed melting profiles of long and short heterogeneous sequences are presented, using a recently introduced reparametrization of the PBD model, and critically discussed. The statistics of extended open bubbles and bound clusters is formulated and results are presented for selected examples.
Energy-Efficient Computational Chemistry: Comparison of x86 and ARM Systems.
Keipert, Kristopher; Mitra, Gaurav; Sunriyal, Vaibhav; Leang, Sarom S; Sosonkina, Masha; Rendell, Alistair P; Gordon, Mark S
2015-11-10
The computational efficiency and energy-to-solution of several applications using the GAMESS quantum chemistry suite of codes is evaluated for 32-bit and 64-bit ARM-based computers, and compared to an x86 machine. The x86 system completes all benchmark computations more quickly than either ARM system and is the best choice to minimize time to solution. The ARM64 and ARM32 computational performances are similar to each other for Hartree-Fock and density functional theory energy calculations. However, for memory-intensive second-order perturbation theory energy and gradient computations the lower ARM32 read/write memory bandwidth results in computation times as much as 86% longer than on the ARM64 system. The ARM32 system is more energy efficient than the x86 and ARM64 CPUs for all benchmarked methods, while the ARM64 CPU is more energy efficient than the x86 CPU for some core counts and molecular sizes. PMID:26574303
Stone, John E.; Hallock, Michael J.; Phillips, James C.; Peterson, Joseph R.; Luthey-Schulten, Zaida; Schulten, Klaus
2016-01-01
Many of the continuing scientific advances achieved through computational biology are predicated on the availability of ongoing increases in computational power required for detailed simulation and analysis of cellular processes on biologically-relevant timescales. A critical challenge facing the development of future exascale supercomputer systems is the development of new computing hardware and associated scientific applications that dramatically improve upon the energy efficiency of existing solutions, while providing increased simulation, analysis, and visualization performance. Mobile computing platforms have recently become powerful enough to support interactive molecular visualization tasks that were previously only possible on laptops and workstations, creating future opportunities for their convenient use for meetings, remote collaboration, and as head mounted displays for immersive stereoscopic viewing. We describe early experiences adapting several biomolecular simulation and analysis applications for emerging heterogeneous computing platforms that combine power-efficient system-on-chip multi-core CPUs with high-performance massively parallel GPUs. We present low-cost power monitoring instrumentation that provides sufficient temporal resolution to evaluate the power consumption of individual CPU algorithms and GPU kernels. We compare the performance and energy efficiency of scientific applications running on emerging platforms with results obtained on traditional platforms, identify hardware and algorithmic performance bottlenecks that affect the usability of these platforms, and describe avenues for improving both the hardware and applications in pursuit of the needs of molecular modeling tasks on mobile devices and future exascale computers. PMID:27516922
Zou, Han; Jiang, Hao; Luo, Yiwen; Zhu, Jianjie; Lu, Xiaoxuan; Xie, Lihua
2016-01-01
The location and contextual status (indoor or outdoor) is fundamental and critical information for upper-layer applications, such as activity recognition and location-based services (LBS) for individuals. In addition, optimizations of building management systems (BMS), such as the pre-cooling or heating process of the air-conditioning system according to the human traffic entering or exiting a building, can utilize the information, as well. The emerging mobile devices, which are equipped with various sensors, become a feasible and flexible platform to perform indoor-outdoor (IO) detection. However, power-hungry sensors, such as GPS and WiFi, should be used with caution due to the constrained battery storage on mobile device. We propose BlueDetect: an accurate, fast response and energy-efficient scheme for IO detection and seamless LBS running on the mobile device based on the emerging low-power iBeacon technology. By leveraging the on-broad Bluetooth module and our proposed algorithms, BlueDetect provides a precise IO detection service that can turn on/off on-board power-hungry sensors smartly and automatically, optimize their performances and reduce the power consumption of mobile devices simultaneously. Moreover, seamless positioning and navigation services can be realized by it, especially in a semi-outdoor environment, which cannot be achieved by GPS or an indoor positioning system (IPS) easily. We prototype BlueDetect on Android mobile devices and evaluate its performance comprehensively. The experimental results have validated the superiority of BlueDetect in terms of IO detection accuracy, localization accuracy and energy consumption. PMID:26907295
Zou, Han; Jiang, Hao; Luo, Yiwen; Zhu, Jianjie; Lu, Xiaoxuan; Xie, Lihua
2016-01-01
The location and contextual status (indoor or outdoor) is fundamental and critical information for upper-layer applications, such as activity recognition and location-based services (LBS) for individuals. In addition, optimizations of building management systems (BMS), such as the pre-cooling or heating process of the air-conditioning system according to the human traffic entering or exiting a building, can utilize the information, as well. The emerging mobile devices, which are equipped with various sensors, become a feasible and flexible platform to perform indoor-outdoor (IO) detection. However, power-hungry sensors, such as GPS and WiFi, should be used with caution due to the constrained battery storage on mobile device. We propose BlueDetect: an accurate, fast response and energy-efficient scheme for IO detection and seamless LBS running on the mobile device based on the emerging low-power iBeacon technology. By leveraging the on-broad Bluetooth module and our proposed algorithms, BlueDetect provides a precise IO detection service that can turn on/off on-board power-hungry sensors smartly and automatically, optimize their performances and reduce the power consumption of mobile devices simultaneously. Moreover, seamless positioning and navigation services can be realized by it, especially in a semi-outdoor environment, which cannot be achieved by GPS or an indoor positioning system (IPS) easily. We prototype BlueDetect on Android mobile devices and evaluate its performance comprehensively. The experimental results have validated the superiority of BlueDetect in terms of IO detection accuracy, localization accuracy and energy consumption. PMID:26907295
Lee, Hui Sun; Im, Wonpil
2016-04-01
Molecular recognition by protein mostly occurs in a local region on the protein surface. Thus, an efficient computational method for accurate characterization of protein local structural conservation is necessary to better understand biology and drug design. We present a novel local structure alignment tool, G-LoSA. G-LoSA aligns protein local structures in a sequence order independent way and provides a GA-score, a chemical feature-based and size-independent structure similarity score. Our benchmark validation shows the robust performance of G-LoSA to the local structures of diverse sizes and characteristics, demonstrating its universal applicability to local structure-centric comparative biology studies. In particular, G-LoSA is highly effective in detecting conserved local regions on the entire surface of a given protein. In addition, the applications of G-LoSA to identifying template ligands and predicting ligand and protein binding sites illustrate its strong potential for computer-aided drug design. We hope that G-LoSA can be a useful computational method for exploring interesting biological problems through large-scale comparison of protein local structures and facilitating drug discovery research and development. G-LoSA is freely available to academic users at http://im.compbio.ku.edu/GLoSA/. PMID:26813336
Spin-neurons: A possible path to energy-efficient neuromorphic computers
NASA Astrophysics Data System (ADS)
Sharad, Mrigank; Fan, Deliang; Roy, Kaushik
2013-12-01
Recent years have witnessed growing interest in the field of brain-inspired computing based on neural-network architectures. In order to translate the related algorithmic models into powerful, yet energy-efficient cognitive-computing hardware, computing-devices beyond CMOS may need to be explored. The suitability of such devices to this field of computing would strongly depend upon how closely their physical characteristics match with the essential computing primitives employed in such models. In this work, we discuss the rationale of applying emerging spin-torque devices for bio-inspired computing. Recent spin-torque experiments have shown the path to low-current, low-voltage, and high-speed magnetization switching in nano-scale magnetic devices. Such magneto-metallic, current-mode spin-torque switches can mimic the analog summing and "thresholding" operation of an artificial neuron with high energy-efficiency. Comparison with CMOS-based analog circuit-model of a neuron shows that "spin-neurons" (spin based circuit model of neurons) can achieve more than two orders of magnitude lower energy and beyond three orders of magnitude reduction in energy-delay product. The application of spin-neurons can therefore be an attractive option for neuromorphic computers of future.
Spin-neurons: A possible path to energy-efficient neuromorphic computers
Sharad, Mrigank; Fan, Deliang; Roy, Kaushik
2013-12-21
Recent years have witnessed growing interest in the field of brain-inspired computing based on neural-network architectures. In order to translate the related algorithmic models into powerful, yet energy-efficient cognitive-computing hardware, computing-devices beyond CMOS may need to be explored. The suitability of such devices to this field of computing would strongly depend upon how closely their physical characteristics match with the essential computing primitives employed in such models. In this work, we discuss the rationale of applying emerging spin-torque devices for bio-inspired computing. Recent spin-torque experiments have shown the path to low-current, low-voltage, and high-speed magnetization switching in nano-scale magnetic devices. Such magneto-metallic, current-mode spin-torque switches can mimic the analog summing and “thresholding” operation of an artificial neuron with high energy-efficiency. Comparison with CMOS-based analog circuit-model of a neuron shows that “spin-neurons” (spin based circuit model of neurons) can achieve more than two orders of magnitude lower energy and beyond three orders of magnitude reduction in energy-delay product. The application of spin-neurons can therefore be an attractive option for neuromorphic computers of future.
NREL's Building-Integrated Supercomputer Provides Heating and Efficient Computing (Fact Sheet)
Not Available
2014-09-01
NREL's Energy Systems Integration Facility (ESIF) is meant to investigate new ways to integrate energy sources so they work together efficiently, and one of the key tools to that investigation, a new supercomputer, is itself a prime example of energy systems integration. NREL teamed with Hewlett-Packard (HP) and Intel to develop the innovative warm-water, liquid-cooled Peregrine supercomputer, which not only operates efficiently but also serves as the primary source of building heat for ESIF offices and laboratories. This innovative high-performance computer (HPC) can perform more than a quadrillion calculations per second as part of the world's most energy-efficient HPC data center.
NASA Astrophysics Data System (ADS)
Je, U. K.; Cho, H. M.; Cho, H. S.; Park, Y. O.; Park, C. K.; Lim, H. W.; Kim, K. S.; Kim, G. A.; Park, S. Y.; Woo, T. H.; Choi, S. I.
2016-02-01
In this paper, we propose a new/next-generation type of CT examinations, the so-called Interior Computed Tomography (ICT), which may presumably lead to dose reduction to the patient outside the target region-of-interest (ROI), in dental x-ray imaging. Here an x-ray beam from each projection position covers only a relatively small ROI containing a target of diagnosis from the examined structure, leading to imaging benefits such as decreasing scatters and system cost as well as reducing imaging dose. We considered the compressed-sensing (CS) framework, rather than common filtered-backprojection (FBP)-based algorithms, for more accurate ICT reconstruction. We implemented a CS-based ICT algorithm and performed a systematic simulation to investigate the imaging characteristics. Simulation conditions of two ROI ratios of 0.28 and 0.14 between the target and the whole phantom sizes and four projection numbers of 360, 180, 90, and 45 were tested. We successfully reconstructed ICT images of substantially high image quality by using the CS framework even with few-view projection data, still preserving sharp edges in the images.
Framework for computationally efficient optimal irrigation scheduling using ant colony optimization
Technology Transfer Automated Retrieval System (TEKTRAN)
A general optimization framework is introduced with the overall goal of reducing search space size and increasing the computational efficiency of evolutionary algorithm application for optimal irrigation scheduling. The framework achieves this goal by representing the problem in the form of a decisi...
Using Neural Net Technology To Enhance the Efficiency of a Computer Adaptive Testing Application.
ERIC Educational Resources Information Center
Van Nelson, C.; Henriksen, Larry W.
The potential for computer adaptive testing (CAT) has been well documented. In order to improve the efficiency of this process, it may be possible to utilize a neural network, or more specifically, a back propagation neural network. The paper asserts that in order to accomplish this end, it must be shown that grouping examinees by ability as…
The Improvement of Efficiency in the Numerical Computation of Orbit Trajectories
NASA Technical Reports Server (NTRS)
Dyer, J.; Danchick, R.; Pierce, S.; Haney, R.
1972-01-01
An analysis, system design, programming, and evaluation of results are described for numerical computation of orbit trajectories. Evaluation of generalized methods, interaction of different formulations for satellite motion, transformation of equations of motion and integrator loads, and development of efficient integrators are also considered.
ERIC Educational Resources Information Center
Anglin, Linda; Anglin, Kenneth; Schumann, Paul L.; Kaliski, John A.
2008-01-01
This study tests the use of computer-assisted grading rubrics compared to other grading methods with respect to the efficiency and effectiveness of different grading processes for subjective assignments. The test was performed on a large Introduction to Business course. The students in this course were randomly assigned to four treatment groups…
Efficient shortest-path-tree computation in network routing based on pulse-coupled neural networks.
Qu, Hong; Yi, Zhang; Yang, Simon X
2013-06-01
Shortest path tree (SPT) computation is a critical issue for routers using link-state routing protocols, such as the most commonly used open shortest path first and intermediate system to intermediate system. Each router needs to recompute a new SPT rooted from itself whenever a change happens in the link state. Most commercial routers do this computation by deleting the current SPT and building a new one using static algorithms such as the Dijkstra algorithm at the beginning. Such recomputation of an entire SPT is inefficient, which may consume a considerable amount of CPU time and result in a time delay in the network. Some dynamic updating methods using the information in the updated SPT have been proposed in recent years. However, there are still many limitations in those dynamic algorithms. In this paper, a new modified model of pulse-coupled neural networks (M-PCNNs) is proposed for the SPT computation. It is rigorously proved that the proposed model is capable of solving some optimization problems, such as the SPT. A static algorithm is proposed based on the M-PCNNs to compute the SPT efficiently for large-scale problems. In addition, a dynamic algorithm that makes use of the structure of the previously computed SPT is proposed, which significantly improves the efficiency of the algorithm. Simulation results demonstrate the effective and efficient performance of the proposed approach. PMID:23144039
A computationally efficient OMP-based compressed sensing reconstruction for dynamic MRI
NASA Astrophysics Data System (ADS)
Usman, M.; Prieto, C.; Odille, F.; Atkinson, D.; Schaeffter, T.; Batchelor, P. G.
2011-04-01
Compressed sensing (CS) methods in MRI are computationally intensive. Thus, designing novel CS algorithms that can perform faster reconstructions is crucial for everyday applications. We propose a computationally efficient orthogonal matching pursuit (OMP)-based reconstruction, specifically suited to cardiac MR data. According to the energy distribution of a y-f space obtained from a sliding window reconstruction, we label the y-f space as static or dynamic. For static y-f space images, a computationally efficient masked OMP reconstruction is performed, whereas for dynamic y-f space images, standard OMP reconstruction is used. The proposed method was tested on a dynamic numerical phantom and two cardiac MR datasets. Depending on the field of view composition of the imaging data, compared to the standard OMP method, reconstruction speedup factors ranging from 1.5 to 2.5 are achieved.
NASA Astrophysics Data System (ADS)
Yanai, Takeshi; Nakajima, Takahito; Ishikawa, Yasuyuki; Hirao, Kimihiko
2001-04-01
A highly efficient computational scheme for four-component relativistic ab initio molecular orbital (MO) calculations over generally contracted spherical harmonic Gaussian-type spinors (GTSs) is presented. Benchmark calculations for the ground states of the group IB hydrides, MH, and dimers, M2 (M=Cu, Ag, and Au), by the Dirac-Hartree-Fock (DHF) method were performed with a new four-component relativistic ab initio MO program package oriented toward contracted GTSs. The relativistic electron repulsion integrals (ERIs), the major bottleneck in routine DHF calculations, are calculated efficiently employing the fast ERI routine SPHERICA, exploiting the general contraction scheme, and the accompanying coordinate expansion method developed by Ishida. Illustrative calculations clearly show the efficiency of our computational scheme.
Hough, Patricia Diane (Sandia National Laboratories, Livermore, CA); Gray, Genetha Anne (Sandia National Laboratories, Livermore, CA); Castro, Joseph Pete Jr.; Giunta, Anthony Andrew
2006-01-01
Many engineering application problems use optimization algorithms in conjunction with numerical simulators to search for solutions. The formulation of relevant objective functions and constraints dictate possible optimization algorithms. Often, a gradient based approach is not possible since objective functions and constraints can be nonlinear, nonconvex, non-differentiable, or even discontinuous and the simulations involved can be computationally expensive. Moreover, computational efficiency and accuracy are desirable and also influence the choice of solution method. With the advent and increasing availability of massively parallel computers, computational speed has increased tremendously. Unfortunately, the numerical and model complexities of many problems still demand significant computational resources. Moreover, in optimization, these expenses can be a limiting factor since obtaining solutions often requires the completion of numerous computationally intensive simulations. Therefore, we propose a multifidelity optimization algorithm (MFO) designed to improve the computational efficiency of an optimization method for a wide range of applications. In developing the MFO algorithm, we take advantage of the interactions between multi fidelity models to develop a dynamic and computational time saving optimization algorithm. First, a direct search method is applied to the high fidelity model over a reduced design space. In conjunction with this search, a specialized oracle is employed to map the design space of this high fidelity model to that of a computationally cheaper low fidelity model using space mapping techniques. Then, in the low fidelity space, an optimum is obtained using gradient or non-gradient based optimization, and it is mapped back to the high fidelity space. In this paper, we describe the theory and implementation details of our MFO algorithm. We also demonstrate our MFO method on some example problems and on two applications: earth penetrators and
A uniform algebraically-based approach to computational physics and efficient programming
NASA Astrophysics Data System (ADS)
Raynolds, James; Mullin, Lenore
2007-03-01
We present an approach to computational physics in which a common formalism is used both to express the physical problem as well as to describe the underlying details of how computation is realized on arbitrary multiprocessor/memory computer architectures. This formalism is the embodiment of a generalized algebra of multi-dimensional arrays (A Mathematics of Arrays) and an efficient computational implementation is obtained through the composition of of array indices (the psi-calculus) of algorithms defined using matrices, tensors, and arrays in general. The power of this approach arises from the fact that multiple computational steps (e.g. Fourier Transform followed by convolution, etc.) can be algebraically composed and reduced to an simplified expression (i.e. Operational Normal Form), that when directly translated into computer code, can be mathematically proven to be the most efficient implementation with the least number of temporary variables, etc. This approach will be illustrated in the context of a cache-optimized FFT that outperforms or is competitive with established library routines: ESSL, FFTW, IMSL, NAG.
Accurate Finite Difference Algorithms
NASA Technical Reports Server (NTRS)
Goodrich, John W.
1996-01-01
Two families of finite difference algorithms for computational aeroacoustics are presented and compared. All of the algorithms are single step explicit methods, they have the same order of accuracy in both space and time, with examples up to eleventh order, and they have multidimensional extensions. One of the algorithm families has spectral like high resolution. Propagation with high order and high resolution algorithms can produce accurate results after O(10(exp 6)) periods of propagation with eight grid points per wavelength.
A computationally efficient model for turbulent droplet dispersion in spray combustion
NASA Technical Reports Server (NTRS)
Litchford, Ron J.; Jeng, San-Mou
1990-01-01
A novel model for turbulent droplet dispersion is formulated having significantly improved computational efficiency in comparison to the conventional point source stochastic sampling methodology. In the proposed model, a computational parcel representing a group of physical particles is considered to have a normal (Gaussian) probability density function (PDF) in three-dimensional space. The mean of each PDF is determined by Lagrangian tracking of each computational parcel, either deterministically or stochastically. The variance is represented by a turbulence-induced mean squared dispersion which is based on statistical inferences from the linearized direct modeling formulation for particle/eddy interactions. Convolution of the computational parcel PDF's produces a single PDF for the physical particle distribution profile. The validity of the new model is established by comparison with the conventional stochastic sampling method, where in each parcel is represented by a delta function distribution, for non-evaporating particles injected into simple turbulent air flows.
A computationally efficient approach for hidden-Markov model-augmented fingerprint-based positioning
NASA Astrophysics Data System (ADS)
Roth, John; Tummala, Murali; McEachen, John
2016-09-01
This paper presents a computationally efficient approach for mobile subscriber position estimation in wireless networks. A method of data scaling assisted by timing adjust is introduced in fingerprint-based location estimation under a framework which allows for minimising computational cost. The proposed method maintains a comparable level of accuracy to the traditional case where no data scaling is used and is evaluated in a simulated environment under varying channel conditions. The proposed scheme is studied when it is augmented by a hidden-Markov model to match the internal parameters to the channel conditions that present, thus minimising computational cost while maximising accuracy. Furthermore, the timing adjust quantity, available in modern wireless signalling messages, is shown to be able to further reduce computational cost and increase accuracy when available. The results may be seen as a significant step towards integrating advanced position-based modelling with power-sensitive mobile devices.
NASA Astrophysics Data System (ADS)
Camacho, Miguel; Boix, Rafael R.; Medina, Francisco
2016-06-01
The authors present a computationally efficient technique for the analysis of extraordinary transmission through both infinite and truncated periodic arrays of slots in perfect conductor screens of negligible thickness. An integral equation is obtained for the tangential electric field in the slots both in the infinite case and in the truncated case. The unknown functions are expressed as linear combinations of known basis functions, and the unknown weight coefficients are determined by means of Galerkin's method. The coefficients of Galerkin's matrix are obtained in the spatial domain in terms of double finite integrals containing the Green's functions (which, in the infinite case, is efficiently computed by means of Ewald's method) times cross-correlations between both the basis functions and their divergences. The computation in the spatial domain is an efficient alternative to the direct computation in the spectral domain since this latter approach involves the determination of either slowly convergent double infinite summations (infinite case) or slowly convergent double infinite integrals (truncated case). The results obtained are validated by means of commercial software, and it is found that the integral equation technique presented in this paper is at least two orders of magnitude faster than commercial software for a similar accuracy. It is also shown that the phenomena related to periodicity such as extraordinary transmission and Wood's anomaly start to appear in the truncated case for arrays with more than 100 (10 ×10 ) slots.
An efficient sparse matrix multiplication scheme for the CYBER 205 computer
NASA Technical Reports Server (NTRS)
Lambiotte, Jules J., Jr.
1988-01-01
This paper describes the development of an efficient algorithm for computing the product of a matrix and vector on a CYBER 205 vector computer. The desire to provide software which allows the user to choose between the often conflicting goals of minimizing central processing unit (CPU) time or storage requirements has led to a diagonal-based algorithm in which one of four types of storage is selected for each diagonal. The candidate storage types employed were chosen to be efficient on the CYBER 205 for diagonals which have nonzero structure which is dense, moderately sparse, very sparse and short, or very sparse and long; however, for many densities, no diagonal type is most efficient with respect to both resource requirements, and a trade-off must be made. For each diagonal, an initialization subroutine estimates the CPU time and storage required for each storage type based on results from previously performed numerical experimentation. These requirements are adjusted by weights provided by the user which reflect the relative importance the user places on the two resources. The adjusted resource requirements are then compared to select the most efficient storage and computational scheme.
Efficient scatter model for simulation of ultrasound images from computed tomography data
NASA Astrophysics Data System (ADS)
D'Amato, J. P.; Lo Vercio, L.; Rubi, P.; Fernandez Vera, E.; Barbuzza, R.; Del Fresno, M.; Larrabide, I.
2015-12-01
Background and motivation: Real-time ultrasound simulation refers to the process of computationally creating fully synthetic ultrasound images instantly. Due to the high value of specialized low cost training for healthcare professionals, there is a growing interest in the use of this technology and the development of high fidelity systems that simulate the acquisitions of echographic images. The objective is to create an efficient and reproducible simulator that can run either on notebooks or desktops using low cost devices. Materials and methods: We present an interactive ultrasound simulator based on CT data. This simulator is based on ray-casting and provides real-time interaction capabilities. The simulation of scattering that is coherent with the transducer position in real time is also introduced. Such noise is produced using a simplified model of multiplicative noise and convolution with point spread functions (PSF) tailored for this purpose. Results: The computational efficiency of scattering maps generation was revised with an improved performance. This allowed a more efficient simulation of coherent scattering in the synthetic echographic images while providing highly realistic result. We describe some quality and performance metrics to validate these results, where a performance of up to 55fps was achieved. Conclusion: The proposed technique for real-time scattering modeling provides realistic yet computationally efficient scatter distributions. The error between the original image and the simulated scattering image was compared for the proposed method and the state-of-the-art, showing negligible differences in its distribution.
Can computational efficiency alone drive the evolution of modularity in neural networks?
Tosh, Colin R.
2016-01-01
Some biologists have abandoned the idea that computational efficiency in processing multipart tasks or input sets alone drives the evolution of modularity in biological networks. A recent study confirmed that small modular (neural) networks are relatively computationally-inefficient but large modular networks are slightly more efficient than non-modular ones. The present study determines whether these efficiency advantages with network size can drive the evolution of modularity in networks whose connective architecture can evolve. The answer is no, but the reason why is interesting. All simulations (run in a wide variety of parameter states) involving gradualistic connective evolution end in non-modular local attractors. Thus while a high performance modular attractor exists, such regions cannot be reached by gradualistic evolution. Non-gradualistic evolutionary simulations in which multi-modularity is obtained through duplication of existing architecture appear viable. Fundamentally, this study indicates that computational efficiency alone does not drive the evolution of modularity, even in large biological networks, but it may still be a viable mechanism when networks evolve by non-gradualistic means. PMID:27573614
Unified commutation-pruning technique for efficient computation of composite DFTs
NASA Astrophysics Data System (ADS)
Castro-Palazuelos, David E.; Medina-Melendrez, Modesto Gpe.; Torres-Roman, Deni L.; Shkvarko, Yuriy V.
2015-12-01
An efficient computation of a composite length discrete Fourier transform (DFT), as well as a fast Fourier transform (FFT) of both time and space data sequences in uncertain (non-sparse or sparse) computational scenarios, requires specific processing algorithms. Traditional algorithms typically employ some pruning methods without any commutations, which prevents them from attaining the potential computational efficiency. In this paper, we propose an alternative unified approach with automatic commutations between three computational modalities aimed at efficient computations of the pruned DFTs adapted for variable composite lengths of the non-sparse input-output data. The first modality is an implementation of the direct computation of a composite length DFT, the second one employs the second-order recursive filtering method, and the third one performs the new pruned decomposed transform. The pruned decomposed transform algorithm performs the decimation in time or space (DIT) data acquisition domain and, then, decimation in frequency (DIF). The unified combination of these three algorithms is addressed as the DFTCOMM technique. Based on the treatment of the combinational-type hypotheses testing optimization problem of preferable allocations between all feasible commuting-pruning modalities, we have found the global optimal solution to the pruning problem that always requires a fewer or, at most, the same number of arithmetic operations than other feasible modalities. The DFTCOMM method outperforms the existing competing pruning techniques in the sense of attainable savings in the number of required arithmetic operations. It requires fewer or at most the same number of arithmetic operations for its execution than any other of the competing pruning methods reported in the literature. Finally, we provide the comparison of the DFTCOMM with the recently developed sparse fast Fourier transform (SFFT) algorithmic family. We feature that, in the sensing scenarios with
Efficient O(N) recursive computation of the operational space inertial matrix
Lilly, K.W.; Orin, D.E.
1993-09-01
The operational space inertia matrix {Lambda} reflects the dynamic properties of a robot manipulator to its tip. In the control domain, it may be used to decouple force and/or motion control about the manipulator workspace axes. The matrix {Lambda} also plays an important role in the development of efficient algorithms for the dynamic simulation of closed-chain robotic mechanisms, including simple closed-chain mechanisms such as multiple manipulator systems and walking machines. The traditional approach used to compute {Lambda} has a computational complexity of O(N{sup 3}) for an N degree-of-freedom manipulator. This paper presents the development of a recursive algorithm for computing the operational space inertia matrix (OSIM) that reduces the computational complexity to O(N). This algorithm, the inertia propagation method, is based on a single recursion that begins at the base of the manipulator and progresses out to the last link. Also applicable to redundant systems and mechanisms with multiple-degree-of-freedom joints, the inertia propagation method is the most efficient method known for computing {Lambda} for N {>=} 6. The numerical accuracy of the algorithm is discussed for a PUMA 560 robot with a fixed base.
Seny, Bruno Lambrechts, Jonathan; Toulorge, Thomas; Legat, Vincent; Remacle, Jean-François
2014-01-01
Although explicit time integration schemes require small computational efforts per time step, their efficiency is severely restricted by their stability limits. Indeed, the multi-scale nature of some physical processes combined with highly unstructured meshes can lead some elements to impose a severely small stable time step for a global problem. Multirate methods offer a way to increase the global efficiency by gathering grid cells in appropriate groups under local stability conditions. These methods are well suited to the discontinuous Galerkin framework. The parallelization of the multirate strategy is challenging because grid cells have different workloads. The computational cost is different for each sub-time step depending on the elements involved and a classical partitioning strategy is not adequate any more. In this paper, we propose a solution that makes use of multi-constraint mesh partitioning. It tends to minimize the inter-processor communications, while ensuring that the workload is almost equally shared by every computer core at every stage of the algorithm. Particular attention is given to the simplicity of the parallel multirate algorithm while minimizing computational and communication overheads. Our implementation makes use of the MeTiS library for mesh partitioning and the Message Passing Interface for inter-processor communication. Performance analyses for two and three-dimensional practical applications confirm that multirate methods preserve important computational advantages of explicit methods up to a significant number of processors.
Serang, Oliver; MacCoss, Michael J.; Noble, William Stafford
2010-01-01
The problem of identifying proteins from a shotgun proteomics experiment has not been definitively solved. Identifying the proteins in a sample requires ranking them, ideally with interpretable scores. In particular, “degenerate” peptides, which map to multiple proteins, have made such a ranking difficult to compute. The problem of computing posterior probabilities for the proteins, which can be interpreted as confidence in a protein’s presence, has been especially daunting. Previous approaches have either ignored the peptide degeneracy problem completely, addressed it by computing a heuristic set of proteins or heuristic posterior probabilities, or by estimating the posterior probabilities with sampling methods. We present a probabilistic model for protein identification in tandem mass spectrometry that recognizes peptide degeneracy. We then introduce graph-transforming algorithms that facilitate efficient computation of protein probabilities, even for large data sets. We evaluate our identification procedure on five different well-characterized data sets and demonstrate our ability to efficiently compute high-quality protein posteriors. PMID:20712337
Efficient numerical method for computation of the thermohydrodynamics of laminar lubricating films
NASA Technical Reports Server (NTRS)
Elrod, H. G.
1991-01-01
The purpose of this paper is to describe an accurate, yet economical, method for computing temperature effects in laminar lubricating films in two dimensions. Because of the marked dependence of lubricant viscosity on temperature, the effect of viscosity variation both across and along a lubricating film can dwarf other deviations from ideal constant-property lubrication. In practice, a thermohydrodynamics program will involve simultaneous solution of the film lubrication problem, together with heat conduction in a solid, complex structure. In pursuit of computational economy, techniques similar to those for Gaussian quadrature are used; it is shown that, for many purposes, the use of just two properly positioned temperatures (Lobatto points) characterizes the transverse temperature distribution.
Efficient numerical method for computation of thermohydrodynamics of laminar lubricating films
NASA Technical Reports Server (NTRS)
Elrod, Harold G.
1989-01-01
The purpose of this paper is to describe an accurate, yet economical, method for computing temperature effects in laminar lubricating films in two dimensions. The procedure presented here is a sequel to one presented in Leeds in 1986 that was carried out for the one-dimensional case. Because of the marked dependence of lubricant viscosity on temperature, the effect of viscosity variation both across and along a lubricating film can dwarf other deviations from ideal constant-property lubrication. In practice, a thermohydrodynamics program will involve simultaneous solution of the film lubrication problem, together with heat conduction in a solid, complex structure. The extent of computation required makes economy in numerical processing of utmost importance. In pursuit of such economy, we here use techniques similar to those for Gaussian quadrature. We show that, for many purposes, the use of just two properly positioned temperatures (Lobatto points) characterizes well the transverse temperature distribution.
NASA Technical Reports Server (NTRS)
Kaiser, Mary K.; Proffitt, Dennis R.
1992-01-01
Recent developments in microelectronics have encouraged the use of 3D data bases to create compelling volumetric renderings of graphical objects. However, even with the computational capabilities of current-generation graphical systems, real-time displays of such objects are difficult, particularly when dynamic spatial transformations are involved. In this paper we discuss a type of visual stimulus (the stereokinetic effect display) that is computationally far less complex than a true three-dimensional transformation but yields an equally compelling depth impression, often perceptually indiscriminable from the true spatial transformation. Several possible applications for this technique are discussed (e.g., animating contour maps and air traffic control displays so as to evoke accurate depth percepts).
Efficient computation of PDF-based characteristics from diffusion MR signal.
Assemlal, Haz-Edine; Tschumperlé, David; Brun, Luc
2008-01-01
We present a general method for the computation of PDF-based characteristics of the tissue micro-architecture in MR imaging. The approach relies on the approximation of the MR signal by a series expansion based on Spherical Harmonics and Laguerre-Gaussian functions, followed by a simple projection step that is efficiently done in a finite dimensional space. The resulting algorithm is generic, flexible and is able to compute a large set of useful characteristics of the local tissues structure. We illustrate the effectiveness of this approach by showing results on synthetic and real MR datasets acquired in a clinical time-frame. PMID:18982591
NASA Astrophysics Data System (ADS)
Chung, Vera Y. Y.; Bergmann, Neil W.
1998-12-01
This paper presents how to implement the block-matching motion estimation algorithm efficiently by Field Programmable Gate Arrays (FPGAs) based Custom Computer Machine (CCM) for video compression. The SPACE2 Custom Computer board consists of up to eight Xilinx XC6216 fine- grain, sea-of-gate FPGA chips. The results show that two Xilinx XC6216 FPGA can perform at 960 MOPs, hence the real- time full-search motion estimation encoder can be easily implemented by our SPACE2 CCM system.
Efficient path-based computations on pedigree graphs with compact encodings
2012-01-01
A pedigree is a diagram of family relationships, and it is often used to determine the mode of inheritance (dominant, recessive, etc.) of genetic diseases. Along with rapidly growing knowledge of genetics and accumulation of genealogy information, pedigree data is becoming increasingly important. In large pedigree graphs, path-based methods for efficiently computing genealogical measurements, such as inbreeding and kinship coefficients of individuals, depend on efficient identification and processing of paths. In this paper, we propose a new compact path encoding scheme on large pedigrees, accompanied by an efficient algorithm for identifying paths. We demonstrate the utilization of our proposed method by applying it to the inbreeding coefficient computation. We present time and space complexity analysis, and also manifest the efficiency of our method for evaluating inbreeding coefficients as compared to previous methods by experimental results using pedigree graphs with real and synthetic data. Both theoretical and experimental results demonstrate that our method is more scalable and efficient than previous methods in terms of time and space requirements. PMID:22536898
Lin, Youzuo; O'Malley, Daniel; Vesselinov, Velimir V.
2016-08-19
Inverse modeling seeks model parameters given a set of observations. However, for practical problems because the number of measurements is often large and the model parameters are also numerous, conventional methods for inverse modeling can be computationally expensive. We have developed a new, computationally-efficient parallel Levenberg-Marquardt method for solving inverse modeling problems with a highly parameterized model space. Levenberg-Marquardt methods require the solution of a linear system of equations which can be prohibitively expensive to compute for moderate to large-scale problems. Our novel method projects the original linear problem down to a Krylov subspace, such that the dimensionality of themore » problem can be significantly reduced. Furthermore, we store the Krylov subspace computed when using the first damping parameter and recycle the subspace for the subsequent damping parameters. The efficiency of our new inverse modeling algorithm is significantly improved using these computational techniques. We apply this new inverse modeling method to invert for random transmissivity fields in 2D and a random hydraulic conductivity field in 3D. Our algorithm is fast enough to solve for the distributed model parameters (transmissivity) in the model domain. The algorithm is coded in Julia and implemented in the MADS computational framework (http://mads.lanl.gov). By comparing with Levenberg-Marquardt methods using standard linear inversion techniques such as QR or SVD methods, our Levenberg-Marquardt method yields a speed-up ratio on the order of ~101 to ~102 in a multi-core computational environment. Furthermore, our new inverse modeling method is a powerful tool for characterizing subsurface heterogeneity for moderate- to large-scale problems.« less
An efficient method for computing the QTAIM topology of a scalar field: the electron density case.
Rodríguez, Juan I
2013-03-30
An efficient method for computing the quantum theory of atoms in molecules (QTAIM) topology of the electron density (or other scalar field) is presented. A modified Newton-Raphson algorithm was implemented for finding the critical points (CP) of the electron density. Bond paths were constructed with the second-order Runge-Kutta method. Vectorization of the present algorithm makes it to scale linearly with the system size. The parallel efficiency decreases with the number of processors (from 70% to 50%) with an average of 54%. The accuracy and performance of the method are demonstrated by computing the QTAIM topology of the electron density of a series of representative molecules. Our results show that our algorithm might allow to apply QTAIM analysis to large systems (carbon nanotubes, polymers, fullerenes) considered unreachable until now. PMID:23175458
A Power Efficient Exaflop Computer Design for Global Cloud System Resolving Climate Models.
NASA Astrophysics Data System (ADS)
Wehner, M. F.; Oliker, L.; Shalf, J.
2008-12-01
Exascale computers would allow routine ensemble modeling of the global climate system at the cloud system resolving scale. Power and cost requirements of traditional architecture systems are likely to delay such capability for many years. We present an alternative route to the exascale using embedded processor technology to design a system optimized for ultra high resolution climate modeling. These power efficient processors, used in consumer electronic devices such as mobile phones, portable music players, cameras, etc., can be tailored to the specific needs of scientific computing. We project that a system capable of integrating a kilometer scale climate model a thousand times faster than real time could be designed and built in a five year time scale for US$75M with a power consumption of 3MW. This is cheaper, more power efficient and sooner than any other existing technology.
NASA Astrophysics Data System (ADS)
Kramer, Alex; Thumm, Uwe
2016-05-01
We discuss a class of window-transform-based ``virtual detector'' methods for computing momentum-resolved dissociation and ionization spectra by numerically analyzing the motion of nuclear or electronic quantum-mechanical wavepackets at the periphery of their numerical grids. While prior applications of such surface-flux methods considered semi-classical limits to derive ionization and dissociation spectra, we systematically include quantum-mechanical corrections and extensions to higher dimensions, discussing numerical convergence properties and the computational efficiency of our method in comparison with alternative schemes for obtaining momentum distributions. Using the example of atomic ionization by co- and counter-rotating circularly polarized laser pulses, we scrutinize the efficiency of common finite-difference schemes for solving the time-dependent Schrödinger equation in virtual detection and standard Fourier-transformation methods for extracting momentum spectra. Supported by the DoE, NSF, and Alexander von Humboldt foundation.
Computational efficient segmentation of cell nuclei in 2D and 3D fluorescent micrographs
NASA Astrophysics Data System (ADS)
De Vylder, Jonas; Philips, Wilfried
2011-02-01
This paper proposes a new segmentation technique developed for the segmentation of cell nuclei in both 2D and 3D fluorescent micrographs. The proposed method can deal with both blurred edges as with touching nuclei. Using a dual scan line algorithm its both memory as computational efficient, making it interesting for the analysis of images coming from high throughput systems or the analysis of 3D microscopic images. Experiments show good results, i.e. recall of over 0.98.
Mitchell, Scott A.; Ebeida, Mohamed Salah; Romero, Vicente J.; Swiler, Laura Painton; Rushdi, Ahmad A.; Abdelkader, Ahmad
2015-09-01
This SAND report summarizes our work on the Sandia National Laboratory LDRD project titled "Efficient Probability of Failure Calculations for QMU using Computational Geometry" which was project #165617 and proposal #13-0144. This report merely summarizes our work. Those interested in the technical details are encouraged to read the full published results, and contact the report authors for the status of the software and follow-on projects.
Step-by-step magic state encoding for efficient fault-tolerant quantum computation
Goto, Hayato
2014-01-01
Quantum error correction allows one to make quantum computers fault-tolerant against unavoidable errors due to decoherence and imperfect physical gate operations. However, the fault-tolerant quantum computation requires impractically large computational resources for useful applications. This is a current major obstacle to the realization of a quantum computer. In particular, magic state distillation, which is a standard approach to universality, consumes the most resources in fault-tolerant quantum computation. For the resource problem, here we propose step-by-step magic state encoding for concatenated quantum codes, where magic states are encoded step by step from the physical level to the logical one. To manage errors during the encoding, we carefully use error detection. Since the sizes of intermediate codes are small, it is expected that the resource overheads will become lower than previous approaches based on the distillation at the logical level. Our simulation results suggest that the resource requirements for a logical magic state will become comparable to those for a single logical controlled-NOT gate. Thus, the present method opens a new possibility for efficient fault-tolerant quantum computation. PMID:25511387
Step-by-step magic state encoding for efficient fault-tolerant quantum computation
NASA Astrophysics Data System (ADS)
Goto, Hayato
2014-12-01
Quantum error correction allows one to make quantum computers fault-tolerant against unavoidable errors due to decoherence and imperfect physical gate operations. However, the fault-tolerant quantum computation requires impractically large computational resources for useful applications. This is a current major obstacle to the realization of a quantum computer. In particular, magic state distillation, which is a standard approach to universality, consumes the most resources in fault-tolerant quantum computation. For the resource problem, here we propose step-by-step magic state encoding for concatenated quantum codes, where magic states are encoded step by step from the physical level to the logical one. To manage errors during the encoding, we carefully use error detection. Since the sizes of intermediate codes are small, it is expected that the resource overheads will become lower than previous approaches based on the distillation at the logical level. Our simulation results suggest that the resource requirements for a logical magic state will become comparable to those for a single logical controlled-NOT gate. Thus, the present method opens a new possibility for efficient fault-tolerant quantum computation.
NASA Astrophysics Data System (ADS)
Kamalzare, Mahmoud; Johnson, Erik A.; Wojtkiewicz, Steven F.
2014-05-01
Designing control strategies for smart structures, such as those with semiactive devices, is complicated by the nonlinear nature of the feedback control, secondary clipping control and other additional requirements such as device saturation. The usual design approach resorts to large-scale simulation parameter studies that are computationally expensive. The authors have previously developed an approach for state-feedback semiactive clipped-optimal control design, based on a nonlinear Volterra integral equation that provides for the computationally efficient simulation of such systems. This paper expands the applicability of the approach by demonstrating that it can also be adapted to accommodate more realistic cases when, instead of full state feedback, only a limited set of noisy response measurements is available to the controller. This extension requires incorporating a Kalman filter (KF) estimator, which is linear, into the nominal model of the uncontrolled system. The efficacy of the approach is demonstrated by a numerical study of a 100-degree-of-freedom frame model, excited by a filtered Gaussian random excitation, with noisy acceleration sensor measurements to determine the semiactive control commands. The results show that the proposed method can improve computational efficiency by more than two orders of magnitude relative to a conventional solver, while retaining a comparable level of accuracy. Further, the proposed approach is shown to be similarly efficient for an extensive Monte Carlo simulation to evaluate the effects of sensor noise levels and KF tuning on the accuracy of the response.
An efficient surrogate-based method for computing rare failure probability
NASA Astrophysics Data System (ADS)
Li, Jing; Li, Jinglai; Xiu, Dongbin
2011-10-01
In this paper, we present an efficient numerical method for evaluating rare failure probability. The method is based on a recently developed surrogate-based method from Li and Xiu [J. Li, D. Xiu, Evaluation of failure probability via surrogate models, J. Comput. Phys. 229 (2010) 8966-8980] for failure probability computation. The method by Li and Xiu is of hybrid nature, in the sense that samples of both the surrogate model and the true physical model are used, and its efficiency gain relies on using only very few samples of the true model. Here we extend the capability of the method to rare probability computation by using the idea of importance sampling (IS). In particular, we employ cross-entropy (CE) method, which is an effective method to determine the biasing distribution in IS. We demonstrate that, by combining with the CE method, a surrogate-based IS algorithm can be constructed and is highly efficient for rare failure probability computation—it incurs much reduced simulation efforts compared to the traditional CE-IS method. In many cases, the new method is capable of capturing failure probability as small as 10 -12 ˜ 10 -6 with only several hundreds samples.
Computing the energy of a water molecule using multideterminants: A simple, efficient algorithm
NASA Astrophysics Data System (ADS)
Clark, Bryan K.; Morales, Miguel A.; McMinis, Jeremy; Kim, Jeongnim; Scuseria, Gustavo E.
2011-12-01
Quantum Monte Carlo (QMC) methods such as variational Monte Carlo and fixed node diffusion Monte Carlo depend heavily on the quality of the trial wave function. Although Slater-Jastrow wave functions are the most commonly used variational ansatz in electronic structure, more sophisticated wave functions are critical to ascertaining new physics. One such wave function is the multi-Slater-Jastrow wave function which consists of a Jastrow function multiplied by the sum of Slater determinants. In this paper we describe a method for working with these wave functions in QMC codes that is easy to implement, efficient both in computational speed as well as memory, and easily parallelized. The computational cost scales quadratically with particle number making this scaling no worse than the single determinant case and linear with the total number of excitations. Additionally, we implement this method and use it to compute the ground state energy of a water molecule.
Computing the energy of a water molecule using multideterminants: a simple, efficient algorithm.
Clark, Bryan K; Morales, Miguel A; McMinis, Jeremy; Kim, Jeongnim; Scuseria, Gustavo E
2011-12-28
Quantum Monte Carlo (QMC) methods such as variational Monte Carlo and fixed node diffusion Monte Carlo depend heavily on the quality of the trial wave function. Although Slater-Jastrow wave functions are the most commonly used variational ansatz in electronic structure, more sophisticated wave functions are critical to ascertaining new physics. One such wave function is the multi-Slater-Jastrow wave function which consists of a Jastrow function multiplied by the sum of Slater determinants. In this paper we describe a method for working with these wave functions in QMC codes that is easy to implement, efficient both in computational speed as well as memory, and easily parallelized. The computational cost scales quadratically with particle number making this scaling no worse than the single determinant case and linear with the total number of excitations. Additionally, we implement this method and use it to compute the ground state energy of a water molecule. PMID:22225142
An efficient FPGA architecture for integer ƞth root computation
NASA Astrophysics Data System (ADS)
Rangel-Valdez, Nelson; Barron-Zambrano, Jose Hugo; Torres-Huitzil, Cesar; Torres-Jimenez, Jose
2015-10-01
In embedded computing, it is common to find applications such as signal processing, image processing, computer graphics or data compression that might benefit from hardware implementation for the computation of integer roots of order ?. However, the scientific literature lacks architectural designs that implement such operations for different values of N, using a low amount of resources. This article presents a parameterisable field programmable gate array (FPGA) architecture for an efficient Nth root calculator that uses only adders/subtractors and ? location memory elements. The architecture was tested for different values of ?, using 64-bit number representation. The results show a consumption up to 10% of the logical resources of a Xilinx XC6SLX45-CSG324C device, depending on the value of N. The hardware implementation improved the performance of its corresponding software implementations in one order of magnitude. The architecture performance varies from several thousands to seven millions of root operations per second.
Redundancy management for efficient fault recovery in NASA's distributed computing system
NASA Technical Reports Server (NTRS)
Malek, Miroslaw; Pandya, Mihir; Yau, Kitty
1991-01-01
The management of redundancy in computer systems was studied and guidelines were provided for the development of NASA's fault-tolerant distributed systems. Fault recovery and reconfiguration mechanisms were examined. A theoretical foundation was laid for redundancy management by efficient reconfiguration methods and algorithmic diversity. Algorithms were developed to optimize the resources for embedding of computational graphs of tasks in the system architecture and reconfiguration of these tasks after a failure has occurred. The computational structure represented by a path and the complete binary tree was considered and the mesh and hypercube architectures were targeted for their embeddings. The innovative concept of Hybrid Algorithm Technique was introduced. This new technique provides a mechanism for obtaining fault tolerance while exhibiting improved performance.
Efficient computation of the stability of three-dimensional compressible boundary layers
NASA Technical Reports Server (NTRS)
Malik, M. R.; Orszag, S. A.
1981-01-01
Methods for the computer analysis of the stability of three-dimensional compressible boundary layers are discussed and the user-oriented Compressible Stability Analysis (COSAL) computer code is described. The COSAL code uses a matrix finite-difference method for local eigenvalue solution when a good guess for the eigenvalue is available and is significantly more computationally efficient than the commonly used initial-value approach. The local eigenvalue search procedure also results in eigenfunctions and, at little extra work, group velocities. A globally convergent eigenvalue procedure is also developed which may be used when no guess for the eigenvalue is available. The global problem is formulated in such a way that no unstable spurious modes appear so that the method is suitable for use in a black-box stability code. Sample stability calculations are presented for the boundary layer profiles of an LFC swept wing.
NASA Astrophysics Data System (ADS)
Louboutin, Stephane R.
2007-03-01
Let \\{K_m\\} be a parametrized family of simplest real cyclic cubic, quartic, quintic or sextic number fields of known regulators, e.g., the so-called simplest cubic and quartic fields associated with the polynomials P_m(x) Dx^3 -mx^2-(m+3)x+1 and P_m(x) Dx^4 -mx^3-6x^2+mx+1 . We give explicit formulas for powers of the Gaussian sums attached to the characters associated with these simplest number fields. We deduce a method for computing the exact values of these Gaussian sums. These values are then used to efficiently compute class numbers of simplest fields. Finally, such class number computations yield many examples of real cyclotomic fields Q(zeta_p)^+ of prime conductors pge 3 and class numbers h_p^+ greater than or equal to p . However, in accordance with Vandiver's conjecture, we found no example of p for which p divides h_p^+ .
Fukuda, Ikuo; Kamiya, Narutoshi; Yonezawa, Yasushige; Nakamura, Haruki
2012-08-01
treated. We discussed the origin of this difference between the two schemes and the dependence of this fact on the physical system. The use of the zero damping-factor will enhance the efficiency of practical computations, since the complementary error function is not employed. In addition, utilizing the zero damping-factor provides freedom from the parameter choice, which is not trivial in the zero-charge scheme, and eliminates the error function term, which corresponds to the time-consuming Fourier part under the periodic boundary conditions. PMID:22894355
NASA Technical Reports Server (NTRS)
Seltzer, S. M.
1974-01-01
Some means of combining both computer simulation and anlytical techniques are indicated in order to mutually enhance their efficiency as design tools and to motivate those involved in engineering design to consider using such combinations. While the idea is not new, heavy reliance on computers often seems to overshadow the potential utility of analytical tools. Although the example used is drawn from the area of dynamics and control, the principles espoused are applicable to other fields. In the example the parameter plane stability analysis technique is described briefly and extended beyond that reported in the literature to increase its utility (through a simple set of recursive formulas) and its applicability (through the portrayal of the effect of varying the sampling period of the computer). The numerical values that were rapidly selected by analysis were found to be correct for the hybrid computer simulation for which they were needed. This obviated the need for cut-and-try methods to choose the numerical values, thereby saving both time and computer utilization.
Efficient curve-skeleton computation for the analysis of biomedical 3d images - biomed 2010.
Brun, Francesco; Dreossi, Diego
2010-01-01
Advances in three dimensional (3D) biomedical imaging techniques, such as magnetic resonance (MR) and computed tomography (CT), make it easy to reconstruct high quality 3D models of portions of human body and other biological specimens. A major challenge lies in the quantitative analysis of the resulting models thus allowing a more comprehensive characterization of the object under investigation. An interesting approach is based on curve-skeleton (or medial axis) extraction, which gives basic information concerning the topology and the geometry. Curve-skeletons have been applied in the analysis of vascular networks and the diagnosis of tracheal stenoses as well as a 3D flight path in virtual endoscopy. However curve-skeleton computation is a crucial task. An effective skeletonization algorithm was introduced by N. Cornea in [1] but it lacks in computational performances. Thanks to the advances in imaging techniques the resolution of 3D images is increasing more and more, therefore there is the need for efficient algorithms in order to analyze significant Volumes of Interest (VOIs). In the present paper an improved skeletonization algorithm based on the idea proposed in [1] is presented. A computational comparison between the original and the proposed method is also reported. The obtained results show that the proposed method allows a significant computational improvement making more appealing the adoption of the skeleton representation in biomedical image analysis applications. PMID:20467122
Toward Efficient Computation of the Dempster-Shafer Belief Theoretic Conditionals.
Wickramarathne, Thanuka L; Premaratne, Kamal; Murthi, Manohar N
2013-04-01
Dempster-Shafer (DS) belief theory provides a convenient framework for the development of powerful data fusion engines by allowing for a convenient representation of a wide variety of data imperfections. The recent work on the DS theoretic (DST) conditional approach, which is based on the Fagin-Halpern (FH) DST conditionals, appears to demonstrate the suitability of DS theory for incorporating both soft (generated by human-based sensors) and hard (generated by physics-based sources) evidence into the fusion process. However, the computation of the FH conditionals imposes a significant computational burden. One reason for this is the difficulty in identifying the FH conditional core, i.e., the set of propositions receiving nonzero support after conditioning. The conditional core theorem (CCT) in this paper redresses this shortcoming by explicitly identifying the conditional focal elements with no recourse to numerical computations, thereby providing a complete characterization of the conditional core. In addition, we derive explicit results to identify those conditioning propositions that may have generated a given conditional core. This "converse" to the CCT is of significant practical value for studying the sensitivity of the updated knowledge base with respect to the evidence received. Based on the CCT, we also develop an algorithm to efficiently compute the conditional masses (generated by FH conditionals), provide bounds on its computational complexity, and employ extensive simulations to analyze its behavior. PMID:23033433
NASA Astrophysics Data System (ADS)
Schaefer, Bastian; Goedecker, Stefan
2016-07-01
An analysis of the network defined by the potential energy minima of multi-atomic systems and their connectivity via reaction pathways that go through transition states allows us to understand important characteristics like thermodynamic, dynamic, and structural properties. Unfortunately computing the transition states and reaction pathways in addition to the significant energetically low-lying local minima is a computationally demanding task. We here introduce a computationally efficient method that is based on a combination of the minima hopping global optimization method and the insight that uphill barriers tend to increase with increasing structural distances of the educt and product states. This method allows us to replace the exact connectivity information and transition state energies with alternative and approximate concepts. Without adding any significant additional cost to the minima hopping global optimization approach, this method allows us to generate an approximate network of the minima, their connectivity, and a rough measure for the energy needed for their interconversion. This can be used to obtain a first qualitative idea on important physical and chemical properties by means of a disconnectivity graph analysis. Besides the physical insight obtained by such an analysis, the gained knowledge can be used to make a decision if it is worthwhile or not to invest computational resources for an exact computation of the transition states and the reaction pathways. Furthermore it is demonstrated that the here presented method can be used for finding physically reasonable interconversion pathways that are promising input pathways for methods like transition path sampling or discrete path sampling.
NASA Astrophysics Data System (ADS)
Shoemaker, Christine; Espinet, Antoine; Pang, Min
2015-04-01
Models of complex environmental systems can be computationally expensive in order to describe the dynamic interactions of the many components over a sizeable time period. Diagnostics of these systems can include forward simulations of calibrated models under uncertainty and analysis of alternatives of systems management. This discussion will focus on applications of new surrogate optimization and uncertainty analysis methods to environmental models that can enhance our ability to extract information and understanding. For complex models, optimization and especially uncertainty analysis can require a large number of model simulations, which is not feasible for computationally expensive models. Surrogate response surfaces can be used in Global Optimization and Uncertainty methods to obtain accurate answers with far fewer model evaluations, which made the methods practical for computationally expensive models for which conventional methods are not feasible. In this paper we will discuss the application of the SOARS surrogate method for estimating Bayesian posterior density functions for model parameters for a TOUGH2 model of geologic carbon sequestration. We will also briefly discuss new parallel surrogate global optimization algorithm applied to two groundwater remediation sites that was implemented on a supercomputer with up to 64 processors. The applications will illustrate the use of these methods to predict the impact of monitoring and management on subsurface contaminants.
NASA Technical Reports Server (NTRS)
Lee, C. S. G.; Chen, C. L.
1989-01-01
Two efficient mapping algorithms for scheduling the robot inverse dynamics computation consisting of m computational modules with precedence relationship to be executed on a multiprocessor system consisting of p identical homogeneous processors with processor and communication costs to achieve minimum computation time are presented. An objective function is defined in terms of the sum of the processor finishing time and the interprocessor communication time. The minimax optimization is performed on the objective function to obtain the best mapping. This mapping problem can be formulated as a combination of the graph partitioning and the scheduling problems; both have been known to be NP-complete. Thus, to speed up the searching for a solution, two heuristic algorithms were proposed to obtain fast but suboptimal mapping solutions. The first algorithm utilizes the level and the communication intensity of the task modules to construct an ordered priority list of ready modules and the module assignment is performed by a weighted bipartite matching algorithm. For a near-optimal mapping solution, the problem can be solved by the heuristic algorithm with simulated annealing. These proposed optimization algorithms can solve various large-scale problems within a reasonable time. Computer simulations were performed to evaluate and verify the performance and the validity of the proposed mapping algorithms. Finally, experiments for computing the inverse dynamics of a six-jointed PUMA-like manipulator based on the Newton-Euler dynamic equations were implemented on an NCUBE/ten hypercube computer to verify the proposed mapping algorithms. Computer simulation and experimental results are compared and discussed.
NASA Astrophysics Data System (ADS)
Lunnoo, Thodsaphon; Puangmali, Theerapong
2015-10-01
The primary limitation of magnetic drug targeting (MDT) relates to the strength of an external magnetic field which decreases with increasing distance. Small nanoparticles (NPs) displaying superparamagnetic behaviour are also required in order to reduce embolization in the blood vessel. The small NPs, however, make it difficult to vector NPs and keep them in the desired location. The aims of this work were to investigate parameters influencing the capture efficiency of the drug carriers in mimicked arterial flow. In this work, we computationally modelled and evaluated capture efficiency in MDT with COMSOL Multiphysics 4.4. The studied parameters were (i) magnetic nanoparticle size, (ii) three classes of magnetic cores (Fe3O4, Fe2O3, and Fe), and (iii) the thickness of biocompatible coating materials (Au, SiO2, and PEG). It was found that the capture efficiency of small particles decreased with decreasing size and was less than 5 % for magnetic particles in the superparamagnetic regime. The thickness of non-magnetic coating materials did not significantly influence the capture efficiency of MDT. It was difficult to capture small drug carriers ( D<200 nm) in the arterial flow. We suggest that the MDT with high-capture efficiency can be obtained in small vessels and low-blood velocities such as micro-capillary vessels.
Park, Won Young; Phadke, Amol; Shah, Nihar
2012-06-29
Displays account for a significant portion of electricity consumed in personal computer (PC) use, and global PC monitor shipments are expected to continue to increase. We assess the market trends in the energy efficiency of PC monitors that are likely to occur without any additional policy intervention and estimate that display efficiency will likely improve by over 40% by 2015 compared to today’s technology. We evaluate the cost effectiveness of a key technology which further improves efficiency beyond this level by at least 20% and find that its adoption is cost effective. We assess the potential for further improving efficiency taking into account the recent development of universal serial bus (USB) powered liquid crystal display (LCD) monitors and find that the current technology available and deployed in USB powered monitors has the potential to deeply reduce energy consumption by as much as 50%. We provide insights for policies and programs that can be used to accelerate the adoption of efficient technologies to capture global energy saving potential from PC monitors which we estimate to be 9.2 terawatt-hours [TWh] per year in 2015.
Lunnoo, Thodsaphon; Puangmali, Theerapong
2015-12-01
The primary limitation of magnetic drug targeting (MDT) relates to the strength of an external magnetic field which decreases with increasing distance. Small nanoparticles (NPs) displaying superparamagnetic behaviour are also required in order to reduce embolization in the blood vessel. The small NPs, however, make it difficult to vector NPs and keep them in the desired location. The aims of this work were to investigate parameters influencing the capture efficiency of the drug carriers in mimicked arterial flow. In this work, we computationally modelled and evaluated capture efficiency in MDT with COMSOL Multiphysics 4.4. The studied parameters were (i) magnetic nanoparticle size, (ii) three classes of magnetic cores (Fe3O4, Fe2O3, and Fe), and (iii) the thickness of biocompatible coating materials (Au, SiO2, and PEG). It was found that the capture efficiency of small particles decreased with decreasing size and was less than 5 % for magnetic particles in the superparamagnetic regime. The thickness of non-magnetic coating materials did not significantly influence the capture efficiency of MDT. It was difficult to capture small drug carriers (D<200 nm) in the arterial flow. We suggest that the MDT with high-capture efficiency can be obtained in small vessels and low-blood velocities such as micro-capillary vessels. PMID:26515074
Anthony, T. Renée
2013-01-01
Computational fluid dynamics (CFD) has been used to report particle inhalability in low velocity freestreams, where realistic faces but simplified, truncated, and cylindrical human torsos were used. When compared to wind tunnel velocity studies, the truncated models were found to underestimate the air’s upward velocity near the humans, raising questions about aspiration estimation. This work compares aspiration efficiencies for particles ranging from 7 to 116 µm using three torso geometries: (i) a simplified truncated cylinder, (ii) a non-truncated cylinder, and (iii) an anthropometrically realistic humanoid body. The primary aim of this work is to (i) quantify the errors introduced by using a simplified geometry and (ii) determine the required level of detail to adequately represent a human form in CFD studies of aspiration efficiency. Fluid simulations used the standard k-epsilon turbulence models, with freestream velocities at 0.1, 0.2, and 0.4 m s−1 and breathing velocities at 1.81 and 12.11 m s−1 to represent at-rest and heavy breathing rates, respectively. Laminar particle trajectory simulations were used to determine the upstream area, also known as the critical area, where particles would be inhaled. These areas were used to compute aspiration efficiencies for facing the wind. Significant differences were found in both vertical velocity estimates and the location of the critical area between the three models. However, differences in aspiration efficiencies between the three forms were <8.8% over all particle sizes, indicating that there is little difference in aspiration efficiency between torso models. PMID:23006817
An efficient and general numerical method to compute steady uniform vortices
NASA Astrophysics Data System (ADS)
Luzzatto-Fegiz, Paolo; Williamson, Charles H. K.
2011-07-01
Steady uniform vortices are widely used to represent high Reynolds number flows, yet their efficient computation still presents some challenges. Existing Newton iteration methods become inefficient as the vortices develop fine-scale features; in addition, these methods cannot, in general, find solutions with specified Casimir invariants. On the other hand, available relaxation approaches are computationally inexpensive, but can fail to converge to a solution. In this paper, we overcome these limitations by introducing a new discretization, based on an inverse-velocity map, which radically increases the efficiency of Newton iteration methods. In addition, we introduce a procedure to prescribe Casimirs and remove the degeneracies in the steady vorticity equation, thus ensuring convergence for general vortex configurations. We illustrate our methodology by considering several unbounded flows involving one or two vortices. Our method enables the computation, for the first time, of steady vortices that do not exhibit any geometric symmetry. In addition, we discover that, as the limiting vortex state for each flow is approached, each family of solutions traces a clockwise spiral in a bifurcation plot consisting of a velocity-impulse diagram. By the recently introduced "IVI diagram" stability approach [Phys. Rev. Lett. 104 (2010) 044504], each turn of this spiral is associated with a loss of stability for the steady flows. Such spiral structure is suggested to be a universal feature of steady, uniform-vorticity flows.
NASA Astrophysics Data System (ADS)
Razavi, S.; Anderson, D.; Martin, P.; MacMillan, G.; Tolson, B.; Gabriel, C.; Zhang, B.
2012-12-01
Many sophisticated groundwater models tend to be computationally intensive as they rigorously represent detailed scientific knowledge about the groundwater systems. Calibration (model inversion), which is a vital step of groundwater model development, can require hundreds or thousands of model evaluations (runs) for different sets of parameters and as such demand prohibitively large computational time and resources. One common strategy to circumvent this computational burden is surrogate modelling which is concerned with developing and utilizing fast-to-run surrogates of the original computationally intensive models (also called fine models). Surrogates can be either based on statistical and data-driven models such as kriging and neural networks or simplified physically-based models with lower fidelity to the original system (also called coarse models). Fidelity in this context refers to the degree of the realism of a simulation model. This research initially investigates different strategies for developing lower-fidelity surrogates of a fine groundwater model and their combinations. These strategies include coarsening the fine model, relaxing the numerical convergence criteria, and simplifying the model geological conceptualisation. Trade-offs between model efficiency and fidelity (accuracy) are of special interest. A methodological framework is developed for coordinating the original fine model with its lower-fidelity surrogates with the objective of efficiently calibrating the parameters of the original model. This framework is capable of mapping the original model parameters to the corresponding surrogate model parameters and also mapping the surrogate model response for the given parameters to the original model response. This framework is general in that it can be used with different optimization and/or uncertainty analysis techniques available for groundwater model calibration and parameter/predictive uncertainty assessment. A real-world computationally
Sampling efficiency of modified 37-mm sampling cassettes using computational fluid dynamics.
Anthony, T Renée; Sleeth, Darrah; Volckens, John
2016-01-01
In the U.S., most industrial hygiene practitioners continue to rely on the closed-face cassette (CFC) to assess worker exposures to hazardous dusts, primarily because ease of use, cost, and familiarity. However, mass concentrations measured with this classic sampler underestimate exposures to larger particles throughout the inhalable particulate mass (IPM) size range (up to aerodynamic diameters of 100 μm). To investigate whether the current 37-mm inlet cap can be redesigned to better meet the IPM sampling criterion, computational fluid dynamics (CFD) models were developed, and particle sampling efficiencies associated with various modifications to the CFC inlet cap were determined. Simulations of fluid flow (standard k-epsilon turbulent model) and particle transport (laminar trajectories, 1-116 μm) were conducted using sampling flow rates of 10 L min(-1) in slow moving air (0.2 m s(-1)) in the facing-the-wind orientation. Combinations of seven inlet shapes and three inlet diameters were evaluated as candidates to replace the current 37-mm inlet cap. For a given inlet geometry, differences in sampler efficiency between inlet diameters averaged less than 1% for particles through 100 μm, but the largest opening was found to increase the efficiency for the 116 μm particles by 14% for the flat inlet cap. A substantial reduction in sampler efficiency was identified for sampler inlets with side walls extending beyond the dimension of the external lip of the current 37-mm CFC. The inlet cap based on the 37-mm CFC dimensions with an expanded 15-mm entry provided the best agreement with facing-the-wind human aspiration efficiency. The sampler efficiency was increased with a flat entry or with a thin central lip adjacent to the new enlarged entry. This work provides a substantial body of sampling efficiency estimates as a function of particle size and inlet geometry for personal aerosol samplers. PMID:26513395
Zaunders, John; Jing, Junmei; Leipold, Michael; Maecker, Holden; Kelleher, Anthony D; Koch, Inge
2016-01-01
Many methods have been described for automated clustering analysis of complex flow cytometry data, but so far the goal to efficiently estimate multivariate densities and their modes for a moderate number of dimensions and potentially millions of data points has not been attained. We have devised a novel approach to describing modes using second order polynomial histogram estimators (SOPHE). The method divides the data into multivariate bins and determines the shape of the data in each bin based on second order polynomials, which is an efficient computation. These calculations yield local maxima and allow joining of adjacent bins to identify clusters. The use of second order polynomials also optimally uses wide bins, such that in most cases each parameter (dimension) need only be divided into 4-8 bins, again reducing computational load. We have validated this method using defined mixtures of up to 17 fluorescent beads in 16 dimensions, correctly identifying all populations in data files of 100,000 beads in <10 s, on a standard laptop. The method also correctly clustered granulocytes, lymphocytes, including standard T, B, and NK cell subsets, and monocytes in 9-color stained peripheral blood, within seconds. SOPHE successfully clustered up to 36 subsets of memory CD4 T cells using differentiation and trafficking markers, in 14-color flow analysis, and up to 65 subpopulations of PBMC in 33-dimensional CyTOF data, showing its usefulness in discovery research. SOPHE has the potential to greatly increase efficiency of analysing complex mixtures of cells in higher dimensions. PMID:26097104
Li, Dongsheng; Sun, Xin; Khaleel, Mohammad A.
2011-09-28
This study evaluated different upscaling methods to predict thermal conductivity in loaded nuclear waste form, a heterogeneous material system. The efficiency and accuracy of these methods were compared. Thermal conductivity in loaded nuclear waste form is an important property specific to scientific researchers, in waste form Integrated performance and safety code (IPSC). The effective thermal conductivity obtained from microstructure information and local thermal conductivity of different components is critical in predicting the life and performance of waste form during storage. How the heat generated during storage is directly related to thermal conductivity, which in turn determining the mechanical deformation behavior, corrosion resistance and aging performance. Several methods, including the Taylor model, Sachs model, self-consistent model, and statistical upscaling models were developed and implemented. Due to the absence of experimental data, prediction results from finite element method (FEM) were used as reference to determine the accuracy of different upscaling models. Micrographs from different loading of nuclear waste were used in the prediction of thermal conductivity. Prediction results demonstrated that in term of efficiency, boundary models (Taylor and Sachs model) are better than self consistent model, statistical upscaling method and FEM. Balancing the computation resource and accuracy, statistical upscaling is a computational efficient method in predicting effective thermal conductivity for nuclear waste form.
An Efficient Computational Approach for the Calculation of the Vibrational Density of States.
Aieta, Chiara; Gabas, Fabio; Ceotto, Michele
2016-07-14
We present an optimized approach for the calculation of the density of fully coupled vibrational states in high-dimensional systems. This task is of paramount importance, because partition functions and several thermodynamic properties can be accurately estimated once the density of states is known. A new code, called paradensum, based on the implementation of the Wang-Landau Monte Carlo algorithm for parallel architectures is described and applied to real complex systems. We test the accuracy of paradensum on several molecular systems, including some benchmarks for which an exact evaluation of the vibrational density of states is doable by direct counting. In addition, we find a significant computational speedup with respect to standard approaches when applying our code to molecules up to 66 degrees of freedom. The new code can easily handle 150 degrees of freedom. These features make paradensum a very promising tool for future calculations of thermodynamic properties and thermal rate constants of complex systems. PMID:26840098
Childress, E M; Kleinstreuer, C
2014-01-01
injection position could be adjusted in vivo using biodegradable mock-spheres to ensure that patient-specific optimal tumor-targeting is achieved. In general, the methodology described could generate computationally very efficient and sufficiently accurate solutions for the transient fluid-particle dynamics problem. However, future work should test this methodology in patient-specific geometries subject to various flow waveforms. PMID:24190601
An efficient numerical method for computing dynamics of spin F = 2 Bose-Einstein condensates
Wang Hanquan
2011-07-01
In this paper, we extend the efficient time-splitting Fourier pseudospectral method to solve the generalized Gross-Pitaevskii (GP) equations, which model the dynamics of spin F = 2 Bose-Einstein condensates at extremely low temperature. Using the time-splitting technique, we split the generalized GP equations into one linear part and two nonlinear parts: the linear part is solved with the Fourier pseudospectral method; one of nonlinear parts is solved analytically while the other one is reformulated into a matrix formulation and solved by diagonalization. We show that the method keeps well the conservation laws related to generalized GP equations in 1D and 2D. We also show that the method is of second-order in time and spectrally accurate in space through a one-dimensional numerical test. We apply the method to investigate the dynamics of spin F = 2 Bose-Einstein condensates confined in a uniform/nonuniform magnetic field.
NASA Technical Reports Server (NTRS)
Pulliam, T. H.; Steger, J. L.
1985-01-01
In 1977 and 1978, general purpose centrally space differenced implicit finite difference codes in two and three dimensions have been introduced. These codes, now called ARC2D and ARC3D, can run either in inviscid or viscous mode for steady or unsteady flow. Since the introduction of the ARC2D and ARC3D codes, overall computational efficiency could be improved by making use of a number of algorithmic changes. These changes are related to the use of a spatially varying time step, the use of a sequence of mesh refinements to establish approximate solutions, implementation of various ways to reduce inversion work, improved numerical dissipation terms, and more implicit treatment of terms. The present investigation has the objective to describe the considered improvements and to quantify advantages and disadvantages. It is found that using established and simple procedures, a computer code can be maintained which is competitive with specialized codes.
NASA Astrophysics Data System (ADS)
Niedermeier, Dennis; Ervens, Barbara; Clauss, Tina; Voigtländer, Jens; Wex, Heike; Hartmann, Susan; Stratmann, Frank
2014-01-01
In a recent study, the Soccer ball model (SBM) was introduced for modeling and/or parameterizing heterogeneous ice nucleation processes. The model applies classical nucleation theory. It allows for a consistent description of both apparently singular and stochastic ice nucleation behavior, by distributing contact angles over the nucleation sites of a particle population assuming a Gaussian probability density function. The original SBM utilizes the Monte Carlo technique, which hampers its usage in atmospheric models, as fairly time-consuming calculations must be performed to obtain statistically significant results. Thus, we have developed a simplified and computationally more efficient version of the SBM. We successfully used the new SBM to parameterize experimental nucleation data of, e.g., bacterial ice nucleation. Both SBMs give identical results; however, the new model is computationally less expensive as confirmed by cloud parcel simulations. Therefore, it is a suitable tool for describing heterogeneous ice nucleation processes in atmospheric models.
Poloni, Roberta; Íñiguez, Jorge; García, Alberto; Canadell, Enric
2010-10-20
We present a computationally efficient semi-empirical method, based on standard first-principles techniques and the so-called virtual crystal approximation, for determining the average atomic structure of crystals with substitutional disorder. We show that, making use of a minimal amount of experimental information, it is possible to define convenient figures of merit that allow us to recast the determination of the average atomic ordering within the unit cell as a minimization problem. We have tested our approach by applying it to a wide variety of materials, ranging from oxynitrides to borocarbides and transition-metal perovskite oxides. In all the cases we were able to reproduce the experimental solution, when it exists, or the first-principles result obtained by means of much more computationally intensive approaches. PMID:21386597
NASA Technical Reports Server (NTRS)
Almroth, B. O.; Stehlin, P.; Brogan, F. A.
1981-01-01
A method for improving the efficiency of nonlinear structural analysis by the use of global displacement functions is presented. The computer programs include options to define the global functions as input or let the program automatically select and update these functions. The program was applied to a number of structures: (1) 'pear-shaped cylinder' in compression, (2) bending of a long cylinder, (3) spherical shell subjected to point force, (4) panel with initial imperfections, (5) cylinder with cutouts. The sample cases indicate the usefulness of the procedure in the solution of nonlinear structural shell problems by the finite element method. It is concluded that the use of global functions for extrapolation will lead to savings in computer time.
Efficient computation of Hamiltonian matrix elements between non-orthogonal Slater determinants
NASA Astrophysics Data System (ADS)
Utsuno, Yutaka; Shimizu, Noritaka; Otsuka, Takaharu; Abe, Takashi
2013-01-01
We present an efficient numerical method for computing Hamiltonian matrix elements between non-orthogonal Slater determinants, focusing on the most time-consuming component of the calculation that involves a sparse array. In the usual case where many matrix elements should be calculated, this computation can be transformed into a multiplication of dense matrices. It is demonstrated that the present method based on the matrix-matrix multiplication attains ˜80% of the theoretical peak performance measured on systems equipped with modern microprocessors, a factor of 5-10 better than the normal method using indirectly indexed arrays to treat a sparse array. The reason for such different performances is discussed from the viewpoint of memory access.
A network of spiking neurons for computing sparse representations in an energy efficient way
Hu, Tao; Genkin, Alexander; Chklovskii, Dmitri B.
2013-01-01
Computing sparse redundant representations is an important problem both in applied mathematics and neuroscience. In many applications, this problem must be solved in an energy efficient way. Here, we propose a hybrid distributed algorithm (HDA), which solves this problem on a network of simple nodes communicating via low-bandwidth channels. HDA nodes perform both gradient-descent-like steps on analog internal variables and coordinate-descent-like steps via quantized external variables communicated to each other. Interestingly, such operation is equivalent to a network of integrate-and-fire neurons, suggesting that HDA may serve as a model of neural computation. We compare the numerical performance of HDA with existing algorithms and show that in the asymptotic regime the representation error of HDA decays with time, t, as 1/t. We show that HDA is stable against time-varying noise, specifically, the representation error decays as 1/t for Gaussian white noise. PMID:22920853
Efficient Computation of Closed-loop Frequency Response for Large Order Flexible Systems
NASA Technical Reports Server (NTRS)
Maghami, Peiman G.; Giesy, Daniel P.
1997-01-01
An efficient and robust computational scheme is given for the calculation of the frequency response function of a large order, flexible system implemented with a linear, time invariant control system. Advantage is taken of the highly structured sparsity of the system matrix of the plant based on a model of the structure using normal mode coordinates. The computational time per frequency point of the new computational scheme is a linear function of system size, a significant improvement over traditional, full-matrix techniques whose computational times per frequency point range from quadratic to cubic functions of system size. This permits the practical frequency domain analysis of systems of much larger order than by traditional, full-matrix techniques. Formulations are given for both open and closed loop loop systems. Numerical examples are presented showing the advantages of the present formulation over traditional approaches, both in speed and in accuracy. Using a model with 703 structural modes, a speed-up of almost two orders of magnitude was observed while accuracy improved by up to 5 decimal places.
Algorithms for Efficient Computation of Transfer Functions for Large Order Flexible Systems
NASA Technical Reports Server (NTRS)
Maghami, Peiman G.; Giesy, Daniel P.
1998-01-01
An efficient and robust computational scheme is given for the calculation of the frequency response function of a large order, flexible system implemented with a linear, time invariant control system. Advantage is taken of the highly structured sparsity of the system matrix of the plant based on a model of the structure using normal mode coordinates. The computational time per frequency point of the new computational scheme is a linear function of system size, a significant improvement over traditional, still-matrix techniques whose computational times per frequency point range from quadratic to cubic functions of system size. This permits the practical frequency domain analysis of systems of much larger order than by traditional, full-matrix techniques. Formulations are given for both open- and closed-loop systems. Numerical examples are presented showing the advantages of the present formulation over traditional approaches, both in speed and in accuracy. Using a model with 703 structural modes, the present method was up to two orders of magnitude faster than a traditional method. The present method generally showed good to excellent accuracy throughout the range of test frequencies, while traditional methods gave adequate accuracy for lower frequencies, but generally deteriorated in performance at higher frequencies with worst case errors being many orders of magnitude times the correct values.
Efficient solid state NMR powder simulations using SMP and MPP parallel computation.
Kristensen, Jørgen Holm; Farnan, Ian
2003-04-01
Methods for parallel simulation of solid state NMR powder spectra are presented for both shared and distributed memory parallel supercomputers. For shared memory architectures the performance of simulation programs implementing the OpenMP application programming interface is evaluated. It is demonstrated that the design of correct and efficient shared memory parallel programs is difficult as the performance depends on data locality and cache memory effects. The distributed memory parallel programming model is examined for simulation programs using the MPI message passing interface. The results reveal that both shared and distributed memory parallel computation are very efficient with an almost perfect application speedup and may be applied to the most advanced powder simulations. PMID:12713968
Hierarchy of Efficiently Computable and Faithful Lower Bounds to Quantum Discord
NASA Astrophysics Data System (ADS)
Piani, Marco
2016-08-01
Quantum discord expresses a fundamental nonclassicality of correlations that is more general than entanglement, but that, in its standard definition, is not easily evaluated. We derive a hierarchy of computationally efficient lower bounds to the standard quantum discord. Every nontrivial element of the hierarchy constitutes by itself a valid discordlike measure, based on a fundamental feature of quantum correlations: their lack of shareability. Our approach emphasizes how the difference between entanglement and discord depends on whether shareability is intended as a static property or as a dynamical process.
NASA Astrophysics Data System (ADS)
Ramos-Mendez, J. A.; Perl, J.; Faddegon, B.; Paganetti, H.
2012-10-01
In this work, the well accepted particle splitting technique has been adapted to proton therapy and implemented in a new Monte Carlo simulation tool (TOPAS) for modeling the gantry mounted treatment nozzles at the Northeast Proton Therapy Center (NPTC) at Massachusetts General Hospital (MGH). Gains up to a factor of 14.5 in computational efficiency were reached with respect to a reference simulation in the generation of the phase space data in the cylindrically symmetric region of the nozzle. Comparisons between dose profiles in a water tank for several configurations show agreement between the simulations done with and without particle splitting within the statistical precision.
Hierarchy of Efficiently Computable and Faithful Lower Bounds to Quantum Discord.
Piani, Marco
2016-08-19
Quantum discord expresses a fundamental nonclassicality of correlations that is more general than entanglement, but that, in its standard definition, is not easily evaluated. We derive a hierarchy of computationally efficient lower bounds to the standard quantum discord. Every nontrivial element of the hierarchy constitutes by itself a valid discordlike measure, based on a fundamental feature of quantum correlations: their lack of shareability. Our approach emphasizes how the difference between entanglement and discord depends on whether shareability is intended as a static property or as a dynamical process. PMID:27588837
PVT: An Efficient Computational Procedure to Speed up Next-generation Sequence Analysis
2014-01-01
Background High-throughput Next-Generation Sequencing (NGS) techniques are advancing genomics and molecular biology research. This technology generates substantially large data which puts up a major challenge to the scientists for an efficient, cost and time effective solution to analyse such data. Further, for the different types of NGS data, there are certain common challenging steps involved in analysing those data. Spliced alignment is one such fundamental step in NGS data analysis which is extremely computational intensive as well as time consuming. There exists serious problem even with the most widely used spliced alignment tools. TopHat is one such widely used spliced alignment tools which although supports multithreading, does not efficiently utilize computational resources in terms of CPU utilization and memory. Here we have introduced PVT (Pipelined Version of TopHat) where we take up a modular approach by breaking TopHat’s serial execution into a pipeline of multiple stages, thereby increasing the degree of parallelization and computational resource utilization. Thus we address the discrepancies in TopHat so as to analyze large NGS data efficiently. Results We analysed the SRA dataset (SRX026839 and SRX026838) consisting of single end reads and SRA data SRR1027730 consisting of paired-end reads. We used TopHat v2.0.8 to analyse these datasets and noted the CPU usage, memory footprint and execution time during spliced alignment. With this basic information, we designed PVT, a pipelined version of TopHat that removes the redundant computational steps during ‘spliced alignment’ and breaks the job into a pipeline of multiple stages (each comprising of different step(s)) to improve its resource utilization, thus reducing the execution time. Conclusions PVT provides an improvement over TopHat for spliced alignment of NGS data analysis. PVT thus resulted in the reduction of the execution time to ~23% for the single end read dataset. Further, PVT designed
Efficient Computation of Info-Gap Robustness for Finite Element Models
Stull, Christopher J.; Hemez, Francois M.; Williams, Brian J.
2012-07-05
A recent research effort at LANL proposed info-gap decision theory as a framework by which to measure the predictive maturity of numerical models. Info-gap theory explores the trade-offs between accuracy, that is, the extent to which predictions reproduce the physical measurements, and robustness, that is, the extent to which predictions are insensitive to modeling assumptions. Both accuracy and robustness are necessary to demonstrate predictive maturity. However, conducting an info-gap analysis can present a formidable challenge, from the standpoint of the required computational resources. This is because a robustness function requires the resolution of multiple optimization problems. This report offers an alternative, adjoint methodology to assess the info-gap robustness of Ax = b-like numerical models solved for a solution x. Two situations that can arise in structural analysis and design are briefly described and contextualized within the info-gap decision theory framework. The treatments of the info-gap problems, using the adjoint methodology are outlined in detail, and the latter problem is solved for four separate finite element models. As compared to statistical sampling, the proposed methodology offers highly accurate approximations of info-gap robustness functions for the finite element models considered in the report, at a small fraction of the computational cost. It is noted that this report considers only linear systems; a natural follow-on study would extend the methodologies described herein to include nonlinear systems.
Schaefer, Bastian; Goedecker, Stefan
2016-07-21
An analysis of the network defined by the potential energy minima of multi-atomic systems and their connectivity via reaction pathways that go through transition states allows us to understand important characteristics like thermodynamic, dynamic, and structural properties. Unfortunately computing the transition states and reaction pathways in addition to the significant energetically low-lying local minima is a computationally demanding task. We here introduce a computationally efficient method that is based on a combination of the minima hopping global optimization method and the insight that uphill barriers tend to increase with increasing structural distances of the educt and product states. This method allows us to replace the exact connectivity information and transition state energies with alternative and approximate concepts. Without adding any significant additional cost to the minima hopping global optimization approach, this method allows us to generate an approximate network of the minima, their connectivity, and a rough measure for the energy needed for their interconversion. This can be used to obtain a first qualitative idea on important physical and chemical properties by means of a disconnectivity graph analysis. Besides the physical insight obtained by such an analysis, the gained knowledge can be used to make a decision if it is worthwhile or not to invest computational resources for an exact computation of the transition states and the reaction pathways. Furthermore it is demonstrated that the here presented method can be used for finding physically reasonable interconversion pathways that are promising input pathways for methods like transition path sampling or discrete path sampling. PMID:27448868
NASA Astrophysics Data System (ADS)
Allphin, Devin
Computational fluid dynamics (CFD) solution approximations for complex fluid flow problems have become a common and powerful engineering analysis technique. These tools, though qualitatively useful, remain limited in practice by their underlying inverse relationship between simulation accuracy and overall computational expense. While a great volume of research has focused on remedying these issues inherent to CFD, one traditionally overlooked area of resource reduction for engineering analysis concerns the basic definition and determination of functional relationships for the studied fluid flow variables. This artificial relationship-building technique, called meta-modeling or surrogate/offline approximation, uses design of experiments (DOE) theory to efficiently approximate non-physical coupling between the variables of interest in a fluid flow analysis problem. By mathematically approximating these variables, DOE methods can effectively reduce the required quantity of CFD simulations, freeing computational resources for other analytical focuses. An idealized interpretation of a fluid flow problem can also be employed to create suitably accurate approximations of fluid flow variables for the purposes of engineering analysis. When used in parallel with a meta-modeling approximation, a closed-form approximation can provide useful feedback concerning proper construction, suitability, or even necessity of an offline approximation tool. It also provides a short-circuit pathway for further reducing the overall computational demands of a fluid flow analysis, again freeing resources for otherwise unsuitable resource expenditures. To validate these inferences, a design optimization problem was presented requiring the inexpensive estimation of aerodynamic forces applied to a valve operating on a simulated piston-cylinder heat engine. The determination of these forces was to be found using parallel surrogate and exact approximation methods, thus evidencing the comparative
NASA Astrophysics Data System (ADS)
Zimmermann, Anke; Kuhn, Sandra; Richter, Marten
2016-01-01
Often, the calculation of Coulomb coupling elements for quantum dynamical treatments, e.g., in cluster or correlation expansion schemes, requires the evaluation of a six dimensional spatial integral. Therefore, it represents a significant limiting factor in quantum mechanical calculations. If the size or the complexity of the investigated system increases, many coupling elements need to be determined. The resulting computational constraints require an efficient method for a fast numerical calculation of the Coulomb coupling. We present a computational method to reduce the numerical complexity by decreasing the number of spatial integrals for arbitrary geometries. We use a Green's function formulation of the Coulomb coupling and introduce a generalized scalar potential as solution of a generalized Poisson equation with a generalized charge density as the inhomogeneity. That enables a fast calculation of Coulomb coupling elements and, additionally, a straightforward inclusion of boundary conditions and arbitrarily spatially dependent dielectrics through the Coulomb Green's function. Particularly, if many coupling elements are included, the presented method, which is not restricted to specific symmetries of the model, presents a promising approach for increasing the efficiency of numerical calculations of the Coulomb interaction. To demonstrate the wide range of applications, we calculate internanostructure couplings, such as the Förster coupling, and illustrate the inclusion of symmetry considerations in the method for the Coulomb coupling between bound quantum dot states and unbound continuum states.
An efficient algorithm to compute row and column counts for sparse Cholesky factorization
Gilbert, J.R. ); Ng, E.G.; Peyton, B.W. )
1992-09-01
Let an undirected graph G be given, along with a specified depth- first spanning tree T. We give almost-linear-time algorithms to solve the following two problems: First, for every vertex v, compute the number of descendants w of v for which some descendant of w is adjacent (in G) to v. Second, for every vertx v, compute the number of ancestors of v that are adjacent (in G) to at least one descendant of v. These problems arise in Cholesky and QR factorizations of sparse matrices. Our algorithms can be used to determine the number of nonzero entries in each row and column of the triangular factor of a matrix from the zero/nonzero structure of the matrix. Such a prediction makes storage allocation for sparse matrix factorizations more efficient. Our algorithms run in time linear in the size of the input times a slowly-growing inverse of Ackermann's function. The best previously known algorithms for these problems ran in time linear in the sum of the nonzero counts, which is usually much larger. We give experimental results demonstrating the practical efficiency of the new algorithms.
An efficient algorithm to compute row and column counts for sparse Cholesky factorization
Gilbert, J.R.; Ng, E.G.; Peyton, B.W.
1992-09-01
Let an undirected graph G be given, along with a specified depth- first spanning tree T. We give almost-linear-time algorithms to solve the following two problems: First, for every vertex v, compute the number of descendants w of v for which some descendant of w is adjacent (in G) to v. Second, for every vertx v, compute the number of ancestors of v that are adjacent (in G) to at least one descendant of v. These problems arise in Cholesky and QR factorizations of sparse matrices. Our algorithms can be used to determine the number of nonzero entries in each row and column of the triangular factor of a matrix from the zero/nonzero structure of the matrix. Such a prediction makes storage allocation for sparse matrix factorizations more efficient. Our algorithms run in time linear in the size of the input times a slowly-growing inverse of Ackermann`s function. The best previously known algorithms for these problems ran in time linear in the sum of the nonzero counts, which is usually much larger. We give experimental results demonstrating the practical efficiency of the new algorithms.
NASA Astrophysics Data System (ADS)
McMillin, L. M.; Crone, L. J.; Goldberg, M. D.; Kleespies, T. J.
1995-09-01
A fast and accurate method for the generation of atmospheric transmittances, optical path transmittance (OPTRAN), is described. Results from OPTRAN are compared with those produced by other currently used methods. OPTRAN produces transmittances that can be used to generate brightness temperatures that are accurate to better than 0.2 K, well over 10 times as accurate as the current methods. This is significant because it brings the accuracy of transmittance computation to a level at which it will not adversely affect atmospheric retrievals. OPTRAN is the product of an evolution of approaches developed earlier at the National Environmental Satellite, Data, and Information Service. A majorfeature of OPTRAN that contributes to its accuracy is that transmittance is obtained as a function of the absorber amount rather than the pressure.
NASA Astrophysics Data System (ADS)
MacDonald, Christopher L.; Bhattacharya, Nirupama; Sprouse, Brian P.; Silva, Gabriel A.
2015-09-01
Computing numerical solutions to fractional differential equations can be computationally intensive due to the effect of non-local derivatives in which all previous time points contribute to the current iteration. In general, numerical approaches that depend on truncating part of the system history while efficient, can suffer from high degrees of error and inaccuracy. Here we present an adaptive time step memory method for smooth functions applied to the Grünwald-Letnikov fractional diffusion derivative. This method is computationally efficient and results in smaller errors during numerical simulations. Sampled points along the system's history at progressively longer intervals are assumed to reflect the values of neighboring time points. By including progressively fewer points backward in time, a temporally 'weighted' history is computed that includes contributions from the entire past of the system, maintaining accuracy, but with fewer points actually calculated, greatly improving computational efficiency.
Lee, Sangyun; Liang, Ruibin; Voth, Gregory A; Swanson, Jessica M J
2016-02-01
An important challenge in the simulation of biomolecular systems is a quantitative description of the protonation and deprotonation process of amino acid residues. Despite the seeming simplicity of adding or removing a positively charged hydrogen nucleus, simulating the actual protonation/deprotonation process is inherently difficult. It requires both the explicit treatment of the excess proton, including its charge defect delocalization and Grotthuss shuttling through inhomogeneous moieties (water and amino residues), and extensive sampling of coupled condensed phase motions. In a recent paper (J. Chem. Theory Comput. 2014, 10, 2729-2737), a multiscale approach was developed to map high-level quantum mechanics/molecular mechanics (QM/MM) data into a multiscale reactive molecular dynamics (MS-RMD) model in order to describe amino acid deprotonation in bulk water. In this article, we extend the fitting approach (called FitRMD) to create MS-RMD models for ionizable amino acids within proteins. The resulting models are shown to faithfully reproduce the free energy profiles of the reference QM/MM Hamiltonian for PT inside an example protein, the ClC-ec1 H(+)/Cl(-) antiporter. Moreover, we show that the resulting MS-RMD models are computationally efficient enough to then characterize more complex 2-dimensional free energy surfaces due to slow degrees of freedom such as water hydration of internal protein cavities that can be inherently coupled to the excess proton charge translocation. The FitRMD method is thus shown to be an effective way to map ab initio level accuracy into a much more computationally efficient reactive MD method in order to explicitly simulate and quantitatively describe amino acid protonation/deprotonation in proteins. PMID:26734942
Sundareshan, Malur K; Bhattacharjee, Supratik; Inampudi, Radhika; Pang, Ho-Yuen
2002-12-10
Computational complexity is a major impediment to the real-time implementation of image restoration and superresolution algorithms in many applications. Although powerful restoration algorithms have been developed within the past few years utilizing sophisticated mathematical machinery (based on statistical optimization and convex set theory), these algorithms are typically iterative in nature and require a sufficient number of iterations to be executed to achieve the desired resolution improvement that may be needed to meaningfully perform postprocessing image exploitation tasks in practice. Additionally, recent technological breakthroughs have facilitated novel sensor designs (focal plane arrays, for instance) that make it possible to capture megapixel imagery data at video frame rates. A major challenge in the processing of these large-format images is to complete the execution of the image processing steps within the frame capture times and to keep up with the output rate of the sensor so that all data captured by the sensor can be efficiently utilized. Consequently, development of novel methods that facilitate real-time implementation of image restoration and superresolution algorithms is of significant practical interest and is the primary focus of this study. The key to designing computationally efficient processing schemes lies in strategically introducing appropriate preprocessing steps together with the superresolution iterations to tailor optimized overall processing sequences for imagery data of specific formats. For substantiating this assertion, three distinct methods for tailoring a preprocessing filter and integrating it with the superresolution processing steps are outlined. These methods consist of a region-of-interest extraction scheme, a background-detail separation procedure, and a scene-derived information extraction step for implementing a set-theoretic restoration of the image that is less demanding in computation compared with the