The solution of large multi-dimensional Poisson problems
NASA Technical Reports Server (NTRS)
Stone, H. S.
1974-01-01
The Buneman algorithm for solving Poisson problems can be adapted to solve large Poisson problems on computers with a rotating drum memory so that the computation is done with very little time lost due to rotational latency of the drum.
Development and Applications of a Modular Parallel Process for Large Scale Fluid/Structures Problems
NASA Technical Reports Server (NTRS)
Guruswamy, Guru P.; Byun, Chansup; Kwak, Dochan (Technical Monitor)
2001-01-01
A modular process that can efficiently solve large scale multidisciplinary problems using massively parallel super computers is presented. The process integrates disciplines with diverse physical characteristics by retaining the efficiency of individual disciplines. Computational domain independence of individual disciplines is maintained using a meta programming approach. The process integrates disciplines without affecting the combined performance. Results are demonstrated for large scale aerospace problems on several supercomputers. The super scalability and portability of the approach is demonstrated on several parallel computers.
An evaluation of superminicomputers for thermal analysis
NASA Technical Reports Server (NTRS)
Storaasli, O. O.; Vidal, J. B.; Jones, G. K.
1962-01-01
The feasibility and cost effectiveness of solving thermal analysis problems on superminicomputers is demonstrated. Conventional thermal analysis and the changing computer environment, computer hardware and software used, six thermal analysis test problems, performance of superminicomputers (CPU time, accuracy, turnaround, and cost) and comparison with large computers are considered. Although the CPU times for superminicomputers were 15 to 30 times greater than the fastest mainframe computer, the minimum cost to obtain the solutions on superminicomputers was from 11 percent to 59 percent of the cost of mainframe solutions. The turnaround (elapsed) time is highly dependent on the computer load, but for large problems, superminicomputers produced results in less elapsed time than a typically loaded mainframe computer.
Workshop report on large-scale matrix diagonalization methods in chemistry theory institute
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bischof, C.H.; Shepard, R.L.; Huss-Lederman, S.
The Large-Scale Matrix Diagonalization Methods in Chemistry theory institute brought together 41 computational chemists and numerical analysts. The goal was to understand the needs of the computational chemistry community in problems that utilize matrix diagonalization techniques. This was accomplished by reviewing the current state of the art and looking toward future directions in matrix diagonalization techniques. This institute occurred about 20 years after a related meeting of similar size. During those 20 years the Davidson method continued to dominate the problem of finding a few extremal eigenvalues for many computational chemistry problems. Work on non-diagonally dominant and non-Hermitian problems asmore » well as parallel computing has also brought new methods to bear. The changes and similarities in problems and methods over the past two decades offered an interesting viewpoint for the success in this area. One important area covered by the talks was overviews of the source and nature of the chemistry problems. The numerical analysts were uniformly grateful for the efforts to convey a better understanding of the problems and issues faced in computational chemistry. An important outcome was an understanding of the wide range of eigenproblems encountered in computational chemistry. The workshop covered problems involving self- consistent-field (SCF), configuration interaction (CI), intramolecular vibrational relaxation (IVR), and scattering problems. In atomic structure calculations using the Hartree-Fock method (SCF), the symmetric matrices can range from order hundreds to thousands. These matrices often include large clusters of eigenvalues which can be as much as 25% of the spectrum. However, if Cl methods are also used, the matrix size can be between 10{sup 4} and 10{sup 9} where only one or a few extremal eigenvalues and eigenvectors are needed. Working with very large matrices has lead to the development of« less
The Role of the Goal in Solving Hard Computational Problems: Do People Really Optimize?
ERIC Educational Resources Information Center
Carruthers, Sarah; Stege, Ulrike; Masson, Michael E. J.
2018-01-01
The role that the mental, or internal, representation plays when people are solving hard computational problems has largely been overlooked to date, despite the reality that this internal representation drives problem solving. In this work we investigate how performance on versions of two hard computational problems differs based on what internal…
Development and Applications of a Modular Parallel Process for Large Scale Fluid/Structures Problems
NASA Technical Reports Server (NTRS)
Guruswamy, Guru P.; Kwak, Dochan (Technical Monitor)
2002-01-01
A modular process that can efficiently solve large scale multidisciplinary problems using massively parallel supercomputers is presented. The process integrates disciplines with diverse physical characteristics by retaining the efficiency of individual disciplines. Computational domain independence of individual disciplines is maintained using a meta programming approach. The process integrates disciplines without affecting the combined performance. Results are demonstrated for large scale aerospace problems on several supercomputers. The super scalability and portability of the approach is demonstrated on several parallel computers.
The nonequilibrium quantum many-body problem as a paradigm for extreme data science
NASA Astrophysics Data System (ADS)
Freericks, J. K.; Nikolić, B. K.; Frieder, O.
2014-12-01
Generating big data pervades much of physics. But some problems, which we call extreme data problems, are too large to be treated within big data science. The nonequilibrium quantum many-body problem on a lattice is just such a problem, where the Hilbert space grows exponentially with system size and rapidly becomes too large to fit on any computer (and can be effectively thought of as an infinite-sized data set). Nevertheless, much progress has been made with computational methods on this problem, which serve as a paradigm for how one can approach and attack extreme data problems. In addition, viewing these physics problems from a computer-science perspective leads to new approaches that can be tried to solve more accurately and for longer times. We review a number of these different ideas here.
Exact parallel algorithms for some members of the traveling salesman problem family
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pekny, J.F.
1989-01-01
The traveling salesman problem and its many generalizations comprise one of the best known combinatorial optimization problem families. Most members of the family are NP-complete problems so that exact algorithms require an unpredictable and sometimes large computational effort. Parallel computers offer hope for providing the power required to meet these demands. A major barrier to applying parallel computers is the lack of parallel algorithms. The contributions presented in this thesis center around new exact parallel algorithms for the asymmetric traveling salesman problem (ATSP), prize collecting traveling salesman problem (PCTSP), and resource constrained traveling salesman problem (RCTSP). The RCTSP is amore » particularly difficult member of the family since finding a feasible solution is an NP-complete problem. An exact sequential algorithm is also presented for the directed hamiltonian cycle problem (DHCP). The DHCP algorithm is superior to current heuristic approaches and represents the first exact method applicable to large graphs. Computational results presented for each of the algorithms demonstrates the effectiveness of combining efficient algorithms with parallel computing methods. Performance statistics are reported for randomly generated ATSPs with 7,500 cities, PCTSPs with 200 cities, RCTSPs with 200 cities, DHCPs with 3,500 vertices, and assignment problems of size 10,000. Sequential results were collected on a Sun 4/260 engineering workstation, while parallel results were collected using a 14 and 100 processor BBN Butterfly Plus computer. The computational results represent the largest instances ever solved to optimality on any type of computer.« less
Predicting protein structures with a multiplayer online game.
Cooper, Seth; Khatib, Firas; Treuille, Adrien; Barbero, Janos; Lee, Jeehyung; Beenen, Michael; Leaver-Fay, Andrew; Baker, David; Popović, Zoran; Players, Foldit
2010-08-05
People exert large amounts of problem-solving effort playing computer games. Simple image- and text-recognition tasks have been successfully 'crowd-sourced' through games, but it is not clear if more complex scientific problems can be solved with human-directed computing. Protein structure prediction is one such problem: locating the biologically relevant native conformation of a protein is a formidable computational challenge given the very large size of the search space. Here we describe Foldit, a multiplayer online game that engages non-scientists in solving hard prediction problems. Foldit players interact with protein structures using direct manipulation tools and user-friendly versions of algorithms from the Rosetta structure prediction methodology, while they compete and collaborate to optimize the computed energy. We show that top-ranked Foldit players excel at solving challenging structure refinement problems in which substantial backbone rearrangements are necessary to achieve the burial of hydrophobic residues. Players working collaboratively develop a rich assortment of new strategies and algorithms; unlike computational approaches, they explore not only the conformational space but also the space of possible search strategies. The integration of human visual problem-solving and strategy development capabilities with traditional computational algorithms through interactive multiplayer games is a powerful new approach to solving computationally-limited scientific problems.
NASA Astrophysics Data System (ADS)
Lin, Youzuo; O'Malley, Daniel; Vesselinov, Velimir V.
2016-09-01
Inverse modeling seeks model parameters given a set of observations. However, for practical problems because the number of measurements is often large and the model parameters are also numerous, conventional methods for inverse modeling can be computationally expensive. We have developed a new, computationally efficient parallel Levenberg-Marquardt method for solving inverse modeling problems with a highly parameterized model space. Levenberg-Marquardt methods require the solution of a linear system of equations which can be prohibitively expensive to compute for moderate to large-scale problems. Our novel method projects the original linear problem down to a Krylov subspace such that the dimensionality of the problem can be significantly reduced. Furthermore, we store the Krylov subspace computed when using the first damping parameter and recycle the subspace for the subsequent damping parameters. The efficiency of our new inverse modeling algorithm is significantly improved using these computational techniques. We apply this new inverse modeling method to invert for random transmissivity fields in 2-D and a random hydraulic conductivity field in 3-D. Our algorithm is fast enough to solve for the distributed model parameters (transmissivity) in the model domain. The algorithm is coded in Julia and implemented in the MADS computational framework (http://mads.lanl.gov). By comparing with Levenberg-Marquardt methods using standard linear inversion techniques such as QR or SVD methods, our Levenberg-Marquardt method yields a speed-up ratio on the order of ˜101 to ˜102 in a multicore computational environment. Therefore, our new inverse modeling method is a powerful tool for characterizing subsurface heterogeneity for moderate to large-scale problems.
NASA Astrophysics Data System (ADS)
Debnath, Lokenath
2010-09-01
This article is essentially devoted to a brief historical introduction to Euler's formula for polyhedra, topology, theory of graphs and networks with many examples from the real-world. Celebrated Königsberg seven-bridge problem and some of the basic properties of graphs and networks for some understanding of the macroscopic behaviour of real physical systems are included. We also mention some important and modern applications of graph theory or network problems from transportation to telecommunications. Graphs or networks are effectively used as powerful tools in industrial, electrical and civil engineering, communication networks in the planning of business and industry. Graph theory and combinatorics can be used to understand the changes that occur in many large and complex scientific, technical and medical systems. With the advent of fast large computers and the ubiquitous Internet consisting of a very large network of computers, large-scale complex optimization problems can be modelled in terms of graphs or networks and then solved by algorithms available in graph theory. Many large and more complex combinatorial problems dealing with the possible arrangements of situations of various kinds, and computing the number and properties of such arrangements can be formulated in terms of networks. The Knight's tour problem, Hamilton's tour problem, problem of magic squares, the Euler Graeco-Latin squares problem and their modern developments in the twentieth century are also included.
Supercomputer optimizations for stochastic optimal control applications
NASA Technical Reports Server (NTRS)
Chung, Siu-Leung; Hanson, Floyd B.; Xu, Huihuang
1991-01-01
Supercomputer optimizations for a computational method of solving stochastic, multibody, dynamic programming problems are presented. The computational method is valid for a general class of optimal control problems that are nonlinear, multibody dynamical systems, perturbed by general Markov noise in continuous time, i.e., nonsmooth Gaussian as well as jump Poisson random white noise. Optimization techniques for vector multiprocessors or vectorizing supercomputers include advanced data structures, loop restructuring, loop collapsing, blocking, and compiler directives. These advanced computing techniques and superconducting hardware help alleviate Bellman's curse of dimensionality in dynamic programming computations, by permitting the solution of large multibody problems. Possible applications include lumped flight dynamics models for uncertain environments, such as large scale and background random aerospace fluctuations.
Computational Issues in Damping Identification for Large Scale Problems
NASA Technical Reports Server (NTRS)
Pilkey, Deborah L.; Roe, Kevin P.; Inman, Daniel J.
1997-01-01
Two damping identification methods are tested for efficiency in large-scale applications. One is an iterative routine, and the other a least squares method. Numerical simulations have been performed on multiple degree-of-freedom models to test the effectiveness of the algorithm and the usefulness of parallel computation for the problems. High Performance Fortran is used to parallelize the algorithm. Tests were performed using the IBM-SP2 at NASA Ames Research Center. The least squares method tested incurs high communication costs, which reduces the benefit of high performance computing. This method's memory requirement grows at a very rapid rate meaning that larger problems can quickly exceed available computer memory. The iterative method's memory requirement grows at a much slower pace and is able to handle problems with 500+ degrees of freedom on a single processor. This method benefits from parallelization, and significant speedup can he seen for problems of 100+ degrees-of-freedom.
Computers as an Instrument for Data Analysis. Technical Report No. 11.
ERIC Educational Resources Information Center
Muller, Mervin E.
A review of statistical data analysis involving computers as a multi-dimensional problem provides the perspective for consideration of the use of computers in statistical analysis and the problems associated with large data files. An overall description of STATJOB, a particular system for doing statistical data analysis on a digital computer,…
Large-scale structural optimization
NASA Technical Reports Server (NTRS)
Sobieszczanski-Sobieski, J.
1983-01-01
Problems encountered by aerospace designers in attempting to optimize whole aircraft are discussed, along with possible solutions. Large scale optimization, as opposed to component-by-component optimization, is hindered by computational costs, software inflexibility, concentration on a single, rather than trade-off, design methodology and the incompatibility of large-scale optimization with single program, single computer methods. The software problem can be approached by placing the full analysis outside of the optimization loop. Full analysis is then performed only periodically. Problem-dependent software can be removed from the generic code using a systems programming technique, and then embody the definitions of design variables, objective function and design constraints. Trade-off algorithms can be used at the design points to obtain quantitative answers. Finally, decomposing the large-scale problem into independent subproblems allows systematic optimization of the problems by an organization of people and machines.
Effect of virtual memory on efficient solution of two model problems
NASA Technical Reports Server (NTRS)
Lambiotte, J. J., Jr.
1977-01-01
Computers with virtual memory architecture allow programs to be written as if they were small enough to be contained in memory. Two types of problems are investigated to show that this luxury can lead to quite an inefficient performance if the programmer does not interact strongly with the characteristics of the operating system when developing the program. The two problems considered are the simultaneous solutions of a large linear system of equations by Gaussian elimination and a model three-dimensional finite-difference problem. The Control Data STAR-100 computer runs are made to demonstrate the inefficiencies of programming the problems in the manner one would naturally do if the problems were indeed, small enough to be contained in memory. Program redesigns are presented which achieve large improvements in performance through changes in the computational procedure and the data base arrangement.
Cloud Computing for Complex Performance Codes.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Appel, Gordon John; Hadgu, Teklu; Klein, Brandon Thorin
This report describes the use of cloud computing services for running complex public domain performance assessment problems. The work consisted of two phases: Phase 1 was to demonstrate complex codes, on several differently configured servers, could run and compute trivial small scale problems in a commercial cloud infrastructure. Phase 2 focused on proving non-trivial large scale problems could be computed in the commercial cloud environment. The cloud computing effort was successfully applied using codes of interest to the geohydrology and nuclear waste disposal modeling community.
Information Power Grid Posters
NASA Technical Reports Server (NTRS)
Vaziri, Arsi
2003-01-01
This document is a summary of the accomplishments of the Information Power Grid (IPG). Grids are an emerging technology that provide seamless and uniform access to the geographically dispersed, computational, data storage, networking, instruments, and software resources needed for solving large-scale scientific and engineering problems. The goal of the NASA IPG is to use NASA's remotely located computing and data system resources to build distributed systems that can address problems that are too large or complex for a single site. The accomplishments outlined in this poster presentation are: access to distributed data, IPG heterogeneous computing, integration of large-scale computing node into distributed environment, remote access to high data rate instruments,and exploratory grid environment.
DOE Office of Scientific and Technical Information (OSTI.GOV)
None, None
The Second SIAM Conference on Computational Science and Engineering was held in San Diego from February 10-12, 2003. Total conference attendance was 553. This is a 23% increase in attendance over the first conference. The focus of this conference was to draw attention to the tremendous range of major computational efforts on large problems in science and engineering, to promote the interdisciplinary culture required to meet these large-scale challenges, and to encourage the training of the next generation of computational scientists. Computational Science & Engineering (CS&E) is now widely accepted, along with theory and experiment, as a crucial third modemore » of scientific investigation and engineering design. Aerospace, automotive, biological, chemical, semiconductor, and other industrial sectors now rely on simulation for technical decision support. For federal agencies also, CS&E has become an essential support for decisions on resources, transportation, and defense. CS&E is, by nature, interdisciplinary. It grows out of physical applications and it depends on computer architecture, but at its heart are powerful numerical algorithms and sophisticated computer science techniques. From an applied mathematics perspective, much of CS&E has involved analysis, but the future surely includes optimization and design, especially in the presence of uncertainty. Another mathematical frontier is the assimilation of very large data sets through such techniques as adaptive multi-resolution, automated feature search, and low-dimensional parameterization. The themes of the 2003 conference included, but were not limited to: Advanced Discretization Methods; Computational Biology and Bioinformatics; Computational Chemistry and Chemical Engineering; Computational Earth and Atmospheric Sciences; Computational Electromagnetics; Computational Fluid Dynamics; Computational Medicine and Bioengineering; Computational Physics and Astrophysics; Computational Solid Mechanics and Materials; CS&E Education; Meshing and Adaptivity; Multiscale and Multiphysics Problems; Numerical Algorithms for CS&E; Discrete and Combinatorial Algorithms for CS&E; Inverse Problems; Optimal Design, Optimal Control, and Inverse Problems; Parallel and Distributed Computing; Problem-Solving Environments; Software and Wddleware Systems; Uncertainty Estimation and Sensitivity Analysis; and Visualization and Computer Graphics.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lin, Youzuo; O'Malley, Daniel; Vesselinov, Velimir V.
Inverse modeling seeks model parameters given a set of observations. However, for practical problems because the number of measurements is often large and the model parameters are also numerous, conventional methods for inverse modeling can be computationally expensive. We have developed a new, computationally-efficient parallel Levenberg-Marquardt method for solving inverse modeling problems with a highly parameterized model space. Levenberg-Marquardt methods require the solution of a linear system of equations which can be prohibitively expensive to compute for moderate to large-scale problems. Our novel method projects the original linear problem down to a Krylov subspace, such that the dimensionality of themore » problem can be significantly reduced. Furthermore, we store the Krylov subspace computed when using the first damping parameter and recycle the subspace for the subsequent damping parameters. The efficiency of our new inverse modeling algorithm is significantly improved using these computational techniques. We apply this new inverse modeling method to invert for random transmissivity fields in 2D and a random hydraulic conductivity field in 3D. Our algorithm is fast enough to solve for the distributed model parameters (transmissivity) in the model domain. The algorithm is coded in Julia and implemented in the MADS computational framework (http://mads.lanl.gov). By comparing with Levenberg-Marquardt methods using standard linear inversion techniques such as QR or SVD methods, our Levenberg-Marquardt method yields a speed-up ratio on the order of ~10 1 to ~10 2 in a multi-core computational environment. Furthermore, our new inverse modeling method is a powerful tool for characterizing subsurface heterogeneity for moderate- to large-scale problems.« less
Lin, Youzuo; O'Malley, Daniel; Vesselinov, Velimir V.
2016-09-01
Inverse modeling seeks model parameters given a set of observations. However, for practical problems because the number of measurements is often large and the model parameters are also numerous, conventional methods for inverse modeling can be computationally expensive. We have developed a new, computationally-efficient parallel Levenberg-Marquardt method for solving inverse modeling problems with a highly parameterized model space. Levenberg-Marquardt methods require the solution of a linear system of equations which can be prohibitively expensive to compute for moderate to large-scale problems. Our novel method projects the original linear problem down to a Krylov subspace, such that the dimensionality of themore » problem can be significantly reduced. Furthermore, we store the Krylov subspace computed when using the first damping parameter and recycle the subspace for the subsequent damping parameters. The efficiency of our new inverse modeling algorithm is significantly improved using these computational techniques. We apply this new inverse modeling method to invert for random transmissivity fields in 2D and a random hydraulic conductivity field in 3D. Our algorithm is fast enough to solve for the distributed model parameters (transmissivity) in the model domain. The algorithm is coded in Julia and implemented in the MADS computational framework (http://mads.lanl.gov). By comparing with Levenberg-Marquardt methods using standard linear inversion techniques such as QR or SVD methods, our Levenberg-Marquardt method yields a speed-up ratio on the order of ~10 1 to ~10 2 in a multi-core computational environment. Furthermore, our new inverse modeling method is a powerful tool for characterizing subsurface heterogeneity for moderate- to large-scale problems.« less
Solving optimization problems on computational grids.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wright, S. J.; Mathematics and Computer Science
2001-05-01
Multiprocessor computing platforms, which have become more and more widely available since the mid-1980s, are now heavily used by organizations that need to solve very demanding computational problems. Parallel computing is now central to the culture of many research communities. Novel parallel approaches were developed for global optimization, network optimization, and direct-search methods for nonlinear optimization. Activity was particularly widespread in parallel branch-and-bound approaches for various problems in combinatorial and network optimization. As the cost of personal computers and low-end workstations has continued to fall, while the speed and capacity of processors and networks have increased dramatically, 'cluster' platforms havemore » become popular in many settings. A somewhat different type of parallel computing platform know as a computational grid (alternatively, metacomputer) has arisen in comparatively recent times. Broadly speaking, this term refers not to a multiprocessor with identical processing nodes but rather to a heterogeneous collection of devices that are widely distributed, possibly around the globe. The advantage of such platforms is obvious: they have the potential to deliver enormous computing power. Just as obviously, however, the complexity of grids makes them very difficult to use. The Condor team, headed by Miron Livny at the University of Wisconsin, were among the pioneers in providing infrastructure for grid computations. More recently, the Globus project has developed technologies to support computations on geographically distributed platforms consisting of high-end computers, storage and visualization devices, and other scientific instruments. In 1997, we started the metaneos project as a collaborative effort between optimization specialists and the Condor and Globus groups. Our aim was to address complex, difficult optimization problems in several areas, designing and implementing the algorithms and the software infrastructure need to solve these problems on computational grids. This article describes some of the results we have obtained during the first three years of the metaneos project. Our efforts have led to development of the runtime support library MW for implementing algorithms with master-worker control structure on Condor platforms. This work is discussed here, along with work on algorithms and codes for integer linear programming, the quadratic assignment problem, and stochastic linear programmming. Our experiences in the metaneos project have shown that cheap, powerful computational grids can be used to tackle large optimization problems of various types. In an industrial or commercial setting, the results demonstrate that one may not have to buy powerful computational servers to solve many of the large problems arising in areas such as scheduling, portfolio optimization, or logistics; the idle time on employee workstations (or, at worst, an investment in a modest cluster of PCs) may do the job. For the optimization research community, our results motivate further work on parallel, grid-enabled algorithms for solving very large problems of other types. The fact that very large problems can be solved cheaply allows researchers to better understand issues of 'practical' complexity and of the role of heuristics.« less
Digital Maps, Matrices and Computer Algebra
ERIC Educational Resources Information Center
Knight, D. G.
2005-01-01
The way in which computer algebra systems, such as Maple, have made the study of complex problems accessible to undergraduate mathematicians with modest computational skills is illustrated by some large matrix calculations, which arise from representing the Earth's surface by digital elevation models. Such problems are often considered to lie in…
Parallel Algorithm Solves Coupled Differential Equations
NASA Technical Reports Server (NTRS)
Hayashi, A.
1987-01-01
Numerical methods adapted to concurrent processing. Algorithm solves set of coupled partial differential equations by numerical integration. Adapted to run on hypercube computer, algorithm separates problem into smaller problems solved concurrently. Increase in computing speed with concurrent processing over that achievable with conventional sequential processing appreciable, especially for large problems.
Exact solution of large asymmetric traveling salesman problems.
Miller, D L; Pekny, J F
1991-02-15
The traveling salesman problem is one of a class of difficult problems in combinatorial optimization that is representative of a large number of important scientific and engineering problems. A survey is given of recent applications and methods for solving large problems. In addition, an algorithm for the exact solution of the asymmetric traveling salesman problem is presented along with computational results for several classes of problems. The results show that the algorithm performs remarkably well for some classes of problems, determining an optimal solution even for problems with large numbers of cities, yet for other classes, even small problems thwart determination of a provably optimal solution.
New design for interfacing computers to the Octopus network
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sloan, L.J.
1977-03-14
The Lawrence Livermore Laboratory has several large-scale computers which are connected to the Octopus network. Several difficulties arise in providing adequate resources along with reliable performance. To alleviate some of these problems a new method of bringing large computers into the Octopus environment is proposed.
Data-driven indexing mechanism for the recognition of polyhedral objects
NASA Astrophysics Data System (ADS)
McLean, Stewart; Horan, Peter; Caelli, Terry M.
1992-02-01
This paper is concerned with the problem of searching large model databases. To date, most object recognition systems have concentrated on the problem of matching using simple searching algorithms. This is quite acceptable when the number of object models is small. However, in the future, general purpose computer vision systems will be required to recognize hundreds or perhaps thousands of objects and, in such circumstances, efficient searching algorithms will be needed. The problem of searching a large model database is one which must be addressed if future computer vision systems are to be at all effective. In this paper we present a method we call data-driven feature-indexed hypothesis generation as one solution to the problem of searching large model databases.
NASA Astrophysics Data System (ADS)
Lin, Y.; O'Malley, D.; Vesselinov, V. V.
2015-12-01
Inverse modeling seeks model parameters given a set of observed state variables. However, for many practical problems due to the facts that the observed data sets are often large and model parameters are often numerous, conventional methods for solving the inverse modeling can be computationally expensive. We have developed a new, computationally-efficient Levenberg-Marquardt method for solving large-scale inverse modeling. Levenberg-Marquardt methods require the solution of a dense linear system of equations which can be prohibitively expensive to compute for large-scale inverse problems. Our novel method projects the original large-scale linear problem down to a Krylov subspace, such that the dimensionality of the measurements can be significantly reduced. Furthermore, instead of solving the linear system for every Levenberg-Marquardt damping parameter, we store the Krylov subspace computed when solving the first damping parameter and recycle it for all the following damping parameters. The efficiency of our new inverse modeling algorithm is significantly improved by using these computational techniques. We apply this new inverse modeling method to invert for a random transitivity field. Our algorithm is fast enough to solve for the distributed model parameters (transitivity) at each computational node in the model domain. The inversion is also aided by the use regularization techniques. The algorithm is coded in Julia and implemented in the MADS computational framework (http://mads.lanl.gov). Julia is an advanced high-level scientific programing language that allows for efficient memory management and utilization of high-performance computational resources. By comparing with a Levenberg-Marquardt method using standard linear inversion techniques, our Levenberg-Marquardt method yields speed-up ratio of 15 in a multi-core computational environment and a speed-up ratio of 45 in a single-core computational environment. Therefore, our new inverse modeling method is a powerful tool for large-scale applications.
NASA Technical Reports Server (NTRS)
Iida, H. T.
1966-01-01
Computational procedure reduces the numerical effort whenever the method of finite differences is used to solve ablation problems for which the surface recession is large relative to the initial slab thickness. The number of numerical operations required for a given maximum space mesh size is reduced.
Active Subspace Methods for Data-Intensive Inverse Problems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wang, Qiqi
2017-04-27
The project has developed theory and computational tools to exploit active subspaces to reduce the dimension in statistical calibration problems. This dimension reduction enables MCMC methods to calibrate otherwise intractable models. The same theoretical and computational tools can also reduce the measurement dimension for calibration problems that use large stores of data.
Computing Evans functions numerically via boundary-value problems
NASA Astrophysics Data System (ADS)
Barker, Blake; Nguyen, Rose; Sandstede, Björn; Ventura, Nathaniel; Wahl, Colin
2018-03-01
The Evans function has been used extensively to study spectral stability of travelling-wave solutions in spatially extended partial differential equations. To compute Evans functions numerically, several shooting methods have been developed. In this paper, an alternative scheme for the numerical computation of Evans functions is presented that relies on an appropriate boundary-value problem formulation. Convergence of the algorithm is proved, and several examples, including the computation of eigenvalues for a multi-dimensional problem, are given. The main advantage of the scheme proposed here compared with earlier methods is that the scheme is linear and scalable to large problems.
Lexical Problems in Large Distributed Information Systems.
ERIC Educational Resources Information Center
Berkovich, Simon Ya; Shneiderman, Ben
1980-01-01
Suggests a unified concept of a lexical subsystem as part of an information system to deal with lexical problems in local and network environments. The linguistic and control functions of the lexical subsystems in solving problems for large computer systems are described, and references are included. (Author/BK)
Parallel block schemes for large scale least squares computations
DOE Office of Scientific and Technical Information (OSTI.GOV)
Golub, G.H.; Plemmons, R.J.; Sameh, A.
1986-04-01
Large scale least squares computations arise in a variety of scientific and engineering problems, including geodetic adjustments and surveys, medical image analysis, molecular structures, partial differential equations and substructuring methods in structural engineering. In each of these problems, matrices often arise which possess a block structure which reflects the local connection nature of the underlying physical problem. For example, such super-large nonlinear least squares computations arise in geodesy. Here the coordinates of positions are calculated by iteratively solving overdetermined systems of nonlinear equations by the Gauss-Newton method. The US National Geodetic Survey will complete this year (1986) the readjustment ofmore » the North American Datum, a problem which involves over 540 thousand unknowns and over 6.5 million observations (equations). The observation matrix for these least squares computations has a block angular form with 161 diagnonal blocks, each containing 3 to 4 thousand unknowns. In this paper parallel schemes are suggested for the orthogonal factorization of matrices in block angular form and for the associated backsubstitution phase of the least squares computations. In addition, a parallel scheme for the calculation of certain elements of the covariance matrix for such problems is described. It is shown that these algorithms are ideally suited for multiprocessors with three levels of parallelism such as the Cedar system at the University of Illinois. 20 refs., 7 figs.« less
Some Thoughts Regarding Practical Quantum Computing
NASA Astrophysics Data System (ADS)
Ghoshal, Debabrata; Gomez, Richard; Lanzagorta, Marco; Uhlmann, Jeffrey
2006-03-01
Quantum computing has become an important area of research in computer science because of its potential to provide more efficient algorithmic solutions to certain problems than are possible with classical computing. The ability of performing parallel operations over an exponentially large computational space has proved to be the main advantage of the quantum computing model. In this regard, we are particularly interested in the potential applications of quantum computers to enhance real software systems of interest to the defense, industrial, scientific and financial communities. However, while much has been written in popular and scientific literature about the benefits of the quantum computational model, several of the problems associated to the practical implementation of real-life complex software systems in quantum computers are often ignored. In this presentation we will argue that practical quantum computation is not as straightforward as commonly advertised, even if the technological problems associated to the manufacturing and engineering of large-scale quantum registers were solved overnight. We will discuss some of the frequently overlooked difficulties that plague quantum computing in the areas of memories, I/O, addressing schemes, compilers, oracles, approximate information copying, logical debugging, error correction and fault-tolerant computing protocols.
Cloud-based large-scale air traffic flow optimization
NASA Astrophysics Data System (ADS)
Cao, Yi
The ever-increasing traffic demand makes the efficient use of airspace an imperative mission, and this paper presents an effort in response to this call. Firstly, a new aggregate model, called Link Transmission Model (LTM), is proposed, which models the nationwide traffic as a network of flight routes identified by origin-destination pairs. The traversal time of a flight route is assumed to be the mode of distribution of historical flight records, and the mode is estimated by using Kernel Density Estimation. As this simplification abstracts away physical trajectory details, the complexity of modeling is drastically decreased, resulting in efficient traffic forecasting. The predicative capability of LTM is validated against recorded traffic data. Secondly, a nationwide traffic flow optimization problem with airport and en route capacity constraints is formulated based on LTM. The optimization problem aims at alleviating traffic congestions with minimal global delays. This problem is intractable due to millions of variables. A dual decomposition method is applied to decompose the large-scale problem such that the subproblems are solvable. However, the whole problem is still computational expensive to solve since each subproblem is an smaller integer programming problem that pursues integer solutions. Solving an integer programing problem is known to be far more time-consuming than solving its linear relaxation. In addition, sequential execution on a standalone computer leads to linear runtime increase when the problem size increases. To address the computational efficiency problem, a parallel computing framework is designed which accommodates concurrent executions via multithreading programming. The multithreaded version is compared with its monolithic version to show decreased runtime. Finally, an open-source cloud computing framework, Hadoop MapReduce, is employed for better scalability and reliability. This framework is an "off-the-shelf" parallel computing model that can be used for both offline historical traffic data analysis and online traffic flow optimization. It provides an efficient and robust platform for easy deployment and implementation. A small cloud consisting of five workstations was configured and used to demonstrate the advantages of cloud computing in dealing with large-scale parallelizable traffic problems.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Blanford, M.
1997-12-31
Most commercially-available quasistatic finite element programs assemble element stiffnesses into a global stiffness matrix, then use a direct linear equation solver to obtain nodal displacements. However, for large problems (greater than a few hundred thousand degrees of freedom), the memory size and computation time required for this approach becomes prohibitive. Moreover, direct solution does not lend itself to the parallel processing needed for today`s multiprocessor systems. This talk gives an overview of the iterative solution strategy of JAS3D, the nonlinear large-deformation quasistatic finite element program. Because its architecture is derived from an explicit transient-dynamics code, it does not ever assemblemore » a global stiffness matrix. The author describes the approach he used to implement the solver on multiprocessor computers, and shows examples of problems run on hundreds of processors and more than a million degrees of freedom. Finally, he describes some of the work he is presently doing to address the challenges of iterative convergence for ill-conditioned problems.« less
Enhancing Large-Group Problem-Based Learning in Veterinary Medical Education.
ERIC Educational Resources Information Center
Pickrell, John A.
This project for large-group, problem-based learning at Kansas State University College of Veterinary Medicine developed 47 case-based videotapes that are used to model clinical conditions and also involved veterinary practitioners to formulate true practice cases into student learning opportunities. Problem-oriented, computer-assisted diagnostic…
Sensitivity analysis for large-scale problems
NASA Technical Reports Server (NTRS)
Noor, Ahmed K.; Whitworth, Sandra L.
1987-01-01
The development of efficient techniques for calculating sensitivity derivatives is studied. The objective is to present a computational procedure for calculating sensitivity derivatives as part of performing structural reanalysis for large-scale problems. The scope is limited to framed type structures. Both linear static analysis and free-vibration eigenvalue problems are considered.
Parallel Optimization of Polynomials for Large-scale Problems in Stability and Control
NASA Astrophysics Data System (ADS)
Kamyar, Reza
In this thesis, we focus on some of the NP-hard problems in control theory. Thanks to the converse Lyapunov theory, these problems can often be modeled as optimization over polynomials. To avoid the problem of intractability, we establish a trade off between accuracy and complexity. In particular, we develop a sequence of tractable optimization problems --- in the form of Linear Programs (LPs) and/or Semi-Definite Programs (SDPs) --- whose solutions converge to the exact solution of the NP-hard problem. However, the computational and memory complexity of these LPs and SDPs grow exponentially with the progress of the sequence - meaning that improving the accuracy of the solutions requires solving SDPs with tens of thousands of decision variables and constraints. Setting up and solving such problems is a significant challenge. The existing optimization algorithms and software are only designed to use desktop computers or small cluster computers --- machines which do not have sufficient memory for solving such large SDPs. Moreover, the speed-up of these algorithms does not scale beyond dozens of processors. This in fact is the reason we seek parallel algorithms for setting-up and solving large SDPs on large cluster- and/or super-computers. We propose parallel algorithms for stability analysis of two classes of systems: 1) Linear systems with a large number of uncertain parameters; 2) Nonlinear systems defined by polynomial vector fields. First, we develop a distributed parallel algorithm which applies Polya's and/or Handelman's theorems to some variants of parameter-dependent Lyapunov inequalities with parameters defined over the standard simplex. The result is a sequence of SDPs which possess a block-diagonal structure. We then develop a parallel SDP solver which exploits this structure in order to map the computation, memory and communication to a distributed parallel environment. Numerical tests on a supercomputer demonstrate the ability of the algorithm to efficiently utilize hundreds and potentially thousands of processors, and analyze systems with 100+ dimensional state-space. Furthermore, we extend our algorithms to analyze robust stability over more complicated geometries such as hypercubes and arbitrary convex polytopes. Our algorithms can be readily extended to address a wide variety of problems in control such as Hinfinity synthesis for systems with parametric uncertainty and computing control Lyapunov functions.
ERIC Educational Resources Information Center
Lord, Frederic M.; Stocking, Martha
A general Computer program is described that will compute asymptotic standard errors and carry out significance tests for an endless variety of (standard and) nonstandard large-sample statistical problems, without requiring the statistician to derive asymptotic standard error formulas. The program assumes that the observations have a multinormal…
Reasoning by analogy as an aid to heuristic theorem proving.
NASA Technical Reports Server (NTRS)
Kling, R. E.
1972-01-01
When heuristic problem-solving programs are faced with large data bases that contain numbers of facts far in excess of those needed to solve any particular problem, their performance rapidly deteriorates. In this paper, the correspondence between a new unsolved problem and a previously solved analogous problem is computed and invoked to tailor large data bases to manageable sizes. This paper outlines the design of an algorithm for generating and exploiting analogies between theorems posed to a resolution-logic system. These algorithms are believed to be the first computationally feasible development of reasoning by analogy to be applied to heuristic theorem proving.
Penas, David R; González, Patricia; Egea, Jose A; Doallo, Ramón; Banga, Julio R
2017-01-21
The development of large-scale kinetic models is one of the current key issues in computational systems biology and bioinformatics. Here we consider the problem of parameter estimation in nonlinear dynamic models. Global optimization methods can be used to solve this type of problems but the associated computational cost is very large. Moreover, many of these methods need the tuning of a number of adjustable search parameters, requiring a number of initial exploratory runs and therefore further increasing the computation times. Here we present a novel parallel method, self-adaptive cooperative enhanced scatter search (saCeSS), to accelerate the solution of this class of problems. The method is based on the scatter search optimization metaheuristic and incorporates several key new mechanisms: (i) asynchronous cooperation between parallel processes, (ii) coarse and fine-grained parallelism, and (iii) self-tuning strategies. The performance and robustness of saCeSS is illustrated by solving a set of challenging parameter estimation problems, including medium and large-scale kinetic models of the bacterium E. coli, bakerés yeast S. cerevisiae, the vinegar fly D. melanogaster, Chinese Hamster Ovary cells, and a generic signal transduction network. The results consistently show that saCeSS is a robust and efficient method, allowing very significant reduction of computation times with respect to several previous state of the art methods (from days to minutes, in several cases) even when only a small number of processors is used. The new parallel cooperative method presented here allows the solution of medium and large scale parameter estimation problems in reasonable computation times and with small hardware requirements. Further, the method includes self-tuning mechanisms which facilitate its use by non-experts. We believe that this new method can play a key role in the development of large-scale and even whole-cell dynamic models.
Correlation between Academic and Skills-Based Tests in Computer Networks
ERIC Educational Resources Information Center
Buchanan, William
2006-01-01
Computing-related programmes and modules have many problems, especially related to large class sizes, large-scale plagiarism, module franchising, and an increased requirement from students for increased amounts of hands-on, practical work. This paper presents a practical computer networks module which uses a mixture of online examinations and a…
Parallel solution of sparse one-dimensional dynamic programming problems
NASA Technical Reports Server (NTRS)
Nicol, David M.
1989-01-01
Parallel computation offers the potential for quickly solving large computational problems. However, it is often a non-trivial task to effectively use parallel computers. Solution methods must sometimes be reformulated to exploit parallelism; the reformulations are often more complex than their slower serial counterparts. We illustrate these points by studying the parallelization of sparse one-dimensional dynamic programming problems, those which do not obviously admit substantial parallelization. We propose a new method for parallelizing such problems, develop analytic models which help us to identify problems which parallelize well, and compare the performance of our algorithm with existing algorithms on a multiprocessor.
NASA Astrophysics Data System (ADS)
Takasaki, Koichi
This paper presents a program for the multidisciplinary optimization and identification problem of the nonlinear model of large aerospace vehicle structures. The program constructs the global matrix of the dynamic system in the time direction by the p-version finite element method (pFEM), and the basic matrix for each pFEM node in the time direction is described by a sparse matrix similarly to the static finite element problem. The algorithm used by the program does not require the Hessian matrix of the objective function and so has low memory requirements. It also has a relatively low computational cost, and is suited to parallel computation. The program was integrated as a solver module of the multidisciplinary analysis system CUMuLOUS (Computational Utility for Multidisciplinary Large scale Optimization of Undense System) which is under development by the Aerospace Research and Development Directorate (ARD) of the Japan Aerospace Exploration Agency (JAXA).
Numerical methods on some structured matrix algebra problems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jessup, E.R.
1996-06-01
This proposal concerned the design, analysis, and implementation of serial and parallel algorithms for certain structured matrix algebra problems. It emphasized large order problems and so focused on methods that can be implemented efficiently on distributed-memory MIMD multiprocessors. Such machines supply the computing power and extensive memory demanded by the large order problems. We proposed to examine three classes of matrix algebra problems: the symmetric and nonsymmetric eigenvalue problems (especially the tridiagonal cases) and the solution of linear systems with specially structured coefficient matrices. As all of these are of practical interest, a major goal of this work was tomore » translate our research in linear algebra into useful tools for use by the computational scientists interested in these and related applications. Thus, in addition to software specific to the linear algebra problems, we proposed to produce a programming paradigm and library to aid in the design and implementation of programs for distributed-memory MIMD computers. We now report on our progress on each of the problems and on the programming tools.« less
He, Qiang; Hu, Xiangtao; Ren, Hong; Zhang, Hongqi
2015-11-01
A novel artificial fish swarm algorithm (NAFSA) is proposed for solving large-scale reliability-redundancy allocation problem (RAP). In NAFSA, the social behaviors of fish swarm are classified in three ways: foraging behavior, reproductive behavior, and random behavior. The foraging behavior designs two position-updating strategies. And, the selection and crossover operators are applied to define the reproductive ability of an artificial fish. For the random behavior, which is essentially a mutation strategy, the basic cloud generator is used as the mutation operator. Finally, numerical results of four benchmark problems and a large-scale RAP are reported and compared. NAFSA shows good performance in terms of computational accuracy and computational efficiency for large scale RAP. Copyright © 2015 ISA. Published by Elsevier Ltd. All rights reserved.
Introduction to bioinformatics.
Can, Tolga
2014-01-01
Bioinformatics is an interdisciplinary field mainly involving molecular biology and genetics, computer science, mathematics, and statistics. Data intensive, large-scale biological problems are addressed from a computational point of view. The most common problems are modeling biological processes at the molecular level and making inferences from collected data. A bioinformatics solution usually involves the following steps: Collect statistics from biological data. Build a computational model. Solve a computational modeling problem. Test and evaluate a computational algorithm. This chapter gives a brief introduction to bioinformatics by first providing an introduction to biological terminology and then discussing some classical bioinformatics problems organized by the types of data sources. Sequence analysis is the analysis of DNA and protein sequences for clues regarding function and includes subproblems such as identification of homologs, multiple sequence alignment, searching sequence patterns, and evolutionary analyses. Protein structures are three-dimensional data and the associated problems are structure prediction (secondary and tertiary), analysis of protein structures for clues regarding function, and structural alignment. Gene expression data is usually represented as matrices and analysis of microarray data mostly involves statistics analysis, classification, and clustering approaches. Biological networks such as gene regulatory networks, metabolic pathways, and protein-protein interaction networks are usually modeled as graphs and graph theoretic approaches are used to solve associated problems such as construction and analysis of large-scale networks.
On the role of minicomputers in structural design
NASA Technical Reports Server (NTRS)
Storaasli, O. O.
1977-01-01
Results are presented of exploratory studies on the use of a minicomputer in conjunction with large-scale computers to perform structural design tasks, including data and program management, use of interactive graphics, and computations for structural analysis and design. An assessment is made of minicomputer use for the structural model definition and checking and for interpreting results. Included are results of computational experiments demonstrating the advantages of using both a minicomputer and a large computer to solve a large aircraft structural design problem.
Xia, Youshen; Sun, Changyin; Zheng, Wei Xing
2012-05-01
There is growing interest in solving linear L1 estimation problems for sparsity of the solution and robustness against non-Gaussian noise. This paper proposes a discrete-time neural network which can calculate large linear L1 estimation problems fast. The proposed neural network has a fixed computational step length and is proved to be globally convergent to an optimal solution. Then, the proposed neural network is efficiently applied to image restoration. Numerical results show that the proposed neural network is not only efficient in solving degenerate problems resulting from the nonunique solutions of the linear L1 estimation problems but also needs much less computational time than the related algorithms in solving both linear L1 estimation and image restoration problems.
Applications of large-scale density functional theory in biology
NASA Astrophysics Data System (ADS)
Cole, Daniel J.; Hine, Nicholas D. M.
2016-10-01
Density functional theory (DFT) has become a routine tool for the computation of electronic structure in the physics, materials and chemistry fields. Yet the application of traditional DFT to problems in the biological sciences is hindered, to a large extent, by the unfavourable scaling of the computational effort with system size. Here, we review some of the major software and functionality advances that enable insightful electronic structure calculations to be performed on systems comprising many thousands of atoms. We describe some of the early applications of large-scale DFT to the computation of the electronic properties and structure of biomolecules, as well as to paradigmatic problems in enzymology, metalloproteins, photosynthesis and computer-aided drug design. With this review, we hope to demonstrate that first principles modelling of biological structure-function relationships are approaching a reality.
Fundamental organometallic reactions: Applications on the CYBER 205
NASA Technical Reports Server (NTRS)
Rappe, A. K.
1984-01-01
Two of the most challenging problems of Organometallic chemistry (loosely defined) are pollution control with the large space velocities needed and nitrogen fixation, a process so capably done by nature and so relatively poorly done by man (industry). For a computational chemist these problems are on the fringe of what is possible with conventional computers (large models needed and accurate energetics required). A summary of the algorithmic modification needed to address these problems on a vector processor such as the CYBER 205 and a sketch of findings to date on deNOx catalysis and nitrogen fixation are presented.
DEP : a computer program for evaluating lumber drying costs and investments
Stewart Holmes; George B. Harpole; Edward Bilek
1983-01-01
The DEP computer program is a modified discounted cash flow computer program designed for analysis of problems involving economic analysis of wood drying processes. Wood drying processes are different from other processes because of the large amounts of working capital required to finance inventories, and because of relatively large shares of costs charged to inventory...
Parallel computing method for simulating hydrological processesof large rivers under climate change
NASA Astrophysics Data System (ADS)
Wang, H.; Chen, Y.
2016-12-01
Climate change is one of the proverbial global environmental problems in the world.Climate change has altered the watershed hydrological processes in time and space distribution, especially in worldlarge rivers.Watershed hydrological process simulation based on physically based distributed hydrological model can could have better results compared with the lumped models.However, watershed hydrological process simulation includes large amount of calculations, especially in large rivers, thus needing huge computing resources that may not be steadily available for the researchers or at high expense, this seriously restricted the research and application. To solve this problem, the current parallel method are mostly parallel computing in space and time dimensions.They calculate the natural features orderly thatbased on distributed hydrological model by grid (unit, a basin) from upstream to downstream.This articleproposes ahigh-performancecomputing method of hydrological process simulation with high speedratio and parallel efficiency.It combinedthe runoff characteristics of time and space of distributed hydrological model withthe methods adopting distributed data storage, memory database, distributed computing, parallel computing based on computing power unit.The method has strong adaptability and extensibility,which means it canmake full use of the computing and storage resources under the condition of limited computing resources, and the computing efficiency can be improved linearly with the increase of computing resources .This method can satisfy the parallel computing requirements ofhydrological process simulation in small, medium and large rivers.
Computer programs: Operational and mathematical, a compilation
NASA Technical Reports Server (NTRS)
1973-01-01
Several computer programs which are available through the NASA Technology Utilization Program are outlined. Presented are: (1) Computer operational programs which can be applied to resolve procedural problems swiftly and accurately. (2) Mathematical applications for the resolution of problems encountered in numerous industries. Although the functions which these programs perform are not new and similar programs are available in many large computer center libraries, this collection may be of use to centers with limited systems libraries and for instructional purposes for new computer operators.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chang, Justin; Karra, Satish; Nakshatrala, Kalyana B.
It is well-known that the standard Galerkin formulation, which is often the formulation of choice under the finite element method for solving self-adjoint diffusion equations, does not meet maximum principles and the non-negative constraint for anisotropic diffusion equations. Recently, optimization-based methodologies that satisfy maximum principles and the non-negative constraint for steady-state and transient diffusion-type equations have been proposed. To date, these methodologies have been tested only on small-scale academic problems. The purpose of this paper is to systematically study the performance of the non-negative methodology in the context of high performance computing (HPC). PETSc and TAO libraries are, respectively, usedmore » for the parallel environment and optimization solvers. For large-scale problems, it is important for computational scientists to understand the computational performance of current algorithms available in these scientific libraries. The numerical experiments are conducted on the state-of-the-art HPC systems, and a single-core performance model is used to better characterize the efficiency of the solvers. Furthermore, our studies indicate that the proposed non-negative computational framework for diffusion-type equations exhibits excellent strong scaling for real-world large-scale problems.« less
Chang, Justin; Karra, Satish; Nakshatrala, Kalyana B.
2016-07-26
It is well-known that the standard Galerkin formulation, which is often the formulation of choice under the finite element method for solving self-adjoint diffusion equations, does not meet maximum principles and the non-negative constraint for anisotropic diffusion equations. Recently, optimization-based methodologies that satisfy maximum principles and the non-negative constraint for steady-state and transient diffusion-type equations have been proposed. To date, these methodologies have been tested only on small-scale academic problems. The purpose of this paper is to systematically study the performance of the non-negative methodology in the context of high performance computing (HPC). PETSc and TAO libraries are, respectively, usedmore » for the parallel environment and optimization solvers. For large-scale problems, it is important for computational scientists to understand the computational performance of current algorithms available in these scientific libraries. The numerical experiments are conducted on the state-of-the-art HPC systems, and a single-core performance model is used to better characterize the efficiency of the solvers. Furthermore, our studies indicate that the proposed non-negative computational framework for diffusion-type equations exhibits excellent strong scaling for real-world large-scale problems.« less
Exploring quantum computing application to satellite data assimilation
NASA Astrophysics Data System (ADS)
Cheung, S.; Zhang, S. Q.
2015-12-01
This is an exploring work on potential application of quantum computing to a scientific data optimization problem. On classical computational platforms, the physical domain of a satellite data assimilation problem is represented by a discrete variable transform, and classical minimization algorithms are employed to find optimal solution of the analysis cost function. The computation becomes intensive and time-consuming when the problem involves large number of variables and data. The new quantum computer opens a very different approach both in conceptual programming and in hardware architecture for solving optimization problem. In order to explore if we can utilize the quantum computing machine architecture, we formulate a satellite data assimilation experimental case in the form of quadratic programming optimization problem. We find a transformation of the problem to map it into Quadratic Unconstrained Binary Optimization (QUBO) framework. Binary Wavelet Transform (BWT) will be applied to the data assimilation variables for its invertible decomposition and all calculations in BWT are performed by Boolean operations. The transformed problem will be experimented as to solve for a solution of QUBO instances defined on Chimera graphs of the quantum computer.
Information Systems and Performance Measures in Schools.
ERIC Educational Resources Information Center
Coleman, James S.; Karweit, Nancy L.
Large school systems bring various administrative problems in handling scheduling, records, and avoiding making red tape casualties of students. The authors review a portion of the current use of computers to handle these problems and examine the range of activities for which computer processing could provide aid. Since automation always brings…
Large-scale inverse model analyses employing fast randomized data reduction
NASA Astrophysics Data System (ADS)
Lin, Youzuo; Le, Ellen B.; O'Malley, Daniel; Vesselinov, Velimir V.; Bui-Thanh, Tan
2017-08-01
When the number of observations is large, it is computationally challenging to apply classical inverse modeling techniques. We have developed a new computationally efficient technique for solving inverse problems with a large number of observations (e.g., on the order of 107 or greater). Our method, which we call the randomized geostatistical approach (RGA), is built upon the principal component geostatistical approach (PCGA). We employ a data reduction technique combined with the PCGA to improve the computational efficiency and reduce the memory usage. Specifically, we employ a randomized numerical linear algebra technique based on a so-called "sketching" matrix to effectively reduce the dimension of the observations without losing the information content needed for the inverse analysis. In this way, the computational and memory costs for RGA scale with the information content rather than the size of the calibration data. Our algorithm is coded in Julia and implemented in the MADS open-source high-performance computational framework (http://mads.lanl.gov). We apply our new inverse modeling method to invert for a synthetic transmissivity field. Compared to a standard geostatistical approach (GA), our method is more efficient when the number of observations is large. Most importantly, our method is capable of solving larger inverse problems than the standard GA and PCGA approaches. Therefore, our new model inversion method is a powerful tool for solving large-scale inverse problems. The method can be applied in any field and is not limited to hydrogeological applications such as the characterization of aquifer heterogeneity.
Multi-GPU implementation of a VMAT treatment plan optimization algorithm.
Tian, Zhen; Peng, Fei; Folkerts, Michael; Tan, Jun; Jia, Xun; Jiang, Steve B
2015-06-01
Volumetric modulated arc therapy (VMAT) optimization is a computationally challenging problem due to its large data size, high degrees of freedom, and many hardware constraints. High-performance graphics processing units (GPUs) have been used to speed up the computations. However, GPU's relatively small memory size cannot handle cases with a large dose-deposition coefficient (DDC) matrix in cases of, e.g., those with a large target size, multiple targets, multiple arcs, and/or small beamlet size. The main purpose of this paper is to report an implementation of a column-generation-based VMAT algorithm, previously developed in the authors' group, on a multi-GPU platform to solve the memory limitation problem. While the column-generation-based VMAT algorithm has been previously developed, the GPU implementation details have not been reported. Hence, another purpose is to present detailed techniques employed for GPU implementation. The authors also would like to utilize this particular problem as an example problem to study the feasibility of using a multi-GPU platform to solve large-scale problems in medical physics. The column-generation approach generates VMAT apertures sequentially by solving a pricing problem (PP) and a master problem (MP) iteratively. In the authors' method, the sparse DDC matrix is first stored on a CPU in coordinate list format (COO). On the GPU side, this matrix is split into four submatrices according to beam angles, which are stored on four GPUs in compressed sparse row format. Computation of beamlet price, the first step in PP, is accomplished using multi-GPUs. A fast inter-GPU data transfer scheme is accomplished using peer-to-peer access. The remaining steps of PP and MP problems are implemented on CPU or a single GPU due to their modest problem scale and computational loads. Barzilai and Borwein algorithm with a subspace step scheme is adopted here to solve the MP problem. A head and neck (H&N) cancer case is then used to validate the authors' method. The authors also compare their multi-GPU implementation with three different single GPU implementation strategies, i.e., truncating DDC matrix (S1), repeatedly transferring DDC matrix between CPU and GPU (S2), and porting computations involving DDC matrix to CPU (S3), in terms of both plan quality and computational efficiency. Two more H&N patient cases and three prostate cases are used to demonstrate the advantages of the authors' method. The authors' multi-GPU implementation can finish the optimization process within ∼ 1 min for the H&N patient case. S1 leads to an inferior plan quality although its total time was 10 s shorter than the multi-GPU implementation due to the reduced matrix size. S2 and S3 yield the same plan quality as the multi-GPU implementation but take ∼4 and ∼6 min, respectively. High computational efficiency was consistently achieved for the other five patient cases tested, with VMAT plans of clinically acceptable quality obtained within 23-46 s. Conversely, to obtain clinically comparable or acceptable plans for all six of these VMAT cases that the authors have tested in this paper, the optimization time needed in a commercial TPS system on CPU was found to be in an order of several minutes. The results demonstrate that the multi-GPU implementation of the authors' column-generation-based VMAT optimization can handle the large-scale VMAT optimization problem efficiently without sacrificing plan quality. The authors' study may serve as an example to shed some light on other large-scale medical physics problems that require multi-GPU techniques.
Parallel computation with molecular-motor-propelled agents in nanofabricated networks.
Nicolau, Dan V; Lard, Mercy; Korten, Till; van Delft, Falco C M J M; Persson, Malin; Bengtsson, Elina; Månsson, Alf; Diez, Stefan; Linke, Heiner; Nicolau, Dan V
2016-03-08
The combinatorial nature of many important mathematical problems, including nondeterministic-polynomial-time (NP)-complete problems, places a severe limitation on the problem size that can be solved with conventional, sequentially operating electronic computers. There have been significant efforts in conceiving parallel-computation approaches in the past, for example: DNA computation, quantum computation, and microfluidics-based computation. However, these approaches have not proven, so far, to be scalable and practical from a fabrication and operational perspective. Here, we report the foundations of an alternative parallel-computation system in which a given combinatorial problem is encoded into a graphical, modular network that is embedded in a nanofabricated planar device. Exploring the network in a parallel fashion using a large number of independent, molecular-motor-propelled agents then solves the mathematical problem. This approach uses orders of magnitude less energy than conventional computers, thus addressing issues related to power consumption and heat dissipation. We provide a proof-of-concept demonstration of such a device by solving, in a parallel fashion, the small instance {2, 5, 9} of the subset sum problem, which is a benchmark NP-complete problem. Finally, we discuss the technical advances necessary to make our system scalable with presently available technology.
Equation solvers for distributed-memory computers
NASA Technical Reports Server (NTRS)
Storaasli, Olaf O.
1994-01-01
A large number of scientific and engineering problems require the rapid solution of large systems of simultaneous equations. The performance of parallel computers in this area now dwarfs traditional vector computers by nearly an order of magnitude. This talk describes the major issues involved in parallel equation solvers with particular emphasis on the Intel Paragon, IBM SP-1 and SP-2 processors.
Solving LP Relaxations of Large-Scale Precedence Constrained Problems
NASA Astrophysics Data System (ADS)
Bienstock, Daniel; Zuckerberg, Mark
We describe new algorithms for solving linear programming relaxations of very large precedence constrained production scheduling problems. We present theory that motivates a new set of algorithmic ideas that can be employed on a wide range of problems; on data sets arising in the mining industry our algorithms prove effective on problems with many millions of variables and constraints, obtaining provably optimal solutions in a few minutes of computation.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Moryakov, A. V., E-mail: sailor@orc.ru
2016-12-15
An algorithm for solving the linear Cauchy problem for large systems of ordinary differential equations is presented. The algorithm for systems of first-order differential equations is implemented in the EDELWEISS code with the possibility of parallel computations on supercomputers employing the MPI (Message Passing Interface) standard for the data exchange between parallel processes. The solution is represented by a series of orthogonal polynomials on the interval [0, 1]. The algorithm is characterized by simplicity and the possibility to solve nonlinear problems with a correction of the operator in accordance with the solution obtained in the previous iterative process.
Software environment for implementing engineering applications on MIMD computers
NASA Technical Reports Server (NTRS)
Lopez, L. A.; Valimohamed, K. A.; Schiff, S.
1990-01-01
In this paper the concept for a software environment for developing engineering application systems for multiprocessor hardware (MIMD) is presented. The philosophy employed is to solve the largest problems possible in a reasonable amount of time, rather than solve existing problems faster. In the proposed environment most of the problems concerning parallel computation and handling of large distributed data spaces are hidden from the application program developer, thereby facilitating the development of large-scale software applications. Applications developed under the environment can be executed on a variety of MIMD hardware; it protects the application software from the effects of a rapidly changing MIMD hardware technology.
Application of computational aero-acoustics to real world problems
NASA Technical Reports Server (NTRS)
Hardin, Jay C.
1996-01-01
The application of computational aeroacoustics (CAA) to real problems is discussed in relation to the analysis performed with the aim of assessing the application of the various techniques. It is considered that the applications are limited by the inability of the computational resources to resolve the large range of scales involved in high Reynolds number flows. Possible simplifications are discussed. It is considered that problems remain to be solved in relation to the efficient use of the power of parallel computers and in the development of turbulent modeling schemes. The goal of CAA is stated as being the implementation of acoustic design studies on a computer terminal with reasonable run times.
Multicategory Composite Least Squares Classifiers
Park, Seo Young; Liu, Yufeng; Liu, Dacheng; Scholl, Paul
2010-01-01
Classification is a very useful statistical tool for information extraction. In particular, multicategory classification is commonly seen in various applications. Although binary classification problems are heavily studied, extensions to the multicategory case are much less so. In view of the increased complexity and volume of modern statistical problems, it is desirable to have multicategory classifiers that are able to handle problems with high dimensions and with a large number of classes. Moreover, it is necessary to have sound theoretical properties for the multicategory classifiers. In the literature, there exist several different versions of simultaneous multicategory Support Vector Machines (SVMs). However, the computation of the SVM can be difficult for large scale problems, especially for problems with large number of classes. Furthermore, the SVM cannot produce class probability estimation directly. In this article, we propose a novel efficient multicategory composite least squares classifier (CLS classifier), which utilizes a new composite squared loss function. The proposed CLS classifier has several important merits: efficient computation for problems with large number of classes, asymptotic consistency, ability to handle high dimensional data, and simple conditional class probability estimation. Our simulated and real examples demonstrate competitive performance of the proposed approach. PMID:21218128
Students' Explanations in Complex Learning of Disciplinary Programming
ERIC Educational Resources Information Center
Vieira, Camilo
2016-01-01
Computational Science and Engineering (CSE) has been denominated as the third pillar of science and as a set of important skills to solve the problems of a global society. Along with the theoretical and the experimental approaches, computation offers a third alternative to solve complex problems that require processing large amounts of data, or…
Effect of local minima on adiabatic quantum optimization.
Amin, M H S
2008-04-04
We present a perturbative method to estimate the spectral gap for adiabatic quantum optimization, based on the structure of the energy levels in the problem Hamiltonian. We show that, for problems that have an exponentially large number of local minima close to the global minimum, the gap becomes exponentially small making the computation time exponentially long. The quantum advantage of adiabatic quantum computation may then be accessed only via the local adiabatic evolution, which requires phase coherence throughout the evolution and knowledge of the spectrum. Such problems, therefore, are not suitable for adiabatic quantum computation.
NASA Technical Reports Server (NTRS)
1973-01-01
Techniques are considered which would be used to characterize areospace computers with the space shuttle application as end usage. The system level digital problems which have been encountered and documented are surveyed. From the large cross section of tests, an optimum set is recommended that has a high probability of discovering documented system level digital problems within laboratory environments. Defined is a baseline hardware, software system which is required as a laboratory tool to test aerospace computers. Hardware and software baselines and additions necessary to interface the UTE to aerospace computers for test purposes are outlined.
Solution of a large hydrodynamic problem using the STAR-100 computer
NASA Technical Reports Server (NTRS)
Weilmuenster, K. J.; Howser, L. M.
1976-01-01
A representative hydrodynamics problem, the shock initiated flow over a flat plate, was used for exploring data organizations and program structures needed to exploit the STAR-100 vector processing computer. A brief description of the problem is followed by a discussion of how each portion of the computational process was vectorized. Finally, timings of different portions of the program are compared with equivalent operations on serial machines. The speed up of the STAR-100 over the CDC 6600 program is shown to increase as the problem size increases. All computations were carried out on a CDC 6600 and a CDC STAR 100, with code written in FORTRAN for the 6600 and in STAR FORTRAN for the STAR 100.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Aguilo Valentin, Miguel Alejandro
2016-07-01
This study presents a new nonlinear programming formulation for the solution of inverse problems. First, a general inverse problem formulation based on the compliance error functional is presented. The proposed error functional enables the computation of the Lagrange multipliers, and thus the first order derivative information, at the expense of just one model evaluation. Therefore, the calculation of the Lagrange multipliers does not require the solution of the computationally intensive adjoint problem. This leads to significant speedups for large-scale, gradient-based inverse problems.
Large scale systems : a study of computer organizations for air traffic control applications.
DOT National Transportation Integrated Search
1971-06-01
Based on current sizing estimates and tracking algorithms, some computer organizations applicable to future air traffic control computing systems are described and assessed. Hardware and software problem areas are defined and solutions are outlined.
Advanced computer architecture for large-scale real-time applications.
DOT National Transportation Integrated Search
1973-04-01
Air traffic control automation is identified as a crucial problem which provides a complex, real-time computer application environment. A novel computer architecture in the form of a pipeline associative processor is conceived to achieve greater perf...
NASA Astrophysics Data System (ADS)
O'Malley, D.; Vesselinov, V. V.
2017-12-01
Classical microprocessors have had a dramatic impact on hydrology for decades, due largely to the exponential growth in computing power predicted by Moore's law. However, this growth is not expected to continue indefinitely and has already begun to slow. Quantum computing is an emerging alternative to classical microprocessors. Here, we demonstrated cutting edge inverse model analyses utilizing some of the best available resources in both worlds: high-performance classical computing and a D-Wave quantum annealer. The classical high-performance computing resources are utilized to build an advanced numerical model that assimilates data from O(10^5) observations, including water levels, drawdowns, and contaminant concentrations. The developed model accurately reproduces the hydrologic conditions at a Los Alamos National Laboratory contamination site, and can be leveraged to inform decision-making about site remediation. We demonstrate the use of a D-Wave 2X quantum annealer to solve hydrologic inverse problems. This work can be seen as an early step in quantum-computational hydrology. We compare and contrast our results with an early inverse approach in classical-computational hydrology that is comparable to the approach we use with quantum annealing. Our results show that quantum annealing can be useful for identifying regions of high and low permeability within an aquifer. While the problems we consider are small-scale compared to the problems that can be solved with modern classical computers, they are large compared to the problems that could be solved with early classical CPUs. Further, the binary nature of the high/low permeability problem makes it well-suited to quantum annealing, but challenging for classical computers.
Efficiently modeling neural networks on massively parallel computers
NASA Technical Reports Server (NTRS)
Farber, Robert M.
1993-01-01
Neural networks are a very useful tool for analyzing and modeling complex real world systems. Applying neural network simulations to real world problems generally involves large amounts of data and massive amounts of computation. To efficiently handle the computational requirements of large problems, we have implemented at Los Alamos a highly efficient neural network compiler for serial computers, vector computers, vector parallel computers, and fine grain SIMD computers such as the CM-2 connection machine. This paper describes the mapping used by the compiler to implement feed-forward backpropagation neural networks for a SIMD (Single Instruction Multiple Data) architecture parallel computer. Thinking Machines Corporation has benchmarked our code at 1.3 billion interconnects per second (approximately 3 gigaflops) on a 64,000 processor CM-2 connection machine (Singer 1990). This mapping is applicable to other SIMD computers and can be implemented on MIMD computers such as the CM-5 connection machine. Our mapping has virtually no communications overhead with the exception of the communications required for a global summation across the processors (which has a sub-linear runtime growth on the order of O(log(number of processors)). We can efficiently model very large neural networks which have many neurons and interconnects and our mapping can extend to arbitrarily large networks (within memory limitations) by merging the memory space of separate processors with fast adjacent processor interprocessor communications. This paper will consider the simulation of only feed forward neural network although this method is extendable to recurrent networks.
Linear static structural and vibration analysis on high-performance computers
NASA Technical Reports Server (NTRS)
Baddourah, M. A.; Storaasli, O. O.; Bostic, S. W.
1993-01-01
Parallel computers offer the oppurtunity to significantly reduce the computation time necessary to analyze large-scale aerospace structures. This paper presents algorithms developed for and implemented on massively-parallel computers hereafter referred to as Scalable High-Performance Computers (SHPC), for the most computationally intensive tasks involved in structural analysis, namely, generation and assembly of system matrices, solution of systems of equations and calculation of the eigenvalues and eigenvectors. Results on SHPC are presented for large-scale structural problems (i.e. models for High-Speed Civil Transport). The goal of this research is to develop a new, efficient technique which extends structural analysis to SHPC and makes large-scale structural analyses tractable.
Active subspace: toward scalable low-rank learning.
Liu, Guangcan; Yan, Shuicheng
2012-12-01
We address the scalability issues in low-rank matrix learning problems. Usually these problems resort to solving nuclear norm regularized optimization problems (NNROPs), which often suffer from high computational complexities if based on existing solvers, especially in large-scale settings. Based on the fact that the optimal solution matrix to an NNROP is often low rank, we revisit the classic mechanism of low-rank matrix factorization, based on which we present an active subspace algorithm for efficiently solving NNROPs by transforming large-scale NNROPs into small-scale problems. The transformation is achieved by factorizing the large solution matrix into the product of a small orthonormal matrix (active subspace) and another small matrix. Although such a transformation generally leads to nonconvex problems, we show that a suboptimal solution can be found by the augmented Lagrange alternating direction method. For the robust PCA (RPCA) (Candès, Li, Ma, & Wright, 2009 ) problem, a typical example of NNROPs, theoretical results verify the suboptimality of the solution produced by our algorithm. For the general NNROPs, we empirically show that our algorithm significantly reduces the computational complexity without loss of optimality.
Changing from computing grid to knowledge grid in life-science grid.
Talukdar, Veera; Konar, Amit; Datta, Ayan; Choudhury, Anamika Roy
2009-09-01
Grid computing has a great potential to become a standard cyber infrastructure for life sciences that often require high-performance computing and large data handling, which exceeds the computing capacity of a single institution. Grid computer applies the resources of many computers in a network to a single problem at the same time. It is useful to scientific problems that require a great number of computer processing cycles or access to a large amount of data.As biologists,we are constantly discovering millions of genes and genome features, which are assembled in a library and distributed on computers around the world.This means that new, innovative methods must be developed that exploit the re-sources available for extensive calculations - for example grid computing.This survey reviews the latest grid technologies from the viewpoints of computing grid, data grid and knowledge grid. Computing grid technologies have been matured enough to solve high-throughput real-world life scientific problems. Data grid technologies are strong candidates for realizing a "resourceome" for bioinformatics. Knowledge grids should be designed not only from sharing explicit knowledge on computers but also from community formulation for sharing tacit knowledge among a community. By extending the concept of grid from computing grid to knowledge grid, it is possible to make use of a grid as not only sharable computing resources, but also as time and place in which people work together, create knowledge, and share knowledge and experiences in a community.
Parallel Simulation of Three-Dimensional Free Surface Fluid Flow Problems
DOE Office of Scientific and Technical Information (OSTI.GOV)
BAER,THOMAS A.; SACKINGER,PHILIP A.; SUBIA,SAMUEL R.
1999-10-14
Simulation of viscous three-dimensional fluid flow typically involves a large number of unknowns. When free surfaces are included, the number of unknowns increases dramatically. Consequently, this class of problem is an obvious application of parallel high performance computing. We describe parallel computation of viscous, incompressible, free surface, Newtonian fluid flow problems that include dynamic contact fines. The Galerkin finite element method was used to discretize the fully-coupled governing conservation equations and a ''pseudo-solid'' mesh mapping approach was used to determine the shape of the free surface. In this approach, the finite element mesh is allowed to deform to satisfy quasi-staticmore » solid mechanics equations subject to geometric or kinematic constraints on the boundaries. As a result, nodal displacements must be included in the set of unknowns. Other issues discussed are the proper constraints appearing along the dynamic contact line in three dimensions. Issues affecting efficient parallel simulations include problem decomposition to equally distribute computational work among a SPMD computer and determination of robust, scalable preconditioners for the distributed matrix systems that must be solved. Solution continuation strategies important for serial simulations have an enhanced relevance in a parallel coquting environment due to the difficulty of solving large scale systems. Parallel computations will be demonstrated on an example taken from the coating flow industry: flow in the vicinity of a slot coater edge. This is a three dimensional free surface problem possessing a contact line that advances at the web speed in one region but transitions to static behavior in another region. As such, a significant fraction of the computational time is devoted to processing boundary data. Discussion focuses on parallel speed ups for fixed problem size, a class of problems of immediate practical importance.« less
Optimistic barrier synchronization
NASA Technical Reports Server (NTRS)
Nicol, David M.
1992-01-01
Barrier synchronization is fundamental operation in parallel computation. In many contexts, at the point a processor enters a barrier it knows that it has already processed all the work required of it prior to synchronization. The alternative case, when a processor cannot enter a barrier with the assurance that it has already performed all the necessary pre-synchronization computation, is treated. The problem arises when the number of pre-sychronization messages to be received by a processor is unkown, for example, in a parallel discrete simulation or any other computation that is largely driven by an unpredictable exchange of messages. We describe an optimistic O(log sup 2 P) barrier algorithm for such problems, study its performance on a large-scale parallel system, and consider extensions to general associative reductions as well as associative parallel prefix computations.
Artificial intelligence issues related to automated computing operations
NASA Technical Reports Server (NTRS)
Hornfeck, William A.
1989-01-01
Large data processing installations represent target systems for effective applications of artificial intelligence (AI) constructs. The system organization of a large data processing facility at the NASA Marshall Space Flight Center is presented. The methodology and the issues which are related to AI application to automated operations within a large-scale computing facility are described. Problems to be addressed and initial goals are outlined.
COPS: Large-scale nonlinearly constrained optimization problems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bondarenko, A.S.; Bortz, D.M.; More, J.J.
2000-02-10
The authors have started the development of COPS, a collection of large-scale nonlinearly Constrained Optimization Problems. The primary purpose of this collection is to provide difficult test cases for optimization software. Problems in the current version of the collection come from fluid dynamics, population dynamics, optimal design, and optimal control. For each problem they provide a short description of the problem, notes on the formulation of the problem, and results of computational experiments with general optimization solvers. They currently have results for DONLP2, LANCELOT, MINOS, SNOPT, and LOQO.
A linear decomposition method for large optimization problems. Blueprint for development
NASA Technical Reports Server (NTRS)
Sobieszczanski-Sobieski, J.
1982-01-01
A method is proposed for decomposing large optimization problems encountered in the design of engineering systems such as an aircraft into a number of smaller subproblems. The decomposition is achieved by organizing the problem and the subordinated subproblems in a tree hierarchy and optimizing each subsystem separately. Coupling of the subproblems is accounted for by subsequent optimization of the entire system based on sensitivities of the suboptimization problem solutions at each level of the tree to variables of the next higher level. A formalization of the procedure suitable for computer implementation is developed and the state of readiness of the implementation building blocks is reviewed showing that the ingredients for the development are on the shelf. The decomposition method is also shown to be compatible with the natural human organization of the design process of engineering systems. The method is also examined with respect to the trends in computer hardware and software progress to point out that its efficiency can be amplified by network computing using parallel processors.
Parallel Computation of Unsteady Flows on a Network of Workstations
NASA Technical Reports Server (NTRS)
1997-01-01
Parallel computation of unsteady flows requires significant computational resources. The utilization of a network of workstations seems an efficient solution to the problem where large problems can be treated at a reasonable cost. This approach requires the solution of several problems: 1) the partitioning and distribution of the problem over a network of workstation, 2) efficient communication tools, 3) managing the system efficiently for a given problem. Of course, there is the question of the efficiency of any given numerical algorithm to such a computing system. NPARC code was chosen as a sample for the application. For the explicit version of the NPARC code both two- and three-dimensional problems were studied. Again both steady and unsteady problems were investigated. The issues studied as a part of the research program were: 1) how to distribute the data between the workstations, 2) how to compute and how to communicate at each node efficiently, 3) how to balance the load distribution. In the following, a summary of these activities is presented. Details of the work have been presented and published as referenced.
Distributed Computation of the knn Graph for Large High-Dimensional Point Sets
Plaku, Erion; Kavraki, Lydia E.
2009-01-01
High-dimensional problems arising from robot motion planning, biology, data mining, and geographic information systems often require the computation of k nearest neighbor (knn) graphs. The knn graph of a data set is obtained by connecting each point to its k closest points. As the research in the above-mentioned fields progressively addresses problems of unprecedented complexity, the demand for computing knn graphs based on arbitrary distance metrics and large high-dimensional data sets increases, exceeding resources available to a single machine. In this work we efficiently distribute the computation of knn graphs for clusters of processors with message passing. Extensions to our distributed framework include the computation of graphs based on other proximity queries, such as approximate knn or range queries. Our experiments show nearly linear speedup with over one hundred processors and indicate that similar speedup can be obtained with several hundred processors. PMID:19847318
Near-Optimal Guidance Method for Maximizing the Reachable Domain of Gliding Aircraft
NASA Astrophysics Data System (ADS)
Tsuchiya, Takeshi
This paper proposes a guidance method for gliding aircraft by using onboard computers to calculate a near-optimal trajectory in real-time, and thereby expanding the reachable domain. The results are applicable to advanced aircraft and future space transportation systems that require high safety. The calculation load of the optimal control problem that is used to maximize the reachable domain is too large for current computers to calculate in real-time. Thus the optimal control problem is divided into two problems: a gliding distance maximization problem in which the aircraft motion is limited to a vertical plane, and an optimal turning flight problem in a horizontal direction. First, the former problem is solved using a shooting method. It can be solved easily because its scale is smaller than that of the original problem, and because some of the features of the optimal solution are obtained in the first part of this paper. Next, in the latter problem, the optimal bank angle is computed from the solution of the former; this is an analytical computation, rather than an iterative computation. Finally, the reachable domain obtained from the proposed near-optimal guidance method is compared with that obtained from the original optimal control problem.
Computational nuclear quantum many-body problem: The UNEDF project
NASA Astrophysics Data System (ADS)
Bogner, S.; Bulgac, A.; Carlson, J.; Engel, J.; Fann, G.; Furnstahl, R. J.; Gandolfi, S.; Hagen, G.; Horoi, M.; Johnson, C.; Kortelainen, M.; Lusk, E.; Maris, P.; Nam, H.; Navratil, P.; Nazarewicz, W.; Ng, E.; Nobre, G. P. A.; Ormand, E.; Papenbrock, T.; Pei, J.; Pieper, S. C.; Quaglioni, S.; Roche, K. J.; Sarich, J.; Schunck, N.; Sosonkina, M.; Terasaki, J.; Thompson, I.; Vary, J. P.; Wild, S. M.
2013-10-01
The UNEDF project was a large-scale collaborative effort that applied high-performance computing to the nuclear quantum many-body problem. The primary focus of the project was on constructing, validating, and applying an optimized nuclear energy density functional, which entailed a wide range of pioneering developments in microscopic nuclear structure and reactions, algorithms, high-performance computing, and uncertainty quantification. UNEDF demonstrated that close associations among nuclear physicists, mathematicians, and computer scientists can lead to novel physics outcomes built on algorithmic innovations and computational developments. This review showcases a wide range of UNEDF science results to illustrate this interplay.
Advantages of Parallel Processing and the Effects of Communications Time
NASA Technical Reports Server (NTRS)
Eddy, Wesley M.; Allman, Mark
2000-01-01
Many computing tasks involve heavy mathematical calculations, or analyzing large amounts of data. These operations can take a long time to complete using only one computer. Networks such as the Internet provide many computers with the ability to communicate with each other. Parallel or distributed computing takes advantage of these networked computers by arranging them to work together on a problem, thereby reducing the time needed to obtain the solution. The drawback to using a network of computers to solve a problem is the time wasted in communicating between the various hosts. The application of distributed computing techniques to a space environment or to use over a satellite network would therefore be limited by the amount of time needed to send data across the network, which would typically take much longer than on a terrestrial network. This experiment shows how much faster a large job can be performed by adding more computers to the task, what role communications time plays in the total execution time, and the impact a long-delay network has on a distributed computing system.
Parallel Preconditioning for CFD Problems on the CM-5
NASA Technical Reports Server (NTRS)
Simon, Horst D.; Kremenetsky, Mark D.; Richardson, John; Lasinski, T. A. (Technical Monitor)
1994-01-01
Up to today, preconditioning methods on massively parallel systems have faced a major difficulty. The most successful preconditioning methods in terms of accelerating the convergence of the iterative solver such as incomplete LU factorizations are notoriously difficult to implement on parallel machines for two reasons: (1) the actual computation of the preconditioner is not very floating-point intensive, but requires a large amount of unstructured communication, and (2) the application of the preconditioning matrix in the iteration phase (i.e. triangular solves) are difficult to parallelize because of the recursive nature of the computation. Here we present a new approach to preconditioning for very large, sparse, unsymmetric, linear systems, which avoids both difficulties. We explicitly compute an approximate inverse to our original matrix. This new preconditioning matrix can be applied most efficiently for iterative methods on massively parallel machines, since the preconditioning phase involves only a matrix-vector multiplication, with possibly a dense matrix. Furthermore the actual computation of the preconditioning matrix has natural parallelism. For a problem of size n, the preconditioning matrix can be computed by solving n independent small least squares problems. The algorithm and its implementation on the Connection Machine CM-5 are discussed in detail and supported by extensive timings obtained from real problem data.
Bridging Social and Semantic Computing - Design and Evaluation of User Interfaces for Hybrid Systems
ERIC Educational Resources Information Center
Bostandjiev, Svetlin Alex I.
2012-01-01
The evolution of the Web brought new interesting problems to computer scientists that we loosely classify in the fields of social and semantic computing. Social computing is related to two major paradigms: computations carried out by a large amount of people in a collective intelligence fashion (i.e. wikis), and performing computations on social…
Henriques, David; González, Patricia; Doallo, Ramón; Saez-Rodriguez, Julio; Banga, Julio R.
2017-01-01
Background We consider a general class of global optimization problems dealing with nonlinear dynamic models. Although this class is relevant to many areas of science and engineering, here we are interested in applying this framework to the reverse engineering problem in computational systems biology, which yields very large mixed-integer dynamic optimization (MIDO) problems. In particular, we consider the framework of logic-based ordinary differential equations (ODEs). Methods We present saCeSS2, a parallel method for the solution of this class of problems. This method is based on an parallel cooperative scatter search metaheuristic, with new mechanisms of self-adaptation and specific extensions to handle large mixed-integer problems. We have paid special attention to the avoidance of convergence stagnation using adaptive cooperation strategies tailored to this class of problems. Results We illustrate its performance with a set of three very challenging case studies from the domain of dynamic modelling of cell signaling. The simpler case study considers a synthetic signaling pathway and has 84 continuous and 34 binary decision variables. A second case study considers the dynamic modeling of signaling in liver cancer using high-throughput data, and has 135 continuous and 109 binaries decision variables. The third case study is an extremely difficult problem related with breast cancer, involving 690 continuous and 138 binary decision variables. We report computational results obtained in different infrastructures, including a local cluster, a large supercomputer and a public cloud platform. Interestingly, the results show how the cooperation of individual parallel searches modifies the systemic properties of the sequential algorithm, achieving superlinear speedups compared to an individual search (e.g. speedups of 15 with 10 cores), and significantly improving (above a 60%) the performance with respect to a non-cooperative parallel scheme. The scalability of the method is also good (tests were performed using up to 300 cores). Conclusions These results demonstrate that saCeSS2 can be used to successfully reverse engineer large dynamic models of complex biological pathways. Further, these results open up new possibilities for other MIDO-based large-scale applications in the life sciences such as metabolic engineering, synthetic biology, drug scheduling. PMID:28813442
Penas, David R; Henriques, David; González, Patricia; Doallo, Ramón; Saez-Rodriguez, Julio; Banga, Julio R
2017-01-01
We consider a general class of global optimization problems dealing with nonlinear dynamic models. Although this class is relevant to many areas of science and engineering, here we are interested in applying this framework to the reverse engineering problem in computational systems biology, which yields very large mixed-integer dynamic optimization (MIDO) problems. In particular, we consider the framework of logic-based ordinary differential equations (ODEs). We present saCeSS2, a parallel method for the solution of this class of problems. This method is based on an parallel cooperative scatter search metaheuristic, with new mechanisms of self-adaptation and specific extensions to handle large mixed-integer problems. We have paid special attention to the avoidance of convergence stagnation using adaptive cooperation strategies tailored to this class of problems. We illustrate its performance with a set of three very challenging case studies from the domain of dynamic modelling of cell signaling. The simpler case study considers a synthetic signaling pathway and has 84 continuous and 34 binary decision variables. A second case study considers the dynamic modeling of signaling in liver cancer using high-throughput data, and has 135 continuous and 109 binaries decision variables. The third case study is an extremely difficult problem related with breast cancer, involving 690 continuous and 138 binary decision variables. We report computational results obtained in different infrastructures, including a local cluster, a large supercomputer and a public cloud platform. Interestingly, the results show how the cooperation of individual parallel searches modifies the systemic properties of the sequential algorithm, achieving superlinear speedups compared to an individual search (e.g. speedups of 15 with 10 cores), and significantly improving (above a 60%) the performance with respect to a non-cooperative parallel scheme. The scalability of the method is also good (tests were performed using up to 300 cores). These results demonstrate that saCeSS2 can be used to successfully reverse engineer large dynamic models of complex biological pathways. Further, these results open up new possibilities for other MIDO-based large-scale applications in the life sciences such as metabolic engineering, synthetic biology, drug scheduling.
Centralized Authentication with Kerberos 5, Part I
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wachsmann, A
Account administration in a distributed Unix/Linux environment can become very complicated and messy if done by hand. Large sites use special tools to deal with this problem. I will describe how even very small installations like your three computer network at home can take advantage of the very same tools. The problem in a distributed environment is that password and shadow files need to be changed individually on each machine if an account change occurs. Account changes include: password change, addition/removal of accounts, name change of an account (UID/GID changes are a big problem in any case), additional or removedmore » login privileges to a (group of) computer(s), etc. In this article, I will show how Kerberos 5 solves the authentication problem in a distributed computing environment. A second article will describe a solution for the authorization problem.« less
Fast Combinatorial Algorithm for the Solution of Linearly Constrained Least Squares Problems
Van Benthem, Mark H.; Keenan, Michael R.
2008-11-11
A fast combinatorial algorithm can significantly reduce the computational burden when solving general equality and inequality constrained least squares problems with large numbers of observation vectors. The combinatorial algorithm provides a mathematically rigorous solution and operates at great speed by reorganizing the calculations to take advantage of the combinatorial nature of the problems to be solved. The combinatorial algorithm exploits the structure that exists in large-scale problems in order to minimize the number of arithmetic operations required to obtain a solution.
Domain decomposition in time for PDE-constrained optimization
Barker, Andrew T.; Stoll, Martin
2015-08-28
Here, PDE-constrained optimization problems have a wide range of applications, but they lead to very large and ill-conditioned linear systems, especially if the problems are time dependent. In this paper we outline an approach for dealing with such problems by decomposing them in time and applying an additive Schwarz preconditioner in time, so that we can take advantage of parallel computers to deal with the very large linear systems. We then illustrate the performance of our method on a variety of problems.
Iterative framework radiation hybrid mapping
USDA-ARS?s Scientific Manuscript database
Building comprehensive radiation hybrid maps for large sets of markers is a computationally expensive process, since the basic mapping problem is equivalent to the traveling salesman problem. The mapping problem is also susceptible to noise, and as a result, it is often beneficial to remove markers ...
Solving satisfiability problems using a novel microarray-based DNA computer.
Lin, Che-Hsin; Cheng, Hsiao-Ping; Yang, Chang-Biau; Yang, Chia-Ning
2007-01-01
An algorithm based on a modified sticker model accompanied with an advanced MEMS-based microarray technology is demonstrated to solve SAT problem, which has long served as a benchmark in DNA computing. Unlike conventional DNA computing algorithms needing an initial data pool to cover correct and incorrect answers and further executing a series of separation procedures to destroy the unwanted ones, we built solutions in parts to satisfy one clause in one step, and eventually solve the entire Boolean formula through steps. No time-consuming sample preparation procedures and delicate sample applying equipment were required for the computing process. Moreover, experimental results show the bound DNA sequences can sustain the chemical solutions during computing processes such that the proposed method shall be useful in dealing with large-scale problems.
NASA Astrophysics Data System (ADS)
Zlotnik, Sergio
2017-04-01
Information provided by visualisation environments can be largely increased if the data shown is combined with some relevant physical processes and the used is allowed to interact with those processes. This is particularly interesting in VR environments where the user has a deep interplay with the data. For example, a geological seismic line in a 3D "cave" shows information of the geological structure of the subsoil. The available information could be enhanced with the thermal state of the region under study, with water-flow patterns in porous rocks or with rock displacements under some stress conditions. The information added by the physical processes is usually the output of some numerical technique applied to solve a Partial Differential Equation (PDE) that describes the underlying physics. Many techniques are available to obtain numerical solutions of PDE (e.g. Finite Elements, Finite Volumes, Finite Differences, etc). Although, all these traditional techniques require very large computational resources (particularly in 3D), making them useless in a real time visualization environment -such as VR- because the time required to compute a solution is measured in minutes or even in hours. We present here a novel alternative for the resolution of PDE-based problems that is able to provide a 3D solutions for a very large family of problems in real time. That is, the solution is evaluated in a one thousands of a second, making the solver ideal to be embedded into VR environments. Based on Model Order Reduction ideas, the proposed technique divides the computational work in to a computationally intensive "offline" phase, that is run only once in a life time, and an "online" phase that allow the real time evaluation of any solution within a family of problems. Preliminary examples of real time solutions of complex PDE-based problems will be presented, including thermal problems, flow problems, wave problems and some simple coupled problems.
Cost-effective use of minicomputers to solve structural problems
NASA Technical Reports Server (NTRS)
Storaasli, O. O.; Foster, E. P.
1978-01-01
Minicomputers are receiving increased use throughout the aerospace industry. Until recently, their use focused primarily on process control and numerically controlled tooling applications, while their exposure to and the opportunity for structural calculations has been limited. With the increased availability of this computer hardware, the question arises as to the feasibility and practicality of carrying out comprehensive structural analysis on a minicomputer. This paper presents results on the potential for using minicomputers for structural analysis by (1) selecting a comprehensive, finite-element structural analysis system in use on large mainframe computers; (2) implementing the system on a minicomputer; and (3) comparing the performance of the minicomputers with that of a large mainframe computer for the solution to a wide range of finite element structural analysis problems.
WE-D-303-00: Computational Phantoms
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lewis, John; Brigham and Women’s Hospital and Dana-Farber Cancer Institute, Boston, MA
2015-06-15
Modern medical physics deals with complex problems such as 4D radiation therapy and imaging quality optimization. Such problems involve a large number of radiological parameters, and anatomical and physiological breathing patterns. A major challenge is how to develop, test, evaluate and compare various new imaging and treatment techniques, which often involves testing over a large range of radiological parameters as well as varying patient anatomies and motions. It would be extremely challenging, if not impossible, both ethically and practically, to test every combination of parameters and every task on every type of patient under clinical conditions. Computer-based simulation using computationalmore » phantoms offers a practical technique with which to evaluate, optimize, and compare imaging technologies and methods. Within simulation, the computerized phantom provides a virtual model of the patient’s anatomy and physiology. Imaging data can be generated from it as if it was a live patient using accurate models of the physics of the imaging and treatment process. With sophisticated simulation algorithms, it is possible to perform virtual experiments entirely on the computer. By serving as virtual patients, computational phantoms hold great promise in solving some of the most complex problems in modern medical physics. In this proposed symposium, we will present the history and recent developments of computational phantom models, share experiences in their application to advanced imaging and radiation applications, and discuss their promises and limitations. Learning Objectives: Understand the need and requirements of computational phantoms in medical physics research Discuss the developments and applications of computational phantoms Know the promises and limitations of computational phantoms in solving complex problems.« less
The rid-redundant procedure in C-Prolog
NASA Technical Reports Server (NTRS)
Chen, Huo-Yan; Wah, Benjamin W.
1987-01-01
C-Prolog can conveniently be used for logical inferences on knowledge bases. However, as similar to many search methods using backward chaining, a large number of redundant computation may be produced in recursive calls. To overcome this problem, the 'rid-redundant' procedure was designed to rid all redundant computations in running multi-recursive procedures. Experimental results obtained for C-Prolog on the Vax 11/780 computer show that there is an order of magnitude improvement in the running time and solvable problem size.
Principal Component Geostatistical Approach for large-dimensional inverse problems
Kitanidis, P K; Lee, J
2014-01-01
The quasi-linear geostatistical approach is for weakly nonlinear underdetermined inverse problems, such as Hydraulic Tomography and Electrical Resistivity Tomography. It provides best estimates as well as measures for uncertainty quantification. However, for its textbook implementation, the approach involves iterations, to reach an optimum, and requires the determination of the Jacobian matrix, i.e., the derivative of the observation function with respect to the unknown. Although there are elegant methods for the determination of the Jacobian, the cost is high when the number of unknowns, m, and the number of observations, n, is high. It is also wasteful to compute the Jacobian for points away from the optimum. Irrespective of the issue of computing derivatives, the computational cost of implementing the method is generally of the order of m2n, though there are methods to reduce the computational cost. In this work, we present an implementation that utilizes a matrix free in terms of the Jacobian matrix Gauss-Newton method and improves the scalability of the geostatistical inverse problem. For each iteration, it is required to perform K runs of the forward problem, where K is not just much smaller than m but can be smaller that n. The computational and storage cost of implementation of the inverse procedure scales roughly linearly with m instead of m2 as in the textbook approach. For problems of very large m, this implementation constitutes a dramatic reduction in computational cost compared to the textbook approach. Results illustrate the validity of the approach and provide insight in the conditions under which this method perform best. PMID:25558113
Principal Component Geostatistical Approach for large-dimensional inverse problems.
Kitanidis, P K; Lee, J
2014-07-01
The quasi-linear geostatistical approach is for weakly nonlinear underdetermined inverse problems, such as Hydraulic Tomography and Electrical Resistivity Tomography. It provides best estimates as well as measures for uncertainty quantification. However, for its textbook implementation, the approach involves iterations, to reach an optimum, and requires the determination of the Jacobian matrix, i.e., the derivative of the observation function with respect to the unknown. Although there are elegant methods for the determination of the Jacobian, the cost is high when the number of unknowns, m , and the number of observations, n , is high. It is also wasteful to compute the Jacobian for points away from the optimum. Irrespective of the issue of computing derivatives, the computational cost of implementing the method is generally of the order of m 2 n , though there are methods to reduce the computational cost. In this work, we present an implementation that utilizes a matrix free in terms of the Jacobian matrix Gauss-Newton method and improves the scalability of the geostatistical inverse problem. For each iteration, it is required to perform K runs of the forward problem, where K is not just much smaller than m but can be smaller that n . The computational and storage cost of implementation of the inverse procedure scales roughly linearly with m instead of m 2 as in the textbook approach. For problems of very large m , this implementation constitutes a dramatic reduction in computational cost compared to the textbook approach. Results illustrate the validity of the approach and provide insight in the conditions under which this method perform best.
Multi-GPU implementation of a VMAT treatment plan optimization algorithm
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tian, Zhen, E-mail: Zhen.Tian@UTSouthwestern.edu, E-mail: Xun.Jia@UTSouthwestern.edu, E-mail: Steve.Jiang@UTSouthwestern.edu; Folkerts, Michael; Tan, Jun
Purpose: Volumetric modulated arc therapy (VMAT) optimization is a computationally challenging problem due to its large data size, high degrees of freedom, and many hardware constraints. High-performance graphics processing units (GPUs) have been used to speed up the computations. However, GPU’s relatively small memory size cannot handle cases with a large dose-deposition coefficient (DDC) matrix in cases of, e.g., those with a large target size, multiple targets, multiple arcs, and/or small beamlet size. The main purpose of this paper is to report an implementation of a column-generation-based VMAT algorithm, previously developed in the authors’ group, on a multi-GPU platform tomore » solve the memory limitation problem. While the column-generation-based VMAT algorithm has been previously developed, the GPU implementation details have not been reported. Hence, another purpose is to present detailed techniques employed for GPU implementation. The authors also would like to utilize this particular problem as an example problem to study the feasibility of using a multi-GPU platform to solve large-scale problems in medical physics. Methods: The column-generation approach generates VMAT apertures sequentially by solving a pricing problem (PP) and a master problem (MP) iteratively. In the authors’ method, the sparse DDC matrix is first stored on a CPU in coordinate list format (COO). On the GPU side, this matrix is split into four submatrices according to beam angles, which are stored on four GPUs in compressed sparse row format. Computation of beamlet price, the first step in PP, is accomplished using multi-GPUs. A fast inter-GPU data transfer scheme is accomplished using peer-to-peer access. The remaining steps of PP and MP problems are implemented on CPU or a single GPU due to their modest problem scale and computational loads. Barzilai and Borwein algorithm with a subspace step scheme is adopted here to solve the MP problem. A head and neck (H and N) cancer case is then used to validate the authors’ method. The authors also compare their multi-GPU implementation with three different single GPU implementation strategies, i.e., truncating DDC matrix (S1), repeatedly transferring DDC matrix between CPU and GPU (S2), and porting computations involving DDC matrix to CPU (S3), in terms of both plan quality and computational efficiency. Two more H and N patient cases and three prostate cases are used to demonstrate the advantages of the authors’ method. Results: The authors’ multi-GPU implementation can finish the optimization process within ∼1 min for the H and N patient case. S1 leads to an inferior plan quality although its total time was 10 s shorter than the multi-GPU implementation due to the reduced matrix size. S2 and S3 yield the same plan quality as the multi-GPU implementation but take ∼4 and ∼6 min, respectively. High computational efficiency was consistently achieved for the other five patient cases tested, with VMAT plans of clinically acceptable quality obtained within 23–46 s. Conversely, to obtain clinically comparable or acceptable plans for all six of these VMAT cases that the authors have tested in this paper, the optimization time needed in a commercial TPS system on CPU was found to be in an order of several minutes. Conclusions: The results demonstrate that the multi-GPU implementation of the authors’ column-generation-based VMAT optimization can handle the large-scale VMAT optimization problem efficiently without sacrificing plan quality. The authors’ study may serve as an example to shed some light on other large-scale medical physics problems that require multi-GPU techniques.« less
ERIC Educational Resources Information Center
Dennis, J. Richard; Thomson, David
This paper is concerned with a low cost alternative for providing computer experience to secondary school students. The brief discussion covers the programmable calculator and its relevance for teaching the concepts and the rudiments of computer programming and for computer problem solving. A list of twenty-five programming activities related to…
Solving Fuzzy Optimization Problem Using Hybrid Ls-Sa Method
NASA Astrophysics Data System (ADS)
Vasant, Pandian
2011-06-01
Fuzzy optimization problem has been one of the most and prominent topics inside the broad area of computational intelligent. It's especially relevant in the filed of fuzzy non-linear programming. It's application as well as practical realization can been seen in all the real world problems. In this paper a large scale non-linear fuzzy programming problem has been solved by hybrid optimization techniques of Line Search (LS), Simulated Annealing (SA) and Pattern Search (PS). As industrial production planning problem with cubic objective function, 8 decision variables and 29 constraints has been solved successfully using LS-SA-PS hybrid optimization techniques. The computational results for the objective function respect to vagueness factor and level of satisfaction has been provided in the form of 2D and 3D plots. The outcome is very promising and strongly suggests that the hybrid LS-SA-PS algorithm is very efficient and productive in solving the large scale non-linear fuzzy programming problem.
Proceedings of the 1977 MACSYMA users' conference (NASA)
NASA Technical Reports Server (NTRS)
1977-01-01
The MACSYMA program for symbolic and algebraic manipulation enables exact, symbolic mathematical computations to be performed on a computer. This program is rather large, and various approaches to the hardware and software problems are examined.
Framework Resources Multiply Computing Power
NASA Technical Reports Server (NTRS)
2010-01-01
As an early proponent of grid computing, Ames Research Center awarded Small Business Innovation Research (SBIR) funding to 3DGeo Development Inc., of Santa Clara, California, (now FusionGeo Inc., of The Woodlands, Texas) to demonstrate a virtual computer environment that linked geographically dispersed computer systems over the Internet to help solve large computational problems. By adding to an existing product, FusionGeo enabled access to resources for calculation- or data-intensive applications whenever and wherever they were needed. Commercially available as Accelerated Imaging and Modeling, the product is used by oil companies and seismic service companies, which require large processing and data storage capacities.
Polynomial complexity despite the fermionic sign
NASA Astrophysics Data System (ADS)
Rossi, R.; Prokof'ev, N.; Svistunov, B.; Van Houcke, K.; Werner, F.
2017-04-01
It is commonly believed that in unbiased quantum Monte Carlo approaches to fermionic many-body problems, the infamous sign problem generically implies prohibitively large computational times for obtaining thermodynamic-limit quantities. We point out that for convergent Feynman diagrammatic series evaluated with a recently introduced Monte Carlo algorithm (see Rossi R., arXiv:1612.05184), the computational time increases only polynomially with the inverse error on thermodynamic-limit quantities.
GRADIENT: Graph Analytic Approach for Discovering Irregular Events, Nascent and Temporal
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hogan, Emilie
2015-03-31
Finding a time-ordered signature within large graphs is a computationally complex problem due to the combinatorial explosion of potential patterns. GRADIENT is designed to search and understand that problem space.
GRADIENT: Graph Analytic Approach for Discovering Irregular Events, Nascent and Temporal
Hogan, Emilie
2018-01-16
Finding a time-ordered signature within large graphs is a computationally complex problem due to the combinatorial explosion of potential patterns. GRADIENT is designed to search and understand that problem space.
Vectorization of transport and diffusion computations on the CDC Cyber 205
DOE Office of Scientific and Technical Information (OSTI.GOV)
Abu-Shumays, I.K.
1986-01-01
The development and testing of alternative numerical methods and computational algorithms specifically designed for the vectorization of transport and diffusion computations on a Control Data Corporation (CDC) Cyber 205 vector computer are described. Two solution methods for the discrete ordinates approximation to the transport equation are summarized and compared. Factors of 4 to 7 reduction in run times for certain large transport problems were achieved on a Cyber 205 as compared with run times on a CDC-7600. The solution of tridiagonal systems of linear equations, central to several efficient numerical methods for multidimensional diffusion computations and essential for fluid flowmore » and other physics and engineering problems, is also dealt with. Among the methods tested, a combined odd-even cyclic reduction and modified Cholesky factorization algorithm for solving linear symmetric positive definite tridiagonal systems is found to be the most effective for these systems on a Cyber 205. For large tridiagonal systems, computation with this algorithm is an order of magnitude faster on a Cyber 205 than computation with the best algorithm for tridiagonal systems on a CDC-7600.« less
Comparative Evaluation of Different Optimization Algorithms for Structural Design Applications
NASA Technical Reports Server (NTRS)
Patnaik, Surya N.; Coroneos, Rula M.; Guptill, James D.; Hopkins, Dale A.
1996-01-01
Non-linear programming algorithms play an important role in structural design optimization. Fortunately, several algorithms with computer codes are available. At NASA Lewis Research Centre, a project was initiated to assess the performance of eight different optimizers through the development of a computer code CometBoards. This paper summarizes the conclusions of that research. CometBoards was employed to solve sets of small, medium and large structural problems, using the eight different optimizers on a Cray-YMP8E/8128 computer. The reliability and efficiency of the optimizers were determined from the performance of these problems. For small problems, the performance of most of the optimizers could be considered adequate. For large problems, however, three optimizers (two sequential quadratic programming routines, DNCONG of IMSL and SQP of IDESIGN, along with Sequential Unconstrained Minimizations Technique SUMT) outperformed others. At optimum, most optimizers captured an identical number of active displacement and frequency constraints but the number of active stress constraints differed among the optimizers. This discrepancy can be attributed to singularity conditions in the optimization and the alleviation of this discrepancy can improve the efficiency of optimizers.
Performance Trend of Different Algorithms for Structural Design Optimization
NASA Technical Reports Server (NTRS)
Patnaik, Surya N.; Coroneos, Rula M.; Guptill, James D.; Hopkins, Dale A.
1996-01-01
Nonlinear programming algorithms play an important role in structural design optimization. Fortunately, several algorithms with computer codes are available. At NASA Lewis Research Center, a project was initiated to assess performance of different optimizers through the development of a computer code CometBoards. This paper summarizes the conclusions of that research. CometBoards was employed to solve sets of small, medium and large structural problems, using different optimizers on a Cray-YMP8E/8128 computer. The reliability and efficiency of the optimizers were determined from the performance of these problems. For small problems, the performance of most of the optimizers could be considered adequate. For large problems however, three optimizers (two sequential quadratic programming routines, DNCONG of IMSL and SQP of IDESIGN, along with the sequential unconstrained minimizations technique SUMT) outperformed others. At optimum, most optimizers captured an identical number of active displacement and frequency constraints but the number of active stress constraints differed among the optimizers. This discrepancy can be attributed to singularity conditions in the optimization and the alleviation of this discrepancy can improve the efficiency of optimizers.
NASA Technical Reports Server (NTRS)
Johnston, William E.; Gannon, Dennis; Nitzberg, Bill; Feiereisen, William (Technical Monitor)
2000-01-01
The term "Grid" refers to distributed, high performance computing and data handling infrastructure that incorporates geographically and organizationally dispersed, heterogeneous resources that are persistent and supported. The vision for NASN's Information Power Grid - a computing and data Grid - is that it will provide significant new capabilities to scientists and engineers by facilitating routine construction of information based problem solving environments / frameworks that will knit together widely distributed computing, data, instrument, and human resources into just-in-time systems that can address complex and large-scale computing and data analysis problems. IPG development and deployment is addressing requirements obtained by analyzing a number of different application areas, in particular from the NASA Aero-Space Technology Enterprise. This analysis has focussed primarily on two types of users: The scientist / design engineer whose primary interest is problem solving (e.g., determining wing aerodynamic characteristics in many different operating environments), and whose primary interface to IPG will be through various sorts of problem solving frameworks. The second type of user if the tool designer: The computational scientists who convert physics and mathematics into code that can simulate the physical world. These are the two primary users of IPG, and they have rather different requirements. This paper describes the current state of IPG (the operational testbed), the set of capabilities being put into place for the operational prototype IPG, as well as some of the longer term R&D tasks.
Sparse deconvolution for the large-scale ill-posed inverse problem of impact force reconstruction
NASA Astrophysics Data System (ADS)
Qiao, Baijie; Zhang, Xingwu; Gao, Jiawei; Liu, Ruonan; Chen, Xuefeng
2017-01-01
Most previous regularization methods for solving the inverse problem of force reconstruction are to minimize the l2-norm of the desired force. However, these traditional regularization methods such as Tikhonov regularization and truncated singular value decomposition, commonly fail to solve the large-scale ill-posed inverse problem in moderate computational cost. In this paper, taking into account the sparse characteristic of impact force, the idea of sparse deconvolution is first introduced to the field of impact force reconstruction and a general sparse deconvolution model of impact force is constructed. Second, a novel impact force reconstruction method based on the primal-dual interior point method (PDIPM) is proposed to solve such a large-scale sparse deconvolution model, where minimizing the l2-norm is replaced by minimizing the l1-norm. Meanwhile, the preconditioned conjugate gradient algorithm is used to compute the search direction of PDIPM with high computational efficiency. Finally, two experiments including the small-scale or medium-scale single impact force reconstruction and the relatively large-scale consecutive impact force reconstruction are conducted on a composite wind turbine blade and a shell structure to illustrate the advantage of PDIPM. Compared with Tikhonov regularization, PDIPM is more efficient, accurate and robust whether in the single impact force reconstruction or in the consecutive impact force reconstruction.
NASA Technical Reports Server (NTRS)
Nguyen, D. T.; Watson, Willie R. (Technical Monitor)
2005-01-01
The overall objectives of this research work are to formulate and validate efficient parallel algorithms, and to efficiently design/implement computer software for solving large-scale acoustic problems, arised from the unified frameworks of the finite element procedures. The adopted parallel Finite Element (FE) Domain Decomposition (DD) procedures should fully take advantages of multiple processing capabilities offered by most modern high performance computing platforms for efficient parallel computation. To achieve this objective. the formulation needs to integrate efficient sparse (and dense) assembly techniques, hybrid (or mixed) direct and iterative equation solvers, proper pre-conditioned strategies, unrolling strategies, and effective processors' communicating schemes. Finally, the numerical performance of the developed parallel finite element procedures will be evaluated by solving series of structural, and acoustic (symmetrical and un-symmetrical) problems (in different computing platforms). Comparisons with existing "commercialized" and/or "public domain" software are also included, whenever possible.
Multiscale solvers and systematic upscaling in computational physics
NASA Astrophysics Data System (ADS)
Brandt, A.
2005-07-01
Multiscale algorithms can overcome the scale-born bottlenecks that plague most computations in physics. These algorithms employ separate processing at each scale of the physical space, combined with interscale iterative interactions, in ways which use finer scales very sparingly. Having been developed first and well known as multigrid solvers for partial differential equations, highly efficient multiscale techniques have more recently been developed for many other types of computational tasks, including: inverse PDE problems; highly indefinite (e.g., standing wave) equations; Dirac equations in disordered gauge fields; fast computation and updating of large determinants (as needed in QCD); fast integral transforms; integral equations; astrophysics; molecular dynamics of macromolecules and fluids; many-atom electronic structures; global and discrete-state optimization; practical graph problems; image segmentation and recognition; tomography (medical imaging); fast Monte-Carlo sampling in statistical physics; and general, systematic methods of upscaling (accurate numerical derivation of large-scale equations from microscopic laws).
Linear solver performance in elastoplastic problem solution on GPU cluster
NASA Astrophysics Data System (ADS)
Khalevitsky, Yu. V.; Konovalov, A. V.; Burmasheva, N. V.; Partin, A. S.
2017-12-01
Applying the finite element method to severe plastic deformation problems involves solving linear equation systems. While the solution procedure is relatively hard to parallelize and computationally intensive by itself, a long series of large scale systems need to be solved for each problem. When dealing with fine computational meshes, such as in the simulations of three-dimensional metal matrix composite microvolume deformation, tens and hundreds of hours may be needed to complete the whole solution procedure, even using modern supercomputers. In general, one of the preconditioned Krylov subspace methods is used in a linear solver for such problems. The method convergence highly depends on the operator spectrum of a problem stiffness matrix. In order to choose the appropriate method, a series of computational experiments is used. Different methods may be preferable for different computational systems for the same problem. In this paper we present experimental data obtained by solving linear equation systems from an elastoplastic problem on a GPU cluster. The data can be used to substantiate the choice of the appropriate method for a linear solver to use in severe plastic deformation simulations.
Computer problem-solving coaches for introductory physics: Design and usability studies
NASA Astrophysics Data System (ADS)
Ryan, Qing X.; Frodermann, Evan; Heller, Kenneth; Hsu, Leonardo; Mason, Andrew
2016-06-01
The combination of modern computing power, the interactivity of web applications, and the flexibility of object-oriented programming may finally be sufficient to create computer coaches that can help students develop metacognitive problem-solving skills, an important competence in our rapidly changing technological society. However, no matter how effective such coaches might be, they will only be useful if they are attractive to students. We describe the design and testing of a set of web-based computer programs that act as personal coaches to students while they practice solving problems from introductory physics. The coaches are designed to supplement regular human instruction, giving students access to effective forms of practice outside class. We present results from large-scale usability tests of the computer coaches and discuss their implications for future versions of the coaches.
The role of metadata in managing large environmental science datasets. Proceedings
DOE Office of Scientific and Technical Information (OSTI.GOV)
Melton, R.B.; DeVaney, D.M.; French, J. C.
1995-06-01
The purpose of this workshop was to bring together computer science researchers and environmental sciences data management practitioners to consider the role of metadata in managing large environmental sciences datasets. The objectives included: establishing a common definition of metadata; identifying categories of metadata; defining problems in managing metadata; and defining problems related to linking metadata with primary data.
NASA Astrophysics Data System (ADS)
Rolla, L. Barrera; Rice, H. J.
2006-09-01
In this paper a "forward-advancing" field discretization method suitable for solving the Helmholtz equation in large-scale problems is proposed. The forward wave expansion method (FWEM) is derived from a highly efficient discretization procedure based on interpolation of wave functions known as the wave expansion method (WEM). The FWEM computes the propagated sound field by means of an exclusively forward advancing solution, neglecting the backscattered field. It is thus analogous to methods such as the (one way) parabolic equation method (PEM) (usually discretized using standard finite difference or finite element methods). These techniques do not require the inversion of large system matrices and thus enable the solution of large-scale acoustic problems where backscatter is not of interest. Calculations using FWEM are presented for two propagation problems and comparisons to data computed with analytical and theoretical solutions and show this forward approximation to be highly accurate. Examples of sound propagation over a screen in upwind and downwind refracting atmospheric conditions at low nodal spacings (0.2 per wavelength in the propagation direction) are also included to demonstrate the flexibility and efficiency of the method.
Reconfigurable Computing for Computational Science: A New Focus in High Performance Computing
2006-11-01
in the past decade. Researchers are regularly employing the power of large computing systems and parallel processing to tackle larger and more...complex problems in all of the physical sciences. For the past decade or so, most of this growth in computing power has been “free” with increased...the scientific computing community as a means to continued growth in computing capability. This paper offers a glimpse of the hardware and
GLAD: a system for developing and deploying large-scale bioinformatics grid.
Teo, Yong-Meng; Wang, Xianbing; Ng, Yew-Kwong
2005-03-01
Grid computing is used to solve large-scale bioinformatics problems with gigabytes database by distributing the computation across multiple platforms. Until now in developing bioinformatics grid applications, it is extremely tedious to design and implement the component algorithms and parallelization techniques for different classes of problems, and to access remotely located sequence database files of varying formats across the grid. In this study, we propose a grid programming toolkit, GLAD (Grid Life sciences Applications Developer), which facilitates the development and deployment of bioinformatics applications on a grid. GLAD has been developed using ALiCE (Adaptive scaLable Internet-based Computing Engine), a Java-based grid middleware, which exploits the task-based parallelism. Two bioinformatics benchmark applications, such as distributed sequence comparison and distributed progressive multiple sequence alignment, have been developed using GLAD.
Applications of Parallel Process HiMAP for Large Scale Multidisciplinary Problems
NASA Technical Reports Server (NTRS)
Guruswamy, Guru P.; Potsdam, Mark; Rodriguez, David; Kwak, Dochay (Technical Monitor)
2000-01-01
HiMAP is a three level parallel middleware that can be interfaced to a large scale global design environment for code independent, multidisciplinary analysis using high fidelity equations. Aerospace technology needs are rapidly changing. Computational tools compatible with the requirements of national programs such as space transportation are needed. Conventional computation tools are inadequate for modern aerospace design needs. Advanced, modular computational tools are needed, such as those that incorporate the technology of massively parallel processors (MPP).
Exploring the Universe with WISE and Cloud Computing
NASA Technical Reports Server (NTRS)
Benford, Dominic J.
2011-01-01
WISE is a recently-completed astronomical survey mission that has imaged the entire sky in four infrared wavelength bands. The large quantity of science images returned consists of 2,776,922 individual snapshots in various locations in each band which, along with ancillary data, totals around 110TB of raw, uncompressed data. Making the most use of this data requires advanced computing resources. I will discuss some initial attempts in the use of cloud computing to make this large problem tractable.
Discrete square root smoothing.
NASA Technical Reports Server (NTRS)
Kaminski, P. G.; Bryson, A. E., Jr.
1972-01-01
The basic techniques applied in the square root least squares and square root filtering solutions are applied to the smoothing problem. Both conventional and square root solutions are obtained by computing the filtered solutions, then modifying the results to include the effect of all measurements. A comparison of computation requirements indicates that the square root information smoother (SRIS) is more efficient than conventional solutions in a large class of fixed interval smoothing problems.
1986-08-01
most of the algorithms fail when applied to real images. (2) Usually the constraints from the geometry and the physics of the problem are not enough...large subset of real images), and so most of the algorithms fail when applied to real images. (2) Usually the constraints from the geometry and the...constraints from the geometry and the physics of the problem are not enough to guarantee uniqueness of the computed parameters. In this case, strong
Architecture independent environment for developing engineering software on MIMD computers
NASA Technical Reports Server (NTRS)
Valimohamed, Karim A.; Lopez, L. A.
1990-01-01
Engineers are constantly faced with solving problems of increasing complexity and detail. Multiple Instruction stream Multiple Data stream (MIMD) computers have been developed to overcome the performance limitations of serial computers. The hardware architectures of MIMD computers vary considerably and are much more sophisticated than serial computers. Developing large scale software for a variety of MIMD computers is difficult and expensive. There is a need to provide tools that facilitate programming these machines. First, the issues that must be considered to develop those tools are examined. The two main areas of concern were architecture independence and data management. Architecture independent software facilitates software portability and improves the longevity and utility of the software product. It provides some form of insurance for the investment of time and effort that goes into developing the software. The management of data is a crucial aspect of solving large engineering problems. It must be considered in light of the new hardware organizations that are available. Second, the functional design and implementation of a software environment that facilitates developing architecture independent software for large engineering applications are described. The topics of discussion include: a description of the model that supports the development of architecture independent software; identifying and exploiting concurrency within the application program; data coherence; engineering data base and memory management.
Self-Scheduling Parallel Methods for Multiple Serial Codes with Application to WOPWOP
NASA Technical Reports Server (NTRS)
Long, Lyle N.; Brentner, Kenneth S.
2000-01-01
This paper presents a scheme for efficiently running a large number of serial jobs on parallel computers. Two examples are given of computer programs that run relatively quickly, but often they must be run numerous times to obtain all the results needed. It is very common in science and engineering to have codes that are not massive computing challenges in themselves, but due to the number of instances that must be run, they do become large-scale computing problems. The two examples given here represent common problems in aerospace engineering: aerodynamic panel methods and aeroacoustic integral methods. The first example simply solves many systems of linear equations. This is representative of an aerodynamic panel code where someone would like to solve for numerous angles of attack. The complete code for this first example is included in the appendix so that it can be readily used by others as a template. The second example is an aeroacoustics code (WOPWOP) that solves the Ffowcs Williams Hawkings equation to predict the far-field sound due to rotating blades. In this example, one quite often needs to compute the sound at numerous observer locations, hence parallelization is utilized to automate the noise computation for a large number of observers.
Parallel computing in enterprise modeling.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Goldsby, Michael E.; Armstrong, Robert C.; Shneider, Max S.
2008-08-01
This report presents the results of our efforts to apply high-performance computing to entity-based simulations with a multi-use plugin for parallel computing. We use the term 'Entity-based simulation' to describe a class of simulation which includes both discrete event simulation and agent based simulation. What simulations of this class share, and what differs from more traditional models, is that the result sought is emergent from a large number of contributing entities. Logistic, economic and social simulations are members of this class where things or people are organized or self-organize to produce a solution. Entity-based problems never have an a priorimore » ergodic principle that will greatly simplify calculations. Because the results of entity-based simulations can only be realized at scale, scalable computing is de rigueur for large problems. Having said that, the absence of a spatial organizing principal makes the decomposition of the problem onto processors problematic. In addition, practitioners in this domain commonly use the Java programming language which presents its own problems in a high-performance setting. The plugin we have developed, called the Parallel Particle Data Model, overcomes both of these obstacles and is now being used by two Sandia frameworks: the Decision Analysis Center, and the Seldon social simulation facility. While the ability to engage U.S.-sized problems is now available to the Decision Analysis Center, this plugin is central to the success of Seldon. Because Seldon relies on computationally intensive cognitive sub-models, this work is necessary to achieve the scale necessary for realistic results. With the recent upheavals in the financial markets, and the inscrutability of terrorist activity, this simulation domain will likely need a capability with ever greater fidelity. High-performance computing will play an important part in enabling that greater fidelity.« less
NASA Technical Reports Server (NTRS)
Merchant, D. H.; Gates, R. M.; Straayer, J. W.
1975-01-01
The effect of localized structural damping on the excitability of higher-order large space telescope spacecraft modes is investigated. A preprocessor computer program is developed to incorporate Voigt structural joint damping models in a finite-element dynamic model. A postprocessor computer program is developed to select critical modes for low-frequency attitude control problems and for higher-frequency fine-stabilization problems. The selection is accomplished by ranking the flexible modes based on coefficients for rate gyro, position gyro, and optical sensor, and on image-plane motions due to sinusoidal or random PSD force and torque inputs.
Distributed intrusion detection system based on grid security model
NASA Astrophysics Data System (ADS)
Su, Jie; Liu, Yahui
2008-03-01
Grid computing has developed rapidly with the development of network technology and it can solve the problem of large-scale complex computing by sharing large-scale computing resource. In grid environment, we can realize a distributed and load balance intrusion detection system. This paper first discusses the security mechanism in grid computing and the function of PKI/CA in the grid security system, then gives the application of grid computing character in the distributed intrusion detection system (IDS) based on Artificial Immune System. Finally, it gives a distributed intrusion detection system based on grid security system that can reduce the processing delay and assure the detection rates.
GAP Noise Computation By The CE/SE Method
NASA Technical Reports Server (NTRS)
Loh, Ching Y.; Chang, Sin-Chung; Wang, Xiao Y.; Jorgenson, Philip C. E.
2001-01-01
A typical gap noise problem is considered in this paper using the new space-time conservation element and solution element (CE/SE) method. Implementation of the computation is straightforward. No turbulence model, LES (large eddy simulation) or a preset boundary layer profile is used, yet the computed frequency agrees well with the experimental one.
Steinwand, Daniel R.; Maddox, Brian; Beckmann, Tim; Hamer, George
2003-01-01
Beowulf clusters can provide a cost-effective way to compute numerical models and process large amounts of remote sensing image data. Usually a Beowulf cluster is designed to accomplish a specific set of processing goals, and processing is very efficient when the problem remains inside the constraints of the original design. There are cases, however, when one might wish to compute a problem that is beyond the capacity of the local Beowulf system. In these cases, spreading the problem to multiple clusters or to other machines on the network may provide a cost-effective solution.
DESCRIPTION OF THE ENIAC CONVERTER CODE
The report is intended as a working manual for personnel preparing problems for the ENIAC . It should also serve as a guide to those groups who have...computing problems that could be solved on the ENIAC . The report discusses the ENIAC from the point of view of the coder, describing its memory as well...accomplishes as well as how to use each instruction. A few remarks are made on the more general subject of problem preparation for large scale computers in general based on the experience of operating the ENIAC . (Author)
NASA Astrophysics Data System (ADS)
Manfredi, Sabato
2016-06-01
Large-scale dynamic systems are becoming highly pervasive in their occurrence with applications ranging from system biology, environment monitoring, sensor networks, and power systems. They are characterised by high dimensionality, complexity, and uncertainty in the node dynamic/interactions that require more and more computational demanding methods for their analysis and control design, as well as the network size and node system/interaction complexity increase. Therefore, it is a challenging problem to find scalable computational method for distributed control design of large-scale networks. In this paper, we investigate the robust distributed stabilisation problem of large-scale nonlinear multi-agent systems (briefly MASs) composed of non-identical (heterogeneous) linear dynamical systems coupled by uncertain nonlinear time-varying interconnections. By employing Lyapunov stability theory and linear matrix inequality (LMI) technique, new conditions are given for the distributed control design of large-scale MASs that can be easily solved by the toolbox of MATLAB. The stabilisability of each node dynamic is a sufficient assumption to design a global stabilising distributed control. The proposed approach improves some of the existing LMI-based results on MAS by both overcoming their computational limits and extending the applicative scenario to large-scale nonlinear heterogeneous MASs. Additionally, the proposed LMI conditions are further reduced in terms of computational requirement in the case of weakly heterogeneous MASs, which is a common scenario in real application where the network nodes and links are affected by parameter uncertainties. One of the main advantages of the proposed approach is to allow to move from a centralised towards a distributed computing architecture so that the expensive computation workload spent to solve LMIs may be shared among processors located at the networked nodes, thus increasing the scalability of the approach than the network size. Finally, a numerical example shows the applicability of the proposed method and its advantage in terms of computational complexity when compared with the existing approaches.
Statistical physics of hard combinatorial optimization: Vertex cover problem
NASA Astrophysics Data System (ADS)
Zhao, Jin-Hua; Zhou, Hai-Jun
2014-07-01
Typical-case computation complexity is a research topic at the boundary of computer science, applied mathematics, and statistical physics. In the last twenty years, the replica-symmetry-breaking mean field theory of spin glasses and the associated message-passing algorithms have greatly deepened our understanding of typical-case computation complexity. In this paper, we use the vertex cover problem, a basic nondeterministic-polynomial (NP)-complete combinatorial optimization problem of wide application, as an example to introduce the statistical physical methods and algorithms. We do not go into the technical details but emphasize mainly the intuitive physical meanings of the message-passing equations. A nonfamiliar reader shall be able to understand to a large extent the physics behind the mean field approaches and to adjust the mean field methods in solving other optimization problems.
AI tools in computer based problem solving
NASA Technical Reports Server (NTRS)
Beane, Arthur J.
1988-01-01
The use of computers to solve value oriented, deterministic, algorithmic problems, has evolved a structured life cycle model of the software process. The symbolic processing techniques used, primarily in research, for solving nondeterministic problems, and those for which an algorithmic solution is unknown, have evolved a different model, much less structured. Traditionally, the two approaches have been used completely independently. With the advent of low cost, high performance 32 bit workstations executing identical software with large minicomputers and mainframes, it became possible to begin to merge both models into a single extended model of computer problem solving. The implementation of such an extended model on a VAX family of micro/mini/mainframe systems is described. Examples in both development and deployment of applications involving a blending of AI and traditional techniques are given.
Computation of Pressurized Gas Bearings Using CE/SE Method
NASA Technical Reports Server (NTRS)
Cioc, Sorin; Dimofte, Florin; Keith, Theo G., Jr.; Fleming, David P.
2003-01-01
The space-time conservation element and solution element (CE/SE) method is extended to compute compressible viscous flows in pressurized thin fluid films. This numerical scheme has previously been used successfully to solve a wide variety of compressible flow problems, including flows with large and small discontinuities. In this paper, the method is applied to calculate the pressure distribution in a hybrid gas journal bearing. The formulation of the problem is presented, including the modeling of the feeding system. the numerical results obtained are compared with experimental data. Good agreement between the computed results and the test data were obtained, and thus validate the CE/SE method to solve such problems.
Portable parallel stochastic optimization for the design of aeropropulsion components
NASA Technical Reports Server (NTRS)
Sues, Robert H.; Rhodes, G. S.
1994-01-01
This report presents the results of Phase 1 research to develop a methodology for performing large-scale Multi-disciplinary Stochastic Optimization (MSO) for the design of aerospace systems ranging from aeropropulsion components to complete aircraft configurations. The current research recognizes that such design optimization problems are computationally expensive, and require the use of either massively parallel or multiple-processor computers. The methodology also recognizes that many operational and performance parameters are uncertain, and that uncertainty must be considered explicitly to achieve optimum performance and cost. The objective of this Phase 1 research was to initialize the development of an MSO methodology that is portable to a wide variety of hardware platforms, while achieving efficient, large-scale parallelism when multiple processors are available. The first effort in the project was a literature review of available computer hardware, as well as review of portable, parallel programming environments. The first effort was to implement the MSO methodology for a problem using the portable parallel programming language, Parallel Virtual Machine (PVM). The third and final effort was to demonstrate the example on a variety of computers, including a distributed-memory multiprocessor, a distributed-memory network of workstations, and a single-processor workstation. Results indicate the MSO methodology can be well-applied towards large-scale aerospace design problems. Nearly perfect linear speedup was demonstrated for computation of optimization sensitivity coefficients on both a 128-node distributed-memory multiprocessor (the Intel iPSC/860) and a network of workstations (speedups of almost 19 times achieved for 20 workstations). Very high parallel efficiencies (75 percent for 31 processors and 60 percent for 50 processors) were also achieved for computation of aerodynamic influence coefficients on the Intel. Finally, the multi-level parallelization strategy that will be needed for large-scale MSO problems was demonstrated to be highly efficient. The same parallel code instructions were used on both platforms, demonstrating portability. There are many applications for which MSO can be applied, including NASA's High-Speed-Civil Transport, and advanced propulsion systems. The use of MSO will reduce design and development time and testing costs dramatically.
NASA Astrophysics Data System (ADS)
Titeux, Isabelle; Li, Yuming M.; Debray, Karl; Guo, Ying-Qiao
2004-11-01
This Note deals with an efficient algorithm to carry out the plastic integration and compute the stresses due to large strains for materials satisfying the Hill's anisotropic yield criterion. The classical algorithm of plastic integration such as 'Return Mapping Method' is largely used for nonlinear analyses of structures and numerical simulations of forming processes, but it requires an iterative schema and may have convergence problems. A new direct algorithm based on a scalar method is developed which allows us to directly obtain the plastic multiplier without an iteration procedure; thus the computation time is largely reduced and the numerical problems are avoided. To cite this article: I. Titeux et al., C. R. Mecanique 332 (2004).
A Comparison of Solver Performance for Complex Gastric Electrophysiology Models
Sathar, Shameer; Cheng, Leo K.; Trew, Mark L.
2016-01-01
Computational techniques for solving systems of equations arising in gastric electrophysiology have not been studied for efficient solution process. We present a computationally challenging problem of simulating gastric electrophysiology in anatomically realistic stomach geometries with multiple intracellular and extracellular domains. The multiscale nature of the problem and mesh resolution required to capture geometric and functional features necessitates efficient solution methods if the problem is to be tractable. In this study, we investigated and compared several parallel preconditioners for the linear systems arising from tetrahedral discretisation of electrically isotropic and anisotropic problems, with and without stimuli. The results showed that the isotropic problem was computationally less challenging than the anisotropic problem and that the application of extracellular stimuli increased workload considerably. Preconditioning based on block Jacobi and algebraic multigrid solvers were found to have the best overall solution times and least iteration counts, respectively. The algebraic multigrid preconditioner would be expected to perform better on large problems. PMID:26736543
Madni, Syed Hamid Hussain; Abd Latiff, Muhammad Shafie; Abdullahi, Mohammed; Abdulhamid, Shafi'i Muhammad; Usman, Mohammed Joda
2017-01-01
Cloud computing infrastructure is suitable for meeting computational needs of large task sizes. Optimal scheduling of tasks in cloud computing environment has been proved to be an NP-complete problem, hence the need for the application of heuristic methods. Several heuristic algorithms have been developed and used in addressing this problem, but choosing the appropriate algorithm for solving task assignment problem of a particular nature is difficult since the methods are developed under different assumptions. Therefore, six rule based heuristic algorithms are implemented and used to schedule autonomous tasks in homogeneous and heterogeneous environments with the aim of comparing their performance in terms of cost, degree of imbalance, makespan and throughput. First Come First Serve (FCFS), Minimum Completion Time (MCT), Minimum Execution Time (MET), Max-min, Min-min and Sufferage are the heuristic algorithms considered for the performance comparison and analysis of task scheduling in cloud computing.
Madni, Syed Hamid Hussain; Abd Latiff, Muhammad Shafie; Abdullahi, Mohammed; Usman, Mohammed Joda
2017-01-01
Cloud computing infrastructure is suitable for meeting computational needs of large task sizes. Optimal scheduling of tasks in cloud computing environment has been proved to be an NP-complete problem, hence the need for the application of heuristic methods. Several heuristic algorithms have been developed and used in addressing this problem, but choosing the appropriate algorithm for solving task assignment problem of a particular nature is difficult since the methods are developed under different assumptions. Therefore, six rule based heuristic algorithms are implemented and used to schedule autonomous tasks in homogeneous and heterogeneous environments with the aim of comparing their performance in terms of cost, degree of imbalance, makespan and throughput. First Come First Serve (FCFS), Minimum Completion Time (MCT), Minimum Execution Time (MET), Max-min, Min-min and Sufferage are the heuristic algorithms considered for the performance comparison and analysis of task scheduling in cloud computing. PMID:28467505
Large-eddy simulation of a boundary layer with concave streamwise curvature
NASA Technical Reports Server (NTRS)
Lund, Thomas S.
1994-01-01
Turbulence modeling continues to be one of the most difficult problems in fluid mechanics. Existing prediction methods are well developed for certain classes of simple equilibrium flows, but are still not entirely satisfactory for a large category of complex non-equilibrium flows found in engineering practice. Direct and large-eddy simulation (LES) approaches have long been believed to have great potential for the accurate prediction of difficult turbulent flows, but the associated computational cost has been prohibitive for practical problems. This remains true for direct simulation but is no longer clear for large-eddy simulation. Advances in computer hardware, numerical methods, and subgrid-scale modeling have made it possible to conduct LES for flows or practical interest at Reynolds numbers in the range of laboratory experiments. The objective of this work is to apply ES and the dynamic subgrid-scale model to the flow of a boundary layer over a concave surface.
Optimization of Angular-Momentum Biases of Reaction Wheels
NASA Technical Reports Server (NTRS)
Lee, Clifford; Lee, Allan
2008-01-01
RBOT [RWA Bias Optimization Tool (wherein RWA signifies Reaction Wheel Assembly )] is a computer program designed for computing angular momentum biases for reaction wheels used for providing spacecraft pointing in various directions as required for scientific observations. RBOT is currently deployed to support the Cassini mission to prevent operation of reaction wheels at unsafely high speeds while minimizing time in undesirable low-speed range, where elasto-hydrodynamic lubrication films in bearings become ineffective, leading to premature bearing failure. The problem is formulated as a constrained optimization problem in which maximum wheel speed limit is a hard constraint and a cost functional that increases as speed decreases below a low-speed threshold. The optimization problem is solved using a parametric search routine known as the Nelder-Mead simplex algorithm. To increase computational efficiency for extended operation involving large quantity of data, the algorithm is designed to (1) use large time increments during intervals when spacecraft attitudes or rates of rotation are nearly stationary, (2) use sinusoidal-approximation sampling to model repeated long periods of Earth-point rolling maneuvers to reduce computational loads, and (3) utilize an efficient equation to obtain wheel-rate profiles as functions of initial wheel biases based on conservation of angular momentum (in an inertial frame) using pre-computed terms.
NASA Technical Reports Server (NTRS)
Vanderplaats, Garrett; Townsend, James C. (Technical Monitor)
2002-01-01
The purpose of this research under the NASA Small Business Innovative Research program was to develop algorithms and associated software to solve very large nonlinear, constrained optimization tasks. Key issues included efficiency, reliability, memory, and gradient calculation requirements. This report describes the general optimization problem, ten candidate methods, and detailed evaluations of four candidates. The algorithm chosen for final development is a modern recreation of a 1960s external penalty function method that uses very limited computer memory and computational time. Although of lower efficiency, the new method can solve problems orders of magnitude larger than current methods. The resulting BIGDOT software has been demonstrated on problems with 50,000 variables and about 50,000 active constraints. For unconstrained optimization, it has solved a problem in excess of 135,000 variables. The method includes a technique for solving discrete variable problems that finds a "good" design, although a theoretical optimum cannot be guaranteed. It is very scalable in that the number of function and gradient evaluations does not change significantly with increased problem size. Test cases are provided to demonstrate the efficiency and reliability of the methods and software.
Knutsson, A
1986-01-01
For 10 years computerization in industry has advanced at a rapid pace. A problem which has not received attention is that of people with reading and writing difficulties who experience severe problems when they have to communicate with a computer monitor screen. These individuals are often embarrassed by their difficulties and conceal them from their fellow workers. A number of case studies are described which show the form the problems can take. In one case, an employee was compelled to move from department to department as each was computerized in turn. Computers transform a large number of manual tasks in industry into jobs which call for reading and writing skills. Better education at elementary school and at the workplace in connection with computerization are the most important means of overcoming this problem. Moreover, computer programs could be written in a more human way.
French Meteor Network for High Precision Orbits of Meteoroids
NASA Technical Reports Server (NTRS)
Atreya, P.; Vaubaillon, J.; Colas, F.; Bouley, S.; Gaillard, B.; Sauli, I.; Kwon, M. K.
2011-01-01
There is a lack of precise meteoroids orbit from video observations as most of the meteor stations use off-the-shelf CCD cameras. Few meteoroids orbit with precise semi-major axis are available using film photographic method. Precise orbits are necessary to compute the dust flux in the Earth s vicinity, and to estimate the ejection time of the meteoroids accurately by comparing them with the theoretical evolution model. We investigate the use of large CCD sensors to observe multi-station meteors and to compute precise orbit of these meteoroids. An ideal spatial and temporal resolution to get an accuracy to those similar of photographic plates are discussed. Various problems faced due to the use of large CCD, such as increasing the spatial and the temporal resolution at the same time and computational problems in finding the meteor position are illustrated.
WE-D-303-01: Development and Application of Digital Human Phantoms
DOE Office of Scientific and Technical Information (OSTI.GOV)
Segars, P.
2015-06-15
Modern medical physics deals with complex problems such as 4D radiation therapy and imaging quality optimization. Such problems involve a large number of radiological parameters, and anatomical and physiological breathing patterns. A major challenge is how to develop, test, evaluate and compare various new imaging and treatment techniques, which often involves testing over a large range of radiological parameters as well as varying patient anatomies and motions. It would be extremely challenging, if not impossible, both ethically and practically, to test every combination of parameters and every task on every type of patient under clinical conditions. Computer-based simulation using computationalmore » phantoms offers a practical technique with which to evaluate, optimize, and compare imaging technologies and methods. Within simulation, the computerized phantom provides a virtual model of the patient’s anatomy and physiology. Imaging data can be generated from it as if it was a live patient using accurate models of the physics of the imaging and treatment process. With sophisticated simulation algorithms, it is possible to perform virtual experiments entirely on the computer. By serving as virtual patients, computational phantoms hold great promise in solving some of the most complex problems in modern medical physics. In this proposed symposium, we will present the history and recent developments of computational phantom models, share experiences in their application to advanced imaging and radiation applications, and discuss their promises and limitations. Learning Objectives: Understand the need and requirements of computational phantoms in medical physics research Discuss the developments and applications of computational phantoms Know the promises and limitations of computational phantoms in solving complex problems.« less
Ordering Unstructured Meshes for Sparse Matrix Computations on Leading Parallel Systems
NASA Technical Reports Server (NTRS)
Oliker, Leonid; Li, Xiaoye; Heber, Gerd; Biswas, Rupak
2000-01-01
The ability of computers to solve hitherto intractable problems and simulate complex processes using mathematical models makes them an indispensable part of modern science and engineering. Computer simulations of large-scale realistic applications usually require solving a set of non-linear partial differential equations (PDES) over a finite region. For example, one thrust area in the DOE Grand Challenge projects is to design future accelerators such as the SpaHation Neutron Source (SNS). Our colleagues at SLAC need to model complex RFQ cavities with large aspect ratios. Unstructured grids are currently used to resolve the small features in a large computational domain; dynamic mesh adaptation will be added in the future for additional efficiency. The PDEs for electromagnetics are discretized by the FEM method, which leads to a generalized eigenvalue problem Kx = AMx, where K and M are the stiffness and mass matrices, and are very sparse. In a typical cavity model, the number of degrees of freedom is about one million. For such large eigenproblems, direct solution techniques quickly reach the memory limits. Instead, the most widely-used methods are Krylov subspace methods, such as Lanczos or Jacobi-Davidson. In all the Krylov-based algorithms, sparse matrix-vector multiplication (SPMV) must be performed repeatedly. Therefore, the efficiency of SPMV usually determines the eigensolver speed. SPMV is also one of the most heavily used kernels in large-scale numerical simulations.
Analysis of Computer Network Information Based on "Big Data"
NASA Astrophysics Data System (ADS)
Li, Tianli
2017-11-01
With the development of the current era, computer network and large data gradually become part of the people's life, people use the computer to provide convenience for their own life, but at the same time there are many network information problems has to pay attention. This paper analyzes the information security of computer network based on "big data" analysis, and puts forward some solutions.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Finch, D.R.; Chandler, J.R.; Church, J.P.
1979-01-01
The SHIELD system is a powerful new computational tool for calculation of isotopic inventory, radiation sources, decay heat, and shielding assessment in part of the nuclear fuel cycle. The integrated approach used in this system permitss the communication and management of large fields of numbers efficiently thus permitting the user to address the technical rather than computer aspects of a problem. Emphasis on graphical outputs permits large fields of resulting numbers to be efficiently displayed.
Approximate l-fold cross-validation with Least Squares SVM and Kernel Ridge Regression
DOE Office of Scientific and Technical Information (OSTI.GOV)
Edwards, Richard E; Zhang, Hao; Parker, Lynne Edwards
2013-01-01
Kernel methods have difficulties scaling to large modern data sets. The scalability issues are based on computational and memory requirements for working with a large matrix. These requirements have been addressed over the years by using low-rank kernel approximations or by improving the solvers scalability. However, Least Squares Support VectorMachines (LS-SVM), a popular SVM variant, and Kernel Ridge Regression still have several scalability issues. In particular, the O(n^3) computational complexity for solving a single model, and the overall computational complexity associated with tuning hyperparameters are still major problems. We address these problems by introducing an O(n log n) approximate l-foldmore » cross-validation method that uses a multi-level circulant matrix to approximate the kernel. In addition, we prove our algorithm s computational complexity and present empirical runtimes on data sets with approximately 1 million data points. We also validate our approximate method s effectiveness at selecting hyperparameters on real world and standard benchmark data sets. Lastly, we provide experimental results on using a multi-level circulant kernel approximation to solve LS-SVM problems with hyperparameters selected using our method.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Munro, J.K. Jr.
1980-05-01
The advent of large, fast computers has opened the way to modeling more complex physical processes and to handling very large quantities of experimental data. The amount of information that can be processed in a short period of time is so great that use of graphical displays assumes greater importance as a means of displaying this information. Information from dynamical processes can be displayed conveniently by use of animated graphics. This guide presents the basic techniques for generating black and white animated graphics, with consideration of aesthetic, mechanical, and computational problems. The guide is intended for use by someone whomore » wants to make movies on the National Magnetic Fusion Energy Computing Center (NMFECC) CDC-7600. Problems encountered by a geographically remote user are given particular attention. Detailed information is given that will allow a remote user to do some file checking and diagnosis before giving graphics files to the system for processing into film in order to spot problems without having to wait for film to be delivered. Source listings of some useful software are given in appendices along with descriptions of how to use it. 3 figures, 5 tables.« less
NASA Astrophysics Data System (ADS)
Newman, Gregory A.
2014-01-01
Many geoscientific applications exploit electrostatic and electromagnetic fields to interrogate and map subsurface electrical resistivity—an important geophysical attribute for characterizing mineral, energy, and water resources. In complex three-dimensional geologies, where many of these resources remain to be found, resistivity mapping requires large-scale modeling and imaging capabilities, as well as the ability to treat significant data volumes, which can easily overwhelm single-core and modest multicore computing hardware. To treat such problems requires large-scale parallel computational resources, necessary for reducing the time to solution to a time frame acceptable to the exploration process. The recognition that significant parallel computing processes must be brought to bear on these problems gives rise to choices that must be made in parallel computing hardware and software. In this review, some of these choices are presented, along with the resulting trade-offs. We also discuss future trends in high-performance computing and the anticipated impact on electromagnetic (EM) geophysics. Topics discussed in this review article include a survey of parallel computing platforms, graphics processing units to multicore CPUs with a fast interconnect, along with effective parallel solvers and associated solver libraries effective for inductive EM modeling and imaging.
Quantum Heterogeneous Computing for Satellite Positioning Optimization
NASA Astrophysics Data System (ADS)
Bass, G.; Kumar, V.; Dulny, J., III
2016-12-01
Hard optimization problems occur in many fields of academic study and practical situations. We present results in which quantum heterogeneous computing is used to solve a real-world optimization problem: satellite positioning. Optimization problems like this can scale very rapidly with problem size, and become unsolvable with traditional brute-force methods. Typically, such problems have been approximately solved with heuristic approaches; however, these methods can take a long time to calculate and are not guaranteed to find optimal solutions. Quantum computing offers the possibility of producing significant speed-up and improved solution quality. There are now commercially available quantum annealing (QA) devices that are designed to solve difficult optimization problems. These devices have 1000+ quantum bits, but they have significant hardware size and connectivity limitations. We present a novel heterogeneous computing stack that combines QA and classical machine learning and allows the use of QA on problems larger than the quantum hardware could solve in isolation. We begin by analyzing the satellite positioning problem with a heuristic solver, the genetic algorithm. The classical computer's comparatively large available memory can explore the full problem space and converge to a solution relatively close to the true optimum. The QA device can then evolve directly to the optimal solution within this more limited space. Preliminary experiments, using the Quantum Monte Carlo (QMC) algorithm to simulate QA hardware, have produced promising results. Working with problem instances with known global minima, we find a solution within 8% in a matter of seconds, and within 5% in a few minutes. Future studies include replacing QMC with commercially available quantum hardware and exploring more problem sets and model parameters. Our results have important implications for how heterogeneous quantum computing can be used to solve difficult optimization problems in any field.
NASA Astrophysics Data System (ADS)
Hreniuc, V.; Hreniuc, A.; Pescaru, A.
2017-08-01
Solving a general strength problem of a ship hull may be done using analytical approaches which are useful to deduce the buoyancy forces distribution, the weighting forces distribution along the hull and the geometrical characteristics of the sections. These data are used to draw the free body diagrams and to compute the stresses. The general strength problems require a large amount of calculi, therefore it is interesting how a computer may be used to solve such problems. Using computer programming an engineer may conceive software instruments based on analytical approaches. However, before developing the computer code the research topic must be thoroughly analysed, in this way being reached a meta-level of understanding of the problem. The following stage is to conceive an appropriate development strategy of the original software instruments useful for the rapid development of computer aided analytical models. The geometrical characteristics of the sections may be computed using a bool algebra that operates with ‘simple’ geometrical shapes. By ‘simple’ we mean that for the according shapes we have direct calculus relations. In the set of ‘simple’ shapes we also have geometrical entities bounded by curves approximated as spline functions or as polygons. To conclude, computer programming offers the necessary support to solve general strength ship hull problems using analytical methods.
NASA Technical Reports Server (NTRS)
Bless, Robert R.
1991-01-01
A time-domain finite element method is developed for optimal control problems. The theory derived is general enough to handle a large class of problems including optimal control problems that are continuous in the states and controls, problems with discontinuities in the states and/or system equations, problems with control inequality constraints, problems with state inequality constraints, or problems involving any combination of the above. The theory is developed in such a way that no numerical quadrature is necessary regardless of the degree of nonlinearity in the equations. Also, the same shape functions may be employed for every problem because all strong boundary conditions are transformed into natural or weak boundary conditions. In addition, the resulting nonlinear algebraic equations are very sparse. Use of sparse matrix solvers allows for the rapid and accurate solution of very difficult optimization problems. The formulation is applied to launch-vehicle trajectory optimization problems, and results show that real-time optimal guidance is realizable with this method. Finally, a general problem solving environment is created for solving a large class of optimal control problems. The algorithm uses both FORTRAN and a symbolic computation program to solve problems with a minimum of user interaction. The use of symbolic computation eliminates the need for user-written subroutines which greatly reduces the setup time for solving problems.
NASA Technical Reports Server (NTRS)
Johnston, William E.; Gannon, Dennis; Nitzberg, Bill
2000-01-01
We use the term "Grid" to refer to distributed, high performance computing and data handling infrastructure that incorporates geographically and organizationally dispersed, heterogeneous resources that are persistent and supported. This infrastructure includes: (1) Tools for constructing collaborative, application oriented Problem Solving Environments / Frameworks (the primary user interfaces for Grids); (2) Programming environments, tools, and services providing various approaches for building applications that use aggregated computing and storage resources, and federated data sources; (3) Comprehensive and consistent set of location independent tools and services for accessing and managing dynamic collections of widely distributed resources: heterogeneous computing systems, storage systems, real-time data sources and instruments, human collaborators, and communications systems; (4) Operational infrastructure including management tools for distributed systems and distributed resources, user services, accounting and auditing, strong and location independent user authentication and authorization, and overall system security services The vision for NASA's Information Power Grid - a computing and data Grid - is that it will provide significant new capabilities to scientists and engineers by facilitating routine construction of information based problem solving environments / frameworks. Such Grids will knit together widely distributed computing, data, instrument, and human resources into just-in-time systems that can address complex and large-scale computing and data analysis problems. Examples of these problems include: (1) Coupled, multidisciplinary simulations too large for single systems (e.g., multi-component NPSS turbomachine simulation); (2) Use of widely distributed, federated data archives (e.g., simultaneous access to metrological, topological, aircraft performance, and flight path scheduling databases supporting a National Air Space Simulation systems}; (3) Coupling large-scale computing and data systems to scientific and engineering instruments (e.g., realtime interaction with experiments through real-time data analysis and interpretation presented to the experimentalist in ways that allow direct interaction with the experiment (instead of just with instrument control); (5) Highly interactive, augmented reality and virtual reality remote collaborations (e.g., Ames / Boeing Remote Help Desk providing field maintenance use of coupled video and NDI to a remote, on-line airframe structures expert who uses this data to index into detailed design databases, and returns 3D internal aircraft geometry to the field); (5) Single computational problems too large for any single system (e.g. the rotocraft reference calculation). Grids also have the potential to provide pools of resources that could be called on in extraordinary / rapid response situations (such as disaster response) because they can provide common interfaces and access mechanisms, standardized management, and uniform user authentication and authorization, for large collections of distributed resources (whether or not they normally function in concert). IPG development and deployment is addressing requirements obtained by analyzing a number of different application areas, in particular from the NASA Aero-Space Technology Enterprise. This analysis has focussed primarily on two types of users: the scientist / design engineer whose primary interest is problem solving (e.g. determining wing aerodynamic characteristics in many different operating environments), and whose primary interface to IPG will be through various sorts of problem solving frameworks. The second type of user is the tool designer: the computational scientists who convert physics and mathematics into code that can simulate the physical world. These are the two primary users of IPG, and they have rather different requirements. The results of the analysis of the needs of these two types of users provides a broad set of requirements that gives rise to a general set of required capabilities. The IPG project is intended to address all of these requirements. In some cases the required computing technology exists, and in some cases it must be researched and developed. The project is using available technology to provide a prototype set of capabilities in a persistent distributed computing testbed. Beyond this, there are required capabilities that are not immediately available, and whose development spans the range from near-term engineering development (one to two years) to much longer term R&D (three to six years). Additional information is contained in the original.
Online Operation Guidance of Computer System Used in Real-Time Distance Education Environment
ERIC Educational Resources Information Center
He, Aiguo
2011-01-01
Computer system is useful for improving real time and interactive distance education activities. Especially in the case that a large number of students participate in one distance lecture together and every student uses their own computer to share teaching materials or control discussions over the virtual classrooms. The problem is that within…
On the minimum orbital intersection distance computation: a new effective method
NASA Astrophysics Data System (ADS)
Hedo, José M.; Ruíz, Manuel; Peláez, Jesús
2018-06-01
The computation of the Minimum Orbital Intersection Distance (MOID) is an old, but increasingly relevant problem. Fast and precise methods for MOID computation are needed to select potentially hazardous asteroids from a large catalogue. The same applies to debris with respect to spacecraft. An iterative method that strictly meets these two premises is presented.
Teaching Business Statistics in a Computer Lab: Benefit or Distraction?
ERIC Educational Resources Information Center
Martin, Linda R.
2011-01-01
Teaching in a classroom configured with computers has been heralded as an aid to learning. Students receive the benefits of working with large data sets and real-world problems. However, with the advent of network and wireless connections, students can now use the computer for alternating tasks, such as emailing, web browsing, and social…
NASA Technical Reports Server (NTRS)
Mckay, Charles W.; Feagin, Terry; Bishop, Peter C.; Hallum, Cecil R.; Freedman, Glenn B.
1987-01-01
The principle focus of one of the RICIS (Research Institute for Computing and Information Systems) components is computer systems and software engineering in-the-large of the lifecycle of large, complex, distributed systems which: (1) evolve incrementally over a long time; (2) contain non-stop components; and (3) must simultaneously satisfy a prioritized balance of mission and safety critical requirements at run time. This focus is extremely important because of the contribution of the scaling direction problem to the current software crisis. The Computer Systems and Software Engineering (CSSE) component addresses the lifestyle issues of three environments: host, integration, and target.
Serang, Oliver; MacCoss, Michael J.; Noble, William Stafford
2010-01-01
The problem of identifying proteins from a shotgun proteomics experiment has not been definitively solved. Identifying the proteins in a sample requires ranking them, ideally with interpretable scores. In particular, “degenerate” peptides, which map to multiple proteins, have made such a ranking difficult to compute. The problem of computing posterior probabilities for the proteins, which can be interpreted as confidence in a protein’s presence, has been especially daunting. Previous approaches have either ignored the peptide degeneracy problem completely, addressed it by computing a heuristic set of proteins or heuristic posterior probabilities, or by estimating the posterior probabilities with sampling methods. We present a probabilistic model for protein identification in tandem mass spectrometry that recognizes peptide degeneracy. We then introduce graph-transforming algorithms that facilitate efficient computation of protein probabilities, even for large data sets. We evaluate our identification procedure on five different well-characterized data sets and demonstrate our ability to efficiently compute high-quality protein posteriors. PMID:20712337
Thin Client Architecture: The Promise and the Problems.
ERIC Educational Resources Information Center
Machovec, George S.
1997-01-01
Describes thin clients, a networking technology that allows organizations to provide software applications over networked workstations connected to a central server. Topics include corporate settings; major advantages, including cost effectiveness and increased computer security; problems; and possible applications for large public and academic…
A Stabilized Sparse-Matrix U-D Square-Root Implementation of a Large-State Extended Kalman Filter
NASA Technical Reports Server (NTRS)
Boggs, D.; Ghil, M.; Keppenne, C.
1995-01-01
The full nonlinear Kalman filter sequential algorithm is, in theory, well-suited to the four-dimensional data assimilation problem in large-scale atmospheric and oceanic problems. However, it was later discovered that this algorithm can be very sensitive to computer roundoff, and that results may cease to be meaningful as time advances. Implementations of a modified Kalman filter are given.
GPU-accelerated element-free reverse-time migration with Gauss points partition
NASA Astrophysics Data System (ADS)
Zhou, Zhen; Jia, Xiaofeng; Qiang, Xiaodong
2018-06-01
An element-free method (EFM) has been demonstrated successfully in elasticity, heat conduction and fatigue crack growth problems. We present the theory of EFM and its numerical applications in seismic modelling and reverse time migration (RTM). Compared with the finite difference method and the finite element method, the EFM has unique advantages: (1) independence of grids in computation and (2) lower expense and more flexibility (because only the information of the nodes and the boundary of the concerned area is required). However, in EFM, due to improper computation and storage of some large sparse matrices, such as the mass matrix and the stiffness matrix, the method is difficult to apply to seismic modelling and RTM for a large velocity model. To solve the problem of storage and computation efficiency, we propose a concept of Gauss points partition and utilise the graphics processing unit to improve the computational efficiency. We employ the compressed sparse row format to compress the intermediate large sparse matrices and attempt to simplify the operations by solving the linear equations with CULA solver. To improve the computation efficiency further, we introduce the concept of the lumped mass matrix. Numerical experiments indicate that the proposed method is accurate and more efficient than the regular EFM.
Parallelization of Finite Element Analysis Codes Using Heterogeneous Distributed Computing
NASA Technical Reports Server (NTRS)
Ozguner, Fusun
1996-01-01
Performance gains in computer design are quickly consumed as users seek to analyze larger problems to a higher degree of accuracy. Innovative computational methods, such as parallel and distributed computing, seek to multiply the power of existing hardware technology to satisfy the computational demands of large applications. In the early stages of this project, experiments were performed using two large, coarse-grained applications, CSTEM and METCAN. These applications were parallelized on an Intel iPSC/860 hypercube. It was found that the overall speedup was very low, due to large, inherently sequential code segments present in the applications. The overall execution time T(sub par), of the application is dependent on these sequential segments. If these segments make up a significant fraction of the overall code, the application will have a poor speedup measure.
Efficient ICCG on a shared memory multiprocessor
NASA Technical Reports Server (NTRS)
Hammond, Steven W.; Schreiber, Robert
1989-01-01
Different approaches are discussed for exploiting parallelism in the ICCG (Incomplete Cholesky Conjugate Gradient) method for solving large sparse symmetric positive definite systems of equations on a shared memory parallel computer. Techniques for efficiently solving triangular systems and computing sparse matrix-vector products are explored. Three methods for scheduling the tasks in solving triangular systems are implemented on the Sequent Balance 21000. Sample problems that are representative of a large class of problems solved using iterative methods are used. We show that a static analysis to determine data dependences in the triangular solve can greatly improve its parallel efficiency. We also show that ignoring symmetry and storing the whole matrix can reduce solution time substantially.
Simulator for multilevel optimization research
NASA Technical Reports Server (NTRS)
Padula, S. L.; Young, K. C.
1986-01-01
A computer program designed to simulate and improve multilevel optimization techniques is described. By using simple analytic functions to represent complex engineering analyses, the simulator can generate and test a large variety of multilevel decomposition strategies in a relatively short time. This type of research is an essential step toward routine optimization of large aerospace systems. The paper discusses the types of optimization problems handled by the simulator and gives input and output listings and plots for a sample problem. It also describes multilevel implementation techniques which have value beyond the present computer program. Thus, this document serves as a user's manual for the simulator and as a guide for building future multilevel optimization applications.
NASA Technical Reports Server (NTRS)
Housner, J. M.; Mcgowan, P. E.; Abrahamson, A. L.; Powell, M. G.
1986-01-01
The LATDYN User's Manual presents the capabilities and instructions for the LATDYN (Large Angle Transient DYNamics) computer program. The LATDYN program is a tool for analyzing the controlled or uncontrolled dynamic transient behavior of interconnected deformable multi-body systems which can undergo large angular motions of each body relative other bodies. The program accommodates large structural deformation as well as large rigid body rotations and is applicable, but not limited to, the following areas: (1) development of large flexible space structures; (2) slewing of large space structure components; (3) mechanisms with rigid or elastic components; and (4) robotic manipulations of beam members. Presently the program is limited to two dimensional problems, but in many cases, three dimensional problems can be exactly or approximately reduced to two dimensions. The program uses convected finite elements to affect the large angular motions involved in the analysis. General geometry is permitted. Detailed user input and output specifications are provided and discussed with example runstreams. To date, LATDYN has been configured for CDC/NOS and DEC VAX/VMS machines. All coding is in ANSII-77 FORTRAN. Detailed instructions regarding interfaces with particular computer operating systems and file structures are provided.
Research on OpenStack of open source cloud computing in colleges and universities’ computer room
NASA Astrophysics Data System (ADS)
Wang, Lei; Zhang, Dandan
2017-06-01
In recent years, the cloud computing technology has a rapid development, especially open source cloud computing. Open source cloud computing has attracted a large number of user groups by the advantages of open source and low cost, have now become a large-scale promotion and application. In this paper, firstly we briefly introduced the main functions and architecture of the open source cloud computing OpenStack tools, and then discussed deeply the core problems of computer labs in colleges and universities. Combining with this research, it is not that the specific application and deployment of university computer rooms with OpenStack tool. The experimental results show that the application of OpenStack tool can efficiently and conveniently deploy cloud of university computer room, and its performance is stable and the functional value is good.
A visual programming environment for the Navier-Stokes computer
NASA Technical Reports Server (NTRS)
Tomboulian, Sherryl; Crockett, Thomas W.; Middleton, David
1988-01-01
The Navier-Stokes computer is a high-performance, reconfigurable, pipelined machine designed to solve large computational fluid dynamics problems. Due to the complexity of the architecture, development of effective, high-level language compilers for the system appears to be a very difficult task. Consequently, a visual programming methodology has been developed which allows users to program the system at an architectural level by constructing diagrams of the pipeline configuration. These schematic program representations can then be checked for validity and automatically translated into machine code. The visual environment is illustrated by using a prototype graphical editor to program an example problem.
A reduced successive quadratic programming strategy for errors-in-variables estimation.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tjoa, I.-B.; Biegler, L. T.; Carnegie-Mellon Univ.
Parameter estimation problems in process engineering represent a special class of nonlinear optimization problems, because the maximum likelihood structure of the objective function can be exploited. Within this class, the errors in variables method (EVM) is particularly interesting. Here we seek a weighted least-squares fit to the measurements with an underdetermined process model. Thus, both the number of variables and degrees of freedom available for optimization increase linearly with the number of data sets. Large optimization problems of this type can be particularly challenging and expensive to solve because, for general-purpose nonlinear programming (NLP) algorithms, the computational effort increases atmore » least quadratically with problem size. In this study we develop a tailored NLP strategy for EVM problems. The method is based on a reduced Hessian approach to successive quadratic programming (SQP), but with the decomposition performed separately for each data set. This leads to the elimination of all variables but the model parameters, which are determined by a QP coordination step. In this way the computational effort remains linear in the number of data sets. Moreover, unlike previous approaches to the EVM problem, global and superlinear properties of the SQP algorithm apply naturally. Also, the method directly incorporates inequality constraints on the model parameters (although not on the fitted variables). This approach is demonstrated on five example problems with up to 102 degrees of freedom. Compared to general-purpose NLP algorithms, large improvements in computational performance are observed.« less
Dynamic optimization of chemical processes using ant colony framework.
Rajesh, J; Gupta, K; Kusumakar, H S; Jayaraman, V K; Kulkarni, B D
2001-11-01
Ant colony framework is illustrated by considering dynamic optimization of six important bench marking examples. This new computational tool is simple to implement and can tackle problems with state as well as terminal constraints in a straightforward fashion. It requires fewer grid points to reach the global optimum at relatively very low computational effort. The examples with varying degree of complexities, analyzed here, illustrate its potential for solving a large class of process optimization problems in chemical engineering.
ERIC Educational Resources Information Center
DOLBY, J.L.; AND OTHERS
THE STUDY IS CONCERNED WITH THE LINGUISTIC PROBLEM INVOLVED IN TEXT COMPRESSION--EXTRACTING, INDEXING, AND THE AUTOMATIC CREATION OF SPECIAL-PURPOSE CITATION DICTIONARIES. IN SPITE OF EARLY SUCCESS IN USING LARGE-SCALE COMPUTERS TO AUTOMATE CERTAIN HUMAN TASKS, THESE PROBLEMS REMAIN AMONG THE MOST DIFFICULT TO SOLVE. ESSENTIALLY, THE PROBLEM IS TO…
Saa, Pedro A.; Nielsen, Lars K.
2016-01-01
Motivation: Computation of steady-state flux solutions in large metabolic models is routinely performed using flux balance analysis based on a simple LP (Linear Programming) formulation. A minimal requirement for thermodynamic feasibility of the flux solution is the absence of internal loops, which are enforced using ‘loopless constraints’. The resulting loopless flux problem is a substantially harder MILP (Mixed Integer Linear Programming) problem, which is computationally expensive for large metabolic models. Results: We developed a pre-processing algorithm that significantly reduces the size of the original loopless problem into an easier and equivalent MILP problem. The pre-processing step employs a fast matrix sparsification algorithm—Fast- sparse null-space pursuit (SNP)—inspired by recent results on SNP. By finding a reduced feasible ‘loop-law’ matrix subject to known directionalities, Fast-SNP considerably improves the computational efficiency in several metabolic models running different loopless optimization problems. Furthermore, analysis of the topology encoded in the reduced loop matrix enabled identification of key directional constraints for the potential permanent elimination of infeasible loops in the underlying model. Overall, Fast-SNP is an effective and simple algorithm for efficient formulation of loop-law constraints, making loopless flux optimization feasible and numerically tractable at large scale. Availability and Implementation: Source code for MATLAB including examples is freely available for download at http://www.aibn.uq.edu.au/cssb-resources under Software. Optimization uses Gurobi, CPLEX or GLPK (the latter is included with the algorithm). Contact: lars.nielsen@uq.edu.au Supplementary information: Supplementary data are available at Bioinformatics online. PMID:27559155
NASA Technical Reports Server (NTRS)
Nguyen, Duc T.
1990-01-01
Practical engineering application can often be formulated in the form of a constrained optimization problem. There are several solution algorithms for solving a constrained optimization problem. One approach is to convert a constrained problem into a series of unconstrained problems. Furthermore, unconstrained solution algorithms can be used as part of the constrained solution algorithms. Structural optimization is an iterative process where one starts with an initial design, a finite element structure analysis is then performed to calculate the response of the system (such as displacements, stresses, eigenvalues, etc.). Based upon the sensitivity information on the objective and constraint functions, an optimizer such as ADS or IDESIGN, can be used to find the new, improved design. For the structural analysis phase, the equation solver for the system of simultaneous, linear equations plays a key role since it is needed for either static, or eigenvalue, or dynamic analysis. For practical, large-scale structural analysis-synthesis applications, computational time can be excessively large. Thus, it is necessary to have a new structural analysis-synthesis code which employs new solution algorithms to exploit both parallel and vector capabilities offered by modern, high performance computers such as the Convex, Cray-2 and Cray-YMP computers. The objective of this research project is, therefore, to incorporate the latest development in the parallel-vector equation solver, PVSOLVE into the widely popular finite-element production code, such as the SAP-4. Furthermore, several nonlinear unconstrained optimization subroutines have also been developed and tested under a parallel computer environment. The unconstrained optimization subroutines are not only useful in their own right, but they can also be incorporated into a more popular constrained optimization code, such as ADS.
Optimal dynamic remapping of parallel computations
NASA Technical Reports Server (NTRS)
Nicol, David M.; Reynolds, Paul F., Jr.
1987-01-01
A large class of computations are characterized by a sequence of phases, with phase changes occurring unpredictably. The decision problem was considered regarding the remapping of workload to processors in a parallel computation when the utility of remapping and the future behavior of the workload is uncertain, and phases exhibit stable execution requirements during a given phase, but requirements may change radically between phases. For these problems a workload assignment generated for one phase may hinder performance during the next phase. This problem is treated formally for a probabilistic model of computation with at most two phases. The fundamental problem of balancing the expected remapping performance gain against the delay cost was addressed. Stochastic dynamic programming is used to show that the remapping decision policy minimizing the expected running time of the computation has an extremely simple structure. Because the gain may not be predictable, the performance of a heuristic policy that does not require estimnation of the gain is examined. The heuristic method's feasibility is demonstrated by its use on an adaptive fluid dynamics code on a multiprocessor. The results suggest that except in extreme cases, the remapping decision problem is essentially that of dynamically determining whether gain can be achieved by remapping after a phase change. The results also suggest that this heuristic is applicable to computations with more than two phases.
Gottschlich, Carsten; Schuhmacher, Dominic
2014-01-01
Finding solutions to the classical transportation problem is of great importance, since this optimization problem arises in many engineering and computer science applications. Especially the Earth Mover's Distance is used in a plethora of applications ranging from content-based image retrieval, shape matching, fingerprint recognition, object tracking and phishing web page detection to computing color differences in linguistics and biology. Our starting point is the well-known revised simplex algorithm, which iteratively improves a feasible solution to optimality. The Shortlist Method that we propose substantially reduces the number of candidates inspected for improving the solution, while at the same time balancing the number of pivots required. Tests on simulated benchmarks demonstrate a considerable reduction in computation time for the new method as compared to the usual revised simplex algorithm implemented with state-of-the-art initialization and pivot strategies. As a consequence, the Shortlist Method facilitates the computation of large scale transportation problems in viable time. In addition we describe a novel method for finding an initial feasible solution which we coin Modified Russell's Method.
Gottschlich, Carsten; Schuhmacher, Dominic
2014-01-01
Finding solutions to the classical transportation problem is of great importance, since this optimization problem arises in many engineering and computer science applications. Especially the Earth Mover's Distance is used in a plethora of applications ranging from content-based image retrieval, shape matching, fingerprint recognition, object tracking and phishing web page detection to computing color differences in linguistics and biology. Our starting point is the well-known revised simplex algorithm, which iteratively improves a feasible solution to optimality. The Shortlist Method that we propose substantially reduces the number of candidates inspected for improving the solution, while at the same time balancing the number of pivots required. Tests on simulated benchmarks demonstrate a considerable reduction in computation time for the new method as compared to the usual revised simplex algorithm implemented with state-of-the-art initialization and pivot strategies. As a consequence, the Shortlist Method facilitates the computation of large scale transportation problems in viable time. In addition we describe a novel method for finding an initial feasible solution which we coin Modified Russell's Method. PMID:25310106
Precision Parameter Estimation and Machine Learning
NASA Astrophysics Data System (ADS)
Wandelt, Benjamin D.
2008-12-01
I discuss the strategy of ``Acceleration by Parallel Precomputation and Learning'' (AP-PLe) that can vastly accelerate parameter estimation in high-dimensional parameter spaces and costly likelihood functions, using trivially parallel computing to speed up sequential exploration of parameter space. This strategy combines the power of distributed computing with machine learning and Markov-Chain Monte Carlo techniques efficiently to explore a likelihood function, posterior distribution or χ2-surface. This strategy is particularly successful in cases where computing the likelihood is costly and the number of parameters is moderate or large. We apply this technique to two central problems in cosmology: the solution of the cosmological parameter estimation problem with sufficient accuracy for the Planck data using PICo; and the detailed calculation of cosmological helium and hydrogen recombination with RICO. Since the APPLe approach is designed to be able to use massively parallel resources to speed up problems that are inherently serial, we can bring the power of distributed computing to bear on parameter estimation problems. We have demonstrated this with the CosmologyatHome project.
Simulating chemistry using quantum computers.
Kassal, Ivan; Whitfield, James D; Perdomo-Ortiz, Alejandro; Yung, Man-Hong; Aspuru-Guzik, Alán
2011-01-01
The difficulty of simulating quantum systems, well known to quantum chemists, prompted the idea of quantum computation. One can avoid the steep scaling associated with the exact simulation of increasingly large quantum systems on conventional computers, by mapping the quantum system to another, more controllable one. In this review, we discuss to what extent the ideas in quantum computation, now a well-established field, have been applied to chemical problems. We describe algorithms that achieve significant advantages for the electronic-structure problem, the simulation of chemical dynamics, protein folding, and other tasks. Although theory is still ahead of experiment, we outline recent advances that have led to the first chemical calculations on small quantum information processors.
Analysis of Algorithms: Coping with Hard Problems
ERIC Educational Resources Information Center
Kolata, Gina Bari
1974-01-01
Although today's computers can perform as many as one million operations per second, there are many problems that are still too large to be solved in a straightforward manner. Recent work indicates that many approximate solutions are useful and more efficient than exact solutions. (Author/RH)
Optimal processor assignment for pipeline computations
NASA Technical Reports Server (NTRS)
Nicol, David M.; Simha, Rahul; Choudhury, Alok N.; Narahari, Bhagirath
1991-01-01
The availability of large scale multitasked parallel architectures introduces the following processor assignment problem for pipelined computations. Given a set of tasks and their precedence constraints, along with their experimentally determined individual responses times for different processor sizes, find an assignment of processor to tasks. Two objectives are of interest: minimal response given a throughput requirement, and maximal throughput given a response time requirement. These assignment problems differ considerably from the classical mapping problem in which several tasks share a processor; instead, it is assumed that a large number of processors are to be assigned to a relatively small number of tasks. Efficient assignment algorithms were developed for different classes of task structures. For a p processor system and a series parallel precedence graph with n constituent tasks, an O(np2) algorithm is provided that finds the optimal assignment for the response time optimization problem; it was found that the assignment optimizing the constrained throughput in O(np2log p) time. Special cases of linear, independent, and tree graphs are also considered.
NASA Astrophysics Data System (ADS)
Agata, Ryoichiro; Ichimura, Tsuyoshi; Hori, Takane; Hirahara, Kazuro; Hashimoto, Chihiro; Hori, Muneo
2018-04-01
The simultaneous estimation of the asthenosphere's viscosity and coseismic slip/afterslip is expected to improve largely the consistency of the estimation results to observation data of crustal deformation collected in widely spread observation points, compared to estimations of slips only. Such an estimate can be formulated as a non-linear inverse problem of material properties of viscosity and input force that is equivalent to fault slips based on large-scale finite-element (FE) modeling of crustal deformation, in which the degree of freedom is in the order of 109. We formulated and developed a computationally efficient adjoint-based estimation method for this inverse problem, together with a fast and scalable FE solver for the associated forward and adjoint problems. In a numerical experiment that imitates the 2011 Tohoku-Oki earthquake, the advantage of the proposed method is confirmed by comparing the estimated results with those obtained using simplified estimation methods. The computational cost required for the optimization shows that the proposed method enabled the targeted estimation to be completed with moderate amount of computational resources.
A novel heuristic algorithm for capacitated vehicle routing problem
NASA Astrophysics Data System (ADS)
Kır, Sena; Yazgan, Harun Reşit; Tüncel, Emre
2017-09-01
The vehicle routing problem with the capacity constraints was considered in this paper. It is quite difficult to achieve an optimal solution with traditional optimization methods by reason of the high computational complexity for large-scale problems. Consequently, new heuristic or metaheuristic approaches have been developed to solve this problem. In this paper, we constructed a new heuristic algorithm based on the tabu search and adaptive large neighborhood search (ALNS) with several specifically designed operators and features to solve the capacitated vehicle routing problem (CVRP). The effectiveness of the proposed algorithm was illustrated on the benchmark problems. The algorithm provides a better performance on large-scaled instances and gained advantage in terms of CPU time. In addition, we solved a real-life CVRP using the proposed algorithm and found the encouraging results by comparison with the current situation that the company is in.
Structure preserving parallel algorithms for solving the Bethe–Salpeter eigenvalue problem
Shao, Meiyue; da Jornada, Felipe H.; Yang, Chao; ...
2015-10-02
The Bethe–Salpeter eigenvalue problem is a dense structured eigenvalue problem arising from discretized Bethe–Salpeter equation in the context of computing exciton energies and states. A computational challenge is that at least half of the eigenvalues and the associated eigenvectors are desired in practice. In this paper, we establish the equivalence between Bethe–Salpeter eigenvalue problems and real Hamiltonian eigenvalue problems. Based on theoretical analysis, structure preserving algorithms for a class of Bethe–Salpeter eigenvalue problems are proposed. We also show that for this class of problems all eigenvalues obtained from the Tamm–Dancoff approximation are overestimated. In order to solve large scale problemsmore » of practical interest, we discuss parallel implementations of our algorithms targeting distributed memory systems. Finally, several numerical examples are presented to demonstrate the efficiency and accuracy of our algorithms.« less
Analyzing Team Based Engineering Design Process in Computer Supported Collaborative Learning
ERIC Educational Resources Information Center
Lee, Dong-Kuk; Lee, Eun-Sang
2016-01-01
The engineering design process has been largely implemented in a collaborative project format. Recently, technological advancement has helped collaborative problem solving processes such as engineering design to have efficient implementation using computers or online technology. In this study, we investigated college students' interaction and…
Challenges and opportunities of cloud computing for atmospheric sciences
NASA Astrophysics Data System (ADS)
Pérez Montes, Diego A.; Añel, Juan A.; Pena, Tomás F.; Wallom, David C. H.
2016-04-01
Cloud computing is an emerging technological solution widely used in many fields. Initially developed as a flexible way of managing peak demand it has began to make its way in scientific research. One of the greatest advantages of cloud computing for scientific research is independence of having access to a large cyberinfrastructure to fund or perform a research project. Cloud computing can avoid maintenance expenses for large supercomputers and has the potential to 'democratize' the access to high-performance computing, giving flexibility to funding bodies for allocating budgets for the computational costs associated with a project. Two of the most challenging problems in atmospheric sciences are computational cost and uncertainty in meteorological forecasting and climate projections. Both problems are closely related. Usually uncertainty can be reduced with the availability of computational resources to better reproduce a phenomenon or to perform a larger number of experiments. Here we expose results of the application of cloud computing resources for climate modeling using cloud computing infrastructures of three major vendors and two climate models. We show how the cloud infrastructure compares in performance to traditional supercomputers and how it provides the capability to complete experiments in shorter periods of time. The monetary cost associated is also analyzed. Finally we discuss the future potential of this technology for meteorological and climatological applications, both from the point of view of operational use and research.
Probabilistic structural mechanics research for parallel processing computers
NASA Technical Reports Server (NTRS)
Sues, Robert H.; Chen, Heh-Chyun; Twisdale, Lawrence A.; Martin, William R.
1991-01-01
Aerospace structures and spacecraft are a complex assemblage of structural components that are subjected to a variety of complex, cyclic, and transient loading conditions. Significant modeling uncertainties are present in these structures, in addition to the inherent randomness of material properties and loads. To properly account for these uncertainties in evaluating and assessing the reliability of these components and structures, probabilistic structural mechanics (PSM) procedures must be used. Much research has focused on basic theory development and the development of approximate analytic solution methods in random vibrations and structural reliability. Practical application of PSM methods was hampered by their computationally intense nature. Solution of PSM problems requires repeated analyses of structures that are often large, and exhibit nonlinear and/or dynamic response behavior. These methods are all inherently parallel and ideally suited to implementation on parallel processing computers. New hardware architectures and innovative control software and solution methodologies are needed to make solution of large scale PSM problems practical.
NASA Technical Reports Server (NTRS)
Krasteva, Denitza T.
1998-01-01
Multidisciplinary design optimization (MDO) for large-scale engineering problems poses many challenges (e.g., the design of an efficient concurrent paradigm for global optimization based on disciplinary analyses, expensive computations over vast data sets, etc.) This work focuses on the application of distributed schemes for massively parallel architectures to MDO problems, as a tool for reducing computation time and solving larger problems. The specific problem considered here is configuration optimization of a high speed civil transport (HSCT), and the efficient parallelization of the embedded paradigm for reasonable design space identification. Two distributed dynamic load balancing techniques (random polling and global round robin with message combining) and two necessary termination detection schemes (global task count and token passing) were implemented and evaluated in terms of effectiveness and scalability to large problem sizes and a thousand processors. The effect of certain parameters on execution time was also inspected. Empirical results demonstrated stable performance and effectiveness for all schemes, and the parametric study showed that the selected algorithmic parameters have a negligible effect on performance.
Estimation of the Thermal Process in the Honeycomb Panel by a Monte Carlo Method
NASA Astrophysics Data System (ADS)
Gusev, S. A.; Nikolaev, V. N.
2018-01-01
A new Monte Carlo method for estimating the thermal state of the heat insulation containing honeycomb panels is proposed in the paper. The heat transfer in the honeycomb panel is described by a boundary value problem for a parabolic equation with discontinuous diffusion coefficient and boundary conditions of the third kind. To obtain an approximate solution, it is proposed to use the smoothing of the diffusion coefficient. After that, the obtained problem is solved on the basis of the probability representation. The probability representation is the expectation of the functional of the diffusion process corresponding to the boundary value problem. The process of solving the problem is reduced to numerical statistical modelling of a large number of trajectories of the diffusion process corresponding to the parabolic problem. It was used earlier the Euler method for this object, but that requires a large computational effort. In this paper the method is modified by using combination of the Euler and the random walk on moving spheres methods. The new approach allows us to significantly reduce the computation costs.
On the performance of exponential integrators for problems in magnetohydrodynamics
NASA Astrophysics Data System (ADS)
Einkemmer, Lukas; Tokman, Mayya; Loffeld, John
2017-02-01
Exponential integrators have been introduced as an efficient alternative to explicit and implicit methods for integrating large stiff systems of differential equations. Over the past decades these methods have been studied theoretically and their performance was evaluated using a range of test problems. While the results of these investigations showed that exponential integrators can provide significant computational savings, the research on validating this hypothesis for large scale systems and understanding what classes of problems can particularly benefit from the use of the new techniques is in its initial stages. Resistive magnetohydrodynamic (MHD) modeling is widely used in studying large scale behavior of laboratory and astrophysical plasmas. In many problems numerical solution of MHD equations is a challenging task due to the temporal stiffness of this system in the parameter regimes of interest. In this paper we evaluate the performance of exponential integrators on large MHD problems and compare them to a state-of-the-art implicit time integrator. Both the variable and constant time step exponential methods of EPIRK-type are used to simulate magnetic reconnection and the Kevin-Helmholtz instability in plasma. Performance of these methods, which are part of the EPIC software package, is compared to the variable time step variable order BDF scheme included in the CVODE (part of SUNDIALS) library. We study performance of the methods on parallel architectures and with respect to magnitudes of important parameters such as Reynolds, Lundquist, and Prandtl numbers. We find that the exponential integrators provide superior or equal performance in most circumstances and conclude that further development of exponential methods for MHD problems is warranted and can lead to significant computational advantages for large scale stiff systems of differential equations such as MHD.
NASA Technical Reports Server (NTRS)
Denning, P. J.
1986-01-01
Virtual memory was conceived as a way to automate overlaying of program segments. Modern computers have very large main memories, but need automatic solutions to the relocation and protection problems. Virtual memory serves this need as well and is thus useful in computers of all sizes. The history of the idea is traced, showing how it has become a widespread, little noticed feature of computers today.
ERIC Educational Resources Information Center
Grandell, Linda
2005-01-01
Computer science is becoming increasingly important in our society. Meta skills, such as problem solving and logical and algorithmic thinking, are emphasized in every field, not only in the natural sciences. Still, largely due to gaps in tuition, common misunderstandings exist about the true nature of computer science. These are especially…
Automation of multi-agent control for complex dynamic systems in heterogeneous computational network
NASA Astrophysics Data System (ADS)
Oparin, Gennady; Feoktistov, Alexander; Bogdanova, Vera; Sidorov, Ivan
2017-01-01
The rapid progress of high-performance computing entails new challenges related to solving large scientific problems for various subject domains in a heterogeneous distributed computing environment (e.g., a network, Grid system, or Cloud infrastructure). The specialists in the field of parallel and distributed computing give the special attention to a scalability of applications for problem solving. An effective management of the scalable application in the heterogeneous distributed computing environment is still a non-trivial issue. Control systems that operate in networks, especially relate to this issue. We propose a new approach to the multi-agent management for the scalable applications in the heterogeneous computational network. The fundamentals of our approach are the integrated use of conceptual programming, simulation modeling, network monitoring, multi-agent management, and service-oriented programming. We developed a special framework for an automation of the problem solving. Advantages of the proposed approach are demonstrated on the parametric synthesis example of the static linear regulator for complex dynamic systems. Benefits of the scalable application for solving this problem include automation of the multi-agent control for the systems in a parallel mode with various degrees of its detailed elaboration.
An Overview of Computational Aeroacoustic Modeling at NASA Langley
NASA Technical Reports Server (NTRS)
Lockard, David P.
2001-01-01
The use of computational techniques in the area of acoustics is known as computational aeroacoustics and has shown great promise in recent years. Although an ultimate goal is to use computational simulations as a virtual wind tunnel, the problem is so complex that blind applications of traditional algorithms are typically unable to produce acceptable results. The phenomena of interest are inherently unsteady and cover a wide range of frequencies and amplitudes. Nonetheless, with appropriate simplifications and special care to resolve specific phenomena, currently available methods can be used to solve important acoustic problems. These simulations can be used to complement experiments, and often give much more detailed information than can be obtained in a wind tunnel. The use of acoustic analogy methods to inexpensively determine far-field acoustics from near-field unsteadiness has greatly reduced the computational requirements. A few examples of current applications of computational aeroacoustics at NASA Langley are given. There remains a large class of problems that require more accurate and efficient methods. Research to develop more advanced methods that are able to handle the geometric complexity of realistic problems using block-structured and unstructured grids are highlighted.
Convergence acceleration of the Proteus computer code with multigrid methods
NASA Technical Reports Server (NTRS)
Demuren, A. O.; Ibraheem, S. O.
1995-01-01
This report presents the results of a study to implement convergence acceleration techniques based on the multigrid concept in the two-dimensional and three-dimensional versions of the Proteus computer code. The first section presents a review of the relevant literature on the implementation of the multigrid methods in computer codes for compressible flow analysis. The next two sections present detailed stability analysis of numerical schemes for solving the Euler and Navier-Stokes equations, based on conventional von Neumann analysis and the bi-grid analysis, respectively. The next section presents details of the computational method used in the Proteus computer code. Finally, the multigrid implementation and applications to several two-dimensional and three-dimensional test problems are presented. The results of the present study show that the multigrid method always leads to a reduction in the number of iterations (or time steps) required for convergence. However, there is an overhead associated with the use of multigrid acceleration. The overhead is higher in 2-D problems than in 3-D problems, thus overall multigrid savings in CPU time are in general better in the latter. Savings of about 40-50 percent are typical in 3-D problems, but they are about 20-30 percent in large 2-D problems. The present multigrid method is applicable to steady-state problems and is therefore ineffective in problems with inherently unstable solutions.
NASA Astrophysics Data System (ADS)
Puzyrkov, Dmitry; Polyakov, Sergey; Podryga, Viktoriia; Markizov, Sergey
2018-02-01
At the present stage of computer technology development it is possible to study the properties and processes in complex systems at molecular and even atomic levels, for example, by means of molecular dynamics methods. The most interesting are problems related with the study of complex processes under real physical conditions. Solving such problems requires the use of high performance computing systems of various types, for example, GRID systems and HPC clusters. Considering the time consuming computational tasks, the need arises of software for automatic and unified monitoring of such computations. A complex computational task can be performed over different HPC systems. It requires output data synchronization between the storage chosen by a scientist and the HPC system used for computations. The design of the computational domain is also quite a problem. It requires complex software tools and algorithms for proper atomistic data generation on HPC systems. The paper describes the prototype of a cloud service, intended for design of atomistic systems of large volume for further detailed molecular dynamic calculations and computational management for this calculations, and presents the part of its concept aimed at initial data generation on the HPC systems.
An electromagnetism-like metaheuristic for open-shop problems with no buffer
NASA Astrophysics Data System (ADS)
Naderi, Bahman; Najafi, Esmaeil; Yazdani, Mehdi
2012-12-01
This paper considers open-shop scheduling with no intermediate buffer to minimize total tardiness. This problem occurs in many production settings, in the plastic molding, chemical, and food processing industries. The paper mathematically formulates the problem by a mixed integer linear program. The problem can be optimally solved by the model. The paper also develops a novel metaheuristic based on an electromagnetism algorithm to solve the large-sized problems. The paper conducts two computational experiments. The first includes small-sized instances by which the mathematical model and general performance of the proposed metaheuristic are evaluated. The second evaluates the metaheuristic for its performance to solve some large-sized instances. The results show that the model and algorithm are effective to deal with the problem.
Towards large scale multi-target tracking
NASA Astrophysics Data System (ADS)
Vo, Ba-Ngu; Vo, Ba-Tuong; Reuter, Stephan; Lam, Quang; Dietmayer, Klaus
2014-06-01
Multi-target tracking is intrinsically an NP-hard problem and the complexity of multi-target tracking solutions usually do not scale gracefully with problem size. Multi-target tracking for on-line applications involving a large number of targets is extremely challenging. This article demonstrates the capability of the random finite set approach to provide large scale multi-target tracking algorithms. In particular it is shown that an approximate filter known as the labeled multi-Bernoulli filter can simultaneously track one thousand five hundred targets in clutter on a standard laptop computer.
Job Scheduling in a Heterogeneous Grid Environment
NASA Technical Reports Server (NTRS)
Shan, Hong-Zhang; Smith, Warren; Oliker, Leonid; Biswas, Rupak
2004-01-01
Computational grids have the potential for solving large-scale scientific problems using heterogeneous and geographically distributed resources. However, a number of major technical hurdles must be overcome before this potential can be realized. One problem that is critical to effective utilization of computational grids is the efficient scheduling of jobs. This work addresses this problem by describing and evaluating a grid scheduling architecture and three job migration algorithms. The architecture is scalable and does not assume control of local site resources. The job migration policies use the availability and performance of computer systems, the network bandwidth available between systems, and the volume of input and output data associated with each job. An extensive performance comparison is presented using real workloads from leading computational centers. The results, based on several key metrics, demonstrate that the performance of our distributed migration algorithms is significantly greater than that of a local scheduling framework and comparable to a non-scalable global scheduling approach.
GPU acceleration of Dock6's Amber scoring computation.
Yang, Hailong; Zhou, Qiongqiong; Li, Bo; Wang, Yongjian; Luan, Zhongzhi; Qian, Depei; Li, Hanlu
2010-01-01
Dressing the problem of virtual screening is a long-term goal in the drug discovery field, which if properly solved, can significantly shorten new drugs' R&D cycle. The scoring functionality that evaluates the fitness of the docking result is one of the major challenges in virtual screening. In general, scoring functionality in docking requires a large amount of floating-point calculations, which usually takes several weeks or even months to be finished. This time-consuming procedure is unacceptable, especially when highly fatal and infectious virus arises such as SARS and H1N1, which forces the scoring task to be done in a limited time. This paper presents how to leverage the computational power of GPU to accelerate Dock6's (http://dock.compbio.ucsf.edu/DOCK_6/) Amber (J. Comput. Chem. 25: 1157-1174, 2004) scoring with NVIDIA CUDA (NVIDIA Corporation Technical Staff, Compute Unified Device Architecture - Programming Guide, NVIDIA Corporation, 2008) (Compute Unified Device Architecture) platform. We also discuss many factors that will greatly influence the performance after porting the Amber scoring to GPU, including thread management, data transfer, and divergence hidden. Our experiments show that the GPU-accelerated Amber scoring achieves a 6.5× speedup with respect to the original version running on AMD dual-core CPU for the same problem size. This acceleration makes the Amber scoring more competitive and efficient for large-scale virtual screening problems.
GPU accelerated fuzzy connected image segmentation by using CUDA.
Zhuge, Ying; Cao, Yong; Miller, Robert W
2009-01-01
Image segmentation techniques using fuzzy connectedness principles have shown their effectiveness in segmenting a variety of objects in several large applications in recent years. However, one problem of these algorithms has been their excessive computational requirements when processing large image datasets. Nowadays commodity graphics hardware provides high parallel computing power. In this paper, we present a parallel fuzzy connected image segmentation algorithm on Nvidia's Compute Unified Device Architecture (CUDA) platform for segmenting large medical image data sets. Our experiments based on three data sets with small, medium, and large data size demonstrate the efficiency of the parallel algorithm, which achieves a speed-up factor of 7.2x, 7.3x, and 14.4x, correspondingly, for the three data sets over the sequential implementation of fuzzy connected image segmentation algorithm on CPU.
Hybrid Quantum-Classical Approach to Quantum Optimal Control.
Li, Jun; Yang, Xiaodong; Peng, Xinhua; Sun, Chang-Pu
2017-04-14
A central challenge in quantum computing is to identify more computational problems for which utilization of quantum resources can offer significant speedup. Here, we propose a hybrid quantum-classical scheme to tackle the quantum optimal control problem. We show that the most computationally demanding part of gradient-based algorithms, namely, computing the fitness function and its gradient for a control input, can be accomplished by the process of evolution and measurement on a quantum simulator. By posing queries to and receiving answers from the quantum simulator, classical computing devices update the control parameters until an optimal control solution is found. To demonstrate the quantum-classical scheme in experiment, we use a seven-qubit nuclear magnetic resonance system, on which we have succeeded in optimizing state preparation without involving classical computation of the large Hilbert space evolution.
Smith, Rob; Mathis, Andrew D; Ventura, Dan; Prince, John T
2014-01-01
For decades, mass spectrometry data has been analyzed to investigate a wide array of research interests, including disease diagnostics, biological and chemical theory, genomics, and drug development. Progress towards solving any of these disparate problems depends upon overcoming the common challenge of interpreting the large data sets generated. Despite interim successes, many data interpretation problems in mass spectrometry are still challenging. Further, though these challenges are inherently interdisciplinary in nature, the significant domain-specific knowledge gap between disciplines makes interdisciplinary contributions difficult. This paper provides an introduction to the burgeoning field of computational mass spectrometry. We illustrate key concepts, vocabulary, and open problems in MS-omics, as well as provide invaluable resources such as open data sets and key search terms and references. This paper will facilitate contributions from mathematicians, computer scientists, and statisticians to MS-omics that will fundamentally improve results over existing approaches and inform novel algorithmic solutions to open problems.
Implementing and Assessing Computational Modeling in Introductory Mechanics
ERIC Educational Resources Information Center
Caballero, Marcos D.; Kohlmyer, Matthew A.; Schatz, Michael F.
2012-01-01
Students taking introductory physics are rarely exposed to computational modeling. In a one-semester large lecture introductory calculus-based mechanics course at Georgia Tech, students learned to solve physics problems using the VPython programming environment. During the term, 1357 students in this course solved a suite of 14 computational…
ERIC Educational Resources Information Center
Bork, Alfred M.
An introduction to the problems involved in conversion of computer dialogues from one computer language to another is presented. Conversion of individual dialogues by complete rewriting is straightforward, if tedious. To make a general conversion of a large group of heterogeneous dialogue material from one language to another at one step is more…
Moore, R. Davis; Drollette, Eric S.; Scudder, Mark R.; Bharij, Aashiv; Hillman, Charles H.
2014-01-01
The current study investigated the influence of cardiorespiratory fitness on arithmetic cognition in forty 9–10 year old children. Measures included a standardized mathematics achievement test to assess conceptual and computational knowledge, self-reported strategy selection, and an experimental arithmetic verification task (including small and large addition problems), which afforded the measurement of event-related brain potentials (ERPs). No differences in math achievement were observed as a function of fitness level, but all children performed better on math concepts relative to math computation. Higher fit children reported using retrieval more often to solve large arithmetic problems, relative to lower fit children. During the arithmetic verification task, higher fit children exhibited superior performance for large problems, as evidenced by greater d' scores, while all children exhibited decreased accuracy and longer reaction time for large relative to small problems, and incorrect relative to correct solutions. On the electrophysiological level, modulations of early (P1, N170) and late ERP components (P3, N400) were observed as a function of problem size and solution correctness. Higher fit children exhibited selective modulations for N170, P3, and N400 amplitude relative to lower fit children, suggesting that fitness influences symbolic encoding, attentional resource allocation and semantic processing during arithmetic tasks. The current study contributes to the fitness-cognition literature by demonstrating that the benefits of cardiorespiratory fitness extend to arithmetic cognition, which has important implications for the educational environment and the context of learning. PMID:24829556
Structural design using equilibrium programming formulations
NASA Technical Reports Server (NTRS)
Scotti, Stephen J.
1995-01-01
Solutions to increasingly larger structural optimization problems are desired. However, computational resources are strained to meet this need. New methods will be required to solve increasingly larger problems. The present approaches to solving large-scale problems involve approximations for the constraints of structural optimization problems and/or decomposition of the problem into multiple subproblems that can be solved in parallel. An area of game theory, equilibrium programming (also known as noncooperative game theory), can be used to unify these existing approaches from a theoretical point of view (considering the existence and optimality of solutions), and be used as a framework for the development of new methods for solving large-scale optimization problems. Equilibrium programming theory is described, and existing design techniques such as fully stressed design and constraint approximations are shown to fit within its framework. Two new structural design formulations are also derived. The first new formulation is another approximation technique which is a general updating scheme for the sensitivity derivatives of design constraints. The second new formulation uses a substructure-based decomposition of the structure for analysis and sensitivity calculations. Significant computational benefits of the new formulations compared with a conventional method are demonstrated.
Big data mining analysis method based on cloud computing
NASA Astrophysics Data System (ADS)
Cai, Qing Qiu; Cui, Hong Gang; Tang, Hao
2017-08-01
Information explosion era, large data super-large, discrete and non-(semi) structured features have gone far beyond the traditional data management can carry the scope of the way. With the arrival of the cloud computing era, cloud computing provides a new technical way to analyze the massive data mining, which can effectively solve the problem that the traditional data mining method cannot adapt to massive data mining. This paper introduces the meaning and characteristics of cloud computing, analyzes the advantages of using cloud computing technology to realize data mining, designs the mining algorithm of association rules based on MapReduce parallel processing architecture, and carries out the experimental verification. The algorithm of parallel association rule mining based on cloud computing platform can greatly improve the execution speed of data mining.
Parallel Computing Strategies for Irregular Algorithms
NASA Technical Reports Server (NTRS)
Biswas, Rupak; Oliker, Leonid; Shan, Hongzhang; Biegel, Bryan (Technical Monitor)
2002-01-01
Parallel computing promises several orders of magnitude increase in our ability to solve realistic computationally-intensive problems, but relies on their efficient mapping and execution on large-scale multiprocessor architectures. Unfortunately, many important applications are irregular and dynamic in nature, making their effective parallel implementation a daunting task. Moreover, with the proliferation of parallel architectures and programming paradigms, the typical scientist is faced with a plethora of questions that must be answered in order to obtain an acceptable parallel implementation of the solution algorithm. In this paper, we consider three representative irregular applications: unstructured remeshing, sparse matrix computations, and N-body problems, and parallelize them using various popular programming paradigms on a wide spectrum of computer platforms ranging from state-of-the-art supercomputers to PC clusters. We present the underlying problems, the solution algorithms, and the parallel implementation strategies. Smart load-balancing, partitioning, and ordering techniques are used to enhance parallel performance. Overall results demonstrate the complexity of efficiently parallelizing irregular algorithms.
Statistical mechanics of complex neural systems and high dimensional data
NASA Astrophysics Data System (ADS)
Advani, Madhu; Lahiri, Subhaneil; Ganguli, Surya
2013-03-01
Recent experimental advances in neuroscience have opened new vistas into the immense complexity of neuronal networks. This proliferation of data challenges us on two parallel fronts. First, how can we form adequate theoretical frameworks for understanding how dynamical network processes cooperate across widely disparate spatiotemporal scales to solve important computational problems? Second, how can we extract meaningful models of neuronal systems from high dimensional datasets? To aid in these challenges, we give a pedagogical review of a collection of ideas and theoretical methods arising at the intersection of statistical physics, computer science and neurobiology. We introduce the interrelated replica and cavity methods, which originated in statistical physics as powerful ways to quantitatively analyze large highly heterogeneous systems of many interacting degrees of freedom. We also introduce the closely related notion of message passing in graphical models, which originated in computer science as a distributed algorithm capable of solving large inference and optimization problems involving many coupled variables. We then show how both the statistical physics and computer science perspectives can be applied in a wide diversity of contexts to problems arising in theoretical neuroscience and data analysis. Along the way we discuss spin glasses, learning theory, illusions of structure in noise, random matrices, dimensionality reduction and compressed sensing, all within the unified formalism of the replica method. Moreover, we review recent conceptual connections between message passing in graphical models, and neural computation and learning. Overall, these ideas illustrate how statistical physics and computer science might provide a lens through which we can uncover emergent computational functions buried deep within the dynamical complexities of neuronal networks.
A modular approach to large-scale design optimization of aerospace systems
NASA Astrophysics Data System (ADS)
Hwang, John T.
Gradient-based optimization and the adjoint method form a synergistic combination that enables the efficient solution of large-scale optimization problems. Though the gradient-based approach struggles with non-smooth or multi-modal problems, the capability to efficiently optimize up to tens of thousands of design variables provides a valuable design tool for exploring complex tradeoffs and finding unintuitive designs. However, the widespread adoption of gradient-based optimization is limited by the implementation challenges for computing derivatives efficiently and accurately, particularly in multidisciplinary and shape design problems. This thesis addresses these difficulties in two ways. First, to deal with the heterogeneity and integration challenges of multidisciplinary problems, this thesis presents a computational modeling framework that solves multidisciplinary systems and computes their derivatives in a semi-automated fashion. This framework is built upon a new mathematical formulation developed in this thesis that expresses any computational model as a system of algebraic equations and unifies all methods for computing derivatives using a single equation. The framework is applied to two engineering problems: the optimization of a nanosatellite with 7 disciplines and over 25,000 design variables; and simultaneous allocation and mission optimization for commercial aircraft involving 330 design variables, 12 of which are integer variables handled using the branch-and-bound method. In both cases, the framework makes large-scale optimization possible by reducing the implementation effort and code complexity. The second half of this thesis presents a differentiable parametrization of aircraft geometries and structures for high-fidelity shape optimization. Existing geometry parametrizations are not differentiable, or they are limited in the types of shape changes they allow. This is addressed by a novel parametrization that smoothly interpolates aircraft components, providing differentiability. An unstructured quadrilateral mesh generation algorithm is also developed to automate the creation of detailed meshes for aircraft structures, and a mesh convergence study is performed to verify that the quality of the mesh is maintained as it is refined. As a demonstration, high-fidelity aerostructural analysis is performed for two unconventional configurations with detailed structures included, and aerodynamic shape optimization is applied to the truss-braced wing, which finds and eliminates a shock in the region bounded by the struts and the wing.
Lanczos eigensolution method for high-performance computers
NASA Technical Reports Server (NTRS)
Bostic, Susan W.
1991-01-01
The theory, computational analysis, and applications are presented of a Lanczos algorithm on high performance computers. The computationally intensive steps of the algorithm are identified as: the matrix factorization, the forward/backward equation solution, and the matrix vector multiples. These computational steps are optimized to exploit the vector and parallel capabilities of high performance computers. The savings in computational time from applying optimization techniques such as: variable band and sparse data storage and access, loop unrolling, use of local memory, and compiler directives are presented. Two large scale structural analysis applications are described: the buckling of a composite blade stiffened panel with a cutout, and the vibration analysis of a high speed civil transport. The sequential computational time for the panel problem executed on a CONVEX computer of 181.6 seconds was decreased to 14.1 seconds with the optimized vector algorithm. The best computational time of 23 seconds for the transport problem with 17,000 degs of freedom was on the the Cray-YMP using an average of 3.63 processors.
NASA Technical Reports Server (NTRS)
Morgan, Philip E.
2004-01-01
This final report contains reports of research related to the tasks "Scalable High Performance Computing: Direct and Lark-Eddy Turbulent FLow Simulations Using Massively Parallel Computers" and "Devleop High-Performance Time-Domain Computational Electromagnetics Capability for RCS Prediction, Wave Propagation in Dispersive Media, and Dual-Use Applications. The discussion of Scalable High Performance Computing reports on three objectives: validate, access scalability, and apply two parallel flow solvers for three-dimensional Navier-Stokes flows; develop and validate a high-order parallel solver for Direct Numerical Simulations (DNS) and Large Eddy Simulation (LES) problems; and Investigate and develop a high-order Reynolds averaged Navier-Stokes turbulence model. The discussion of High-Performance Time-Domain Computational Electromagnetics reports on five objectives: enhancement of an electromagnetics code (CHARGE) to be able to effectively model antenna problems; utilize lessons learned in high-order/spectral solution of swirling 3D jets to apply to solving electromagnetics project; transition a high-order fluids code, FDL3DI, to be able to solve Maxwell's Equations using compact-differencing; develop and demonstrate improved radiation absorbing boundary conditions for high-order CEM; and extend high-order CEM solver to address variable material properties. The report also contains a review of work done by the systems engineer.
Indirect addressing and load balancing for faster solution to Mandelbrot Set on SIMD architectures
NASA Technical Reports Server (NTRS)
Tomboulian, Sherryl
1989-01-01
SIMD computers with local indirect addressing allow programs to have queues and buffers, making certain kinds of problems much more efficient. Examined here are a class of problems characterized by computations on data points where the computation is identical, but the convergence rate is data dependent. Normally, in this situation, the algorithm time is governed by the maximum number of iterations required by each point. Using indirect addressing allows a processor to proceed to the next data point when it is done, reducing the overall number of iterations required to approach the mean convergence rate when a sufficiently large problem set is solved. Load balancing techniques can be applied for additional performance improvement. Simulations of this technique applied to solving Mandelbrot Sets indicate significant performance gains.
NASA Astrophysics Data System (ADS)
Negrello, Camille; Gosselet, Pierre; Rey, Christian
2018-05-01
An efficient method for solving large nonlinear problems combines Newton solvers and Domain Decomposition Methods (DDM). In the DDM framework, the boundary conditions can be chosen to be primal, dual or mixed. The mixed approach presents the advantage to be eligible for the research of an optimal interface parameter (often called impedance) which can increase the convergence rate. The optimal value for this parameter is often too expensive to be computed exactly in practice: an approximate version has to be sought for, along with a compromise between efficiency and computational cost. In the context of parallel algorithms for solving nonlinear structural mechanical problems, we propose a new heuristic for the impedance which combines short and long range effects at a low computational cost.
NASA Astrophysics Data System (ADS)
Aishah Syed Ali, Sharifah
2017-09-01
This paper considers economic lot sizing problem in remanufacturing with separate setup (ELSRs), where remanufactured and new products are produced on dedicated production lines. Since this problem is NP-hard in general, which leads to computationally inefficient and low-quality of solutions, we present (a) a multicommodity formulation and (b) a strengthened formulation based on a priori addition of valid inequalities in the space of original variables, which are then compared with the Wagner-Whitin based formulation available in the literature. Computational experiments on a large number of test data sets are performed to evaluate the different approaches. The numerical results show that our strengthened formulation outperforms all the other tested approaches in terms of linear relaxation bounds. Finally, we conclude with future research directions.
[Problem list in computer-based patient records].
Ludwig, C A
1997-01-14
Computer-based clinical information systems are capable of effectively processing even large amounts of patient-related data. However, physicians depend on rapid access to summarized, clearly laid out data on the computer screen to inform themselves about a patient's current clinical situation. In introducing a clinical workplace system, we therefore transformed the problem list-which for decades has been successfully used in clinical information management-into an electronic equivalent and integrated it into the medical record. The table contains a concise overview of diagnoses and problems as well as related findings. Graphical information can also be integrated into the table, and an additional space is provided for a summary of planned examinations or interventions. The digital form of the problem list makes it possible to use the entire list or selected text elements for generating medical documents. Diagnostic terms for medical reports are transferred automatically to corresponding documents. Computer technology has an immense potential for the further development of problem list concepts. With multimedia applications sound and images will be included in the problem list. For hyperlink purpose the problem list could become a central information board and table of contents of the medical record, thus serving as the starting point for database searches and supporting the user in navigating through the medical record.
a Novel Discrete Optimal Transport Method for Bayesian Inverse Problems
NASA Astrophysics Data System (ADS)
Bui-Thanh, T.; Myers, A.; Wang, K.; Thiery, A.
2017-12-01
We present the Augmented Ensemble Transform (AET) method for generating approximate samples from a high-dimensional posterior distribution as a solution to Bayesian inverse problems. Solving large-scale inverse problems is critical for some of the most relevant and impactful scientific endeavors of our time. Therefore, constructing novel methods for solving the Bayesian inverse problem in more computationally efficient ways can have a profound impact on the science community. This research derives the novel AET method for exploring a posterior by solving a sequence of linear programming problems, resulting in a series of transport maps which map prior samples to posterior samples, allowing for the computation of moments of the posterior. We show both theoretical and numerical results, indicating this method can offer superior computational efficiency when compared to other SMC methods. Most of this efficiency is derived from matrix scaling methods to solve the linear programming problem and derivative-free optimization for particle movement. We use this method to determine inter-well connectivity in a reservoir and the associated uncertainty related to certain parameters. The attached file shows the difference between the true parameter and the AET parameter in an example 3D reservoir problem. The error is within the Morozov discrepancy allowance with lower computational cost than other particle methods.
Decentralized Optimal Dispatch of Photovoltaic Inverters in Residential Distribution Systems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dall'Anese, Emiliano; Dhople, Sairaj V.; Johnson, Brian B.
Summary form only given. Decentralized methods for computing optimal real and reactive power setpoints for residential photovoltaic (PV) inverters are developed in this paper. It is known that conventional PV inverter controllers, which are designed to extract maximum power at unity power factor, cannot address secondary performance objectives such as voltage regulation and network loss minimization. Optimal power flow techniques can be utilized to select which inverters will provide ancillary services, and to compute their optimal real and reactive power setpoints according to well-defined performance criteria and economic objectives. Leveraging advances in sparsity-promoting regularization techniques and semidefinite relaxation, this papermore » shows how such problems can be solved with reduced computational burden and optimality guarantees. To enable large-scale implementation, a novel algorithmic framework is introduced - based on the so-called alternating direction method of multipliers - by which optimal power flow-type problems in this setting can be systematically decomposed into sub-problems that can be solved in a decentralized fashion by the utility and customer-owned PV systems with limited exchanges of information. Since the computational burden is shared among multiple devices and the requirement of all-to-all communication can be circumvented, the proposed optimization approach scales favorably to large distribution networks.« less
Vieira, J; Cunha, M C
2011-01-01
This article describes a solution method of solving large nonlinear problems in two steps. The two steps solution approach takes advantage of handling smaller and simpler models and having better starting points to improve solution efficiency. The set of nonlinear constraints (named as complicating constraints) which makes the solution of the model rather complex and time consuming is eliminated from step one. The complicating constraints are added only in the second step so that a solution of the complete model is then found. The solution method is applied to a large-scale problem of conjunctive use of surface water and groundwater resources. The results obtained are compared with solutions determined with the direct solve of the complete model in one single step. In all examples the two steps solution approach allowed a significant reduction of the computation time. This potential gain of efficiency of the two steps solution approach can be extremely important for work in progress and it can be particularly useful for cases where the computation time would be a critical factor for having an optimized solution in due time.
Decomposition and model selection for large contingency tables.
Dahinden, Corinne; Kalisch, Markus; Bühlmann, Peter
2010-04-01
Large contingency tables summarizing categorical variables arise in many areas. One example is in biology, where large numbers of biomarkers are cross-tabulated according to their discrete expression level. Interactions of the variables are of great interest and are generally studied with log-linear models. The structure of a log-linear model can be visually represented by a graph from which the conditional independence structure can then be easily read off. However, since the number of parameters in a saturated model grows exponentially in the number of variables, this generally comes with a heavy computational burden. Even if we restrict ourselves to models of lower-order interactions or other sparse structures, we are faced with the problem of a large number of cells which play the role of sample size. This is in sharp contrast to high-dimensional regression or classification procedures because, in addition to a high-dimensional parameter, we also have to deal with the analogue of a huge sample size. Furthermore, high-dimensional tables naturally feature a large number of sampling zeros which often leads to the nonexistence of the maximum likelihood estimate. We therefore present a decomposition approach, where we first divide the problem into several lower-dimensional problems and then combine these to form a global solution. Our methodology is computationally feasible for log-linear interaction models with many categorical variables each or some of them having many levels. We demonstrate the proposed method on simulated data and apply it to a bio-medical problem in cancer research.
NASA Technical Reports Server (NTRS)
Morrell, R. A.; Odoherty, R. J.; Ramsey, H. R.; Reynolds, C. C.; Willoughby, J. K.; Working, R. D.
1975-01-01
Data and analyses related to a variety of algorithms for solving typical large-scale scheduling and resource allocation problems are presented. The capabilities and deficiencies of various alternative problem solving strategies are discussed from the viewpoint of computer system design.
Interaction Network Estimation: Predicting Problem-Solving Diversity in Interactive Environments
ERIC Educational Resources Information Center
Eagle, Michael; Hicks, Drew; Barnes, Tiffany
2015-01-01
Intelligent tutoring systems and computer aided learning environments aimed at developing problem solving produce large amounts of transactional data which make it a challenge for both researchers and educators to understand how students work within the environment. Researchers have modeled student-tutor interactions using complex networks in…
Analysis, preliminary design and simulation systems for control-structure interaction problems
NASA Technical Reports Server (NTRS)
Park, K. C.; Alvin, Kenneth F.
1991-01-01
Software aspects of control-structure interaction (CSI) analysis are discussed. The following subject areas are covered: (1) implementation of a partitioned algorithm for simulation of large CSI problems; (2) second-order discrete Kalman filtering equations for CSI simulations; and (3) parallel computations and control of adaptive structures.
Algorithm For Optimal Control Of Large Structures
NASA Technical Reports Server (NTRS)
Salama, Moktar A.; Garba, John A..; Utku, Senol
1989-01-01
Cost of computation appears competitive with other methods. Problem to compute optimal control of forced response of structure with n degrees of freedom identified in terms of smaller number, r, of vibrational modes. Article begins with Hamilton-Jacobi formulation of mechanics and use of quadratic cost functional. Complexity reduced by alternative approach in which quadratic cost functional expressed in terms of control variables only. Leads to iterative solution of second-order time-integral matrix Volterra equation of second kind containing optimal control vector. Cost of algorithm, measured in terms of number of computations required, is of order of, or less than, cost of prior algoritms applied to similar problems.
Parallel Computational Protein Design.
Zhou, Yichao; Donald, Bruce R; Zeng, Jianyang
2017-01-01
Computational structure-based protein design (CSPD) is an important problem in computational biology, which aims to design or improve a prescribed protein function based on a protein structure template. It provides a practical tool for real-world protein engineering applications. A popular CSPD method that guarantees to find the global minimum energy solution (GMEC) is to combine both dead-end elimination (DEE) and A* tree search algorithms. However, in this framework, the A* search algorithm can run in exponential time in the worst case, which may become the computation bottleneck of large-scale computational protein design process. To address this issue, we extend and add a new module to the OSPREY program that was previously developed in the Donald lab (Gainza et al., Methods Enzymol 523:87, 2013) to implement a GPU-based massively parallel A* algorithm for improving protein design pipeline. By exploiting the modern GPU computational framework and optimizing the computation of the heuristic function for A* search, our new program, called gOSPREY, can provide up to four orders of magnitude speedups in large protein design cases with a small memory overhead comparing to the traditional A* search algorithm implementation, while still guaranteeing the optimality. In addition, gOSPREY can be configured to run in a bounded-memory mode to tackle the problems in which the conformation space is too large and the global optimal solution cannot be computed previously. Furthermore, the GPU-based A* algorithm implemented in the gOSPREY program can be combined with the state-of-the-art rotamer pruning algorithms such as iMinDEE (Gainza et al., PLoS Comput Biol 8:e1002335, 2012) and DEEPer (Hallen et al., Proteins 81:18-39, 2013) to also consider continuous backbone and side-chain flexibility.
Domain decomposition algorithms and computation fluid dynamics
NASA Technical Reports Server (NTRS)
Chan, Tony F.
1988-01-01
In the past several years, domain decomposition was a very popular topic, partly motivated by the potential of parallelization. While a large body of theory and algorithms were developed for model elliptic problems, they are only recently starting to be tested on realistic applications. The application of some of these methods to two model problems in computational fluid dynamics are investigated. Some examples are two dimensional convection-diffusion problems and the incompressible driven cavity flow problem. The construction and analysis of efficient preconditioners for the interface operator to be used in the iterative solution of the interface solution is described. For the convection-diffusion problems, the effect of the convection term and its discretization on the performance of some of the preconditioners is discussed. For the driven cavity problem, the effectiveness of a class of boundary probe preconditioners is discussed.
System balance analysis for vector computers
NASA Technical Reports Server (NTRS)
Knight, J. C.; Poole, W. G., Jr.; Voight, R. G.
1975-01-01
The availability of vector processors capable of sustaining computing rates of 10 to the 8th power arithmetic results pers second raised the question of whether peripheral storage devices representing current technology can keep such processors supplied with data. By examining the solution of a large banded linear system on these computers, it was found that even under ideal conditions, the processors will frequently be waiting for problem data.
PETSc Users Manual Revision 3.7
DOE Office of Scientific and Technical Information (OSTI.GOV)
Balay, Satish; Abhyankar, S.; Adams, M.
This manual describes the use of PETSc for the numerical solution of partial differential equations and related problems on high-performance computers. The Portable, Extensible Toolkit for Scientific Computation (PETSc) is a suite of data structures and routines that provide the building blocks for the implementation of large-scale application codes on parallel (and serial) computers. PETSc uses the MPI standard for all message-passing communication.
PETSc Users Manual Revision 3.8
DOE Office of Scientific and Technical Information (OSTI.GOV)
Balay, S.; Abhyankar, S.; Adams, M.
This manual describes the use of PETSc for the numerical solution of partial differential equations and related problems on high-performance computers. The Portable, Extensible Toolkit for Scientific Computation (PETSc) is a suite of data structures and routines that provide the building blocks for the implementation of large-scale application codes on parallel (and serial) computers. PETSc uses the MPI standard for all message-passing communication.
Eigensolver for a Sparse, Large Hermitian Matrix
NASA Technical Reports Server (NTRS)
Tisdale, E. Robert; Oyafuso, Fabiano; Klimeck, Gerhard; Brown, R. Chris
2003-01-01
A parallel-processing computer program finds a few eigenvalues in a sparse Hermitian matrix that contains as many as 100 million diagonal elements. This program finds the eigenvalues faster, using less memory, than do other, comparable eigensolver programs. This program implements a Lanczos algorithm in the American National Standards Institute/ International Organization for Standardization (ANSI/ISO) C computing language, using the Message Passing Interface (MPI) standard to complement an eigensolver in PARPACK. [PARPACK (Parallel Arnoldi Package) is an extension, to parallel-processing computer architectures, of ARPACK (Arnoldi Package), which is a collection of Fortran 77 subroutines that solve large-scale eigenvalue problems.] The eigensolver runs on Beowulf clusters of computers at the Jet Propulsion Laboratory (JPL).
Fast Solution in Sparse LDA for Binary Classification
NASA Technical Reports Server (NTRS)
Moghaddam, Baback
2010-01-01
An algorithm that performs sparse linear discriminant analysis (Sparse-LDA) finds near-optimal solutions in far less time than the prior art when specialized to binary classification (of 2 classes). Sparse-LDA is a type of feature- or variable- selection problem with numerous applications in statistics, machine learning, computer vision, computational finance, operations research, and bio-informatics. Because of its combinatorial nature, feature- or variable-selection problems are NP-hard or computationally intractable in cases involving more than 30 variables or features. Therefore, one typically seeks approximate solutions by means of greedy search algorithms. The prior Sparse-LDA algorithm was a greedy algorithm that considered the best variable or feature to add/ delete to/ from its subsets in order to maximally discriminate between multiple classes of data. The present algorithm is designed for the special but prevalent case of 2-class or binary classification (e.g. 1 vs. 0, functioning vs. malfunctioning, or change versus no change). The present algorithm provides near-optimal solutions on large real-world datasets having hundreds or even thousands of variables or features (e.g. selecting the fewest wavelength bands in a hyperspectral sensor to do terrain classification) and does so in typical computation times of minutes as compared to days or weeks as taken by the prior art. Sparse LDA requires solving generalized eigenvalue problems for a large number of variable subsets (represented by the submatrices of the input within-class and between-class covariance matrices). In the general (fullrank) case, the amount of computation scales at least cubically with the number of variables and thus the size of the problems that can be solved is limited accordingly. However, in binary classification, the principal eigenvalues can be found using a special analytic formula, without resorting to costly iterative techniques. The present algorithm exploits this analytic form along with the inherent sequential nature of greedy search itself. Together this enables the use of highly-efficient partitioned-matrix-inverse techniques that result in large speedups of computation in both the forward-selection and backward-elimination stages of greedy algorithms in general.
2014-01-01
Background Network-based learning algorithms for automated function prediction (AFP) are negatively affected by the limited coverage of experimental data and limited a priori known functional annotations. As a consequence their application to model organisms is often restricted to well characterized biological processes and pathways, and their effectiveness with poorly annotated species is relatively limited. A possible solution to this problem might consist in the construction of big networks including multiple species, but this in turn poses challenging computational problems, due to the scalability limitations of existing algorithms and the main memory requirements induced by the construction of big networks. Distributed computation or the usage of big computers could in principle respond to these issues, but raises further algorithmic problems and require resources not satisfiable with simple off-the-shelf computers. Results We propose a novel framework for scalable network-based learning of multi-species protein functions based on both a local implementation of existing algorithms and the adoption of innovative technologies: we solve “locally” the AFP problem, by designing “vertex-centric” implementations of network-based algorithms, but we do not give up thinking “globally” by exploiting the overall topology of the network. This is made possible by the adoption of secondary memory-based technologies that allow the efficient use of the large memory available on disks, thus overcoming the main memory limitations of modern off-the-shelf computers. This approach has been applied to the analysis of a large multi-species network including more than 300 species of bacteria and to a network with more than 200,000 proteins belonging to 13 Eukaryotic species. To our knowledge this is the first work where secondary-memory based network analysis has been applied to multi-species function prediction using biological networks with hundreds of thousands of proteins. Conclusions The combination of these algorithmic and technological approaches makes feasible the analysis of large multi-species networks using ordinary computers with limited speed and primary memory, and in perspective could enable the analysis of huge networks (e.g. the whole proteomes available in SwissProt), using well-equipped stand-alone machines. PMID:24843788
NASA Astrophysics Data System (ADS)
Castro, María Eugenia; Díaz, Javier; Muñoz-Caro, Camelia; Niño, Alfonso
2011-09-01
We present a system of classes, SHMatrix, to deal in a unified way with the computation of eigenvalues and eigenvectors in real symmetric and Hermitian matrices. Thus, two descendant classes, one for the real symmetric and other for the Hermitian cases, override the abstract methods defined in a base class. The use of the inheritance relationship and polymorphism allows handling objects of any descendant class using a single reference of the base class. The system of classes is intended to be the core element of more sophisticated methods to deal with large eigenvalue problems, as those arising in the variational treatment of realistic quantum mechanical problems. The present system of classes allows computing a subset of all the possible eigenvalues and, optionally, the corresponding eigenvectors. Comparison with well established solutions for analogous eigenvalue problems, as those included in LAPACK, shows that the present solution is competitive against them. Program summaryProgram title: SHMatrix Catalogue identifier: AEHZ_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEHZ_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: Standard CPC licence, http://cpc.cs.qub.ac.uk/licence/licence.html No. of lines in distributed program, including test data, etc.: 2616 No. of bytes in distributed program, including test data, etc.: 127 312 Distribution format: tar.gz Programming language: Standard ANSI C++. Computer: PCs and workstations. Operating system: Linux, Windows. Classification: 4.8. Nature of problem: The treatment of problems involving eigensystems is a central topic in the quantum mechanical field. Here, the use of the variational approach leads to the computation of eigenvalues and eigenvectors of real symmetric and Hermitian Hamiltonian matrices. Realistic models with several degrees of freedom leads to large (sometimes very large) matrices. Different techniques, such as divide and conquer, can be used to factorize the matrices in order to apply a parallel computing approach. However, it is still interesting to have a core procedure able to tackle the computation of eigenvalues and eigenvectors once the matrix has been factorized to pieces of enough small size. Several available software packages, such as LAPACK, tackled this problem under the traditional imperative programming paradigm. In order to ease the modelling of complex quantum mechanical models it could be interesting to apply an object-oriented approach to the treatment of the eigenproblem. This approach offers the advantage of a single, uniform treatment for the real symmetric and Hermitian cases. Solution method: To reach the above goals, we have developed a system of classes: SHMatrix. SHMatrix is composed by an abstract base class and two descendant classes, one for real symmetric matrices and the other for the Hermitian case. The object-oriented characteristics of inheritance and polymorphism allows handling both cases using a single reference of the base class. The basic computing strategy applied in SHMatrix allows computing subsets of eigenvalues and (optionally) eigenvectors. The tests performed show that SHMatrix is competitive, and more efficient for large matrices, than the equivalent routines of the LAPACK package. Running time: The examples included in the distribution take only a couple of seconds to run.
Internet computer coaches for introductory physics problem solving
NASA Astrophysics Data System (ADS)
Xu Ryan, Qing
The ability to solve problems in a variety of contexts is becoming increasingly important in our rapidly changing technological society. Problem-solving is a complex process that is important for everyday life and crucial for learning physics. Although there is a great deal of effort to improve student problem solving skills throughout the educational system, national studies have shown that the majority of students emerge from such courses having made little progress toward developing good problem-solving skills. The Physics Education Research Group at the University of Minnesota has been developing Internet computer coaches to help students become more expert-like problem solvers. During the Fall 2011 and Spring 2013 semesters, the coaches were introduced into large sections (200+ students) of the calculus based introductory mechanics course at the University of Minnesota. This dissertation, will address the research background of the project, including the pedagogical design of the coaches and the assessment of problem solving. The methodological framework of conducting experiments will be explained. The data collected from the large-scale experimental studies will be discussed from the following aspects: the usage and usability of these coaches; the usefulness perceived by students; and the usefulness measured by final exam and problem solving rubric. It will also address the implications drawn from this study, including using this data to direct future coach design and difficulties in conducting authentic assessment of problem-solving.
Lattice gas methods for computational aeroacoustics
NASA Technical Reports Server (NTRS)
Sparrow, Victor W.
1995-01-01
This paper presents the lattice gas solution to the category 1 problems of the ICASE/LaRC Workshop on Benchmark Problems in Computational Aeroacoustics. The first and second problems were solved for Delta t = Delta x = 1, and additionally the second problem was solved for Delta t = 1/4 and Delta x = 1/2. The results are striking: even for these large time and space grids the lattice gas numerical solutions are almost indistinguishable from the analytical solutions. A simple bug in the Mathematica code was found in the solutions submitted for comparison, and the comparison plots shown at the end of this volume show the bug. An Appendix to the present paper shows an example lattice gas solution with and without the bug.
The Next Frontier in Computing
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sarrao, John
2016-11-16
Exascale computing refers to computing systems capable of at least one exaflop or a billion calculations per second (1018). That is 50 times faster than the most powerful supercomputers being used today and represents a thousand-fold increase over the first petascale computer that came into operation in 2008. How we use these large-scale simulation resources is the key to solving some of today’s most pressing problems, including clean energy production, nuclear reactor lifetime extension and nuclear stockpile aging.
He, Bo; Zhang, Shujing; Yan, Tianhong; Zhang, Tao; Liang, Yan; Zhang, Hongjin
2011-01-01
Mobile autonomous systems are very important for marine scientific investigation and military applications. Many algorithms have been studied to deal with the computational efficiency problem required for large scale simultaneous localization and mapping (SLAM) and its related accuracy and consistency. Among these methods, submap-based SLAM is a more effective one. By combining the strength of two popular mapping algorithms, the Rao-Blackwellised particle filter (RBPF) and extended information filter (EIF), this paper presents a combined SLAM-an efficient submap-based solution to the SLAM problem in a large scale environment. RBPF-SLAM is used to produce local maps, which are periodically fused into an EIF-SLAM algorithm. RBPF-SLAM can avoid linearization of the robot model during operating and provide a robust data association, while EIF-SLAM can improve the whole computational speed, and avoid the tendency of RBPF-SLAM to be over-confident. In order to further improve the computational speed in a real time environment, a binary-tree-based decision-making strategy is introduced. Simulation experiments show that the proposed combined SLAM algorithm significantly outperforms currently existing algorithms in terms of accuracy and consistency, as well as the computing efficiency. Finally, the combined SLAM algorithm is experimentally validated in a real environment by using the Victoria Park dataset.
Efficient Implementation of an Optimal Interpolator for Large Spatial Data Sets
NASA Technical Reports Server (NTRS)
Memarsadeghi, Nargess; Mount, David M.
2007-01-01
Interpolating scattered data points is a problem of wide ranging interest. A number of approaches for interpolation have been proposed both from theoretical domains such as computational geometry and in applications' fields such as geostatistics. Our motivation arises from geological and mining applications. In many instances data can be costly to compute and are available only at nonuniformly scattered positions. Because of the high cost of collecting measurements, high accuracy is required in the interpolants. One of the most popular interpolation methods in this field is called ordinary kriging. It is popular because it is a best linear unbiased estimator. The price for its statistical optimality is that the estimator is computationally very expensive. This is because the value of each interpolant is given by the solution of a large dense linear system. In practice, kriging problems have been solved approximately by restricting the domain to a small local neighborhood of points that lie near the query point. Determining the proper size for this neighborhood is a solved by ad hoc methods, and it has been shown that this approach leads to undesirable discontinuities in the interpolant. Recently a more principled approach to approximating kriging has been proposed based on a technique called covariance tapering. This process achieves its efficiency by replacing the large dense kriging system with a much sparser linear system. This technique has been applied to a restriction of our problem, called simple kriging, which is not unbiased for general data sets. In this paper we generalize these results by showing how to apply covariance tapering to the more general problem of ordinary kriging. Through experimentation we demonstrate the space and time efficiency and accuracy of approximating ordinary kriging through the use of covariance tapering combined with iterative methods for solving large sparse systems. We demonstrate our approach on large data sizes arising both from synthetic sources and from real applications.
Hesford, Andrew J.; Chew, Weng C.
2010-01-01
The distorted Born iterative method (DBIM) computes iterative solutions to nonlinear inverse scattering problems through successive linear approximations. By decomposing the scattered field into a superposition of scattering by an inhomogeneous background and by a material perturbation, large or high-contrast variations in medium properties can be imaged through iterations that are each subject to the distorted Born approximation. However, the need to repeatedly compute forward solutions still imposes a very heavy computational burden. To ameliorate this problem, the multilevel fast multipole algorithm (MLFMA) has been applied as a forward solver within the DBIM. The MLFMA computes forward solutions in linear time for volumetric scatterers. The typically regular distribution and shape of scattering elements in the inverse scattering problem allow the method to take advantage of data redundancy and reduce the computational demands of the normally expensive MLFMA setup. Additional benefits are gained by employing Kaczmarz-like iterations, where partial measurements are used to accelerate convergence. Numerical results demonstrate both the efficiency of the forward solver and the successful application of the inverse method to imaging problems with dimensions in the neighborhood of ten wavelengths. PMID:20707438
Application of NASA General-Purpose Solver to Large-Scale Computations in Aeroacoustics
NASA Technical Reports Server (NTRS)
Watson, Willie R.; Storaasli, Olaf O.
2004-01-01
Of several iterative and direct equation solvers evaluated previously for computations in aeroacoustics, the most promising was the NASA-developed General-Purpose Solver (winner of NASA's 1999 software of the year award). This paper presents detailed, single-processor statistics of the performance of this solver, which has been tailored and optimized for large-scale aeroacoustic computations. The statistics, compiled using an SGI ORIGIN 2000 computer with 12 Gb available memory (RAM) and eight available processors, are the central processing unit time, RAM requirements, and solution error. The equation solver is capable of solving 10 thousand complex unknowns in as little as 0.01 sec using 0.02 Gb RAM, and 8.4 million complex unknowns in slightly less than 3 hours using all 12 Gb. This latter solution is the largest aeroacoustics problem solved to date with this technique. The study was unable to detect any noticeable error in the solution, since noise levels predicted from these solution vectors are in excellent agreement with the noise levels computed from the exact solution. The equation solver provides a means for obtaining numerical solutions to aeroacoustics problems in three dimensions.
ERIC Educational Resources Information Center
Richardson, William H., Jr.
2006-01-01
Computational precision is sometimes given short shrift in a first programming course. Treating this topic requires discussing integer and floating-point number representations and inaccuracies that may result from their use. An example of a moderately simple programming problem from elementary statistics was examined. It forced students to…
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pieper, Andreas; Kreutzer, Moritz; Alvermann, Andreas, E-mail: alvermann@physik.uni-greifswald.de
2016-11-15
We study Chebyshev filter diagonalization as a tool for the computation of many interior eigenvalues of very large sparse symmetric matrices. In this technique the subspace projection onto the target space of wanted eigenvectors is approximated with filter polynomials obtained from Chebyshev expansions of window functions. After the discussion of the conceptual foundations of Chebyshev filter diagonalization we analyze the impact of the choice of the damping kernel, search space size, and filter polynomial degree on the computational accuracy and effort, before we describe the necessary steps towards a parallel high-performance implementation. Because Chebyshev filter diagonalization avoids the need formore » matrix inversion it can deal with matrices and problem sizes that are presently not accessible with rational function methods based on direct or iterative linear solvers. To demonstrate the potential of Chebyshev filter diagonalization for large-scale problems of this kind we include as an example the computation of the 10{sup 2} innermost eigenpairs of a topological insulator matrix with dimension 10{sup 9} derived from quantum physics applications.« less
NASA Technical Reports Server (NTRS)
DeBonis, James R.
2013-01-01
A computational fluid dynamics code that solves the compressible Navier-Stokes equations was applied to the Taylor-Green vortex problem to examine the code s ability to accurately simulate the vortex decay and subsequent turbulence. The code, WRLES (Wave Resolving Large-Eddy Simulation), uses explicit central-differencing to compute the spatial derivatives and explicit Low Dispersion Runge-Kutta methods for the temporal discretization. The flow was first studied and characterized using Bogey & Bailley s 13-point dispersion relation preserving (DRP) scheme. The kinetic energy dissipation rate, computed both directly and from the enstrophy field, vorticity contours, and the energy spectra are examined. Results are in excellent agreement with a reference solution obtained using a spectral method and provide insight into computations of turbulent flows. In addition the following studies were performed: a comparison of 4th-, 8th-, 12th- and DRP spatial differencing schemes, the effect of the solution filtering on the results, the effect of large-eddy simulation sub-grid scale models, and the effect of high-order discretization of the viscous terms.
NASA Astrophysics Data System (ADS)
Cartarius, Holger; Musslimani, Ziad H.; Schwarz, Lukas; Wunner, Günter
2018-03-01
The spectral renormalization method was introduced in 2005 as an effective way to compute ground states of nonlinear Schrödinger and Gross-Pitaevskii type equations. In this paper, we introduce an orthogonal spectral renormalization (OSR) method to compute ground and excited states (and their respective eigenvalues) of linear and nonlinear eigenvalue problems. The implementation of the algorithm follows four simple steps: (i) reformulate the underlying eigenvalue problem as a fixed-point equation, (ii) introduce a renormalization factor that controls the convergence properties of the iteration, (iii) perform a Gram-Schmidt orthogonalization process in order to prevent the iteration from converging to an unwanted mode, and (iv) compute the solution sought using a fixed-point iteration. The advantages of the OSR scheme over other known methods (such as Newton's and self-consistency) are (i) it allows the flexibility to choose large varieties of initial guesses without diverging, (ii) it is easy to implement especially at higher dimensions, and (iii) it can easily handle problems with complex and random potentials. The OSR method is implemented on benchmark Hermitian linear and nonlinear eigenvalue problems as well as linear and nonlinear non-Hermitian PT -symmetric models.
IETI – Isogeometric Tearing and Interconnecting
Kleiss, Stefan K.; Pechstein, Clemens; Jüttler, Bert; Tomar, Satyendra
2012-01-01
Finite Element Tearing and Interconnecting (FETI) methods are a powerful approach to designing solvers for large-scale problems in computational mechanics. The numerical simulation problem is subdivided into a number of independent sub-problems, which are then coupled in appropriate ways. NURBS- (Non-Uniform Rational B-spline) based isogeometric analysis (IGA) applied to complex geometries requires to represent the computational domain as a collection of several NURBS geometries. Since there is a natural decomposition of the computational domain into several subdomains, NURBS-based IGA is particularly well suited for using FETI methods. This paper proposes the new IsogEometric Tearing and Interconnecting (IETI) method, which combines the advanced solver design of FETI with the exact geometry representation of IGA. We describe the IETI framework for two classes of simple model problems (Poisson and linearized elasticity) and discuss the coupling of the subdomains along interfaces (both for matching interfaces and for interfaces with T-joints, i.e. hanging nodes). Special attention is paid to the construction of a suitable preconditioner for the iterative linear solver used for the interface problem. We report several computational experiments to demonstrate the performance of the proposed IETI method. PMID:24511167
Optimal estimation and scheduling in aquifer management using the rapid feedback control method
NASA Astrophysics Data System (ADS)
Ghorbanidehno, Hojat; Kokkinaki, Amalia; Kitanidis, Peter K.; Darve, Eric
2017-12-01
Management of water resources systems often involves a large number of parameters, as in the case of large, spatially heterogeneous aquifers, and a large number of "noisy" observations, as in the case of pressure observation in wells. Optimizing the operation of such systems requires both searching among many possible solutions and utilizing new information as it becomes available. However, the computational cost of this task increases rapidly with the size of the problem to the extent that textbook optimization methods are practically impossible to apply. In this paper, we present a new computationally efficient technique as a practical alternative for optimally operating large-scale dynamical systems. The proposed method, which we term Rapid Feedback Controller (RFC), provides a practical approach for combined monitoring, parameter estimation, uncertainty quantification, and optimal control for linear and nonlinear systems with a quadratic cost function. For illustration, we consider the case of a weakly nonlinear uncertain dynamical system with a quadratic objective function, specifically a two-dimensional heterogeneous aquifer management problem. To validate our method, we compare our results with the linear quadratic Gaussian (LQG) method, which is the basic approach for feedback control. We show that the computational cost of the RFC scales only linearly with the number of unknowns, a great improvement compared to the basic LQG control with a computational cost that scales quadratically. We demonstrate that the RFC method can obtain the optimal control values at a greatly reduced computational cost compared to the conventional LQG algorithm with small and controllable losses in the accuracy of the state and parameter estimation.
Optimal pre-scheduling of problem remappings
NASA Technical Reports Server (NTRS)
Nicol, David M.; Saltz, Joel H.
1987-01-01
A large class of scientific computational problems can be characterized as a sequence of steps where a significant amount of computation occurs each step, but the work performed at each step is not necessarily identical. Two good examples of this type of computation are: (1) regridding methods which change the problem discretization during the course of the computation, and (2) methods for solving sparse triangular systems of linear equations. Recent work has investigated a means of mapping such computations onto parallel processors; the method defines a family of static mappings with differing degrees of importance placed on the conflicting goals of good load balance and low communication/synchronization overhead. The performance tradeoffs are controllable by adjusting the parameters of the mapping method. To achieve good performance it may be necessary to dynamically change these parameters at run-time, but such changes can impose additional costs. If the computation's behavior can be determined prior to its execution, it can be possible to construct an optimal parameter schedule using a low-order-polynomial-time dynamic programming algorithm. Since the latter can be expensive, the performance is studied of the effect of a linear-time scheduling heuristic on one of the model problems, and it is shown to be effective and nearly optimal.
Translational bioinformatics in the cloud: an affordable alternative
2010-01-01
With the continued exponential expansion of publicly available genomic data and access to low-cost, high-throughput molecular technologies for profiling patient populations, computational technologies and informatics are becoming vital considerations in genomic medicine. Although cloud computing technology is being heralded as a key enabling technology for the future of genomic research, available case studies are limited to applications in the domain of high-throughput sequence data analysis. The goal of this study was to evaluate the computational and economic characteristics of cloud computing in performing a large-scale data integration and analysis representative of research problems in genomic medicine. We find that the cloud-based analysis compares favorably in both performance and cost in comparison to a local computational cluster, suggesting that cloud computing technologies might be a viable resource for facilitating large-scale translational research in genomic medicine. PMID:20691073
The application of dynamic programming in production planning
NASA Astrophysics Data System (ADS)
Wu, Run
2017-05-01
Nowadays, with the popularity of the computers, various industries and fields are widely applying computer information technology, which brings about huge demand for a variety of application software. In order to develop software meeting various needs with most economical cost and best quality, programmers must design efficient algorithms. A superior algorithm can not only soul up one thing, but also maximize the benefits and generate the smallest overhead. As one of the common algorithms, dynamic programming algorithms are used to solving problems with some sort of optimal properties. When solving problems with a large amount of sub-problems that needs repetitive calculations, the ordinary sub-recursive method requires to consume exponential time, and dynamic programming algorithm can reduce the time complexity of the algorithm to the polynomial level, according to which we can conclude that dynamic programming algorithm is a very efficient compared to other algorithms reducing the computational complexity and enriching the computational results. In this paper, we expound the concept, basic elements, properties, core, solving steps and difficulties of the dynamic programming algorithm besides, establish the dynamic programming model of the production planning problem.
Evolving Non-Dominated Parameter Sets for Computational Models from Multiple Experiments
NASA Astrophysics Data System (ADS)
Lane, Peter C. R.; Gobet, Fernand
2013-03-01
Creating robust, reproducible and optimal computational models is a key challenge for theorists in many sciences. Psychology and cognitive science face particular challenges as large amounts of data are collected and many models are not amenable to analytical techniques for calculating parameter sets. Particular problems are to locate the full range of acceptable model parameters for a given dataset, and to confirm the consistency of model parameters across different datasets. Resolving these problems will provide a better understanding of the behaviour of computational models, and so support the development of general and robust models. In this article, we address these problems using evolutionary algorithms to develop parameters for computational models against multiple sets of experimental data; in particular, we propose the `speciated non-dominated sorting genetic algorithm' for evolving models in several theories. We discuss the problem of developing a model of categorisation using twenty-nine sets of data and models drawn from four different theories. We find that the evolutionary algorithms generate high quality models, adapted to provide a good fit to all available data.
NASA Astrophysics Data System (ADS)
Li, Yuzhong
Using GA solve the winner determination problem (WDP) with large bids and items, run under different distribution, because the search space is large, constraint complex and it may easy to produce infeasible solution, would affect the efficiency and quality of algorithm. This paper present improved MKGA, including three operator: preprocessing, insert bid and exchange recombination, and use Monkey-king elite preservation strategy. Experimental results show that improved MKGA is better than SGA in population size and computation. The problem that traditional branch and bound algorithm hard to solve, improved MKGA can solve and achieve better effect.
Using Mosix for Wide-Area Compuational Resources
Maddox, Brian G.
2004-01-01
One of the problems with using traditional Beowulf-type distributed processing clusters is that they require an investment in dedicated computer resources. These resources are usually needed in addition to pre-existing ones such as desktop computers and file servers. Mosix is a series of modifications to the Linux kernel that creates a virtual computer, featuring automatic load balancing by migrating processes from heavily loaded nodes to less used ones. An extension of the Beowulf concept is to run a Mosixenabled Linux kernel on a large number of computer resources in an organization. This configuration would provide a very large amount of computational resources based on pre-existing equipment. The advantage of this method is that it provides much more processing power than a traditional Beowulf cluster without the added costs of dedicating resources.
On Undecidability Aspects of Resilient Computations and Implications to Exascale
DOE Office of Scientific and Technical Information (OSTI.GOV)
Rao, Nageswara S
2014-01-01
Future Exascale computing systems with a large number of processors, memory elements and interconnection links, are expected to experience multiple, complex faults, which affect both applications and operating-runtime systems. A variety of algorithms, frameworks and tools are being proposed to realize and/or verify the resilience properties of computations that guarantee correct results on failure-prone computing systems. We analytically show that certain resilient computation problems in presence of general classes of faults are undecidable, that is, no algorithms exist for solving them. We first show that the membership verification in a generic set of resilient computations is undecidable. We describe classesmore » of faults that can create infinite loops or non-halting computations, whose detection in general is undecidable. We then show certain resilient computation problems to be undecidable by using reductions from the loop detection and halting problems under two formulations, namely, an abstract programming language and Turing machines, respectively. These two reductions highlight different failure effects: the former represents program and data corruption, and the latter illustrates incorrect program execution. These results call for broad-based, well-characterized resilience approaches that complement purely computational solutions using methods such as hardware monitors, co-designs, and system- and application-specific diagnosis codes.« less
Compiling Planning into Quantum Optimization Problems: A Comparative Study
2015-06-07
and Sipser, M. 2000. Quantum computation by adiabatic evolution. arXiv:quant- ph/0001106. Fikes, R. E., and Nilsson, N. J. 1972. STRIPS: A new...become available: quantum annealing. Quantum annealing is one of the most accessible quantum algorithms for a computer sci- ence audience not versed...in quantum computing because of its close ties to classical optimization algorithms such as simulated annealing. While large-scale universal quantum
NASA Technical Reports Server (NTRS)
Hofmann, R.
1980-01-01
The STEALTH code system, which solves large strain, nonlinear continuum mechanics problems, was rigorously structured in both overall design and programming standards. The design is based on the theoretical elements of analysis while the programming standards attempt to establish a parallelism between physical theory, programming structure, and documentation. These features have made it easy to maintain, modify, and transport the codes. It has also guaranteed users a high level of quality control and quality assurance.
Algorithms in nature: the convergence of systems biology and computational thinking
Navlakha, Saket; Bar-Joseph, Ziv
2011-01-01
Computer science and biology have enjoyed a long and fruitful relationship for decades. Biologists rely on computational methods to analyze and integrate large data sets, while several computational methods were inspired by the high-level design principles of biological systems. Recently, these two directions have been converging. In this review, we argue that thinking computationally about biological processes may lead to more accurate models, which in turn can be used to improve the design of algorithms. We discuss the similar mechanisms and requirements shared by computational and biological processes and then present several recent studies that apply this joint analysis strategy to problems related to coordination, network analysis, and tracking and vision. We also discuss additional biological processes that can be studied in a similar manner and link them to potential computational problems. With the rapid accumulation of data detailing the inner workings of biological systems, we expect this direction of coupling biological and computational studies to greatly expand in the future. PMID:22068329
Applied Distributed Model Predictive Control for Energy Efficient Buildings and Ramp Metering
NASA Astrophysics Data System (ADS)
Koehler, Sarah Muraoka
Industrial large-scale control problems present an interesting algorithmic design challenge. A number of controllers must cooperate in real-time on a network of embedded hardware with limited computing power in order to maximize system efficiency while respecting constraints and despite communication delays. Model predictive control (MPC) can automatically synthesize a centralized controller which optimizes an objective function subject to a system model, constraints, and predictions of disturbance. Unfortunately, the computations required by model predictive controllers for large-scale systems often limit its industrial implementation only to medium-scale slow processes. Distributed model predictive control (DMPC) enters the picture as a way to decentralize a large-scale model predictive control problem. The main idea of DMPC is to split the computations required by the MPC problem amongst distributed processors that can compute in parallel and communicate iteratively to find a solution. Some popularly proposed solutions are distributed optimization algorithms such as dual decomposition and the alternating direction method of multipliers (ADMM). However, these algorithms ignore two practical challenges: substantial communication delays present in control systems and also problem non-convexity. This thesis presents two novel and practically effective DMPC algorithms. The first DMPC algorithm is based on a primal-dual active-set method which achieves fast convergence, making it suitable for large-scale control applications which have a large communication delay across its communication network. In particular, this algorithm is suited for MPC problems with a quadratic cost, linear dynamics, forecasted demand, and box constraints. We measure the performance of this algorithm and show that it significantly outperforms both dual decomposition and ADMM in the presence of communication delay. The second DMPC algorithm is based on an inexact interior point method which is suited for nonlinear optimization problems. The parallel computation of the algorithm exploits iterative linear algebra methods for the main linear algebra computations in the algorithm. We show that the splitting of the algorithm is flexible and can thus be applied to various distributed platform configurations. The two proposed algorithms are applied to two main energy and transportation control problems. The first application is energy efficient building control. Buildings represent 40% of energy consumption in the United States. Thus, it is significant to improve the energy efficiency of buildings. The goal is to minimize energy consumption subject to the physics of the building (e.g. heat transfer laws), the constraints of the actuators as well as the desired operating constraints (thermal comfort of the occupants), and heat load on the system. In this thesis, we describe the control systems of forced air building systems in practice. We discuss the "Trim and Respond" algorithm which is a distributed control algorithm that is used in practice, and show that it performs similarly to a one-step explicit DMPC algorithm. Then, we apply the novel distributed primal-dual active-set method and provide extensive numerical results for the building MPC problem. The second main application is the control of ramp metering signals to optimize traffic flow through a freeway system. This application is particularly important since urban congestion has more than doubled in the past few decades. The ramp metering problem is to maximize freeway throughput subject to freeway dynamics (derived from mass conservation), actuation constraints, freeway capacity constraints, and predicted traffic demand. In this thesis, we develop a hybrid model predictive controller for ramp metering that is guaranteed to be persistently feasible and stable. This contrasts to previous work on MPC for ramp metering where such guarantees are absent. We apply a smoothing method to the hybrid model predictive controller and apply the inexact interior point method to this nonlinear non-convex ramp metering problem.
Supersonic reacting internal flowfields
NASA Astrophysics Data System (ADS)
Drummond, J. P.
The national program to develop a trans-atmospheric vehicle has kindled a renewed interest in the modeling of supersonic reacting flows. A supersonic combustion ramjet, or scramjet, has been proposed to provide the propulsion system for this vehicle. The development of computational techniques for modeling supersonic reacting flowfields, and the application of these techniques to an increasingly difficult set of combustion problems are studied. Since the scramjet problem has been largely responsible for motivating this computational work, a brief history is given of hypersonic vehicles and their propulsion systems. A discussion is also given of some early modeling efforts applied to high speed reacting flows. Current activities to develop accurate and efficient algorithms and improved physical models for modeling supersonic combustion is then discussed. Some new problems where computer codes based on these algorithms and models are being applied are described.
Supersonic reacting internal flow fields
NASA Technical Reports Server (NTRS)
Drummond, J. Philip
1989-01-01
The national program to develop a trans-atmospheric vehicle has kindled a renewed interest in the modeling of supersonic reacting flows. A supersonic combustion ramjet, or scramjet, has been proposed to provide the propulsion system for this vehicle. The development of computational techniques for modeling supersonic reacting flow fields, and the application of these techniques to an increasingly difficult set of combustion problems are studied. Since the scramjet problem has been largely responsible for motivating this computational work, a brief history is given of hypersonic vehicles and their propulsion systems. A discussion is also given of some early modeling efforts applied to high speed reacting flows. Current activities to develop accurate and efficient algorithms and improved physical models for modeling supersonic combustion is then discussed. Some new problems where computer codes based on these algorithms and models are being applied are described.
Computer-Based Assessment of Complex Problem Solving: Concept, Implementation, and Application
ERIC Educational Resources Information Center
Greiff, Samuel; Wustenberg, Sascha; Holt, Daniel V.; Goldhammer, Frank; Funke, Joachim
2013-01-01
Complex Problem Solving (CPS) skills are essential to successfully deal with environments that change dynamically and involve a large number of interconnected and partially unknown causal influences. The increasing importance of such skills in the 21st century requires appropriate assessment and intervention methods, which in turn rely on adequate…
Fast reconstruction of optical properties for complex segmentations in near infrared imaging
NASA Astrophysics Data System (ADS)
Jiang, Jingjing; Wolf, Martin; Sánchez Majos, Salvador
2017-04-01
The intrinsic ill-posed nature of the inverse problem in near infrared imaging makes the reconstruction of fine details of objects deeply embedded in turbid media challenging even for the large amounts of data provided by time-resolved cameras. In addition, most reconstruction algorithms for this type of measurements are only suitable for highly symmetric geometries and rely on a linear approximation to the diffusion equation since a numerical solution of the fully non-linear problem is computationally too expensive. In this paper, we will show that a problem of practical interest can be successfully addressed making efficient use of the totality of the information supplied by time-resolved cameras. We set aside the goal of achieving high spatial resolution for deep structures and focus on the reconstruction of complex arrangements of large regions. We show numerical results based on a combined approach of wavelength-normalized data and prior geometrical information, defining a fully parallelizable problem in arbitrary geometries for time-resolved measurements. Fast reconstructions are obtained using a diffusion approximation and Monte-Carlo simulations, parallelized in a multicore computer and a GPU respectively.
Numerical simulation using vorticity-vector potential formulation
NASA Technical Reports Server (NTRS)
Tokunaga, Hiroshi
1993-01-01
An accurate and efficient computational method is needed for three-dimensional incompressible viscous flows in engineering applications. On solving the turbulent shear flows directly or using the subgrid scale model, it is indispensable to resolve the small scale fluid motions as well as the large scale motions. From this point of view, the pseudo-spectral method is used so far as the computational method. However, the finite difference or the finite element methods are widely applied for computing the flow with practical importance since these methods are easily applied to the flows with complex geometric configurations. However, there exist several problems in applying the finite difference method to direct and large eddy simulations. Accuracy is one of most important problems. This point was already addressed by the present author on the direct simulations on the instability of the plane Poiseuille flow and also on the transition to turbulence. In order to obtain high efficiency, the multi-grid Poisson solver is combined with the higher-order, accurate finite difference method. The formulation method is also one of the most important problems in applying the finite difference method to the incompressible turbulent flows. The three-dimensional Navier-Stokes equations have been solved so far in the primitive variables formulation. One of the major difficulties of this method is the rigorous satisfaction of the equation of continuity. In general, the staggered grid is used for the satisfaction of the solenoidal condition for the velocity field at the wall boundary. However, the velocity field satisfies the equation of continuity automatically in the vorticity-vector potential formulation. From this point of view, the vorticity-vector potential method was extended to the generalized coordinate system. In the present article, we adopt the vorticity-vector potential formulation, the generalized coordinate system, and the 4th-order accurate difference method as the computational method. We present the computational method and apply the present method to computations of flows in a square cavity at large Reynolds number in order to investigate its effectiveness.
Structural performance analysis and redesign
NASA Technical Reports Server (NTRS)
Whetstone, W. D.
1978-01-01
Program performs stress buckling and vibrational analysis of large, linear, finite-element systems in excess of 50,000 degrees of freedom. Cost, execution time, and storage requirements are kept reasonable through use of sparse matrix solution techniques, and other computational and data management procedures designed for problems of very large size.
Remote access to very large image repositories, a high performance computing perspective
NASA Technical Reports Server (NTRS)
Plesea, Lucian
2005-01-01
The main challenges of using the increasingly large repositories of remote imagery data can be summarized in one word: efficiency. In this paper, a number of concrete problems and the chosen solutions are described, based on the construction of a 5TB global Landsat 7 mosaic.
Parallel Computing for Probabilistic Response Analysis of High Temperature Composites
NASA Technical Reports Server (NTRS)
Sues, R. H.; Lua, Y. J.; Smith, M. D.
1994-01-01
The objective of this Phase I research was to establish the required software and hardware strategies to achieve large scale parallelism in solving PCM problems. To meet this objective, several investigations were conducted. First, we identified the multiple levels of parallelism in PCM and the computational strategies to exploit these parallelisms. Next, several software and hardware efficiency investigations were conducted. These involved the use of three different parallel programming paradigms and solution of two example problems on both a shared-memory multiprocessor and a distributed-memory network of workstations.
NASA Astrophysics Data System (ADS)
Gross, Lutz; Altinay, Cihan; Fenwick, Joel; Smith, Troy
2014-05-01
The program package escript has been designed for solving mathematical modeling problems using python, see Gross et al. (2013). Its development and maintenance has been funded by the Australian Commonwealth to provide open source software infrastructure for the Australian Earth Science community (recent funding by the Australian Geophysical Observing System EIF (AGOS) and the AuScope Collaborative Research Infrastructure Scheme (CRIS)). The key concepts of escript are based on the terminology of spatial functions and partial differential equations (PDEs) - an approach providing abstraction from the underlying spatial discretization method (i.e. the finite element method (FEM)). This feature presents a programming environment to the user which is easy to use even for complex models. Due to the fact that implementations are independent from data structures simulations are easily portable across desktop computers and scalable compute clusters without modifications to the program code. escript has been successfully applied in a variety of applications including modeling mantel convection, melting processes, volcanic flow, earthquakes, faulting, multi-phase flow, block caving and mineralization (see Poulet et al. 2013). The recent escript release (see Gross et al. (2013)) provides an open framework for solving joint inversion problems for geophysical data sets (potential field, seismic and electro-magnetic). The strategy bases on the idea to formulate the inversion problem as an optimization problem with PDE constraints where the cost function is defined by the data defect and the regularization term for the rock properties, see Gross & Kemp (2013). This approach of first-optimize-then-discretize avoids the assemblage of the - in general- dense sensitivity matrix as used in conventional approaches where discrete programming techniques are applied to the discretized problem (first-discretize-then-optimize). In this paper we will discuss the mathematical framework for inversion and appropriate solution schemes in escript. We will also give a brief introduction into escript's open framework for defining and solving geophysical inversion problems. Finally we will show some benchmark results to demonstrate the computational scalability of the inversion method across a large number of cores and compute nodes in a parallel computing environment. References: - L. Gross et al. (2013): Escript Solving Partial Differential Equations in Python Version 3.4, The University of Queensland, https://launchpad.net/escript-finley - L. Gross and C. Kemp (2013) Large Scale Joint Inversion of Geophysical Data using the Finite Element Method in escript. ASEG Extended Abstracts 2013, http://dx.doi.org/10.1071/ASEG2013ab306 - T. Poulet, L. Gross, D. Georgiev, J. Cleverley (2012): escript-RT: Reactive transport simulation in Python using escript, Computers & Geosciences, Volume 45, 168-176. http://dx.doi.org/10.1016/j.cageo.2011.11.005.
A Genetic Algorithm for Flow Shop Scheduling with Assembly Operations to Minimize Makespan
NASA Astrophysics Data System (ADS)
Bhongade, A. S.; Khodke, P. M.
2014-04-01
Manufacturing systems, in which, several parts are processed through machining workstations and later assembled to form final products, is common. Though scheduling of such problems are solved using heuristics, available solution approaches can provide solution for only moderate sized problems due to large computation time required. In this work, scheduling approach is developed for such flow-shop manufacturing system having machining workstations followed by assembly workstations. The initial schedule is generated using Disjunctive method and genetic algorithm (GA) is applied further for generating schedule for large sized problems. GA is found to give near optimal solution based on the deviation of makespan from lower bound. The lower bound of makespan of such problem is estimated and percent deviation of makespan from lower bounds is used as a performance measure to evaluate the schedules. Computational experiments are conducted on problems developed using fractional factorial orthogonal array, varying the number of parts per product, number of products, and number of workstations (ranging upto 1,520 number of operations). A statistical analysis indicated the significance of all the three factors considered. It is concluded that GA method can obtain optimal makespan.
Parallel scalability of Hartree-Fock calculations
NASA Astrophysics Data System (ADS)
Chow, Edmond; Liu, Xing; Smelyanskiy, Mikhail; Hammond, Jeff R.
2015-03-01
Quantum chemistry is increasingly performed using large cluster computers consisting of multiple interconnected nodes. For a fixed molecular problem, the efficiency of a calculation usually decreases as more nodes are used, due to the cost of communication between the nodes. This paper empirically investigates the parallel scalability of Hartree-Fock calculations. The construction of the Fock matrix and the density matrix calculation are analyzed separately. For the former, we use a parallelization of Fock matrix construction based on a static partitioning of work followed by a work stealing phase. For the latter, we use density matrix purification from the linear scaling methods literature, but without using sparsity. When using large numbers of nodes for moderately sized problems, density matrix computations are network-bandwidth bound, making purification methods potentially faster than eigendecomposition methods.
OpenMP parallelization of a gridded SWAT (SWATG)
NASA Astrophysics Data System (ADS)
Zhang, Ying; Hou, Jinliang; Cao, Yongpan; Gu, Juan; Huang, Chunlin
2017-12-01
Large-scale, long-term and high spatial resolution simulation is a common issue in environmental modeling. A Gridded Hydrologic Response Unit (HRU)-based Soil and Water Assessment Tool (SWATG) that integrates grid modeling scheme with different spatial representations also presents such problems. The time-consuming problem affects applications of very high resolution large-scale watershed modeling. The OpenMP (Open Multi-Processing) parallel application interface is integrated with SWATG (called SWATGP) to accelerate grid modeling based on the HRU level. Such parallel implementation takes better advantage of the computational power of a shared memory computer system. We conducted two experiments at multiple temporal and spatial scales of hydrological modeling using SWATG and SWATGP on a high-end server. At 500-m resolution, SWATGP was found to be up to nine times faster than SWATG in modeling over a roughly 2000 km2 watershed with 1 CPU and a 15 thread configuration. The study results demonstrate that parallel models save considerable time relative to traditional sequential simulation runs. Parallel computations of environmental models are beneficial for model applications, especially at large spatial and temporal scales and at high resolutions. The proposed SWATGP model is thus a promising tool for large-scale and high-resolution water resources research and management in addition to offering data fusion and model coupling ability.
Dynamic remapping of parallel computations with varying resource demands
NASA Technical Reports Server (NTRS)
Nicol, D. M.; Saltz, J. H.
1986-01-01
A large class of computational problems is characterized by frequent synchronization, and computational requirements which change as a function of time. When such a problem must be solved on a message passing multiprocessor machine, the combination of these characteristics lead to system performance which decreases in time. Performance can be improved with periodic redistribution of computational load; however, redistribution can exact a sometimes large delay cost. We study the issue of deciding when to invoke a global load remapping mechanism. Such a decision policy must effectively weigh the costs of remapping against the performance benefits. We treat this problem by constructing two analytic models which exhibit stochastically decreasing performance. One model is quite tractable; we are able to describe the optimal remapping algorithm, and the optimal decision policy governing when to invoke that algorithm. However, computational complexity prohibits the use of the optimal remapping decision policy. We then study the performance of a general remapping policy on both analytic models. This policy attempts to minimize a statistic W(n) which measures the system degradation (including the cost of remapping) per computation step over a period of n steps. We show that as a function of time, the expected value of W(n) has at most one minimum, and that when this minimum exists it defines the optimal fixed-interval remapping policy. Our decision policy appeals to this result by remapping when it estimates that W(n) is minimized. Our performance data suggests that this policy effectively finds the natural frequency of remapping. We also use the analytic models to express the relationship between performance and remapping cost, number of processors, and the computation's stochastic activity.
State estimation of spatio-temporal phenomena
NASA Astrophysics Data System (ADS)
Yu, Dan
This dissertation addresses the state estimation problem of spatio-temporal phenomena which can be modeled by partial differential equations (PDEs), such as pollutant dispersion in the atmosphere. After discretizing the PDE, the dynamical system has a large number of degrees of freedom (DOF). State estimation using Kalman Filter (KF) is computationally intractable, and hence, a reduced order model (ROM) needs to be constructed first. Moreover, the nonlinear terms, external disturbances or unknown boundary conditions can be modeled as unknown inputs, which leads to an unknown input filtering problem. Furthermore, the performance of KF could be improved by placing sensors at feasible locations. Therefore, the sensor scheduling problem to place multiple mobile sensors is of interest. The first part of the dissertation focuses on model reduction for large scale systems with a large number of inputs/outputs. A commonly used model reduction algorithm, the balanced proper orthogonal decomposition (BPOD) algorithm, is not computationally tractable for large systems with a large number of inputs/outputs. Inspired by the BPOD and randomized algorithms, we propose a randomized proper orthogonal decomposition (RPOD) algorithm and a computationally optimal RPOD (RPOD*) algorithm, which construct an ROM to capture the input-output behaviour of the full order model, while reducing the computational cost of BPOD by orders of magnitude. It is demonstrated that the proposed RPOD* algorithm could construct the ROM in real-time, and the performance of the proposed algorithms on different advection-diffusion equations. Next, we consider the state estimation problem of linear discrete-time systems with unknown inputs which can be treated as a wide-sense stationary process with rational power spectral density, while no other prior information needs to be known. We propose an autoregressive (AR) model based unknown input realization technique which allows us to recover the input statistics from the output data by solving an appropriate least squares problem, then fit an AR model to the recovered input statistics and construct an innovations model of the unknown inputs using the eigensystem realization algorithm. The proposed algorithm outperforms the augmented two-stage Kalman Filter (ASKF) and the unbiased minimum-variance (UMV) algorithm are shown in several examples. Finally, we propose a framework to place multiple mobile sensors to optimize the long-term performance of KF in the estimation of the state of a PDE. The major challenges are that placing multiple sensors is an NP-hard problem, and the optimization problem is non-convex in general. In this dissertation, first, we construct an ROM using RPOD* algorithm, and then reduce the feasible sensor locations into a subset using the ROM. The Information Space Receding Horizon Control (I-RHC) approach and a modified Monte Carlo Tree Search (MCTS) approach are applied to solve the sensor scheduling problem using the subset. Various applications have been provided to demonstrate the performance of the proposed approach.
Jackin, Boaz Jessie; Watanabe, Shinpei; Ootsu, Kanemitsu; Ohkawa, Takeshi; Yokota, Takashi; Hayasaki, Yoshio; Yatagai, Toyohiko; Baba, Takanobu
2018-04-20
A parallel computation method for large-size Fresnel computer-generated hologram (CGH) is reported. The method was introduced by us in an earlier report as a technique for calculating Fourier CGH from 2D object data. In this paper we extend the method to compute Fresnel CGH from 3D object data. The scale of the computation problem is also expanded to 2 gigapixels, making it closer to real application requirements. The significant feature of the reported method is its ability to avoid communication overhead and thereby fully utilize the computing power of parallel devices. The method exhibits three layers of parallelism that favor small to large scale parallel computing machines. Simulation and optical experiments were conducted to demonstrate the workability and to evaluate the efficiency of the proposed technique. A two-times improvement in computation speed has been achieved compared to the conventional method, on a 16-node cluster (one GPU per node) utilizing only one layer of parallelism. A 20-times improvement in computation speed has been estimated utilizing two layers of parallelism on a very large-scale parallel machine with 16 nodes, where each node has 16 GPUs.
NASA Technical Reports Server (NTRS)
Anderson, B. H.; Putt, C. W.; Giamati, C. C.
1981-01-01
Color coding techniques used in the processing of remote sensing imagery were adapted and applied to the fluid dynamics problems associated with turbofan mixer nozzles. The computer generated color graphics were found to be useful in reconstructing the measured flow field from low resolution experimental data to give more physical meaning to this information and in scanning and interpreting the large volume of computer generated data from the three dimensional viscous computer code used in the analysis.
Leahy, P.P.
1982-01-01
The Trescott computer program for modeling groundwater flow in three dimensions has been modified to (1) treat aquifer and confining bed pinchouts more realistically and (2) reduce the computer memory requirements needed for the input data. Using the original program, simulation of aquifer systems with nonrectangular external boundaries may result in a large number of nodes that are not involved in the numerical solution of the problem, but require computer storage. (USGS)
The Next Frontier in Computing
Sarrao, John
2018-06-13
Exascale computing refers to computing systems capable of at least one exaflop or a billion calculations per second (1018). That is 50 times faster than the most powerful supercomputers being used today and represents a thousand-fold increase over the first petascale computer that came into operation in 2008. How we use these large-scale simulation resources is the key to solving some of todayâs most pressing problems, including clean energy production, nuclear reactor lifetime extension and nuclear stockpile aging.
Infrared Ship Classification Using A New Moment Pattern Recognition Concept
NASA Astrophysics Data System (ADS)
Casasent, David; Pauly, John; Fetterly, Donald
1982-03-01
An analysis of the statistics of the moments and the conventional invariant moments shows that the variance of the latter become quite large as the order of the moments and the degree of invariance increases. Moreso, the need to whiten the error volume increases with the order and degree, but so does the computational load associated with computing the whitening operator. We thus advance a new estimation approach to the use of moments in pattern recog-nition that overcomes these problems. This work is supported by experimental verification and demonstration on an infrared ship pattern recognition problem. The computational load associated with our new algorithm is also shown to be very low.
NASA Technical Reports Server (NTRS)
Tam, Christopher K. W.; Fang, Jun; Kurbatskii, Konstantin A.
1996-01-01
A set of nonhomogeneous radiation and outflow conditions which automatically generate prescribed incoming acoustic or vorticity waves and, at the same time, are transparent to outgoing sound waves produced internally in a finite computation domain is proposed. This type of boundary condition is needed for the numerical solution of many exterior aeroacoustics problems. In computational aeroacoustics, the computation scheme must be as nondispersive ans nondissipative as possible. It must also support waves with wave speeds which are nearly the same as those of the original linearized Euler equations. To meet these requirements, a high-order/large-stencil scheme is necessary The proposed nonhomogeneous radiation and outflow boundary conditions are designed primarily for use in conjunction with such high-order/large-stencil finite difference schemes.
Variational approach to probabilistic finite elements
NASA Technical Reports Server (NTRS)
Belytschko, T.; Liu, W. K.; Mani, A.; Besterfield, G.
1991-01-01
Probabilistic finite element methods (PFEM), synthesizing the power of finite element methods with second-moment techniques, are formulated for various classes of problems in structural and solid mechanics. Time-invariant random materials, geometric properties and loads are incorporated in terms of their fundamental statistics viz. second-moments. Analogous to the discretization of the displacement field in finite element methods, the random fields are also discretized. Preserving the conceptual simplicity, the response moments are calculated with minimal computations. By incorporating certain computational techniques, these methods are shown to be capable of handling large systems with many sources of uncertainties. By construction, these methods are applicable when the scale of randomness is not very large and when the probabilistic density functions have decaying tails. The accuracy and efficiency of these methods, along with their limitations, are demonstrated by various applications. Results obtained are compared with those of Monte Carlo simulation and it is shown that good accuracy can be obtained for both linear and nonlinear problems. The methods are amenable to implementation in deterministic FEM based computer codes.
Variational approach to probabilistic finite elements
NASA Astrophysics Data System (ADS)
Belytschko, T.; Liu, W. K.; Mani, A.; Besterfield, G.
1991-08-01
Probabilistic finite element methods (PFEM), synthesizing the power of finite element methods with second-moment techniques, are formulated for various classes of problems in structural and solid mechanics. Time-invariant random materials, geometric properties and loads are incorporated in terms of their fundamental statistics viz. second-moments. Analogous to the discretization of the displacement field in finite element methods, the random fields are also discretized. Preserving the conceptual simplicity, the response moments are calculated with minimal computations. By incorporating certain computational techniques, these methods are shown to be capable of handling large systems with many sources of uncertainties. By construction, these methods are applicable when the scale of randomness is not very large and when the probabilistic density functions have decaying tails. The accuracy and efficiency of these methods, along with their limitations, are demonstrated by various applications. Results obtained are compared with those of Monte Carlo simulation and it is shown that good accuracy can be obtained for both linear and nonlinear problems. The methods are amenable to implementation in deterministic FEM based computer codes.
Variational approach to probabilistic finite elements
NASA Technical Reports Server (NTRS)
Belytschko, T.; Liu, W. K.; Mani, A.; Besterfield, G.
1987-01-01
Probabilistic finite element method (PFEM), synthesizing the power of finite element methods with second-moment techniques, are formulated for various classes of problems in structural and solid mechanics. Time-invariant random materials, geometric properties, and loads are incorporated in terms of their fundamental statistics viz. second-moments. Analogous to the discretization of the displacement field in finite element methods, the random fields are also discretized. Preserving the conceptual simplicity, the response moments are calculated with minimal computations. By incorporating certain computational techniques, these methods are shown to be capable of handling large systems with many sources of uncertainties. By construction, these methods are applicable when the scale of randomness is not very large and when the probabilistic density functions have decaying tails. The accuracy and efficiency of these methods, along with their limitations, are demonstrated by various applications. Results obtained are compared with those of Monte Carlo simulation and it is shown that good accuracy can be obtained for both linear and nonlinear problems. The methods are amenable to implementation in deterministic FEM based computer codes.
Studies in nonlinear problems of energy. Final report
DOE Office of Scientific and Technical Information (OSTI.GOV)
Matkowsky, B.J.
1998-12-01
The author completed a successful research program on Nonlinear Problems of Energy, with emphasis on combustion and flame propagation. A total of 183 papers associated with the grant has appeared in the literature, and the efforts have twice been recognized by DOE`s Basic Science Division for Top Accomplishment. In the research program the author concentrated on modeling, analysis and computation of combustion phenomena, with particular emphasis on the transition from laminar to turbulent combustion. Thus he investigated the nonlinear dynamics and pattern formation in the successive stages of transition. He described the stability of combustion waves, and transitions to wavesmore » exhibiting progressively higher degrees of spatio-temporal complexity. Combustion waves are characterized by large activation energies, so that chemical reactions are significant only in thin layers, termed reaction zones. In the limit of infinite activation energy, the zones shrink to moving surfaces, termed fronts, which must be found during the course of the analysis, so that the problems are moving free boundary problems. The analytical studies were carried out for the limiting case with fronts, while the numerical studies were carried out for the case of finite, though large, activation energy. Accurate resolution of the solution in the reaction zone(s) is essential, otherwise false predictions of dynamical behavior are possible. Since the reaction zones move, and their location is not known a-priori, the author has developed adaptive pseudo-spectral methods, which have proven to be very useful for the accurate, efficient computation of solutions of combustion, and other, problems. The approach is based on a combination of analytical and numerical methods. The numerical computations built on and extended the information obtained analytically. Furthermore, the solutions obtained analytically served as benchmarks for testing the accuracy of the solutions determined computationally. Finally, the computational results suggested new analysis to be considered. A cumulative list of publications citing the grant make up the contents of this report.« less
Styopin, Nikita E; Vershinin, Anatoly V; Zingerman, Konstantin M; Levin, Vladimir A
2016-09-01
Different variants of the Uzawa algorithm are compared with one another. The comparison is performed for the case in which this algorithm is applied to large-scale systems of linear algebraic equations. These systems arise in the finite-element solution of the problems of elasticity theory for incompressible materials. A modification of the Uzawa algorithm is proposed. Computational experiments show that this modification improves the convergence of the Uzawa algorithm for the problems of solid mechanics. The results of computational experiments show that each variant of the Uzawa algorithm considered has its advantages and disadvantages and may be convenient in one case or another.
Afshar, Yaser; Sbalzarini, Ivo F.
2016-01-01
Modern fluorescence microscopy modalities, such as light-sheet microscopy, are capable of acquiring large three-dimensional images at high data rate. This creates a bottleneck in computational processing and analysis of the acquired images, as the rate of acquisition outpaces the speed of processing. Moreover, images can be so large that they do not fit the main memory of a single computer. We address both issues by developing a distributed parallel algorithm for segmentation of large fluorescence microscopy images. The method is based on the versatile Discrete Region Competition algorithm, which has previously proven useful in microscopy image segmentation. The present distributed implementation decomposes the input image into smaller sub-images that are distributed across multiple computers. Using network communication, the computers orchestrate the collectively solving of the global segmentation problem. This not only enables segmentation of large images (we test images of up to 1010 pixels), but also accelerates segmentation to match the time scale of image acquisition. Such acquisition-rate image segmentation is a prerequisite for the smart microscopes of the future and enables online data compression and interactive experiments. PMID:27046144
Afshar, Yaser; Sbalzarini, Ivo F
2016-01-01
Modern fluorescence microscopy modalities, such as light-sheet microscopy, are capable of acquiring large three-dimensional images at high data rate. This creates a bottleneck in computational processing and analysis of the acquired images, as the rate of acquisition outpaces the speed of processing. Moreover, images can be so large that they do not fit the main memory of a single computer. We address both issues by developing a distributed parallel algorithm for segmentation of large fluorescence microscopy images. The method is based on the versatile Discrete Region Competition algorithm, which has previously proven useful in microscopy image segmentation. The present distributed implementation decomposes the input image into smaller sub-images that are distributed across multiple computers. Using network communication, the computers orchestrate the collectively solving of the global segmentation problem. This not only enables segmentation of large images (we test images of up to 10(10) pixels), but also accelerates segmentation to match the time scale of image acquisition. Such acquisition-rate image segmentation is a prerequisite for the smart microscopes of the future and enables online data compression and interactive experiments.
Fast Katz and Commuters: Efficient Estimation of Social Relatedness in Large Networks
NASA Astrophysics Data System (ADS)
Esfandiar, Pooya; Bonchi, Francesco; Gleich, David F.; Greif, Chen; Lakshmanan, Laks V. S.; On, Byung-Won
Motivated by social network data mining problems such as link prediction and collaborative filtering, significant research effort has been devoted to computing topological measures including the Katz score and the commute time. Existing approaches typically approximate all pairwise relationships simultaneously. In this paper, we are interested in computing: the score for a single pair of nodes, and the top-k nodes with the best scores from a given source node. For the pairwise problem, we apply an iterative algorithm that computes upper and lower bounds for the measures we seek. This algorithm exploits a relationship between the Lanczos process and a quadrature rule. For the top-k problem, we propose an algorithm that only accesses a small portion of the graph and is related to techniques used in personalized PageRank computing. To test the scalability and accuracy of our algorithms we experiment with three real-world networks and find that these algorithms run in milliseconds to seconds without any preprocessing.
Fast katz and commuters : efficient estimation of social relatedness in large networks.
DOE Office of Scientific and Technical Information (OSTI.GOV)
On, Byung-Won; Lakshmanan, Laks V. S.; Greif, Chen
Motivated by social network data mining problems such as link prediction and collaborative filtering, significant research effort has been devoted to computing topological measures including the Katz score and the commute time. Existing approaches typically approximate all pairwise relationships simultaneously. In this paper, we are interested in computing: the score for a single pair of nodes, and the top-k nodes with the best scores from a given source node. For the pairwise problem, we apply an iterative algorithm that computes upper and lower bounds for the measures we seek. This algorithm exploits a relationship between the Lanczos process and amore » quadrature rule. For the top-k problem, we propose an algorithm that only accesses a small portion of the graph and is related to techniques used in personalized PageRank computing. To test the scalability and accuracy of our algorithms we experiment with three real-world networks and find that these algorithms run in milliseconds to seconds without any preprocessing.« less
Final Report: Large-Scale Optimization for Bayesian Inference in Complex Systems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ghattas, Omar
2013-10-15
The SAGUARO (Scalable Algorithms for Groundwater Uncertainty Analysis and Robust Optimiza- tion) Project focuses on the development of scalable numerical algorithms for large-scale Bayesian inversion in complex systems that capitalize on advances in large-scale simulation-based optimiza- tion and inversion methods. Our research is directed in three complementary areas: efficient approximations of the Hessian operator, reductions in complexity of forward simulations via stochastic spectral approximations and model reduction, and employing large-scale optimization concepts to accelerate sampling. Our efforts are integrated in the context of a challenging testbed problem that considers subsurface reacting flow and transport. The MIT component of the SAGUAROmore » Project addresses the intractability of conventional sampling methods for large-scale statistical inverse problems by devising reduced-order models that are faithful to the full-order model over a wide range of parameter values; sampling then employs the reduced model rather than the full model, resulting in very large computational savings. Results indicate little effect on the computed posterior distribution. On the other hand, in the Texas-Georgia Tech component of the project, we retain the full-order model, but exploit inverse problem structure (adjoint-based gradients and partial Hessian information of the parameter-to- observation map) to implicitly extract lower dimensional information on the posterior distribution; this greatly speeds up sampling methods, so that fewer sampling points are needed. We can think of these two approaches as "reduce then sample" and "sample then reduce." In fact, these two approaches are complementary, and can be used in conjunction with each other. Moreover, they both exploit deterministic inverse problem structure, in the form of adjoint-based gradient and Hessian information of the underlying parameter-to-observation map, to achieve their speedups.« less
Jiang, Yuyi; Shao, Zhiqing; Guo, Yi
2014-01-01
A complex computing problem can be solved efficiently on a system with multiple computing nodes by dividing its implementation code into several parallel processing modules or tasks that can be formulated as directed acyclic graph (DAG) problems. The DAG jobs may be mapped to and scheduled on the computing nodes to minimize the total execution time. Searching an optimal DAG scheduling solution is considered to be NP-complete. This paper proposed a tuple molecular structure-based chemical reaction optimization (TMSCRO) method for DAG scheduling on heterogeneous computing systems, based on a very recently proposed metaheuristic method, chemical reaction optimization (CRO). Comparing with other CRO-based algorithms for DAG scheduling, the design of tuple reaction molecular structure and four elementary reaction operators of TMSCRO is more reasonable. TMSCRO also applies the concept of constrained critical paths (CCPs), constrained-critical-path directed acyclic graph (CCPDAG) and super molecule for accelerating convergence. In this paper, we have also conducted simulation experiments to verify the effectiveness and efficiency of TMSCRO upon a large set of randomly generated graphs and the graphs for real world problems. PMID:25143977
Jiang, Yuyi; Shao, Zhiqing; Guo, Yi
2014-01-01
A complex computing problem can be solved efficiently on a system with multiple computing nodes by dividing its implementation code into several parallel processing modules or tasks that can be formulated as directed acyclic graph (DAG) problems. The DAG jobs may be mapped to and scheduled on the computing nodes to minimize the total execution time. Searching an optimal DAG scheduling solution is considered to be NP-complete. This paper proposed a tuple molecular structure-based chemical reaction optimization (TMSCRO) method for DAG scheduling on heterogeneous computing systems, based on a very recently proposed metaheuristic method, chemical reaction optimization (CRO). Comparing with other CRO-based algorithms for DAG scheduling, the design of tuple reaction molecular structure and four elementary reaction operators of TMSCRO is more reasonable. TMSCRO also applies the concept of constrained critical paths (CCPs), constrained-critical-path directed acyclic graph (CCPDAG) and super molecule for accelerating convergence. In this paper, we have also conducted simulation experiments to verify the effectiveness and efficiency of TMSCRO upon a large set of randomly generated graphs and the graphs for real world problems.
Addressing the computational cost of large EIT solutions.
Boyle, Alistair; Borsic, Andrea; Adler, Andy
2012-05-01
Electrical impedance tomography (EIT) is a soft field tomography modality based on the application of electric current to a body and measurement of voltages through electrodes at the boundary. The interior conductivity is reconstructed on a discrete representation of the domain using a finite-element method (FEM) mesh and a parametrization of that domain. The reconstruction requires a sequence of numerically intensive calculations. There is strong interest in reducing the cost of these calculations. An improvement in the compute time for current problems would encourage further exploration of computationally challenging problems such as the incorporation of time series data, wide-spread adoption of three-dimensional simulations and correlation of other modalities such as CT and ultrasound. Multicore processors offer an opportunity to reduce EIT computation times but may require some restructuring of the underlying algorithms to maximize the use of available resources. This work profiles two EIT software packages (EIDORS and NDRM) to experimentally determine where the computational costs arise in EIT as problems scale. Sparse matrix solvers, a key component for the FEM forward problem and sensitivity estimates in the inverse problem, are shown to take a considerable portion of the total compute time in these packages. A sparse matrix solver performance measurement tool, Meagre-Crowd, is developed to interface with a variety of solvers and compare their performance over a range of two- and three-dimensional problems of increasing node density. Results show that distributed sparse matrix solvers that operate on multiple cores are advantageous up to a limit that increases as the node density increases. We recommend a selection procedure to find a solver and hardware arrangement matched to the problem and provide guidance and tools to perform that selection.
Playing with Technology: Is It All Bad?
ERIC Educational Resources Information Center
Slutsky, Ruslan; Slutsky, Mindy; DeShelter, Lori M.
2014-01-01
Technology now plays a very large role in the way children of all ages play. Children want access to technology, so parents and teachers must determine the best ways to present it to them. Computers are a popular form of technology for children as young as age three. With that in mind, computer games should be problem-solving oriented and…
ERIC Educational Resources Information Center
Kent, Thomas H.; And Others
The advantages, feasibility and problems associated with a student-paced course were investigated, and a computer managed evaluation system compared to paper and pencil testing mode. The development of a self-paced course was facilitated by explicit behavior objectives, a variety of learning materials referenced to the objectives and a large pool…
Simultaneous analysis and design
NASA Technical Reports Server (NTRS)
Haftka, R. T.
1984-01-01
Optimization techniques are increasingly being used for performing nonlinear structural analysis. The development of element by element (EBE) preconditioned conjugate gradient (CG) techniques is expected to extend this trend to linear analysis. Under these circumstances the structural design problem can be viewed as a nested optimization problem. There are computational benefits to treating this nested problem as a large single optimization problem. The response variables (such as displacements) and the structural parameters are all treated as design variables in a unified formulation which performs simultaneously the design and analysis. Two examples are used for demonstration. A seventy-two bar truss is optimized subject to linear stress constraints and a wing box structure is optimized subject to nonlinear collapse constraints. Both examples show substantial computational savings with the unified approach as compared to the traditional nested approach.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Heydari, M.H., E-mail: heydari@stu.yazd.ac.ir; The Laboratory of Quantum Information Processing, Yazd University, Yazd; Hooshmandasl, M.R., E-mail: hooshmandasl@yazd.ac.ir
Because of the nonlinearity, closed-form solutions of many important stochastic functional equations are virtually impossible to obtain. Thus, numerical solutions are a viable alternative. In this paper, a new computational method based on the generalized hat basis functions together with their stochastic operational matrix of Itô-integration is proposed for solving nonlinear stochastic Itô integral equations in large intervals. In the proposed method, a new technique for computing nonlinear terms in such problems is presented. The main advantage of the proposed method is that it transforms problems under consideration into nonlinear systems of algebraic equations which can be simply solved. Errormore » analysis of the proposed method is investigated and also the efficiency of this method is shown on some concrete examples. The obtained results reveal that the proposed method is very accurate and efficient. As two useful applications, the proposed method is applied to obtain approximate solutions of the stochastic population growth models and stochastic pendulum problem.« less
VALIDATION OF ANSYS FINITE ELEMENT ANALYSIS SOFTWARE
DOE Office of Scientific and Technical Information (OSTI.GOV)
HAMM, E.R.
2003-06-27
This document provides a record of the verification and Validation of the ANSYS Version 7.0 software that is installed on selected CH2M HILL computers. The issues addressed include: Software verification, installation, validation, configuration management and error reporting. The ANSYS{reg_sign} computer program is a large scale multi-purpose finite element program which may be used for solving several classes of engineering analysis. The analysis capabilities of ANSYS Full Mechanical Version 7.0 installed on selected CH2M Hill Hanford Group (CH2M HILL) Intel processor based computers include the ability to solve static and dynamic structural analyses, steady-state and transient heat transfer problems, mode-frequency andmore » buckling eigenvalue problems, static or time-varying magnetic analyses and various types of field and coupled-field applications. The program contains many special features which allow nonlinearities or secondary effects to be included in the solution, such as plasticity, large strain, hyperelasticity, creep, swelling, large deflections, contact, stress stiffening, temperature dependency, material anisotropy, and thermal radiation. The ANSYS program has been in commercial use since 1970, and has been used extensively in the aerospace, automotive, construction, electronic, energy services, manufacturing, nuclear, plastics, oil and steel industries.« less
Spectral Regularization Algorithms for Learning Large Incomplete Matrices.
Mazumder, Rahul; Hastie, Trevor; Tibshirani, Robert
2010-03-01
We use convex relaxation techniques to provide a sequence of regularized low-rank solutions for large-scale matrix completion problems. Using the nuclear norm as a regularizer, we provide a simple and very efficient convex algorithm for minimizing the reconstruction error subject to a bound on the nuclear norm. Our algorithm Soft-Impute iteratively replaces the missing elements with those obtained from a soft-thresholded SVD. With warm starts this allows us to efficiently compute an entire regularization path of solutions on a grid of values of the regularization parameter. The computationally intensive part of our algorithm is in computing a low-rank SVD of a dense matrix. Exploiting the problem structure, we show that the task can be performed with a complexity linear in the matrix dimensions. Our semidefinite-programming algorithm is readily scalable to large matrices: for example it can obtain a rank-80 approximation of a 10(6) × 10(6) incomplete matrix with 10(5) observed entries in 2.5 hours, and can fit a rank 40 approximation to the full Netflix training set in 6.6 hours. Our methods show very good performance both in training and test error when compared to other competitive state-of-the art techniques.
Spectral Regularization Algorithms for Learning Large Incomplete Matrices
Mazumder, Rahul; Hastie, Trevor; Tibshirani, Robert
2010-01-01
We use convex relaxation techniques to provide a sequence of regularized low-rank solutions for large-scale matrix completion problems. Using the nuclear norm as a regularizer, we provide a simple and very efficient convex algorithm for minimizing the reconstruction error subject to a bound on the nuclear norm. Our algorithm Soft-Impute iteratively replaces the missing elements with those obtained from a soft-thresholded SVD. With warm starts this allows us to efficiently compute an entire regularization path of solutions on a grid of values of the regularization parameter. The computationally intensive part of our algorithm is in computing a low-rank SVD of a dense matrix. Exploiting the problem structure, we show that the task can be performed with a complexity linear in the matrix dimensions. Our semidefinite-programming algorithm is readily scalable to large matrices: for example it can obtain a rank-80 approximation of a 106 × 106 incomplete matrix with 105 observed entries in 2.5 hours, and can fit a rank 40 approximation to the full Netflix training set in 6.6 hours. Our methods show very good performance both in training and test error when compared to other competitive state-of-the art techniques. PMID:21552465
GPU-accelerated computing for Lagrangian coherent structures of multi-body gravitational regimes
NASA Astrophysics Data System (ADS)
Lin, Mingpei; Xu, Ming; Fu, Xiaoyu
2017-04-01
Based on a well-established theoretical foundation, Lagrangian Coherent Structures (LCSs) have elicited widespread research on the intrinsic structures of dynamical systems in many fields, including the field of astrodynamics. Although the application of LCSs in dynamical problems seems straightforward theoretically, its associated computational cost is prohibitive. We propose a block decomposition algorithm developed on Compute Unified Device Architecture (CUDA) platform for the computation of the LCSs of multi-body gravitational regimes. In order to take advantage of GPU's outstanding computing properties, such as Shared Memory, Constant Memory, and Zero-Copy, the algorithm utilizes a block decomposition strategy to facilitate computation of finite-time Lyapunov exponent (FTLE) fields of arbitrary size and timespan. Simulation results demonstrate that this GPU-based algorithm can satisfy double-precision accuracy requirements and greatly decrease the time needed to calculate final results, increasing speed by approximately 13 times. Additionally, this algorithm can be generalized to various large-scale computing problems, such as particle filters, constellation design, and Monte-Carlo simulation.
The application of artificial intelligence techniques to large distributed networks
NASA Technical Reports Server (NTRS)
Dubyah, R.; Smith, T. R.; Star, J. L.
1985-01-01
Data accessibility and transfer of information, including the land resources information system pilot, are structured as large computer information networks. These pilot efforts include the reduction of the difficulty to find and use data, reducing processing costs, and minimize incompatibility between data sources. Artificial Intelligence (AI) techniques were suggested to achieve these goals. The applicability of certain AI techniques are explored in the context of distributed problem solving systems and the pilot land data system (PLDS). The topics discussed include: PLDS and its data processing requirements, expert systems and PLDS, distributed problem solving systems, AI problem solving paradigms, query processing, and distributed data bases.
NASA Astrophysics Data System (ADS)
Jiang, Xikai; Li, Jiyuan; Zhao, Xujun; Qin, Jian; Karpeev, Dmitry; Hernandez-Ortiz, Juan; de Pablo, Juan J.; Heinonen, Olle
2016-08-01
Large classes of materials systems in physics and engineering are governed by magnetic and electrostatic interactions. Continuum or mesoscale descriptions of such systems can be cast in terms of integral equations, whose direct computational evaluation requires O(N2) operations, where N is the number of unknowns. Such a scaling, which arises from the many-body nature of the relevant Green's function, has precluded wide-spread adoption of integral methods for solution of large-scale scientific and engineering problems. In this work, a parallel computational approach is presented that relies on using scalable open source libraries and utilizes a kernel-independent Fast Multipole Method (FMM) to evaluate the integrals in O(N) operations, with O(N) memory cost, thereby substantially improving the scalability and efficiency of computational integral methods. We demonstrate the accuracy, efficiency, and scalability of our approach in the context of two examples. In the first, we solve a boundary value problem for a ferroelectric/ferromagnetic volume in free space. In the second, we solve an electrostatic problem involving polarizable dielectric bodies in an unbounded dielectric medium. The results from these test cases show that our proposed parallel approach, which is built on a kernel-independent FMM, can enable highly efficient and accurate simulations and allow for considerable flexibility in a broad range of applications.
Munoz, F. D.; Hobbs, B. F.; Watson, J. -P.
2016-02-01
A novel two-phase bounding and decomposition approach to compute optimal and near-optimal solutions to large-scale mixed-integer investment planning problems is proposed and it considers a large number of operating subproblems, each of which is a convex optimization. Our motivating application is the planning of power transmission and generation in which policy constraints are designed to incentivize high amounts of intermittent generation in electric power systems. The bounding phase exploits Jensen’s inequality to define a lower bound, which we extend to stochastic programs that use expected-value constraints to enforce policy objectives. The decomposition phase, in which the bounds are tightened, improvesmore » upon the standard Benders’ algorithm by accelerating the convergence of the bounds. The lower bound is tightened by using a Jensen’s inequality-based approach to introduce an auxiliary lower bound into the Benders master problem. Upper bounds for both phases are computed using a sub-sampling approach executed on a parallel computer system. Numerical results show that only the bounding phase is necessary if loose optimality gaps are acceptable. But, the decomposition phase is required to attain optimality gaps. Moreover, use of both phases performs better, in terms of convergence speed, than attempting to solve the problem using just the bounding phase or regular Benders decomposition separately.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Munoz, F. D.; Hobbs, B. F.; Watson, J. -P.
A novel two-phase bounding and decomposition approach to compute optimal and near-optimal solutions to large-scale mixed-integer investment planning problems is proposed and it considers a large number of operating subproblems, each of which is a convex optimization. Our motivating application is the planning of power transmission and generation in which policy constraints are designed to incentivize high amounts of intermittent generation in electric power systems. The bounding phase exploits Jensen’s inequality to define a lower bound, which we extend to stochastic programs that use expected-value constraints to enforce policy objectives. The decomposition phase, in which the bounds are tightened, improvesmore » upon the standard Benders’ algorithm by accelerating the convergence of the bounds. The lower bound is tightened by using a Jensen’s inequality-based approach to introduce an auxiliary lower bound into the Benders master problem. Upper bounds for both phases are computed using a sub-sampling approach executed on a parallel computer system. Numerical results show that only the bounding phase is necessary if loose optimality gaps are acceptable. But, the decomposition phase is required to attain optimality gaps. Moreover, use of both phases performs better, in terms of convergence speed, than attempting to solve the problem using just the bounding phase or regular Benders decomposition separately.« less
Jiang, Xikai; Li, Jiyuan; Zhao, Xujun; ...
2016-08-10
Large classes of materials systems in physics and engineering are governed by magnetic and electrostatic interactions. Continuum or mesoscale descriptions of such systems can be cast in terms of integral equations, whose direct computational evaluation requires O( N 2) operations, where N is the number of unknowns. Such a scaling, which arises from the many-body nature of the relevant Green's function, has precluded wide-spread adoption of integral methods for solution of large-scale scientific and engineering problems. In this work, a parallel computational approach is presented that relies on using scalable open source libraries and utilizes a kernel-independent Fast Multipole Methodmore » (FMM) to evaluate the integrals in O( N) operations, with O( N) memory cost, thereby substantially improving the scalability and efficiency of computational integral methods. We demonstrate the accuracy, efficiency, and scalability of our approach in the context of two examples. In the first, we solve a boundary value problem for a ferroelectric/ferromagnetic volume in free space. In the second, we solve an electrostatic problem involving polarizable dielectric bodies in an unbounded dielectric medium. Lastly, the results from these test cases show that our proposed parallel approach, which is built on a kernel-independent FMM, can enable highly efficient and accurate simulations and allow for considerable flexibility in a broad range of applications.« less
2012-04-21
the photoelectric effect. The typical shortest wavelengths needed for ion traps range from 194 nm for Hg+ to 493 nm for Ba +, corresponding to 6.4-2.5...REPORT Comprehensive Materials and Morphologies Study of Ion Traps (COMMIT) for scalable Quantum Computation - Final Report 14. ABSTRACT 16. SECURITY...CLASSIFICATION OF: Trapped ion systems, are extremely promising for large-scale quantum computation, but face a vexing problem, with motional quantum
Daxini, S D; Prajapati, J M
2014-01-01
Meshfree methods are viewed as next generation computational techniques. With evident limitations of conventional grid based methods, like FEM, in dealing with problems of fracture mechanics, large deformation, and simulation of manufacturing processes, meshfree methods have gained much attention by researchers. A number of meshfree methods have been proposed till now for analyzing complex problems in various fields of engineering. Present work attempts to review recent developments and some earlier applications of well-known meshfree methods like EFG and MLPG to various types of structure mechanics and fracture mechanics applications like bending, buckling, free vibration analysis, sensitivity analysis and topology optimization, single and mixed mode crack problems, fatigue crack growth, and dynamic crack analysis and some typical applications like vibration of cracked structures, thermoelastic crack problems, and failure transition in impact problems. Due to complex nature of meshfree shape functions and evaluation of integrals in domain, meshless methods are computationally expensive as compared to conventional mesh based methods. Some improved versions of original meshfree methods and other techniques suggested by researchers to improve computational efficiency of meshfree methods are also reviewed here.
Input-output oriented computation algorithms for the control of large flexible structures
NASA Technical Reports Server (NTRS)
Minto, K. D.
1989-01-01
An overview is given of work in progress aimed at developing computational algorithms addressing two important aspects in the control of large flexible space structures; namely, the selection and placement of sensors and actuators, and the resulting multivariable control law design problem. The issue of sensor/actuator set selection is particularly crucial to obtaining a satisfactory control design, as clearly a poor choice will inherently limit the degree to which good control can be achieved. With regard to control law design, the researchers are driven by concerns stemming from the practical issues associated with eventual implementation of multivariable control laws, such as reliability, limit protection, multimode operation, sampling rate selection, processor throughput, etc. Naturally, the burden imposed by dealing with these aspects of the problem can be reduced by ensuring that the complexity of the compensator is minimized. Our approach to these problems is based on extensions to input/output oriented techniques that have proven useful in the design of multivariable control systems for aircraft engines. In particular, researchers are exploring the use of relative gain analysis and the condition number as a means of quantifying the process of sensor/actuator selection and placement for shape control of a large space platform.
Scaling up to address data science challenges
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wendelberger, Joanne R.
Statistics and Data Science provide a variety of perspectives and technical approaches for exploring and understanding Big Data. Partnerships between scientists from different fields such as statistics, machine learning, computer science, and applied mathematics can lead to innovative approaches for addressing problems involving increasingly large amounts of data in a rigorous and effective manner that takes advantage of advances in computing. Here, this article will explore various challenges in Data Science and will highlight statistical approaches that can facilitate analysis of large-scale data including sampling and data reduction methods, techniques for effective analysis and visualization of large-scale simulations, and algorithmsmore » and procedures for efficient processing.« less
Recursive partitioned inversion of large (1500 x 1500) symmetric matrices
NASA Technical Reports Server (NTRS)
Putney, B. H.; Brownd, J. E.; Gomez, R. A.
1976-01-01
A recursive algorithm was designed to invert large, dense, symmetric, positive definite matrices using small amounts of computer core, i.e., a small fraction of the core needed to store the complete matrix. The described algorithm is a generalized Gaussian elimination technique. Other algorithms are also discussed for the Cholesky decomposition and step inversion techniques. The purpose of the inversion algorithm is to solve large linear systems of normal equations generated by working geodetic problems. The algorithm was incorporated into a computer program called SOLVE. In the past the SOLVE program has been used in obtaining solutions published as the Goddard earth models.
Scaling up to address data science challenges
Wendelberger, Joanne R.
2017-04-27
Statistics and Data Science provide a variety of perspectives and technical approaches for exploring and understanding Big Data. Partnerships between scientists from different fields such as statistics, machine learning, computer science, and applied mathematics can lead to innovative approaches for addressing problems involving increasingly large amounts of data in a rigorous and effective manner that takes advantage of advances in computing. Here, this article will explore various challenges in Data Science and will highlight statistical approaches that can facilitate analysis of large-scale data including sampling and data reduction methods, techniques for effective analysis and visualization of large-scale simulations, and algorithmsmore » and procedures for efficient processing.« less
Clocks in Feynman's computer and Kitaev's local Hamiltonian: Bias, gaps, idling, and pulse tuning
NASA Astrophysics Data System (ADS)
Caha, Libor; Landau, Zeph; Nagaj, Daniel
2018-06-01
We present a collection of results about the clock in Feynman's computer construction and Kitaev's local Hamiltonian problem. First, by analyzing the spectra of quantum walks on a line with varying end-point terms, we find a better lower bound on the gap of the Feynman Hamiltonian, which translates into a less strict promise gap requirement for the quantum-Merlin-Arthur-complete local Hamiltonian problem. We also translate this result into the language of adiabatic quantum computation. Second, introducing an idling clock construction with a large state space but fast Cesaro mixing, we provide a way for achieving an arbitrarily high success probability of computation with Feynman's computer with only a logarithmic increase in the number of clock qubits. Finally, we tune and thus improve the costs (locality and gap scaling) of implementing a (pulse) clock with a single excitation.
GPU-based High-Performance Computing for Radiation Therapy
Jia, Xun; Ziegenhein, Peter; Jiang, Steve B.
2014-01-01
Recent developments in radiotherapy therapy demand high computation powers to solve challenging problems in a timely fashion in a clinical environment. Graphics processing unit (GPU), as an emerging high-performance computing platform, has been introduced to radiotherapy. It is particularly attractive due to its high computational power, small size, and low cost for facility deployment and maintenance. Over the past a few years, GPU-based high-performance computing in radiotherapy has experienced rapid developments. A tremendous amount of studies have been conducted, in which large acceleration factors compared with the conventional CPU platform have been observed. In this article, we will first give a brief introduction to the GPU hardware structure and programming model. We will then review the current applications of GPU in major imaging-related and therapy-related problems encountered in radiotherapy. A comparison of GPU with other platforms will also be presented. PMID:24486639
A CLIPS based personal computer hardware diagnostic system
NASA Technical Reports Server (NTRS)
Whitson, George M.
1991-01-01
Often the person designated to repair personal computers has little or no knowledge of how to repair a computer. Described here is a simple expert system to aid these inexperienced repair people. The first component of the system leads the repair person through a number of simple system checks such as making sure that all cables are tight and that the dip switches are set correctly. The second component of the system assists the repair person in evaluating error codes generated by the computer. The final component of the system applies a large knowledge base to attempt to identify the component of the personal computer that is malfunctioning. We have implemented and tested our design with a full system to diagnose problems for an IBM compatible system based on the 8088 chip. In our tests, the inexperienced repair people found the system very useful in diagnosing hardware problems.
Price schedules coordination for electricity pool markets
NASA Astrophysics Data System (ADS)
Legbedji, Alexis Motto
2002-04-01
We consider the optimal coordination of a class of mathematical programs with equilibrium constraints, which is formally interpreted as a resource-allocation problem. Many decomposition techniques were proposed to circumvent the difficulty of solving large systems with limited computer resources. The considerable improvement in computer architecture has allowed the solution of large-scale problems with increasing speed. Consequently, interest in decomposition techniques has waned. Nonetheless, there is an important class of applications for which decomposition techniques will still be relevant, among others, distributed systems---the Internet, perhaps, being the most conspicuous example---and competitive economic systems. Conceptually, a competitive economic system is a collection of agents that have similar or different objectives while sharing the same system resources. In theory, constructing a large-scale mathematical program and solving it centrally, using currently available computing power can optimize such systems of agents. In practice, however, because agents are self-interested and not willing to reveal some sensitive corporate data, one cannot solve these kinds of coordination problems by simply maximizing the sum of agent's objective functions with respect to their constraints. An iterative price decomposition or Lagrangian dual method is considered best suited because it can operate with limited information. A price-directed strategy, however, can only work successfully when coordinating or equilibrium prices exist, which is not generally the case when a weak duality is unavoidable. Showing when such prices exist and how to compute them is the main subject of this thesis. Among our results, we show that, if the Lagrangian function of a primal program is additively separable, price schedules coordination may be attained. The prices are Lagrange multipliers, and are also the decision variables of a dual program. In addition, we propose a new form of augmented or nonlinear pricing, which is an example of the use of penalty functions in mathematical programming. Applications are drawn from mathematical programming problems of the form arising in electric power system scheduling under competition.
The linear regulator problem for parabolic systems
NASA Technical Reports Server (NTRS)
Banks, H. T.; Kunisch, K.
1983-01-01
An approximation framework is presented for computation (in finite imensional spaces) of Riccati operators that can be guaranteed to converge to the Riccati operator in feedback controls for abstract evolution systems in a Hilbert space. It is shown how these results may be used in the linear optimal regulator problem for a large class of parabolic systems.
Parallel-vector solution of large-scale structural analysis problems on supercomputers
NASA Technical Reports Server (NTRS)
Storaasli, Olaf O.; Nguyen, Duc T.; Agarwal, Tarun K.
1989-01-01
A direct linear equation solution method based on the Choleski factorization procedure is presented which exploits both parallel and vector features of supercomputers. The new equation solver is described, and its performance is evaluated by solving structural analysis problems on three high-performance computers. The method has been implemented using Force, a generic parallel FORTRAN language.
Multilevel decomposition of complete vehicle configuration in a parallel computing environment
NASA Technical Reports Server (NTRS)
Bhatt, Vinay; Ragsdell, K. M.
1989-01-01
This research summarizes various approaches to multilevel decomposition to solve large structural problems. A linear decomposition scheme based on the Sobieski algorithm is selected as a vehicle for automated synthesis of a complete vehicle configuration in a parallel processing environment. The research is in a developmental state. Preliminary numerical results are presented for several example problems.
An efficient dynamic load balancing algorithm
NASA Astrophysics Data System (ADS)
Lagaros, Nikos D.
2014-01-01
In engineering problems, randomness and uncertainties are inherent. Robust design procedures, formulated in the framework of multi-objective optimization, have been proposed in order to take into account sources of randomness and uncertainty. These design procedures require orders of magnitude more computational effort than conventional analysis or optimum design processes since a very large number of finite element analyses is required to be dealt. It is therefore an imperative need to exploit the capabilities of computing resources in order to deal with this kind of problems. In particular, parallel computing can be implemented at the level of metaheuristic optimization, by exploiting the physical parallelization feature of the nondominated sorting evolution strategies method, as well as at the level of repeated structural analyses required for assessing the behavioural constraints and for calculating the objective functions. In this study an efficient dynamic load balancing algorithm for optimum exploitation of available computing resources is proposed and, without loss of generality, is applied for computing the desired Pareto front. In such problems the computation of the complete Pareto front with feasible designs only, constitutes a very challenging task. The proposed algorithm achieves linear speedup factors and almost 100% speedup factor values with reference to the sequential procedure.
Plagianakos, V P; Magoulas, G D; Vrahatis, M N
2006-03-01
Distributed computing is a process through which a set of computers connected by a network is used collectively to solve a single problem. In this paper, we propose a distributed computing methodology for training neural networks for the detection of lesions in colonoscopy. Our approach is based on partitioning the training set across multiple processors using a parallel virtual machine. In this way, interconnected computers of varied architectures can be used for the distributed evaluation of the error function and gradient values, and, thus, training neural networks utilizing various learning methods. The proposed methodology has large granularity and low synchronization, and has been implemented and tested. Our results indicate that the parallel virtual machine implementation of the training algorithms developed leads to considerable speedup, especially when large network architectures and training sets are used.
Parallelization of the preconditioned IDR solver for modern multicore computer systems
NASA Astrophysics Data System (ADS)
Bessonov, O. A.; Fedoseyev, A. I.
2012-10-01
This paper present the analysis, parallelization and optimization approach for the large sparse matrix solver CNSPACK for modern multicore microprocessors. CNSPACK is an advanced solver successfully used for coupled solution of stiff problems arising in multiphysics applications such as CFD, semiconductor transport, kinetic and quantum problems. It employs iterative IDR algorithm with ILU preconditioning (user chosen ILU preconditioning order). CNSPACK has been successfully used during last decade for solving problems in several application areas, including fluid dynamics and semiconductor device simulation. However, there was a dramatic change in processor architectures and computer system organization in recent years. Due to this, performance criteria and methods have been revisited, together with involving the parallelization of the solver and preconditioner using Open MP environment. Results of the successful implementation for efficient parallelization are presented for the most advances computer system (Intel Core i7-9xx or two-processor Xeon 55xx/56xx).
Computational biology in the cloud: methods and new insights from computing at scale.
Kasson, Peter M
2013-01-01
The past few years have seen both explosions in the size of biological data sets and the proliferation of new, highly flexible on-demand computing capabilities. The sheer amount of information available from genomic and metagenomic sequencing, high-throughput proteomics, experimental and simulation datasets on molecular structure and dynamics affords an opportunity for greatly expanded insight, but it creates new challenges of scale for computation, storage, and interpretation of petascale data. Cloud computing resources have the potential to help solve these problems by offering a utility model of computing and storage: near-unlimited capacity, the ability to burst usage, and cheap and flexible payment models. Effective use of cloud computing on large biological datasets requires dealing with non-trivial problems of scale and robustness, since performance-limiting factors can change substantially when a dataset grows by a factor of 10,000 or more. New computing paradigms are thus often needed. The use of cloud platforms also creates new opportunities to share data, reduce duplication, and to provide easy reproducibility by making the datasets and computational methods easily available.
Visser, Marco D.; McMahon, Sean M.; Merow, Cory; Dixon, Philip M.; Record, Sydne; Jongejans, Eelke
2015-01-01
Computation has become a critical component of research in biology. A risk has emerged that computational and programming challenges may limit research scope, depth, and quality. We review various solutions to common computational efficiency problems in ecological and evolutionary research. Our review pulls together material that is currently scattered across many sources and emphasizes those techniques that are especially effective for typical ecological and environmental problems. We demonstrate how straightforward it can be to write efficient code and implement techniques such as profiling or parallel computing. We supply a newly developed R package (aprof) that helps to identify computational bottlenecks in R code and determine whether optimization can be effective. Our review is complemented by a practical set of examples and detailed Supporting Information material (S1–S3 Texts) that demonstrate large improvements in computational speed (ranging from 10.5 times to 14,000 times faster). By improving computational efficiency, biologists can feasibly solve more complex tasks, ask more ambitious questions, and include more sophisticated analyses in their research. PMID:25811842
NASA Technical Reports Server (NTRS)
Navon, I. M.
1984-01-01
A Lagrange multiplier method using techniques developed by Bertsekas (1982) was applied to solving the problem of enforcing simultaneous conservation of the nonlinear integral invariants of the shallow water equations on a limited area domain. This application of nonlinear constrained optimization is of the large dimensional type and the conjugate gradient method was found to be the only computationally viable method for the unconstrained minimization. Several conjugate-gradient codes were tested and compared for increasing accuracy requirements. Robustness and computational efficiency were the principal criteria.
Using Computing and Data Grids for Large-Scale Science and Engineering
NASA Technical Reports Server (NTRS)
Johnston, William E.
2001-01-01
We use the term "Grid" to refer to a software system that provides uniform and location independent access to geographically and organizationally dispersed, heterogeneous resources that are persistent and supported. These emerging data and computing Grids promise to provide a highly capable and scalable environment for addressing large-scale science problems. We describe the requirements for science Grids, the resulting services and architecture of NASA's Information Power Grid (IPG) and DOE's Science Grid, and some of the scaling issues that have come up in their implementation.
User's guide for a large signal computer model of the helical traveling wave tube
NASA Technical Reports Server (NTRS)
Palmer, Raymond W.
1992-01-01
The use is described of a successful large-signal, two-dimensional (axisymmetric), deformable disk computer model of the helical traveling wave tube amplifier, an extensively revised and operationally simplified version. We also discuss program input and output and the auxiliary files necessary for operation. Included is a sample problem and its input data and output results. Interested parties may now obtain from the author the FORTRAN source code, auxiliary files, and sample input data on a standard floppy diskette, the contents of which are described herein.
On the impact of approximate computation in an analog DeSTIN architecture.
Young, Steven; Lu, Junjie; Holleman, Jeremy; Arel, Itamar
2014-05-01
Deep machine learning (DML) holds the potential to revolutionize machine learning by automating rich feature extraction, which has become the primary bottleneck of human engineering in pattern recognition systems. However, the heavy computational burden renders DML systems implemented on conventional digital processors impractical for large-scale problems. The highly parallel computations required to implement large-scale deep learning systems are well suited to custom hardware. Analog computation has demonstrated power efficiency advantages of multiple orders of magnitude relative to digital systems while performing nonideal computations. In this paper, we investigate typical error sources introduced by analog computational elements and their impact on system-level performance in DeSTIN--a compositional deep learning architecture. These inaccuracies are evaluated on a pattern classification benchmark, clearly demonstrating the robustness of the underlying algorithm to the errors introduced by analog computational elements. A clear understanding of the impacts of nonideal computations is necessary to fully exploit the efficiency of analog circuits.
Suplatov, Dmitry; Popova, Nina; Zhumatiy, Sergey; Voevodin, Vladimir; Švedas, Vytas
2016-04-01
Rapid expansion of online resources providing access to genomic, structural, and functional information associated with biological macromolecules opens an opportunity to gain a deeper understanding of the mechanisms of biological processes due to systematic analysis of large datasets. This, however, requires novel strategies to optimally utilize computer processing power. Some methods in bioinformatics and molecular modeling require extensive computational resources. Other algorithms have fast implementations which take at most several hours to analyze a common input on a modern desktop station, however, due to multiple invocations for a large number of subtasks the full task requires a significant computing power. Therefore, an efficient computational solution to large-scale biological problems requires both a wise parallel implementation of resource-hungry methods as well as a smart workflow to manage multiple invocations of relatively fast algorithms. In this work, a new computer software mpiWrapper has been developed to accommodate non-parallel implementations of scientific algorithms within the parallel supercomputing environment. The Message Passing Interface has been implemented to exchange information between nodes. Two specialized threads - one for task management and communication, and another for subtask execution - are invoked on each processing unit to avoid deadlock while using blocking calls to MPI. The mpiWrapper can be used to launch all conventional Linux applications without the need to modify their original source codes and supports resubmission of subtasks on node failure. We show that this approach can be used to process huge amounts of biological data efficiently by running non-parallel programs in parallel mode on a supercomputer. The C++ source code and documentation are available from http://biokinet.belozersky.msu.ru/mpiWrapper .
NASA Astrophysics Data System (ADS)
Stork, David G.; Furuichi, Yasuo
2011-03-01
David Hockney has argued that the right hand of the disciple, thrust to the rear in Caravaggio's Supper at Emmaus (1606), is anomalously large as a result of the artist refocusing a putative secret lens-based optical projector and tracing the image it projected onto his canvas. We show through rigorous optical analysis that to achieve such an anomalously large hand image, Caravaggio would have needed to make extremely large, conspicuous and implausible alterations to his studio setup, moving both his purported lens and his canvas nearly two meters between "exposing" the disciple's left hand and then his right hand. Such major disruptions to his studio would have impeded -not aided- Caravaggio in his work. Our optical analysis quantifies these problems and our computer graphics reconstruction of Caravaggio's studio illustrates these problems. In this way we conclude that Caravaggio did not use optical projections in the way claimed by Hockney, but instead most likely set the sizes of these hands "by eye" for artistic reasons.
MacDoctor: The Macintosh diagnoser
NASA Technical Reports Server (NTRS)
Lavery, David B.; Brooks, William D.
1990-01-01
When the Macintosh computer was first released, the primary user was a computer hobbyist who typically had a significant technical background and was highly motivated to understand the internal structure and operational intricacies of the computer. In recent years the Macintosh computer has become a widely-accepted general purpose computer which is being used by an ever-increasing non-technical audience. This has lead to a large base of users which has neither the interest nor the background to understand what is happening 'behind the scenes' when the Macintosh is put to use - or what should be happening when something goes wrong. Additionally, the Macintosh itself has evolved from a simple closed design to a complete family of processor platforms and peripherals with a tremendous number of possible configurations. With the increasing popularity of the Macintosh series, software and hardware developers are producing a product for every user's need. As the complexity of configuration possibilities grows, the need for experienced or even expert knowledge is required to diagnose problems. This presents a problem to uneducated or casual users. This problem indicates a new Macintosh consumer need; that is, a diagnostic tool able to determine the problem for the user. As the volume of Macintosh products has increased, this need has also increased.
Laghari, Samreen; Niazi, Muaz A
2016-01-01
Computer Networks have a tendency to grow at an unprecedented scale. Modern networks involve not only computers but also a wide variety of other interconnected devices ranging from mobile phones to other household items fitted with sensors. This vision of the "Internet of Things" (IoT) implies an inherent difficulty in modeling problems. It is practically impossible to implement and test all scenarios for large-scale and complex adaptive communication networks as part of Complex Adaptive Communication Networks and Environments (CACOONS). The goal of this study is to explore the use of Agent-based Modeling as part of the Cognitive Agent-based Computing (CABC) framework to model a Complex communication network problem. We use Exploratory Agent-based Modeling (EABM), as part of the CABC framework, to develop an autonomous multi-agent architecture for managing carbon footprint in a corporate network. To evaluate the application of complexity in practical scenarios, we have also introduced a company-defined computer usage policy. The conducted experiments demonstrated two important results: Primarily CABC-based modeling approach such as using Agent-based Modeling can be an effective approach to modeling complex problems in the domain of IoT. Secondly, the specific problem of managing the Carbon footprint can be solved using a multiagent system approach.
Analysis on the security of cloud computing
NASA Astrophysics Data System (ADS)
He, Zhonglin; He, Yuhua
2011-02-01
Cloud computing is a new technology, which is the fusion of computer technology and Internet development. It will lead the revolution of IT and information field. However, in cloud computing data and application software is stored at large data centers, and the management of data and service is not completely trustable, resulting in safety problems, which is the difficult point to improve the quality of cloud service. This paper briefly introduces the concept of cloud computing. Considering the characteristics of cloud computing, it constructs the security architecture of cloud computing. At the same time, with an eye toward the security threats cloud computing faces, several corresponding strategies are provided from the aspect of cloud computing users and service providers.
3D Data Acquisition Platform for Human Activity Understanding
2016-03-02
address fundamental research problems of representation and invariant description of3D data, human motion modeling and applications of human activity analysis, and computational optimization of large-scale 3D data.
3D Data Acquisition Platform for Human Activity Understanding
2016-03-02
address fundamental research problems of representation and invariant description of 3D data, human motion modeling and applications of human activity analysis, and computational optimization of large-scale 3D data.
cOSPREY: A Cloud-Based Distributed Algorithm for Large-Scale Computational Protein Design
Pan, Yuchao; Dong, Yuxi; Zhou, Jingtian; Hallen, Mark; Donald, Bruce R.; Xu, Wei
2016-01-01
Abstract Finding the global minimum energy conformation (GMEC) of a huge combinatorial search space is the key challenge in computational protein design (CPD) problems. Traditional algorithms lack a scalable and efficient distributed design scheme, preventing researchers from taking full advantage of current cloud infrastructures. We design cloud OSPREY (cOSPREY), an extension to a widely used protein design software OSPREY, to allow the original design framework to scale to the commercial cloud infrastructures. We propose several novel designs to integrate both algorithm and system optimizations, such as GMEC-specific pruning, state search partitioning, asynchronous algorithm state sharing, and fault tolerance. We evaluate cOSPREY on three different cloud platforms using different technologies and show that it can solve a number of large-scale protein design problems that have not been possible with previous approaches. PMID:27154509
Unsteady flow simulations around complex geometries using stationary or rotating unstructured grids
NASA Astrophysics Data System (ADS)
Sezer-Uzol, Nilay
In this research, the computational analysis of three-dimensional, unsteady, separated, vortical flows around complex geometries is studied by using stationary or moving unstructured grids. Two main engineering problems are investigated. The first problem is the unsteady simulation of a ship airwake, where helicopter operations become even more challenging, by using stationary unstructured grids. The second problem is the unsteady simulation of wind turbine rotor flow fields by using moving unstructured grids which are rotating with the whole three-dimensional rigid rotor geometry. The three dimensional, unsteady, parallel, unstructured, finite volume flow solver, PUMA2, is used for the computational fluid dynamics (CFD) simulations considered in this research. The code is modified to have a moving grid capability to perform three-dimensional, time-dependent rotor simulations. An instantaneous log-law wall model for Large Eddy Simulations is also implemented in PUMA2 to investigate the very large Reynolds number flow fields of rotating blades. To verify the code modifications, several sample test cases are also considered. In addition, interdisciplinary studies, which are aiming to provide new tools and insights to the aerospace and wind energy scientific communities, are done during this research by focusing on the coupling of ship airwake CFD simulations with the helicopter flight dynamics and control analysis, the coupling of wind turbine rotor CFD simulations with the aeroacoustic analysis, and the analysis of these time-dependent and large-scale CFD simulations with the help of a computational monitoring, steering and visualization tool, POSSE.
Advanced Computational Methods for Security Constrained Financial Transmission Rights
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kalsi, Karanjit; Elbert, Stephen T.; Vlachopoulou, Maria
Financial Transmission Rights (FTRs) are financial insurance tools to help power market participants reduce price risks associated with transmission congestion. FTRs are issued based on a process of solving a constrained optimization problem with the objective to maximize the FTR social welfare under power flow security constraints. Security constraints for different FTR categories (monthly, seasonal or annual) are usually coupled and the number of constraints increases exponentially with the number of categories. Commercial software for FTR calculation can only provide limited categories of FTRs due to the inherent computational challenges mentioned above. In this paper, first an innovative mathematical reformulationmore » of the FTR problem is presented which dramatically improves the computational efficiency of optimization problem. After having re-formulated the problem, a novel non-linear dynamic system (NDS) approach is proposed to solve the optimization problem. The new formulation and performance of the NDS solver is benchmarked against widely used linear programming (LP) solvers like CPLEX™ and tested on both standard IEEE test systems and large-scale systems using data from the Western Electricity Coordinating Council (WECC). The performance of the NDS is demonstrated to be comparable and in some cases is shown to outperform the widely used CPLEX algorithms. The proposed formulation and NDS based solver is also easily parallelizable enabling further computational improvement.« less
High-performance biocomputing for simulating the spread of contagion over large contact networks
2012-01-01
Background Many important biological problems can be modeled as contagion diffusion processes over interaction networks. This article shows how the EpiSimdemics interaction-based simulation system can be applied to the general contagion diffusion problem. Two specific problems, computational epidemiology and human immune system modeling, are given as examples. We then show how the graphics processing unit (GPU) within each compute node of a cluster can effectively be used to speed-up the execution of these types of problems. Results We show that a single GPU can accelerate the EpiSimdemics computation kernel by a factor of 6 and the entire application by a factor of 3.3, compared to the execution time on a single core. When 8 CPU cores and 2 GPU devices are utilized, the speed-up of the computational kernel increases to 9.5. When combined with effective techniques for inter-node communication, excellent scalability can be achieved without significant loss of accuracy in the results. Conclusions We show that interaction-based simulation systems can be used to model disparate and highly relevant problems in biology. We also show that offloading some of the work to GPUs in distributed interaction-based simulations can be an effective way to achieve increased intra-node efficiency. PMID:22537298
A Computer-Assisted Personalized Approach in an Undergraduate Plant Physiology Class1
Artus, Nancy N.; Nadler, Kenneth D.
1999-01-01
We used Computer-Assisted Personalized Approach (CAPA), a networked teaching and learning tool that generates computer individualized homework problem sets, in our large-enrollment introductory plant physiology course. We saw significant improvement in student examination performance with regular homework assignments, with CAPA being an effective and efficient substitute for hand-graded homework. Using CAPA, each student received a printed set of similar but individualized problems of a conceptual (qualitative) and/or quantitative nature with quality graphics. Because each set of problems is unique, students were encouraged to work together to clarify concepts but were required to do their own work for credit. Students could enter answers multiple times without penalty, and they were able to obtain immediate feedback and hints until the due date. These features increased student time on task, allowing higher course standards and student achievement in a diverse student population. CAPA handles routine tasks such as grading, recording, summarizing, and posting grades. In anonymous surveys, students indicated an overwhelming preference for homework in CAPA format, citing several features such as immediate feedback, multiple tries, and on-line accessibility as reasons for their preference. We wrote and used more than 170 problems on 17 topics in introductory plant physiology, cataloging them in a computer library for general access. Representative problems are compared and discussed. PMID:10198076
Parallel computing for probabilistic fatigue analysis
NASA Technical Reports Server (NTRS)
Sues, Robert H.; Lua, Yuan J.; Smith, Mark D.
1993-01-01
This paper presents the results of Phase I research to investigate the most effective parallel processing software strategies and hardware configurations for probabilistic structural analysis. We investigate the efficiency of both shared and distributed-memory architectures via a probabilistic fatigue life analysis problem. We also present a parallel programming approach, the virtual shared-memory paradigm, that is applicable across both types of hardware. Using this approach, problems can be solved on a variety of parallel configurations, including networks of single or multiprocessor workstations. We conclude that it is possible to effectively parallelize probabilistic fatigue analysis codes; however, special strategies will be needed to achieve large-scale parallelism to keep large number of processors busy and to treat problems with the large memory requirements encountered in practice. We also conclude that distributed-memory architecture is preferable to shared-memory for achieving large scale parallelism; however, in the future, the currently emerging hybrid-memory architectures will likely be optimal.
Sierra/Solid Mechanics 4.48 User's Guide.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Merewether, Mark Thomas; Crane, Nathan K; de Frias, Gabriel Jose
Sierra/SolidMechanics (Sierra/SM) is a Lagrangian, three-dimensional code for finite element analysis of solids and structures. It provides capabilities for explicit dynamic, implicit quasistatic and dynamic analyses. The explicit dynamics capabilities allow for the efficient and robust solution of models with extensive contact subjected to large, suddenly applied loads. For implicit problems, Sierra/SM uses a multi-level iterative solver, which enables it to effectively solve problems with large deformations, nonlinear material behavior, and contact. Sierra/SM has a versatile library of continuum and structural elements, and a large library of material models. The code is written for parallel computing environments enabling scalable solutionsmore » of extremely large problems for both implicit and explicit analyses. It is built on the SIERRA Framework, which facilitates coupling with other SIERRA mechanics codes. This document describes the functionality and input syntax for Sierra/SM.« less
Ultrafast and scalable cone-beam CT reconstruction using MapReduce in a cloud computing environment.
Meng, Bowen; Pratx, Guillem; Xing, Lei
2011-12-01
Four-dimensional CT (4DCT) and cone beam CT (CBCT) are widely used in radiation therapy for accurate tumor target definition and localization. However, high-resolution and dynamic image reconstruction is computationally demanding because of the large amount of data processed. Efficient use of these imaging techniques in the clinic requires high-performance computing. The purpose of this work is to develop a novel ultrafast, scalable and reliable image reconstruction technique for 4D CBCT∕CT using a parallel computing framework called MapReduce. We show the utility of MapReduce for solving large-scale medical physics problems in a cloud computing environment. In this work, we accelerated the Feldcamp-Davis-Kress (FDK) algorithm by porting it to Hadoop, an open-source MapReduce implementation. Gated phases from a 4DCT scans were reconstructed independently. Following the MapReduce formalism, Map functions were used to filter and backproject subsets of projections, and Reduce function to aggregate those partial backprojection into the whole volume. MapReduce automatically parallelized the reconstruction process on a large cluster of computer nodes. As a validation, reconstruction of a digital phantom and an acquired CatPhan 600 phantom was performed on a commercial cloud computing environment using the proposed 4D CBCT∕CT reconstruction algorithm. Speedup of reconstruction time is found to be roughly linear with the number of nodes employed. For instance, greater than 10 times speedup was achieved using 200 nodes for all cases, compared to the same code executed on a single machine. Without modifying the code, faster reconstruction is readily achievable by allocating more nodes in the cloud computing environment. Root mean square error between the images obtained using MapReduce and a single-threaded reference implementation was on the order of 10(-7). Our study also proved that cloud computing with MapReduce is fault tolerant: the reconstruction completed successfully with identical results even when half of the nodes were manually terminated in the middle of the process. An ultrafast, reliable and scalable 4D CBCT∕CT reconstruction method was developed using the MapReduce framework. Unlike other parallel computing approaches, the parallelization and speedup required little modification of the original reconstruction code. MapReduce provides an efficient and fault tolerant means of solving large-scale computing problems in a cloud computing environment.
Ultrafast and scalable cone-beam CT reconstruction using MapReduce in a cloud computing environment
Meng, Bowen; Pratx, Guillem; Xing, Lei
2011-01-01
Purpose: Four-dimensional CT (4DCT) and cone beam CT (CBCT) are widely used in radiation therapy for accurate tumor target definition and localization. However, high-resolution and dynamic image reconstruction is computationally demanding because of the large amount of data processed. Efficient use of these imaging techniques in the clinic requires high-performance computing. The purpose of this work is to develop a novel ultrafast, scalable and reliable image reconstruction technique for 4D CBCT/CT using a parallel computing framework called MapReduce. We show the utility of MapReduce for solving large-scale medical physics problems in a cloud computing environment. Methods: In this work, we accelerated the Feldcamp–Davis–Kress (FDK) algorithm by porting it to Hadoop, an open-source MapReduce implementation. Gated phases from a 4DCT scans were reconstructed independently. Following the MapReduce formalism, Map functions were used to filter and backproject subsets of projections, and Reduce function to aggregate those partial backprojection into the whole volume. MapReduce automatically parallelized the reconstruction process on a large cluster of computer nodes. As a validation, reconstruction of a digital phantom and an acquired CatPhan 600 phantom was performed on a commercial cloud computing environment using the proposed 4D CBCT/CT reconstruction algorithm. Results: Speedup of reconstruction time is found to be roughly linear with the number of nodes employed. For instance, greater than 10 times speedup was achieved using 200 nodes for all cases, compared to the same code executed on a single machine. Without modifying the code, faster reconstruction is readily achievable by allocating more nodes in the cloud computing environment. Root mean square error between the images obtained using MapReduce and a single-threaded reference implementation was on the order of 10−7. Our study also proved that cloud computing with MapReduce is fault tolerant: the reconstruction completed successfully with identical results even when half of the nodes were manually terminated in the middle of the process. Conclusions: An ultrafast, reliable and scalable 4D CBCT/CT reconstruction method was developed using the MapReduce framework. Unlike other parallel computing approaches, the parallelization and speedup required little modification of the original reconstruction code. MapReduce provides an efficient and fault tolerant means of solving large-scale computing problems in a cloud computing environment. PMID:22149842
DOE Office of Scientific and Technical Information (OSTI.GOV)
Spencer, Benjamin Whiting; Crane, Nathan K.; Heinstein, Martin W.
2011-03-01
Adagio is a Lagrangian, three-dimensional, implicit code for the analysis of solids and structures. It uses a multi-level iterative solver, which enables it to solve problems with large deformations, nonlinear material behavior, and contact. It also has a versatile library of continuum and structural elements, and an extensive library of material models. Adagio is written for parallel computing environments, and its solvers allow for scalable solutions of very large problems. Adagio uses the SIERRA Framework, which allows for coupling with other SIERRA mechanics codes. This document describes the functionality and input structure for Adagio.
Applications of multiple-constraint matrix updates to the optimal control of large structures
NASA Technical Reports Server (NTRS)
Smith, S. W.; Walcott, B. L.
1992-01-01
Low-authority control or vibration suppression in large, flexible space structures can be formulated as a linear feedback control problem requiring computation of displacement and velocity feedback gain matrices. To ensure stability in the uncontrolled modes, these gain matrices must be symmetric and positive definite. In this paper, efficient computation of symmetric, positive-definite feedback gain matrices is accomplished through the use of multiple-constraint matrix update techniques originally developed for structural identification applications. Two systems were used to illustrate the application: a simple spring-mass system and a planar truss. From these demonstrations, use of this multiple-constraint technique is seen to provide a straightforward approach for computing the low-authority gains.
Gradient gravitational search: An efficient metaheuristic algorithm for global optimization.
Dash, Tirtharaj; Sahu, Prabhat K
2015-05-30
The adaptation of novel techniques developed in the field of computational chemistry to solve the concerned problems for large and flexible molecules is taking the center stage with regard to efficient algorithm, computational cost and accuracy. In this article, the gradient-based gravitational search (GGS) algorithm, using analytical gradients for a fast minimization to the next local minimum has been reported. Its efficiency as metaheuristic approach has also been compared with Gradient Tabu Search and others like: Gravitational Search, Cuckoo Search, and Back Tracking Search algorithms for global optimization. Moreover, the GGS approach has also been applied to computational chemistry problems for finding the minimal value potential energy of two-dimensional and three-dimensional off-lattice protein models. The simulation results reveal the relative stability and physical accuracy of protein models with efficient computational cost. © 2015 Wiley Periodicals, Inc.
Improved Quasi-Newton method via PSB update for solving systems of nonlinear equations
NASA Astrophysics Data System (ADS)
Mamat, Mustafa; Dauda, M. K.; Waziri, M. Y.; Ahmad, Fadhilah; Mohamad, Fatma Susilawati
2016-10-01
The Newton method has some shortcomings which includes computation of the Jacobian matrix which may be difficult or even impossible to compute and solving the Newton system in every iteration. Also, the common setback with some quasi-Newton methods is that they need to compute and store an n × n matrix at each iteration, this is computationally costly for large scale problems. To overcome such drawbacks, an improved Method for solving systems of nonlinear equations via PSB (Powell-Symmetric-Broyden) update is proposed. In the proposed method, the approximate Jacobian inverse Hk of PSB is updated and its efficiency has improved thereby require low memory storage, hence the main aim of this paper. The preliminary numerical results show that the proposed method is practically efficient when applied on some benchmark problems.
NASA Technical Reports Server (NTRS)
Newman, P. A.; Hou, G. J.-W.; Jones, H. E.; Taylor, A. C., III; Korivi, V. M.
1992-01-01
How a combination of various computational methodologies could reduce the enormous computational costs envisioned in using advanced CFD codes in gradient based optimized multidisciplinary design (MdD) procedures is briefly outlined. Implications of these MdD requirements upon advanced CFD codes are somewhat different than those imposed by a single discipline design. A means for satisfying these MdD requirements for gradient information is presented which appear to permit: (1) some leeway in the CFD solution algorithms which can be used; (2) an extension to 3-D problems; and (3) straightforward use of other computational methodologies. Many of these observations have previously been discussed as possibilities for doing parts of the problem more efficiently; the contribution here is observing how they fit together in a mutually beneficial way.
Finite-difference computations of rotor loads
NASA Technical Reports Server (NTRS)
Caradonna, F. X.; Tung, C.
1985-01-01
This paper demonstrates the current and future potential of finite-difference methods for solving real rotor problems which now rely largely on empiricism. The demonstration consists of a simple means of combining existing finite-difference, integral, and comprehensive loads codes to predict real transonic rotor flows. These computations are performed for hover and high-advance-ratio flight. Comparisons are made with experimental pressure data.
Finite-difference computations of rotor loads
NASA Technical Reports Server (NTRS)
Caradonna, F. X.; Tung, C.
1985-01-01
The current and future potential of finite difference methods for solving real rotor problems which now rely largely on empiricism are demonstrated. The demonstration consists of a simple means of combining existing finite-difference, integral, and comprehensive loads codes to predict real transonic rotor flows. These computations are performed for hover and high-advanced-ratio flight. Comparisons are made with experimental pressure data.
On some Aitken-like acceleration of the Schwarz method
NASA Astrophysics Data System (ADS)
Garbey, M.; Tromeur-Dervout, D.
2002-12-01
In this paper we present a family of domain decomposition based on Aitken-like acceleration of the Schwarz method seen as an iterative procedure with a linear rate of convergence. We first present the so-called Aitken-Schwarz procedure for linear differential operators. The solver can be a direct solver when applied to the Helmholtz problem with five-point finite difference scheme on regular grids. We then introduce the Steffensen-Schwarz variant which is an iterative domain decomposition solver that can be applied to linear and nonlinear problems. We show that these solvers have reasonable numerical efficiency compared to classical fast solvers for the Poisson problem or multigrids for more general linear and nonlinear elliptic problems. However, the salient feature of our method is that our algorithm has high tolerance to slow network in the context of distributed parallel computing and is attractive, generally speaking, to use with computer architecture for which performance is limited by the memory bandwidth rather than the flop performance of the CPU. This is nowadays the case for most parallel. computer using the RISC processor architecture. We will illustrate this highly desirable property of our algorithm with large-scale computing experiments.
NASA Technical Reports Server (NTRS)
Banks, H. T.; Ito, K.
1991-01-01
A hybrid method for computing the feedback gains in linear quadratic regulator problem is proposed. The method, which combines use of a Chandrasekhar type system with an iteration of the Newton-Kleinman form with variable acceleration parameter Smith schemes, is formulated to efficiently compute directly the feedback gains rather than solutions of an associated Riccati equation. The hybrid method is particularly appropriate when used with large dimensional systems such as those arising in approximating infinite-dimensional (distributed parameter) control systems (e.g., those governed by delay-differential and partial differential equations). Computational advantages of the proposed algorithm over the standard eigenvector (Potter, Laub-Schur) based techniques are discussed, and numerical evidence of the efficacy of these ideas is presented.
A numerical algorithm for optimal feedback gains in high dimensional LQR problems
NASA Technical Reports Server (NTRS)
Banks, H. T.; Ito, K.
1986-01-01
A hybrid method for computing the feedback gains in linear quadratic regulator problems is proposed. The method, which combines the use of a Chandrasekhar type system with an iteration of the Newton-Kleinman form with variable acceleration parameter Smith schemes, is formulated so as to efficiently compute directly the feedback gains rather than solutions of an associated Riccati equation. The hybrid method is particularly appropriate when used with large dimensional systems such as those arising in approximating infinite dimensional (distributed parameter) control systems (e.g., those governed by delay-differential and partial differential equations). Computational advantage of the proposed algorithm over the standard eigenvector (Potter, Laub-Schur) based techniques are discussed and numerical evidence of the efficacy of our ideas presented.
Computational complexities and storage requirements of some Riccati equation solvers
NASA Technical Reports Server (NTRS)
Utku, Senol; Garba, John A.; Ramesh, A. V.
1989-01-01
The linear optimal control problem of an nth-order time-invariant dynamic system with a quadratic performance functional is usually solved by the Hamilton-Jacobi approach. This leads to the solution of the differential matrix Riccati equation with a terminal condition. The bulk of the computation for the optimal control problem is related to the solution of this equation. There are various algorithms in the literature for solving the matrix Riccati equation. However, computational complexities and storage requirements as a function of numbers of state variables, control variables, and sensors are not available for all these algorithms. In this work, the computational complexities and storage requirements for some of these algorithms are given. These expressions show the immensity of the computational requirements of the algorithms in solving the Riccati equation for large-order systems such as the control of highly flexible space structures. The expressions are also needed to compute the speedup and efficiency of any implementation of these algorithms on concurrent machines.
NASA Technical Reports Server (NTRS)
Shishir, Pandya; Chaderjian, Neal; Ahmad, Jsaim; Kwak, Dochan (Technical Monitor)
2001-01-01
Flow simulations using the time-dependent Navier-Stokes equations remain a challenge for several reasons. Principal among them are the difficulty to accurately model complex flows, and the time needed to perform the computations. A parametric study of such complex problems is not considered practical due to the large cost associated with computing many time-dependent solutions. The computation time for each solution must be reduced in order to make a parametric study possible. With successful reduction of computation time, the issue of accuracy, and appropriateness of turbulence models will become more tractable.
NASA Astrophysics Data System (ADS)
Xu, Jincheng; Liu, Wei; Wang, Jin; Liu, Linong; Zhang, Jianfeng
2018-02-01
De-absorption pre-stack time migration (QPSTM) compensates for the absorption and dispersion of seismic waves by introducing an effective Q parameter, thereby making it an effective tool for 3D, high-resolution imaging of seismic data. Although the optimal aperture obtained via stationary-phase migration reduces the computational cost of 3D QPSTM and yields 3D stationary-phase QPSTM, the associated computational efficiency is still the main problem in the processing of 3D, high-resolution images for real large-scale seismic data. In the current paper, we proposed a division method for large-scale, 3D seismic data to optimize the performance of stationary-phase QPSTM on clusters of graphics processing units (GPU). Then, we designed an imaging point parallel strategy to achieve an optimal parallel computing performance. Afterward, we adopted an asynchronous double buffering scheme for multi-stream to perform the GPU/CPU parallel computing. Moreover, several key optimization strategies of computation and storage based on the compute unified device architecture (CUDA) were adopted to accelerate the 3D stationary-phase QPSTM algorithm. Compared with the initial GPU code, the implementation of the key optimization steps, including thread optimization, shared memory optimization, register optimization and special function units (SFU), greatly improved the efficiency. A numerical example employing real large-scale, 3D seismic data showed that our scheme is nearly 80 times faster than the CPU-QPSTM algorithm. Our GPU/CPU heterogeneous parallel computing framework significant reduces the computational cost and facilitates 3D high-resolution imaging for large-scale seismic data.
Aeroelastic analysis of versatile thermal insulation (VTI) panels with pinched boundary conditions
NASA Astrophysics Data System (ADS)
Carrera, Erasmo; Zappino, Enrico; Patočka, Karel; Komarek, Martin; Ferrarese, Adriano; Montabone, Mauro; Kotzias, Bernhard; Huermann, Brian; Schwane, Richard
2014-03-01
Launch vehicle design and analysis is a crucial problem in space engineering. The large range of external conditions and the complexity of space vehicles make the solution of the problem really challenging. The problem considered in the present work deals with the versatile thermal insulation (VTI) panel. This thermal protection system is designed to reduce heat fluxes on the LH2 tank during the long coasting phases. Because of the unconventional boundary conditions and the large-scale geometry of the panel, the aeroelastic behaviour of VTI is investigated in the present work. Known available results from literature related to similar problem, are reviewed by considering the effect of various Mach regimes, including boundary layer thickness effects, in-plane mechanical and thermal loads, non-linear effects and amplitude of limit cycle oscillations. A dedicated finite element model is developed for the supersonic regime. The models used for coupling the orthotropic layered structural model with Piston Theory aerodynamic models allow the calculations of flutter conditions in case of curved panels supported in a discrete number of points. An advanced computational aeroelasticity tool is developed using various dedicated commercial softwares (CFX, ZAERO, EDGE). A wind tunnel test campaign is carried out to assess the computational tool in the analysis of this type of problem.
Gong, Pinghua; Zhang, Changshui; Lu, Zhaosong; Huang, Jianhua Z; Ye, Jieping
2013-01-01
Non-convex sparsity-inducing penalties have recently received considerable attentions in sparse learning. Recent theoretical investigations have demonstrated their superiority over the convex counterparts in several sparse learning settings. However, solving the non-convex optimization problems associated with non-convex penalties remains a big challenge. A commonly used approach is the Multi-Stage (MS) convex relaxation (or DC programming), which relaxes the original non-convex problem to a sequence of convex problems. This approach is usually not very practical for large-scale problems because its computational cost is a multiple of solving a single convex problem. In this paper, we propose a General Iterative Shrinkage and Thresholding (GIST) algorithm to solve the nonconvex optimization problem for a large class of non-convex penalties. The GIST algorithm iteratively solves a proximal operator problem, which in turn has a closed-form solution for many commonly used penalties. At each outer iteration of the algorithm, we use a line search initialized by the Barzilai-Borwein (BB) rule that allows finding an appropriate step size quickly. The paper also presents a detailed convergence analysis of the GIST algorithm. The efficiency of the proposed algorithm is demonstrated by extensive experiments on large-scale data sets.
ERIC Educational Resources Information Center
Atwood, Nancy K.
School districts have begun examining the feasibility of, and in some cases are developing and implementing automated systems for, managing and evaluating instructional programs. This paper describes and analyzes the issues and problems that emerged over the course of three projects--a large suburban school in the West, a consortium of five small…
Adaptive probabilistic collocation based Kalman filter for unsaturated flow problem
NASA Astrophysics Data System (ADS)
Man, J.; Li, W.; Zeng, L.; Wu, L.
2015-12-01
The ensemble Kalman filter (EnKF) has gained popularity in hydrological data assimilation problems. As a Monte Carlo based method, a relatively large ensemble size is usually required to guarantee the accuracy. As an alternative approach, the probabilistic collocation based Kalman filter (PCKF) employs the Polynomial Chaos to approximate the original system. In this way, the sampling error can be reduced. However, PCKF suffers from the so called "cure of dimensionality". When the system nonlinearity is strong and number of parameters is large, PCKF is even more computationally expensive than EnKF. Motivated by recent developments in uncertainty quantification, we propose a restart adaptive probabilistic collocation based Kalman filter (RAPCKF) for data assimilation in unsaturated flow problem. During the implementation of RAPCKF, the important parameters are identified and active PCE basis functions are adaptively selected. The "restart" technology is used to alleviate the inconsistency between model parameters and states. The performance of RAPCKF is tested by unsaturated flow numerical cases. It is shown that RAPCKF is more efficient than EnKF with the same computational cost. Compared with the traditional PCKF, the RAPCKF is more applicable in strongly nonlinear and high dimensional problems.
NASA Technical Reports Server (NTRS)
Lee, C. S. G.; Chen, C. L.
1989-01-01
Two efficient mapping algorithms for scheduling the robot inverse dynamics computation consisting of m computational modules with precedence relationship to be executed on a multiprocessor system consisting of p identical homogeneous processors with processor and communication costs to achieve minimum computation time are presented. An objective function is defined in terms of the sum of the processor finishing time and the interprocessor communication time. The minimax optimization is performed on the objective function to obtain the best mapping. This mapping problem can be formulated as a combination of the graph partitioning and the scheduling problems; both have been known to be NP-complete. Thus, to speed up the searching for a solution, two heuristic algorithms were proposed to obtain fast but suboptimal mapping solutions. The first algorithm utilizes the level and the communication intensity of the task modules to construct an ordered priority list of ready modules and the module assignment is performed by a weighted bipartite matching algorithm. For a near-optimal mapping solution, the problem can be solved by the heuristic algorithm with simulated annealing. These proposed optimization algorithms can solve various large-scale problems within a reasonable time. Computer simulations were performed to evaluate and verify the performance and the validity of the proposed mapping algorithms. Finally, experiments for computing the inverse dynamics of a six-jointed PUMA-like manipulator based on the Newton-Euler dynamic equations were implemented on an NCUBE/ten hypercube computer to verify the proposed mapping algorithms. Computer simulation and experimental results are compared and discussed.
Fast solver for large scale eddy current non-destructive evaluation problems
NASA Astrophysics Data System (ADS)
Lei, Naiguang
Eddy current testing plays a very important role in non-destructive evaluations of conducting test samples. Based on Faraday's law, an alternating magnetic field source generates induced currents, called eddy currents, in an electrically conducting test specimen. The eddy currents generate induced magnetic fields that oppose the direction of the inducing magnetic field in accordance with Lenz's law. In the presence of discontinuities in material property or defects in the test specimen, the induced eddy current paths are perturbed and the associated magnetic fields can be detected by coils or magnetic field sensors, such as Hall elements or magneto-resistance sensors. Due to the complexity of the test specimen and the inspection environments, the availability of theoretical simulation models is extremely valuable for studying the basic field/flaw interactions in order to obtain a fuller understanding of non-destructive testing phenomena. Theoretical models of the forward problem are also useful for training and validation of automated defect detection systems. Theoretical models generate defect signatures that are expensive to replicate experimentally. In general, modelling methods can be classified into two categories: analytical and numerical. Although analytical approaches offer closed form solution, it is generally not possible to obtain largely due to the complex sample and defect geometries, especially in three-dimensional space. Numerical modelling has become popular with advances in computer technology and computational methods. However, due to the huge time consumption in the case of large scale problems, accelerations/fast solvers are needed to enhance numerical models. This dissertation describes a numerical simulation model for eddy current problems using finite element analysis. Validation of the accuracy of this model is demonstrated via comparison with experimental measurements of steam generator tube wall defects. These simulations generating two-dimension raster scan data typically takes one to two days on a dedicated eight-core PC. A novel direct integral solver for eddy current problems and GPU-based implementation is also investigated in this research to reduce the computational time.
Error Detection and Recovery for Robot Motion Planning with Uncertainty.
1987-07-01
plans for these problems . This intuition-which is a heuristic claim, so the reader is advised to proceed with caution--should be verified or disproven...that might work. but fail in a --reasonable" way when they cannot. While EDR is largely motivated by the problems of uncertainty and model error. its...definition for EDR strategies and show how they can be computed. This theory represents what is perhaps the first systematic attack on the problem of
Accelerating Large Scale Image Analyses on Parallel, CPU-GPU Equipped Systems
Teodoro, George; Kurc, Tahsin M.; Pan, Tony; Cooper, Lee A.D.; Kong, Jun; Widener, Patrick; Saltz, Joel H.
2014-01-01
The past decade has witnessed a major paradigm shift in high performance computing with the introduction of accelerators as general purpose processors. These computing devices make available very high parallel computing power at low cost and power consumption, transforming current high performance platforms into heterogeneous CPU-GPU equipped systems. Although the theoretical performance achieved by these hybrid systems is impressive, taking practical advantage of this computing power remains a very challenging problem. Most applications are still deployed to either GPU or CPU, leaving the other resource under- or un-utilized. In this paper, we propose, implement, and evaluate a performance aware scheduling technique along with optimizations to make efficient collaborative use of CPUs and GPUs on a parallel system. In the context of feature computations in large scale image analysis applications, our evaluations show that intelligently co-scheduling CPUs and GPUs can significantly improve performance over GPU-only or multi-core CPU-only approaches. PMID:25419545
NASA Technical Reports Server (NTRS)
Johnson, D. R.; Uccellini, L. W.
1983-01-01
In connection with the employment of the sigma coordinates introduced by Phillips (1957), problems can arise regarding an accurate finite-difference computation of the pressure gradient force. Over steeply sloped terrain, the calculation of the sigma-coordinate pressure gradient force involves computing the difference between two large terms of opposite sign which results in large truncation error. To reduce the truncation error, several finite-difference methods have been designed and implemented. The present investigation has the objective to provide another method of computing the sigma-coordinate pressure gradient force. Phillips' method is applied for the elimination of a hydrostatic component to a flux formulation. The new technique is compared with four other methods for computing the pressure gradient force. The work is motivated by the desire to use an isentropic and sigma-coordinate hybrid model for experiments designed to study flow near mountainous terrain.
The Galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2016 update
Afgan, Enis; Baker, Dannon; van den Beek, Marius; Blankenberg, Daniel; Bouvier, Dave; Čech, Martin; Chilton, John; Clements, Dave; Coraor, Nate; Eberhard, Carl; Grüning, Björn; Guerler, Aysam; Hillman-Jackson, Jennifer; Von Kuster, Greg; Rasche, Eric; Soranzo, Nicola; Turaga, Nitesh; Taylor, James; Nekrutenko, Anton; Goecks, Jeremy
2016-01-01
High-throughput data production technologies, particularly ‘next-generation’ DNA sequencing, have ushered in widespread and disruptive changes to biomedical research. Making sense of the large datasets produced by these technologies requires sophisticated statistical and computational methods, as well as substantial computational power. This has led to an acute crisis in life sciences, as researchers without informatics training attempt to perform computation-dependent analyses. Since 2005, the Galaxy project has worked to address this problem by providing a framework that makes advanced computational tools usable by non experts. Galaxy seeks to make data-intensive research more accessible, transparent and reproducible by providing a Web-based environment in which users can perform computational analyses and have all of the details automatically tracked for later inspection, publication, or reuse. In this report we highlight recently added features enabling biomedical analyses on a large scale. PMID:27137889
Cooley, R.L.; Hill, M.C.
1992-01-01
Three methods of solving nonlinear least-squares problems were compared for robustness and efficiency using a series of hypothetical and field problems. A modified Gauss-Newton/full Newton hybrid method (MGN/FN) and an analogous method for which part of the Hessian matrix was replaced by a quasi-Newton approximation (MGN/QN) solved some of the problems with appreciably fewer iterations than required using only a modified Gauss-Newton (MGN) method. In these problems, model nonlinearity and a large variance for the observed data apparently caused MGN to converge more slowly than MGN/FN or MGN/QN after the sum of squared errors had almost stabilized. Other problems were solved as efficiently with MGN as with MGN/FN or MGN/QN. Because MGN/FN can require significantly more computer time per iteration and more computer storage for transient problems, it is less attractive for a general purpose algorithm than MGN/QN.
Novel metaheuristic for parameter estimation in nonlinear dynamic biological systems
Rodriguez-Fernandez, Maria; Egea, Jose A; Banga, Julio R
2006-01-01
Background We consider the problem of parameter estimation (model calibration) in nonlinear dynamic models of biological systems. Due to the frequent ill-conditioning and multi-modality of many of these problems, traditional local methods usually fail (unless initialized with very good guesses of the parameter vector). In order to surmount these difficulties, global optimization (GO) methods have been suggested as robust alternatives. Currently, deterministic GO methods can not solve problems of realistic size within this class in reasonable computation times. In contrast, certain types of stochastic GO methods have shown promising results, although the computational cost remains large. Rodriguez-Fernandez and coworkers have presented hybrid stochastic-deterministic GO methods which could reduce computation time by one order of magnitude while guaranteeing robustness. Our goal here was to further reduce the computational effort without loosing robustness. Results We have developed a new procedure based on the scatter search methodology for nonlinear optimization of dynamic models of arbitrary (or even unknown) structure (i.e. black-box models). In this contribution, we describe and apply this novel metaheuristic, inspired by recent developments in the field of operations research, to a set of complex identification problems and we make a critical comparison with respect to the previous (above mentioned) successful methods. Conclusion Robust and efficient methods for parameter estimation are of key importance in systems biology and related areas. The new metaheuristic presented in this paper aims to ensure the proper solution of these problems by adopting a global optimization approach, while keeping the computational effort under reasonable values. This new metaheuristic was applied to a set of three challenging parameter estimation problems of nonlinear dynamic biological systems, outperforming very significantly all the methods previously used for these benchmark problems. PMID:17081289
Novel metaheuristic for parameter estimation in nonlinear dynamic biological systems.
Rodriguez-Fernandez, Maria; Egea, Jose A; Banga, Julio R
2006-11-02
We consider the problem of parameter estimation (model calibration) in nonlinear dynamic models of biological systems. Due to the frequent ill-conditioning and multi-modality of many of these problems, traditional local methods usually fail (unless initialized with very good guesses of the parameter vector). In order to surmount these difficulties, global optimization (GO) methods have been suggested as robust alternatives. Currently, deterministic GO methods can not solve problems of realistic size within this class in reasonable computation times. In contrast, certain types of stochastic GO methods have shown promising results, although the computational cost remains large. Rodriguez-Fernandez and coworkers have presented hybrid stochastic-deterministic GO methods which could reduce computation time by one order of magnitude while guaranteeing robustness. Our goal here was to further reduce the computational effort without loosing robustness. We have developed a new procedure based on the scatter search methodology for nonlinear optimization of dynamic models of arbitrary (or even unknown) structure (i.e. black-box models). In this contribution, we describe and apply this novel metaheuristic, inspired by recent developments in the field of operations research, to a set of complex identification problems and we make a critical comparison with respect to the previous (above mentioned) successful methods. Robust and efficient methods for parameter estimation are of key importance in systems biology and related areas. The new metaheuristic presented in this paper aims to ensure the proper solution of these problems by adopting a global optimization approach, while keeping the computational effort under reasonable values. This new metaheuristic was applied to a set of three challenging parameter estimation problems of nonlinear dynamic biological systems, outperforming very significantly all the methods previously used for these benchmark problems.
A new tool called DISSECT for analysing large genomic data sets using a Big Data approach
Canela-Xandri, Oriol; Law, Andy; Gray, Alan; Woolliams, John A.; Tenesa, Albert
2015-01-01
Large-scale genetic and genomic data are increasingly available and the major bottleneck in their analysis is a lack of sufficiently scalable computational tools. To address this problem in the context of complex traits analysis, we present DISSECT. DISSECT is a new and freely available software that is able to exploit the distributed-memory parallel computational architectures of compute clusters, to perform a wide range of genomic and epidemiologic analyses, which currently can only be carried out on reduced sample sizes or under restricted conditions. We demonstrate the usefulness of our new tool by addressing the challenge of predicting phenotypes from genotype data in human populations using mixed-linear model analysis. We analyse simulated traits from 470,000 individuals genotyped for 590,004 SNPs in ∼4 h using the combined computational power of 8,400 processor cores. We find that prediction accuracies in excess of 80% of the theoretical maximum could be achieved with large sample sizes. PMID:26657010
Hypercube matrix computation task
NASA Technical Reports Server (NTRS)
Calalo, R.; Imbriale, W.; Liewer, P.; Lyons, J.; Manshadi, F.; Patterson, J.
1987-01-01
The Hypercube Matrix Computation (Year 1986-1987) task investigated the applicability of a parallel computing architecture to the solution of large scale electromagnetic scattering problems. Two existing electromagnetic scattering codes were selected for conversion to the Mark III Hypercube concurrent computing environment. They were selected so that the underlying numerical algorithms utilized would be different thereby providing a more thorough evaluation of the appropriateness of the parallel environment for these types of problems. The first code was a frequency domain method of moments solution, NEC-2, developed at Lawrence Livermore National Laboratory. The second code was a time domain finite difference solution of Maxwell's equations to solve for the scattered fields. Once the codes were implemented on the hypercube and verified to obtain correct solutions by comparing the results with those from sequential runs, several measures were used to evaluate the performance of the two codes. First, a comparison was provided of the problem size possible on the hypercube with 128 megabytes of memory for a 32-node configuration with that available in a typical sequential user environment of 4 to 8 megabytes. Then, the performance of the codes was anlyzed for the computational speedup attained by the parallel architecture.
Large-scale structural analysis: The structural analyst, the CSM Testbed and the NAS System
NASA Technical Reports Server (NTRS)
Knight, Norman F., Jr.; Mccleary, Susan L.; Macy, Steven C.; Aminpour, Mohammad A.
1989-01-01
The Computational Structural Mechanics (CSM) activity is developing advanced structural analysis and computational methods that exploit high-performance computers. Methods are developed in the framework of the CSM testbed software system and applied to representative complex structural analysis problems from the aerospace industry. An overview of the CSM testbed methods development environment is presented and some numerical methods developed on a CRAY-2 are described. Selected application studies performed on the NAS CRAY-2 are also summarized.
Communication-Efficient Arbitration Models for Low-Resolution Data Flow Computing
1988-12-01
phase can be formally described as follows: Graph Partitioning Problem NP-complete: (Garey & Johnson) Given graph G = (V, E), weights w (v) for each v e V...Technical Report, MIT/LCS/TR-218, Cambridge, Mass. Agerwala, Tilak, February 1982, "Data Flow Systems", Computer, pp. 10-13. Babb, Robert G ., July 1984...34Parallel Processing with Large-Grain Data Flow Techniques," IEEE Computer 17, 7, pp. 55-61. Babb, Robert G ., II, Lise Storc, and William C. Ragsdale
Coupled Finite Volume and Finite Element Method Analysis of a Complex Large-Span Roof Structure
NASA Astrophysics Data System (ADS)
Szafran, J.; Juszczyk, K.; Kamiński, M.
2017-12-01
The main goal of this paper is to present coupled Computational Fluid Dynamics and structural analysis for the precise determination of wind impact on internal forces and deformations of structural elements of a longspan roof structure. The Finite Volume Method (FVM) serves for a solution of the fluid flow problem to model the air flow around the structure, whose results are applied in turn as the boundary tractions in the Finite Element Method problem structural solution for the linear elastostatics with small deformations. The first part is carried out with the use of ANSYS 15.0 computer system, whereas the FEM system Robot supports stress analysis in particular roof members. A comparison of the wind pressure distribution throughout the roof surface shows some differences with respect to that available in the engineering designing codes like Eurocode, which deserves separate further numerical studies. Coupling of these two separate numerical techniques appears to be promising in view of future computational models of stochastic nature in large scale structural systems due to the stochastic perturbation method.
Matpar: Parallel Extensions for MATLAB
NASA Technical Reports Server (NTRS)
Springer, P. L.
1998-01-01
Matpar is a set of client/server software that allows a MATLAB user to take advantage of a parallel computer for very large problems. The user can replace calls to certain built-in MATLAB functions with calls to Matpar functions.
ERIC Educational Resources Information Center
Goldhammer, Frank; Naumann, Johannes; Stelter, Annette; Tóth, Krisztina; Rölke, Heiko; Klieme, Eckhard
2014-01-01
Computer-based assessment can provide new insights into behavioral processes of task completion that cannot be uncovered by paper-based instruments. Time presents a major characteristic of the task completion process. Psychologically, time on task has 2 different interpretations, suggesting opposing associations with task outcome: Spending more…
Determination of aerodynamic sensitivity coefficients for wings in transonic flow
NASA Technical Reports Server (NTRS)
Carlson, Leland A.; El-Banna, Hesham M.
1992-01-01
The quasianalytical approach is applied to the 3-D full potential equation to compute wing aerodynamic sensitivity coefficients in the transonic regime. Symbolic manipulation is used to reduce the effort associated with obtaining the sensitivity equations, and the large sensitivity system is solved using 'state of the art' routines. The quasianalytical approach is believed to be reasonably accurate and computationally efficient for 3-D problems.
Erol Sarigul; A. Lynn Abbott; Daniel L. Schmoldt; Philip A. Araman
2005-01-01
This paper describes recent progress in the analysis of computed tomography (CT) images of hardwood logs. The long-term goal of the work is to develop a system that is capable of autonomous (or semiautonomous) detection of internal defects, so that log breakdown decisions can be optimized based on defect locations. The problem is difficult because wood exhibits large...
DOE Office of Scientific and Technical Information (OSTI.GOV)
Petra, C.; Gavrea, B.; Anitescu, M.
2009-01-01
The present work aims at comparing the performance of several quadratic programming (QP) solvers for simulating large-scale frictional rigid-body systems. Traditional time-stepping schemes for simulation of multibody systems are formulated as linear complementarity problems (LCPs) with copositive matrices. Such LCPs are generally solved by means of Lemke-type algorithms and solvers such as the PATH solver proved to be robust. However, for large systems, the PATH solver or any other pivotal algorithm becomes unpractical from a computational point of view. The convex relaxation proposed by one of the authors allows the formulation of the integration step as a QPD, for whichmore » a wide variety of state-of-the-art solvers are available. In what follows we report the results obtained solving that subproblem when using the QP solvers MOSEK, OOQP, TRON, and BLMVM. OOQP is presented with both the symmetric indefinite solver MA27 and our Cholesky reformulation using the CHOLMOD package. We investigate computational performance and address the correctness of the results from a modeling point of view. We conclude that the OOQP solver, particularly with the CHOLMOD linear algebra solver, has predictable performance and memory use patterns and is far more competitive for these problems than are the other solvers.« less
NASA Technical Reports Server (NTRS)
Noor, A. K.
1983-01-01
Advances in continuum modeling, progress in reduction methods, and analysis and modeling needs for large space structures are covered with specific attention given to repetitive lattice trusses. As far as continuum modeling is concerned, an effective and verified analysis capability exists for linear thermoelastic stress, birfurcation buckling, and free vibration problems of repetitive lattices. However, application of continuum modeling to nonlinear analysis needs more development. Reduction methods are very effective for bifurcation buckling and static (steady-state) nonlinear analysis. However, more work is needed to realize their full potential for nonlinear dynamic and time-dependent problems. As far as analysis and modeling needs are concerned, three areas are identified: loads determination, modeling and nonclassical behavior characteristics, and computational algorithms. The impact of new advances in computer hardware, software, integrated analysis, CAD/CAM stems, and materials technology is also discussed.
A k-Vector Approach to Sampling, Interpolation, and Approximation
NASA Astrophysics Data System (ADS)
Mortari, Daniele; Rogers, Jonathan
2013-12-01
The k-vector search technique is a method designed to perform extremely fast range searching of large databases at computational cost independent of the size of the database. k-vector search algorithms have historically found application in satellite star-tracker navigation systems which index very large star catalogues repeatedly in the process of attitude estimation. Recently, the k-vector search algorithm has been applied to numerous other problem areas including non-uniform random variate sampling, interpolation of 1-D or 2-D tables, nonlinear function inversion, and solution of systems of nonlinear equations. This paper presents algorithms in which the k-vector search technique is used to solve each of these problems in a computationally-efficient manner. In instances where these tasks must be performed repeatedly on a static (or nearly-static) data set, the proposed k-vector-based algorithms offer an extremely fast solution technique that outperforms standard methods.
Systems of Inhomogeneous Linear Equations
NASA Astrophysics Data System (ADS)
Scherer, Philipp O. J.
Many problems in physics and especially computational physics involve systems of linear equations which arise e.g. from linearization of a general nonlinear problem or from discretization of differential equations. If the dimension of the system is not too large standard methods like Gaussian elimination or QR decomposition are sufficient. Systems with a tridiagonal matrix are important for cubic spline interpolation and numerical second derivatives. They can be solved very efficiently with a specialized Gaussian elimination method. Practical applications often involve very large dimensions and require iterative methods. Convergence of Jacobi and Gauss-Seidel methods is slow and can be improved by relaxation or over-relaxation. An alternative for large systems is the method of conjugate gradients.
FastMag: Fast micromagnetic simulator for complex magnetic structures (invited)
NASA Astrophysics Data System (ADS)
Chang, R.; Li, S.; Lubarda, M. V.; Livshitz, B.; Lomakin, V.
2011-04-01
A fast micromagnetic simulator (FastMag) for general problems is presented. FastMag solves the Landau-Lifshitz-Gilbert equation and can handle multiscale problems with a high computational efficiency. The simulator derives its high performance from efficient methods for evaluating the effective field and from implementations on massively parallel graphics processing unit (GPU) architectures. FastMag discretizes the computational domain into tetrahedral elements and therefore is highly flexible for general problems. The magnetostatic field is computed via the superposition principle for both volume and surface parts of the computational domain. This is accomplished by implementing efficient quadrature rules and analytical integration for overlapping elements in which the integral kernel is singular. Thus, discretized superposition integrals are computed using a nonuniform grid interpolation method, which evaluates the field from N sources at N collocated observers in O(N) operations. This approach allows handling objects of arbitrary shape, allows easily calculating of the field outside the magnetized domains, does not require solving a linear system of equations, and requires little memory. FastMag is implemented on GPUs with ?> GPU-central processing unit speed-ups of 2 orders of magnitude. Simulations are shown of a large array of magnetic dots and a recording head fully discretized down to the exchange length, with over a hundred million tetrahedral elements on an inexpensive desktop computer.
Extending Strong Scaling of Quantum Monte Carlo to the Exascale
NASA Astrophysics Data System (ADS)
Shulenburger, Luke; Baczewski, Andrew; Luo, Ye; Romero, Nichols; Kent, Paul
Quantum Monte Carlo is one of the most accurate and most computationally expensive methods for solving the electronic structure problem. In spite of its significant computational expense, its massively parallel nature is ideally suited to petascale computers which have enabled a wide range of applications to relatively large molecular and extended systems. Exascale capabilities have the potential to enable the application of QMC to significantly larger systems, capturing much of the complexity of real materials such as defects and impurities. However, both memory and computational demands will require significant changes to current algorithms to realize this possibility. This talk will detail both the causes of the problem and potential solutions. Sandia National Laboratories is a multi-mission laboratory managed and operated by Sandia Corp, a wholly owned subsidiary of Lockheed Martin Corp, for the US Department of Energys National Nuclear Security Administration under contract DE-AC04-94AL85000.
An immersed boundary method for modeling a dirty geometry data
NASA Astrophysics Data System (ADS)
Onishi, Keiji; Tsubokura, Makoto
2017-11-01
We present a robust, fast, and low preparation cost immersed boundary method (IBM) for simulating an incompressible high Re flow around highly complex geometries. The method is achieved by the dispersion of the momentum by the axial linear projection and the approximate domain assumption satisfying the mass conservation around the wall including cells. This methodology has been verified against an analytical theory and wind tunnel experiment data. Next, we simulate the problem of flow around a rotating object and demonstrate the ability of this methodology to the moving geometry problem. This methodology provides the possibility as a method for obtaining a quick solution at a next large scale supercomputer. This research was supported by MEXT as ``Priority Issue on Post-K computer'' (Development of innovative design and production processes) and used computational resources of the K computer provided by the RIKEN Advanced Institute for Computational Science.
Cloud computing for comparative genomics
2010-01-01
Background Large comparative genomics studies and tools are becoming increasingly more compute-expensive as the number of available genome sequences continues to rise. The capacity and cost of local computing infrastructures are likely to become prohibitive with the increase, especially as the breadth of questions continues to rise. Alternative computing architectures, in particular cloud computing environments, may help alleviate this increasing pressure and enable fast, large-scale, and cost-effective comparative genomics strategies going forward. To test this, we redesigned a typical comparative genomics algorithm, the reciprocal smallest distance algorithm (RSD), to run within Amazon's Elastic Computing Cloud (EC2). We then employed the RSD-cloud for ortholog calculations across a wide selection of fully sequenced genomes. Results We ran more than 300,000 RSD-cloud processes within the EC2. These jobs were farmed simultaneously to 100 high capacity compute nodes using the Amazon Web Service Elastic Map Reduce and included a wide mix of large and small genomes. The total computation time took just under 70 hours and cost a total of $6,302 USD. Conclusions The effort to transform existing comparative genomics algorithms from local compute infrastructures is not trivial. However, the speed and flexibility of cloud computing environments provides a substantial boost with manageable cost. The procedure designed to transform the RSD algorithm into a cloud-ready application is readily adaptable to similar comparative genomics problems. PMID:20482786
Cloud computing for comparative genomics.
Wall, Dennis P; Kudtarkar, Parul; Fusaro, Vincent A; Pivovarov, Rimma; Patil, Prasad; Tonellato, Peter J
2010-05-18
Large comparative genomics studies and tools are becoming increasingly more compute-expensive as the number of available genome sequences continues to rise. The capacity and cost of local computing infrastructures are likely to become prohibitive with the increase, especially as the breadth of questions continues to rise. Alternative computing architectures, in particular cloud computing environments, may help alleviate this increasing pressure and enable fast, large-scale, and cost-effective comparative genomics strategies going forward. To test this, we redesigned a typical comparative genomics algorithm, the reciprocal smallest distance algorithm (RSD), to run within Amazon's Elastic Computing Cloud (EC2). We then employed the RSD-cloud for ortholog calculations across a wide selection of fully sequenced genomes. We ran more than 300,000 RSD-cloud processes within the EC2. These jobs were farmed simultaneously to 100 high capacity compute nodes using the Amazon Web Service Elastic Map Reduce and included a wide mix of large and small genomes. The total computation time took just under 70 hours and cost a total of $6,302 USD. The effort to transform existing comparative genomics algorithms from local compute infrastructures is not trivial. However, the speed and flexibility of cloud computing environments provides a substantial boost with manageable cost. The procedure designed to transform the RSD algorithm into a cloud-ready application is readily adaptable to similar comparative genomics problems.
Discrete is it enough? The revival of Piola-Hencky keynotes to analyze three-dimensional Elastica
NASA Astrophysics Data System (ADS)
Turco, Emilio
2018-04-01
Complex problems such as those concerning the mechanics of materials can be confronted only by considering numerical simulations. Analytical methods are useful to build guidelines or reference solutions but, for general cases of technical interest, they have to be solved numerically, especially in the case of large displacements and deformations. Probably continuous models arose for producing inspiring examples and stemmed from homogenization techniques. These techniques allowed for the solution of some paradigmatic examples but, in general, always require a discretization method for solving problems dictated by the applications. Therefore, and also by taking into account that computing powers are nowadays more largely available and cheap, the question arises: why not using directly a discrete model for 3D beams? In other words, it could be interesting to formulate a discrete model without using an intermediate continuum one, as this last, at the end, has to be discretized in any case. These simple considerations immediately evoke some very basic models developed many years ago when the computing powers were practically inexistent but the problem of finding simple solutions to beam deformation problem was already an emerging one. Actually, in recent years, the keynotes of Hencky and Piola attracted a renewed attention [see, one for all, the work (Turco et al. in Zeitschrift für Angewandte Mathematik und Physik 67(4):1-28, 2016)]: generalizing their results, in the present paper, a novel directly discrete three-dimensional beam model is presented and discussed, in the framework of geometrically nonlinear analysis. Using a stepwise algorithm based essentially on Newton's method to compute the extrapolations and on the Riks' arc-length method to perform the corrections, we could obtain some numerical simulations showing the computational effectiveness of presented model: Indeed, it presents a convenient balance between accuracy and computational cost.
A blended continuous–discontinuous finite element method for solving the multi-fluid plasma model
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sousa, E.M., E-mail: sousae@uw.edu; Shumlak, U., E-mail: shumlak@uw.edu
The multi-fluid plasma model represents electrons, multiple ion species, and multiple neutral species as separate fluids that interact through short-range collisions and long-range electromagnetic fields. The model spans a large range of temporal and spatial scales, which renders the model stiff and presents numerical challenges. To address the large range of timescales, a blended continuous and discontinuous Galerkin method is proposed, where the massive ion and neutral species are modeled using an explicit discontinuous Galerkin method while the electrons and electromagnetic fields are modeled using an implicit continuous Galerkin method. This approach is able to capture large-gradient ion and neutralmore » physics like shock formation, while resolving high-frequency electron dynamics in a computationally efficient manner. The details of the Blended Finite Element Method (BFEM) are presented. The numerical method is benchmarked for accuracy and tested using two-fluid one-dimensional soliton problem and electromagnetic shock problem. The results are compared to conventional finite volume and finite element methods, and demonstrate that the BFEM is particularly effective in resolving physics in stiff problems involving realistic physical parameters, including realistic electron mass and speed of light. The benefit is illustrated by computing a three-fluid plasma application that demonstrates species separation in multi-component plasmas.« less
Computational solutions to large-scale data management and analysis
Schadt, Eric E.; Linderman, Michael D.; Sorenson, Jon; Lee, Lawrence; Nolan, Garry P.
2011-01-01
Today we can generate hundreds of gigabases of DNA and RNA sequencing data in a week for less than US$5,000. The astonishing rate of data generation by these low-cost, high-throughput technologies in genomics is being matched by that of other technologies, such as real-time imaging and mass spectrometry-based flow cytometry. Success in the life sciences will depend on our ability to properly interpret the large-scale, high-dimensional data sets that are generated by these technologies, which in turn requires us to adopt advances in informatics. Here we discuss how we can master the different types of computational environments that exist — such as cloud and heterogeneous computing — to successfully tackle our big data problems. PMID:20717155
GPU accelerated FDTD solver and its application in MRI.
Chi, J; Liu, F; Jin, J; Mason, D G; Crozier, S
2010-01-01
The finite difference time domain (FDTD) method is a popular technique for computational electromagnetics (CEM). The large computational power often required, however, has been a limiting factor for its applications. In this paper, we will present a graphics processing unit (GPU)-based parallel FDTD solver and its successful application to the investigation of a novel B1 shimming scheme for high-field magnetic resonance imaging (MRI). The optimized shimming scheme exhibits considerably improved transmit B(1) profiles. The GPU implementation dramatically shortened the runtime of FDTD simulation of electromagnetic field compared with its CPU counterpart. The acceleration in runtime has made such investigation possible, and will pave the way for other studies of large-scale computational electromagnetic problems in modern MRI which were previously impractical.
Approaches to eliminate waste and reduce cost for recycling glass.
Chao, Chien-Wen; Liao, Ching-Jong
2011-12-01
In recent years, the issue of environmental protection has received considerable attention. This paper adds to the literature by investigating a scheduling problem in the manufacturing of a glass recycling factory in Taiwan. The objective is to minimize the sum of the total holding cost and loss cost. We first represent the problem as an integer programming (IP) model, and then develop two heuristics based on the IP model to find near-optimal solutions for the problem. To validate the proposed heuristics, comparisons between optimal solutions from the IP model and solutions from the current method are conducted. The comparisons involve two problem sizes, small and large, where the small problems range from 15 to 45 jobs, and the large problems from 50 to 100 jobs. Finally, a genetic algorithm is applied to evaluate the proposed heuristics. Computational experiments show that the proposed heuristics can find good solutions in a reasonable time for the considered problem. Copyright © 2011 Elsevier Ltd. All rights reserved.
2016-01-01
Background Computer Networks have a tendency to grow at an unprecedented scale. Modern networks involve not only computers but also a wide variety of other interconnected devices ranging from mobile phones to other household items fitted with sensors. This vision of the "Internet of Things" (IoT) implies an inherent difficulty in modeling problems. Purpose It is practically impossible to implement and test all scenarios for large-scale and complex adaptive communication networks as part of Complex Adaptive Communication Networks and Environments (CACOONS). The goal of this study is to explore the use of Agent-based Modeling as part of the Cognitive Agent-based Computing (CABC) framework to model a Complex communication network problem. Method We use Exploratory Agent-based Modeling (EABM), as part of the CABC framework, to develop an autonomous multi-agent architecture for managing carbon footprint in a corporate network. To evaluate the application of complexity in practical scenarios, we have also introduced a company-defined computer usage policy. Results The conducted experiments demonstrated two important results: Primarily CABC-based modeling approach such as using Agent-based Modeling can be an effective approach to modeling complex problems in the domain of IoT. Secondly, the specific problem of managing the Carbon footprint can be solved using a multiagent system approach. PMID:26812235
Computational complexity of ecological and evolutionary spatial dynamics
Ibsen-Jensen, Rasmus; Chatterjee, Krishnendu; Nowak, Martin A.
2015-01-01
There are deep, yet largely unexplored, connections between computer science and biology. Both disciplines examine how information proliferates in time and space. Central results in computer science describe the complexity of algorithms that solve certain classes of problems. An algorithm is deemed efficient if it can solve a problem in polynomial time, which means the running time of the algorithm is a polynomial function of the length of the input. There are classes of harder problems for which the fastest possible algorithm requires exponential time. Another criterion is the space requirement of the algorithm. There is a crucial distinction between algorithms that can find a solution, verify a solution, or list several distinct solutions in given time and space. The complexity hierarchy that is generated in this way is the foundation of theoretical computer science. Precise complexity results can be notoriously difficult. The famous question whether polynomial time equals nondeterministic polynomial time (i.e., P = NP) is one of the hardest open problems in computer science and all of mathematics. Here, we consider simple processes of ecological and evolutionary spatial dynamics. The basic question is: What is the probability that a new invader (or a new mutant) will take over a resident population? We derive precise complexity results for a variety of scenarios. We therefore show that some fundamental questions in this area cannot be answered by simple equations (assuming that P is not equal to NP). PMID:26644569
NASA Astrophysics Data System (ADS)
Buddala, Raviteja; Mahapatra, Siba Sankar
2017-11-01
Flexible flow shop (or a hybrid flow shop) scheduling problem is an extension of classical flow shop scheduling problem. In a simple flow shop configuration, a job having `g' operations is performed on `g' operation centres (stages) with each stage having only one machine. If any stage contains more than one machine for providing alternate processing facility, then the problem becomes a flexible flow shop problem (FFSP). FFSP which contains all the complexities involved in a simple flow shop and parallel machine scheduling problems is a well-known NP-hard (Non-deterministic polynomial time) problem. Owing to high computational complexity involved in solving these problems, it is not always possible to obtain an optimal solution in a reasonable computation time. To obtain near-optimal solutions in a reasonable computation time, a large variety of meta-heuristics have been proposed in the past. However, tuning algorithm-specific parameters for solving FFSP is rather tricky and time consuming. To address this limitation, teaching-learning-based optimization (TLBO) and JAYA algorithm are chosen for the study because these are not only recent meta-heuristics but they do not require tuning of algorithm-specific parameters. Although these algorithms seem to be elegant, they lose solution diversity after few iterations and get trapped at the local optima. To alleviate such drawback, a new local search procedure is proposed in this paper to improve the solution quality. Further, mutation strategy (inspired from genetic algorithm) is incorporated in the basic algorithm to maintain solution diversity in the population. Computational experiments have been conducted on standard benchmark problems to calculate makespan and computational time. It is found that the rate of convergence of TLBO is superior to JAYA. From the results, it is found that TLBO and JAYA outperform many algorithms reported in the literature and can be treated as efficient methods for solving the FFSP.
A Computer Assisted Program for the Management of Acute Dental Pain. User’s Manual
1989-07-28
tooth , unfavorable prognosis Neurologic injury Endo/perio combined problem Osseous sequestrum Enamel fracture Occlusal trauma Food impaction...of Tooth , Guarded Prognosis Enamel Fracture Endodontic/Periodontic Combined Problem Food Impaction Fractured Alveolar Bone Fractured Crown, Large...sensitivity of dentin, which is the light yellowish calcific tissue underlying the cementum or enamel that forms the body of a tooth . Clinically
Methods for High-Order Multi-Scale and Stochastic Problems Analysis, Algorithms, and Applications
2016-10-17
finite volume schemes, discontinuous Galerkin finite element method, and related methods, for solving computational fluid dynamics (CFD) problems and...approximation for finite element methods. (3) The development of methods of simulation and analysis for the study of large scale stochastic systems of...laws, finite element method, Bernstein-Bezier finite elements , weakly interacting particle systems, accelerated Monte Carlo, stochastic networks 16
Information Based Numerical Practice.
1987-02-01
characterization by comparative computational studies of various benchmark problems. See e.g. [MacNeal, Harder (1985)], [Robinson, Blackham (1981)] any...FOR NONADAPTIVE METHODS 2.1. THE QUADRATURE FORMULA The simplest example studied in detail in the literature is the problem of the optimal quadrature...formulae and the functional analytic prerequisites for the study of optimal formulae, we refer to the large monography (808 p) of [Sobolev (1974)]. Let us
A Game Based e-Learning System to Teach Artificial Intelligence in the Computer Sciences Degree
ERIC Educational Resources Information Center
de Castro-Santos, Amable; Fajardo, Waldo; Molina-Solana, Miguel
2017-01-01
Our students taking the Artificial Intelligence and Knowledge Engineering courses often encounter a large number of problems to solve which are not directly related to the subject to be learned. To solve this problem, we have developed a game based e-learning system. The elected game, that has been implemented as an e-learning system, allows to…
Two-machine flow shop scheduling integrated with preventive maintenance planning
NASA Astrophysics Data System (ADS)
Wang, Shijin; Liu, Ming
2016-02-01
This paper investigates an integrated optimisation problem of production scheduling and preventive maintenance (PM) in a two-machine flow shop with time to failure of each machine subject to a Weibull probability distribution. The objective is to find the optimal job sequence and the optimal PM decisions before each job such that the expected makespan is minimised. To investigate the value of integrated scheduling solution, computational experiments on small-scale problems with different configurations are conducted with total enumeration method, and the results are compared with those of scheduling without maintenance but with machine degradation, and individual job scheduling combined with independent PM planning. Then, for large-scale problems, four genetic algorithm (GA) based heuristics are proposed. The numerical results with several large problem sizes and different configurations indicate the potential benefits of integrated scheduling solution and the results also show that proposed GA-based heuristics are efficient for the integrated problem.
An interior-point method-based solver for simulation of aircraft parts riveting
NASA Astrophysics Data System (ADS)
Stefanova, Maria; Yakunin, Sergey; Petukhova, Margarita; Lupuleac, Sergey; Kokkolaras, Michael
2018-05-01
The particularities of the aircraft parts riveting process simulation necessitate the solution of a large amount of contact problems. A primal-dual interior-point method-based solver is proposed for solving such problems efficiently. The proposed method features a worst case polynomial complexity bound ? on the number of iterations, where n is the dimension of the problem and ε is a threshold related to desired accuracy. In practice, the convergence is often faster than this worst case bound, which makes the method applicable to large-scale problems. The computational challenge is solving the system of linear equations because the associated matrix is ill conditioned. To that end, the authors introduce a preconditioner and a strategy for determining effective initial guesses based on the physics of the problem. Numerical results are compared with ones obtained using the Goldfarb-Idnani algorithm. The results demonstrate the efficiency of the proposed method.
The potential benefits of photonics in the computing platform
NASA Astrophysics Data System (ADS)
Bautista, Jerry
2005-03-01
The increase in computational requirements for real-time image processing, complex computational fluid dynamics, very large scale data mining in the health industry/Internet, and predictive models for financial markets are driving computer architects to consider new paradigms that rely upon very high speed interconnects within and between computing elements. Further challenges result from reduced power requirements, reduced transmission latency, and greater interconnect density. Optical interconnects may solve many of these problems with the added benefit extended reach. In addition, photonic interconnects provide relative EMI immunity which is becoming an increasing issue with a greater dependence on wireless connectivity. However, to be truly functional, the optical interconnect mesh should be able to support arbitration, addressing, etc. completely in the optical domain with a BER that is more stringent than "traditional" communication requirements. Outlined are challenges in the advanced computing environment, some possible optical architectures and relevant platform technologies, as well roughly sizing these opportunities which are quite large relative to the more "traditional" optical markets.
Big Data, Deep Learning and Tianhe-2 at Sun Yat-Sen University, Guangzhou
NASA Astrophysics Data System (ADS)
Yuen, D. A.; Dzwinel, W.; Liu, J.; Zhang, K.
2014-12-01
In this decade the big data revolution has permeated in many fields, ranging from financial transactions, medical surveys and scientific endeavors, because of the big opportunities people see ahead. What to do with all this data remains an intriguing question. This is where computer scientists together with applied mathematicians have made some significant inroads in developing deep learning techniques for unraveling new relationships among the different variables by means of correlation analysis and data-assimilation methods. Deep-learning and big data taken together is a grand challenge task in High-performance computing which demand both ultrafast speed and large memory. The Tianhe-2 recently installed at Sun Yat-Sen University in Guangzhou is well positioned to take up this challenge because it is currently the world's fastest computer at 34 Petaflops. Each compute node of Tianhe-2 has two CPUs of Intel Xeon E5-2600 and three Xeon Phi accelerators. The Tianhe-2 has a very large fast memory RAM of 88 Gigabytes on each node. The system has a total memory of 1,375 Terabytes. All of these technical features will allow very high dimensional (more than 10) problem in deep learning to be explored carefully on the Tianhe-2. Problems in seismology which can be solved include three-dimensional seismic wave simulations of the whole Earth with a few km resolution and the recognition of new phases in seismic wave form from assemblage of large data sets.
Bicriteria Network Optimization Problem using Priority-based Genetic Algorithm
NASA Astrophysics Data System (ADS)
Gen, Mitsuo; Lin, Lin; Cheng, Runwei
Network optimization is being an increasingly important and fundamental issue in the fields such as engineering, computer science, operations research, transportation, telecommunication, decision support systems, manufacturing, and airline scheduling. In many applications, however, there are several criteria associated with traversing each edge of a network. For example, cost and flow measures are both important in the networks. As a result, there has been recent interest in solving Bicriteria Network Optimization Problem. The Bicriteria Network Optimization Problem is known a NP-hard. The efficient set of paths may be very large, possibly exponential in size. Thus the computational effort required to solve it can increase exponentially with the problem size in the worst case. In this paper, we propose a genetic algorithm (GA) approach used a priority-based chromosome for solving the bicriteria network optimization problem including maximum flow (MXF) model and minimum cost flow (MCF) model. The objective is to find the set of Pareto optimal solutions that give possible maximum flow with minimum cost. This paper also combines Adaptive Weight Approach (AWA) that utilizes some useful information from the current population to readjust weights for obtaining a search pressure toward a positive ideal point. Computer simulations show the several numerical experiments by using some difficult-to-solve network design problems, and show the effectiveness of the proposed method.
Enhanced algorithms for stochastic programming
DOE Office of Scientific and Technical Information (OSTI.GOV)
Krishna, Alamuru S.
1993-09-01
In this dissertation, we present some of the recent advances made in solving two-stage stochastic linear programming problems of large size and complexity. Decomposition and sampling are two fundamental components of techniques to solve stochastic optimization problems. We describe improvements to the current techniques in both these areas. We studied different ways of using importance sampling techniques in the context of Stochastic programming, by varying the choice of approximation functions used in this method. We have concluded that approximating the recourse function by a computationally inexpensive piecewise-linear function is highly efficient. This reduced the problem from finding the mean ofmore » a computationally expensive functions to finding that of a computationally inexpensive function. Then we implemented various variance reduction techniques to estimate the mean of a piecewise-linear function. This method achieved similar variance reductions in orders of magnitude less time than, when we directly applied variance-reduction techniques directly on the given problem. In solving a stochastic linear program, the expected value problem is usually solved before a stochastic solution and also to speed-up the algorithm by making use of the information obtained from the solution of the expected value problem. We have devised a new decomposition scheme to improve the convergence of this algorithm.« less
NASA Astrophysics Data System (ADS)
Naumov, D.; Fischer, T.; Böttcher, N.; Watanabe, N.; Walther, M.; Rink, K.; Bilke, L.; Shao, H.; Kolditz, O.
2014-12-01
OpenGeoSys (OGS) is a scientific open source code for numerical simulation of thermo-hydro-mechanical-chemical processes in porous and fractured media. Its basic concept is to provide a flexible numerical framework for solving multi-field problems for applications in geoscience and hydrology as e.g. for CO2 storage applications, geothermal power plant forecast simulation, salt water intrusion, water resources management, etc. Advances in computational mathematics have revolutionized the variety and nature of the problems that can be addressed by environmental scientists and engineers nowadays and an intensive code development in the last years enables in the meantime the solutions of much larger numerical problems and applications. However, solving environmental processes along the water cycle at large scales, like for complete catchment or reservoirs, stays computationally still a challenging task. Therefore, we started a new OGS code development with focus on execution speed and parallelization. In the new version, a local data structure concept improves the instruction and data cache performance by a tight bundling of data with an element-wise numerical integration loop. Dedicated analysis methods enable the investigation of memory-access patterns in the local and global assembler routines, which leads to further data structure optimization for an additional performance gain. The concept is presented together with a technical code analysis of the recent development and a large case study including transient flow simulation in the unsaturated / saturated zone of the Thuringian Syncline, Germany. The analysis is performed on a high-resolution mesh (up to 50M elements) with embedded fault structures.
A machine learning approach for efficient uncertainty quantification using multiscale methods
NASA Astrophysics Data System (ADS)
Chan, Shing; Elsheikh, Ahmed H.
2018-02-01
Several multiscale methods account for sub-grid scale features using coarse scale basis functions. For example, in the Multiscale Finite Volume method the coarse scale basis functions are obtained by solving a set of local problems over dual-grid cells. We introduce a data-driven approach for the estimation of these coarse scale basis functions. Specifically, we employ a neural network predictor fitted using a set of solution samples from which it learns to generate subsequent basis functions at a lower computational cost than solving the local problems. The computational advantage of this approach is realized for uncertainty quantification tasks where a large number of realizations has to be evaluated. We attribute the ability to learn these basis functions to the modularity of the local problems and the redundancy of the permeability patches between samples. The proposed method is evaluated on elliptic problems yielding very promising results.
NASA Astrophysics Data System (ADS)
Wang, Wenlong; Mandrà, Salvatore; Katzgraber, Helmut
We propose a patch planting heuristic that allows us to create arbitrarily-large Ising spin-glass instances on any topology and with any type of disorder, and where the exact ground-state energy of the problem is known by construction. By breaking up the problem into patches that can be treated either with exact or heuristic solvers, we can reconstruct the optimum of the original, considerably larger, problem. The scaling of the computational complexity of these instances with various patch numbers and sizes is investigated and compared with random instances using population annealing Monte Carlo and quantum annealing on the D-Wave 2X quantum annealer. The method can be useful for benchmarking of novel computing technologies and algorithms. NSF-DMR-1208046 and the Office of the Director of National Intelligence (ODNI), Intelligence Advanced Research Projects Activity (IARPA), via MIT Lincoln Laboratory Air Force Contract No. FA8721-05-C-0002.
The design of multiplayer online video game systems
NASA Astrophysics Data System (ADS)
Hsu, Chia-chun A.; Ling, Jim; Li, Qing; Kuo, C.-C. J.
2003-11-01
The distributed Multiplayer Online Game (MOG) system is complex since it involves technologies in computer graphics, multimedia, artificial intelligence, computer networking, embedded systems, etc. Due to the large scope of this problem, the design of MOG systems has not yet been widely addressed in the literatures. In this paper, we review and analyze the current MOG system architecture followed by evaluation. Furthermore, we propose a clustered-server architecture to provide a scalable solution together with the region oriented allocation strategy. Two key issues, i.e. interesting management and synchronization, are discussed in depth. Some preliminary ideas to deal with the identified problems are described.
Approximations of thermoelastic and viscoelastic control systems
NASA Technical Reports Server (NTRS)
Burns, J. A.; Liu, Z. Y.; Miller, R. E.
1990-01-01
Well-posed models and computational algorithms are developed and analyzed for control of a class of partial differential equations that describe the motions of thermo-viscoelastic structures. An abstract (state space) framework and a general well-posedness result are presented that can be applied to a large class of thermo-elastic and thermo-viscoelastic models. This state space framework is used in the development of a computational scheme to be used in the solution of a linear quadratic regulator (LQR) control problem. A detailed convergence proof is provided for the viscoelastic model and several numerical results are presented to illustrate the theory and to analyze problems for which the theory is incomplete.
ERIC Educational Resources Information Center
Bati, Tesfaye Bayu; Gelderblom, Helene; van Biljon, Judy
2014-01-01
The challenge of teaching programming in higher education is complicated by problems associated with large class teaching, a prevalent situation in many developing countries. This paper reports on an investigation into the use of a blended learning approach to teaching and learning of programming in a class of more than 200 students. A course and…
Introducing Computational Approaches in Intermediate Mechanics
NASA Astrophysics Data System (ADS)
Cook, David M.
2006-12-01
In the winter of 2003, we at Lawrence University moved Lagrangian mechanics and rigid body dynamics from a required sophomore course to an elective junior/senior course, freeing 40% of the time for computational approaches to ordinary differential equations (trajectory problems, the large amplitude pendulum, non-linear dynamics); evaluation of integrals (finding centers of mass and moment of inertia tensors, calculating gravitational potentials for various sources); and finding eigenvalues and eigenvectors of matrices (diagonalizing the moment of inertia tensor, finding principal axes), and to generating graphical displays of computed results. Further, students begin to use LaTeX to prepare some of their submitted problem solutions. Placed in the middle of the sophomore year, this course provides the background that permits faculty members as appropriate to assign computer-based exercises in subsequent courses. Further, students are encouraged to use our Computational Physics Laboratory on their own initiative whenever that use seems appropriate. (Curricular development supported in part by the W. M. Keck Foundation, the National Science Foundation, and Lawrence University.)
High-frequency CAD-based scattering model: SERMAT
NASA Astrophysics Data System (ADS)
Goupil, D.; Boutillier, M.
1991-09-01
Specifications for an industrial radar cross section (RCS) calculation code are given: it must be able to exchange data with many computer aided design (CAD) systems, it must be fast, and it must have powerful graphic tools. Classical physical optics (PO) and equivalent currents (EC) techniques have proven their efficiency on simple objects for a long time. Difficult geometric problems occur when objects with very complex shapes have to be computed. Only a specific geometric code can solve these problems. We have established that, once these problems have been solved: (1) PO and EC give good results on complex objects of large size compared to wavelength; and (2) the implementation of these objects in a software package (SERMAT) allows fast and sufficiently precise domain RCS calculations to meet industry requirements in the domain of stealth.
Model Order Reduction Algorithm for Estimating the Absorption Spectrum
DOE Office of Scientific and Technical Information (OSTI.GOV)
Van Beeumen, Roel; Williams-Young, David B.; Kasper, Joseph M.
The ab initio description of the spectral interior of the absorption spectrum poses both a theoretical and computational challenge for modern electronic structure theory. Due to the often spectrally dense character of this domain in the quantum propagator’s eigenspectrum for medium-to-large sized systems, traditional approaches based on the partial diagonalization of the propagator often encounter oscillatory and stagnating convergence. Electronic structure methods which solve the molecular response problem through the solution of spectrally shifted linear systems, such as the complex polarization propagator, offer an alternative approach which is agnostic to the underlying spectral density or domain location. This generality comesmore » at a seemingly high computational cost associated with solving a large linear system for each spectral shift in some discretization of the spectral domain of interest. In this work, we present a novel, adaptive solution to this high computational overhead based on model order reduction techniques via interpolation. Model order reduction reduces the computational complexity of mathematical models and is ubiquitous in the simulation of dynamical systems and control theory. The efficiency and effectiveness of the proposed algorithm in the ab initio prediction of X-ray absorption spectra is demonstrated using a test set of challenging water clusters which are spectrally dense in the neighborhood of the oxygen K-edge. On the basis of a single, user defined tolerance we automatically determine the order of the reduced models and approximate the absorption spectrum up to the given tolerance. We also illustrate that, for the systems studied, the automatically determined model order increases logarithmically with the problem dimension, compared to a linear increase of the number of eigenvalues within the energy window. Furthermore, we observed that the computational cost of the proposed algorithm only scales quadratically with respect to the problem dimension.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Nakhai, B.
A new method for solving radiation transport problems is presented. The heart of the technique is a new cross section processing procedure for the calculation of group-to-point and point-to-group cross sections sets. The method is ideally suited for problems which involve media with highly fluctuating cross sections, where the results of the traditional multigroup calculations are beclouded by the group averaging procedures employed. Extensive computational efforts, which would be required to evaluate double integrals in the multigroup treatment numerically, prohibit iteration to optimize the energy boundaries. On the other hand, use of point-to-point techniques (as in the stochastic technique) ismore » often prohibitively expensive due to the large computer storage requirement. The pseudo-point code is a hybrid of the two aforementioned methods (group-to-group and point-to-point) - hence the name pseudo-point - that reduces the computational efforts of the former and the large core requirements of the latter. The pseudo-point code generates the group-to-point or the point-to-group transfer matrices, and can be coupled with the existing transport codes to calculate pointwise energy-dependent fluxes. This approach yields much more detail than is available from the conventional energy-group treatments. Due to the speed of this code, several iterations could be performed (in affordable computing efforts) to optimize the energy boundaries and the weighting functions. The pseudo-point technique is demonstrated by solving six problems, each depicting a certain aspect of the technique. The results are presented as flux vs energy at various spatial intervals. The sensitivity of the technique to the energy grid and the savings in computational effort are clearly demonstrated.« less
An assessment of future computer system needs for large-scale computation
NASA Technical Reports Server (NTRS)
Lykos, P.; White, J.
1980-01-01
Data ranging from specific computer capability requirements to opinions about the desirability of a national computer facility are summarized. It is concluded that considerable attention should be given to improving the user-machine interface. Otherwise, increased computer power may not improve the overall effectiveness of the machine user. Significant improvement in throughput requires highly concurrent systems plus the willingness of the user community to develop problem solutions for that kind of architecture. An unanticipated result was the expression of need for an on-going cross-disciplinary users group/forum in order to share experiences and to more effectively communicate needs to the manufacturers.
NASA Astrophysics Data System (ADS)
Huang, Daniel Z.; De Santis, Dante; Farhat, Charbel
2018-07-01
The Finite Volume method with Exact two-material Riemann Problems (FIVER) is both a computational framework for multi-material flows characterized by large density jumps, and an Embedded Boundary Method (EBM) for computational fluid dynamics and highly nonlinear Fluid-Structure Interaction (FSI) problems. This paper deals with the EBM aspect of FIVER. For FSI problems, this EBM has already demonstrated the ability to address viscous effects along wall boundaries, and large deformations and topological changes of such boundaries. However, like for most EBMs - also known as immersed boundary methods - the performance of FIVER in the vicinity of a wall boundary can be sensitive with respect to the position and orientation of this boundary relative to the embedding mesh. This is mainly due to ill-conditioning issues that arise when an embedded interface becomes too close to a node of the embedding mesh, which may lead to spurious oscillations in the computed solution gradients at the wall boundary. This paper resolves these issues by introducing an alternative definition of the active/inactive status of a mesh node that leads to the removal of all sources of potential ill-conditioning from all spatial approximations performed by FIVER in the vicinity of a fluid-structure interface. It also makes two additional contributions. The first one is a new procedure for constructing the fluid-structure half Riemann problem underlying the semi-discretization by FIVER of the convective fluxes. This procedure eliminates one extrapolation from the conventional treatment of the wall boundary conditions and replaces it by an interpolation, which improves robustness. The second contribution is a post-processing algorithm for computing quantities of interest at the wall that achieves smoothness in the computed solution and its gradients. Lessons learned from these enhancements and contributions that are triggered by the new definition of the status of a mesh node are then generalized and exploited to eliminate from the original version of the FIVER method its sensitivities with respect to both of the position and orientation of the wall boundary relative to the embedding mesh, while maintaining the original definition of the status of a mesh node. This leads to a family of second-generation FIVER methods whose performance is illustrated in this paper for several flow and FSI problems. These include a challenging flow problem over a bird wing characterized by a feather-induced surface roughness, and a complex flexible flapping wing problem for which experimental data is available.
Veliz-Cuba, Alan; Aguilar, Boris; Hinkelmann, Franziska; Laubenbacher, Reinhard
2014-06-26
A key problem in the analysis of mathematical models of molecular networks is the determination of their steady states. The present paper addresses this problem for Boolean network models, an increasingly popular modeling paradigm for networks lacking detailed kinetic information. For small models, the problem can be solved by exhaustive enumeration of all state transitions. But for larger models this is not feasible, since the size of the phase space grows exponentially with the dimension of the network. The dimension of published models is growing to over 100, so that efficient methods for steady state determination are essential. Several methods have been proposed for large networks, some of them heuristic. While these methods represent a substantial improvement in scalability over exhaustive enumeration, the problem for large networks is still unsolved in general. This paper presents an algorithm that consists of two main parts. The first is a graph theoretic reduction of the wiring diagram of the network, while preserving all information about steady states. The second part formulates the determination of all steady states of a Boolean network as a problem of finding all solutions to a system of polynomial equations over the finite number system with two elements. This problem can be solved with existing computer algebra software. This algorithm compares favorably with several existing algorithms for steady state determination. One advantage is that it is not heuristic or reliant on sampling, but rather determines algorithmically and exactly all steady states of a Boolean network. The code for the algorithm, as well as the test suite of benchmark networks, is available upon request from the corresponding author. The algorithm presented in this paper reliably determines all steady states of sparse Boolean networks with up to 1000 nodes. The algorithm is effective at analyzing virtually all published models even those of moderate connectivity. The problem for large Boolean networks with high average connectivity remains an open problem.
2014-01-01
Background A key problem in the analysis of mathematical models of molecular networks is the determination of their steady states. The present paper addresses this problem for Boolean network models, an increasingly popular modeling paradigm for networks lacking detailed kinetic information. For small models, the problem can be solved by exhaustive enumeration of all state transitions. But for larger models this is not feasible, since the size of the phase space grows exponentially with the dimension of the network. The dimension of published models is growing to over 100, so that efficient methods for steady state determination are essential. Several methods have been proposed for large networks, some of them heuristic. While these methods represent a substantial improvement in scalability over exhaustive enumeration, the problem for large networks is still unsolved in general. Results This paper presents an algorithm that consists of two main parts. The first is a graph theoretic reduction of the wiring diagram of the network, while preserving all information about steady states. The second part formulates the determination of all steady states of a Boolean network as a problem of finding all solutions to a system of polynomial equations over the finite number system with two elements. This problem can be solved with existing computer algebra software. This algorithm compares favorably with several existing algorithms for steady state determination. One advantage is that it is not heuristic or reliant on sampling, but rather determines algorithmically and exactly all steady states of a Boolean network. The code for the algorithm, as well as the test suite of benchmark networks, is available upon request from the corresponding author. Conclusions The algorithm presented in this paper reliably determines all steady states of sparse Boolean networks with up to 1000 nodes. The algorithm is effective at analyzing virtually all published models even those of moderate connectivity. The problem for large Boolean networks with high average connectivity remains an open problem. PMID:24965213
Graph theory approach to the eigenvalue problem of large space structures
NASA Technical Reports Server (NTRS)
Reddy, A. S. S. R.; Bainum, P. M.
1981-01-01
Graph theory is used to obtain numerical solutions to eigenvalue problems of large space structures (LSS) characterized by a state vector of large dimensions. The LSS are considered as large, flexible systems requiring both orientation and surface shape control. Graphic interpretation of the determinant of a matrix is employed to reduce a higher dimensional matrix into combinations of smaller dimensional sub-matrices. The reduction is implemented by means of a Boolean equivalent of the original matrices formulated to obtain smaller dimensional equivalents of the original numerical matrix. Computation time becomes less and more accurate solutions are possible. An example is provided in the form of a free-free square plate. Linearized system equations and numerical values of a stiffness matrix are presented, featuring a state vector with 16 components.
CSM solutions of rotating blade dynamics using integrating matrices
NASA Technical Reports Server (NTRS)
Lakin, William D.
1992-01-01
The dynamic behavior of flexible rotating beams continues to receive considerable research attention as it constitutes a fundamental problem in applied mechanics. Further, beams comprise parts of many rotating structures of engineering significance. A topic of particular interest at the present time involves the development of techniques for obtaining the behavior in both space and time of a rotor acted upon by a simple airload loading. Most current work on problems of this type use solution techniques based on normal modes. It is certainly true that normal modes cannot be disregarded, as knowledge of natural blade frequencies is always important. However, the present work has considered a computational structural mechanics (CSM) approach to rotor blade dynamics problems in which the physical properties of the rotor blade provide input for a direct numerical solution of the relevant boundary-and-initial-value problem. Analysis of the dynamics of a given rotor system may require solution of the governing equations over a long time interval corresponding to many revolutions of the loaded flexible blade. For this reason, most of the common techniques in computational mechanics, which treat the space-time behavior concurrently, cannot be applied to the rotor dynamics problem without a large expenditure of computational resources. By contrast, the integrating matrix technique of computational mechanics has the ability to consistently incorporate boundary conditions and 'remove' dependence on a space variable. For problems involving both space and time, this feature of the integrating matrix approach thus can generate a 'splitting' which forms the basis of an efficient CSM method for numerical solution of rotor dynamics problems.
SAMSAN- MODERN NUMERICAL METHODS FOR CLASSICAL SAMPLED SYSTEM ANALYSIS
NASA Technical Reports Server (NTRS)
Frisch, H. P.
1994-01-01
SAMSAN was developed to aid the control system analyst by providing a self consistent set of computer algorithms that support large order control system design and evaluation studies, with an emphasis placed on sampled system analysis. Control system analysts have access to a vast array of published algorithms to solve an equally large spectrum of controls related computational problems. The analyst usually spends considerable time and effort bringing these published algorithms to an integrated operational status and often finds them less general than desired. SAMSAN reduces the burden on the analyst by providing a set of algorithms that have been well tested and documented, and that can be readily integrated for solving control system problems. Algorithm selection for SAMSAN has been biased toward numerical accuracy for large order systems with computational speed and portability being considered important but not paramount. In addition to containing relevant subroutines from EISPAK for eigen-analysis and from LINPAK for the solution of linear systems and related problems, SAMSAN contains the following not so generally available capabilities: 1) Reduction of a real non-symmetric matrix to block diagonal form via a real similarity transformation matrix which is well conditioned with respect to inversion, 2) Solution of the generalized eigenvalue problem with balancing and grading, 3) Computation of all zeros of the determinant of a matrix of polynomials, 4) Matrix exponentiation and the evaluation of integrals involving the matrix exponential, with option to first block diagonalize, 5) Root locus and frequency response for single variable transfer functions in the S, Z, and W domains, 6) Several methods of computing zeros for linear systems, and 7) The ability to generate documentation "on demand". All matrix operations in the SAMSAN algorithms assume non-symmetric matrices with real double precision elements. There is no fixed size limit on any matrix in any SAMSAN algorithm; however, it is generally agreed by experienced users, and in the numerical error analysis literature, that computation with non-symmetric matrices of order greater than about 200 should be avoided or treated with extreme care. SAMSAN attempts to support the needs of application oriented analysis by providing: 1) a methodology with unlimited growth potential, 2) a methodology to insure that associated documentation is current and available "on demand", 3) a foundation of basic computational algorithms that most controls analysis procedures are based upon, 4) a set of check out and evaluation programs which demonstrate usage of the algorithms on a series of problems which are structured to expose the limits of each algorithm's applicability, and 5) capabilities which support both a priori and a posteriori error analysis for the computational algorithms provided. The SAMSAN algorithms are coded in FORTRAN 77 for batch or interactive execution and have been implemented on a DEC VAX computer under VMS 4.7. An effort was made to assure that the FORTRAN source code was portable and thus SAMSAN may be adaptable to other machine environments. The documentation is included on the distribution tape or can be purchased separately at the price below. SAMSAN version 2.0 was developed in 1982 and updated to version 3.0 in 1988.
Application of high-performance computing to numerical simulation of human movement
NASA Technical Reports Server (NTRS)
Anderson, F. C.; Ziegler, J. M.; Pandy, M. G.; Whalen, R. T.
1995-01-01
We have examined the feasibility of using massively-parallel and vector-processing supercomputers to solve large-scale optimization problems for human movement. Specifically, we compared the computational expense of determining the optimal controls for the single support phase of gait using a conventional serial machine (SGI Iris 4D25), a MIMD parallel machine (Intel iPSC/860), and a parallel-vector-processing machine (Cray Y-MP 8/864). With the human body modeled as a 14 degree-of-freedom linkage actuated by 46 musculotendinous units, computation of the optimal controls for gait could take up to 3 months of CPU time on the Iris. Both the Cray and the Intel are able to reduce this time to practical levels. The optimal solution for gait can be found with about 77 hours of CPU on the Cray and with about 88 hours of CPU on the Intel. Although the overall speeds of the Cray and the Intel were found to be similar, the unique capabilities of each machine are better suited to different portions of the computational algorithm used. The Intel was best suited to computing the derivatives of the performance criterion and the constraints whereas the Cray was best suited to parameter optimization of the controls. These results suggest that the ideal computer architecture for solving very large-scale optimal control problems is a hybrid system in which a vector-processing machine is integrated into the communication network of a MIMD parallel machine.
Yang, C L; Wei, H Y; Adler, A; Soleimani, M
2013-06-01
Electrical impedance tomography (EIT) is a fast and cost-effective technique to provide a tomographic conductivity image of a subject from boundary current-voltage data. This paper proposes a time and memory efficient method for solving a large scale 3D EIT inverse problem using a parallel conjugate gradient (CG) algorithm. The 3D EIT system with a large number of measurement data can produce a large size of Jacobian matrix; this could cause difficulties in computer storage and the inversion process. One of challenges in 3D EIT is to decrease the reconstruction time and memory usage, at the same time retaining the image quality. Firstly, a sparse matrix reduction technique is proposed using thresholding to set very small values of the Jacobian matrix to zero. By adjusting the Jacobian matrix into a sparse format, the element with zeros would be eliminated, which results in a saving of memory requirement. Secondly, a block-wise CG method for parallel reconstruction has been developed. The proposed method has been tested using simulated data as well as experimental test samples. Sparse Jacobian with a block-wise CG enables the large scale EIT problem to be solved efficiently. Image quality measures are presented to quantify the effect of sparse matrix reduction in reconstruction results.
Underworld results as a triple (shopping list, posterior, priors)
NASA Astrophysics Data System (ADS)
Quenette, S. M.; Moresi, L. N.; Abramson, D.
2013-12-01
When studying long-term lithosphere deformation and other such large-scale, spatially distinct and behaviour rich problems, there is a natural trade-off between the meaning of a model, the observations used to validate the model and the ability to compute over this space. For example, many models of varying lithologies, rheological properties and underlying physics may reasonably match (or not match) observables. To compound this problem, each realisation is computationally intensive, requiring high resolution, algorithm tuning and code tuning to contemporary computer hardware. It is often intractable to use sampling based assimilation methods, but with better optimisation, the window of tractability becomes wider. The ultimate goal is to find a sweet-spot where a formal assimilation method is used, and where a model affines to observations. Its natural to think of this as an inverse problem, in which the underlying physics may be fixed and the rheological properties and possibly the lithologies themselves are unknown. What happens when we push this approach and treat some portion of the underlying physics as an unknown? At its extreme this is an intractable problem. However, there is an analogy here with how we develop software for these scientific problems. What happens when we treat the changing part of a largely complete code as an unknown, where the changes are working towards this sweet-spot? When posed as a Bayesian inverse problem the result is a triple - the model changes, the real priors and the real posterior. Not only does this give meaning to the process by which a code changes, it forms a mathematical bridge from an inverse problem to compiler optimisations given such changes. As a stepping stone example we show a regional scale heat flow model with constraining observations, and the inverse process including increasingly complexity in the software. The implementation uses Underworld-GT (Underworld plus research extras to import geology and export geothermic measures, etc). Underworld uses StGermain an early (partial) implementation of the theories described here.
Reliable semiclassical computations in QCD
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dine, Michael; Department of Physics, Stanford University Stanford, California 94305-4060; Festuccia, Guido
We revisit the question of whether or not one can perform reliable semiclassical QCD computations at zero temperature. We study correlation functions with no perturbative contributions, and organize the problem by means of the operator product expansion, establishing a precise criterion for the validity of a semiclassical calculation. For N{sub f}>N, a systematic computation is possible; for N{sub f}
Multi-objective optimization of GENIE Earth system models.
Price, Andrew R; Myerscough, Richard J; Voutchkov, Ivan I; Marsh, Robert; Cox, Simon J
2009-07-13
The tuning of parameters in climate models is essential to provide reliable long-term forecasts of Earth system behaviour. We apply a multi-objective optimization algorithm to the problem of parameter estimation in climate models. This optimization process involves the iterative evaluation of response surface models (RSMs), followed by the execution of multiple Earth system simulations. These computations require an infrastructure that provides high-performance computing for building and searching the RSMs and high-throughput computing for the concurrent evaluation of a large number of models. Grid computing technology is therefore essential to make this algorithm practical for members of the GENIE project.
Navier-Stokes computations useful in aircraft design
NASA Technical Reports Server (NTRS)
Holst, Terry L.
1990-01-01
Large scale Navier-Stokes computations about aircraft components as well as reasonably complete aircraft configurations are presented and discussed. Speed and memory requirements are described for various general problem classes, which in some cases are already being used in the industrial design environment. Recent computed results, with experimental comparisons when available, are included to highlight the presentation. Finally, prospects for the future are described and recommendations for areas of concentrated research are indicated. The future of Navier-Stokes computations is seen to be rapidly expanding across a broad front of applications, which includes the entire subsonic-to-hypersonic speed regime.
CREW CHIEF: A computer graphics simulation of an aircraft maintenance technician
NASA Technical Reports Server (NTRS)
Aume, Nilss M.
1990-01-01
Approximately 35 percent of the lifetime cost of a military system is spent for maintenance. Excessive repair time is caused by not considering maintenance during design. Problems are usually discovered only after a mock-up has been constructed, when it is too late to make changes. CREW CHIEF will reduce the incidence of such problems by catching design defects in the early design stages. CREW CHIEF is a computer graphic human factors evaluation system interfaced to commercial computer aided design (CAD) systems. It creates a three dimensional man model, either male or female, large or small, with various types of clothing and in several postures. It can perform analyses for physical accessibility, strength capability with tools, visual access, and strength capability for manual materials handling. The designer would produce a drawing on his CAD system and introduce CREW CHIEF in it. CREW CHIEF's analyses would then indicate places where problems could be foreseen and corrected before the design is frozen.
NASA Astrophysics Data System (ADS)
Mundis, Nathan L.; Mavriplis, Dimitri J.
2017-09-01
The time-spectral method applied to the Euler and coupled aeroelastic equations theoretically offers significant computational savings for purely periodic problems when compared to standard time-implicit methods. However, attaining superior efficiency with time-spectral methods over traditional time-implicit methods hinges on the ability rapidly to solve the large non-linear system resulting from time-spectral discretizations which become larger and stiffer as more time instances are employed or the period of the flow becomes especially short (i.e. the maximum resolvable wave-number increases). In order to increase the efficiency of these solvers, and to improve robustness, particularly for large numbers of time instances, the Generalized Minimal Residual Method (GMRES) is used to solve the implicit linear system over all coupled time instances. The use of GMRES as the linear solver makes time-spectral methods more robust, allows them to be applied to a far greater subset of time-accurate problems, including those with a broad range of harmonic content, and vastly improves the efficiency of time-spectral methods. In previous work, a wave-number independent preconditioner that mitigates the increased stiffness of the time-spectral method when applied to problems with large resolvable wave numbers has been developed. This preconditioner, however, directly inverts a large matrix whose size increases in proportion to the number of time instances. As a result, the computational time of this method scales as the cube of the number of time instances. In the present work, this preconditioner has been reworked to take advantage of an approximate-factorization approach that effectively decouples the spatial and temporal systems. Once decoupled, the time-spectral matrix can be inverted in frequency space, where it has entries only on the main diagonal and therefore can be inverted quite efficiently. This new GMRES/preconditioner combination is shown to be over an order of magnitude more efficient than the previous wave-number independent preconditioner for problems with large numbers of time instances and/or large reduced frequencies.
NASA Astrophysics Data System (ADS)
Volkov, D.
2017-12-01
We introduce an algorithm for the simultaneous reconstruction of faults and slip fields on those faults. We define a regularized functional to be minimized for the reconstruction. We prove that the minimum of that functional converges to the unique solution of the related fault inverse problem. Due to inherent uncertainties in measurements, rather than seeking a deterministic solution to the fault inverse problem, we consider a Bayesian approach. The advantage of such an approach is that we obtain a way of quantifying uncertainties as part of our final answer. On the downside, this Bayesian approach leads to a very large computation. To contend with the size of this computation we developed an algorithm for the numerical solution to the stochastic minimization problem which can be easily implemented on a parallel multi-core platform and we discuss techniques to save on computational time. After showing how this algorithm performs on simulated data and assessing the effect of noise, we apply it to measured data. The data was recorded during a slow slip event in Guerrero, Mexico.
NMESys: An expert system for network fault detection
NASA Technical Reports Server (NTRS)
Nelson, Peter C.; Warpinski, Janet
1991-01-01
The problem of network management is becoming an increasingly difficult and challenging task. It is very common today to find heterogeneous networks consisting of many different types of computers, operating systems, and protocols. The complexity of implementing a network with this many components is difficult enough, while the maintenance of such a network is an even larger problem. A prototype network management expert system, NMESys, implemented in the C Language Integrated Production System (CLIPS). NMESys concentrates on solving some of the critical problems encountered in managing a large network. The major goal of NMESys is to provide a network operator with an expert system tool to quickly and accurately detect hard failures, potential failures, and to minimize or eliminate user down time in a large network.
PALP: A Package for Analysing Lattice Polytopes with applications to toric geometry
NASA Astrophysics Data System (ADS)
Kreuzer, Maximilian; Skarke, Harald
2004-02-01
We describe our package PALP of C programs for calculations with lattice polytopes and applications to toric geometry, which is freely available on the internet. It contains routines for vertex and facet enumeration, computation of incidences and symmetries, as well as completion of the set of lattice points in the convex hull of a given set of points. In addition, there are procedures specialized to reflexive polytopes such as the enumeration of reflexive subpolytopes, and applications to toric geometry and string theory, like the computation of Hodge data and fibration structures for toric Calabi-Yau varieties. The package is well tested and optimized in speed as it was used for time consuming tasks such as the classification of reflexive polyhedra in 4 dimensions and the creation and manipulation of very large lists of 5-dimensional polyhedra. While originally intended for low-dimensional applications, the algorithms work in any dimension and our key routine for vertex and facet enumeration compares well with existing packages. Program summaryProgram obtainable form: CPC Program Library, Queen's University of Belfast, N. Ireland Title of program: PALP Catalogue identifier: ADSQ Program summary URL:http://cpc.cs.qub.ac.uk/summaries/ADSQ Computer for which the program is designed: Any computer featuring C Computers on which it has been tested: PCs, SGI Origin 2000, IBM RS/6000, COMPAQ GS140 Operating systems under which the program has been tested: Linux, IRIX, AIX, OSF1 Programming language used: C Memory required to execute with typical data: Negligible for most applications; highly variable for analysis of large polytopes; no minimum but strong effects on calculation time for some tasks Number of bits in a word: arbitrary Number of processors used: 1 Has the code been vectorised or parallelized?: No Number of bytes in distributed program, including test data, etc.: 138 098 Distribution format: tar gzip file Keywords: Lattice polytopes, facet enumeration, reflexive polytopes, toric geometry, Calabi-Yau manifolds, string theory, conformal field theory Nature of problem: Certain lattice polytopes called reflexive polytopes afford a combinatorial description of a very large class of Calabi-Yau manifolds in terms of toric geometry. These manifolds play an essential role for compactifications of string theory. While originally designed to handle and classify reflexive polytopes, with particular emphasis on problems relevant to string theory applications [M. Kreuzer and H. Skarke, Rev. Math. Phys. 14 (2002) 343], the package also handles standard questions (facet enumeration and similar problems) about arbitrary lattice polytopes very efficiently. Method of solution: Much of the code is straightforward programming, but certain key routines are optimized with respect to calculation time and the handling of large sets of data. A double description method (see, e.g., [D. Avis et al., Comput. Geometry 7 (1997) 265]) is used for the facet enumeration problem, lattice basis reduction for extended gcd and a binary database structure for tasks involving large numbers of polytopes, such as classification problems. Restrictions on the complexity of the program: The only hard limitation comes from the fact that fixed integer arithmetic (32 or 64 bit) is used, allowing for input data (polytope coordinates) of roughly up to 10 9. Other parameters (dimension, numbers of points and vertices, etc.) can be set before compilation. Typical running time: Most tasks (typically: analysis of a four dimensional reflexive polytope) can be perfomed interactively within milliseconds. The classification of all reflexive polytopes in four dimensions takes several processor years. The facet enumeration problem for higher (e.g., 12-20) dimensional polytopes varies strongly with the dimension and structure of the polytope; here PALP's performance is similar to that of existing packages [Avis et al., Comput. Geometry 7 (1997) 265]. Unusual features of the program: None
Experimental Program to Stimulate Competitive Research (EPSCoR)
NASA Technical Reports Server (NTRS)
Dingerson, Michael R.
1997-01-01
Report includes: (1) CLUSTER: "Studies in Macromolecular Behavior in Microgravity Environment": The Role of Protein Oligomers in Protein Crystallization; Phase Separation Phenomena in Microgravity; Traveling Front Polymerizations; Investigating Mechanisms Affecting Phase Transition Response and Changes in Thermal Transport Properties in ER-Fluids under Normal and Microgravity Conditions. (2) CLUSTER: "Computational/Parallel Processing Studies": Flows in Local Chemical Equilibrium; A Computational Method for Solving Very Large Problems; Modeling of Cavitating Flows.
NASA Astrophysics Data System (ADS)
Reiter, D. T.; Rodi, W. L.
2015-12-01
Constructing 3D Earth models through the joint inversion of large geophysical data sets presents numerous theoretical and practical challenges, especially when diverse types of data and model parameters are involved. Among the challenges are the computational complexity associated with large data and model vectors and the need to unify differing model parameterizations, forward modeling methods and regularization schemes within a common inversion framework. The challenges can be addressed in part by decomposing the inverse problem into smaller, simpler inverse problems that can be solved separately, providing one knows how to merge the separate inversion results into an optimal solution of the full problem. We have formulated an approach to the decomposition of large inverse problems based on the augmented Lagrangian technique from optimization theory. As commonly done, we define a solution to the full inverse problem as the Earth model minimizing an objective function motivated, for example, by a Bayesian inference formulation. Our decomposition approach recasts the minimization problem equivalently as the minimization of component objective functions, corresponding to specified data subsets, subject to the constraints that the minimizing models be equal. A standard optimization algorithm solves the resulting constrained minimization problems by alternating between the separate solution of the component problems and the updating of Lagrange multipliers that serve to steer the individual solution models toward a common model solving the full problem. We are applying our inversion method to the reconstruction of the·crust and upper-mantle seismic velocity structure across Eurasia.· Data for the inversion comprise a large set of P and S body-wave travel times·and fundamental and first-higher mode Rayleigh-wave group velocities.
NASA Technical Reports Server (NTRS)
Kizhner, Semion; Hunter, Stanley D.; Hanu, Andrei R.; Sheets, Teresa B.
2016-01-01
Richard O. Duda and Peter E. Hart of Stanford Research Institute in [1] described the recurring problem in computer image processing as the detection of straight lines in digitized images. The problem is to detect the presence of groups of collinear or almost collinear figure points. It is clear that the problem can be solved to any desired degree of accuracy by testing the lines formed by all pairs of points. However, the computation required for n=NxM points image is approximately proportional to n2 or O(n2), becoming prohibitive for large images or when data processing cadence time is in milliseconds. Rosenfeld in [2] described an ingenious method due to Hough [3] for replacing the original problem of finding collinear points by a mathematically equivalent problem of finding concurrent lines. This method involves transforming each of the figure points into a straight line in a parameter space. Hough chose to use the familiar slope-intercept parameters, and thus his parameter space was the two-dimensional slope-intercept plane. A parallel Hough transform running on multi-core processors was elaborated in [4]. There are many other proposed methods of solving a similar problem, such as sampling-up-the-ramp algorithm (SUTR) [5] and algorithms involving artificial swarm intelligence techniques [6]. However, all state-of-the-art algorithms lack in real time performance. Namely, they are slow for large images that require performance cadence of a few dozens of milliseconds (50ms). This problem arises in spaceflight applications such as near real-time analysis of gamma ray measurements contaminated by overwhelming amount of traces of cosmic rays (CR). Future spaceflight instruments such as the Advanced Energetic Pair Telescope instrument (AdEPT) [7-9] for cosmos gamma ray survey employ large detector readout planes registering multitudes of cosmic ray interference events and sparse science gamma ray event traces' projections. The AdEPT science of interest is in the gamma ray events and the problem is to detect and reject the much more voluminous cosmic ray projections, so that the remaining science data can be telemetered to the ground over the constrained communication link. The state-of-the-art in cosmic rays detection and rejection does not provide an adequate computational solution. This paper presents a novel approach to the AdEPT on-board data processing burdened with the CR detection top pole bottleneck problem. This paper is introducing the data processing object, demonstrates object segmentation and distribution for processing among many processing elements (PEs) and presents solution algorithm for the processing bottleneck - the CR-Algorithm. The algorithm is based on the a priori knowledge that a CR pierces the entire instrument pressure vessel. This phenomenon is also the basis for a straightforward CR simulator, allowing the CR-Algorithm performance testing. Parallel processing of the readout image's (2(N+M) - 4) peripheral voxels is detecting all CRs, resulting in O(n) computational complexity. This algorithm near real-time performance is making AdEPT class spaceflight instruments feasible.
Acuña, Daniel E; Parada, Víctor
2010-07-29
Humans need to solve computationally intractable problems such as visual search, categorization, and simultaneous learning and acting, yet an increasing body of evidence suggests that their solutions to instantiations of these problems are near optimal. Computational complexity advances an explanation to this apparent paradox: (1) only a small portion of instances of such problems are actually hard, and (2) successful heuristics exploit structural properties of the typical instance to selectively improve parts that are likely to be sub-optimal. We hypothesize that these two ideas largely account for the good performance of humans on computationally hard problems. We tested part of this hypothesis by studying the solutions of 28 participants to 28 instances of the Euclidean Traveling Salesman Problem (TSP). Participants were provided feedback on the cost of their solutions and were allowed unlimited solution attempts (trials). We found a significant improvement between the first and last trials and that solutions are significantly different from random tours that follow the convex hull and do not have self-crossings. More importantly, we found that participants modified their current better solutions in such a way that edges belonging to the optimal solution ("good" edges) were significantly more likely to stay than other edges ("bad" edges), a hallmark of structural exploitation. We found, however, that more trials harmed the participants' ability to tell good from bad edges, suggesting that after too many trials the participants "ran out of ideas." In sum, we provide the first demonstration of significant performance improvement on the TSP under repetition and feedback and evidence that human problem-solving may exploit the structure of hard problems paralleling behavior of state-of-the-art heuristics.
Acuña, Daniel E.; Parada, Víctor
2010-01-01
Humans need to solve computationally intractable problems such as visual search, categorization, and simultaneous learning and acting, yet an increasing body of evidence suggests that their solutions to instantiations of these problems are near optimal. Computational complexity advances an explanation to this apparent paradox: (1) only a small portion of instances of such problems are actually hard, and (2) successful heuristics exploit structural properties of the typical instance to selectively improve parts that are likely to be sub-optimal. We hypothesize that these two ideas largely account for the good performance of humans on computationally hard problems. We tested part of this hypothesis by studying the solutions of 28 participants to 28 instances of the Euclidean Traveling Salesman Problem (TSP). Participants were provided feedback on the cost of their solutions and were allowed unlimited solution attempts (trials). We found a significant improvement between the first and last trials and that solutions are significantly different from random tours that follow the convex hull and do not have self-crossings. More importantly, we found that participants modified their current better solutions in such a way that edges belonging to the optimal solution (“good” edges) were significantly more likely to stay than other edges (“bad” edges), a hallmark of structural exploitation. We found, however, that more trials harmed the participants' ability to tell good from bad edges, suggesting that after too many trials the participants “ran out of ideas.” In sum, we provide the first demonstration of significant performance improvement on the TSP under repetition and feedback and evidence that human problem-solving may exploit the structure of hard problems paralleling behavior of state-of-the-art heuristics. PMID:20686597
Large biases in regression-based constituent flux estimates: causes and diagnostic tools
Hirsch, Robert M.
2014-01-01
It has been documented in the literature that, in some cases, widely used regression-based models can produce severely biased estimates of long-term mean river fluxes of various constituents. These models, estimated using sample values of concentration, discharge, and date, are used to compute estimated fluxes for a multiyear period at a daily time step. This study compares results of the LOADEST seven-parameter model, LOADEST five-parameter model, and the Weighted Regressions on Time, Discharge, and Season (WRTDS) model using subsampling of six very large datasets to better understand this bias problem. This analysis considers sample datasets for dissolved nitrate and total phosphorus. The results show that LOADEST-7 and LOADEST-5, although they often produce very nearly unbiased results, can produce highly biased results. This study identifies three conditions that can give rise to these severe biases: (1) lack of fit of the log of concentration vs. log discharge relationship, (2) substantial differences in the shape of this relationship across seasons, and (3) severely heteroscedastic residuals. The WRTDS model is more resistant to the bias problem than the LOADEST models but is not immune to them. Understanding the causes of the bias problem is crucial to selecting an appropriate method for flux computations. Diagnostic tools for identifying the potential for bias problems are introduced, and strategies for resolving bias problems are described.
Verifiable fault tolerance in measurement-based quantum computation
NASA Astrophysics Data System (ADS)
Fujii, Keisuke; Hayashi, Masahito
2017-09-01
Quantum systems, in general, cannot be simulated efficiently by a classical computer, and hence are useful for solving certain mathematical problems and simulating quantum many-body systems. This also implies, unfortunately, that verification of the output of the quantum systems is not so trivial, since predicting the output is exponentially hard. As another problem, the quantum system is very delicate for noise and thus needs an error correction. Here, we propose a framework for verification of the output of fault-tolerant quantum computation in a measurement-based model. In contrast to existing analyses on fault tolerance, we do not assume any noise model on the resource state, but an arbitrary resource state is tested by using only single-qubit measurements to verify whether or not the output of measurement-based quantum computation on it is correct. Verifiability is equipped by a constant time repetition of the original measurement-based quantum computation in appropriate measurement bases. Since full characterization of quantum noise is exponentially hard for large-scale quantum computing systems, our framework provides an efficient way to practically verify the experimental quantum error correction.
In vitro molecular machine learning algorithm via symmetric internal loops of DNA.
Lee, Ji-Hoon; Lee, Seung Hwan; Baek, Christina; Chun, Hyosun; Ryu, Je-Hwan; Kim, Jin-Woo; Deaton, Russell; Zhang, Byoung-Tak
2017-08-01
Programmable biomolecules, such as DNA strands, deoxyribozymes, and restriction enzymes, have been used to solve computational problems, construct large-scale logic circuits, and program simple molecular games. Although studies have shown the potential of molecular computing, the capability of computational learning with DNA molecules, i.e., molecular machine learning, has yet to be experimentally verified. Here, we present a novel molecular learning in vitro model in which symmetric internal loops of double-stranded DNA are exploited to measure the differences between training instances, thus enabling the molecules to learn from small errors. The model was evaluated on a data set of twenty dialogue sentences obtained from the television shows Friends and Prison Break. The wet DNA-computing experiments confirmed that the molecular learning machine was able to generalize the dialogue patterns of each show and successfully identify the show from which the sentences originated. The molecular machine learning model described here opens the way for solving machine learning problems in computer science and biology using in vitro molecular computing with the data encoded in DNA molecules. Copyright © 2017. Published by Elsevier B.V.
Network Community Detection based on the Physarum-inspired Computational Framework.
Gao, Chao; Liang, Mingxin; Li, Xianghua; Zhang, Zili; Wang, Zhen; Zhou, Zhili
2016-12-13
Community detection is a crucial and essential problem in the structure analytics of complex networks, which can help us understand and predict the characteristics and functions of complex networks. Many methods, ranging from the optimization-based algorithms to the heuristic-based algorithms, have been proposed for solving such a problem. Due to the inherent complexity of identifying network structure, how to design an effective algorithm with a higher accuracy and a lower computational cost still remains an open problem. Inspired by the computational capability and positive feedback mechanism in the wake of foraging process of Physarum, which is a large amoeba-like cell consisting of a dendritic network of tube-like pseudopodia, a general Physarum-based computational framework for community detection is proposed in this paper. Based on the proposed framework, the inter-community edges can be identified from the intra-community edges in a network and the positive feedback of solving process in an algorithm can be further enhanced, which are used to improve the efficiency of original optimization-based and heuristic-based community detection algorithms, respectively. Some typical algorithms (e.g., genetic algorithm, ant colony optimization algorithm, and Markov clustering algorithm) and real-world datasets have been used to estimate the efficiency of our proposed computational framework. Experiments show that the algorithms optimized by Physarum-inspired computational framework perform better than the original ones, in terms of accuracy and computational cost. Moreover, a computational complexity analysis verifies the scalability of our framework.
Quantum communication and information processing
NASA Astrophysics Data System (ADS)
Beals, Travis Roland
Quantum computers enable dramatically more efficient algorithms for solving certain classes of computational problems, but, in doing so, they create new problems. In particular, Shor's Algorithm allows for efficient cryptanalysis of many public-key cryptosystems. As public key cryptography is a critical component of present-day electronic commerce, it is crucial that a working, secure replacement be found. Quantum key distribution (QKD), first developed by C.H. Bennett and G. Brassard, offers a partial solution, but many challenges remain, both in terms of hardware limitations and in designing cryptographic protocols for a viable large-scale quantum communication infrastructure. In Part I, I investigate optical lattice-based approaches to quantum information processing. I look at details of a proposal for an optical lattice-based quantum computer, which could potentially be used for both quantum communications and for more sophisticated quantum information processing. In Part III, I propose a method for converting and storing photonic quantum bits in the internal state of periodically-spaced neutral atoms by generating and manipulating a photonic band gap and associated defect states. In Part II, I present a cryptographic protocol which allows for the extension of present-day QKD networks over much longer distances without the development of new hardware. I also present a second, related protocol which effectively solves the authentication problem faced by a large QKD network, thus making QKD a viable, information-theoretic secure replacement for public key cryptosystems.
Visual Analytics for Power Grid Contingency Analysis
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wong, Pak C.; Huang, Zhenyu; Chen, Yousu
2014-01-20
Contingency analysis is the process of employing different measures to model scenarios, analyze them, and then derive the best response to remove the threats. This application paper focuses on a class of contingency analysis problems found in the power grid management system. A power grid is a geographically distributed interconnected transmission network that transmits and delivers electricity from generators to end users. The power grid contingency analysis problem is increasingly important because of both the growing size of the underlying raw data that need to be analyzed and the urgency to deliver working solutions in an aggressive timeframe. Failure tomore » do so may bring significant financial, economic, and security impacts to all parties involved and the society at large. The paper presents a scalable visual analytics pipeline that transforms about 100 million contingency scenarios to a manageable size and form for grid operators to examine different scenarios and come up with preventive or mitigation strategies to address the problems in a predictive and timely manner. Great attention is given to the computational scalability, information scalability, visual scalability, and display scalability issues surrounding the data analytics pipeline. Most of the large-scale computation requirements of our work are conducted on a Cray XMT multi-threaded parallel computer. The paper demonstrates a number of examples using western North American power grid models and data.« less
Advanced Variance Reduction Strategies for Optimizing Mesh Tallies in MAVRIC
DOE Office of Scientific and Technical Information (OSTI.GOV)
Peplow, Douglas E.; Blakeman, Edward D; Wagner, John C
2007-01-01
More often than in the past, Monte Carlo methods are being used to compute fluxes or doses over large areas using mesh tallies (a set of region tallies defined on a mesh that overlays the geometry). For problems that demand that the uncertainty in each mesh cell be less than some set maximum, computation time is controlled by the cell with the largest uncertainty. This issue becomes quite troublesome in deep-penetration problems, and advanced variance reduction techniques are required to obtain reasonable uncertainties over large areas. The CADIS (Consistent Adjoint Driven Importance Sampling) methodology has been shown to very efficientlymore » optimize the calculation of a response (flux or dose) for a single point or a small region using weight windows and a biased source based on the adjoint of that response. This has been incorporated into codes such as ADVANTG (based on MCNP) and the new sequence MAVRIC, which will be available in the next release of SCALE. In an effort to compute lower uncertainties everywhere in the problem, Larsen's group has also developed several methods to help distribute particles more evenly, based on forward estimates of flux. This paper focuses on the use of a forward estimate to weight the placement of the source in the adjoint calculation used by CADIS, which we refer to as a forward-weighted CADIS (FW-CADIS).« less
Seeing beyond Computer Science and Software Engineering
NASA Astrophysics Data System (ADS)
Nori, Kesav Vithal
The boundaries of computer science are defined by what symbolic computation can accomplish. Software Engineering is concerned with effective use of computing technology to support automatic computation on a large scale so as to construct desirable solutions to worthwhile problems. Both focus on what happens within the machine. In contrast, most practical applications of computing support end-users in realizing (often unsaid) objectives. It is often said that such objectives cannot be even specified, e.g., what is the specification of MS Word, or for that matter, any flavour of UNIX? This situation points to the need for architecting what people do with computers. Based on Systems Thinking and Cybernetics, we present such a viewpoint which hinges on Human Responsibility and means of living up to it.
Large-Scale Optimization for Bayesian Inference in Complex Systems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Willcox, Karen; Marzouk, Youssef
2013-11-12
The SAGUARO (Scalable Algorithms for Groundwater Uncertainty Analysis and Robust Optimization) Project focused on the development of scalable numerical algorithms for large-scale Bayesian inversion in complex systems that capitalize on advances in large-scale simulation-based optimization and inversion methods. The project was a collaborative effort among MIT, the University of Texas at Austin, Georgia Institute of Technology, and Sandia National Laboratories. The research was directed in three complementary areas: efficient approximations of the Hessian operator, reductions in complexity of forward simulations via stochastic spectral approximations and model reduction, and employing large-scale optimization concepts to accelerate sampling. The MIT--Sandia component of themore » SAGUARO Project addressed the intractability of conventional sampling methods for large-scale statistical inverse problems by devising reduced-order models that are faithful to the full-order model over a wide range of parameter values; sampling then employs the reduced model rather than the full model, resulting in very large computational savings. Results indicate little effect on the computed posterior distribution. On the other hand, in the Texas--Georgia Tech component of the project, we retain the full-order model, but exploit inverse problem structure (adjoint-based gradients and partial Hessian information of the parameter-to-observation map) to implicitly extract lower dimensional information on the posterior distribution; this greatly speeds up sampling methods, so that fewer sampling points are needed. We can think of these two approaches as ``reduce then sample'' and ``sample then reduce.'' In fact, these two approaches are complementary, and can be used in conjunction with each other. Moreover, they both exploit deterministic inverse problem structure, in the form of adjoint-based gradient and Hessian information of the underlying parameter-to-observation map, to achieve their speedups.« less
Sub-problem Optimization With Regression and Neural Network Approximators
NASA Technical Reports Server (NTRS)
Guptill, James D.; Hopkins, Dale A.; Patnaik, Surya N.
2003-01-01
Design optimization of large systems can be attempted through a sub-problem strategy. In this strategy, the original problem is divided into a number of smaller problems that are clustered together to obtain a sequence of sub-problems. Solution to the large problem is attempted iteratively through repeated solutions to the modest sub-problems. This strategy is applicable to structures and to multidisciplinary systems. For structures, clustering the substructures generates the sequence of sub-problems. For a multidisciplinary system, individual disciplines, accounting for coupling, can be considered as sub-problems. A sub-problem, if required, can be further broken down to accommodate sub-disciplines. The sub-problem strategy is being implemented into the NASA design optimization test bed, referred to as "CometBoards." Neural network and regression approximators are employed for reanalysis and sensitivity analysis calculations at the sub-problem level. The strategy has been implemented in sequential as well as parallel computational environments. This strategy, which attempts to alleviate algorithmic and reanalysis deficiencies, has the potential to become a powerful design tool. However, several issues have to be addressed before its full potential can be harnessed. This paper illustrates the strategy and addresses some issues.
Ordinal optimization and its application to complex deterministic problems
NASA Astrophysics Data System (ADS)
Yang, Mike Shang-Yu
1998-10-01
We present in this thesis a new perspective to approach a general class of optimization problems characterized by large deterministic complexities. Many problems of real-world concerns today lack analyzable structures and almost always involve high level of difficulties and complexities in the evaluation process. Advances in computer technology allow us to build computer models to simulate the evaluation process through numerical means, but the burden of high complexities remains to tax the simulation with an exorbitant computing cost for each evaluation. Such a resource requirement makes local fine-tuning of a known design difficult under most circumstances, let alone global optimization. Kolmogorov equivalence of complexity and randomness in computation theory is introduced to resolve this difficulty by converting the complex deterministic model to a stochastic pseudo-model composed of a simple deterministic component and a white-noise like stochastic term. The resulting randomness is then dealt with by a noise-robust approach called Ordinal Optimization. Ordinal Optimization utilizes Goal Softening and Ordinal Comparison to achieve an efficient and quantifiable selection of designs in the initial search process. The approach is substantiated by a case study in the turbine blade manufacturing process. The problem involves the optimization of the manufacturing process of the integrally bladed rotor in the turbine engines of U.S. Air Force fighter jets. The intertwining interactions among the material, thermomechanical, and geometrical changes makes the current FEM approach prohibitively uneconomical in the optimization process. The generalized OO approach to complex deterministic problems is applied here with great success. Empirical results indicate a saving of nearly 95% in the computing cost.
An evaluation of superminicomputers for thermal analysis
NASA Technical Reports Server (NTRS)
Storaasli, O. O.; Vidal, J. B.; Jones, G. K.
1982-01-01
The use of superminicomputers for solving a series of increasingly complex thermal analysis problems is investigated. The approach involved (1) installation and verification of the SPAR thermal analyzer software on superminicomputers at Langley Research Center and Goddard Space Flight Center, (2) solution of six increasingly complex thermal problems on this equipment, and (3) comparison of solution (accuracy, CPU time, turnaround time, and cost) with solutions on large mainframe computers.
1994-06-01
algorithms for large, irreducibly coupled systems iteratively solve concurrent problems within different subspaces of a Hilbert space, or within different...effective on problems amenable to SIMD solution. Together with researchers at AT&T Bell Labs (Boris Lubachevsky, Albert Greenberg ) we have developed...reasonable measurement. In the study of different speedups, various causes of superlinear speedup are also presented. Greenberg , Albert G., Boris D
A Pipeline Software Architecture for NMR Spectrum Data Translation
Ellis, Heidi J.C.; Weatherby, Gerard; Nowling, Ronald J.; Vyas, Jay; Fenwick, Matthew; Gryk, Michael R.
2012-01-01
The problem of formatting data so that it conforms to the required input for scientific data processing tools pervades scientific computing. The CONNecticut Joint University Research Group (CONNJUR) has developed a data translation tool based on a pipeline architecture that partially solves this problem. The CONNJUR Spectrum Translator supports data format translation for experiments that use Nuclear Magnetic Resonance to determine the structure of large protein molecules. PMID:24634607
Ishihara, Koji; Morimoto, Jun
2018-03-01
Humans use multiple muscles to generate such joint movements as an elbow motion. With multiple lightweight and compliant actuators, joint movements can also be efficiently generated. Similarly, robots can use multiple actuators to efficiently generate a one degree of freedom movement. For this movement, the desired joint torque must be properly distributed to each actuator. One approach to cope with this torque distribution problem is an optimal control method. However, solving the optimal control problem at each control time step has not been deemed a practical approach due to its large computational burden. In this paper, we propose a computationally efficient method to derive an optimal control strategy for a hybrid actuation system composed of multiple actuators, where each actuator has different dynamical properties. We investigated a singularly perturbed system of the hybrid actuator model that subdivided the original large-scale control problem into smaller subproblems so that the optimal control outputs for each actuator can be derived at each control time step and applied our proposed method to our pneumatic-electric hybrid actuator system. Our method derived a torque distribution strategy for the hybrid actuator by dealing with the difficulty of solving real-time optimal control problems. Copyright © 2017 The Author(s). Published by Elsevier Ltd.. All rights reserved.
Numerical Technology for Large-Scale Computational Electromagnetics
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sharpe, R; Champagne, N; White, D
The key bottleneck of implicit computational electromagnetics tools for large complex geometries is the solution of the resulting linear system of equations. The goal of this effort was to research and develop critical numerical technology that alleviates this bottleneck for large-scale computational electromagnetics (CEM). The mathematical operators and numerical formulations used in this arena of CEM yield linear equations that are complex valued, unstructured, and indefinite. Also, simultaneously applying multiple mathematical modeling formulations to different portions of a complex problem (hybrid formulations) results in a mixed structure linear system, further increasing the computational difficulty. Typically, these hybrid linear systems aremore » solved using a direct solution method, which was acceptable for Cray-class machines but does not scale adequately for ASCI-class machines. Additionally, LLNL's previously existing linear solvers were not well suited for the linear systems that are created by hybrid implicit CEM codes. Hence, a new approach was required to make effective use of ASCI-class computing platforms and to enable the next generation design capabilities. Multiple approaches were investigated, including the latest sparse-direct methods developed by our ASCI collaborators. In addition, approaches that combine domain decomposition (or matrix partitioning) with general-purpose iterative methods and special purpose pre-conditioners were investigated. Special-purpose pre-conditioners that take advantage of the structure of the matrix were adapted and developed based on intimate knowledge of the matrix properties. Finally, new operator formulations were developed that radically improve the conditioning of the resulting linear systems thus greatly reducing solution time. The goal was to enable the solution of CEM problems that are 10 to 100 times larger than our previous capability.« less
NASA Astrophysics Data System (ADS)
Varini, Nicola; Ceresoli, Davide; Martin-Samos, Layla; Girotto, Ivan; Cavazzoni, Carlo
2013-08-01
One of the most promising techniques used for studying the electronic properties of materials is based on Density Functional Theory (DFT) approach and its extensions. DFT has been widely applied in traditional solid state physics problems where periodicity and symmetry play a crucial role in reducing the computational workload. With growing compute power capability and the development of improved DFT methods, the range of potential applications is now including other scientific areas such as Chemistry and Biology. However, cross disciplinary combinations of traditional Solid-State Physics, Chemistry and Biology drastically improve the system complexity while reducing the degree of periodicity and symmetry. Large simulation cells containing of hundreds or even thousands of atoms are needed to model these kind of physical systems. The treatment of those systems still remains a computational challenge even with modern supercomputers. In this paper we describe our work to improve the scalability of Quantum ESPRESSO (Giannozzi et al., 2009 [3]) for treating very large cells and huge numbers of electrons. To this end we have introduced an extra level of parallelism, over electronic bands, in three kernels for solving computationally expensive problems: the Sternheimer equation solver (Nuclear Magnetic Resonance, package QE-GIPAW), the Fock operator builder (electronic ground-state, package PWscf) and most of the Car-Parrinello routines (Car-Parrinello dynamics, package CP). Final benchmarks show our success in computing the Nuclear Magnetic Response (NMR) chemical shift of a large biological assembly, the electronic structure of defected amorphous silica with hybrid exchange-correlation functionals and the equilibrium atomic structure of height Porphyrins anchored to a Carbon Nanotube, on many thousands of CPU cores.
Strategies for Large Scale Implementation of a Multiscale, Multiprocess Integrated Hydrologic Model
NASA Astrophysics Data System (ADS)
Kumar, M.; Duffy, C.
2006-05-01
Distributed models simulate hydrologic state variables in space and time while taking into account the heterogeneities in terrain, surface, subsurface properties and meteorological forcings. Computational cost and complexity associated with these model increases with its tendency to accurately simulate the large number of interacting physical processes at fine spatio-temporal resolution in a large basin. A hydrologic model run on a coarse spatial discretization of the watershed with limited number of physical processes needs lesser computational load. But this negatively affects the accuracy of model results and restricts physical realization of the problem. So it is imperative to have an integrated modeling strategy (a) which can be universally applied at various scales in order to study the tradeoffs between computational complexity (determined by spatio- temporal resolution), accuracy and predictive uncertainty in relation to various approximations of physical processes (b) which can be applied at adaptively different spatial scales in the same domain by taking into account the local heterogeneity of topography and hydrogeologic variables c) which is flexible enough to incorporate different number and approximation of process equations depending on model purpose and computational constraint. An efficient implementation of this strategy becomes all the more important for Great Salt Lake river basin which is relatively large (~89000 sq. km) and complex in terms of hydrologic and geomorphic conditions. Also the types and the time scales of hydrologic processes which are dominant in different parts of basin are different. Part of snow melt runoff generated in the Uinta Mountains infiltrates and contributes as base flow to the Great Salt Lake over a time scale of decades to centuries. The adaptive strategy helps capture the steep topographic and climatic gradient along the Wasatch front. Here we present the aforesaid modeling strategy along with an associated hydrologic modeling framework which facilitates a seamless, computationally efficient and accurate integration of the process model with the data model. The flexibility of this framework leads to implementation of multiscale, multiresolution, adaptive refinement/de-refinement and nested modeling simulations with least computational burden. However, performing these simulations and related calibration of these models over a large basin at higher spatio- temporal resolutions is computationally intensive and requires use of increasing computing power. With the advent of parallel processing architectures, high computing performance can be achieved by parallelization of existing serial integrated-hydrologic-model code. This translates to running the same model simulation on a network of large number of processors thereby reducing the time needed to obtain solution. The paper also discusses the implementation of the integrated model on parallel processors. Also will be discussed the mapping of the problem on multi-processor environment, method to incorporate coupling between hydrologic processes using interprocessor communication models, model data structure and parallel numerical algorithms to obtain high performance.
CPU architecture for a fast and energy-saving calculation of convolution neural networks
NASA Astrophysics Data System (ADS)
Knoll, Florian J.; Grelcke, Michael; Czymmek, Vitali; Holtorf, Tim; Hussmann, Stephan
2017-06-01
One of the most difficult problem in the use of artificial neural networks is the computational capacity. Although large search engine companies own specially developed hardware to provide the necessary computing power, for the conventional user only remains the state of the art method, which is the use of a graphic processing unit (GPU) as a computational basis. Although these processors are well suited for large matrix computations, they need massive energy. Therefore a new processor on the basis of a field programmable gate array (FPGA) has been developed and is optimized for the application of deep learning. This processor is presented in this paper. The processor can be adapted for a particular application (in this paper to an organic farming application). The power consumption is only a fraction of a GPU application and should therefore be well suited for energy-saving applications.
Fast calculation method for computer-generated cylindrical holograms.
Yamaguchi, Takeshi; Fujii, Tomohiko; Yoshikawa, Hiroshi
2008-07-01
Since a general flat hologram has a limited viewable area, we usually cannot see the other side of a reconstructed object. There are some holograms that can solve this problem. A cylindrical hologram is well known to be viewable in 360 deg. Most cylindrical holograms are optical holograms, but there are few reports of computer-generated cylindrical holograms. The lack of computer-generated cylindrical holograms is because the spatial resolution of output devices is not great enough; therefore, we have to make a large hologram or use a small object to fulfill the sampling theorem. In addition, in calculating the large fringe, the calculation amount increases in proportion to the hologram size. Therefore, we propose what we believe to be a new calculation method for fast calculation. Then, we print these fringes with our prototype fringe printer. As a result, we obtain a good reconstructed image from a computer-generated cylindrical hologram.
A class of hybrid finite element methods for electromagnetics: A review
NASA Technical Reports Server (NTRS)
Volakis, J. L.; Chatterjee, A.; Gong, J.
1993-01-01
Integral equation methods have generally been the workhorse for antenna and scattering computations. In the case of antennas, they continue to be the prominent computational approach, but for scattering applications the requirement for large-scale computations has turned researchers' attention to near neighbor methods such as the finite element method, which has low O(N) storage requirements and is readily adaptable in modeling complex geometrical features and material inhomogeneities. In this paper, we review three hybrid finite element methods for simulating composite scatterers, conformal microstrip antennas, and finite periodic arrays. Specifically, we discuss the finite element method and its application to electromagnetic problems when combined with the boundary integral, absorbing boundary conditions, and artificial absorbers for terminating the mesh. Particular attention is given to large-scale simulations, methods, and solvers for achieving low memory requirements and code performance on parallel computing architectures.
Colak, Recep; Moser, Flavia; Chu, Jeffrey Shih-Chieh; Schönhuth, Alexander; Chen, Nansheng; Ester, Martin
2010-10-25
Computational prediction of functionally related groups of genes (functional modules) from large-scale data is an important issue in computational biology. Gene expression experiments and interaction networks are well studied large-scale data sources, available for many not yet exhaustively annotated organisms. It has been well established, when analyzing these two data sources jointly, modules are often reflected by highly interconnected (dense) regions in the interaction networks whose participating genes are co-expressed. However, the tractability of the problem had remained unclear and methods by which to exhaustively search for such constellations had not been presented. We provide an algorithmic framework, referred to as Densely Connected Biclustering (DECOB), by which the aforementioned search problem becomes tractable. To benchmark the predictive power inherent to the approach, we computed all co-expressed, dense regions in physical protein and genetic interaction networks from human and yeast. An automatized filtering procedure reduces our output which results in smaller collections of modules, comparable to state-of-the-art approaches. Our results performed favorably in a fair benchmarking competition which adheres to standard criteria. We demonstrate the usefulness of an exhaustive module search, by using the unreduced output to more quickly perform GO term related function prediction tasks. We point out the advantages of our exhaustive output by predicting functional relationships using two examples. We demonstrate that the computation of all densely connected and co-expressed regions in interaction networks is an approach to module discovery of considerable value. Beyond confirming the well settled hypothesis that such co-expressed, densely connected interaction network regions reflect functional modules, we open up novel computational ways to comprehensively analyze the modular organization of an organism based on prevalent and largely available large-scale datasets. Software and data sets are available at http://www.sfu.ca/~ester/software/DECOB.zip.
Model verification of large structural systems
NASA Technical Reports Server (NTRS)
Lee, L. T.; Hasselman, T. K.
1977-01-01
A methodology was formulated, and a general computer code implemented for processing sinusoidal vibration test data to simultaneously make adjustments to a prior mathematical model of a large structural system, and resolve measured response data to obtain a set of orthogonal modes representative of the test model. The derivation of estimator equations is shown along with example problems. A method for improving the prior analytic model is included.
Local CD-ROM in interaction with HTML documents over the Internet.
Mattheos, N; Nattestad, A; Attström, R
2000-08-01
The internet and computer assisted learning have enhanced the possibilities of providing quality distance learning in dentistry. The use of multimedia material is an essential part of such distance learning courses. However the Internet technology available has limitations regarding transmission of large multimedia files. Therefore especially when addressing undergraduate students or geographically isolated professionals, large download times make distance learning unattractive. This problem was technically solved in a distance learning course for undergraduate students from all over Europe. The present communication describes a method to bypass the problem of transmitting large multimedia files by the use of a specially designed CD-ROM. This CD-ROM was run locally on the students' PC interacting with HTML documents sent over the Internet.
Special issue of Computers and Fluids in honor of Cecil E. (Chuck) Leith
Zhou, Ye; Herring, Jackson
2017-05-12
Here, this special issue of Computers and Fluids is dedicated to Cecil E. (Chuck) Leith in honor of his research contributions, leadership in the areas of statistical fluid mechanics, computational fluid dynamics, and climate theory. Leith's contribution to these fields emerged from his interest in solving complex fluid flow problems--even those at high Mach numbers--in an era well before large scale supercomputing became the dominant mode of inquiry into these fields. Yet the issues raised and solved by his research effort are still of vital interest today.
CSM Testbed Development and Large-Scale Structural Applications
NASA Technical Reports Server (NTRS)
Knight, Norman F., Jr.; Gillian, R. E.; Mccleary, Susan L.; Lotts, C. G.; Poole, E. L.; Overman, A. L.; Macy, S. C.
1989-01-01
A research activity called Computational Structural Mechanics (CSM) conducted at the NASA Langley Research Center is described. This activity is developing advanced structural analysis and computational methods that exploit high-performance computers. Methods are developed in the framework of the CSM Testbed software system and applied to representative complex structural analysis problems from the aerospace industry. An overview of the CSM Testbed methods development environment is presented and some new numerical methods developed on a CRAY-2 are described. Selected application studies performed on the NAS CRAY-2 are also summarized.
Special issue of Computers and Fluids in honor of Cecil E. (Chuck) Leith
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhou, Ye; Herring, Jackson
Here, this special issue of Computers and Fluids is dedicated to Cecil E. (Chuck) Leith in honor of his research contributions, leadership in the areas of statistical fluid mechanics, computational fluid dynamics, and climate theory. Leith's contribution to these fields emerged from his interest in solving complex fluid flow problems--even those at high Mach numbers--in an era well before large scale supercomputing became the dominant mode of inquiry into these fields. Yet the issues raised and solved by his research effort are still of vital interest today.
Terascale Optimal PDE Simulations (TOPS) Center
DOE Office of Scientific and Technical Information (OSTI.GOV)
Professor Olof B. Widlund
2007-07-09
Our work has focused on the development and analysis of domain decomposition algorithms for a variety of problems arising in continuum mechanics modeling. In particular, we have extended and analyzed FETI-DP and BDDC algorithms; these iterative solvers were first introduced and studied by Charbel Farhat and his collaborators, see [11, 45, 12], and by Clark Dohrmann of SANDIA, Albuquerque, see [43, 2, 1], respectively. These two closely related families of methods are of particular interest since they are used more extensively than other iterative substructuring methods to solve very large and difficult problems. Thus, the FETI algorithms are part ofmore » the SALINAS system developed by the SANDIA National Laboratories for very large scale computations, and as already noted, BDDC was first developed by a SANDIA scientist, Dr. Clark Dohrmann. The FETI algorithms are also making inroads in commercial engineering software systems. We also note that the analysis of these algorithms poses very real mathematical challenges. The success in developing this theory has, in several instances, led to significant improvements in the performance of these algorithms. A very desirable feature of these iterative substructuring and other domain decomposition algorithms is that they respect the memory hierarchy of modern parallel and distributed computing systems, which is essential for approaching peak floating point performance. The development of improved methods, together with more powerful computer systems, is making it possible to carry out simulations in three dimensions, with quite high resolution, relatively easily. This work is supported by high quality software systems, such as Argonne's PETSc library, which facilitates code development as well as the access to a variety of parallel and distributed computer systems. The success in finding scalable and robust domain decomposition algorithms for very large number of processors and very large finite element problems is, e.g., illustrated in [24, 25, 26]. This work is based on [29, 31]. Our work over these five and half years has, in our opinion, helped advance the knowledge of domain decomposition methods significantly. We see these methods as providing valuable alternatives to other iterative methods, in particular, those based on multi-grid. In our opinion, our accomplishments also match the goals of the TOPS project quite closely.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yamazaki, Ichitaro; Wu, Kesheng; Simon, Horst
2008-10-27
The original software package TRLan, [TRLan User Guide], page 24, implements the thick restart Lanczos method, [Wu and Simon 2001], page 24, for computing eigenvalues {lambda} and their corresponding eigenvectors v of a symmetric matrix A: Av = {lambda}v. Its effectiveness in computing the exterior eigenvalues of a large matrix has been demonstrated, [LBNL-42982], page 24. However, its performance strongly depends on the user-specified dimension of a projection subspace. If the dimension is too small, TRLan suffers from slow convergence. If it is too large, the computational and memory costs become expensive. Therefore, to balance the solution convergence and costs,more » users must select an appropriate subspace dimension for each eigenvalue problem at hand. To free users from this difficult task, nu-TRLan, [LNBL-1059E], page 23, adjusts the subspace dimension at every restart such that optimal performance in solving the eigenvalue problem is automatically obtained. This document provides a user guide to the nu-TRLan software package. The original TRLan software package was implemented in Fortran 90 to solve symmetric eigenvalue problems using static projection subspace dimensions. nu-TRLan was developed in C and extended to solve Hermitian eigenvalue problems. It can be invoked using either a static or an adaptive subspace dimension. In order to simplify its use for TRLan users, nu-TRLan has interfaces and features similar to those of TRLan: (1) Solver parameters are stored in a single data structure called trl-info, Chapter 4 [trl-info structure], page 7. (2) Most of the numerical computations are performed by BLAS, [BLAS], page 23, and LAPACK, [LAPACK], page 23, subroutines, which allow nu-TRLan to achieve optimized performance across a wide range of platforms. (3) To solve eigenvalue problems on distributed memory systems, the message passing interface (MPI), [MPI forum], page 23, is used. The rest of this document is organized as follows. In Chapter 2 [Installation], page 2, we provide an installation guide of the nu-TRLan software package. In Chapter 3 [Example], page 3, we present a simple nu-TRLan example program. In Chapter 4 [trl-info structure], page 7, and Chapter 5 [trlan subroutine], page 14, we describe the solver parameters and interfaces in detail. In Chapter 6 [Solver parameters], page 21, we discuss the selection of the user-specified parameters. In Chapter 7 [Contact information], page 22, we give the acknowledgements and contact information of the authors. In Chapter 8 [References], page 23, we list reference to related works.« less
Developing science gateways for drug discovery in a grid environment.
Pérez-Sánchez, Horacio; Rezaei, Vahid; Mezhuyev, Vitaliy; Man, Duhu; Peña-García, Jorge; den-Haan, Helena; Gesing, Sandra
2016-01-01
Methods for in silico screening of large databases of molecules increasingly complement and replace experimental techniques to discover novel compounds to combat diseases. As these techniques become more complex and computationally costly we are faced with an increasing problem to provide the research community of life sciences with a convenient tool for high-throughput virtual screening on distributed computing resources. To this end, we recently integrated the biophysics-based drug-screening program FlexScreen into a service, applicable for large-scale parallel screening and reusable in the context of scientific workflows. Our implementation is based on Pipeline Pilot and Simple Object Access Protocol and provides an easy-to-use graphical user interface to construct complex workflows, which can be executed on distributed computing resources, thus accelerating the throughput by several orders of magnitude.
Pre-Hardware Optimization of Spacecraft Image Processing Algorithms and Hardware Implementation
NASA Technical Reports Server (NTRS)
Kizhner, Semion; Petrick, David J.; Flatley, Thomas P.; Hestnes, Phyllis; Jentoft-Nilsen, Marit; Day, John H. (Technical Monitor)
2002-01-01
Spacecraft telemetry rates and telemetry product complexity have steadily increased over the last decade presenting a problem for real-time processing by ground facilities. This paper proposes a solution to a related problem for the Geostationary Operational Environmental Spacecraft (GOES-8) image data processing and color picture generation application. Although large super-computer facilities are the obvious heritage solution, they are very costly, making it imperative to seek a feasible alternative engineering solution at a fraction of the cost. The proposed solution is based on a Personal Computer (PC) platform and synergy of optimized software algorithms, and reconfigurable computing hardware (RC) technologies, such as Field Programmable Gate Arrays (FPGA) and Digital Signal Processors (DSP). It has been shown that this approach can provide superior inexpensive performance for a chosen application on the ground station or on-board a spacecraft.
Parallel processing for scientific computations
NASA Technical Reports Server (NTRS)
Alkhatib, Hasan S.
1991-01-01
The main contribution of the effort in the last two years is the introduction of the MOPPS system. After doing extensive literature search, we introduced the system which is described next. MOPPS employs a new solution to the problem of managing programs which solve scientific and engineering applications on a distributed processing environment. Autonomous computers cooperate efficiently in solving large scientific problems with this solution. MOPPS has the advantage of not assuming the presence of any particular network topology or configuration, computer architecture, or operating system. It imposes little overhead on network and processor resources while efficiently managing programs concurrently. The core of MOPPS is an intelligent program manager that builds a knowledge base of the execution performance of the parallel programs it is managing under various conditions. The manager applies this knowledge to improve the performance of future runs. The program manager learns from experience.
NASA Astrophysics Data System (ADS)
Plattner, A.; Maurer, H. R.; Vorloeper, J.; Dahmen, W.
2010-08-01
Despite the ever-increasing power of modern computers, realistic modelling of complex 3-D earth models is still a challenging task and requires substantial computing resources. The overwhelming majority of current geophysical modelling approaches includes either finite difference or non-adaptive finite element algorithms and variants thereof. These numerical methods usually require the subsurface to be discretized with a fine mesh to accurately capture the behaviour of the physical fields. However, this may result in excessive memory consumption and computing times. A common feature of most of these algorithms is that the modelled data discretizations are independent of the model complexity, which may be wasteful when there are only minor to moderate spatial variations in the subsurface parameters. Recent developments in the theory of adaptive numerical solvers have the potential to overcome this problem. Here, we consider an adaptive wavelet-based approach that is applicable to a large range of problems, also including nonlinear problems. In comparison with earlier applications of adaptive solvers to geophysical problems we employ here a new adaptive scheme whose core ingredients arose from a rigorous analysis of the overall asymptotically optimal computational complexity, including in particular, an optimal work/accuracy rate. Our adaptive wavelet algorithm offers several attractive features: (i) for a given subsurface model, it allows the forward modelling domain to be discretized with a quasi minimal number of degrees of freedom, (ii) sparsity of the associated system matrices is guaranteed, which makes the algorithm memory efficient and (iii) the modelling accuracy scales linearly with computing time. We have implemented the adaptive wavelet algorithm for solving 3-D geoelectric problems. To test its performance, numerical experiments were conducted with a series of conductivity models exhibiting varying degrees of structural complexity. Results were compared with a non-adaptive finite element algorithm, which incorporates an unstructured mesh to best-fitting subsurface boundaries. Such algorithms represent the current state-of-the-art in geoelectric modelling. An analysis of the numerical accuracy as a function of the number of degrees of freedom revealed that the adaptive wavelet algorithm outperforms the finite element solver for simple and moderately complex models, whereas the results become comparable for models with high spatial variability of electrical conductivities. The linear dependence of the modelling error and the computing time proved to be model-independent. This feature will allow very efficient computations using large-scale models as soon as our experimental code is optimized in terms of its implementation.
Making automated computer program documentation a feature of total system design
NASA Technical Reports Server (NTRS)
Wolf, A. W.
1970-01-01
It is pointed out that in large-scale computer software systems, program documents are too often fraught with errors, out of date, poorly written, and sometimes nonexistent in whole or in part. The means are described by which many of these typical system documentation problems were overcome in a large and dynamic software project. A systems approach was employed which encompassed such items as: (1) configuration management; (2) standards and conventions; (3) collection of program information into central data banks; (4) interaction among executive, compiler, central data banks, and configuration management; and (5) automatic documentation. A complete description of the overall system is given.
Stream computing for biomedical signal processing: A QRS complex detection case-study.
Murphy, B M; O'Driscoll, C; Boylan, G B; Lightbody, G; Marnane, W P
2015-01-01
Recent developments in "Big Data" have brought significant gains in the ability to process large amounts of data on commodity server hardware. Stream computing is a relatively new paradigm in this area, addressing the need to process data in real time with very low latency. While this approach has been developed for dealing with large scale data from the world of business, security and finance, there is a natural overlap with clinical needs for physiological signal processing. In this work we present a case study of streams processing applied to a typical physiological signal processing problem: QRS detection from ECG data.
CELES: CUDA-accelerated simulation of electromagnetic scattering by large ensembles of spheres
NASA Astrophysics Data System (ADS)
Egel, Amos; Pattelli, Lorenzo; Mazzamuto, Giacomo; Wiersma, Diederik S.; Lemmer, Uli
2017-09-01
CELES is a freely available MATLAB toolbox to simulate light scattering by many spherical particles. Aiming at high computational performance, CELES leverages block-diagonal preconditioning, a lookup-table approach to evaluate costly functions and massively parallel execution on NVIDIA graphics processing units using the CUDA computing platform. The combination of these techniques allows to efficiently address large electrodynamic problems (>104 scatterers) on inexpensive consumer hardware. In this paper, we validate near- and far-field distributions against the well-established multi-sphere T-matrix (MSTM) code and discuss the convergence behavior for ensembles of different sizes, including an exemplary system comprising 105 particles.
NASA Astrophysics Data System (ADS)
Puckett, E. G.; Turcotte, D. L.; He, Y.; Lokavarapu, H. V.; Robey, J.; Kellogg, L. H.
2017-12-01
Geochemical observations of mantle-derived rocks favor a nearly homogeneous upper mantle, the source of mid-ocean ridge basalts (MORB), and heterogeneous lower mantle regions.Plumes that generate ocean island basalts are thought to sample the lower mantle regions and exhibit more heterogeneity than MORB.These regions have been associated with lower mantle structures known as large low shear velocity provinces below Africa and the South Pacific.The isolation of these regions is attributed to compositional differences and density stratification that, consequently, have been the subject of computational and laboratory modeling designed to determine the parameter regime in which layering is stable and understanding how layering evolves.Mathematical models of persistent compositional interfaces in the Earth's mantle may be inherently unstable, at least in some regions of the parameter space relevant to the mantle.Computing approximations to solutions of such problems presents severe challenges, even to state-of-the-art numerical methods.Some numerical algorithms for modeling the interface between distinct compositions smear the interface at the boundary between compositions, such as methods that add numerical diffusion or `artificial viscosity' in order to stabilize the algorithm. We present two new algorithms for maintaining high-resolution and sharp computational boundaries in computations of these types of problems: a discontinuous Galerkin method with a bound preserving limiter and a Volume-of-Fluid interface tracking algorithm.We compare these new methods with two approaches widely used for modeling the advection of two distinct thermally driven compositional fields in mantle convection computations: a high-order accurate finite element advection algorithm with entropy viscosity and a particle method.We compare the performance of these four algorithms on three problems, including computing an approximation to the solution of an initially compositionally stratified fluid at Ra = 105 with buoyancy numbers {B} that vary from no stratification at B = 0 to stratified flow at large B.
Computational Challenges in the Analysis of Petrophysics Using Microtomography and Upscaling
NASA Astrophysics Data System (ADS)
Liu, J.; Pereira, G.; Freij-Ayoub, R.; Regenauer-Lieb, K.
2014-12-01
Microtomography provides detailed 3D internal structures of rocks in micro- to tens of nano-meter resolution and is quickly turning into a new technology for studying petrophysical properties of materials. An important step is the upscaling of these properties as micron or sub-micron resolution can only be done on the sample-scale of millimeters or even less than a millimeter. We present here a recently developed computational workflow for the analysis of microstructures including the upscaling of material properties. Computations of properties are first performed using conventional material science simulations at micro to nano-scale. The subsequent upscaling of these properties is done by a novel renormalization procedure based on percolation theory. We have tested the workflow using different rock samples, biological and food science materials. We have also applied the technique on high-resolution time-lapse synchrotron CT scans. In this contribution we focus on the computational challenges that arise from the big data problem of analyzing petrophysical properties and its subsequent upscaling. We discuss the following challenges: 1) Characterization of microtomography for extremely large data sets - our current capability. 2) Computational fluid dynamics simulations at pore-scale for permeability estimation - methods, computing cost and accuracy. 3) Solid mechanical computations at pore-scale for estimating elasto-plastic properties - computational stability, cost, and efficiency. 4) Extracting critical exponents from derivative models for scaling laws - models, finite element meshing, and accuracy. Significant progress in each of these challenges is necessary to transform microtomography from the current research problem into a robust computational big data tool for multi-scale scientific and engineering problems.
Toward a virtual building laboratory
DOE Office of Scientific and Technical Information (OSTI.GOV)
Klems, J.H.; Finlayson, E.U.; Olsen, T.H.
1999-03-01
In order to achieve in a timely manner the large energy and dollar savings technically possible through improvements in building energy efficiency, it will be necessary to solve the problem of design failure risk. The most economical method of doing this would be to learn to calculate building performance with sufficient detail, accuracy and reliability to avoid design failure. Existing building simulation models (BSM) are a large step in this direction, but are still not capable of this level of modeling. Developments in computational fluid dynamics (CFD) techniques now allow one to construct a road map from present BSM's tomore » a complete building physical model. The most useful first step is a building interior model (BIM) that would allow prediction of local conditions affecting occupant health and comfort. To provide reliable prediction a BIM must incorporate the correct physical boundary conditions on a building interior. Doing so raises a number of specific technical problems and research questions. The solution of these within a context useful for building research and design is not likely to result from other research on CFD, which is directed toward the solution of different types of problems. A six-step plan for incorporating the correct boundary conditions within the context of the model problem of a large atrium has been outlined. A promising strategy for constructing a BIM is the overset grid technique for representing a building space in a CFD calculation. This technique promises to adapt well to building design and allows a step-by-step approach. A state-of-the-art CFD computer code using this technique has been adapted to the problem and can form the departure point for this research.« less
Fast, Nonlinear, Fully Probabilistic Inversion of Large Geophysical Problems
NASA Astrophysics Data System (ADS)
Curtis, A.; Shahraeeni, M.; Trampert, J.; Meier, U.; Cho, G.
2010-12-01
Almost all Geophysical inverse problems are in reality nonlinear. Fully nonlinear inversion including non-approximated physics, and solving for probability distribution functions (pdf’s) that describe the solution uncertainty, generally requires sampling-based Monte-Carlo style methods that are computationally intractable in most large problems. In order to solve such problems, physical relationships are usually linearized leading to efficiently-solved, (possibly iterated) linear inverse problems. However, it is well known that linearization can lead to erroneous solutions, and in particular to overly optimistic uncertainty estimates. What is needed across many Geophysical disciplines is a method to invert large inverse problems (or potentially tens of thousands of small inverse problems) fully probabilistically and without linearization. This talk shows how very large nonlinear inverse problems can be solved fully probabilistically and incorporating any available prior information using mixture density networks (driven by neural network banks), provided the problem can be decomposed into many small inverse problems. In this talk I will explain the methodology, compare multi-dimensional pdf inversion results to full Monte Carlo solutions, and illustrate the method with two applications: first, inverting surface wave group and phase velocities for a fully-probabilistic global tomography model of the Earth’s crust and mantle, and second inverting industrial 3D seismic data for petrophysical properties throughout and around a subsurface hydrocarbon reservoir. The latter problem is typically decomposed into 104 to 105 individual inverse problems, each solved fully probabilistically and without linearization. The results in both cases are sufficiently close to the Monte Carlo solution to exhibit realistic uncertainty, multimodality and bias. This provides far greater confidence in the results, and in decisions made on their basis.
Recognition Using Hybrid Classifiers.
Osadchy, Margarita; Keren, Daniel; Raviv, Dolev
2016-04-01
A canonical problem in computer vision is category recognition (e.g., find all instances of human faces, cars etc., in an image). Typically, the input for training a binary classifier is a relatively small sample of positive examples, and a huge sample of negative examples, which can be very diverse, consisting of images from a large number of categories. The difficulty of the problem sharply increases with the dimension and size of the negative example set. We propose to alleviate this problem by applying a "hybrid" classifier, which replaces the negative samples by a prior, and then finds a hyperplane which separates the positive samples from this prior. The method is extended to kernel space and to an ensemble-based approach. The resulting binary classifiers achieve an identical or better classification rate than SVM, while requiring far smaller memory and lower computational complexity to train and apply.
Microcomputers in the Anesthesia Library.
ERIC Educational Resources Information Center
Wright, A. J.
The combination of computer technology and library operation is helping to alleviate such library problems as escalating costs, increasing collection size, deteriorating materials, unwieldy arrangement schemes, poor subject control, and the acquisition and processing of large numbers of rarely used documents. Small special libraries such as…
Concurrent electromagnetic scattering analysis
NASA Technical Reports Server (NTRS)
Patterson, Jean E.; Cwik, Tom; Ferraro, Robert D.; Jacobi, Nathan; Liewer, Paulett C.; Lockhart, Thomas G.; Lyzenga, Gregory A.; Parker, Jay
1989-01-01
The computational power of the hypercube parallel computing architecture is applied to the solution of large-scale electromagnetic scattering and radiation problems. Three analysis codes have been implemented. A Hypercube Electromagnetic Interactive Analysis Workstation was developed to aid in the design and analysis of metallic structures such as antennas and to facilitate the use of these analysis codes. The workstation provides a general user environment for specification of the structure to be analyzed and graphical representations of the results.
Application of artificial neural networks to gaming
NASA Astrophysics Data System (ADS)
Baba, Norio; Kita, Tomio; Oda, Kazuhiro
1995-04-01
Recently, neural network technology has been applied to various actual problems. It has succeeded in producing a large number of intelligent systems. In this article, we suggest that it could be applied to the field of gaming. In particular, we suggest that the neural network model could be used to mimic players' characters. Several computer simulation results using a computer gaming system which is a modified version of the COMMONS GAME confirm our idea.
Computing Systems Configuration for Highly Integrated Guidance and Control Systems
1988-06-01
conmmunication ear lea imlustrielaiservenant dais an projet. Cela eat renda , possible entre auies par l’adoption dene mibodologie do travai coammune, par...computed graph results to data processors for post processing, or commnicating with system I/O modules. The ESU PI- Bus interface logic includes extra ...the extra constraint checking helps to find more problems at compile time), and it is especially well- suited for large software systems written by a
Demonstration of quantum advantage in machine learning
NASA Astrophysics Data System (ADS)
Ristè, Diego; da Silva, Marcus P.; Ryan, Colm A.; Cross, Andrew W.; Córcoles, Antonio D.; Smolin, John A.; Gambetta, Jay M.; Chow, Jerry M.; Johnson, Blake R.
2017-04-01
The main promise of quantum computing is to efficiently solve certain problems that are prohibitively expensive for a classical computer. Most problems with a proven quantum advantage involve the repeated use of a black box, or oracle, whose structure encodes the solution. One measure of the algorithmic performance is the query complexity, i.e., the scaling of the number of oracle calls needed to find the solution with a given probability. Few-qubit demonstrations of quantum algorithms, such as Deutsch-Jozsa and Grover, have been implemented across diverse physical systems such as nuclear magnetic resonance, trapped ions, optical systems, and superconducting circuits. However, at the small scale, these problems can already be solved classically with a few oracle queries, limiting the obtained advantage. Here we solve an oracle-based problem, known as learning parity with noise, on a five-qubit superconducting processor. Executing classical and quantum algorithms using the same oracle, we observe a large gap in query count in favor of quantum processing. We find that this gap grows by orders of magnitude as a function of the error rates and the problem size. This result demonstrates that, while complex fault-tolerant architectures will be required for universal quantum computing, a significant quantum advantage already emerges in existing noisy systems.
WPS mediation: An approach to process geospatial data on different computing backends
NASA Astrophysics Data System (ADS)
Giuliani, Gregory; Nativi, Stefano; Lehmann, Anthony; Ray, Nicolas
2012-10-01
The OGC Web Processing Service (WPS) specification allows generating information by processing distributed geospatial data made available through Spatial Data Infrastructures (SDIs). However, current SDIs have limited analytical capacities and various problems emerge when trying to use them in data and computing-intensive domains such as environmental sciences. These problems are usually not or only partially solvable using single computing resources. Therefore, the Geographic Information (GI) community is trying to benefit from the superior storage and computing capabilities offered by distributed computing (e.g., Grids, Clouds) related methods and technologies. Currently, there is no commonly agreed approach to grid-enable WPS. No implementation allows one to seamlessly execute a geoprocessing calculation following user requirements on different computing backends, ranging from a stand-alone GIS server up to computer clusters and large Grid infrastructures. Considering this issue, this paper presents a proof of concept by mediating different geospatial and Grid software packages, and by proposing an extension of WPS specification through two optional parameters. The applicability of this approach will be demonstrated using a Normalized Difference Vegetation Index (NDVI) mediated WPS process, highlighting benefits, and issues that need to be further investigated to improve performances.
Vecharynski, Eugene; Yang, Chao; Pask, John E.
2015-02-25
Here, we present an iterative algorithm for computing an invariant subspace associated with the algebraically smallest eigenvalues of a large sparse or structured Hermitian matrix A. We are interested in the case in which the dimension of the invariant subspace is large (e.g., over several hundreds or thousands) even though it may still be small relative to the dimension of A. These problems arise from, for example, density functional theory (DFT) based electronic structure calculations for complex materials. The key feature of our algorithm is that it performs fewer Rayleigh–Ritz calculations compared to existing algorithms such as the locally optimalmore » block preconditioned conjugate gradient or the Davidson algorithm. It is a block algorithm, and hence can take advantage of efficient BLAS3 operations and be implemented with multiple levels of concurrency. We discuss a number of practical issues that must be addressed in order to implement the algorithm efficiently on a high performance computer.« less
MDTS: automatic complex materials design using Monte Carlo tree search.
M Dieb, Thaer; Ju, Shenghong; Yoshizoe, Kazuki; Hou, Zhufeng; Shiomi, Junichiro; Tsuda, Koji
2017-01-01
Complex materials design is often represented as a black-box combinatorial optimization problem. In this paper, we present a novel python library called MDTS (Materials Design using Tree Search). Our algorithm employs a Monte Carlo tree search approach, which has shown exceptional performance in computer Go game. Unlike evolutionary algorithms that require user intervention to set parameters appropriately, MDTS has no tuning parameters and works autonomously in various problems. In comparison to a Bayesian optimization package, our algorithm showed competitive search efficiency and superior scalability. We succeeded in designing large Silicon-Germanium (Si-Ge) alloy structures that Bayesian optimization could not deal with due to excessive computational cost. MDTS is available at https://github.com/tsudalab/MDTS.
MDTS: automatic complex materials design using Monte Carlo tree search
NASA Astrophysics Data System (ADS)
Dieb, Thaer M.; Ju, Shenghong; Yoshizoe, Kazuki; Hou, Zhufeng; Shiomi, Junichiro; Tsuda, Koji
2017-12-01
Complex materials design is often represented as a black-box combinatorial optimization problem. In this paper, we present a novel python library called MDTS (Materials Design using Tree Search). Our algorithm employs a Monte Carlo tree search approach, which has shown exceptional performance in computer Go game. Unlike evolutionary algorithms that require user intervention to set parameters appropriately, MDTS has no tuning parameters and works autonomously in various problems. In comparison to a Bayesian optimization package, our algorithm showed competitive search efficiency and superior scalability. We succeeded in designing large Silicon-Germanium (Si-Ge) alloy structures that Bayesian optimization could not deal with due to excessive computational cost. MDTS is available at https://github.com/tsudalab/MDTS.
A Thick-Restart Lanczos Algorithm with Polynomial Filtering for Hermitian Eigenvalue Problems
Li, Ruipeng; Xi, Yuanzhe; Vecharynski, Eugene; ...
2016-08-16
Polynomial filtering can provide a highly effective means of computing all eigenvalues of a real symmetric (or complex Hermitian) matrix that are located in a given interval, anywhere in the spectrum. This paper describes a technique for tackling this problem by combining a thick-restart version of the Lanczos algorithm with deflation ("locking'') and a new type of polynomial filter obtained from a least-squares technique. Furthermore, the resulting algorithm can be utilized in a “spectrum-slicing” approach whereby a very large number of eigenvalues and associated eigenvectors of the matrix are computed by extracting eigenpairs located in different subintervals independently from onemore » another.« less
Artificial Intelligence In Computational Fluid Dynamics
NASA Technical Reports Server (NTRS)
Vogel, Alison Andrews
1991-01-01
Paper compares four first-generation artificial-intelligence (Al) software systems for computational fluid dynamics. Includes: Expert Cooling Fan Design System (EXFAN), PAN AIR Knowledge System (PAKS), grid-adaptation program MITOSIS, and Expert Zonal Grid Generation (EZGrid). Focuses on knowledge-based ("expert") software systems. Analyzes intended tasks, kinds of knowledge possessed, magnitude of effort required to codify knowledge, how quickly constructed, performances, and return on investment. On basis of comparison, concludes Al most successful when applied to well-formulated problems solved by classifying or selecting preenumerated solutions. In contrast, application of Al to poorly understood or poorly formulated problems generally results in long development time and large investment of effort, with no guarantee of success.
ARPA surveillance technology for detection of targets hidden in foliage
NASA Astrophysics Data System (ADS)
Hoff, Lawrence E.; Stotts, Larry B.
1994-02-01
The processing of large quantities of synthetic aperture radar data in real time is a complex problem. Even the image formation process taxes today's most advanced computers. The use of complex algorithms with multiple channels adds another dimension to the computational problem. Advanced Research Projects Agency (ARPA) is currently planning on using the Paragon parallel processor for this task. The Paragon is small enough to allow its use in a sensor aircraft. Candidate algorithms will be implemented on the Paragon for evaluation for real time processing. In this paper ARPA technology developments for detecting targets hidden in foliage are reviewed and examples of signal processing techniques on field collected data are presented.
Reduced-Order Modeling: New Approaches for Computational Physics
NASA Technical Reports Server (NTRS)
Beran, Philip S.; Silva, Walter A.
2001-01-01
In this paper, we review the development of new reduced-order modeling techniques and discuss their applicability to various problems in computational physics. Emphasis is given to methods ba'sed on Volterra series representations and the proper orthogonal decomposition. Results are reported for different nonlinear systems to provide clear examples of the construction and use of reduced-order models, particularly in the multi-disciplinary field of computational aeroelasticity. Unsteady aerodynamic and aeroelastic behaviors of two- dimensional and three-dimensional geometries are described. Large increases in computational efficiency are obtained through the use of reduced-order models, thereby justifying the initial computational expense of constructing these models and inotivatim,- their use for multi-disciplinary design analysis.
SIMS: A Hybrid Method for Rapid Conformational Analysis
Gipson, Bryant; Moll, Mark; Kavraki, Lydia E.
2013-01-01
Proteins are at the root of many biological functions, often performing complex tasks as the result of large changes in their structure. Describing the exact details of these conformational changes, however, remains a central challenge for computational biology due the enormous computational requirements of the problem. This has engendered the development of a rich variety of useful methods designed to answer specific questions at different levels of spatial, temporal, and energetic resolution. These methods fall largely into two classes: physically accurate, but computationally demanding methods and fast, approximate methods. We introduce here a new hybrid modeling tool, the Structured Intuitive Move Selector (sims), designed to bridge the divide between these two classes, while allowing the benefits of both to be seamlessly integrated into a single framework. This is achieved by applying a modern motion planning algorithm, borrowed from the field of robotics, in tandem with a well-established protein modeling library. sims can combine precise energy calculations with approximate or specialized conformational sampling routines to produce rapid, yet accurate, analysis of the large-scale conformational variability of protein systems. Several key advancements are shown, including the abstract use of generically defined moves (conformational sampling methods) and an expansive probabilistic conformational exploration. We present three example problems that sims is applied to and demonstrate a rapid solution for each. These include the automatic determination of “active” residues for the hinge-based system Cyanovirin-N, exploring conformational changes involving long-range coordinated motion between non-sequential residues in Ribose-Binding Protein, and the rapid discovery of a transient conformational state of Maltose-Binding Protein, previously only determined by Molecular Dynamics. For all cases we provide energetic validations using well-established energy fields, demonstrating this framework as a fast and accurate tool for the analysis of a wide range of protein flexibility problems. PMID:23935893