performance computing modernization: Topics by Science.gov

Sample records for performance computing modernization

Modernization and optimization of a legacy open-source CFD code for high-performance computing architectures

NASA Astrophysics Data System (ADS)

Gel, Aytekin; Hu, Jonathan; Ould-Ahmed-Vall, ElMoustapha; Kalinkin, Alexander A.

2017-02-01

Legacy codes remain a crucial element of today's simulation-based engineering ecosystem due to the extensive validation process and investment in such software. The rapid evolution of high-performance computing architectures necessitates the modernization of these codes. One approach to modernization is a complete overhaul of the code. However, this could require extensive investments, such as rewriting in modern languages, new data constructs, etc., which will necessitate systematic verification and validation to re-establish the credibility of the computational models. The current study advocates using a more incremental approach and is a culmination of several modernization efforts of the legacy code MFIX, which is an open-source computational fluid dynamics code that has evolved over several decades, widely used in multiphase flows and still being developed by the National Energy Technology Laboratory. Two different modernization approaches,'bottom-up' and 'top-down', are illustrated. Preliminary results show up to 8.5x improvement at the selected kernel level with the first approach, and up to 50% improvement in total simulated time with the latter were achieved for the demonstration cases and target HPC systems employed.
Modernization and optimization of a legacy open-source CFD code for high-performance computing architectures

DOE PAGES

Gel, Aytekin; Hu, Jonathan; Ould-Ahmed-Vall, ElMoustapha; ...

2017-03-20

Legacy codes remain a crucial element of today's simulation-based engineering ecosystem due to the extensive validation process and investment in such software. The rapid evolution of high-performance computing architectures necessitates the modernization of these codes. One approach to modernization is a complete overhaul of the code. However, this could require extensive investments, such as rewriting in modern languages, new data constructs, etc., which will necessitate systematic verification and validation to re-establish the credibility of the computational models. The current study advocates using a more incremental approach and is a culmination of several modernization efforts of the legacy code MFIX, whichmore » is an open-source computational fluid dynamics code that has evolved over several decades, widely used in multiphase flows and still being developed by the National Energy Technology Laboratory. Two different modernization approaches,‘bottom-up’ and ‘top-down’, are illustrated. Here, preliminary results show up to 8.5x improvement at the selected kernel level with the first approach, and up to 50% improvement in total simulated time with the latter were achieved for the demonstration cases and target HPC systems employed.« less
Modernization and optimization of a legacy open-source CFD code for high-performance computing architectures

DOE Office of Scientific and Technical Information (OSTI.GOV)

Gel, Aytekin; Hu, Jonathan; Ould-Ahmed-Vall, ElMoustapha

Legacy codes remain a crucial element of today's simulation-based engineering ecosystem due to the extensive validation process and investment in such software. The rapid evolution of high-performance computing architectures necessitates the modernization of these codes. One approach to modernization is a complete overhaul of the code. However, this could require extensive investments, such as rewriting in modern languages, new data constructs, etc., which will necessitate systematic verification and validation to re-establish the credibility of the computational models. The current study advocates using a more incremental approach and is a culmination of several modernization efforts of the legacy code MFIX, whichmore » is an open-source computational fluid dynamics code that has evolved over several decades, widely used in multiphase flows and still being developed by the National Energy Technology Laboratory. Two different modernization approaches,‘bottom-up’ and ‘top-down’, are illustrated. Here, preliminary results show up to 8.5x improvement at the selected kernel level with the first approach, and up to 50% improvement in total simulated time with the latter were achieved for the demonstration cases and target HPC systems employed.« less
Optimizing Engineering Tools Using Modern Ground Architectures

DTIC Science & Technology

2017-12-01

Considerations,” International Journal of Computer Science & Engineering Survey , vol. 5, no. 4, 2014. [10] R. Bell. (n.d). A beginner’s guide to big O notation...scientific community. Traditional computing architectures were not capable of processing the data efficiently, or in some cases, could not process the...thesis investigates how these modern computing architectures could be leveraged by industry and academia to improve the performance and capabilities of
Code Modernization of VPIC

NASA Astrophysics Data System (ADS)

Bird, Robert; Nystrom, David; Albright, Brian

2017-10-01

The ability of scientific simulations to effectively deliver performant computation is increasingly being challenged by successive generations of high-performance computing architectures. Code development to support efficient computation on these modern architectures is both expensive, and highly complex; if it is approached without due care, it may also not be directly transferable between subsequent hardware generations. Previous works have discussed techniques to support the process of adapting a legacy code for modern hardware generations, but despite the breakthroughs in the areas of mini-app development, portable-performance, and cache oblivious algorithms the problem still remains largely unsolved. In this work we demonstrate how a focus on platform agnostic modern code-development can be applied to Particle-in-Cell (PIC) simulations to facilitate effective scientific delivery. This work builds directly on our previous work optimizing VPIC, in which we replaced intrinsic based vectorisation with compile generated auto-vectorization to improve the performance and portability of VPIC. In this work we present the use of a specialized SIMD queue for processing some particle operations, and also preview a GPU capable OpenMP variant of VPIC. Finally we include a lessons learnt. Work performed under the auspices of the U.S. Dept. of Energy by the Los Alamos National Security, LLC Los Alamos National Laboratory under contract DE-AC52-06NA25396 and supported by the LANL LDRD program.
ESIF 2016: Modernizing Our Grid and Energy System

DOE Office of Scientific and Technical Information (OSTI.GOV)

Van Becelaere, Kimberly

This 2016 annual report highlights work conducted at the Energy Systems Integration Facility (ESIF) in FY 2016, including grid modernization, high-performance computing and visualization, and INTEGRATE projects.
The DoD's High Performance Computing Modernization Program - Ensuing the National Earth Systems Prediction Capability Becomes Operational

NASA Astrophysics Data System (ADS)

Burnett, W.

2016-12-01

The Department of Defense's (DoD) High Performance Computing Modernization Program (HPCMP) provides high performance computing to address the most significant challenges in computational resources, software application support and nationwide research and engineering networks. Today, the HPCMP has a critical role in ensuring the National Earth System Prediction Capability (N-ESPC) achieves initial operational status in 2019. A 2015 study commissioned by the HPCMP found that N-ESPC computational requirements will exceed interconnect bandwidth capacity due to the additional load from data assimilation and passing connecting data between ensemble codes. Memory bandwidth and I/O bandwidth will continue to be significant bottlenecks for the Navy's Hybrid Coordinate Ocean Model (HYCOM) scalability - by far the major driver of computing resource requirements in the N-ESPC. The study also found that few of the N-ESPC model developers have detailed plans to ensure their respective codes scale through 2024. Three HPCMP initiatives are designed to directly address and support these issues: Productivity Enhancement, Technology, Transfer and Training (PETTT), the HPCMP Applications Software Initiative (HASI), and Frontier Projects. PETTT supports code conversion by providing assistance, expertise and training in scalable and high-end computing architectures. HASI addresses the continuing need for modern application software that executes effectively and efficiently on next-generation high-performance computers. Frontier Projects enable research and development that could not be achieved using typical HPCMP resources by providing multi-disciplinary teams access to exceptional amounts of high performance computing resources. Finally, the Navy's DoD Supercomputing Resource Center (DSRC) currently operates a 6 Petabyte system, of which Naval Oceanography receives 15% of operational computational system use, or approximately 1 Petabyte of the processing capability. The DSRC will provide the DoD with future computing assets to initially operate the N-ESPC in 2019. This talk will further describe how DoD's HPCMP will ensure N-ESPC becomes operational, efficiently and effectively, using next-generation high performance computing.
Modern Computational Techniques for the HMMER Sequence Analysis

PubMed Central

2013-01-01

This paper focuses on the latest research and critical reviews on modern computing architectures, software and hardware accelerated algorithms for bioinformatics data analysis with an emphasis on one of the most important sequence analysis applications—hidden Markov models (HMM). We show the detailed performance comparison of sequence analysis tools on various computing platforms recently developed in the bioinformatics society. The characteristics of the sequence analysis, such as data and compute-intensive natures, make it very attractive to optimize and parallelize by using both traditional software approach and innovated hardware acceleration technologies. PMID:25937944
Improvements in the efficiency of turboexpanders in cryogenic applications

DOE Office of Scientific and Technical Information (OSTI.GOV)

Agahi, R.R.; Lin, M.C.; Ershaghi, B.

1996-12-31

Process designers have utilized turboexpanders in cryogenic processes because of their higher thermal efficiencies when compared with conventional refrigeration cycles. Process design and equipment performance have improved substantially through the utilization of modern technologies. Turboexpander manufacturers have also adopted Computational Fluid Dynamic Software, Computer Numerical Control Technology and Holography Techniques to further improve an already impressive turboexpander efficiency performance. In this paper, the authors explain the design process of the turboexpander utilizing modern technology. Two cases of turboexpanders processing helium (4.35{degrees}K) and hydrogen (56{degrees}K) will be presented.
Case for a field-programmable gate array multicore hybrid machine for an image-processing application

NASA Astrophysics Data System (ADS)

Rakvic, Ryan N.; Ives, Robert W.; Lira, Javier; Molina, Carlos

2011-01-01

General purpose computer designers have recently begun adding cores to their processors in order to increase performance. For example, Intel has adopted a homogeneous quad-core processor as a base for general purpose computing. PlayStation3 (PS3) game consoles contain a multicore heterogeneous processor known as the Cell, which is designed to perform complex image processing algorithms at a high level. Can modern image-processing algorithms utilize these additional cores? On the other hand, modern advancements in configurable hardware, most notably field-programmable gate arrays (FPGAs) have created an interesting question for general purpose computer designers. Is there a reason to combine FPGAs with multicore processors to create an FPGA multicore hybrid general purpose computer? Iris matching, a repeatedly executed portion of a modern iris-recognition algorithm, is parallelized on an Intel-based homogeneous multicore Xeon system, a heterogeneous multicore Cell system, and an FPGA multicore hybrid system. Surprisingly, the cheaper PS3 slightly outperforms the Intel-based multicore on a core-for-core basis. However, both multicore systems are beaten by the FPGA multicore hybrid system by >50%.
Laboratory Mathematics

ERIC Educational Resources Information Center

de Mestre, Neville

2004-01-01

Computers were invented to help mathematicians perform long and complicated calculations more efficiently. By the time that a computing area became a familiar space in primary and secondary schools, the initial motivation for computer use had been submerged in the many other functions that modern computers now accomplish. Not only the mathematics…
High performance flight computer developed for deep space applications

NASA Technical Reports Server (NTRS)

Bunker, Robert L.

1993-01-01

The development of an advanced space flight computer for real time embedded deep space applications which embodies the lessons learned on Galileo and modern computer technology is described. The requirements are listed and the design implementation that meets those requirements is described. The development of SPACE-16 (Spaceborne Advanced Computing Engine) (where 16 designates the databus width) was initiated to support the MM2 (Marine Mark 2) project. The computer is based on a radiation hardened emulation of a modern 32 bit microprocessor and its family of support devices including a high performance floating point accelerator. Additional custom devices which include a coprocessor to improve input/output capabilities, a memory interface chip, and an additional support chip that provide management of all fault tolerant features, are described. Detailed supporting analyses and rationale which justifies specific design and architectural decisions are provided. The six chip types were designed and fabricated. Testing and evaluation of a brass/board was initiated.
Design Trade-off Between Performance and Fault-Tolerance of Space Onboard Computers

NASA Astrophysics Data System (ADS)

Gorbunov, M. S.; Antonov, A. A.

2017-01-01

It is well known that there is a trade-off between performance and power consumption in onboard computers. The fault-tolerance is another important factor affecting performance, chip area and power consumption. Involving special SRAM cells and error-correcting codes is often too expensive with relation to the performance needed. We discuss the possibility of finding the optimal solutions for modern onboard computer for scientific apparatus focusing on multi-level cache memory design.
Performance of High-Reliability Space-Qualified Processors Implementing Software Defined Radios

DTIC Science & Technology

2014-03-01

ADDRESS(ES) AND ADDRESS(ES) Naval Postgraduate School, Department of Electrical and Computer Engineering, 833 Dyer Road, Monterey, CA 93943-5121 8...Chairman Jeffrey D. Paduan Electrical and Computer Engineering Dean of Research iii THIS PAGE...capability. Radiation in space poses a considerable threat to modern microelectronic devices, in particular to the high-performance low-cost computing
Oscillatory threshold logic.

PubMed

Borresen, Jon; Lynch, Stephen

2012-01-01

In the 1940s, the first generation of modern computers used vacuum tube oscillators as their principle components, however, with the development of the transistor, such oscillator based computers quickly became obsolete. As the demand for faster and lower power computers continues, transistors are themselves approaching their theoretical limit and emerging technologies must eventually supersede them. With the development of optical oscillators and Josephson junction technology, we are again presented with the possibility of using oscillators as the basic components of computers, and it is possible that the next generation of computers will be composed almost entirely of oscillatory devices. Here, we demonstrate how coupled threshold oscillators may be used to perform binary logic in a manner entirely consistent with modern computer architectures. We describe a variety of computational circuitry and demonstrate working oscillator models of both computation and memory.
Web Based Parallel Programming Workshop for Undergraduate Education.

ERIC Educational Resources Information Center

Marcus, Robert L.; Robertson, Douglass

Central State University (Ohio), under a contract with Nichols Research Corporation, has developed a World Wide web based workshop on high performance computing entitled "IBN SP2 Parallel Programming Workshop." The research is part of the DoD (Department of Defense) High Performance Computing Modernization Program. The research…
Computers and neurosurgery.

PubMed

Shaikhouni, Ammar; Elder, J Bradley

2012-11-01

At the turn of the twentieth century, the only computational device used in neurosurgical procedures was the brain of the surgeon. Today, most neurosurgical procedures rely at least in part on the use of a computer to help perform surgeries accurately and safely. The techniques that revolutionized neurosurgery were mostly developed after the 1950s. Just before that era, the transistor was invented in the late 1940s, and the integrated circuit was invented in the late 1950s. During this time, the first automated, programmable computational machines were introduced. The rapid progress in the field of neurosurgery not only occurred hand in hand with the development of modern computers, but one also can state that modern neurosurgery would not exist without computers. The focus of this article is the impact modern computers have had on the practice of neurosurgery. Neuroimaging, neuronavigation, and neuromodulation are examples of tools in the armamentarium of the modern neurosurgeon that owe each step in their evolution to progress made in computer technology. Advances in computer technology central to innovations in these fields are highlighted, with particular attention to neuroimaging. Developments over the last 10 years in areas of sensors and robotics that promise to transform the practice of neurosurgery further are discussed. Potential impacts of advances in computers related to neurosurgery in developing countries and underserved regions are also discussed. As this article illustrates, the computer, with its underlying and related technologies, is central to advances in neurosurgery over the last half century. Copyright © 2012 Elsevier Inc. All rights reserved.
Role of HPC in Advancing Computational Aeroelasticity

NASA Technical Reports Server (NTRS)

Guruswamy, Guru P.

2004-01-01

On behalf of the High Performance Computing and Modernization Program (HPCMP) and NASA Advanced Supercomputing Division (NAS) a study is conducted to assess the role of supercomputers on computational aeroelasticity of aerospace vehicles. The study is mostly based on the responses to a web based questionnaire that was designed to capture the nuances of high performance computational aeroelasticity, particularly on parallel computers. A procedure is presented to assign a fidelity-complexity index to each application. Case studies based on major applications using HPCMP resources are presented.
A cross-sectional study of the effects of load carriage on running characteristics and tibial mechanical stress: implications for stress fracture injuries in women

DTIC Science & Technology

2017-03-23

performance computing resources made available by the US Department of Defense High Performance Computing Modernization Program at the Air Force...1Department of Defense Biotechnology High Performance Computing Software Applications Institute, Telemedicine and Advanced Technology Research Center, United...States Army Medical Research and Materiel Command, Fort Detrick, Maryland, USA Full list of author information is available at the end of the article
Numerical Prediction of Pitch Damping Stability Derivatives for Finned Projectiles

DTIC Science & Technology

2013-11-01

in part by a grant of high-performance computing time from the U.S. DOD High Performance Computing Modernization Program (HPCMP) at the Army...to average 1 hour per response, including the time for reviewing instructions, searching existing data sources, gathering and maintaining the data...12 3.3.2 Time -Accurate Simulations

Oscillatory Threshold Logic

PubMed Central

Borresen, Jon; Lynch, Stephen

2012-01-01

In the 1940s, the first generation of modern computers used vacuum tube oscillators as their principle components, however, with the development of the transistor, such oscillator based computers quickly became obsolete. As the demand for faster and lower power computers continues, transistors are themselves approaching their theoretical limit and emerging technologies must eventually supersede them. With the development of optical oscillators and Josephson junction technology, we are again presented with the possibility of using oscillators as the basic components of computers, and it is possible that the next generation of computers will be composed almost entirely of oscillatory devices. Here, we demonstrate how coupled threshold oscillators may be used to perform binary logic in a manner entirely consistent with modern computer architectures. We describe a variety of computational circuitry and demonstrate working oscillator models of both computation and memory. PMID:23173034
Unified, Cross-Platform, Open-Source Library Package for High-Performance Computing

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kozacik, Stephen

Compute power is continually increasing, but this increased performance is largely found in sophisticated computing devices and supercomputer resources that are difficult to use, resulting in under-utilization. We developed a unified set of programming tools that will allow users to take full advantage of the new technology by allowing them to work at a level abstracted away from the platform specifics, encouraging the use of modern computing systems, including government-funded supercomputer facilities.
Biological and Environmental Research Exascale Requirements Review. An Office of Science review sponsored jointly by Advanced Scientific Computing Research and Biological and Environmental Research, March 28-31, 2016, Rockville, Maryland

DOE Office of Scientific and Technical Information (OSTI.GOV)

Arkin, Adam; Bader, David C.; Coffey, Richard

Understanding the fundamentals of genomic systems or the processes governing impactful weather patterns are examples of the types of simulation and modeling performed on the most advanced computing resources in America. High-performance computing and computational science together provide a necessary platform for the mission science conducted by the Biological and Environmental Research (BER) office at the U.S. Department of Energy (DOE). This report reviews BER’s computing needs and their importance for solving some of the toughest problems in BER’s portfolio. BER’s impact on science has been transformative. Mapping the human genome, including the U.S.-supported international Human Genome Project that DOEmore » began in 1987, initiated the era of modern biotechnology and genomics-based systems biology. And since the 1950s, BER has been a core contributor to atmospheric, environmental, and climate science research, beginning with atmospheric circulation studies that were the forerunners of modern Earth system models (ESMs) and by pioneering the implementation of climate codes onto high-performance computers. See http://exascaleage.org/ber/ for more information.« less
DoD High Performance Computing Modernization Program Users Group Conference (HPCMP UGC 2011) Held in Portland, Oregon on June 20-23, 2011

DTIC Science & Technology

2011-06-01

4. Conclusion The Web -based AGeS system described in this paper is a computationally-efficient and scalable system for high- throughput genome...method for protecting web services involves making them more resilient to attack using autonomic computing techniques. This paper presents our initial...20–23, 2011 2011 DoD High Performance Computing Modernzation Program Users Group Conference HPCMP UGC 2011 The papers in this book comprise the
Modern Electronic Devices: An Increasingly Common Cause of Skin Disorders in Consumers.

PubMed

Corazza, Monica; Minghetti, Sara; Bertoldi, Alberto Maria; Martina, Emanuela; Virgili, Annarosa; Borghi, Alessandro

2016-01-01

: The modern conveniences and enjoyment brought about by electronic devices bring with them some health concerns. In particular, personal electronic devices are responsible for rising cases of several skin disorders, including pressure, friction, contact dermatitis, and other physical dermatitis. The universal use of such devices, either for work or recreational purposes, will probably increase the occurrence of polymorphous skin manifestations over time. It is important for clinicians to consider electronics as potential sources of dermatological ailments, for proper patient management. We performed a literature review on skin disorders associated with the personal use of modern technology, including personal computers and laptops, personal computer accessories, mobile phones, tablets, video games, and consoles.
Goal Orientation Framing and Its Influence on Performance

DTIC Science & Technology

2012-12-01

first-person shooter computer games Call of Duty: Modern Warfare 2 and Call of Duty: Modern Warfare 3. During the simulation, participants were...working understanding of social expectations and norms (Duda & Nicholis, 1992). It is true that obese people, drug addicts and abusive parents exist in...Monterey Student Activity Center. B. MEASURES Performance was assessed in two tests, a math test and a first-person shooter game . It was the intent of
Computer simulation of a single pilot flying a modern high-performance helicopter

NASA Technical Reports Server (NTRS)

Zipf, Mark E.; Vogt, William G.; Mickle, Marlin H.; Hoelzeman, Ronald G.; Kai, Fei; Mihaloew, James R.

1988-01-01

Presented is a computer simulation of a human response pilot model able to execute operational flight maneuvers and vehicle stabilization of a modern high-performance helicopter. Low-order, single-variable, human response mechanisms, integrated to form a multivariable pilot structure, provide a comprehensive operational control over the vehicle. Evaluations of the integrated pilot were performed by direct insertion into a nonlinear, total-force simulation environment provided by NASA Lewis. Comparisons between the integrated pilot structure and single-variable pilot mechanisms are presented. Static and dynamically alterable configurations of the pilot structure are introduced to simulate pilot activities during vehicle maneuvers. These configurations, in conjunction with higher level, decision-making processes, are considered for use where guidance and navigational procedures, operational mode transfers, and resource sharing are required.
Real-time implementation of an interactive jazz accompaniment system

NASA Astrophysics Data System (ADS)

Deshpande, Nikhil

Modern computational algorithms and digital signal processing (DSP) are able to combine with human performers without forced or predetermined structure in order to create dynamic and real-time accompaniment systems. With modern computing power and intelligent algorithm layout and design, it is possible to achieve more detailed auditory analysis of live music. Using this information, computer code can follow and predict how a human's musical performance evolves, and use this to react in a musical manner. This project builds a real-time accompaniment system to perform together with live musicians, with a focus on live jazz performance and improvisation. The system utilizes a new polyphonic pitch detector and embeds it in an Ableton Live system - combined with Max for Live - to perform elements of audio analysis, generation, and triggering. The system also relies on tension curves and information rate calculations from the Creative Artificially Intuitive and Reasoning Agent (CAIRA) system to help understand and predict human improvisation. These metrics are vital to the core system and allow for extrapolated audio analysis. The system is able to react dynamically to a human performer, and can successfully accompany the human as an entire rhythm section.
High-performance equation solvers and their impact on finite element analysis

NASA Technical Reports Server (NTRS)

Poole, Eugene L.; Knight, Norman F., Jr.; Davis, D. Dale, Jr.

1990-01-01

The role of equation solvers in modern structural analysis software is described. Direct and iterative equation solvers which exploit vectorization on modern high-performance computer systems are described and compared. The direct solvers are two Cholesky factorization methods. The first method utilizes a novel variable-band data storage format to achieve very high computation rates and the second method uses a sparse data storage format designed to reduce the number of operations. The iterative solvers are preconditioned conjugate gradient methods. Two different preconditioners are included; the first uses a diagonal matrix storage scheme to achieve high computation rates and the second requires a sparse data storage scheme and converges to the solution in fewer iterations that the first. The impact of using all of the equation solvers in a common structural analysis software system is demonstrated by solving several representative structural analysis problems.
High-performance equation solvers and their impact on finite element analysis

NASA Technical Reports Server (NTRS)

Poole, Eugene L.; Knight, Norman F., Jr.; Davis, D. D., Jr.

1992-01-01

The role of equation solvers in modern structural analysis software is described. Direct and iterative equation solvers which exploit vectorization on modern high-performance computer systems are described and compared. The direct solvers are two Cholesky factorization methods. The first method utilizes a novel variable-band data storage format to achieve very high computation rates and the second method uses a sparse data storage format designed to reduce the number od operations. The iterative solvers are preconditioned conjugate gradient methods. Two different preconditioners are included; the first uses a diagonal matrix storage scheme to achieve high computation rates and the second requires a sparse data storage scheme and converges to the solution in fewer iterations that the first. The impact of using all of the equation solvers in a common structural analysis software system is demonstrated by solving several representative structural analysis problems.
Computer simulation of multiple pilots flying a modern high performance helicopter

NASA Technical Reports Server (NTRS)

Zipf, Mark E.; Vogt, William G.; Mickle, Marlin H.; Hoelzeman, Ronald G.; Kai, Fei; Mihaloew, James R.

1988-01-01

A computer simulation of a human response pilot mechanism within the flight control loop of a high-performance modern helicopter is presented. A human response mechanism, implemented by a low order, linear transfer function, is used in a decoupled single variable configuration that exploits the dominant vehicle characteristics by associating cockpit controls and instrumentation with specific vehicle dynamics. Low order helicopter models obtained from evaluations of the time and frequency domain responses of a nonlinear simulation model, provided by NASA Lewis Research Center, are presented and considered in the discussion of the pilot development. Pilot responses and reactions to test maneuvers are presented and discussed. Higher level implementation, using the pilot mechanisms, are discussed and considered for their use in a comprehensive control structure.
A view of Kanerva's sparse distributed memory

NASA Technical Reports Server (NTRS)

Denning, P. J.

1986-01-01

Pentti Kanerva is working on a new class of computers, which are called pattern computers. Pattern computers may close the gap between capabilities of biological organisms to recognize and act on patterns (visual, auditory, tactile, or olfactory) and capabilities of modern computers. Combinations of numeric, symbolic, and pattern computers may one day be capable of sustaining robots. The overview of the requirements for a pattern computer, a summary of Kanerva's Sparse Distributed Memory (SDM), and examples of tasks this computer can be expected to perform well are given.
Integration of Computational Chemistry into the Undergraduate Organic Chemistry Laboratory Curriculum

ERIC Educational Resources Information Center

Esselman, Brian J.; Hill, Nicholas J.

2016-01-01

Advances in software and hardware have promoted the use of computational chemistry in all branches of chemical research to probe important chemical concepts and to support experimentation. Consequently, it has become imperative that students in the modern undergraduate curriculum become adept at performing simple calculations using computational…
A performance model for GPUs with caches

DOE PAGES

Dao, Thanh Tuan; Kim, Jungwon; Seo, Sangmin; ...

2014-06-24

To exploit the abundant computational power of the world's fastest supercomputers, an even workload distribution to the typically heterogeneous compute devices is necessary. While relatively accurate performance models exist for conventional CPUs, accurate performance estimation models for modern GPUs do not exist. This paper presents two accurate models for modern GPUs: a sampling-based linear model, and a model based on machine-learning (ML) techniques which improves the accuracy of the linear model and is applicable to modern GPUs with and without caches. We first construct the sampling-based linear model to predict the runtime of an arbitrary OpenCL kernel. Based on anmore » analysis of NVIDIA GPUs' scheduling policies we determine the earliest sampling points that allow an accurate estimation. The linear model cannot capture well the significant effects that memory coalescing or caching as implemented in modern GPUs have on performance. We therefore propose a model based on ML techniques that takes several compiler-generated statistics about the kernel as well as the GPU's hardware performance counters as additional inputs to obtain a more accurate runtime performance estimation for modern GPUs. We demonstrate the effectiveness and broad applicability of the model by applying it to three different NVIDIA GPU architectures and one AMD GPU architecture. On an extensive set of OpenCL benchmarks, on average, the proposed model estimates the runtime performance with less than 7 percent error for a second-generation GTX 280 with no on-chip caches and less than 5 percent for the Fermi-based GTX 580 with hardware caches. On the Kepler-based GTX 680, the linear model has an error of less than 10 percent. On an AMD GPU architecture, Radeon HD 6970, the model estimates with 8 percent of error rates. As a result, the proposed technique outperforms existing models by a factor of 5 to 6 in terms of accuracy.« less
HPC Annual Report 2017

DOE Office of Scientific and Technical Information (OSTI.GOV)

Dennig, Yasmin

Sandia National Laboratories has a long history of significant contributions to the high performance community and industry. Our innovative computer architectures allowed the United States to become the first to break the teraFLOP barrier—propelling us to the international spotlight. Our advanced simulation and modeling capabilities have been integral in high consequence US operations such as Operation Burnt Frost. Strong partnerships with industry leaders, such as Cray, Inc. and Goodyear, have enabled them to leverage our high performance computing (HPC) capabilities to gain a tremendous competitive edge in the marketplace. As part of our continuing commitment to providing modern computing infrastructuremore » and systems in support of Sandia missions, we made a major investment in expanding Building 725 to serve as the new home of HPC systems at Sandia. Work is expected to be completed in 2018 and will result in a modern facility of approximately 15,000 square feet of computer center space. The facility will be ready to house the newest National Nuclear Security Administration/Advanced Simulation and Computing (NNSA/ASC) Prototype platform being acquired by Sandia, with delivery in late 2019 or early 2020. This new system will enable continuing advances by Sandia science and engineering staff in the areas of operating system R&D, operation cost effectiveness (power and innovative cooling technologies), user environment and application code performance.« less
Joint the Center for Applied Scientific Computing

DOE Office of Scientific and Technical Information (OSTI.GOV)

Gamblin, Todd; Bremer, Timo; Van Essen, Brian

The Center for Applied Scientific Computing serves as Livermore Lab’s window to the broader computer science, computational physics, applied mathematics, and data science research communities. In collaboration with academic, industrial, and other government laboratory partners, we conduct world-class scientific research and development on problems critical to national security. CASC applies the power of high-performance computing and the efficiency of modern computational methods to the realms of stockpile stewardship, cyber and energy security, and knowledge discovery for intelligence applications.
Turbulence Model Effects on Cold-Gas Lateral Jet Interaction in a Supersonic Crossflow

DTIC Science & Technology

2014-06-01

performance computing time from the U.S. Department of Defense (DOD) High Performance Computing Modernization program at the U.S. Army Research Laboratory... time for reviewing instructions, searching existing data sources, gathering and maintaining the data needed, and completing and reviewing the...thanks Dr. Ross Chaplin , Defence Science Technology Laboratory, United Kingdom (UK), and Dr. David MacManus and Robert Christie, Cranfield University, UK
Development of Modern Performance Assessment Tools and Capabilities for Underground Disposal of Transuranic Waste at WIPP

NASA Astrophysics Data System (ADS)

Zeitler, T.; Kirchner, T. B.; Hammond, G. E.; Park, H.

2014-12-01

The Waste Isolation Pilot Plant (WIPP) has been developed by the U.S. Department of Energy (DOE) for the geologic (deep underground) disposal of transuranic (TRU) waste. Containment of TRU waste at the WIPP is regulated by the U.S. Environmental Protection Agency (EPA). The DOE demonstrates compliance with the containment requirements by means of performance assessment (PA) calculations. WIPP PA calculations estimate the probability and consequence of potential radionuclide releases from the repository to the accessible environment for a regulatory period of 10,000 years after facility closure. The long-term performance of the repository is assessed using a suite of sophisticated computational codes. In a broad modernization effort, the DOE has overseen the transfer of these codes to modern hardware and software platforms. Additionally, there is a current effort to establish new performance assessment capabilities through the further development of the PFLOTRAN software, a state-of-the-art massively parallel subsurface flow and reactive transport code. Improvements to the current computational environment will result in greater detail in the final models due to the parallelization afforded by the modern code. Parallelization will allow for relatively faster calculations, as well as a move from a two-dimensional calculation grid to a three-dimensional grid. The result of the modernization effort will be a state-of-the-art subsurface flow and transport capability that will serve WIPP PA into the future. Sandia National Laboratories is a multi-program laboratory managed and operated by Sandia Corporation, a wholly owned subsidiary of Lockheed Martin Corporation, for the U.S. Department of Energy's National Nuclear Security Administration under contract DE-AC04-94AL85000. This research is funded by WIPP programs administered by the Office of Environmental Management (EM) of the U.S Department of Energy.
Computer Simulation Performed for Columbia Project Cooling System

NASA Technical Reports Server (NTRS)

Ahmad, Jasim

2005-01-01

This demo shows a high-fidelity simulation of the air flow in the main computer room housing the Columbia (10,024 intel titanium processors) system. The simulation asseses the performance of the cooling system and identified deficiencies, and recommended modifications to eliminate them. It used two in house software packages on NAS supercomputers: Chimera Grid tools to generate a geometric model of the computer room, OVERFLOW-2 code for fluid and thermal simulation. This state-of-the-art technology can be easily extended to provide a general capability for air flow analyses on any modern computer room. Columbia_CFD_black.tiff
Department of Defense High Performance Computing Modernization Program. 2008 Annual Report

DTIC Science & Technology

2009-04-01

place to another on the network. Without it, a computer could only talk to itself - no email, no web browsing, and no iTunes . Most of the Internet...Your SecurID Card ), Ken Renard Secure Wireless, Rob Scott and Stephen Bowman Securing Today’s Networks, Rich Whittney, Juniper Networks, Federal

Running R Statistical Computing Environment Software on the Peregrine

Science.gov Websites

for the development of new statistical methodologies and enjoys a large user base. Please consult the distribution details. Natural language support but running in an English locale R is a collaborative project programming paradigms to better leverage modern HPC systems. The CRAN task view for High Performance Computing
Application of Game Theory to Improve the Defense of the Smart Grid

DTIC Science & Technology

2012-03-01

Computer Systems and Networks ...............................................22 2.4.2 Trust Models ...systems. In this environment, developers assumed deterministic communications mediums rather than the “best effort” models provided in most modern... models or computational models to validate the SPSs design. Finally, the study reveals concerns about the performance of load rejection schemes
Effects of Turbulence Model on Prediction of Hot-Gas Lateral Jet Interaction in a Supersonic Crossflow

DTIC Science & Technology

2015-07-01

performance computing time from the US Department of Defense (DOD) High Performance Computing Modernization program at the US Army Research Laboratory...Approved OMB No. 0704-0188 Public reporting burden for this collection of information is estimated to average 1 hour per response, including the time ...dimensional, compressible, Reynolds-averaged Navier-Stokes (RANS) equations are solved using a finite volume method. A point-implicit time - integration
QCDLoop: A comprehensive framework for one-loop scalar integrals

NASA Astrophysics Data System (ADS)

Carrazza, Stefano; Ellis, R. Keith; Zanderighi, Giulia

2016-12-01

We present a new release of the QCDLoop library based on a modern object-oriented framework. We discuss the available new features such as the extension to the complex masses, the possibility to perform computations in double and quadruple precision simultaneously, and useful caching mechanisms to improve the computational speed. We benchmark the performance of the new library, and provide practical examples of phenomenological implementations by interfacing this new library to Monte Carlo programs.
Computer software.

PubMed

Rosenthal, L E

1986-10-01

Software is the component in a computer system that permits the hardware to perform the various functions that a computer system is capable of doing. The history of software and its development can be traced to the early nineteenth century. All computer systems are designed to utilize the "stored program concept" as first developed by Charles Babbage in the 1850s. The concept was lost until the mid-1940s, when modern computers made their appearance. Today, because of the complex and myriad tasks that a computer system can perform, there has been a differentiation of types of software. There is software designed to perform specific business applications. There is software that controls the overall operation of a computer system. And there is software that is designed to carry out specialized tasks. Regardless of types, software is the most critical component of any computer system. Without it, all one has is a collection of circuits, transistors, and silicone chips.
The efficiency of geophysical adjoint codes generated by automatic differentiation tools

NASA Astrophysics Data System (ADS)

Vlasenko, A. V.; Köhl, A.; Stammer, D.

2016-02-01

The accuracy of numerical models that describe complex physical or chemical processes depends on the choice of model parameters. Estimating an optimal set of parameters by optimization algorithms requires knowledge of the sensitivity of the process of interest to model parameters. Typically the sensitivity computation involves differentiation of the model, which can be performed by applying algorithmic differentiation (AD) tools to the underlying numerical code. However, existing AD tools differ substantially in design, legibility and computational efficiency. In this study we show that, for geophysical data assimilation problems of varying complexity, the performance of adjoint codes generated by the existing AD tools (i) Open_AD, (ii) Tapenade, (iii) NAGWare and (iv) Transformation of Algorithms in Fortran (TAF) can be vastly different. Based on simple test problems, we evaluate the efficiency of each AD tool with respect to computational speed, accuracy of the adjoint, the efficiency of memory usage, and the capability of each AD tool to handle modern FORTRAN 90-95 elements such as structures and pointers, which are new elements that either combine groups of variables or provide aliases to memory addresses, respectively. We show that, while operator overloading tools are the only ones suitable for modern codes written in object-oriented programming languages, their computational efficiency lags behind source transformation by orders of magnitude, rendering the application of these modern tools to practical assimilation problems prohibitive. In contrast, the application of source transformation tools appears to be the most efficient choice, allowing handling even large geophysical data assimilation problems. However, they can only be applied to numerical models written in earlier generations of programming languages. Our study indicates that applying existing AD tools to realistic geophysical problems faces limitations that urgently need to be solved to allow the continuous use of AD tools for solving geophysical problems on modern computer architectures.
Incorporating modern neuroscience findings to improve brain-computer interfaces: tracking auditory attention.

PubMed

Wronkiewicz, Mark; Larson, Eric; Lee, Adrian Kc

2016-10-01

Brain-computer interface (BCI) technology allows users to generate actions based solely on their brain signals. However, current non-invasive BCIs generally classify brain activity recorded from surface electroencephalography (EEG) electrodes, which can hinder the application of findings from modern neuroscience research. In this study, we use source imaging-a neuroimaging technique that projects EEG signals onto the surface of the brain-in a BCI classification framework. This allowed us to incorporate prior research from functional neuroimaging to target activity from a cortical region involved in auditory attention. Classifiers trained to detect attention switches performed better with source imaging projections than with EEG sensor signals. Within source imaging, including subject-specific anatomical MRI information (instead of using a generic head model) further improved classification performance. This source-based strategy also reduced accuracy variability across three dimensionality reduction techniques-a major design choice in most BCIs. Our work shows that source imaging provides clear quantitative and qualitative advantages to BCIs and highlights the value of incorporating modern neuroscience knowledge and methods into BCI systems.
Using Modeling and Simulation to Complement Testing for Increased Understanding of Weapon Subassembly Response.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wong, Michael K.; Davidson, Megan

As part of Sandia’s nuclear deterrence mission, the B61-12 Life Extension Program (LEP) aims to modernize the aging weapon system. Modernization requires requalification and Sandia is using high performance computing to perform advanced computational simulations to better understand, evaluate, and verify weapon system performance in conjunction with limited physical testing. The Nose Bomb Subassembly (NBSA) of the B61-12 is responsible for producing a fuzing signal upon ground impact. The fuzing signal is dependent upon electromechanical impact sensors producing valid electrical fuzing signals at impact. Computer generated models were used to assess the timing between the impact sensor’s response to themore » deceleration of impact and damage to major components and system subassemblies. The modeling and simulation team worked alongside the physical test team to design a large-scale reverse ballistic test to not only assess system performance, but to also validate their computational models. The reverse ballistic test conducted at Sandia’s sled test facility sent a rocket sled with a representative target into a stationary B61-12 (NBSA) to characterize the nose crush and functional response of NBSA components. Data obtained from data recorders and high-speed photometrics were integrated with previously generated computer models in order to refine and validate the model’s ability to reliably simulate real-world effects. Large-scale tests are impractical to conduct for every single impact scenario. By creating reliable computer models, we can perform simulations that identify trends and produce estimates of outcomes over the entire range of required impact conditions. Sandia’s HPCs enable geometric resolution that was unachievable before, allowing for more fidelity and detail, and creating simulations that can provide insight to support evaluation of requirements and performance margins. As computing resources continue to improve, researchers at Sandia are hoping to improve these simulations so they provide increasingly credible analysis of the system response and performance over the full range of conditions.« less
SUPREM-DSMC: A New Scalable, Parallel, Reacting, Multidimensional Direct Simulation Monte Carlo Flow Code

NASA Technical Reports Server (NTRS)

Campbell, David; Wysong, Ingrid; Kaplan, Carolyn; Mott, David; Wadsworth, Dean; VanGilder, Douglas

2000-01-01

An AFRL/NRL team has recently been selected to develop a scalable, parallel, reacting, multidimensional (SUPREM) Direct Simulation Monte Carlo (DSMC) code for the DoD user community under the High Performance Computing Modernization Office (HPCMO) Common High Performance Computing Software Support Initiative (CHSSI). This paper will introduce the JANNAF Exhaust Plume community to this three-year development effort and present the overall goals, schedule, and current status of this new code.
Geospace simulations using modern accelerator processor technology

NASA Astrophysics Data System (ADS)

Germaschewski, K.; Raeder, J.; Larson, D. J.

2009-12-01

OpenGGCM (Open Geospace General Circulation Model) is a well-established numerical code simulating the Earth's space environment. The most computing intensive part is the MHD (magnetohydrodynamics) solver that models the plasma surrounding Earth and its interaction with Earth's magnetic field and the solar wind flowing in from the sun. Like other global magnetosphere codes, OpenGGCM's realism is currently limited by computational constraints on grid resolution. OpenGGCM has been ported to make use of the added computational powerof modern accelerator based processor architectures, in particular the Cell processor. The Cell architecture is a novel inhomogeneous multicore architecture capable of achieving up to 230 GFLops on a single chip. The University of New Hampshire recently acquired a PowerXCell 8i based computing cluster, and here we will report initial performance results of OpenGGCM. Realizing the high theoretical performance of the Cell processor is a programming challenge, though. We implemented the MHD solver using a multi-level parallelization approach: On the coarsest level, the problem is distributed to processors based upon the usual domain decomposition approach. Then, on each processor, the problem is divided into 3D columns, each of which is handled by the memory limited SPEs (synergistic processing elements) slice by slice. Finally, SIMD instructions are used to fully exploit the SIMD FPUs in each SPE. Memory management needs to be handled explicitly by the code, using DMA to move data from main memory to the per-SPE local store and vice versa. We use a modern technique, automatic code generation, which shields the application programmer from having to deal with all of the implementation details just described, keeping the code much more easily maintainable. Our preliminary results indicate excellent performance, a speed-up of a factor of 30 compared to the unoptimized version.
Modern multicore and manycore architectures: Modelling, optimisation and benchmarking a multiblock CFD code

NASA Astrophysics Data System (ADS)

Hadade, Ioan; di Mare, Luca

2016-08-01

Modern multicore and manycore processors exhibit multiple levels of parallelism through a wide range of architectural features such as SIMD for data parallel execution or threads for core parallelism. The exploitation of multi-level parallelism is therefore crucial for achieving superior performance on current and future processors. This paper presents the performance tuning of a multiblock CFD solver on Intel SandyBridge and Haswell multicore CPUs and the Intel Xeon Phi Knights Corner coprocessor. Code optimisations have been applied on two computational kernels exhibiting different computational patterns: the update of flow variables and the evaluation of the Roe numerical fluxes. We discuss at great length the code transformations required for achieving efficient SIMD computations for both kernels across the selected devices including SIMD shuffles and transpositions for flux stencil computations and global memory transformations. Core parallelism is expressed through threading based on a number of domain decomposition techniques together with optimisations pertaining to alleviating NUMA effects found in multi-socket compute nodes. Results are correlated with the Roofline performance model in order to assert their efficiency for each distinct architecture. We report significant speedups for single thread execution across both kernels: 2-5X on the multicore CPUs and 14-23X on the Xeon Phi coprocessor. Computations at full node and chip concurrency deliver a factor of three speedup on the multicore processors and up to 24X on the Xeon Phi manycore coprocessor.
Method and computer program product for maintenance and modernization backlogging

DOEpatents

Mattimore, Bernard G; Reynolds, Paul E; Farrell, Jill M

2013-02-19

According to one embodiment, a computer program product for determining future facility conditions includes a computer readable medium having computer readable program code stored therein. The computer readable program code includes computer readable program code for calculating a time period specific maintenance cost, for calculating a time period specific modernization factor, and for calculating a time period specific backlog factor. Future facility conditions equal the time period specific maintenance cost plus the time period specific modernization factor plus the time period specific backlog factor. In another embodiment, a computer-implemented method for calculating future facility conditions includes calculating a time period specific maintenance cost, calculating a time period specific modernization factor, and calculating a time period specific backlog factor. Future facility conditions equal the time period specific maintenance cost plus the time period specific modernization factor plus the time period specific backlog factor. Other embodiments are also presented.
Research on Influence of Cloud Environment on Traditional Network Security

NASA Astrophysics Data System (ADS)

Ming, Xiaobo; Guo, Jinhua

2018-02-01

Cloud computing is a symbol of the progress of modern information network, cloud computing provides a lot of convenience to the Internet users, but it also brings a lot of risk to the Internet users. Second, one of the main reasons for Internet users to choose cloud computing is that the network security performance is great, it also is the cornerstone of cloud computing applications. This paper briefly explores the impact on cloud environment on traditional cybersecurity, and puts forward corresponding solutions.
Real-time optical flow estimation on a GPU for a skied-steered mobile robot

NASA Astrophysics Data System (ADS)

Kniaz, V. V.

2016-04-01

Accurate egomotion estimation is required for mobile robot navigation. Often the egomotion is estimated using optical flow algorithms. For an accurate estimation of optical flow most of modern algorithms require high memory resources and processor speed. However simple single-board computers that control the motion of the robot usually do not provide such resources. On the other hand, most of modern single-board computers are equipped with an embedded GPU that could be used in parallel with a CPU to improve the performance of the optical flow estimation algorithm. This paper presents a new Z-flow algorithm for efficient computation of an optical flow using an embedded GPU. The algorithm is based on the phase correlation optical flow estimation and provide a real-time performance on a low cost embedded GPU. The layered optical flow model is used. Layer segmentation is performed using graph-cut algorithm with a time derivative based energy function. Such approach makes the algorithm both fast and robust in low light and low texture conditions. The algorithm implementation for a Raspberry Pi Model B computer is discussed. For evaluation of the algorithm the computer was mounted on a Hercules mobile skied-steered robot equipped with a monocular camera. The evaluation was performed using a hardware-in-the-loop simulation and experiments with Hercules mobile robot. Also the algorithm was evaluated using KITTY Optical Flow 2015 dataset. The resulting endpoint error of the optical flow calculated with the developed algorithm was low enough for navigation of the robot along the desired trajectory.
Performance of the fusion code GYRO on four generations of Cray computers

DOE Office of Scientific and Technical Information (OSTI.GOV)

Fahey, Mark R

2014-01-01

GYRO is a code used for the direct numerical simulation of plasma microturbulence. It has been ported to a variety of modern MPP platforms including several modern commodity clusters, IBM SPs, and Cray XC, XT, and XE series machines. We briefly describe the mathematical structure of the equations, the data layout, and the redistribution scheme. Also, while the performance and scaling of GYRO on many of these systems has been shown before, here we show the comparative performance and scaling on four generations of Cray supercomputers including the newest addition - the Cray XC30. The more recently added hybrid OpenMP/MPImore » imple- mentation also shows a great deal of promise on custom HPC systems that utilize fast CPUs and proprietary interconnects. Four machines of varying sizes were used in the experiment, all of which are located at the National Institute for Computational Sciences at the University of Tennessee at Knoxville and Oak Ridge National Laboratory. The advantages, limitations, and performance of using each system are discussed.« less
SISYPHUS: A high performance seismic inversion factory

NASA Astrophysics Data System (ADS)

Gokhberg, Alexey; Simutė, Saulė; Boehm, Christian; Fichtner, Andreas

2016-04-01

In the recent years the massively parallel high performance computers became the standard instruments for solving the forward and inverse problems in seismology. The respective software packages dedicated to forward and inverse waveform modelling specially designed for such computers (SPECFEM3D, SES3D) became mature and widely available. These packages achieve significant computational performance and provide researchers with an opportunity to solve problems of bigger size at higher resolution within a shorter time. However, a typical seismic inversion process contains various activities that are beyond the common solver functionality. They include management of information on seismic events and stations, 3D models, observed and synthetic seismograms, pre-processing of the observed signals, computation of misfits and adjoint sources, minimization of misfits, and process workflow management. These activities are time consuming, seldom sufficiently automated, and therefore represent a bottleneck that can substantially offset performance benefits provided by even the most powerful modern supercomputers. Furthermore, a typical system architecture of modern supercomputing platforms is oriented towards the maximum computational performance and provides limited standard facilities for automation of the supporting activities. We present a prototype solution that automates all aspects of the seismic inversion process and is tuned for the modern massively parallel high performance computing systems. We address several major aspects of the solution architecture, which include (1) design of an inversion state database for tracing all relevant aspects of the entire solution process, (2) design of an extensible workflow management framework, (3) integration with wave propagation solvers, (4) integration with optimization packages, (5) computation of misfits and adjoint sources, and (6) process monitoring. The inversion state database represents a hierarchical structure with branches for the static process setup, inversion iterations, and solver runs, each branch specifying information at the event, station and channel levels. The workflow management framework is based on an embedded scripting engine that allows definition of various workflow scenarios using a high-level scripting language and provides access to all available inversion components represented as standard library functions. At present the SES3D wave propagation solver is integrated in the solution; the work is in progress for interfacing with SPECFEM3D. A separate framework is designed for interoperability with an optimization module; the workflow manager and optimization process run in parallel and cooperate by exchanging messages according to a specially designed protocol. A library of high-performance modules implementing signal pre-processing, misfit and adjoint computations according to established good practices is included. Monitoring is based on information stored in the inversion state database and at present implements a command line interface; design of a graphical user interface is in progress. The software design fits well into the common massively parallel system architecture featuring a large number of computational nodes running distributed applications under control of batch-oriented resource managers. The solution prototype has been implemented on the "Piz Daint" supercomputer provided by the Swiss Supercomputing Centre (CSCS).
Optimizing high performance computing workflow for protein functional annotation.

PubMed

Stanberry, Larissa; Rekepalli, Bhanu; Liu, Yuan; Giblock, Paul; Higdon, Roger; Montague, Elizabeth; Broomall, William; Kolker, Natali; Kolker, Eugene

2014-09-10

Functional annotation of newly sequenced genomes is one of the major challenges in modern biology. With modern sequencing technologies, the protein sequence universe is rapidly expanding. Newly sequenced bacterial genomes alone contain over 7.5 million proteins. The rate of data generation has far surpassed that of protein annotation. The volume of protein data makes manual curation infeasible, whereas a high compute cost limits the utility of existing automated approaches. In this work, we present an improved and optmized automated workflow to enable large-scale protein annotation. The workflow uses high performance computing architectures and a low complexity classification algorithm to assign proteins into existing clusters of orthologous groups of proteins. On the basis of the Position-Specific Iterative Basic Local Alignment Search Tool the algorithm ensures at least 80% specificity and sensitivity of the resulting classifications. The workflow utilizes highly scalable parallel applications for classification and sequence alignment. Using Extreme Science and Engineering Discovery Environment supercomputers, the workflow processed 1,200,000 newly sequenced bacterial proteins. With the rapid expansion of the protein sequence universe, the proposed workflow will enable scientists to annotate big genome data.
Optimizing high performance computing workflow for protein functional annotation

PubMed Central

Stanberry, Larissa; Rekepalli, Bhanu; Liu, Yuan; Giblock, Paul; Higdon, Roger; Montague, Elizabeth; Broomall, William; Kolker, Natali; Kolker, Eugene

2014-01-01

Functional annotation of newly sequenced genomes is one of the major challenges in modern biology. With modern sequencing technologies, the protein sequence universe is rapidly expanding. Newly sequenced bacterial genomes alone contain over 7.5 million proteins. The rate of data generation has far surpassed that of protein annotation. The volume of protein data makes manual curation infeasible, whereas a high compute cost limits the utility of existing automated approaches. In this work, we present an improved and optmized automated workflow to enable large-scale protein annotation. The workflow uses high performance computing architectures and a low complexity classification algorithm to assign proteins into existing clusters of orthologous groups of proteins. On the basis of the Position-Specific Iterative Basic Local Alignment Search Tool the algorithm ensures at least 80% specificity and sensitivity of the resulting classifications. The workflow utilizes highly scalable parallel applications for classification and sequence alignment. Using Extreme Science and Engineering Discovery Environment supercomputers, the workflow processed 1,200,000 newly sequenced bacterial proteins. With the rapid expansion of the protein sequence universe, the proposed workflow will enable scientists to annotate big genome data. PMID:25313296
Using the Eclipse Parallel Tools Platform to Assist Earth Science Model Development and Optimization on High Performance Computers

NASA Astrophysics Data System (ADS)

Alameda, J. C.

2011-12-01

Development and optimization of computational science models, particularly on high performance computers, and with the advent of ubiquitous multicore processor systems, practically on every system, has been accomplished with basic software tools, typically, command-line based compilers, debuggers, performance tools that have not changed substantially from the days of serial and early vector computers. However, model complexity, including the complexity added by modern message passing libraries such as MPI, and the need for hybrid code models (such as openMP and MPI) to be able to take full advantage of high performance computers with an increasing core count per shared memory node, has made development and optimization of such codes an increasingly arduous task. Additional architectural developments, such as many-core processors, only complicate the situation further. In this paper, we describe how our NSF-funded project, "SI2-SSI: A Productive and Accessible Development Workbench for HPC Applications Using the Eclipse Parallel Tools Platform" (WHPC) seeks to improve the Eclipse Parallel Tools Platform, an environment designed to support scientific code development targeted at a diverse set of high performance computing systems. Our WHPC project to improve Eclipse PTP takes an application-centric view to improve PTP. We are using a set of scientific applications, each with a variety of challenges, and using PTP to drive further improvements to both the scientific application, as well as to understand shortcomings in Eclipse PTP from an application developer perspective, to drive our list of improvements we seek to make. We are also partnering with performance tool providers, to drive higher quality performance tool integration. We have partnered with the Cactus group at Louisiana State University to improve Eclipse's ability to work with computational frameworks and extremely complex build systems, as well as to develop educational materials to incorporate into computational science and engineering codes. Finally, we are partnering with the lead PTP developers at IBM, to ensure we are as effective as possible within the Eclipse community development. We are also conducting training and outreach to our user community, including conference BOF sessions, monthly user calls, and an annual user meeting, so that we can best inform the improvements we make to Eclipse PTP. With these activities we endeavor to encourage use of modern software engineering practices, as enabled through the Eclipse IDE, with computational science and engineering applications. These practices include proper use of source code repositories, tracking and rectifying issues, measuring and monitoring code performance changes against both optimizations as well as ever-changing software stacks and configurations on HPC systems, as well as ultimately encouraging development and maintenance of testing suites -- things that have become commonplace in many software endeavors, but have lagged in the development of science applications. We view that the challenge with the increased complexity of both HPC systems and science applications demands the use of better software engineering methods, preferably enabled by modern tools such as Eclipse PTP, to help the computational science community thrive as we evolve the HPC landscape.
Computational Science and Innovation

NASA Astrophysics Data System (ADS)

Dean, D. J.

2011-09-01

Simulations - utilizing computers to solve complicated science and engineering problems - are a key ingredient of modern science. The U.S. Department of Energy (DOE) is a world leader in the development of high-performance computing (HPC), the development of applied math and algorithms that utilize the full potential of HPC platforms, and the application of computing to science and engineering problems. An interesting general question is whether the DOE can strategically utilize its capability in simulations to advance innovation more broadly. In this article, I will argue that this is certainly possible.

Overview of computational structural methods for modern military aircraft

NASA Technical Reports Server (NTRS)

Kudva, J. N.

1992-01-01

Computational structural methods are essential for designing modern military aircraft. This briefing deals with computational structural methods (CSM) currently used. First a brief summary of modern day aircraft structural design procedures is presented. Following this, several ongoing CSM related projects at Northrop are discussed. Finally, shortcomings in this area, future requirements, and summary remarks are given.
Parallel-vector unsymmetric Eigen-Solver on high performance computers

NASA Technical Reports Server (NTRS)

Nguyen, Duc T.; Jiangning, Qin

1993-01-01

The popular QR algorithm for solving all eigenvalues of an unsymmetric matrix is reviewed. Among the basic components in the QR algorithm, it was concluded from this study, that the reduction of an unsymmetric matrix to a Hessenberg form (before applying the QR algorithm itself) can be done effectively by exploiting the vector speed and multiple processors offered by modern high-performance computers. Numerical examples of several test cases have indicated that the proposed parallel-vector algorithm for converting a given unsymmetric matrix to a Hessenberg form offers computational advantages over the existing algorithm. The time saving obtained by the proposed methods is increased as the problem size increased.
Aerodynamic Characterization of a Modern Launch Vehicle

NASA Technical Reports Server (NTRS)

Hall, Robert M.; Holland, Scott D.; Blevins, John A.

2011-01-01

A modern launch vehicle is by necessity an extremely integrated design. The accurate characterization of its aerodynamic characteristics is essential to determine design loads, to design flight control laws, and to establish performance. The NASA Ares Aerodynamics Panel has been responsible for technical planning, execution, and vetting of the aerodynamic characterization of the Ares I vehicle. An aerodynamics team supporting the Panel consists of wind tunnel engineers, computational engineers, database engineers, and other analysts that address topics such as uncertainty quantification. The team resides at three NASA centers: Langley Research Center, Marshall Space Flight Center, and Ames Research Center. The Panel has developed strategies to synergistically combine both the wind tunnel efforts and the computational efforts with the goal of validating the computations. Selected examples highlight key flow physics and, where possible, the fidelity of the comparisons between wind tunnel results and the computations. Lessons learned summarize what has been gleaned during the project and can be useful for other vehicle development projects.
Information Systems at Enterprise. Design of Secure Network of Enterprise

NASA Astrophysics Data System (ADS)

Saigushev, N. Y.; Mikhailova, U. V.; Vedeneeva, O. A.; Tsaran, A. A.

2018-05-01

No enterprise and company can do without designing its own corporate network in today's information society. It accelerates and facilitates the work of employees at any level, but contains a big threat to confidential information of the company. In addition to the data theft attackers, there are plenty of information threats posed by modern malware effects. In this regard, the computational security of corporate networks is an important component of modern information technologies of computer security for any enterprise. This article says about the design of the protected corporate network of the enterprise that provides the computers on the network access to the Internet, as well interoperability with the branch. The access speed to the Internet at a high level is provided through the use of high-speed access channels and load balancing between devices. The security of the designed network is performed through the use of VLAN technology as well as access lists and AAA server.
Parallelization of the preconditioned IDR solver for modern multicore computer systems

NASA Astrophysics Data System (ADS)

Bessonov, O. A.; Fedoseyev, A. I.

2012-10-01

This paper present the analysis, parallelization and optimization approach for the large sparse matrix solver CNSPACK for modern multicore microprocessors. CNSPACK is an advanced solver successfully used for coupled solution of stiff problems arising in multiphysics applications such as CFD, semiconductor transport, kinetic and quantum problems. It employs iterative IDR algorithm with ILU preconditioning (user chosen ILU preconditioning order). CNSPACK has been successfully used during last decade for solving problems in several application areas, including fluid dynamics and semiconductor device simulation. However, there was a dramatic change in processor architectures and computer system organization in recent years. Due to this, performance criteria and methods have been revisited, together with involving the parallelization of the solver and preconditioner using Open MP environment. Results of the successful implementation for efficient parallelization are presented for the most advances computer system (Intel Core i7-9xx or two-processor Xeon 55xx/56xx).
Intelligent redundant actuation system requirements and preliminary system design

NASA Technical Reports Server (NTRS)

Defeo, P.; Geiger, L. J.; Harris, J.

1985-01-01

Several redundant actuation system configurations were designed and demonstrated to satisfy the stringent operational requirements of advanced flight control systems. However, this has been accomplished largely through brute force hardware redundancy, resulting in significantly increased computational requirements on the flight control computers which perform the failure analysis and reconfiguration management. Modern technology now provides powerful, low-cost microprocessors which are effective in performing failure isolation and configuration management at the local actuator level. One such concept, called an Intelligent Redundant Actuation System (IRAS), significantly reduces the flight control computer requirements and performs the local tasks more comprehensively than previously feasible. The requirements and preliminary design of an experimental laboratory system capable of demonstrating the concept and sufficiently flexible to explore a variety of configurations are discussed.
Development of STOLAND, a versatile navigation, guidance and control system

NASA Technical Reports Server (NTRS)

Young, L. S.; Hansen, Q. M.; Rouse, W. E.; Osder, S. S.

1972-01-01

STOLAND has been developed to perform navigation, guidance, control, and flight management experiments in advanced V/STOL aircraft. The experiments have broad requirements and have dictated that STOLAND be capable of providing performance that would be realistic and equivalent to a wide range of current and future avionics systems. An integrated digital concept using modern avionics components was selected as the simplest approach to maximizing versatility and growth potential. Unique flexibility has been obtained by use of a single, general-purpose digital computer for all navigation, guidance, control, and displays computation.
Intricacies of modern supercomputing illustrated with recent advances in simulations of strongly correlated electron systems

NASA Astrophysics Data System (ADS)

Schulthess, Thomas C.

2013-03-01

The continued thousand-fold improvement in sustained application performance per decade on modern supercomputers keeps opening new opportunities for scientific simulations. But supercomputers have become very complex machines, built with thousands or tens of thousands of complex nodes consisting of multiple CPU cores or, most recently, a combination of CPU and GPU processors. Efficient simulations on such high-end computing systems require tailored algorithms that optimally map numerical methods to particular architectures. These intricacies will be illustrated with simulations of strongly correlated electron systems, where the development of quantum cluster methods, Monte Carlo techniques, as well as their optimal implementation by means of algorithms with improved data locality and high arithmetic density have gone hand in hand with evolving computer architectures. The present work would not have been possible without continued access to computing resources at the National Center for Computational Science of Oak Ridge National Laboratory, which is funded by the Facilities Division of the Office of Advanced Scientific Computing Research, and the Swiss National Supercomputing Center (CSCS) that is funded by ETH Zurich.
Breast tumor malignancy modelling using evolutionary neural logic networks.

PubMed

Tsakonas, Athanasios; Dounias, Georgios; Panagi, Georgia; Panourgias, Evangelia

2006-01-01

The present work proposes a computer assisted methodology for the effective modelling of the diagnostic decision for breast tumor malignancy. The suggested approach is based on innovative hybrid computational intelligence algorithms properly applied in related cytological data contained in past medical records. The experimental data used in this study were gathered in the early 1990s in the University of Wisconsin, based in post diagnostic cytological observations performed by expert medical staff. Data were properly encoded in a computer database and accordingly, various alternative modelling techniques were applied on them, in an attempt to form diagnostic models. Previous methods included standard optimisation techniques, as well as artificial intelligence approaches, in a way that a variety of related publications exists in modern literature on the subject. In this report, a hybrid computational intelligence approach is suggested, which effectively combines modern mathematical logic principles, neural computation and genetic programming in an effective manner. The approach proves promising either in terms of diagnostic accuracy and generalization capabilities, or in terms of comprehensibility and practical importance for the related medical staff.
Petascale supercomputing to accelerate the design of high-temperature alloys

DOE PAGES

Shin, Dongwon; Lee, Sangkeun; Shyam, Amit; ...

2017-10-25

Recent progress in high-performance computing and data informatics has opened up numerous opportunities to aid the design of advanced materials. Herein, we demonstrate a computational workflow that includes rapid population of high-fidelity materials datasets via petascale computing and subsequent analyses with modern data science techniques. We use a first-principles approach based on density functional theory to derive the segregation energies of 34 microalloying elements at the coherent and semi-coherent interfaces between the aluminium matrix and the θ'-Al 2Cu precipitate, which requires several hundred supercell calculations. We also perform extensive correlation analyses to identify materials descriptors that affect the segregation behaviourmore » of solutes at the interfaces. Finally, we show an example of leveraging machine learning techniques to predict segregation energies without performing computationally expensive physics-based simulations. As a result, the approach demonstrated in the present work can be applied to any high-temperature alloy system for which key materials data can be obtained using high-performance computing.« less
Petascale supercomputing to accelerate the design of high-temperature alloys

DOE Office of Scientific and Technical Information (OSTI.GOV)

Shin, Dongwon; Lee, Sangkeun; Shyam, Amit

Recent progress in high-performance computing and data informatics has opened up numerous opportunities to aid the design of advanced materials. Herein, we demonstrate a computational workflow that includes rapid population of high-fidelity materials datasets via petascale computing and subsequent analyses with modern data science techniques. We use a first-principles approach based on density functional theory to derive the segregation energies of 34 microalloying elements at the coherent and semi-coherent interfaces between the aluminium matrix and the θ'-Al 2Cu precipitate, which requires several hundred supercell calculations. We also perform extensive correlation analyses to identify materials descriptors that affect the segregation behaviourmore » of solutes at the interfaces. Finally, we show an example of leveraging machine learning techniques to predict segregation energies without performing computationally expensive physics-based simulations. As a result, the approach demonstrated in the present work can be applied to any high-temperature alloy system for which key materials data can be obtained using high-performance computing.« less
Petascale supercomputing to accelerate the design of high-temperature alloys

NASA Astrophysics Data System (ADS)

Shin, Dongwon; Lee, Sangkeun; Shyam, Amit; Haynes, J. Allen

2017-12-01

Recent progress in high-performance computing and data informatics has opened up numerous opportunities to aid the design of advanced materials. Herein, we demonstrate a computational workflow that includes rapid population of high-fidelity materials datasets via petascale computing and subsequent analyses with modern data science techniques. We use a first-principles approach based on density functional theory to derive the segregation energies of 34 microalloying elements at the coherent and semi-coherent interfaces between the aluminium matrix and the θ‧-Al2Cu precipitate, which requires several hundred supercell calculations. We also perform extensive correlation analyses to identify materials descriptors that affect the segregation behaviour of solutes at the interfaces. Finally, we show an example of leveraging machine learning techniques to predict segregation energies without performing computationally expensive physics-based simulations. The approach demonstrated in the present work can be applied to any high-temperature alloy system for which key materials data can be obtained using high-performance computing.
Department of Defense High Performance Computing Modernization Program. 2007 Annual Report

DTIC Science & Technology

2008-03-01

Directorate, Kirtland AFB, NM Applications of Time-Accurate CFD in Order to Account for Blade -Row Interactions and Distortion Transfer in the Design of...Patterson AFB, OH Direct Numerical Simulations of Active Control for Low- Pressure Turbine Blades Herman Fasel, University of Arizona, Tucson, AZ (Air Force...interactions with the rotor wake . These HI-ARMS computations compare favorably with available wind tunnel test measurements of surface and flowfield
Modern gyrokinetic particle-in-cell simulation of fusion plasmas on top supercomputers

DOE PAGES

Wang, Bei; Ethier, Stephane; Tang, William; ...

2017-06-29

The Gyrokinetic Toroidal Code at Princeton (GTC-P) is a highly scalable and portable particle-in-cell (PIC) code. It solves the 5D Vlasov-Poisson equation featuring efficient utilization of modern parallel computer architectures at the petascale and beyond. Motivated by the goal of developing a modern code capable of dealing with the physics challenge of increasing problem size with sufficient resolution, new thread-level optimizations have been introduced as well as a key additional domain decomposition. GTC-P's multiple levels of parallelism, including inter-node 2D domain decomposition and particle decomposition, as well as intra-node shared memory partition and vectorization have enabled pushing the scalability ofmore » the PIC method to extreme computational scales. In this paper, we describe the methods developed to build a highly parallelized PIC code across a broad range of supercomputer designs. This particularly includes implementations on heterogeneous systems using NVIDIA GPU accelerators and Intel Xeon Phi (MIC) co-processors and performance comparisons with state-of-the-art homogeneous HPC systems such as Blue Gene/Q. New discovery science capabilities in the magnetic fusion energy application domain are enabled, including investigations of Ion-Temperature-Gradient (ITG) driven turbulence simulations with unprecedented spatial resolution and long temporal duration. Performance studies with realistic fusion experimental parameters are carried out on multiple supercomputing systems spanning a wide range of cache capacities, cache-sharing configurations, memory bandwidth, interconnects and network topologies. These performance comparisons using a realistic discovery-science-capable domain application code provide valuable insights on optimization techniques across one of the broadest sets of current high-end computing platforms worldwide.« less
Modern gyrokinetic particle-in-cell simulation of fusion plasmas on top supercomputers

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wang, Bei; Ethier, Stephane; Tang, William

The Gyrokinetic Toroidal Code at Princeton (GTC-P) is a highly scalable and portable particle-in-cell (PIC) code. It solves the 5D Vlasov-Poisson equation featuring efficient utilization of modern parallel computer architectures at the petascale and beyond. Motivated by the goal of developing a modern code capable of dealing with the physics challenge of increasing problem size with sufficient resolution, new thread-level optimizations have been introduced as well as a key additional domain decomposition. GTC-P's multiple levels of parallelism, including inter-node 2D domain decomposition and particle decomposition, as well as intra-node shared memory partition and vectorization have enabled pushing the scalability ofmore » the PIC method to extreme computational scales. In this paper, we describe the methods developed to build a highly parallelized PIC code across a broad range of supercomputer designs. This particularly includes implementations on heterogeneous systems using NVIDIA GPU accelerators and Intel Xeon Phi (MIC) co-processors and performance comparisons with state-of-the-art homogeneous HPC systems such as Blue Gene/Q. New discovery science capabilities in the magnetic fusion energy application domain are enabled, including investigations of Ion-Temperature-Gradient (ITG) driven turbulence simulations with unprecedented spatial resolution and long temporal duration. Performance studies with realistic fusion experimental parameters are carried out on multiple supercomputing systems spanning a wide range of cache capacities, cache-sharing configurations, memory bandwidth, interconnects and network topologies. These performance comparisons using a realistic discovery-science-capable domain application code provide valuable insights on optimization techniques across one of the broadest sets of current high-end computing platforms worldwide.« less
High-performance computing with quantum processing units

DOE PAGES

Britt, Keith A.; Oak Ridge National Lab.; Humble, Travis S.; ...

2017-03-01

The prospects of quantum computing have driven efforts to realize fully functional quantum processing units (QPUs). Recent success in developing proof-of-principle QPUs has prompted the question of how to integrate these emerging processors into modern high-performance computing (HPC) systems. We examine how QPUs can be integrated into current and future HPC system architectures by accounting for func- tional and physical design requirements. We identify two integration pathways that are differentiated by infrastructure constraints on the QPU and the use cases expected for the HPC system. This includes a tight integration that assumes infrastructure bottlenecks can be overcome as well asmore » a loose integration that as- sumes they cannot. We find that the performance of both approaches is likely to depend on the quantum interconnect that serves to entangle multiple QPUs. As a result, we also identify several challenges in assessing QPU performance for HPC, and we consider new metrics that capture the interplay between system architecture and the quantum parallelism underlying computational performance.« less
High-performance computing with quantum processing units

DOE Office of Scientific and Technical Information (OSTI.GOV)

Britt, Keith A.; Oak Ridge National Lab.; Humble, Travis S.

The prospects of quantum computing have driven efforts to realize fully functional quantum processing units (QPUs). Recent success in developing proof-of-principle QPUs has prompted the question of how to integrate these emerging processors into modern high-performance computing (HPC) systems. We examine how QPUs can be integrated into current and future HPC system architectures by accounting for func- tional and physical design requirements. We identify two integration pathways that are differentiated by infrastructure constraints on the QPU and the use cases expected for the HPC system. This includes a tight integration that assumes infrastructure bottlenecks can be overcome as well asmore » a loose integration that as- sumes they cannot. We find that the performance of both approaches is likely to depend on the quantum interconnect that serves to entangle multiple QPUs. As a result, we also identify several challenges in assessing QPU performance for HPC, and we consider new metrics that capture the interplay between system architecture and the quantum parallelism underlying computational performance.« less
Performance of GeantV EM Physics Models

NASA Astrophysics Data System (ADS)

Amadio, G.; Ananya, A.; Apostolakis, J.; Aurora, A.; Bandieramonte, M.; Bhattacharyya, A.; Bianchini, C.; Brun, R.; Canal, P.; Carminati, F.; Cosmo, G.; Duhem, L.; Elvira, D.; Folger, G.; Gheata, A.; Gheata, M.; Goulas, I.; Iope, R.; Jun, S. Y.; Lima, G.; Mohanty, A.; Nikitina, T.; Novak, M.; Pokorski, W.; Ribon, A.; Seghal, R.; Shadura, O.; Vallecorsa, S.; Wenzel, S.; Zhang, Y.

2017-10-01

The recent progress in parallel hardware architectures with deeper vector pipelines or many-cores technologies brings opportunities for HEP experiments to take advantage of SIMD and SIMT computing models. Launched in 2013, the GeantV project studies performance gains in propagating multiple particles in parallel, improving instruction throughput and data locality in HEP event simulation on modern parallel hardware architecture. Due to the complexity of geometry description and physics algorithms of a typical HEP application, performance analysis is indispensable in identifying factors limiting parallel execution. In this report, we will present design considerations and preliminary computing performance of GeantV physics models on coprocessors (Intel Xeon Phi and NVidia GPUs) as well as on mainstream CPUs.
Using Words Instead of Jumbled Characters as Stimuli in Keyboard Training Facilitates Fluent Performance

ERIC Educational Resources Information Center

DeFulio, Anthony; Crone-Todd, Darlene E.; Long, Lauren V.; Nuzzo, Paul A.; Silverman, Kenneth

2011-01-01

Keyboarding skill is an important target for adult education programs due to the ubiquity of computers in modern work environments. A previous study showed that novice typists learned key locations quickly but that fluency took a relatively long time to develop. In the present study, novice typists achieved fluent performance in nearly half the…
A survey of GPU-based acceleration techniques in MRI reconstructions

PubMed Central

Wang, Haifeng; Peng, Hanchuan; Chang, Yuchou

2018-01-01

Image reconstruction in magnetic resonance imaging (MRI) clinical applications has become increasingly more complicated. However, diagnostic and treatment require very fast computational procedure. Modern competitive platforms of graphics processing unit (GPU) have been used to make high-performance parallel computations available, and attractive to common consumers for computing massively parallel reconstruction problems at commodity price. GPUs have also become more and more important for reconstruction computations, especially when deep learning starts to be applied into MRI reconstruction. The motivation of this survey is to review the image reconstruction schemes of GPU computing for MRI applications and provide a summary reference for researchers in MRI community. PMID:29675361

A survey of GPU-based acceleration techniques in MRI reconstructions.

PubMed

Wang, Haifeng; Peng, Hanchuan; Chang, Yuchou; Liang, Dong

2018-03-01

Image reconstruction in magnetic resonance imaging (MRI) clinical applications has become increasingly more complicated. However, diagnostic and treatment require very fast computational procedure. Modern competitive platforms of graphics processing unit (GPU) have been used to make high-performance parallel computations available, and attractive to common consumers for computing massively parallel reconstruction problems at commodity price. GPUs have also become more and more important for reconstruction computations, especially when deep learning starts to be applied into MRI reconstruction. The motivation of this survey is to review the image reconstruction schemes of GPU computing for MRI applications and provide a summary reference for researchers in MRI community.
Shor's factoring algorithm and modern cryptography. An illustration of the capabilities inherent in quantum computers

NASA Astrophysics Data System (ADS)

Gerjuoy, Edward

2005-06-01

The security of messages encoded via the widely used RSA public key encryption system rests on the enormous computational effort required to find the prime factors of a large number N using classical (conventional) computers. In 1994 Peter Shor showed that for sufficiently large N, a quantum computer could perform the factoring with much less computational effort. This paper endeavors to explain, in a fashion comprehensible to the nonexpert, the RSA encryption protocol; the various quantum computer manipulations constituting the Shor algorithm; how the Shor algorithm performs the factoring; and the precise sense in which a quantum computer employing Shor's algorithm can be said to accomplish the factoring of very large numbers with less computational effort than a classical computer. It is made apparent that factoring N generally requires many successive runs of the algorithm. Our analysis reveals that the probability of achieving a successful factorization on a single run is about twice as large as commonly quoted in the literature.
The Development of Sociocultural Competence with the Help of Computer Technology

ERIC Educational Resources Information Center

Rakhimova, Alina E.; Yashina, Marianna E.; Mukhamadiarova, Albina F.; Sharipova, Astrid V.

2017-01-01

The article deals with the description of the process of development sociocultural knowledge and competences using computer technologies. On the whole the development of modern computer technologies allows teachers to broaden trainees' sociocultural outlook and trace their progress online. Observation of modern computer technologies and estimation…
Some Observations on the Current Status of Performing Finite Element Analyses

NASA Technical Reports Server (NTRS)

Raju, Ivatury S.; Knight, Norman F., Jr; Shivakumar, Kunigal N.

2015-01-01

Aerospace structures are complex high-performance structures. Advances in reliable and efficient computing and modeling tools are enabling analysts to consider complex configurations, build complex finite element models, and perform analysis rapidly. Many of the early career engineers of today are very proficient in the usage of modern computers, computing engines, complex software systems, and visualization tools. These young engineers are becoming increasingly efficient in building complex 3D models of complicated aerospace components. However, the current trends demonstrate blind acceptance of the results of the finite element analysis results. This paper is aimed at raising an awareness of this situation. Examples of the common encounters are presented. To overcome the current trends, some guidelines and suggestions for analysts, senior engineers, and educators are offered.
DOE Office of Scientific and Technical Information (OSTI.GOV)

McCarthy, J.M.

The theory and methodology of design of general-purpose machines that may be controlled by a computer to perform all the tasks of a set of special-purpose machines is the focus of modern machine design research. These seventeen contributions chronicle recent activity in the analysis and design of robot manipulators that are the prototype of these general-purpose machines. They focus particularly on kinematics, the geometry of rigid-body motion, which is an integral part of machine design theory. The challenges to kinematics researchers presented by general-purpose machines such as the manipulator are leading to new perspectives in the design and control ofmore » simpler machines with two, three, and more degrees of freedom. Researchers are rethinking the uses of gear trains, planar mechanisms, adjustable mechanisms, and computer controlled actuators in the design of modern machines.« less
An S N Algorithm for Modern Architectures

DOE Office of Scientific and Technical Information (OSTI.GOV)

Baker, Randal Scott

2016-08-29

LANL discrete ordinates transport packages are required to perform large, computationally intensive time-dependent calculations on massively parallel architectures, where even a single such calculation may need many months to complete. While KBA methods scale out well to very large numbers of compute nodes, we are limited by practical constraints on the number of such nodes we can actually apply to any given calculation. Instead, we describe a modified KBA algorithm that allows realization of the reductions in solution time offered by both the current, and future, architectural changes within a compute node.
Manycore Performance-Portability: Kokkos Multidimensional Array Library

DOE PAGES

Edwards, H. Carter; Sunderland, Daniel; Porter, Vicki; ...

2012-01-01

Large, complex scientific and engineering application code have a significant investment in computational kernels to implement their mathematical models. Porting these computational kernels to the collection of modern manycore accelerator devices is a major challenge in that these devices have diverse programming models, application programming interfaces (APIs), and performance requirements. The Kokkos Array programming model provides library-based approach to implement computational kernels that are performance-portable to CPU-multicore and GPGPU accelerator devices. This programming model is based upon three fundamental concepts: (1) manycore compute devices each with its own memory space, (2) data parallel kernels and (3) multidimensional arrays. Kernel executionmore » performance is, especially for NVIDIA® devices, extremely dependent on data access patterns. Optimal data access pattern can be different for different manycore devices – potentially leading to different implementations of computational kernels specialized for different devices. The Kokkos Array programming model supports performance-portable kernels by (1) separating data access patterns from computational kernels through a multidimensional array API and (2) introduce device-specific data access mappings when a kernel is compiled. An implementation of Kokkos Array is available through Trilinos [Trilinos website, http://trilinos.sandia.gov/, August 2011].« less
Articles on Practical Cybernetics. Computer-Developed Computers; Heuristics and Modern Sciences; Linguistics and Practice; Cybernetics and Moral-Ethical Considerations; and Men and Machines at the Chessboard.

ERIC Educational Resources Information Center

Berg, A. I.; And Others

Five articles which were selected from a Russian language book on cybernetics and then translated are presented here. They deal with the topics of: computer-developed computers, heuristics and modern sciences, linguistics and practice, cybernetics and moral-ethical considerations, and computer chess programs. (Author/JY)
FY07 NRL DoD High Performance Computing Modernization Program Annual Reports

DTIC Science & Technology

2008-09-05

performed. Implicit and explicit solutions methods are used as appropriate. The primary finite element codes used are ABAQUS and ANSYS. User subroutines ...geometric complexities, loading path dependence, rate dependence, and interaction between loading types (electrical, thermal and mechanical). Work is not...are used for specialized material constitutive response. Coupled material responses, such as electrical- thermal for capacitor materials or electrical
On the Edge: Intelligent CALL in the 1990s.

ERIC Educational Resources Information Center

Underwood, John

1989-01-01

Examines the possibilities of developing computer-assisted language learning (CALL) based on the best of modern technology, arguing that artificial intelligence (AI) strategies will radically improve the kinds of exercises that can be performed. Recommends combining AI technology with other tools for delivering instruction, such as simulation and…
Developing and Assessing E-Learning Techniques for Teaching Forecasting

ERIC Educational Resources Information Center

Gel, Yulia R.; O'Hara Hines, R. Jeanette; Chen, He; Noguchi, Kimihiro; Schoner, Vivian

2014-01-01

In the modern business environment, managers are increasingly required to perform decision making and evaluate related risks based on quantitative information in the face of uncertainty, which in turn increases demand for business professionals with sound skills and hands-on experience with statistical data analysis. Computer-based training…
Measurement of flows around modern commercial ship models

NASA Astrophysics Data System (ADS)

Kim, W. J.; Van, S. H.; Kim, D. H.

To document the details of flow characteristics around modern commercial ships, global force, wave pattern, and local mean velocity components were measured in the towing tank. Three modern commercial hull models of a container ship (KRISO container ship = KCS) and of two very large crude-oil carriers (VLCCs) with the same forebody and slightly different afterbody (KVLCC and KVLCC2) having bow and stern bulbs were selected for the test. Uncertainty analysis was performed for the measured data using the procedure recommended by the ITTC. Obtained experimental data will provide a good opportunity to explore integrated flow phenomena around practical hull forms of today. Those can be also used as the validation data for the computational fluid dynamics (CFD) code of both inviscid and viscous flow calculations.
Computational Chemistry Using Modern Electronic Structure Methods

ERIC Educational Resources Information Center

Bell, Stephen; Dines, Trevor J.; Chowdhry, Babur Z.; Withnall, Robert

2007-01-01

Various modern electronic structure methods are now days used to teach computational chemistry to undergraduate students. Such quantum calculations can now be easily used even for large size molecules.
Microelectromechanical reprogrammable logic device.

PubMed

Hafiz, M A A; Kosuru, L; Younis, M I

2016-03-29

In modern computing, the Boolean logic operations are set by interconnect schemes between the transistors. As the miniaturization in the component level to enhance the computational power is rapidly approaching physical limits, alternative computing methods are vigorously pursued. One of the desired aspects in the future computing approaches is the provision for hardware reconfigurability at run time to allow enhanced functionality. Here we demonstrate a reprogrammable logic device based on the electrothermal frequency modulation scheme of a single microelectromechanical resonator, capable of performing all the fundamental 2-bit logic functions as well as n-bit logic operations. Logic functions are performed by actively tuning the linear resonance frequency of the resonator operated at room temperature and under modest vacuum conditions, reprogrammable by the a.c.-driving frequency. The device is fabricated using complementary metal oxide semiconductor compatible mass fabrication process, suitable for on-chip integration, and promises an alternative electromechanical computing scheme.
Microelectromechanical reprogrammable logic device

PubMed Central

Hafiz, M. A. A.; Kosuru, L.; Younis, M. I.

2016-01-01

In modern computing, the Boolean logic operations are set by interconnect schemes between the transistors. As the miniaturization in the component level to enhance the computational power is rapidly approaching physical limits, alternative computing methods are vigorously pursued. One of the desired aspects in the future computing approaches is the provision for hardware reconfigurability at run time to allow enhanced functionality. Here we demonstrate a reprogrammable logic device based on the electrothermal frequency modulation scheme of a single microelectromechanical resonator, capable of performing all the fundamental 2-bit logic functions as well as n-bit logic operations. Logic functions are performed by actively tuning the linear resonance frequency of the resonator operated at room temperature and under modest vacuum conditions, reprogrammable by the a.c.-driving frequency. The device is fabricated using complementary metal oxide semiconductor compatible mass fabrication process, suitable for on-chip integration, and promises an alternative electromechanical computing scheme. PMID:27021295
FY16 NRL DoD High Performance Computing Modernization Program

DTIC Science & Technology

2017-09-15

explored both wind and wave forcing in the numerical wave tank. The model uses high spatial and temporal resolution and a multi-phase formulation to...Results: The ADVED_NS code was used to predict the effect of the standoff distance between micron- diameter wires and flow frequency on the total...contours for a flow over 3D wire mesh. Figure 2 shows verifications comparing computed and theoretical drag forces for the flow over two cylinders in an
A Parallel Numerical Micromagnetic Code Using FEniCS

NASA Astrophysics Data System (ADS)

Nagy, L.; Williams, W.; Mitchell, L.

2013-12-01

Many problems in the geosciences depend on understanding the ability of magnetic minerals to provide stable paleomagnetic recordings. Numerical micromagnetic modelling allows us to calculate the domain structures found in naturally occurring magnetic materials. However the computational cost rises exceedingly quickly with respect to the size and complexity of the geometries that we wish to model. This problem is compounded by the fact that the modern processor design no longer focuses on the speed at which calculations are performed, but rather on the number of computational units amongst which we may distribute our calculations. Consequently to better exploit modern computational resources our micromagnetic simulations must "go parallel". We present a parallel and scalable micromagnetics code written using FEniCS. FEniCS is a multinational collaboration involving several institutions (University of Cambridge, University of Chicago, The Simula Research Laboratory, etc.) that aims to provide a set of tools for writing scientific software; in particular software that employs the finite element method. The advantages of this approach are the leveraging of pre-existing projects from the world of scientific computing (PETSc, Trilinos, Metis/Parmetis, etc.) and exposing these so that researchers may pose problems in a manner closer to the mathematical language of their domain. Our code provides a scriptable interface (in Python) that allows users to not only run micromagnetic models in parallel, but also to perform pre/post processing of data.
LASER APPLICATIONS AND OTHER TOPICS IN QUANTUM ELECTRONICS: Methods of computational physics in the problem of mathematical interpretation of laser investigations

NASA Astrophysics Data System (ADS)

Brodyn, M. S.; Starkov, V. N.

2007-07-01

It is shown that in laser experiments performed by using an 'imperfect' setup when instrumental distortions are considerable, sufficiently accurate results can be obtained by the modern methods of computational physics. It is found for the first time that a new instrumental function — the 'cap' function — a 'sister' of a Gaussian curve proved to be demanded namely in laser experiments. A new mathematical model of a measurement path and carefully performed computational experiment show that a light beam transmitted through a mesoporous film has actually a narrower intensity distribution than the detected beam, and the amplitude of the real intensity distribution is twice as large as that for measured intensity distributions.
Fast neural net simulation with a DSP processor array.

PubMed

Muller, U A; Gunzinger, A; Guggenbuhl, W

1995-01-01

This paper describes the implementation of a fast neural net simulator on a novel parallel distributed-memory computer. A 60-processor system, named MUSIC (multiprocessor system with intelligent communication), is operational and runs the backpropagation algorithm at a speed of 330 million connection updates per second (continuous weight update) using 32-b floating-point precision. This is equal to 1.4 Gflops sustained performance. The complete system with 3.8 Gflops peak performance consumes less than 800 W of electrical power and fits into a 19-in rack. While reaching the speed of modern supercomputers, MUSIC still can be used as a personal desktop computer at a researcher's own disposal. In neural net simulation, this gives a computing performance to a single user which was unthinkable before. The system's real-time interfaces make it especially useful for embedded applications.
Global Magnetohydrodynamic Simulation Using High Performance FORTRAN on Parallel Computers

NASA Astrophysics Data System (ADS)

Ogino, T.

High Performance Fortran (HPF) is one of modern and common techniques to achieve high performance parallel computation. We have translated a 3-dimensional magnetohydrodynamic (MHD) simulation code of the Earth's magnetosphere from VPP Fortran to HPF/JA on the Fujitsu VPP5000/56 vector-parallel supercomputer and the MHD code was fully vectorized and fully parallelized in VPP Fortran. The entire performance and capability of the HPF MHD code could be shown to be almost comparable to that of VPP Fortran. A 3-dimensional global MHD simulation of the earth's magnetosphere was performed at a speed of over 400 Gflops with an efficiency of 76.5 VPP5000/56 in vector and parallel computation that permitted comparison with catalog values. We have concluded that fluid and MHD codes that are fully vectorized and fully parallelized in VPP Fortran can be translated with relative ease to HPF/JA, and a code in HPF/JA may be expected to perform comparably to the same code written in VPP Fortran.

Computer-assisted learning in critical care: from ENIAC to HAL.

PubMed

Tegtmeyer, K; Ibsen, L; Goldstein, B

2001-08-01

Computers are commonly used to serve many functions in today's modern intensive care unit. One of the most intriguing and perhaps most challenging applications of computers has been to attempt to improve medical education. With the introduction of the first computer, medical educators began looking for ways to incorporate their use into the modern curriculum. Prior limitations of cost and complexity of computers have consistently decreased since their introduction, making it increasingly feasible to incorporate computers into medical education. Simultaneously, the capabilities and capacities of computers have increased. Combining the computer with other modern digital technology has allowed the development of more intricate and realistic educational tools. The purpose of this article is to briefly describe the history and use of computers in medical education with special reference to critical care medicine. In addition, we will examine the role of computers in teaching and learning and discuss the types of interaction between the computer user and the computer.
Development of seismic tomography software for hybrid supercomputers

NASA Astrophysics Data System (ADS)

Nikitin, Alexandr; Serdyukov, Alexandr; Duchkov, Anton

2015-04-01

Seismic tomography is a technique used for computing velocity model of geologic structure from first arrival travel times of seismic waves. The technique is used in processing of regional and global seismic data, in seismic exploration for prospecting and exploration of mineral and hydrocarbon deposits, and in seismic engineering for monitoring the condition of engineering structures and the surrounding host medium. As a consequence of development of seismic monitoring systems and increasing volume of seismic data, there is a growing need for new, more effective computational algorithms for use in seismic tomography applications with improved performance, accuracy and resolution. To achieve this goal, it is necessary to use modern high performance computing systems, such as supercomputers with hybrid architecture that use not only CPUs, but also accelerators and co-processors for computation. The goal of this research is the development of parallel seismic tomography algorithms and software package for such systems, to be used in processing of large volumes of seismic data (hundreds of gigabytes and more). These algorithms and software package will be optimized for the most common computing devices used in modern hybrid supercomputers, such as Intel Xeon CPUs, NVIDIA Tesla accelerators and Intel Xeon Phi co-processors. In this work, the following general scheme of seismic tomography is utilized. Using the eikonal equation solver, arrival times of seismic waves are computed based on assumed velocity model of geologic structure being analyzed. In order to solve the linearized inverse problem, tomographic matrix is computed that connects model adjustments with travel time residuals, and the resulting system of linear equations is regularized and solved to adjust the model. The effectiveness of parallel implementations of existing algorithms on target architectures is considered. During the first stage of this work, algorithms were developed for execution on supercomputers using multicore CPUs only, with preliminary performance tests showing good parallel efficiency on large numerical grids. Porting of the algorithms to hybrid supercomputers is currently ongoing.
Extending a Flight Management Computer for Simulation and Flight Experiments

NASA Technical Reports Server (NTRS)

Madden, Michael M.; Sugden, Paul C.

2005-01-01

In modern transport aircraft, the flight management computer (FMC) has evolved from a flight planning aid to an important hub for pilot information and origin-to-destination optimization of flight performance. Current trends indicate increasing roles of the FMC in aviation safety, aviation security, increasing airport capacity, and improving environmental impact from aircraft. Related research conducted at the Langley Research Center (LaRC) often requires functional extension of a modern, full-featured FMC. Ideally, transport simulations would include an FMC simulation that could be tailored and extended for experiments. However, due to the complexity of a modern FMC, a large investment (millions of dollars over several years) and scarce domain knowledge are needed to create such a simulation for transport aircraft. As an intermediate alternative, the Flight Research Services Directorate (FRSD) at LaRC created a set of reusable software products to extend flight management functionality upstream of a Boeing-757 FMC, transparently simulating or sharing its operator interfaces. The paper details the design of these products and highlights their use on NASA projects.
A simple modern correctness condition for a space-based high-performance multiprocessor

NASA Technical Reports Server (NTRS)

Probst, David K.; Li, Hon F.

1992-01-01

A number of U.S. national programs, including space-based detection of ballistic missile launches, envisage putting significant computing power into space. Given sufficient progress in low-power VLSI, multichip-module packaging and liquid-cooling technologies, we will see design of high-performance multiprocessors for individual satellites. In very high speed implementations, performance depends critically on tolerating large latencies in interprocessor communication; without latency tolerance, performance is limited by the vastly differing time scales in processor and data-memory modules, including interconnect times. The modern approach to tolerating remote-communication cost in scalable, shared-memory multiprocessors is to use a multithreaded architecture, and alter the semantics of shared memory slightly, at the price of forcing the programmer either to reason about program correctness in a relaxed consistency model or to agree to program in a constrained style. The literature on multiprocessor correctness conditions has become increasingly complex, and sometimes confusing, which may hinder its practical application. We propose a simple modern correctness condition for a high-performance, shared-memory multiprocessor; the correctness condition is based on a simple interface between the multiprocessor architecture and a high-performance, shared-memory multiprocessor; the correctness condition is based on a simple interface between the multiprocessor architecture and the parallel programming system.
Resolutions of the Coulomb operator: VIII. Parallel implementation using the modern programming language X10.

PubMed

Limpanuparb, Taweetham; Milthorpe, Josh; Rendell, Alistair P

2014-10-30

Use of the modern parallel programming language X10 for computing long-range Coulomb and exchange interactions is presented. By using X10, a partitioned global address space language with support for task parallelism and the explicit representation of data locality, the resolution of the Ewald operator can be parallelized in a straightforward manner including use of both intranode and internode parallelism. We evaluate four different schemes for dynamic load balancing of integral calculation using X10's work stealing runtime, and report performance results for long-range HF energy calculation of large molecule/high quality basis running on up to 1024 cores of a high performance cluster machine. Copyright © 2014 Wiley Periodicals, Inc.
Legacy Code Modernization

NASA Technical Reports Server (NTRS)

Hribar, Michelle R.; Frumkin, Michael; Jin, Haoqiang; Waheed, Abdul; Yan, Jerry; Saini, Subhash (Technical Monitor)

1998-01-01

Over the past decade, high performance computing has evolved rapidly; systems based on commodity microprocessors have been introduced in quick succession from at least seven vendors/families. Porting codes to every new architecture is a difficult problem; in particular, here at NASA, there are many large CFD applications that are very costly to port to new machines by hand. The LCM ("Legacy Code Modernization") Project is the development of an integrated parallelization environment (IPE) which performs the automated mapping of legacy CFD (Fortran) applications to state-of-the-art high performance computers. While most projects to port codes focus on the parallelization of the code, we consider porting to be an iterative process consisting of several steps: 1) code cleanup, 2) serial optimization,3) parallelization, 4) performance monitoring and visualization, 5) intelligent tools for automated tuning using performance prediction and 6) machine specific optimization. The approach for building this parallelization environment is to build the components for each of the steps simultaneously and then integrate them together. The demonstration will exhibit our latest research in building this environment: 1. Parallelizing tools and compiler evaluation. 2. Code cleanup and serial optimization using automated scripts 3. Development of a code generator for performance prediction 4. Automated partitioning 5. Automated insertion of directives. These demonstrations will exhibit the effectiveness of an automated approach for all the steps involved with porting and tuning a legacy code application for a new architecture.
An evaluation of multi-probe locality sensitive hashing for computing similarities over web-scale query logs.

PubMed

Cormode, Graham; Dasgupta, Anirban; Goyal, Amit; Lee, Chi Hoon

2018-01-01

Many modern applications of AI such as web search, mobile browsing, image processing, and natural language processing rely on finding similar items from a large database of complex objects. Due to the very large scale of data involved (e.g., users' queries from commercial search engines), computing such near or nearest neighbors is a non-trivial task, as the computational cost grows significantly with the number of items. To address this challenge, we adopt Locality Sensitive Hashing (a.k.a, LSH) methods and evaluate four variants in a distributed computing environment (specifically, Hadoop). We identify several optimizations which improve performance, suitable for deployment in very large scale settings. The experimental results demonstrate our variants of LSH achieve the robust performance with better recall compared with "vanilla" LSH, even when using the same amount of space.
A study of workstation computational performance for real-time flight simulation

NASA Technical Reports Server (NTRS)

Maddalon, Jeffrey M.; Cleveland, Jeff I., II

1995-01-01

With recent advances in microprocessor technology, some have suggested that modern workstations provide enough computational power to properly operate a real-time simulation. This paper presents the results of a computational benchmark, based on actual real-time flight simulation code used at Langley Research Center, which was executed on various workstation-class machines. The benchmark was executed on different machines from several companies including: CONVEX Computer Corporation, Cray Research, Digital Equipment Corporation, Hewlett-Packard, Intel, International Business Machines, Silicon Graphics, and Sun Microsystems. The machines are compared by their execution speed, computational accuracy, and porting effort. The results of this study show that the raw computational power needed for real-time simulation is now offered by workstations.
ECG R-R peak detection on mobile phones.

PubMed

Sufi, F; Fang, Q; Cosic, I

2007-01-01

Mobile phones have become an integral part of modern life. Due to the ever increasing processing power, mobile phones are rapidly expanding its arena from a sole device of telecommunication to organizer, calculator, gaming device, web browser, music player, audio/video recording device, navigator etc. The processing power of modern mobile phones has been utilized by many innovative purposes. In this paper, we are proposing the utilization of mobile phones for monitoring and analysis of biosignal. The computation performed inside the mobile phone's processor will now be exploited for healthcare delivery. We performed literature review on RR interval detection from ECG and selected few PC based algorithms. Then, three of those existing RR interval detection algorithms were programmed on Java platform. Performance monitoring and comparison studies were carried out on three different mobile devices to determine their application on a realtime telemonitoring scenario.
Irregular Applications: Architectures & Algorithms

DOE Office of Scientific and Technical Information (OSTI.GOV)

Feo, John T.; Villa, Oreste; Tumeo, Antonino

Irregular applications are characterized by irregular data structures, control and communication patterns. Novel irregular high performance applications which deal with large data sets and require have recently appeared. Unfortunately, current high performance systems and software infrastructures executes irregular algorithms poorly. Only coordinated efforts by end user, area specialists and computer scientists that consider both the architecture and the software stack may be able to provide solutions to the challenges of modern irregular applications.
On the concept of the interactive information and simulation system for gas dynamics and multiphysics problems

NASA Astrophysics Data System (ADS)

Bessonov, O.; Silvestrov, P.

2017-02-01

This paper describes the general idea and the first implementation of the Interactive information and simulation system - an integrated environment that combines computational modules for modeling the aerodynamics and aerothermodynamics of re-entry space vehicles with the large collection of different information materials on this topic. The internal organization and the composition of the system are described and illustrated. Examples of the computational and information output are presented. The system has the unified implementation for Windows and Linux operation systems and can be deployed on any modern high-performance personal computer.
FY16 NRL DoD High Performance Computing Modernization Program Annual Reports

DTIC Science & Technology

2017-09-15

explored both wind and wave forcing in the numerical wave tank. The model uses high spatial and temporal resolution and a multi-phase formulation to...Results: The ADVED_NS code was used to predict the effect of the standoff distance between micron- diameter wires and flow frequency on the total...contours for a flow over 3D wire mesh. Figure 2 shows verifications comparing computed and theoretical drag forces for the flow over two cylinders in an
Open chemistry: RESTful web APIs, JSON, NWChem and the modern web application.

PubMed

Hanwell, Marcus D; de Jong, Wibe A; Harris, Christopher J

2017-10-30

An end-to-end platform for chemical science research has been developed that integrates data from computational and experimental approaches through a modern web-based interface. The platform offers an interactive visualization and analytics environment that functions well on mobile, laptop and desktop devices. It offers pragmatic solutions to ensure that large and complex data sets are more accessible. Existing desktop applications/frameworks were extended to integrate with high-performance computing resources, and offer command-line tools to automate interaction-connecting distributed teams to this software platform on their own terms. The platform was developed openly, and all source code hosted on the GitHub platform with automated deployment possible using Ansible coupled with standard Ubuntu-based machine images deployed to cloud machines. The platform is designed to enable teams to reap the benefits of the connected web-going beyond what conventional search and analytics platforms offer in this area. It also has the goal of offering federated instances, that can be customized to the sites/research performed. Data gets stored using JSON, extending upon previous approaches using XML, building structures that support computational chemistry calculations. These structures were developed to make it easy to process data across different languages, and send data to a JavaScript-based web client.
Open chemistry: RESTful web APIs, JSON, NWChem and the modern web application

DOE PAGES

Hanwell, Marcus D.; de Jong, Wibe A.; Harris, Christopher J.

2017-10-30

An end-to-end platform for chemical science research has been developed that integrates data from computational and experimental approaches through a modern web-based interface. The platform offers an interactive visualization and analytics environment that functions well on mobile, laptop and desktop devices. It offers pragmatic solutions to ensure that large and complex data sets are more accessible. Existing desktop applications/frameworks were extended to integrate with high-performance computing resources, and offer command-line tools to automate interaction - connecting distributed teams to this software platform on their own terms. The platform was developed openly, and all source code hosted on the GitHub platformmore » with automated deployment possible using Ansible coupled with standard Ubuntu-based machine images deployed to cloud machines. The platform is designed to enable teams to reap the benefits of the connected web - going beyond what conventional search and analytics platforms offer in this area. It also has the goal of offering federated instances, that can be customized to the sites/research performed. Data gets stored using JSON, extending upon previous approaches using XML, building structures that support computational chemistry calculations. These structures were developed to make it easy to process data across different languages, and send data to a JavaScript-based web client.« less
Open chemistry: RESTful web APIs, JSON, NWChem and the modern web application

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hanwell, Marcus D.; de Jong, Wibe A.; Harris, Christopher J.

An end-to-end platform for chemical science research has been developed that integrates data from computational and experimental approaches through a modern web-based interface. The platform offers an interactive visualization and analytics environment that functions well on mobile, laptop and desktop devices. It offers pragmatic solutions to ensure that large and complex data sets are more accessible. Existing desktop applications/frameworks were extended to integrate with high-performance computing resources, and offer command-line tools to automate interaction - connecting distributed teams to this software platform on their own terms. The platform was developed openly, and all source code hosted on the GitHub platformmore » with automated deployment possible using Ansible coupled with standard Ubuntu-based machine images deployed to cloud machines. The platform is designed to enable teams to reap the benefits of the connected web - going beyond what conventional search and analytics platforms offer in this area. It also has the goal of offering federated instances, that can be customized to the sites/research performed. Data gets stored using JSON, extending upon previous approaches using XML, building structures that support computational chemistry calculations. These structures were developed to make it easy to process data across different languages, and send data to a JavaScript-based web client.« less
NREL Evaluates Aquarius Liquid-Cooled High-Performance Computing Technology

Science.gov Websites

HPC and influence the modern data center designer towards adoption of liquid cooling. Our shared technology. Aquila and Sandia chose NREL's HPC Data Center for the initial installation and evaluation because the data center is configured for liquid cooling, along with the required instrumentation to
First Encounters of the Close Kind: The Formation Process of Airline Flight Crews

DTIC Science & Technology

1987-01-01

process and aircrew performance, Foushee notes an interesting etymological parallel: "Webster’s New Collegiate Dictionary (1961) defines cockpit as ’a...here combines applications from the physical science of chemistry and the modern science of computers. In chemistry , a shell is a space occupied by
Bridging the Gap between Basic and Clinical Sciences: A Description of a Radiological Anatomy Course

ERIC Educational Resources Information Center

Torres, Anna; Staskiewicz, Grzegorz J.; Lisiecka, Justyna; Pietrzyk, Lukasz; Czekajlo, Michael; Arancibia, Carlos U.; Maciejewski, Ryszard; Torres, Kamil

2016-01-01

A wide variety of medical imaging techniques pervade modern medicine, and the changing portability and performance of tools like ultrasound imaging have brought these medical imaging techniques into the everyday practice of many specialties outside of radiology. However, proper interpretation of ultrasonographic and computed tomographic images…
Remote control system for high-perfomance computer simulation of crystal growth by the PFC method

NASA Astrophysics Data System (ADS)

Pavlyuk, Evgeny; Starodumov, Ilya; Osipov, Sergei

2017-04-01

Modeling of crystallization process by the phase field crystal method (PFC) - one of the important directions of modern computational materials science. In this paper, the practical side of the computer simulation of the crystallization process by the PFC method is investigated. To solve problems using this method, it is necessary to use high-performance computing clusters, data storage systems and other often expensive complex computer systems. Access to such resources is often limited, unstable and accompanied by various administrative problems. In addition, the variety of software and settings of different computing clusters sometimes does not allow researchers to use unified program code. There is a need to adapt the program code for each configuration of the computer complex. The practical experience of the authors has shown that the creation of a special control system for computing with the possibility of remote use can greatly simplify the implementation of simulations and increase the performance of scientific research. In current paper we show the principal idea of such a system and justify its efficiency.
Parallel-hierarchical processing and classification of laser beam profile images based on the GPU-oriented architecture

NASA Astrophysics Data System (ADS)

Yarovyi, Andrii A.; Timchenko, Leonid I.; Kozhemiako, Volodymyr P.; Kokriatskaia, Nataliya I.; Hamdi, Rami R.; Savchuk, Tamara O.; Kulyk, Oleksandr O.; Surtel, Wojciech; Amirgaliyev, Yedilkhan; Kashaganova, Gulzhan

2017-08-01

The paper deals with a problem of insufficient productivity of existing computer means for large image processing, which do not meet modern requirements posed by resource-intensive computing tasks of laser beam profiling. The research concentrated on one of the profiling problems, namely, real-time processing of spot images of the laser beam profile. Development of a theory of parallel-hierarchic transformation allowed to produce models for high-performance parallel-hierarchical processes, as well as algorithms and software for their implementation based on the GPU-oriented architecture using GPGPU technologies. The analyzed performance of suggested computerized tools for processing and classification of laser beam profile images allows to perform real-time processing of dynamic images of various sizes.

Neuromorphic Computing – From Materials Research to Systems Architecture Roundtable

DOE Office of Scientific and Technical Information (OSTI.GOV)

Schuller, Ivan K.; Stevens, Rick; Pino, Robinson

2015-10-29

Computation in its many forms is the engine that fuels our modern civilization. Modern computation—based on the von Neumann architecture—has allowed, until now, the development of continuous improvements, as predicted by Moore’s law. However, computation using current architectures and materials will inevitably—within the next 10 years—reach a limit because of fundamental scientific reasons. DOE convened a roundtable of experts in neuromorphic computing systems, materials science, and computer science in Washington on October 29-30, 2015 to address the following basic questions: Can brain-like (“neuromorphic”) computing devices based on new material concepts and systems be developed to dramatically outperform conventional CMOS basedmore » technology? If so, what are the basic research challenges for materials sicence and computing? The overarching answer that emerged was: The development of novel functional materials and devices incorporated into unique architectures will allow a revolutionary technological leap toward the implementation of a fully “neuromorphic” computer. To address this challenge, the following issues were considered: The main differences between neuromorphic and conventional computing as related to: signaling models, timing/clock, non-volatile memory, architecture, fault tolerance, integrated memory and compute, noise tolerance, analog vs. digital, and in situ learning New neuromorphic architectures needed to: produce lower energy consumption, potential novel nanostructured materials, and enhanced computation Device and materials properties needed to implement functions such as: hysteresis, stability, and fault tolerance Comparisons of different implementations: spin torque, memristors, resistive switching, phase change, and optical schemes for enhanced breakthroughs in performance, cost, fault tolerance, and/or manufacturability.« less
High Performance Object-Oriented Scientific Programming in Fortran 90

NASA Technical Reports Server (NTRS)

Norton, Charles D.; Decyk, Viktor K.; Szymanski, Boleslaw K.

1997-01-01

We illustrate how Fortran 90 supports object-oriented concepts by example of plasma particle computations on the IBM SP. Our experience shows that Fortran 90 and object-oriented methodology give high performance while providing a bridge from Fortran 77 legacy codes to modern programming principles. All of our object-oriented Fortran 90 codes execute more quickly thatn the equeivalent C++ versions, yet the abstraction modelling capabilities used for scentific programming are comparably powereful.
The computational challenges of Earth-system science.

PubMed

O'Neill, Alan; Steenman-Clark, Lois

2002-06-15

The Earth system--comprising atmosphere, ocean, land, cryosphere and biosphere--is an immensely complex system, involving processes and interactions on a wide range of space- and time-scales. To understand and predict the evolution of the Earth system is one of the greatest challenges of modern science, with success likely to bring enormous societal benefits. High-performance computing, along with the wealth of new observational data, is revolutionizing our ability to simulate the Earth system with computer models that link the different components of the system together. There are, however, considerable scientific and technical challenges to be overcome. This paper will consider four of them: complexity, spatial resolution, inherent uncertainty and time-scales. Meeting these challenges requires a significant increase in the power of high-performance computers. The benefits of being able to make reliable predictions about the evolution of the Earth system should, on their own, amply repay this investment.
Parallel Domain Decomposition Formulation and Software for Large-Scale Sparse Symmetrical/Unsymmetrical Aeroacoustic Applications

NASA Technical Reports Server (NTRS)

Nguyen, D. T.; Watson, Willie R. (Technical Monitor)

2005-01-01

The overall objectives of this research work are to formulate and validate efficient parallel algorithms, and to efficiently design/implement computer software for solving large-scale acoustic problems, arised from the unified frameworks of the finite element procedures. The adopted parallel Finite Element (FE) Domain Decomposition (DD) procedures should fully take advantages of multiple processing capabilities offered by most modern high performance computing platforms for efficient parallel computation. To achieve this objective. the formulation needs to integrate efficient sparse (and dense) assembly techniques, hybrid (or mixed) direct and iterative equation solvers, proper pre-conditioned strategies, unrolling strategies, and effective processors' communicating schemes. Finally, the numerical performance of the developed parallel finite element procedures will be evaluated by solving series of structural, and acoustic (symmetrical and un-symmetrical) problems (in different computing platforms). Comparisons with existing "commercialized" and/or "public domain" software are also included, whenever possible.
An evaluation of multi-probe locality sensitive hashing for computing similarities over web-scale query logs

PubMed Central

2018-01-01

Many modern applications of AI such as web search, mobile browsing, image processing, and natural language processing rely on finding similar items from a large database of complex objects. Due to the very large scale of data involved (e.g., users’ queries from commercial search engines), computing such near or nearest neighbors is a non-trivial task, as the computational cost grows significantly with the number of items. To address this challenge, we adopt Locality Sensitive Hashing (a.k.a, LSH) methods and evaluate four variants in a distributed computing environment (specifically, Hadoop). We identify several optimizations which improve performance, suitable for deployment in very large scale settings. The experimental results demonstrate our variants of LSH achieve the robust performance with better recall compared with “vanilla” LSH, even when using the same amount of space. PMID:29346410
Extending Clause Learning of SAT Solvers with Boolean Gröbner Bases

NASA Astrophysics Data System (ADS)

Zengler, Christoph; Küchlin, Wolfgang

We extend clause learning as performed by most modern SAT Solvers by integrating the computation of Boolean Gröbner bases into the conflict learning process. Instead of learning only one clause per conflict, we compute and learn additional binary clauses from a Gröbner basis of the current conflict. We used the Gröbner basis engine of the logic package Redlog contained in the computer algebra system Reduce to extend the SAT solver MiniSAT with Gröbner basis learning. Our approach shows a significant reduction of conflicts and a reduction of restarts and computation time on many hard problems from the SAT 2009 competition.
Accelerating nuclear configuration interaction calculations through a preconditioned block iterative eigensolver

NASA Astrophysics Data System (ADS)

Shao, Meiyue; Aktulga, H. Metin; Yang, Chao; Ng, Esmond G.; Maris, Pieter; Vary, James P.

2018-01-01

We describe a number of recently developed techniques for improving the performance of large-scale nuclear configuration interaction calculations on high performance parallel computers. We show the benefit of using a preconditioned block iterative method to replace the Lanczos algorithm that has traditionally been used to perform this type of computation. The rapid convergence of the block iterative method is achieved by a proper choice of starting guesses of the eigenvectors and the construction of an effective preconditioner. These acceleration techniques take advantage of special structure of the nuclear configuration interaction problem which we discuss in detail. The use of a block method also allows us to improve the concurrency of the computation, and take advantage of the memory hierarchy of modern microprocessors to increase the arithmetic intensity of the computation relative to data movement. We also discuss the implementation details that are critical to achieving high performance on massively parallel multi-core supercomputers, and demonstrate that the new block iterative solver is two to three times faster than the Lanczos based algorithm for problems of moderate sizes on a Cray XC30 system.
MIC-SVM: Designing A Highly Efficient Support Vector Machine For Advanced Modern Multi-Core and Many-Core Architectures

DOE Office of Scientific and Technical Information (OSTI.GOV)

You, Yang; Song, Shuaiwen; Fu, Haohuan

2014-08-16

Support Vector Machine (SVM) has been widely used in data-mining and Big Data applications as modern commercial databases start to attach an increasing importance to the analytic capabilities. In recent years, SVM was adapted to the field of High Performance Computing for power/performance prediction, auto-tuning, and runtime scheduling. However, even at the risk of losing prediction accuracy due to insufficient runtime information, researchers can only afford to apply offline model training to avoid significant runtime training overhead. To address the challenges above, we designed and implemented MICSVM, a highly efficient parallel SVM for x86 based multi-core and many core architectures,more » such as the Intel Ivy Bridge CPUs and Intel Xeon Phi coprocessor (MIC).« less
CSP: A Multifaceted Hybrid Architecture for Space Computing

NASA Technical Reports Server (NTRS)

Rudolph, Dylan; Wilson, Christopher; Stewart, Jacob; Gauvin, Patrick; George, Alan; Lam, Herman; Crum, Gary Alex; Wirthlin, Mike; Wilson, Alex; Stoddard, Aaron

2014-01-01

Research on the CHREC Space Processor (CSP) takes a multifaceted hybrid approach to embedded space computing. Working closely with the NASA Goddard SpaceCube team, researchers at the National Science Foundation (NSF) Center for High-Performance Reconfigurable Computing (CHREC) at the University of Florida and Brigham Young University are developing hybrid space computers that feature an innovative combination of three technologies: commercial-off-the-shelf (COTS) devices, radiation-hardened (RadHard) devices, and fault-tolerant computing. Modern COTS processors provide the utmost in performance and energy-efficiency but are susceptible to ionizing radiation in space, whereas RadHard processors are virtually immune to this radiation but are more expensive, larger, less energy-efficient, and generations behind in speed and functionality. By featuring COTS devices to perform the critical data processing, supported by simpler RadHard devices that monitor and manage the COTS devices, and augmented with novel uses of fault-tolerant hardware, software, information, and networking within and between COTS devices, the resulting system can maximize performance and reliability while minimizing energy consumption and cost. NASA Goddard has adopted the CSP concept and technology with plans underway to feature flight-ready CSP boards on two upcoming space missions.
The path toward HEP High Performance Computing

NASA Astrophysics Data System (ADS)

Apostolakis, John; Brun, René; Carminati, Federico; Gheata, Andrei; Wenzel, Sandro

2014-06-01

High Energy Physics code has been known for making poor use of high performance computing architectures. Efforts in optimising HEP code on vector and RISC architectures have yield limited results and recent studies have shown that, on modern architectures, it achieves a performance between 10% and 50% of the peak one. Although several successful attempts have been made to port selected codes on GPUs, no major HEP code suite has a "High Performance" implementation. With LHC undergoing a major upgrade and a number of challenging experiments on the drawing board, HEP cannot any longer neglect the less-than-optimal performance of its code and it has to try making the best usage of the hardware. This activity is one of the foci of the SFT group at CERN, which hosts, among others, the Root and Geant4 project. The activity of the experiments is shared and coordinated via a Concurrency Forum, where the experience in optimising HEP code is presented and discussed. Another activity is the Geant-V project, centred on the development of a highperformance prototype for particle transport. Achieving a good concurrency level on the emerging parallel architectures without a complete redesign of the framework can only be done by parallelizing at event level, or with a much larger effort at track level. Apart the shareable data structures, this typically implies a multiplication factor in terms of memory consumption compared to the single threaded version, together with sub-optimal handling of event processing tails. Besides this, the low level instruction pipelining of modern processors cannot be used efficiently to speedup the program. We have implemented a framework that allows scheduling vectors of particles to an arbitrary number of computing resources in a fine grain parallel approach. The talk will review the current optimisation activities within the SFT group with a particular emphasis on the development perspectives towards a simulation framework able to profit best from the recent technology evolution in computing.
SoAx: A generic C++ Structure of Arrays for handling particles in HPC codes

NASA Astrophysics Data System (ADS)

Homann, Holger; Laenen, Francois

2018-03-01

The numerical study of physical problems often require integrating the dynamics of a large number of particles evolving according to a given set of equations. Particles are characterized by the information they are carrying such as an identity, a position other. There are generally speaking two different possibilities for handling particles in high performance computing (HPC) codes. The concept of an Array of Structures (AoS) is in the spirit of the object-oriented programming (OOP) paradigm in that the particle information is implemented as a structure. Here, an object (realization of the structure) represents one particle and a set of many particles is stored in an array. In contrast, using the concept of a Structure of Arrays (SoA), a single structure holds several arrays each representing one property (such as the identity) of the whole set of particles. The AoS approach is often implemented in HPC codes due to its handiness and flexibility. For a class of problems, however, it is known that the performance of SoA is much better than that of AoS. We confirm this observation for our particle problem. Using a benchmark we show that on modern Intel Xeon processors the SoA implementation is typically several times faster than the AoS one. On Intel's MIC co-processors the performance gap even attains a factor of ten. The same is true for GPU computing, using both computational and multi-purpose GPUs. Combining performance and handiness, we present the library SoAx that has optimal performance (on CPUs, MICs, and GPUs) while providing the same handiness as AoS. For this, SoAx uses modern C++ design techniques such template meta programming that allows to automatically generate code for user defined heterogeneous data structures.
High performance in silico virtual drug screening on many-core processors.

PubMed

McIntosh-Smith, Simon; Price, James; Sessions, Richard B; Ibarra, Amaurys A

2015-05-01

Drug screening is an important part of the drug development pipeline for the pharmaceutical industry. Traditional, lab-based methods are increasingly being augmented with computational methods, ranging from simple molecular similarity searches through more complex pharmacophore matching to more computationally intensive approaches, such as molecular docking. The latter simulates the binding of drug molecules to their targets, typically protein molecules. In this work, we describe BUDE, the Bristol University Docking Engine, which has been ported to the OpenCL industry standard parallel programming language in order to exploit the performance of modern many-core processors. Our highly optimized OpenCL implementation of BUDE sustains 1.43 TFLOP/s on a single Nvidia GTX 680 GPU, or 46% of peak performance. BUDE also exploits OpenCL to deliver effective performance portability across a broad spectrum of different computer architectures from different vendors, including GPUs from Nvidia and AMD, Intel's Xeon Phi and multi-core CPUs with SIMD instruction sets.
High performance in silico virtual drug screening on many-core processors

PubMed Central

Price, James; Sessions, Richard B; Ibarra, Amaurys A

2015-01-01

Drug screening is an important part of the drug development pipeline for the pharmaceutical industry. Traditional, lab-based methods are increasingly being augmented with computational methods, ranging from simple molecular similarity searches through more complex pharmacophore matching to more computationally intensive approaches, such as molecular docking. The latter simulates the binding of drug molecules to their targets, typically protein molecules. In this work, we describe BUDE, the Bristol University Docking Engine, which has been ported to the OpenCL industry standard parallel programming language in order to exploit the performance of modern many-core processors. Our highly optimized OpenCL implementation of BUDE sustains 1.43 TFLOP/s on a single Nvidia GTX 680 GPU, or 46% of peak performance. BUDE also exploits OpenCL to deliver effective performance portability across a broad spectrum of different computer architectures from different vendors, including GPUs from Nvidia and AMD, Intel’s Xeon Phi and multi-core CPUs with SIMD instruction sets. PMID:25972727
OpenCluster: A Flexible Distributed Computing Framework for Astronomical Data Processing

NASA Astrophysics Data System (ADS)

Wei, Shoulin; Wang, Feng; Deng, Hui; Liu, Cuiyin; Dai, Wei; Liang, Bo; Mei, Ying; Shi, Congming; Liu, Yingbo; Wu, Jingping

2017-02-01

The volume of data generated by modern astronomical telescopes is extremely large and rapidly growing. However, current high-performance data processing architectures/frameworks are not well suited for astronomers because of their limitations and programming difficulties. In this paper, we therefore present OpenCluster, an open-source distributed computing framework to support rapidly developing high-performance processing pipelines of astronomical big data. We first detail the OpenCluster design principles and implementations and present the APIs facilitated by the framework. We then demonstrate a case in which OpenCluster is used to resolve complex data processing problems for developing a pipeline for the Mingantu Ultrawide Spectral Radioheliograph. Finally, we present our OpenCluster performance evaluation. Overall, OpenCluster provides not only high fault tolerance and simple programming interfaces, but also a flexible means of scaling up the number of interacting entities. OpenCluster thereby provides an easily integrated distributed computing framework for quickly developing a high-performance data processing system of astronomical telescopes and for significantly reducing software development expenses.
Robotics in medicine

NASA Astrophysics Data System (ADS)

Kuznetsov, D. N.; Syryamkin, V. I.

2015-11-01

Modern technologies play a very important role in our lives. It is hard to imagine how people can get along without personal computers, and companies - without powerful computer centers. Nowadays, many devices make modern medicine more effective. Medicine is developing constantly, so introduction of robots in this sector is a very promising activity. Advances in technology have influenced medicine greatly. Robotic surgery is now actively developing worldwide. Scientists have been carrying out research and practical attempts to create robotic surgeons for more than 20 years, since the mid-80s of the last century. Robotic assistants play an important role in modern medicine. This industry is new enough and is at the early stage of development; despite this, some developments already have worldwide application; they function successfully and bring invaluable help to employees of medical institutions. Today, doctors can perform operations that seemed impossible a few years ago. Such progress in medicine is due to many factors. First, modern operating rooms are equipped with up-to-date equipment, allowing doctors to make operations more accurately and with less risk to the patient. Second, technology has enabled to improve the quality of doctors' training. Various types of robots exist now: assistants, military robots, space, household and medical, of course. Further, we should make a detailed analysis of existing types of robots and their application. The purpose of the article is to illustrate the most popular types of robots used in medicine.
Results of comparative RBMK neutron computation using VNIIEF codes (cell computation, 3D statics, 3D kinetics). Final report

DOE Office of Scientific and Technical Information (OSTI.GOV)

Grebennikov, A.N.; Zhitnik, A.K.; Zvenigorodskaya, O.A.

1995-12-31

In conformity with the protocol of the Workshop under Contract {open_quotes}Assessment of RBMK reactor safety using modern Western Codes{close_quotes} VNIIEF performed a neutronics computation series to compare western and VNIIEF codes and assess whether VNIIEF codes are suitable for RBMK type reactor safety assessment computation. The work was carried out in close collaboration with M.I. Rozhdestvensky and L.M. Podlazov, NIKIET employees. The effort involved: (1) cell computations with the WIMS, EKRAN codes (improved modification of the LOMA code) and the S-90 code (VNIIEF Monte Carlo). Cell, polycell, burnup computation; (2) 3D computation of static states with the KORAT-3D and NEUmore » codes and comparison with results of computation with the NESTLE code (USA). The computations were performed in the geometry and using the neutron constants presented by the American party; (3) 3D computation of neutron kinetics with the KORAT-3D and NEU codes. These computations were performed in two formulations, both being developed in collaboration with NIKIET. Formulation of the first problem maximally possibly agrees with one of NESTLE problems and imitates gas bubble travel through a core. The second problem is a model of the RBMK as a whole with imitation of control and protection system controls (CPS) movement in a core.« less
Unstructured Adaptive (UA) NAS Parallel Benchmark. Version 1.0

NASA Technical Reports Server (NTRS)

Feng, Huiyu; VanderWijngaart, Rob; Biswas, Rupak; Mavriplis, Catherine

2004-01-01

We present a complete specification of a new benchmark for measuring the performance of modern computer systems when solving scientific problems featuring irregular, dynamic memory accesses. It complements the existing NAS Parallel Benchmark suite. The benchmark involves the solution of a stylized heat transfer problem in a cubic domain, discretized on an adaptively refined, unstructured mesh.
Software Tools for Development on the Peregrine System | High-Performance

Science.gov Websites

Computing | NREL Software Tools for Development on the Peregrine System Software Tools for and manage software at the source code level. Cross-Platform Make and SCons The "Cross-Platform Make" (CMake) package is from Kitware, and SCons is a modern software build tool based on Python
Modern Design of Resonant Edge-Slot Array Antennas

NASA Technical Reports Server (NTRS)

Gosselin, R. B.

2006-01-01

Resonant edge-slot (slotted-waveguide) array antennas can now be designed very accurately following a modern computational approach like that followed for some other microwave components. This modern approach makes it possible to design superior antennas at lower cost than was previously possible. Heretofore, the physical and engineering knowledge of resonant edge-slot array antennas had remained immature since they were introduced during World War II. This is because despite their mechanical simplicity, high reliability, and potential for operation with high efficiency, the electromagnetic behavior of resonant edge-slot antennas is very complex. Because engineering design formulas and curves for such antennas are not available in the open literature, designers have been forced to implement iterative processes of fabricating and testing multiple prototypes to derive design databases, each unique for a specific combination of operating frequency and set of waveguide tube dimensions. The expensive, time-consuming nature of these processes has inhibited the use of resonant edge-slot antennas. The present modern approach reduces costs by making it unnecessary to build and test multiple prototypes. As an additional benefit, this approach affords a capability to design an array of slots having different dimensions to taper the antenna illumination to reduce the amplitudes of unwanted side lobes. The heart of the modern approach is the use of the latest commercially available microwave-design software, which implements finite-element models of electromagnetic fields in and around waveguides, antenna elements, and similar components. Instead of building and testing prototypes, one builds a database and constructs design curves from the results of computational simulations for sets of design parameters. The figure shows a resonant edge-slot antenna designed following this approach. Intended for use as part of a radiometer operating at a frequency of 10.7 GHz, this antenna was fabricated from dimensions defined exclusively by results of computational simulations. The final design was found to be well optimized and to yield performance exceeding that initially required.
CUDA Optimization Strategies for Compute- and Memory-Bound Neuroimaging Algorithms

PubMed Central

Lee, Daren; Dinov, Ivo; Dong, Bin; Gutman, Boris; Yanovsky, Igor; Toga, Arthur W.

2011-01-01

As neuroimaging algorithms and technology continue to grow faster than CPU performance in complexity and image resolution, data-parallel computing methods will be increasingly important. The high performance, data-parallel architecture of modern graphical processing units (GPUs) can reduce computational times by orders of magnitude. However, its massively threaded architecture introduces challenges when GPU resources are exceeded. This paper presents optimization strategies for compute- and memory-bound algorithms for the CUDA architecture. For compute-bound algorithms, the registers are reduced through variable reuse via shared memory and the data throughput is increased through heavier thread workloads and maximizing the thread configuration for a single thread block per multiprocessor. For memory-bound algorithms, fitting the data into the fast but limited GPU resources is achieved through reorganizing the data into self-contained structures and employing a multi-pass approach. Memory latencies are reduced by selecting memory resources whose cache performance are optimized for the algorithm's access patterns. We demonstrate the strategies on two computationally expensive algorithms and achieve optimized GPU implementations that perform up to 6× faster than unoptimized ones. Compared to CPU implementations, we achieve peak GPU speedups of 129× for the 3D unbiased nonlinear image registration technique and 93× for the non-local means surface denoising algorithm. PMID:21159404

CUDA optimization strategies for compute- and memory-bound neuroimaging algorithms.

PubMed

Lee, Daren; Dinov, Ivo; Dong, Bin; Gutman, Boris; Yanovsky, Igor; Toga, Arthur W

2012-06-01

As neuroimaging algorithms and technology continue to grow faster than CPU performance in complexity and image resolution, data-parallel computing methods will be increasingly important. The high performance, data-parallel architecture of modern graphical processing units (GPUs) can reduce computational times by orders of magnitude. However, its massively threaded architecture introduces challenges when GPU resources are exceeded. This paper presents optimization strategies for compute- and memory-bound algorithms for the CUDA architecture. For compute-bound algorithms, the registers are reduced through variable reuse via shared memory and the data throughput is increased through heavier thread workloads and maximizing the thread configuration for a single thread block per multiprocessor. For memory-bound algorithms, fitting the data into the fast but limited GPU resources is achieved through reorganizing the data into self-contained structures and employing a multi-pass approach. Memory latencies are reduced by selecting memory resources whose cache performance are optimized for the algorithm's access patterns. We demonstrate the strategies on two computationally expensive algorithms and achieve optimized GPU implementations that perform up to 6× faster than unoptimized ones. Compared to CPU implementations, we achieve peak GPU speedups of 129× for the 3D unbiased nonlinear image registration technique and 93× for the non-local means surface denoising algorithm. Copyright © 2010 Elsevier Ireland Ltd. All rights reserved.
An introduction to real-time graphical techniques for analyzing multivariate data

NASA Astrophysics Data System (ADS)

Friedman, Jerome H.; McDonald, John Alan; Stuetzle, Werner

1987-08-01

Orion I is a graphics system used to study applications of computer graphics - especially interactive motion graphics - in statistics. Orion I is the newest of a family of "Prim" systems, whose most striking common feature is the use of real-time motion graphics to display three dimensional scatterplots. Orion I differs from earlier Prim systems through the use of modern and relatively inexpensive raster graphics and microprocessor technology. It also delivers more computing power to its user; Orion I can perform more sophisticated real-time computations than were possible on previous such systems. We demonstrate some of Orion I's capabilities in our film: "Exploring data with Orion I".
Accelerating scientific computations with mixed precision algorithms

NASA Astrophysics Data System (ADS)

Baboulin, Marc; Buttari, Alfredo; Dongarra, Jack; Kurzak, Jakub; Langou, Julie; Langou, Julien; Luszczek, Piotr; Tomov, Stanimire

2009-12-01

On modern architectures, the performance of 32-bit operations is often at least twice as fast as the performance of 64-bit operations. By using a combination of 32-bit and 64-bit floating point arithmetic, the performance of many dense and sparse linear algebra algorithms can be significantly enhanced while maintaining the 64-bit accuracy of the resulting solution. The approach presented here can apply not only to conventional processors but also to other technologies such as Field Programmable Gate Arrays (FPGA), Graphical Processing Units (GPU), and the STI Cell BE processor. Results on modern processor architectures and the STI Cell BE are presented. Program summaryProgram title: ITER-REF Catalogue identifier: AECO_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AECO_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: Standard CPC licence, http://cpc.cs.qub.ac.uk/licence/licence.html No. of lines in distributed program, including test data, etc.: 7211 No. of bytes in distributed program, including test data, etc.: 41 862 Distribution format: tar.gz Programming language: FORTRAN 77 Computer: desktop, server Operating system: Unix/Linux RAM: 512 Mbytes Classification: 4.8 External routines: BLAS (optional) Nature of problem: On modern architectures, the performance of 32-bit operations is often at least twice as fast as the performance of 64-bit operations. By using a combination of 32-bit and 64-bit floating point arithmetic, the performance of many dense and sparse linear algebra algorithms can be significantly enhanced while maintaining the 64-bit accuracy of the resulting solution. Solution method: Mixed precision algorithms stem from the observation that, in many cases, a single precision solution of a problem can be refined to the point where double precision accuracy is achieved. A common approach to the solution of linear systems, either dense or sparse, is to perform the LU factorization of the coefficient matrix using Gaussian elimination. First, the coefficient matrix A is factored into the product of a lower triangular matrix L and an upper triangular matrix U. Partial row pivoting is in general used to improve numerical stability resulting in a factorization PA=LU, where P is a permutation matrix. The solution for the system is achieved by first solving Ly=Pb (forward substitution) and then solving Ux=y (backward substitution). Due to round-off errors, the computed solution, x, carries a numerical error magnified by the condition number of the coefficient matrix A. In order to improve the computed solution, an iterative process can be applied, which produces a correction to the computed solution at each iteration, which then yields the method that is commonly known as the iterative refinement algorithm. Provided that the system is not too ill-conditioned, the algorithm produces a solution correct to the working precision. Running time: seconds/minutes
Final Project Report: Data Locality Enhancement of Dynamic Simulations for Exascale Computing

DOE Office of Scientific and Technical Information (OSTI.GOV)

Shen, Xipeng

The goal of this project is to develop a set of techniques and software tools to enhance the matching between memory accesses in dynamic simulations and the prominent features of modern and future manycore systems, alleviating the memory performance issues for exascale computing. In the first three years, the PI and his group have achieves some significant progress towards the goal, producing a set of novel techniques for improving the memory performance and data locality in manycore systems, yielding 18 conference and workshop papers and 4 journal papers and graduating 6 Ph.Ds. This report summarizes the research results of thismore » project through that period.« less
Thin client performance for remote 3-D image display.

PubMed

Lai, Albert; Nieh, Jason; Laine, Andrew; Starren, Justin

2003-01-01

Several trends in biomedical computing are converging in a way that will require new approaches to telehealth image display. Image viewing is becoming an "anytime, anywhere" activity. In addition, organizations are beginning to recognize that healthcare providers are highly mobile and optimal care requires providing information wherever the provider and patient are. Thin-client computing is one way to support image viewing this complex environment. However little is known about the behavior of thin client systems in supporting image transfer in modern heterogeneous networks. Our results show that using thin-clients can deliver acceptable performance over conditions commonly seen in wireless networks if newer protocols optimized for these conditions are used.
Vectorization for Molecular Dynamics on Intel Xeon Phi Corpocessors

NASA Astrophysics Data System (ADS)

Yi, Hongsuk

2014-03-01

Many modern processors are capable of exploiting data-level parallelism through the use of single instruction multiple data (SIMD) execution. The new Intel Xeon Phi coprocessor supports 512 bit vector registers for the high performance computing. In this paper, we have developed a hierarchical parallelization scheme for accelerated molecular dynamics simulations with the Terfoff potentials for covalent bond solid crystals on Intel Xeon Phi coprocessor systems. The scheme exploits multi-level parallelism computing. We combine thread-level parallelism using a tightly coupled thread-level and task-level parallelism with 512-bit vector register. The simulation results show that the parallel performance of SIMD implementations on Xeon Phi is apparently superior to their x86 CPU architecture.
Analysis of lead twist in modern high-performance grinding methods

NASA Astrophysics Data System (ADS)

Kundrák, J.; Gyáni, K.; Felhő, C.; Markopoulos, AP; Deszpoth, I.

2016-11-01

According to quality requirements of road vehicles shafts, which bear dynamic seals, twisted-pattern micro-geometrical topography is not allowed. It is a question whether newer modern grinding methods - such as quick-point grinding and peel grinding - could provide twist- free topography. According to industrial experience, twist-free surfaces can be made, however with certain settings, same twist occurs. In this paper it is proved by detailed chip-geometrical analysis that the topography generated by the new procedures is theoretically twist-patterned because of the feeding motion of the CBN tool. The presented investigation was carried out by a single-grain wheel model and computer simulation.
Application of modern tools and techniques to maximize engineering productivity in the development of orbital operations plans for the space station progrm

NASA Technical Reports Server (NTRS)

Manford, J. S.; Bennett, G. R.

1985-01-01

The Space Station Program will incorporate analysis of operations constraints and considerations in the early design phases to avoid the need for later modifications to the Space Station for operations. The application of modern tools and administrative techniques to minimize the cost of performing effective orbital operations planning and design analysis in the preliminary design phase of the Space Station Program is discussed. Tools and techniques discussed include: approach for rigorous analysis of operations functions, use of the resources of a large computer network, and providing for efficient research and access to information.
A computer analysis of the RF performance of a ground-mounted, air-supported radome

NASA Astrophysics Data System (ADS)

Punnett, M. B.; Joy, E. B.

Several reports and actual operating experience have highlighted the degradation of RF Performance which can occur when SSR or IFF antenna are mounted above primary search antenna within metal space frame or dielectric space frame radomes. These effects are usually attributed to both the high incidence angles and sensitivity of the low gain antennae to sidelobe changes due to scattered energy. Although it has been widely accepted that thin membrane radomes would provide superior performance for this application, there has been little supporting documentation. A plane-wave-spectrum (PWS) computer-based radome analysis was conducted to assess the performance of a specific air-supported radome for the SSR application. In conducting the analysis a mathematical model of a modern SSR antenna was combined with a model of an existing Birdair radome design.
Assessment of CFD capability for prediction of hypersonic shock interactions

NASA Astrophysics Data System (ADS)

Knight, Doyle; Longo, José; Drikakis, Dimitris; Gaitonde, Datta; Lani, Andrea; Nompelis, Ioannis; Reimann, Bodo; Walpot, Louis

2012-01-01

The aerothermodynamic loadings associated with shock wave boundary layer interactions (shock interactions) must be carefully considered in the design of hypersonic air vehicles. The capability of Computational Fluid Dynamics (CFD) software to accurately predict hypersonic shock wave laminar boundary layer interactions is examined. A series of independent computations performed by researchers in the US and Europe are presented for two generic configurations (double cone and cylinder) and compared with experimental data. The results illustrate the current capabilities and limitations of modern CFD methods for these flows.
Merging the Machines of Modern Science

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wolf, Laura; Collins, Jim

Two recent projects have harnessed supercomputing resources at the US Department of Energy’s Argonne National Laboratory in a novel way to support major fusion science and particle collider experiments. Using leadership computing resources, one team ran fine-grid analysis of real-time data to make near-real-time adjustments to an ongoing experiment, while a second team is working to integrate Argonne’s supercomputers into the Large Hadron Collider/ATLAS workflow. Together these efforts represent a new paradigm of the high-performance computing center as a partner in experimental science.
New Trends in E-Science: Machine Learning and Knowledge Discovery in Databases

NASA Astrophysics Data System (ADS)

Brescia, Massimo

2012-11-01

Data mining, or Knowledge Discovery in Databases (KDD), while being the main methodology to extract the scientific information contained in Massive Data Sets (MDS), needs to tackle crucial problems since it has to orchestrate complex challenges posed by transparent access to different computing environments, scalability of algorithms, reusability of resources. To achieve a leap forward for the progress of e-science in the data avalanche era, the community needs to implement an infrastructure capable of performing data access, processing and mining in a distributed but integrated context. The increasing complexity of modern technologies carried out a huge production of data, whose related warehouse management and the need to optimize analysis and mining procedures lead to a change in concept on modern science. Classical data exploration, based on local user own data storage and limited computing infrastructures, is no more efficient in the case of MDS, worldwide spread over inhomogeneous data centres and requiring teraflop processing power. In this context modern experimental and observational science requires a good understanding of computer science, network infrastructures, Data Mining, etc. i.e. of all those techniques which fall into the domain of the so called e-science (recently assessed also by the Fourth Paradigm of Science). Such understanding is almost completely absent in the older generations of scientists and this reflects in the inadequacy of most academic and research programs. A paradigm shift is needed: statistical pattern recognition, object oriented programming, distributed computing, parallel programming need to become an essential part of scientific background. A possible practical solution is to provide the research community with easy-to understand, easy-to-use tools, based on the Web 2.0 technologies and Machine Learning methodology. Tools where almost all the complexity is hidden to the final user, but which are still flexible and able to produce efficient and reliable scientific results. All these considerations will be described in the detail in the chapter. Moreover, examples of modern applications offering to a wide variety of e-science communities a large spectrum of computational facilities to exploit the wealth of available massive data sets and powerful machine learning and statistical algorithms will be also introduced.
DNS of Flow in a Low-Pressure Turbine Cascade Using a Discontinuous-Galerkin Spectral-Element Method

NASA Technical Reports Server (NTRS)

Garai, Anirban; Diosady, Laslo Tibor; Murman, Scott; Madavan, Nateri

2015-01-01

A new computational capability under development for accurate and efficient high-fidelity direct numerical simulation (DNS) and large eddy simulation (LES) of turbomachinery is described. This capability is based on an entropy-stable Discontinuous-Galerkin spectral-element approach that extends to arbitrarily high orders of spatial and temporal accuracy and is implemented in a computationally efficient manner on a modern high performance computer architecture. A validation study using this method to perform DNS of flow in a low-pressure turbine airfoil cascade are presented. Preliminary results indicate that the method captures the main features of the flow. Discrepancies between the predicted results and the experiments are likely due to the effects of freestream turbulence not being included in the simulation and will be addressed in the final paper.
Efficient Multicriteria Protein Structure Comparison on Modern Processor Architectures

PubMed Central

Manolakos, Elias S.

2015-01-01

Fast increasing computational demand for all-to-all protein structures comparison (PSC) is a result of three confounding factors: rapidly expanding structural proteomics databases, high computational complexity of pairwise protein comparison algorithms, and the trend in the domain towards using multiple criteria for protein structures comparison (MCPSC) and combining results. We have developed a software framework that exploits many-core and multicore CPUs to implement efficient parallel MCPSC in modern processors based on three popular PSC methods, namely, TMalign, CE, and USM. We evaluate and compare the performance and efficiency of the two parallel MCPSC implementations using Intel's experimental many-core Single-Chip Cloud Computer (SCC) as well as Intel's Core i7 multicore processor. We show that the 48-core SCC is more efficient than the latest generation Core i7, achieving a speedup factor of 42 (efficiency of 0.9), making many-core processors an exciting emerging technology for large-scale structural proteomics. We compare and contrast the performance of the two processors on several datasets and also show that MCPSC outperforms its component methods in grouping related domains, achieving a high F-measure of 0.91 on the benchmark CK34 dataset. The software implementation for protein structure comparison using the three methods and combined MCPSC, along with the developed underlying rckskel algorithmic skeletons library, is available via GitHub. PMID:26605332
Efficient Multicriteria Protein Structure Comparison on Modern Processor Architectures.

PubMed

Sharma, Anuj; Manolakos, Elias S

2015-01-01

Fast increasing computational demand for all-to-all protein structures comparison (PSC) is a result of three confounding factors: rapidly expanding structural proteomics databases, high computational complexity of pairwise protein comparison algorithms, and the trend in the domain towards using multiple criteria for protein structures comparison (MCPSC) and combining results. We have developed a software framework that exploits many-core and multicore CPUs to implement efficient parallel MCPSC in modern processors based on three popular PSC methods, namely, TMalign, CE, and USM. We evaluate and compare the performance and efficiency of the two parallel MCPSC implementations using Intel's experimental many-core Single-Chip Cloud Computer (SCC) as well as Intel's Core i7 multicore processor. We show that the 48-core SCC is more efficient than the latest generation Core i7, achieving a speedup factor of 42 (efficiency of 0.9), making many-core processors an exciting emerging technology for large-scale structural proteomics. We compare and contrast the performance of the two processors on several datasets and also show that MCPSC outperforms its component methods in grouping related domains, achieving a high F-measure of 0.91 on the benchmark CK34 dataset. The software implementation for protein structure comparison using the three methods and combined MCPSC, along with the developed underlying rckskel algorithmic skeletons library, is available via GitHub.
Some observations on computer lip-reading: moving from the dream to the reality

NASA Astrophysics Data System (ADS)

Bear, Helen L.; Owen, Gari; Harvey, Richard; Theobald, Barry-John

2014-10-01

In the quest for greater computer lip-reading performance there are a number of tacit assumptions which are either present in the datasets (high resolution for example) or in the methods (recognition of spoken visual units called "visemes" for example). Here we review these and other assumptions and show the surprising result that computer lip-reading is not heavily constrained by video resolution, pose, lighting and other practical factors. However, the working assumption that visemes, which are the visual equivalent of phonemes, are the best unit for recognition does need further examination. We conclude that visemes, which were defined over a century ago, are unlikely to be optimal for a modern computer lip-reading system.
Hands-on approach to teaching Earth system sciences using a information-computational web-GIS portal "Climate"

NASA Astrophysics Data System (ADS)

Gordova, Yulia; Gorbatenko, Valentina; Martynova, Yulia; Shulgina, Tamara

2014-05-01

A problem of making education relevant to the workplace tasks is a key problem of higher education because old-school training programs are not keeping pace with the rapidly changing situation in the professional field of environmental sciences. A joint group of specialists from Tomsk State University and Siberian center for Environmental research and Training/IMCES SB RAS developed several new courses for students of "Climatology" and "Meteorology" specialties, which comprises theoretical knowledge from up-to-date environmental sciences with practical tasks. To organize the educational process we use an open-source course management system Moodle (www.moodle.org). It gave us an opportunity to combine text and multimedia in a theoretical part of educational courses. The hands-on approach is realized through development of innovative trainings which are performed within the information-computational platform "Climate" (http://climate.scert.ru/) using web GIS tools. These trainings contain practical tasks on climate modeling and climate changes assessment and analysis and should be performed using typical tools which are usually used by scientists performing such kind of research. Thus, students are engaged in n the use of modern tools of the geophysical data analysis and it cultivates dynamic of their professional learning. The hands-on approach can help us to fill in this gap because it is the only approach that offers experience, increases students involvement, advance the use of modern information and communication tools. The courses are implemented at Tomsk State University and help forming modern curriculum in Earth system science area. This work is partially supported by SB RAS project VIII.80.2.1, RFBR grants numbers 13-05-12034 and 14-05-00502.
Virtual reality computer simulation.

PubMed

Grantcharov, T P; Rosenberg, J; Pahle, E; Funch-Jensen, P

2001-03-01

Objective assessment of psychomotor skills should be an essential component of a modern surgical training program. There are computer systems that can be used for this purpose, but their wide application is not yet generally accepted. The aim of this study was to validate the role of virtual reality computer simulation as a method for evaluating surgical laparoscopic skills. The study included 14 surgical residents. On day 1, they performed two runs of all six tasks on the Minimally Invasive Surgical Trainer, Virtual Reality (MIST VR). On day 2, they performed a laparoscopic cholecystectomy on living pigs; afterward, they were tested again on the MIST VR. A group of experienced surgeons evaluated the trainees' performance on the animal operation, giving scores for total performance error and economy of motion. During the tasks on the MIST VR, errors and noneconomy of movements for the left and right hand were also recorded. There were significant correlations between error scores in vivo and three of the six in vitro tasks (p < 0.05). In vivo economy scores correlated significantly with non-economy right-hand scores for five of the six tasks and with non-economy left-hand scores for one of the six tasks (p < 0.05). In this study, laparoscopic performance in the animal model correlated significantly with performance on the computer simulator. Thus, the computer model seems to be a promising objective method for the assessment of laparoscopic psychomotor skills.
GeauxDock: Accelerating Structure-Based Virtual Screening with Heterogeneous Computing

PubMed Central

Fang, Ye; Ding, Yun; Feinstein, Wei P.; Koppelman, David M.; Moreno, Juana; Jarrell, Mark; Ramanujam, J.; Brylinski, Michal

2016-01-01

Computational modeling of drug binding to proteins is an integral component of direct drug design. Particularly, structure-based virtual screening is often used to perform large-scale modeling of putative associations between small organic molecules and their pharmacologically relevant protein targets. Because of a large number of drug candidates to be evaluated, an accurate and fast docking engine is a critical element of virtual screening. Consequently, highly optimized docking codes are of paramount importance for the effectiveness of virtual screening methods. In this communication, we describe the implementation, tuning and performance characteristics of GeauxDock, a recently developed molecular docking program. GeauxDock is built upon the Monte Carlo algorithm and features a novel scoring function combining physics-based energy terms with statistical and knowledge-based potentials. Developed specifically for heterogeneous computing platforms, the current version of GeauxDock can be deployed on modern, multi-core Central Processing Units (CPUs) as well as massively parallel accelerators, Intel Xeon Phi and NVIDIA Graphics Processing Unit (GPU). First, we carried out a thorough performance tuning of the high-level framework and the docking kernel to produce a fast serial code, which was then ported to shared-memory multi-core CPUs yielding a near-ideal scaling. Further, using Xeon Phi gives 1.9× performance improvement over a dual 10-core Xeon CPU, whereas the best GPU accelerator, GeForce GTX 980, achieves a speedup as high as 3.5×. On that account, GeauxDock can take advantage of modern heterogeneous architectures to considerably accelerate structure-based virtual screening applications. GeauxDock is open-sourced and publicly available at www.brylinski.org/geauxdock and https://figshare.com/articles/geauxdock_tar_gz/3205249. PMID:27420300
Combining Distributed and Shared Memory Models: Approach and Evolution of the Global Arrays Toolkit

DOE Office of Scientific and Technical Information (OSTI.GOV)

Nieplocha, Jarek; Harrison, Robert J.; Kumar, Mukul

2002-07-29

Both shared memory and distributed memory models have advantages and shortcomings. Shared memory model is much easier to use but it ignores data locality/placement. Given the hierarchical nature of the memory subsystems in the modern computers this characteristic might have a negative impact on performance and scalability. Various techniques, such as code restructuring to increase data reuse and introducing blocking in data accesses, can address the problem and yield performance competitive with message passing[Singh], however at the cost of compromising the ease of use feature. Distributed memory models such as message passing or one-sided communication offer performance and scalability butmore » they compromise the ease-of-use. In this context, the message-passing model is sometimes referred to as?assembly programming for the scientific computing?. The Global Arrays toolkit[GA1, GA2] attempts to offer the best features of both models. It implements a shared-memory programming model in which data locality is managed explicitly by the programmer. This management is achieved by explicit calls to functions that transfer data between a global address space (a distributed array) and local storage. In this respect, the GA model has similarities to the distributed shared-memory models that provide an explicit acquire/release protocol. However, the GA model acknowledges that remote data is slower to access than local data and allows data locality to be explicitly specified and hence managed. The GA model exposes to the programmer the hierarchical memory of modern high-performance computer systems, and by recognizing the communication overhead for remote data transfer, it promotes data reuse and locality of reference. This paper describes the characteristics of the Global Arrays programming model, capabilities of the toolkit, and discusses its evolution.« less

GeauxDock: Accelerating Structure-Based Virtual Screening with Heterogeneous Computing.

PubMed

Fang, Ye; Ding, Yun; Feinstein, Wei P; Koppelman, David M; Moreno, Juana; Jarrell, Mark; Ramanujam, J; Brylinski, Michal

2016-01-01

Computational modeling of drug binding to proteins is an integral component of direct drug design. Particularly, structure-based virtual screening is often used to perform large-scale modeling of putative associations between small organic molecules and their pharmacologically relevant protein targets. Because of a large number of drug candidates to be evaluated, an accurate and fast docking engine is a critical element of virtual screening. Consequently, highly optimized docking codes are of paramount importance for the effectiveness of virtual screening methods. In this communication, we describe the implementation, tuning and performance characteristics of GeauxDock, a recently developed molecular docking program. GeauxDock is built upon the Monte Carlo algorithm and features a novel scoring function combining physics-based energy terms with statistical and knowledge-based potentials. Developed specifically for heterogeneous computing platforms, the current version of GeauxDock can be deployed on modern, multi-core Central Processing Units (CPUs) as well as massively parallel accelerators, Intel Xeon Phi and NVIDIA Graphics Processing Unit (GPU). First, we carried out a thorough performance tuning of the high-level framework and the docking kernel to produce a fast serial code, which was then ported to shared-memory multi-core CPUs yielding a near-ideal scaling. Further, using Xeon Phi gives 1.9× performance improvement over a dual 10-core Xeon CPU, whereas the best GPU accelerator, GeForce GTX 980, achieves a speedup as high as 3.5×. On that account, GeauxDock can take advantage of modern heterogeneous architectures to considerably accelerate structure-based virtual screening applications. GeauxDock is open-sourced and publicly available at www.brylinski.org/geauxdock and https://figshare.com/articles/geauxdock_tar_gz/3205249.
Impedance computations and beam-based measurements: A problem of discrepancy

NASA Astrophysics Data System (ADS)

Smaluk, Victor

2018-04-01

High intensity of particle beams is crucial for high-performance operation of modern electron-positron storage rings, both colliders and light sources. The beam intensity is limited by the interaction of the beam with self-induced electromagnetic fields (wake fields) proportional to the vacuum chamber impedance. For a new accelerator project, the total broadband impedance is computed by element-wise wake-field simulations using computer codes. For a machine in operation, the impedance can be measured experimentally using beam-based techniques. In this article, a comparative analysis of impedance computations and beam-based measurements is presented for 15 electron-positron storage rings. The measured data and the predictions based on the computed impedance budgets show a significant discrepancy. Three possible reasons for the discrepancy are discussed: interference of the wake fields excited by a beam in adjacent components of the vacuum chamber, effect of computation mesh size, and effect of insufficient bandwidth of the computed impedance.
Supercomputing with toys: harnessing the power of NVIDIA 8800GTX and playstation 3 for bioinformatics problem.

PubMed

Wilson, Justin; Dai, Manhong; Jakupovic, Elvis; Watson, Stanley; Meng, Fan

2007-01-01

Modern video cards and game consoles typically have much better performance to price ratios than that of general purpose CPUs. The parallel processing capabilities of game hardware are well-suited for high throughput biomedical data analysis. Our initial results suggest that game hardware is a cost-effective platform for some computationally demanding bioinformatics problems.
WE-D-303-00: Computational Phantoms

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lewis, John; Brigham and Women’s Hospital and Dana-Farber Cancer Institute, Boston, MA

2015-06-15

Modern medical physics deals with complex problems such as 4D radiation therapy and imaging quality optimization. Such problems involve a large number of radiological parameters, and anatomical and physiological breathing patterns. A major challenge is how to develop, test, evaluate and compare various new imaging and treatment techniques, which often involves testing over a large range of radiological parameters as well as varying patient anatomies and motions. It would be extremely challenging, if not impossible, both ethically and practically, to test every combination of parameters and every task on every type of patient under clinical conditions. Computer-based simulation using computationalmore » phantoms offers a practical technique with which to evaluate, optimize, and compare imaging technologies and methods. Within simulation, the computerized phantom provides a virtual model of the patient’s anatomy and physiology. Imaging data can be generated from it as if it was a live patient using accurate models of the physics of the imaging and treatment process. With sophisticated simulation algorithms, it is possible to perform virtual experiments entirely on the computer. By serving as virtual patients, computational phantoms hold great promise in solving some of the most complex problems in modern medical physics. In this proposed symposium, we will present the history and recent developments of computational phantom models, share experiences in their application to advanced imaging and radiation applications, and discuss their promises and limitations. Learning Objectives: Understand the need and requirements of computational phantoms in medical physics research Discuss the developments and applications of computational phantoms Know the promises and limitations of computational phantoms in solving complex problems.« less
How accurately does high tibial osteotomy correct the mechanical axis of an arthritic varus knee? A systematic review.

PubMed

Van den Bempt, Maxim; Van Genechten, Wouter; Claes, Toon; Claes, Steven

2016-12-01

The aim of this study was to give an overview of the accuracy of coronal limb alignment correction after high tibial osteotomy (HTO) for the arthritic varus knee by performing a systematic review of the literature. The databases PubMed, MEDLINE and Cochrane Library were screened for relevant articles. Only prospective clinical studies with the accuracy of alignment correction by performing HTO as primary or secondary objective were included. Fifteen studies were included in this systematic review and were subdivided in 23 cohorts. A total of 966 procedures were considered. Nine cohorts used computer navigation during HTO and the other 14 cohorts used a conventional method. In seven computer navigation cohorts, at least 75% of the study population fell into the accepted "range of accuracy" (AR) as proposed by the different studies, but only six out of 14 conventional cohorts reached this percentage. Four out of eight conventional cohorts that provided data on under- and overcorrection, had a tendency to undercorrection. The accuracy of coronal alignment corrections using conventional HTO falls short. The number of procedures outside the proposed AR is surprising and exposes a critical concern for modern HTO. Computer navigation might improve the accuracy of correction, but its use is not widespread among orthopedic surgeons. Although HTO procedures have been shown to be successful in the treatment of unicompartmental knee arthritis when performed accurately, the results of this review stress the importance of ongoing efforts in order to improve correction accuracy in modern HTO. Copyright Â© 2016 Elsevier B.V. All rights reserved.
Accelerating nuclear configuration interaction calculations through a preconditioned block iterative eigensolver

DOE PAGES

Shao, Meiyue; Aktulga, H. Metin; Yang, Chao; ...

2017-09-14

In this paper, we describe a number of recently developed techniques for improving the performance of large-scale nuclear configuration interaction calculations on high performance parallel computers. We show the benefit of using a preconditioned block iterative method to replace the Lanczos algorithm that has traditionally been used to perform this type of computation. The rapid convergence of the block iterative method is achieved by a proper choice of starting guesses of the eigenvectors and the construction of an effective preconditioner. These acceleration techniques take advantage of special structure of the nuclear configuration interaction problem which we discuss in detail. Themore » use of a block method also allows us to improve the concurrency of the computation, and take advantage of the memory hierarchy of modern microprocessors to increase the arithmetic intensity of the computation relative to data movement. Finally, we also discuss the implementation details that are critical to achieving high performance on massively parallel multi-core supercomputers, and demonstrate that the new block iterative solver is two to three times faster than the Lanczos based algorithm for problems of moderate sizes on a Cray XC30 system.« less
Accelerating nuclear configuration interaction calculations through a preconditioned block iterative eigensolver

DOE Office of Scientific and Technical Information (OSTI.GOV)

Shao, Meiyue; Aktulga, H. Metin; Yang, Chao

In this paper, we describe a number of recently developed techniques for improving the performance of large-scale nuclear configuration interaction calculations on high performance parallel computers. We show the benefit of using a preconditioned block iterative method to replace the Lanczos algorithm that has traditionally been used to perform this type of computation. The rapid convergence of the block iterative method is achieved by a proper choice of starting guesses of the eigenvectors and the construction of an effective preconditioner. These acceleration techniques take advantage of special structure of the nuclear configuration interaction problem which we discuss in detail. Themore » use of a block method also allows us to improve the concurrency of the computation, and take advantage of the memory hierarchy of modern microprocessors to increase the arithmetic intensity of the computation relative to data movement. Finally, we also discuss the implementation details that are critical to achieving high performance on massively parallel multi-core supercomputers, and demonstrate that the new block iterative solver is two to three times faster than the Lanczos based algorithm for problems of moderate sizes on a Cray XC30 system.« less
High-Productivity Computing in Computational Physics Education

NASA Astrophysics Data System (ADS)

Tel-Zur, Guy

2011-03-01

We describe the development of a new course in Computational Physics at the Ben-Gurion University. This elective course for 3rd year undergraduates and MSc. students is being taught during one semester. Computational Physics is by now well accepted as the Third Pillar of Science. This paper's claim is that modern Computational Physics education should deal also with High-Productivity Computing. The traditional approach of teaching Computational Physics emphasizes ``Correctness'' and then ``Accuracy'' and we add also ``Performance.'' Along with topics in Mathematical Methods and case studies in Physics the course deals a significant amount of time with ``Mini-Courses'' in topics such as: High-Throughput Computing - Condor, Parallel Programming - MPI and OpenMP, How to build a Beowulf, Visualization and Grid and Cloud Computing. The course does not intend to teach neither new physics nor new mathematics but it is focused on an integrated approach for solving problems starting from the physics problem, the corresponding mathematical solution, the numerical scheme, writing an efficient computer code and finally analysis and visualization.
Dense and Sparse Matrix Operations on the Cell Processor

DOE Office of Scientific and Technical Information (OSTI.GOV)

Williams, Samuel W.; Shalf, John; Oliker, Leonid

2005-05-01

The slowing pace of commodity microprocessor performance improvements combined with ever-increasing chip power demands has become of utmost concern to computational scientists. Therefore, the high performance computing community is examining alternative architectures that address the limitations of modern superscalar designs. In this work, we examine STI's forthcoming Cell processor: a novel, low-power architecture that combines a PowerPC core with eight independent SIMD processing units coupled with a software-controlled memory to offer high FLOP/s/Watt. Since neither Cell hardware nor cycle-accurate simulators are currently publicly available, we develop an analytic framework to predict Cell performance on dense and sparse matrix operations, usingmore » a variety of algorithmic approaches. Results demonstrate Cell's potential to deliver more than an order of magnitude better GFLOP/s per watt performance, when compared with the Intel Itanium2 and Cray X1 processors.« less
Double-negative metamaterial for mobile phone application

NASA Astrophysics Data System (ADS)

Hossain, M. I.; Faruque, M. R. I.; Islam, M. T.

2017-01-01

In this paper, a new design and analysis of metamaterial and its applications to modern handset are presented. The proposed metamaterial unit-cell design consists of two connected square spiral structures, which leads to increase the effective media ratio. The finite instigation technique based on Computer Simulation Technology Microwave Studio is utilized in this investigation, and the measurement is taken in an anechoic chamber. A good agreement is observed among simulated and measured results. The results indicate that the proposed metamaterial can successfully cover cellular phone frequency bands. Moreover, the uses of proposed metamaterial in modern handset antennas are also analyzed. The results reveal that the proposed metamaterial attachment significantly reduces specific absorption rate values without reducing the antenna performances.
Developing and applying modern methods of leakage monitoring and state estimation of fuel at the Novovoronezh nuclear power plant

NASA Astrophysics Data System (ADS)

Povarov, V. P.; Tereshchenko, A. B.; Kravchenko, Yu. N.; Pozychanyuk, I. V.; Gorobtsov, L. I.; Golubev, E. I.; Bykov, V. I.; Likhanskii, V. V.; Evdokimov, I. A.; Zborovskii, V. G.; Sorokin, A. A.; Kanyukova, V. D.; Aliev, T. N.

2014-02-01

The results of developing and implementing the modernized fuel leakage monitoring methods at the shut-down and running reactor of the Novovoronezh nuclear power plant (NPP) are presented. An automated computerized expert system integrated with an in-core monitoring system (ICMS) and installed at the Novovoronezh NPP unit no. 5 is described. If leaky fuel elements appear in the core, the system allows one to perform on-line assessment of the parameters of leaky fuel assemblies (FAs). The computer expert system units designed for optimizing the operating regimes and enhancing the fuel usage efficiency at the Novovoronezh NPP unit no. 5 are now being developed.
Turbulent heat transfer performance of single stage turbine

DOE Office of Scientific and Technical Information (OSTI.GOV)

Amano, R.S.; Song, B.

1999-07-01

To increase the efficiency and the power of modern power plant gas turbines, designers are continually trying to raise the maximum turbine inlet temperature. Here, a numerical study based on the Navier-Stokes equations on a three-dimensional turbulent flow in a single stage turbine stator/rotor passage has been conducted and reported in this paper. The full Reynolds-stress closure model (RSM) was used for the computations and the results were also compared with the computations made by using the Launder-Sharma low-Reynolds-number {kappa}-{epsilon} model. The computational results obtained using these models were compared in order to investigate the turbulence effect in the near-wallmore » region. The set of the governing equations in a generalized curvilinear coordinate system was discretized by using the finite volume method with non-staggered grids. The numerical modeling was performed to interact between the stator and rotor blades.« less
Contemporary imaging of mild TBI: the journey toward diffusion tensor imaging to assess neuronal damage.

PubMed

Fox, W Christopher; Park, Min S; Belverud, Shawn; Klugh, Arnett; Rivet, Dennis; Tomlin, Jeffrey M

2013-04-01

To follow the progression of neuroimaging as a means of non-invasive evaluation of mild traumatic brain injury (mTBI) in order to provide recommendations based on reproducible, defined imaging findings. A comprehensive literature review and analysis of contemporary published articles was performed to study the progression of neuroimaging findings as a non-invasive 'biomarker' for mTBI. Multiple imaging modalities exist to support the evaluation of patients with mTBI, including ultrasound (US), computed tomography (CT), single photon emission computed tomography (SPECT), positron emission tomography (PET), and magnetic resonance imaging (MRI). These techniques continue to evolve with the development of fractional anisotropy (FA), fiber tractography (FT), and diffusion tensor imaging (DTI). Modern imaging techniques, when applied in the appropriate clinical setting, may serve as a valuable tool for diagnosis and management of patients with mTBI. An understanding of modern neuroanatomical imaging will enhance our ability to analyse injury and recognize the manifestations of mTBI.
Methodology of problem-based learning engineering and technology and of its implementation with modern computer resources

NASA Astrophysics Data System (ADS)

Lebedev, A. A.; Ivanova, E. G.; Komleva, V. A.; Klokov, N. M.; Komlev, A. A.

2017-01-01

The considered method of learning the basics of microelectronic circuits and systems amplifier enables one to understand electrical processes deeper, to understand the relationship between static and dynamic characteristics and, finally, bring the learning process to the cognitive process. The scheme of problem-based learning can be represented by the following sequence of procedures: the contradiction is perceived and revealed; the cognitive motivation is provided by creating a problematic situation (the mental state of the student), moving the desire to solve the problem, to raise the question "why?", the hypothesis is made; searches for solutions are implemented; the answer is looked for. Due to the complexity of architectural schemes in the work the modern methods of computer analysis and synthesis are considered in the work. Examples of engineering by students in the framework of students' scientific and research work of analog circuits with improved performance based on standard software and software developed at the Department of Microelectronics MEPhI.
A task-based parallelism and vectorized approach to 3D Method of Characteristics (MOC) reactor simulation for high performance computing architectures

NASA Astrophysics Data System (ADS)

Tramm, John R.; Gunow, Geoffrey; He, Tim; Smith, Kord S.; Forget, Benoit; Siegel, Andrew R.

2016-05-01

In this study we present and analyze a formulation of the 3D Method of Characteristics (MOC) technique applied to the simulation of full core nuclear reactors. Key features of the algorithm include a task-based parallelism model that allows independent MOC tracks to be assigned to threads dynamically, ensuring load balancing, and a wide vectorizable inner loop that takes advantage of modern SIMD computer architectures. The algorithm is implemented in a set of highly optimized proxy applications in order to investigate its performance characteristics on CPU, GPU, and Intel Xeon Phi architectures. Speed, power, and hardware cost efficiencies are compared. Additionally, performance bottlenecks are identified for each architecture in order to determine the prospects for continued scalability of the algorithm on next generation HPC architectures.
Re-Computation of Numerical Results Contained in NACA Report No. 496

NASA Technical Reports Server (NTRS)

Perry, Boyd, III

2015-01-01

An extensive examination of NACA Report No. 496 (NACA 496), "General Theory of Aerodynamic Instability and the Mechanism of Flutter," by Theodore Theodorsen, is described. The examination included checking equations and solution methods and re-computing interim quantities and all numerical examples in NACA 496. The checks revealed that NACA 496 contains computational shortcuts (time- and effort-saving devices for engineers of the time) and clever artifices (employed in its solution methods), but, unfortunately, also contains numerous tripping points (aspects of NACA 496 that have the potential to cause confusion) and some errors. The re-computations were performed employing the methods and procedures described in NACA 496, but using modern computational tools. With some exceptions, the magnitudes and trends of the original results were in fair-to-very-good agreement with the re-computed results. The exceptions included what are speculated to be computational errors in the original in some instances and transcription errors in the original in others. Independent flutter calculations were performed and, in all cases, including those where the original and re-computed results differed significantly, were in excellent agreement with the re-computed results. Appendix A contains NACA 496; Appendix B contains a Matlab(Reistered) program that performs the re-computation of results; Appendix C presents three alternate solution methods, with examples, for the two-degree-of-freedom solution method of NACA 496; Appendix D contains the three-degree-of-freedom solution method (outlined in NACA 496 but never implemented), with examples.
GPU applications for data processing

DOE Office of Scientific and Technical Information (OSTI.GOV)

Vladymyrov, Mykhailo, E-mail: mykhailo.vladymyrov@cern.ch; Aleksandrov, Andrey; INFN sezione di Napoli, I-80125 Napoli

2015-12-31

Modern experiments that use nuclear photoemulsion imply fast and efficient data acquisition from the emulsion can be performed. The new approaches in developing scanning systems require real-time processing of large amount of data. Methods that use Graphical Processing Unit (GPU) computing power for emulsion data processing are presented here. It is shown how the GPU-accelerated emulsion processing helped us to rise the scanning speed by factor of nine.
Fast Computation on the Modern Battlefield

DTIC Science & Technology

2015-04-01

the performance of offloading systems in current and future scenarios. The modularity of this model allows system designers to replace model...goals were simplicity and modularity . We wanted the model to not necessarily answer every question for every scenario, but rather expose easy to...acquisitions for future systems. Again, because of the modularity of the model, it is possible for designers to substitute the most accurate value for
Analysis of Satellite Communications Antenna Patterns

NASA Technical Reports Server (NTRS)

Rahmat-Samii, Y.

1985-01-01

Computer program accurately and efficiently predicts far-field patterns of offset, or symmetric, parabolic reflector antennas. Antenna designer uses program to study effects of varying geometrical and electrical (RF) parameters of parabolic reflector and its feed system. Accurate predictions of far-field patterns help designer predict overall performance of antenna. These reflectors used extensively in modern communications satellites and in multiple-beam and low side-lobe antenna systems.
Department of Defense High Performance Computing Modernization Program. 2006 Annual Report

DTIC Science & Technology

2007-03-01

Department. We successfully completed several software development projects that introduced parallel, scalable production software now in use across the...imagined. They are developing and deploying weather and ocean models that allow our soldiers, sailors, marines and airmen to plan missions more effectively...and to navigate adverse environments safely. They are modeling molecular interactions leading to the development of higher energy fuels, munitions

Airborne Network Optimization with Dynamic Network Update

DTIC Science & Technology

2015-03-26

Faculty Department of Electrical and Computer Engineering Graduate School of Engineering and Management Air Force Institute of Technology Air University...Member Dr. Barry E. Mullins Member AFIT-ENG-MS-15-M-030 Abstract Modern networks employ congestion and routing management algorithms that can perform...airborne networks. Intelligent agents can make use of Kalman filter predictions to make informed decisions to manage communication in airborne networks. The
Automatic Blocking Of QR and LU Factorizations for Locality

DOE Office of Scientific and Technical Information (OSTI.GOV)

Yi, Q; Kennedy, K; You, H

2004-03-26

QR and LU factorizations for dense matrices are important linear algebra computations that are widely used in scientific applications. To efficiently perform these computations on modern computers, the factorization algorithms need to be blocked when operating on large matrices to effectively exploit the deep cache hierarchy prevalent in today's computer memory systems. Because both QR (based on Householder transformations) and LU factorization algorithms contain complex loop structures, few compilers can fully automate the blocking of these algorithms. Though linear algebra libraries such as LAPACK provides manually blocked implementations of these algorithms, by automatically generating blocked versions of the computations, moremore » benefit can be gained such as automatic adaptation of different blocking strategies. This paper demonstrates how to apply an aggressive loop transformation technique, dependence hoisting, to produce efficient blockings for both QR and LU with partial pivoting. We present different blocking strategies that can be generated by our optimizer and compare the performance of auto-blocked versions with manually tuned versions in LAPACK, both using reference BLAS, ATLAS BLAS and native BLAS specially tuned for the underlying machine architectures.« less
Intermediate Palomar Transient Factory: Realtime Image Subtraction Pipeline

DOE PAGES

Cao, Yi; Nugent, Peter E.; Kasliwal, Mansi M.

2016-09-28

A fast-turnaround pipeline for realtime data reduction plays an essential role in discovering and permitting followup observations to young supernovae and fast-evolving transients in modern time-domain surveys. In this paper, we present the realtime image subtraction pipeline in the intermediate Palomar Transient Factory. By using highperformance computing, efficient databases, and machine-learning algorithms, this pipeline manages to reliably deliver transient candidates within 10 minutes of images being taken. Our experience in using high-performance computing resources to process big data in astronomy serves as a trailblazer to dealing with data from large-scale time-domain facilities in the near future.
Intermediate Palomar Transient Factory: Realtime Image Subtraction Pipeline

DOE Office of Scientific and Technical Information (OSTI.GOV)

Cao, Yi; Nugent, Peter E.; Kasliwal, Mansi M.

A fast-turnaround pipeline for realtime data reduction plays an essential role in discovering and permitting followup observations to young supernovae and fast-evolving transients in modern time-domain surveys. In this paper, we present the realtime image subtraction pipeline in the intermediate Palomar Transient Factory. By using highperformance computing, efficient databases, and machine-learning algorithms, this pipeline manages to reliably deliver transient candidates within 10 minutes of images being taken. Our experience in using high-performance computing resources to process big data in astronomy serves as a trailblazer to dealing with data from large-scale time-domain facilities in the near future.
KAGLVis - On-line 3D Visualisation of Earth-observing-satellite Data

NASA Astrophysics Data System (ADS)

Szuba, Marek; Ameri, Parinaz; Grabowski, Udo; Maatouki, Ahmad; Meyer, Jörg

2015-04-01

One of the goals of the Large-Scale Data Management and Analysis project is to provide a high-performance framework facilitating management of data acquired by Earth-observing satellites such as Envisat. On the client-facing facet of this framework, we strive to provide visualisation and basic analysis tool which could be used by scientists with minimal to no knowledge of the underlying infrastructure. Our tool, KAGLVis, is a JavaScript client-server Web application which leverages modern Web technologies to provide three-dimensional visualisation of satellite observables on a wide range of client systems. It takes advantage of the WebGL API to employ locally available GPU power for 3D rendering; this approach has been demonstrated to perform well even on relatively weak hardware such as integrated graphics chipsets found in modern laptop computers and with some user-interface tuning could even be usable on embedded devices such as smartphones or tablets. Data is fetched from the database back-end using a ReST API and cached locally, both in memory and using HTML5 Web Storage, to minimise network use. Computations, calculation of cloud altitude from cloud-index measurements for instance, can depending on configuration be performed on either the client or the server side. Keywords: satellite data, Envisat, visualisation, 3D graphics, Web application, WebGL, MEAN stack.
Visualizing staggered fields and analyzing electromagnetic data with PerceptEM

DOE Office of Scientific and Technical Information (OSTI.GOV)

Shasharina, Svetlana

This project resulted in VSimSP: a software for simulating large photonic devices of high-performance computers. It includes: GUI for Photonics Simulations; High-Performance Meshing Algorithm; 2d Order Multimaterials Algorithm; Mode Solver for Waveguides; 2d Order Material Dispersion Algorithm; S Parameters Calculation; High-Performance Workflow at NERSC ; and Large Photonic Devices Simulation Setups We believe we became the only company in the world which can simulate large photonics devices in 3D on modern supercomputers without the need to split them into subparts or do low-fidelity modeling. We started commercial engagement with a manufacturing company.
Studying an Eulerian Computer Model on Different High-performance Computer Platforms and Some Applications

NASA Astrophysics Data System (ADS)

Georgiev, K.; Zlatev, Z.

2010-11-01

The Danish Eulerian Model (DEM) is an Eulerian model for studying the transport of air pollutants on large scale. Originally, the model was developed at the National Environmental Research Institute of Denmark. The model computational domain covers Europe and some neighbour parts belong to the Atlantic Ocean, Asia and Africa. If DEM model is to be applied by using fine grids, then its discretization leads to a huge computational problem. This implies that such a model as DEM must be run only on high-performance computer architectures. The implementation and tuning of such a complex large-scale model on each different computer is a non-trivial task. Here, some comparison results of running of this model on different kind of vector (CRAY C92A, Fujitsu, etc.), parallel computers with distributed memory (IBM SP, CRAY T3E, Beowulf clusters, Macintosh G4 clusters, etc.), parallel computers with shared memory (SGI Origin, SUN, etc.) and parallel computers with two levels of parallelism (IBM SMP, IBM BlueGene/P, clusters of multiprocessor nodes, etc.) will be presented. The main idea in the parallel version of DEM is domain partitioning approach. Discussions according to the effective use of the cache and hierarchical memories of the modern computers as well as the performance, speed-ups and efficiency achieved will be done. The parallel code of DEM, created by using MPI standard library, appears to be highly portable and shows good efficiency and scalability on different kind of vector and parallel computers. Some important applications of the computer model output are presented in short.
New developments in supra-threshold perimetry.

PubMed

Henson, David B; Artes, Paul H

2002-09-01

To describe a series of recent enhancements to supra-threshold perimetry. Computer simulations were used to develop an improved algorithm (HEART) for the setting of the supra-threshold test intensity at the beginning of a field test, and to evaluate the relationship between various pass/fail criteria and the test's performance (sensitivity and specificity) and how they compare with modern threshold perimetry. Data were collected in optometric practices to evaluate HEART and to assess how the patient's response times can be analysed to detect false positive response errors in visual field test results. The HEART algorithm shows improved performance (reduced between-eye differences) over current algorithms. A pass/fail criterion of '3 stimuli seen of 3-5 presentations' at each test location reduces test/retest variability and combines high sensitivity and specificity. A large percentage of false positive responses can be detected by comparing their latencies to the average response time of a patient. Optimised supra-threshold visual field tests can perform as well as modern threshold techniques. Such tests may be easier to perform for novice patients, compared with the more demanding threshold tests.
Securing Secrets and Managing Trust in Modern Computing Applications

ERIC Educational Resources Information Center

Sayler, Andy

2016-01-01

The amount of digital data generated and stored by users increases every day. In order to protect this data, modern computing systems employ numerous cryptographic and access control solutions. Almost all of such solutions, however, require the keeping of certain secrets as the basis of their security models. How best to securely store and control…
Implementation and Evaluation of Flipped Classroom as IoT Element into Learning Process of Computer Network Education

ERIC Educational Resources Information Center

Zhamanov, Azamat; Yoo, Seong-Moo; Sakhiyeva, Zhulduz; Zhaparov, Meirambek

2018-01-01

Students nowadays are hard to be motivated to study lessons with traditional teaching methods. Computers, smartphones, tablets and other smart devices disturb students' attentions. Nevertheless, those smart devices can be used as auxiliary tools of modern teaching methods. In this article, the authors review two popular modern teaching methods:…
Software designs of image processing tasks with incremental refinement of computation.

PubMed

Anastasia, Davide; Andreopoulos, Yiannis

2010-08-01

Software realizations of computationally-demanding image processing tasks (e.g., image transforms and convolution) do not currently provide graceful degradation when their clock-cycles budgets are reduced, e.g., when delay deadlines are imposed in a multitasking environment to meet throughput requirements. This is an important obstacle in the quest for full utilization of modern programmable platforms' capabilities since worst-case considerations must be in place for reasonable quality of results. In this paper, we propose (and make available online) platform-independent software designs performing bitplane-based computation combined with an incremental packing framework in order to realize block transforms, 2-D convolution and frame-by-frame block matching. The proposed framework realizes incremental computation: progressive processing of input-source increments improves the output quality monotonically. Comparisons with the equivalent nonincremental software realization of each algorithm reveal that, for the same precision of the result, the proposed approach can lead to comparable or faster execution, while it can be arbitrarily terminated and provide the result up to the computed precision. Application examples with region-of-interest based incremental computation, task scheduling per frame, and energy-distortion scalability verify that our proposal provides significant performance scalability with graceful degradation.
Performance Modeling in CUDA Streams - A Means for High-Throughput Data Processing.

PubMed

Li, Hao; Yu, Di; Kumar, Anand; Tu, Yi-Cheng

2014-10-01

Push-based database management system (DBMS) is a new type of data processing software that streams large volume of data to concurrent query operators. The high data rate of such systems requires large computing power provided by the query engine. In our previous work, we built a push-based DBMS named G-SDMS to harness the unrivaled computational capabilities of modern GPUs. A major design goal of G-SDMS is to support concurrent processing of heterogenous query processing operations and enable resource allocation among such operations. Understanding the performance of operations as a result of resource consumption is thus a premise in the design of G-SDMS. With NVIDIA's CUDA framework as the system implementation platform, we present our recent work on performance modeling of CUDA kernels running concurrently under a runtime mechanism named CUDA stream . Specifically, we explore the connection between performance and resource occupancy of compute-bound kernels and develop a model that can predict the performance of such kernels. Furthermore, we provide an in-depth anatomy of the CUDA stream mechanism and summarize the main kernel scheduling disciplines in it. Our models and derived scheduling disciplines are verified by extensive experiments using synthetic and real-world CUDA kernels.
Reliability based design optimization: Formulations and methodologies

NASA Astrophysics Data System (ADS)

Agarwal, Harish

Modern products ranging from simple components to complex systems should be designed to be optimal and reliable. The challenge of modern engineering is to ensure that manufacturing costs are reduced and design cycle times are minimized while achieving requirements for performance and reliability. If the market for the product is competitive, improved quality and reliability can generate very strong competitive advantages. Simulation based design plays an important role in designing almost any kind of automotive, aerospace, and consumer products under these competitive conditions. Single discipline simulations used for analysis are being coupled together to create complex coupled simulation tools. This investigation focuses on the development of efficient and robust methodologies for reliability based design optimization in a simulation based design environment. Original contributions of this research are the development of a novel efficient and robust unilevel methodology for reliability based design optimization, the development of an innovative decoupled reliability based design optimization methodology, the application of homotopy techniques in unilevel reliability based design optimization methodology, and the development of a new framework for reliability based design optimization under epistemic uncertainty. The unilevel methodology for reliability based design optimization is shown to be mathematically equivalent to the traditional nested formulation. Numerical test problems show that the unilevel methodology can reduce computational cost by at least 50% as compared to the nested approach. The decoupled reliability based design optimization methodology is an approximate technique to obtain consistent reliable designs at lesser computational expense. Test problems show that the methodology is computationally efficient compared to the nested approach. A framework for performing reliability based design optimization under epistemic uncertainty is also developed. A trust region managed sequential approximate optimization methodology is employed for this purpose. Results from numerical test studies indicate that the methodology can be used for performing design optimization under severe uncertainty.
Modern Theories of Pelvic Floor Support : A Topical Review of Modern Studies on Structural and Functional Pelvic Floor Support from Medical Imaging, Computational Modeling, and Electromyographic Perspectives.

PubMed

Peng, Yun; Miller, Brandi D; Boone, Timothy B; Zhang, Yingchun

2018-02-12

Weakened pelvic floor support is believed to be the main cause of various pelvic floor disorders. Modern theories of pelvic floor support stress on the structural and functional integrity of multiple structures and their interplay to maintain normal pelvic floor functions. Connective tissues provide passive pelvic floor support while pelvic floor muscles provide active support through voluntary contraction. Advanced modern medical technologies allow us to comprehensively and thoroughly evaluate the interaction of supporting structures and assess both active and passive support functions. The pathophysiology of various pelvic floor disorders associated with pelvic floor weakness is now under scrutiny from the combination of (1) morphological, (2) dynamic (through computational modeling), and (3) neurophysiological perspectives. This topical review aims to update newly emerged studies assessing pelvic floor support function among these three categories. A literature search was performed with emphasis on (1) medical imaging studies that assess pelvic floor muscle architecture, (2) subject-specific computational modeling studies that address new topics such as modeling muscle contractions, and (3) pelvic floor neurophysiology studies that report novel devices or findings such as high-density surface electromyography techniques. We found that recent computational modeling studies are featured with more realistic soft tissue constitutive models (e.g., active muscle contraction) as well as an increasing interest in simulating surgical interventions (e.g., artificial sphincter). Diffusion tensor imaging provides a useful non-invasive tool to characterize pelvic floor muscles at the microstructural level, which can be potentially used to improve the accuracy of the simulation of muscle contraction. Studies using high-density surface electromyography anal and vaginal probes on large patient cohorts have been recently reported. Influences of vaginal delivery on the distribution of innervation zones of pelvic floor muscles are clarified, providing useful guidance for a better protection of women during delivery. We are now in a period of transition to advanced diagnostic and predictive pelvic floor medicine. Our findings highlight the application of diffusion tensor imaging, computational models with consideration of active pelvic floor muscle contraction, high-density surface electromyography, and their potential integration, as tools to push the boundary of our knowledge in pelvic floor support and better shape current clinical practice.
QMachine: commodity supercomputing in web browsers.

PubMed

Wilkinson, Sean R; Almeida, Jonas S

2014-06-09

Ongoing advancements in cloud computing provide novel opportunities in scientific computing, especially for distributed workflows. Modern web browsers can now be used as high-performance workstations for querying, processing, and visualizing genomics' "Big Data" from sources like The Cancer Genome Atlas (TCGA) and the International Cancer Genome Consortium (ICGC) without local software installation or configuration. The design of QMachine (QM) was driven by the opportunity to use this pervasive computing model in the context of the Web of Linked Data in Biomedicine. QM is an open-sourced, publicly available web service that acts as a messaging system for posting tasks and retrieving results over HTTP. The illustrative application described here distributes the analyses of 20 Streptococcus pneumoniae genomes for shared suffixes. Because all analytical and data retrieval tasks are executed by volunteer machines, few server resources are required. Any modern web browser can submit those tasks and/or volunteer to execute them without installing any extra plugins or programs. A client library provides high-level distribution templates including MapReduce. This stark departure from the current reliance on expensive server hardware running "download and install" software has already gathered substantial community interest, as QM received more than 2.2 million API calls from 87 countries in 12 months. QM was found adequate to deliver the sort of scalable bioinformatics solutions that computation- and data-intensive workflows require. Paradoxically, the sandboxed execution of code by web browsers was also found to enable them, as compute nodes, to address critical privacy concerns that characterize biomedical environments.
Extreme-Scale De Novo Genome Assembly

DOE Office of Scientific and Technical Information (OSTI.GOV)

Georganas, Evangelos; Hofmeyr, Steven; Egan, Rob

De novo whole genome assembly reconstructs genomic sequence from short, overlapping, and potentially erroneous DNA segments and is one of the most important computations in modern genomics. This work presents HipMER, a high-quality end-to-end de novo assembler designed for extreme scale analysis, via efficient parallelization of the Meraculous code. Genome assembly software has many components, each of which stresses different components of a computer system. This chapter explains the computational challenges involved in each step of the HipMer pipeline, the key distributed data structures, and communication costs in detail. We present performance results of assembling the human genome and themore » large hexaploid wheat genome on large supercomputers up to tens of thousands of cores.« less
Parallel algorithms for large-scale biological sequence alignment on Xeon-Phi based clusters.

PubMed

Lan, Haidong; Chan, Yuandong; Xu, Kai; Schmidt, Bertil; Peng, Shaoliang; Liu, Weiguo

2016-07-19

Computing alignments between two or more sequences are common operations frequently performed in computational molecular biology. The continuing growth of biological sequence databases establishes the need for their efficient parallel implementation on modern accelerators. This paper presents new approaches to high performance biological sequence database scanning with the Smith-Waterman algorithm and the first stage of progressive multiple sequence alignment based on the ClustalW heuristic on a Xeon Phi-based compute cluster. Our approach uses a three-level parallelization scheme to take full advantage of the compute power available on this type of architecture; i.e. cluster-level data parallelism, thread-level coarse-grained parallelism, and vector-level fine-grained parallelism. Furthermore, we re-organize the sequence datasets and use Xeon Phi shuffle operations to improve I/O efficiency. Evaluations show that our method achieves a peak overall performance up to 220 GCUPS for scanning real protein sequence databanks on a single node consisting of two Intel E5-2620 CPUs and two Intel Xeon Phi 7110P cards. It also exhibits good scalability in terms of sequence length and size, and number of compute nodes for both database scanning and multiple sequence alignment. Furthermore, the achieved performance is highly competitive in comparison to optimized Xeon Phi and GPU implementations. Our implementation is available at https://github.com/turbo0628/LSDBS-mpi .
Internal Flow

NASA Astrophysics Data System (ADS)

Greitzer, E. M.; Tan, C. S.; Graf, M. B.

2004-06-01

Focusing on phenomena important in implementing the performance of a broad range of fluid devices, this work describes the behavior of internal flows encountered in propulsion systems, fluid machinery (compressors, turbines, and pumps) and ducts (diffusers, nozzles and combustion chambers). The book equips students and practicing engineers with a range of new analytical tools. These tools offer enhanced interpretation and application of both experimental measurements and the computational procedures that characterize modern fluids engineering.
FY06 NRL DoD High Performance Computing Modernization Program Annual Reports

DTIC Science & Technology

2007-10-31

our simulations yield important new information on the amount and form of the energy that is released by these explosive events. These results...coupled with the ideal-gas equation of state and a one-step Arrhenuis kinetics of energy release. The equations are solved using the explicit...practical applications, including hydrogen safety and pulse -detonation engines (PDE). For example, the results summarizing the effect of obstacle
Preliminary Evaluation of MapReduce for High-Performance Climate Data Analysis

NASA Technical Reports Server (NTRS)

Duffy, Daniel Q.; Schnase, John L.; Thompson, John H.; Freeman, Shawn M.; Clune, Thomas L.

2012-01-01

MapReduce is an approach to high-performance analytics that may be useful to data intensive problems in climate research. It offers an analysis paradigm that uses clusters of computers and combines distributed storage of large data sets with parallel computation. We are particularly interested in the potential of MapReduce to speed up basic operations common to a wide range of analyses. In order to evaluate this potential, we are prototyping a series of canonical MapReduce operations over a test suite of observational and climate simulation datasets. Our initial focus has been on averaging operations over arbitrary spatial and temporal extents within Modern Era Retrospective- Analysis for Research and Applications (MERRA) data. Preliminary results suggest this approach can improve efficiencies within data intensive analytic workflows.

SAPNEW: Parallel finite element code for thin shell structures on the Alliant FX/80

NASA Astrophysics Data System (ADS)

Kamat, Manohar P.; Watson, Brian C.

1992-02-01

The results of a research activity aimed at providing a finite element capability for analyzing turbo-machinery bladed-disk assemblies in a vector/parallel processing environment are summarized. Analysis of aircraft turbofan engines is very computationally intensive. The performance limit of modern day computers with a single processing unit was estimated at 3 billions of floating point operations per second (3 gigaflops). In view of this limit of a sequential unit, performance rates higher than 3 gigaflops can be achieved only through vectorization and/or parallelization as on Alliant FX/80. Accordingly, the efforts of this critically needed research were geared towards developing and evaluating parallel finite element methods for static and vibration analysis. A special purpose code, named with the acronym SAPNEW, performs static and eigen analysis of multi-degree-of-freedom blade models built-up from flat thin shell elements.
SAPNEW: Parallel finite element code for thin shell structures on the Alliant FX/80

NASA Technical Reports Server (NTRS)

Kamat, Manohar P.; Watson, Brian C.

1992-01-01

The results of a research activity aimed at providing a finite element capability for analyzing turbo-machinery bladed-disk assemblies in a vector/parallel processing environment are summarized. Analysis of aircraft turbofan engines is very computationally intensive. The performance limit of modern day computers with a single processing unit was estimated at 3 billions of floating point operations per second (3 gigaflops). In view of this limit of a sequential unit, performance rates higher than 3 gigaflops can be achieved only through vectorization and/or parallelization as on Alliant FX/80. Accordingly, the efforts of this critically needed research were geared towards developing and evaluating parallel finite element methods for static and vibration analysis. A special purpose code, named with the acronym SAPNEW, performs static and eigen analysis of multi-degree-of-freedom blade models built-up from flat thin shell elements.
Impedance computations and beam-based measurements: A problem of discrepancy

DOE PAGES

Smaluk, Victor

2018-04-21

High intensity of particle beams is crucial for high-performance operation of modern electron-positron storage rings, both colliders and light sources. The beam intensity is limited by the interaction of the beam with self-induced electromagnetic fields (wake fields) proportional to the vacuum chamber impedance. For a new accelerator project, the total broadband impedance is computed by element-wise wake-field simulations using computer codes. For a machine in operation, the impedance can be measured experimentally using beam-based techniques. In this article, a comparative analysis of impedance computations and beam-based measurements is presented for 15 electron-positron storage rings. The measured data and the predictionsmore » based on the computed impedance budgets show a significant discrepancy. For this article, three possible reasons for the discrepancy are discussed: interference of the wake fields excited by a beam in adjacent components of the vacuum chamber, effect of computation mesh size, and effect of insufficient bandwidth of the computed impedance.« less
Advanced Scientific Computing Research Exascale Requirements Review. An Office of Science review sponsored by Advanced Scientific Computing Research, September 27-29, 2016, Rockville, Maryland

DOE Office of Scientific and Technical Information (OSTI.GOV)

Almgren, Ann; DeMar, Phil; Vetter, Jeffrey

The widespread use of computing in the American economy would not be possible without a thoughtful, exploratory research and development (R&D) community pushing the performance edge of operating systems, computer languages, and software libraries. These are the tools and building blocks — the hammers, chisels, bricks, and mortar — of the smartphone, the cloud, and the computing services on which we rely. Engineers and scientists need ever-more specialized computing tools to discover new material properties for manufacturing, make energy generation safer and more efficient, and provide insight into the fundamentals of the universe, for example. The research division of themore » U.S. Department of Energy’s (DOE’s) Office of Advanced Scientific Computing and Research (ASCR Research) ensures that these tools and building blocks are being developed and honed to meet the extreme needs of modern science. See also http://exascaleage.org/ascr/ for additional information.« less
Impedance computations and beam-based measurements: A problem of discrepancy

DOE Office of Scientific and Technical Information (OSTI.GOV)

Smaluk, Victor

High intensity of particle beams is crucial for high-performance operation of modern electron-positron storage rings, both colliders and light sources. The beam intensity is limited by the interaction of the beam with self-induced electromagnetic fields (wake fields) proportional to the vacuum chamber impedance. For a new accelerator project, the total broadband impedance is computed by element-wise wake-field simulations using computer codes. For a machine in operation, the impedance can be measured experimentally using beam-based techniques. In this article, a comparative analysis of impedance computations and beam-based measurements is presented for 15 electron-positron storage rings. The measured data and the predictionsmore » based on the computed impedance budgets show a significant discrepancy. For this article, three possible reasons for the discrepancy are discussed: interference of the wake fields excited by a beam in adjacent components of the vacuum chamber, effect of computation mesh size, and effect of insufficient bandwidth of the computed impedance.« less
The Analog Revolution and Its On-Going Role in Modern Analytical Measurements.

PubMed

Enke, Christie G

2015-12-15

The electronic revolution in analytical instrumentation began when we first exceeded the two-digit resolution of panel meters and chart recorders and then took the first steps into automated control. It started with the first uses of operational amplifiers (op amps) in the analog domain 20 years before the digital computer entered the analytical lab. Their application greatly increased both accuracy and precision in chemical measurement and they provided an elegant means for the electronic control of experimental quantities. Later, laboratory and personal computers provided an unlimited readout resolution and enabled programmable control of instrument parameters as well as storage and computation of acquired data. However, digital computers did not replace the op amp's critical role of converting the analog sensor's output to a robust and accurate voltage. Rather it added a new role: converting that voltage into a number. These analog operations are generally the limiting portions of our computerized instrumentation systems. Operational amplifier performance in gain, input current and resistance, offset voltage, and rise time have improved by a remarkable 3-4 orders of magnitude since their first implementations. Each 10-fold improvement has opened the doors for the development of new techniques in all areas of chemical analysis. Along with some interesting history, the multiple roles op amps play in modern instrumentation are described along with a number of examples of new areas of analysis that have been enabled by their improvements.
Comparing performance of many-core CPUs and GPUs for static and motion compensated reconstruction of C-arm CT data.

PubMed

Hofmann, Hannes G; Keck, Benjamin; Rohkohl, Christopher; Hornegger, Joachim

2011-01-01

Interventional reconstruction of 3-D volumetric data from C-arm CT projections is a computationally demanding task. Hardware optimization is not an option but mandatory for interventional image processing and, in particular, for image reconstruction due to the high demands on performance. Several groups have published fast analytical 3-D reconstruction on highly parallel hardware such as GPUs to mitigate this issue. The authors show that the performance of modern CPU-based systems is in the same order as current GPUs for static 3-D reconstruction and outperforms them for a recent motion compensated (3-D+time) image reconstruction algorithm. This work investigates two algorithms: Static 3-D reconstruction as well as a recent motion compensated algorithm. The evaluation was performed using a standardized reconstruction benchmark, RABBITCT, to get comparable results and two additional clinical data sets. The authors demonstrate for a parametric B-spline motion estimation scheme that the derivative computation, which requires many write operations to memory, performs poorly on the GPU and can highly benefit from modern CPU architectures with large caches. Moreover, on a 32-core Intel Xeon server system, the authors achieve linear scaling with the number of cores used and reconstruction times almost in the same range as current GPUs. Algorithmic innovations in the field of motion compensated image reconstruction may lead to a shift back to CPUs in the future. For analytical 3-D reconstruction, the authors show that the gap between GPUs and CPUs became smaller. It can be performed in less than 20 s (on-the-fly) using a 32-core server.
Improving robustness and computational efficiency using modern C++

DOE Office of Scientific and Technical Information (OSTI.GOV)

Paterno, M.; Kowalkowski, J.; Green, C.

2014-01-01

For nearly two decades, the C++ programming language has been the dominant programming language for experimental HEP. The publication of ISO/IEC 14882:2011, the current version of the international standard for the C++ programming language, makes available a variety of language and library facilities for improving the robustness, expressiveness, and computational efficiency of C++ code. However, much of the C++ written by the experimental HEP community does not take advantage of the features of the language to obtain these benefits, either due to lack of familiarity with these features or concern that these features must somehow be computationally inefficient. In thismore » paper, we address some of the features of modern C+-+, and show how they can be used to make programs that are both robust and computationally efficient. We compare and contrast simple yet realistic examples of some common implementation patterns in C, currently-typical C++, and modern C++, and show (when necessary, down to the level of generated assembly language code) the quality of the executable code produced by recent C++ compilers, with the aim of allowing the HEP community to make informed decisions on the costs and benefits of the use of modern C++.« less
On the Impact of Execution Models: A Case Study in Computational Chemistry

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chavarría-Miranda, Daniel; Halappanavar, Mahantesh; Krishnamoorthy, Sriram

2015-05-25

Efficient utilization of high-performance computing (HPC) platforms is an important and complex problem. Execution models, abstract descriptions of the dynamic runtime behavior of the execution stack, have significant impact on the utilization of HPC systems. Using a computational chemistry kernel as a case study and a wide variety of execution models combined with load balancing techniques, we explore the impact of execution models on the utilization of an HPC system. We demonstrate a 50 percent improvement in performance by using work stealing relative to a more traditional static scheduling approach. We also use a novel semi-matching technique for load balancingmore » that has comparable performance to a traditional hypergraph-based partitioning implementation, which is computationally expensive. Using this study, we found that execution model design choices and assumptions can limit critical optimizations such as global, dynamic load balancing and finding the correct balance between available work units and different system and runtime overheads. With the emergence of multi- and many-core architectures and the consequent growth in the complexity of HPC platforms, we believe that these lessons will be beneficial to researchers tuning diverse applications on modern HPC platforms, especially on emerging dynamic platforms with energy-induced performance variability.« less
On the Emergence of Modern Humans

ERIC Educational Resources Information Center

Amati, Daniele; Shallice, Tim

2007-01-01

The emergence of modern humans with their extraordinary cognitive capacities is ascribed to a novel type of cognitive computational process (sustained non-routine multi-level operations) required for abstract projectuality, held to be the common denominator of the cognitive capacities specific to modern humans. A brain operation (latching) that…
GeoBuilder: a geometric algorithm visualization and debugging system for 2D and 3D geometric computing.

PubMed

Wei, Jyh-Da; Tsai, Ming-Hung; Lee, Gen-Cher; Huang, Jeng-Hung; Lee, Der-Tsai

2009-01-01

Algorithm visualization is a unique research topic that integrates engineering skills such as computer graphics, system programming, database management, computer networks, etc., to facilitate algorithmic researchers in testing their ideas, demonstrating new findings, and teaching algorithm design in the classroom. Within the broad applications of algorithm visualization, there still remain performance issues that deserve further research, e.g., system portability, collaboration capability, and animation effect in 3D environments. Using modern technologies of Java programming, we develop an algorithm visualization and debugging system, dubbed GeoBuilder, for geometric computing. The GeoBuilder system features Java's promising portability, engagement of collaboration in algorithm development, and automatic camera positioning for tracking 3D geometric objects. In this paper, we describe the design of the GeoBuilder system and demonstrate its applications.
Through Kazan ASPERA to Modern Projects

NASA Astrophysics Data System (ADS)

Gusev, Alexander; Kitiashvili, Irina; Petrova, Natasha

Now the European Union form the Sixth Framework Programme. One of its the objects of the EU Programme is opening national researches and training programmes. The Russian PhD students and young astronomers have business and financial difficulties in access to modern databases and astronomical projects and so they has not been included in European overview of priorities. Modern requirements to the organization of observant projects on powerful telescopes assumes painstaking scientific computer preparation of the application. A rigid competition for observation time assume preliminary computer modeling of target object for success of the application. Kazan AstroGeoPhysics Partnership
Dynamic provisioning of local and remote compute resources with OpenStack

NASA Astrophysics Data System (ADS)

Giffels, M.; Hauth, T.; Polgart, F.; Quast, G.

2015-12-01

Modern high-energy physics experiments rely on the extensive usage of computing resources, both for the reconstruction of measured events as well as for Monte-Carlo simulation. The Institut fur Experimentelle Kernphysik (EKP) at KIT is participating in both the CMS and Belle experiments with computing and storage resources. In the upcoming years, these requirements are expected to increase due to growing amount of recorded data and the rise in complexity of the simulated events. It is therefore essential to increase the available computing capabilities by tapping into all resource pools. At the EKP institute, powerful desktop machines are available to users. Due to the multi-core nature of modern CPUs, vast amounts of CPU time are not utilized by common desktop usage patterns. Other important providers of compute capabilities are classical HPC data centers at universities or national research centers. Due to the shared nature of these installations, the standardized software stack required by HEP applications cannot be installed. A viable way to overcome this constraint and offer a standardized software environment in a transparent manner is the usage of virtualization technologies. The OpenStack project has become a widely adopted solution to virtualize hardware and offer additional services like storage and virtual machine management. This contribution will report on the incorporation of the institute's desktop machines into a private OpenStack Cloud. The additional compute resources provisioned via the virtual machines have been used for Monte-Carlo simulation and data analysis. Furthermore, a concept to integrate shared, remote HPC centers into regular HEP job workflows will be presented. In this approach, local and remote resources are merged to form a uniform, virtual compute cluster with a single point-of-entry for the user. Evaluations of the performance and stability of this setup and operational experiences will be discussed.
WE-D-303-01: Development and Application of Digital Human Phantoms

DOE Office of Scientific and Technical Information (OSTI.GOV)

Segars, P.

2015-06-15

Modern medical physics deals with complex problems such as 4D radiation therapy and imaging quality optimization. Such problems involve a large number of radiological parameters, and anatomical and physiological breathing patterns. A major challenge is how to develop, test, evaluate and compare various new imaging and treatment techniques, which often involves testing over a large range of radiological parameters as well as varying patient anatomies and motions. It would be extremely challenging, if not impossible, both ethically and practically, to test every combination of parameters and every task on every type of patient under clinical conditions. Computer-based simulation using computationalmore » phantoms offers a practical technique with which to evaluate, optimize, and compare imaging technologies and methods. Within simulation, the computerized phantom provides a virtual model of the patient’s anatomy and physiology. Imaging data can be generated from it as if it was a live patient using accurate models of the physics of the imaging and treatment process. With sophisticated simulation algorithms, it is possible to perform virtual experiments entirely on the computer. By serving as virtual patients, computational phantoms hold great promise in solving some of the most complex problems in modern medical physics. In this proposed symposium, we will present the history and recent developments of computational phantom models, share experiences in their application to advanced imaging and radiation applications, and discuss their promises and limitations. Learning Objectives: Understand the need and requirements of computational phantoms in medical physics research Discuss the developments and applications of computational phantoms Know the promises and limitations of computational phantoms in solving complex problems.« less
Multicore Architecture-aware Scientific Applications

DOE Office of Scientific and Technical Information (OSTI.GOV)

Srinivasa, Avinash

Modern high performance systems are becoming increasingly complex and powerful due to advancements in processor and memory architecture. In order to keep up with this increasing complexity, applications have to be augmented with certain capabilities to fully exploit such systems. These may be at the application level, such as static or dynamic adaptations or at the system level, like having strategies in place to override some of the default operating system polices, the main objective being to improve computational performance of the application. The current work proposes two such capabilites with respect to multi-threaded scientific applications, in particular a largemore » scale physics application computing ab-initio nuclear structure. The first involves using a middleware tool to invoke dynamic adaptations in the application, so as to be able to adjust to the changing computational resource availability at run-time. The second involves a strategy for effective placement of data in main memory, to optimize memory access latencies and bandwidth. These capabilties when included were found to have a significant impact on the application performance, resulting in average speedups of as much as two to four times.« less
Digital image processing for information extraction.

NASA Technical Reports Server (NTRS)

Billingsley, F. C.

1973-01-01

The modern digital computer has made practical image processing techniques for handling nonlinear operations in both the geometrical and the intensity domains, various types of nonuniform noise cleanup, and the numerical analysis of pictures. An initial requirement is that a number of anomalies caused by the camera (e.g., geometric distortion, MTF roll-off, vignetting, and nonuniform intensity response) must be taken into account or removed to avoid their interference with the information extraction process. Examples illustrating these operations are discussed along with computer techniques used to emphasize details, perform analyses, classify materials by multivariate analysis, detect temporal differences, and aid in human interpretation of photos.
Fast Photon Monte Carlo for Water Cherenkov Detectors

NASA Astrophysics Data System (ADS)

Latorre, Anthony; Seibert, Stanley

2012-03-01

We present Chroma, a high performance optical photon simulation for large particle physics detectors, such as the water Cerenkov far detector option for LBNE. This software takes advantage of the CUDA parallel computing platform to propagate photons using modern graphics processing units. In a computer model of a 200 kiloton water Cerenkov detector with 29,000 photomultiplier tubes, Chroma can propagate 2.5 million photons per second, around 200 times faster than the same simulation with Geant4. Chroma uses a surface based approach to modeling geometry which offers many benefits over a solid based modelling approach which is used in other simulations like Geant4.
Propulsion system/flight control integration for supersonic aircraft

NASA Technical Reports Server (NTRS)

Reukauf, P. J.; Burcham, F. W., Jr.

1976-01-01

Digital integrated control systems are studied. Such systems allow minimization of undesirable interactions while maximizing performance at all flight conditions. One such program is the YF-12 cooperative control program. The existing analog air data computer, autothrottle, autopilot, and inlet control systems are converted to digital systems by using a general purpose airborne computer and interface unit. Existing control laws are programed and tested in flight. Integrated control laws, derived using accurate mathematical models of the airplane and propulsion system in conjunction with modern control techniques, are tested in flight. Analysis indicates that an integrated autothrottle autopilot gives good flight path control and that observers are used to replace failed sensors.
Evaluating Modern Defenses Against Control Flow Hijacking

DTIC Science & Technology

2015-09-01

unsound and could introduce false negatives (opening up another possible set of attacks). CFG Construction using DSA We next evaluate the precision of CFG...Evaluating Modern Defenses Against Control Flow Hijacking by Ulziibayar Otgonbaatar Submitted to the Department of Electrical Engineering and...Computer Science in partial fulfillment of the requirements for the degree of Master of Science in Computer Science and Engineering at the MASSACHUSETTS
Translations on Eastern Europe, Scientific Affairs, Number 542.

DTIC Science & Technology

1977-04-18

transplanting human tissue has not as yet been given a final juridical approval like euthanasia, artificial insemination , abortion, birth control, and others...and data teleprocessing. This computer may also be used as a satellite computer for complex systems. The IZOT 310 has a large instruction...a well-known truth that modern science is using the most modern and leading technical facilities—from bathyscaphes to satellites , from gigantic

Resiliency in Future Cyber Combat

DTIC Science & Technology

2016-04-04

including the Internet , telecommunications networks, computer systems, and embed- ded processors and controllers.”6 One important point emerging from the...definition is that while the Internet is part of cyberspace, it is not all of cyberspace. Any computer processor capable of communicating with a...central proces- sor on a modern car are all part of cyberspace, although only some of them are routinely connected to the Internet . Most modern
Feasibility study for the implementation of NASTRAN on the ILLIAC 4 parallel processor

NASA Technical Reports Server (NTRS)

Field, E. I.

1975-01-01

The ILLIAC IV, a fourth generation multiprocessor using parallel processing hardware concepts, is operational at Moffett Field, California. Its capability to excel at matrix manipulation, makes the ILLIAC well suited for performing structural analyses using the finite element displacement method. The feasibility of modifying the NASTRAN (NASA structural analysis) computer program to make effective use of the ILLIAC IV was investigated. The characteristics are summarized of the ILLIAC and the ARPANET, a telecommunications network which spans the continent making the ILLIAC accessible to nearly all major industrial centers in the United States. Two distinct approaches are studied: retaining NASTRAN as it now operates on many of the host computers of the ARPANET to process the input and output while using the ILLIAC only for the major computational tasks, and installing NASTRAN to operate entirely in the ILLIAC environment. Though both alternatives offer similar and significant increases in computational speed over modern third generation processors, the full installation of NASTRAN on the ILLIAC is recommended. Specifications are presented for performing that task with manpower estimates and schedules to correspond.
Computer Technology: State of the Art.

ERIC Educational Resources Information Center

Withington, Frederic G.

1981-01-01

Describes the nature of modern general-purpose computer systems, including hardware, semiconductor electronics, microprocessors, computer architecture, input output technology, and system control programs. Seven suggested readings are cited. (FM)
Performance Analysis, Design Considerations, and Applications of Extreme-Scale In Situ Infrastructures

DOE PAGES

Ayachit, Utkarsh; Bauer, Andrew; Duque, Earl P. N.; ...

2016-11-01

A key trend facing extreme-scale computational science is the widening gap between computational and I/O rates, and the challenge that follows is how to best gain insight from simulation data when it is increasingly impractical to save it to persistent storage for subsequent visual exploration and analysis. One approach to this challenge is centered around the idea of in situ processing, where visualization and analysis processing is performed while data is still resident in memory. Our paper examines several key design and performance issues related to the idea of in situ processing at extreme scale on modern platforms: Scalability, overhead,more » performance measurement and analysis, comparison and contrast with a traditional post hoc approach, and interfacing with simulation codes. We illustrate these principles in practice with studies, conducted on large-scale HPC platforms, that include a miniapplication and multiple science application codes, one of which demonstrates in situ methods in use at greater than 1M-way concurrency.« less
High Performance Radiation Transport Simulations on TITAN

DOE Office of Scientific and Technical Information (OSTI.GOV)

Baker, Christopher G; Davidson, Gregory G; Evans, Thomas M

2012-01-01

In this paper we describe the Denovo code system. Denovo solves the six-dimensional, steady-state, linear Boltzmann transport equation, of central importance to nuclear technology applications such as reactor core analysis (neutronics), radiation shielding, nuclear forensics and radiation detection. The code features multiple spatial differencing schemes, state-of-the-art linear solvers, the Koch-Baker-Alcouffe (KBA) parallel-wavefront sweep algorithm for inverting the transport operator, a new multilevel energy decomposition method scaling to hundreds of thousands of processing cores, and a modern, novel code architecture that supports straightforward integration of new features. In this paper we discuss the performance of Denovo on the 10--20 petaflop ORNLmore » GPU-based system, Titan. We describe algorithms and techniques used to exploit the capabilities of Titan's heterogeneous compute node architecture and the challenges of obtaining good parallel performance for this sparse hyperbolic PDE solver containing inherently sequential computations. Numerical results demonstrating Denovo performance on early Titan hardware are presented.« less
Modernizing and transforming medical education at the Kilimanjaro Christian Medical University College.

PubMed

Lisasi, Esther; Kulanga, Ahaz; Muiruri, Charles; Killewo, Lucy; Fadhili, Ndimangwa; Mimano, Lucy; Kapanda, Gibson; Tibyampansha, Dativa; Ibrahim, Glory; Nyindo, Mramba; Mteta, Kien; Kessi, Egbert; Ntabaye, Moshi; Bartlett, John

2014-08-01

The Kilimanjaro Christian Medical University (KCMU) College and the Medical Education Partnership Initiative (MEPI) are addressing the crisis in Tanzanian health care manpower by modernizing the college's medical education with new tools and techniques. With a $10 million MEPI grant and the participation of its partner, Duke University, KCMU is harnessing the power of information technology (IT) to upgrade tools for students and faculty. Initiatives in eLearning have included bringing fiber-optic connectivity to the campus, offering campus-wide wireless access, opening student and faculty computer laboratories, and providing computer tablets to all incoming medical students. Beyond IT, the college is also offering wet laboratory instruction for hands-on diagnostic skills, team-based learning, and clinical skills workshops. In addition, modern teaching tools and techniques address the challenges posed by increasing numbers of students. To provide incentives for instructors, a performance-based compensation plan and teaching awards have been established. Also for faculty, IT tools and training have been made available, and a medical education course management system is now being widely employed. Student and faculty responses have been favorable, and the rapid uptake of these interventions by students, faculty, and the college's administration suggests that the KCMU College MEPI approach has addressed unmet needs. This enabling environment has transformed the culture of learning and teaching at KCMU College, where a path to sustainability is now being pursued.
Modernizing and Transforming Medical Education at the Kilimanjaro Christian Medical University College

PubMed Central

Lisasi, Esther; Kulanga, Ahaz; Muiruri, Charles; Killewo, Lucy; Fadhili, Ndimangwa; Mimano, Lucy; Kapanda, Gibson; Tibyampansha, Dativa; Ibrahim, Glory; Nyindo, Mramba; Mteta, Kien; Kessi, Egbert; Ntabaye, Moshi; Bartlett, John

2014-01-01

The Kilimanjaro Christian Medical University (KCMU) College and the Medical Education Partnership Initiative (MEPI) are addressing the crisis in Tanzanian health care manpower by modernizing the college’s medical education with new tools and techniques. With a $10 million MEPI grant and the participation of its partner, Duke University, KCMU is harnessing the power of information technology (IT) to upgrade tools for students and faculty. Initiatives in eLearning have included bringing fiber-optic connectivity to the campus, offering campus-wide wireless access, opening student and faculty computer laboratories, and providing computer tablets to all incoming medical students. Beyond IT, the college is also offering wet laboratory instruction for hands-on diagnostic skills, team-based learning, and clinical skills workshops. In addition, modern teaching tools and techniques address the challenges posed by increasing numbers of students. To provide incentives for instructors, a performance-based compensation plan and teaching awards have been established. Also for faculty, IT tools and training have been made available, and a medical education course management system is now being widely employed. Student and faculty responses have been favorable, and the rapid uptake of these interventions by students, faculty, and the college’s administration suggests that the KCMU College MEPI approach has addressed unmet needs. This enabling environment has transformed the culture of learning and teaching at KCMU College, where a path to sustainability is now being pursued. PMID:25072581
Extreme Scale Computing to Secure the Nation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Brown, D L; McGraw, J R; Johnson, J R

2009-11-10

Since the dawn of modern electronic computing in the mid 1940's, U.S. national security programs have been dominant users of every new generation of high-performance computer. Indeed, the first general-purpose electronic computer, ENIAC (the Electronic Numerical Integrator and Computer), was used to calculate the expected explosive yield of early thermonuclear weapons designs. Even the U. S. numerical weather prediction program, another early application for high-performance computing, was initially funded jointly by sponsors that included the U.S. Air Force and Navy, agencies interested in accurate weather predictions to support U.S. military operations. For the decades of the cold war, national securitymore » requirements continued to drive the development of high performance computing (HPC), including advancement of the computing hardware and development of sophisticated simulation codes to support weapons and military aircraft design, numerical weather prediction as well as data-intensive applications such as cryptography and cybersecurity U.S. national security concerns continue to drive the development of high-performance computers and software in the U.S. and in fact, events following the end of the cold war have driven an increase in the growth rate of computer performance at the high-end of the market. This mainly derives from our nation's observance of a moratorium on underground nuclear testing beginning in 1992, followed by our voluntary adherence to the Comprehensive Test Ban Treaty (CTBT) beginning in 1995. The CTBT prohibits further underground nuclear tests, which in the past had been a key component of the nation's science-based program for assuring the reliability, performance and safety of U.S. nuclear weapons. In response to this change, the U.S. Department of Energy (DOE) initiated the Science-Based Stockpile Stewardship (SBSS) program in response to the Fiscal Year 1994 National Defense Authorization Act, which requires, 'in the absence of nuclear testing, a progam to: (1) Support a focused, multifaceted program to increase the understanding of the enduring stockpile; (2) Predict, detect, and evaluate potential problems of the aging of the stockpile; (3) Refurbish and re-manufacture weapons and components, as required; and (4) Maintain the science and engineering institutions needed to support the nation's nuclear deterrent, now and in the future'. This program continues to fulfill its national security mission by adding significant new capabilities for producing scientific results through large-scale computational simulation coupled with careful experimentation, including sub-critical nuclear experiments permitted under the CTBT. To develop the computational science and the computational horsepower needed to support its mission, SBSS initiated the Accelerated Strategic Computing Initiative, later renamed the Advanced Simulation & Computing (ASC) program (sidebar: 'History of ASC Computing Program Computing Capability'). The modern 3D computational simulation capability of the ASC program supports the assessment and certification of the current nuclear stockpile through calibration with past underground test (UGT) data. While an impressive accomplishment, continued evolution of national security mission requirements will demand computing resources at a significantly greater scale than we have today. In particular, continued observance and potential Senate confirmation of the Comprehensive Test Ban Treaty (CTBT) together with the U.S administration's promise for a significant reduction in the size of the stockpile and the inexorable aging and consequent refurbishment of the stockpile all demand increasing refinement of our computational simulation capabilities. Assessment of the present and future stockpile with increased confidence of the safety and reliability without reliance upon calibration with past or future test data is a long-term goal of the ASC program. This will be accomplished through significant increases in the scientific bases that underlie the computational tools. Computer codes must be developed that replace phenomenology with increased levels of scientific understanding together with an accompanying quantification of uncertainty. These advanced codes will place significantly higher demands on the computing infrastructure than do the current 3D ASC codes. This article discusses not only the need for a future computing capability at the exascale for the SBSS program, but also considers high performance computing requirements for broader national security questions. For example, the increasing concern over potential nuclear terrorist threats demands a capability to assess threats and potential disablement technologies as well as a rapid forensic capability for determining a nuclear weapons design from post-detonation evidence (nuclear counterterrorism).« less
Cosmological neutrino simulations at extreme scale

DOE PAGES

Emberson, J. D.; Yu, Hao-Ran; Inman, Derek; ...

2017-08-01

Constraining neutrino mass remains an elusive challenge in modern physics. Precision measurements are expected from several upcoming cosmological probes of large-scale structure. Achieving this goal relies on an equal level of precision from theoretical predictions of neutrino clustering. Numerical simulations of the non-linear evolution of cold dark matter and neutrinos play a pivotal role in this process. We incorporate neutrinos into the cosmological N-body code CUBEP3M and discuss the challenges associated with pushing to the extreme scales demanded by the neutrino problem. We highlight code optimizations made to exploit modern high performance computing architectures and present a novel method ofmore » data compression that reduces the phase-space particle footprint from 24 bytes in single precision to roughly 9 bytes. We scale the neutrino problem to the Tianhe-2 supercomputer and provide details of our production run, named TianNu, which uses 86% of the machine (13,824 compute nodes). With a total of 2.97 trillion particles, TianNu is currently the world’s largest cosmological N-body simulation and improves upon previous neutrino simulations by two orders of magnitude in scale. We finish with a discussion of the unanticipated computational challenges that were encountered during the TianNu runtime.« less
Determination of Static and Dynamic Stability Derivatives Using Beggar

DTIC Science & Technology

2008-03-01

applies a symmetric Gauss - Seidel method , which solves the generic equation [A]x = b for x by dividing [A] into [A] = ([D][L])[U ] (2.68) 37 where D is...oscillations. Convergence studies on each of these parameters were performed to ensure both convergence and solution independence. Roll stability derivatives...aerodynamic stability parame- ters made dynamic solutions more difficult to compute and less reliable, but modern methods and resources are making this process
DoD High Performance Computing Modernization Program FY16 Annual Report

DTIC Science & Technology

2018-05-02

vortex shedding from rotor blade tips using adaptive mesh refinement gives Helios the unique capability to assess the interaction of these vortices...with the fuselage and nearby rotor blades . Helios provides all the benefits for rotary-winged aircraft that Kestrel does for fixed-wing aircraft...rotor blade upgrade of the CH-47F Chinook helicopter to achieve up to an estimated 2,000 pounds increase in hover thrust (~10%) with limited
2009 High Performance Computing Modernization Program Users Group Conference

DTIC Science & Technology

2009-06-17

Asymmetric Threats Future Peer GWoT / ungoverned areas Irregular Warfare Low-end Asymmetric 1-4-2-1 (State-to-State War) Disruptive technologies Superiority...2008 “As changes in this century’s threat environment create strategic challenges – irregular warfare, weapons of mass destruction, disruptive ... technologies – this request places greater emphasis on basic research, which in recent years has not kept pace with other parts of the budget.” • Personnel
Analysis of impact of general-purpose graphics processor units in supersonic flow modeling

NASA Astrophysics Data System (ADS)

Emelyanov, V. N.; Karpenko, A. G.; Kozelkov, A. S.; Teterina, I. V.; Volkov, K. N.; Yalozo, A. V.

2017-06-01

Computational methods are widely used in prediction of complex flowfields associated with off-normal situations in aerospace engineering. Modern graphics processing units (GPU) provide architectures and new programming models that enable to harness their large processing power and to design computational fluid dynamics (CFD) simulations at both high performance and low cost. Possibilities of the use of GPUs for the simulation of external and internal flows on unstructured meshes are discussed. The finite volume method is applied to solve three-dimensional unsteady compressible Euler and Navier-Stokes equations on unstructured meshes with high resolution numerical schemes. CUDA technology is used for programming implementation of parallel computational algorithms. Solutions of some benchmark test cases on GPUs are reported, and the results computed are compared with experimental and computational data. Approaches to optimization of the CFD code related to the use of different types of memory are considered. Speedup of solution on GPUs with respect to the solution on central processor unit (CPU) is compared. Performance measurements show that numerical schemes developed achieve 20-50 speedup on GPU hardware compared to CPU reference implementation. The results obtained provide promising perspective for designing a GPU-based software framework for applications in CFD.
Fast neuromimetic object recognition using FPGA outperforms GPU implementations.

PubMed

Orchard, Garrick; Martin, Jacob G; Vogelstein, R Jacob; Etienne-Cummings, Ralph

2013-08-01

Recognition of objects in still images has traditionally been regarded as a difficult computational problem. Although modern automated methods for visual object recognition have achieved steadily increasing recognition accuracy, even the most advanced computational vision approaches are unable to obtain performance equal to that of humans. This has led to the creation of many biologically inspired models of visual object recognition, among them the hierarchical model and X (HMAX) model. HMAX is traditionally known to achieve high accuracy in visual object recognition tasks at the expense of significant computational complexity. Increasing complexity, in turn, increases computation time, reducing the number of images that can be processed per unit time. In this paper we describe how the computationally intensive and biologically inspired HMAX model for visual object recognition can be modified for implementation on a commercial field-programmable aate Array, specifically the Xilinx Virtex 6 ML605 evaluation board with XC6VLX240T FPGA. We show that with minor modifications to the traditional HMAX model we can perform recognition on images of size 128 × 128 pixels at a rate of 190 images per second with a less than 1% loss in recognition accuracy in both binary and multiclass visual object recognition tasks.
Orthorectification by Using Gpgpu Method

NASA Astrophysics Data System (ADS)

Sahin, H.; Kulur, S.

2012-07-01

Thanks to the nature of the graphics processing, the newly released products offer highly parallel processing units with high-memory bandwidth and computational power of more than teraflops per second. The modern GPUs are not only powerful graphic engines but also they are high level parallel programmable processors with very fast computing capabilities and high-memory bandwidth speed compared to central processing units (CPU). Data-parallel computations can be shortly described as mapping data elements to parallel processing threads. The rapid development of GPUs programmability and capabilities attracted the attentions of researchers dealing with complex problems which need high level calculations. This interest has revealed the concepts of "General Purpose Computation on Graphics Processing Units (GPGPU)" and "stream processing". The graphic processors are powerful hardware which is really cheap and affordable. So the graphic processors became an alternative to computer processors. The graphic chips which were standard application hardware have been transformed into modern, powerful and programmable processors to meet the overall needs. Especially in recent years, the phenomenon of the usage of graphics processing units in general purpose computation has led the researchers and developers to this point. The biggest problem is that the graphics processing units use different programming models unlike current programming methods. Therefore, an efficient GPU programming requires re-coding of the current program algorithm by considering the limitations and the structure of the graphics hardware. Currently, multi-core processors can not be programmed by using traditional programming methods. Event procedure programming method can not be used for programming the multi-core processors. GPUs are especially effective in finding solution for repetition of the computing steps for many data elements when high accuracy is needed. Thus, it provides the computing process more quickly and accurately. Compared to the GPUs, CPUs which perform just one computing in a time according to the flow control are slower in performance. This structure can be evaluated for various applications of computer technology. In this study covers how general purpose parallel programming and computational power of the GPUs can be used in photogrammetric applications especially direct georeferencing. The direct georeferencing algorithm is coded by using GPGPU method and CUDA (Compute Unified Device Architecture) programming language. Results provided by this method were compared with the traditional CPU programming. In the other application the projective rectification is coded by using GPGPU method and CUDA programming language. Sample images of various sizes, as compared to the results of the program were evaluated. GPGPU method can be used especially in repetition of same computations on highly dense data, thus finding the solution quickly.
The Helicopter Antenna Radiation Prediction Code (HARP)

NASA Technical Reports Server (NTRS)

Klevenow, F. T.; Lynch, B. G.; Newman, E. H.; Rojas, R. G.; Scheick, J. T.; Shamansky, H. T.; Sze, K. Y.

1990-01-01

The first nine months effort in the development of a user oriented computer code, referred to as the HARP code, for analyzing the radiation from helicopter antennas is described. The HARP code uses modern computer graphics to aid in the description and display of the helicopter geometry. At low frequencies the helicopter is modeled by polygonal plates, and the method of moments is used to compute the desired patterns. At high frequencies the helicopter is modeled by a composite ellipsoid and flat plates, and computations are made using the geometrical theory of diffraction. The HARP code will provide a user friendly interface, employing modern computer graphics, to aid the user to describe the helicopter geometry, select the method of computation, construct the desired high or low frequency model, and display the results.
Space Shuttle Debris Impact Tool Assessment Using the Modern Design of Experiments

NASA Technical Reports Server (NTRS)

DeLoach, Richard; Rayos, Elonsio M.; Campbell, Charles H.; Rickman, Steven L.; Larsen, Curtis E.

2007-01-01

Complex computer codes are used to estimate thermal and structural reentry loads on the Shuttle Orbiter induced by ice and foam debris impact during ascent. Such debris can create cavities in the Shuttle Thermal Protection System. The sizes and shapes of these cavities are approximated to accommodate a code limitation that requires simple "shoebox" geometries to describe the cavities -- rectangular areas and planar walls that are at constant angles with respect to vertical. These approximations induce uncertainty in the code results. The Modern Design of Experiments (MDOE) has recently been applied to develop a series of resource-minimal computational experiments designed to generate low-order polynomial graduating functions to approximate the more complex underlying codes. These polynomial functions were then used to propagate cavity geometry errors to estimate the uncertainty they induce in the reentry load calculations performed by the underlying code. This paper describes a methodological study focused on evaluating the application of MDOE to future operational codes in a rapid and low-cost way to assess the effects of cavity geometry uncertainty.
Highly parallel implementation of non-adiabatic Ehrenfest molecular dynamics

NASA Astrophysics Data System (ADS)

Kanai, Yosuke; Schleife, Andre; Draeger, Erik; Anisimov, Victor; Correa, Alfredo

2014-03-01

While the adiabatic Born-Oppenheimer approximation tremendously lowers computational effort, many questions in modern physics, chemistry, and materials science require an explicit description of coupled non-adiabatic electron-ion dynamics. Electronic stopping, i.e. the energy transfer of a fast projectile atom to the electronic system of the target material, is a notorious example. We recently implemented real-time time-dependent density functional theory based on the plane-wave pseudopotential formalism in the Qbox/qb@ll codes. We demonstrate that explicit integration using a fourth-order Runge-Kutta scheme is very suitable for modern highly parallelized supercomputers. Applying the new implementation to systems with hundreds of atoms and thousands of electrons, we achieved excellent performance and scalability on a large number of nodes both on the BlueGene based ``Sequoia'' system at LLNL as well as the Cray architecture of ``Blue Waters'' at NCSA. As an example, we discuss our work on computing the electronic stopping power of aluminum and gold for hydrogen projectiles, showing an excellent agreement with experiment. These first-principles calculations allow us to gain important insight into the the fundamental physics of electronic stopping.
Effects of Inlet Distortion on Aeromechanical Stability of a Forward-Swept High-Speed Fan

NASA Technical Reports Server (NTRS)

Herrick, Gregory P.

2011-01-01

Concerns regarding noise, propulsive efficiency, and fuel burn are inspiring aircraft designs wherein the propulsive turbomachines are partially (or fully) embedded within the airframe; such designs present serious concerns with regard to aerodynamic and aeromechanic performance of the compression system in response to inlet distortion. Separately, a forward-swept high-speed fan was developed to address noise concerns of modern podded turbofans; however this fan encounters aeroelastic instability (flutter) as it approaches stall. A three-dimensional, unsteady, Navier-Stokes computational fluid dynamics code is applied to analyze and corroborate fan performance with clean inlet flow. This code, already validated in its application to assess aerodynamic damping of vibrating blades at various flow conditions, is modified and then applied in a computational study to preliminarily assess the effects of inlet distortion on aeroelastic stability of the fan. Computational engineering application and implementation issues are discussed, followed by an investigation into the aeroelastic behavior of the fan with clean and distorted inlets.
Comparative Performance Analysis of Intel Xeon Phi, GPU, and CPU: A Case Study from Microscopy Image Analysis

PubMed Central

Teodoro, George; Kurc, Tahsin; Kong, Jun; Cooper, Lee; Saltz, Joel

2014-01-01

We study and characterize the performance of operations in an important class of applications on GPUs and Many Integrated Core (MIC) architectures. Our work is motivated by applications that analyze low-dimensional spatial datasets captured by high resolution sensors, such as image datasets obtained from whole slide tissue specimens using microscopy scanners. Common operations in these applications involve the detection and extraction of objects (object segmentation), the computation of features of each extracted object (feature computation), and characterization of objects based on these features (object classification). In this work, we have identify the data access and computation patterns of operations in the object segmentation and feature computation categories. We systematically implement and evaluate the performance of these operations on modern CPUs, GPUs, and MIC systems for a microscopy image analysis application. Our results show that the performance on a MIC of operations that perform regular data access is comparable or sometimes better than that on a GPU. On the other hand, GPUs are significantly more efficient than MICs for operations that access data irregularly. This is a result of the low performance of MICs when it comes to random data access. We also have examined the coordinated use of MICs and CPUs. Our experiments show that using a performance aware task strategy for scheduling application operations improves performance about 1.29× over a first-come-first-served strategy. This allows applications to obtain high performance efficiency on CPU-MIC systems - the example application attained an efficiency of 84% on 192 nodes (3072 CPU cores and 192 MICs). PMID:25419088

Accelerating epistasis analysis in human genetics with consumer graphics hardware.

PubMed

Sinnott-Armstrong, Nicholas A; Greene, Casey S; Cancare, Fabio; Moore, Jason H

2009-07-24

Human geneticists are now capable of measuring more than one million DNA sequence variations from across the human genome. The new challenge is to develop computationally feasible methods capable of analyzing these data for associations with common human disease, particularly in the context of epistasis. Epistasis describes the situation where multiple genes interact in a complex non-linear manner to determine an individual's disease risk and is thought to be ubiquitous for common diseases. Multifactor Dimensionality Reduction (MDR) is an algorithm capable of detecting epistasis. An exhaustive analysis with MDR is often computationally expensive, particularly for high order interactions. This challenge has previously been met with parallel computation and expensive hardware. The option we examine here exploits commodity hardware designed for computer graphics. In modern computers Graphics Processing Units (GPUs) have more memory bandwidth and computational capability than Central Processing Units (CPUs) and are well suited to this problem. Advances in the video game industry have led to an economy of scale creating a situation where these powerful components are readily available at very low cost. Here we implement and evaluate the performance of the MDR algorithm on GPUs. Of primary interest are the time required for an epistasis analysis and the price to performance ratio of available solutions. We found that using MDR on GPUs consistently increased performance per machine over both a feature rich Java software package and a C++ cluster implementation. The performance of a GPU workstation running a GPU implementation reduces computation time by a factor of 160 compared to an 8-core workstation running the Java implementation on CPUs. This GPU workstation performs similarly to 150 cores running an optimized C++ implementation on a Beowulf cluster. Furthermore this GPU system provides extremely cost effective performance while leaving the CPU available for other tasks. The GPU workstation containing three GPUs costs $2000 while obtaining similar performance on a Beowulf cluster requires 150 CPU cores which, including the added infrastructure and support cost of the cluster system, cost approximately $82,500. Graphics hardware based computing provides a cost effective means to perform genetic analysis of epistasis using MDR on large datasets without the infrastructure of a computing cluster.
Overset grid applications on distributed memory MIMD computers

NASA Technical Reports Server (NTRS)

Chawla, Kalpana; Weeratunga, Sisira

1994-01-01

Analysis of modern aerospace vehicles requires the computation of flowfields about complex three dimensional geometries composed of regions with varying spatial resolution requirements. Overset grid methods allow the use of proven structured grid flow solvers to address the twin issues of geometrical complexity and the resolution variation by decomposing the complex physical domain into a collection of overlapping subdomains. This flexibility is accompanied by the need for irregular intergrid boundary communication among the overlapping component grids. This study investigates a strategy for implementing such a static overset grid implicit flow solver on distributed memory, MIMD computers; i.e., the 128 node Intel iPSC/860 and the 208 node Intel Paragon. Performance data for two composite grid configurations characteristic of those encountered in present day aerodynamic analysis are also presented.
Imaging in anatomy: a comparison of imaging techniques in embalmed human cadavers

PubMed Central

2013-01-01

Background A large variety of imaging techniques is an integral part of modern medicine. Introducing radiological imaging techniques into the dissection course serves as a basis for improved learning of anatomy and multidisciplinary learning in pre-clinical medical education. Methods Four different imaging techniques (ultrasound, radiography, computed tomography, and magnetic resonance imaging) were performed in embalmed human body donors to analyse possibilities and limitations of the respective techniques in this peculiar setting. Results The quality of ultrasound and radiography images was poor, images of computed tomography and magnetic resonance imaging were of good quality. Conclusion Computed tomography and magnetic resonance imaging have a superior image quality in comparison to ultrasound and radiography and offer suitable methods for imaging embalmed human cadavers as a valuable addition to the dissection course. PMID:24156510
Integrating Computer Architectures into the Design of High-Performance Controllers

NASA Technical Reports Server (NTRS)

Jacklin, Stephen A.; Leyland, Jane A.; Warmbrodt, William

1986-01-01

Modern control systems must typically perform real-time identification and control, as well as coordinate a host of other activities related to user interaction, on-line graphics, and file management. This paper discusses five global design considerations that are useful to integrate array processor, multimicroprocessor, and host computer system architecture into versatile, high-speed controllers. Such controllers are capable of very high control throughput, and can maintain constant interaction with the non-real-time or user environment. As an application example, the architecture of a high-speed, closed-loop controller used to actively control helicopter vibration will be briefly discussed. Although this system has been designed for use as the controller for real-time rotorcraft dynamics and control studies in a wind-tunnel environment, the control architecture can generally be applied to a wide range of automatic control applications.
The Archive Solution for Distributed Workflow Management Agents of the CMS Experiment at LHC

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kuznetsov, Valentin; Fischer, Nils Leif; Guo, Yuyi

The CMS experiment at the CERN LHC developed the Workflow Management Archive system to persistently store unstructured framework job report documents produced by distributed workflow management agents. In this paper we present its architecture, implementation, deployment, and integration with the CMS and CERN computing infrastructures, such as central HDFS and Hadoop Spark cluster. The system leverages modern technologies such as a document oriented database and the Hadoop eco-system to provide the necessary flexibility to reliably process, store, and aggregatemore » $$\\mathcal{O}$$(1M) documents on a daily basis. We describe the data transformation, the short and long term storage layers, the query language, along with the aggregation pipeline developed to visualize various performance metrics to assist CMS data operators in assessing the performance of the CMS computing system.« less
The Archive Solution for Distributed Workflow Management Agents of the CMS Experiment at LHC

DOE PAGES

Kuznetsov, Valentin; Fischer, Nils Leif; Guo, Yuyi

2018-03-19

The CMS experiment at the CERN LHC developed the Workflow Management Archive system to persistently store unstructured framework job report documents produced by distributed workflow management agents. In this paper we present its architecture, implementation, deployment, and integration with the CMS and CERN computing infrastructures, such as central HDFS and Hadoop Spark cluster. The system leverages modern technologies such as a document oriented database and the Hadoop eco-system to provide the necessary flexibility to reliably process, store, and aggregatemore » $$\\mathcal{O}$$(1M) documents on a daily basis. We describe the data transformation, the short and long term storage layers, the query language, along with the aggregation pipeline developed to visualize various performance metrics to assist CMS data operators in assessing the performance of the CMS computing system.« less
LARCRIM user's guide, version 1.0

NASA Technical Reports Server (NTRS)

Davis, John S.; Heaphy, William J.

1993-01-01

LARCRIM is a relational database management system (RDBMS) which performs the conventional duties of an RDBMS with the added feature that it can store attributes which consist of arrays or matrices. This makes it particularly valuable for scientific data management. It is accessible as a stand-alone system and through an application program interface. The stand-alone system may be executed in two modes: menu or command. The menu mode prompts the user for the input required to create, update, and/or query the database. The command mode requires the direct input of LARCRIM commands. Although LARCRIM is an update of an old database family, its performance on modern computers is quite satisfactory. LARCRIM is written in FORTRAN 77 and runs under the UNIX operating system. Versions have been released for the following computers: SUN (3 & 4), Convex, IRIS, Hewlett-Packard, CRAY 2 & Y-MP.
WOMBAT: A Scalable and High-performance Astrophysical Magnetohydrodynamics Code

NASA Astrophysics Data System (ADS)

Mendygral, P. J.; Radcliffe, N.; Kandalla, K.; Porter, D.; O'Neill, B. J.; Nolting, C.; Edmon, P.; Donnert, J. M. F.; Jones, T. W.

2017-02-01

We present a new code for astrophysical magnetohydrodynamics specifically designed and optimized for high performance and scaling on modern and future supercomputers. We describe a novel hybrid OpenMP/MPI programming model that emerged from a collaboration between Cray, Inc. and the University of Minnesota. This design utilizes MPI-RMA optimized for thread scaling, which allows the code to run extremely efficiently at very high thread counts ideal for the latest generation of multi-core and many-core architectures. Such performance characteristics are needed in the era of “exascale” computing. We describe and demonstrate our high-performance design in detail with the intent that it may be used as a model for other, future astrophysical codes intended for applications demanding exceptional performance.
GPU-Accelerated Molecular Modeling Coming Of Age

PubMed Central

Stone, John E.; Hardy, David J.; Ufimtsev, Ivan S.

2010-01-01

Graphics processing units (GPUs) have traditionally been used in molecular modeling solely for visualization of molecular structures and animation of trajectories resulting from molecular dynamics simulations. Modern GPUs have evolved into fully programmable, massively parallel co-processors that can now be exploited to accelerate many scientific computations, typically providing about one order of magnitude speedup over CPU code and in special cases providing speedups of two orders of magnitude. This paper surveys the development of molecular modeling algorithms that leverage GPU computing, the advances already made and remaining issues to be resolved, and the continuing evolution of GPU technology that promises to become even more useful to molecular modeling. Hardware acceleration with commodity GPUs is expected to benefit the overall computational biology community by bringing teraflops performance to desktop workstations and in some cases potentially changing what were formerly batch-mode computational jobs into interactive tasks. PMID:20675161
GPU-accelerated molecular modeling coming of age.

PubMed

Stone, John E; Hardy, David J; Ufimtsev, Ivan S; Schulten, Klaus

2010-09-01

Graphics processing units (GPUs) have traditionally been used in molecular modeling solely for visualization of molecular structures and animation of trajectories resulting from molecular dynamics simulations. Modern GPUs have evolved into fully programmable, massively parallel co-processors that can now be exploited to accelerate many scientific computations, typically providing about one order of magnitude speedup over CPU code and in special cases providing speedups of two orders of magnitude. This paper surveys the development of molecular modeling algorithms that leverage GPU computing, the advances already made and remaining issues to be resolved, and the continuing evolution of GPU technology that promises to become even more useful to molecular modeling. Hardware acceleration with commodity GPUs is expected to benefit the overall computational biology community by bringing teraflops performance to desktop workstations and in some cases potentially changing what were formerly batch-mode computational jobs into interactive tasks. (c) 2010 Elsevier Inc. All rights reserved.
The Computational Ecologist’s Toolbox

EPA Science Inventory

Computational ecology, nestled in the broader field of data science, is an interdisciplinary field that attempts to improve our understanding of complex ecological systems through the use of modern computational methods. Computational ecology is based on a union of competence in...
Heterogeneous computing architecture for fast detection of SNP-SNP interactions.

PubMed

Sluga, Davor; Curk, Tomaz; Zupan, Blaz; Lotric, Uros

2014-06-25

The extent of data in a typical genome-wide association study (GWAS) poses considerable computational challenges to software tools for gene-gene interaction discovery. Exhaustive evaluation of all interactions among hundreds of thousands to millions of single nucleotide polymorphisms (SNPs) may require weeks or even months of computation. Massively parallel hardware within a modern Graphic Processing Unit (GPU) and Many Integrated Core (MIC) coprocessors can shorten the run time considerably. While the utility of GPU-based implementations in bioinformatics has been well studied, MIC architecture has been introduced only recently and may provide a number of comparative advantages that have yet to be explored and tested. We have developed a heterogeneous, GPU and Intel MIC-accelerated software module for SNP-SNP interaction discovery to replace the previously single-threaded computational core in the interactive web-based data exploration program SNPsyn. We report on differences between these two modern massively parallel architectures and their software environments. Their utility resulted in an order of magnitude shorter execution times when compared to the single-threaded CPU implementation. GPU implementation on a single Nvidia Tesla K20 runs twice as fast as that for the MIC architecture-based Xeon Phi P5110 coprocessor, but also requires considerably more programming effort. General purpose GPUs are a mature platform with large amounts of computing power capable of tackling inherently parallel problems, but can prove demanding for the programmer. On the other hand the new MIC architecture, albeit lacking in performance reduces the programming effort and makes it up with a more general architecture suitable for a wider range of problems.
Heterogeneous computing architecture for fast detection of SNP-SNP interactions

PubMed Central

2014-01-01

Background The extent of data in a typical genome-wide association study (GWAS) poses considerable computational challenges to software tools for gene-gene interaction discovery. Exhaustive evaluation of all interactions among hundreds of thousands to millions of single nucleotide polymorphisms (SNPs) may require weeks or even months of computation. Massively parallel hardware within a modern Graphic Processing Unit (GPU) and Many Integrated Core (MIC) coprocessors can shorten the run time considerably. While the utility of GPU-based implementations in bioinformatics has been well studied, MIC architecture has been introduced only recently and may provide a number of comparative advantages that have yet to be explored and tested. Results We have developed a heterogeneous, GPU and Intel MIC-accelerated software module for SNP-SNP interaction discovery to replace the previously single-threaded computational core in the interactive web-based data exploration program SNPsyn. We report on differences between these two modern massively parallel architectures and their software environments. Their utility resulted in an order of magnitude shorter execution times when compared to the single-threaded CPU implementation. GPU implementation on a single Nvidia Tesla K20 runs twice as fast as that for the MIC architecture-based Xeon Phi P5110 coprocessor, but also requires considerably more programming effort. Conclusions General purpose GPUs are a mature platform with large amounts of computing power capable of tackling inherently parallel problems, but can prove demanding for the programmer. On the other hand the new MIC architecture, albeit lacking in performance reduces the programming effort and makes it up with a more general architecture suitable for a wider range of problems. PMID:24964802
QMachine: commodity supercomputing in web browsers

PubMed Central

2014-01-01

Background Ongoing advancements in cloud computing provide novel opportunities in scientific computing, especially for distributed workflows. Modern web browsers can now be used as high-performance workstations for querying, processing, and visualizing genomics’ “Big Data” from sources like The Cancer Genome Atlas (TCGA) and the International Cancer Genome Consortium (ICGC) without local software installation or configuration. The design of QMachine (QM) was driven by the opportunity to use this pervasive computing model in the context of the Web of Linked Data in Biomedicine. Results QM is an open-sourced, publicly available web service that acts as a messaging system for posting tasks and retrieving results over HTTP. The illustrative application described here distributes the analyses of 20 Streptococcus pneumoniae genomes for shared suffixes. Because all analytical and data retrieval tasks are executed by volunteer machines, few server resources are required. Any modern web browser can submit those tasks and/or volunteer to execute them without installing any extra plugins or programs. A client library provides high-level distribution templates including MapReduce. This stark departure from the current reliance on expensive server hardware running “download and install” software has already gathered substantial community interest, as QM received more than 2.2 million API calls from 87 countries in 12 months. Conclusions QM was found adequate to deliver the sort of scalable bioinformatics solutions that computation- and data-intensive workflows require. Paradoxically, the sandboxed execution of code by web browsers was also found to enable them, as compute nodes, to address critical privacy concerns that characterize biomedical environments. PMID:24913605
Atherosclerosis in ancient and modern Egyptians: the Horus study.

PubMed

Allam, Adel H; Mandour Ali, Mohamed A; Wann, L Samuel; Thompson, Randall C; Sutherland, M Linda; Sutherland, James D; Frohlich, Bruno; Michalik, David E; Zink, Albert; Lombardi, Guido P; Watson, Lucia; Cox, Samantha L; Finch, Caleb E; Miyamoto, Michael I; Sallam, Sallam L; Narula, Jagat; Thomas, Gregory S

2014-06-01

Although atherosclerosis is usually thought of as a disease of modernity, the Horus Team has previously reported atherosclerotic vascular calcifications on computed tomographic (CT) scans in ancient Egyptians. The purpose of this study was to compare patterns and demographic characteristics of this disease among Egyptians from ancient and modern eras. We compared the presence and extent of vascular calcifications from whole-body CT scans performed on 178 modern Egyptians from Cairo undergoing positron emission tomography (PET)/CT for cancer staging to CT scans of 76 Egyptian mummies (3100 bce to 364 ce). The mean age of the modern Egyptian group was 52.3 ± 15 years (range 14 to 84) versus estimated age at death of ancient Egyptian mummies 36.5 ± 13 years (range 4 to 60); p < 0.0001. Vascular calcification was detected in 108 of 178 (60.7%) of modern patients versus 26 of 76 (38.2%) of mummies, p < 0.001. Vascular calcifications on CT strongly correlated to age in both groups. In addition, the severity of disease by number of involved arterial beds also correlated to age, and there was a very similar pattern between the 2 groups. Calcifications in both modern and ancient Egyptians were seen peripherally in aortoiliac beds almost a decade earlier than in event-related beds (coronary and carotid). The presence and severity of atherosclerotic vascular disease correlates strongly to age in both ancient and modern Egyptians. There is a striking correlation in the distribution of the number of vascular beds involved. Atherosclerotic calcifications are seen in the aortoiliac beds almost a decade earlier than in the coronary and carotid beds. Copyright © 2014. Published by Elsevier B.V.
Topology optimization aided structural design: Interpretation, computational aspects and 3D printing.

PubMed

Kazakis, Georgios; Kanellopoulos, Ioannis; Sotiropoulos, Stefanos; Lagaros, Nikos D

2017-10-01

Construction industry has a major impact on the environment that we spend most of our life. Therefore, it is important that the outcome of architectural intuition performs well and complies with the design requirements. Architects usually describe as "optimal design" their choice among a rather limited set of design alternatives, dictated by their experience and intuition. However, modern design of structures requires accounting for a great number of criteria derived from multiple disciplines, often of conflicting nature. Such criteria derived from structural engineering, eco-design, bioclimatic and acoustic performance. The resulting vast number of alternatives enhances the need for computer-aided architecture in order to increase the possibility of arriving at a more preferable solution. Therefore, the incorporation of smart, automatic tools in the design process, able to further guide designer's intuition becomes even more indispensable. The principal aim of this study is to present possibilities to integrate automatic computational techniques related to topology optimization in the phase of intuition of civil structures as part of computer aided architectural design. In this direction, different aspects of a new computer aided architectural era related to the interpretation of the optimized designs, difficulties resulted from the increased computational effort and 3D printing capabilities are covered here in.
Performance Modeling in CUDA Streams - A Means for High-Throughput Data Processing

PubMed Central

Li, Hao; Yu, Di; Kumar, Anand; Tu, Yi-Cheng

2015-01-01

Push-based database management system (DBMS) is a new type of data processing software that streams large volume of data to concurrent query operators. The high data rate of such systems requires large computing power provided by the query engine. In our previous work, we built a push-based DBMS named G-SDMS to harness the unrivaled computational capabilities of modern GPUs. A major design goal of G-SDMS is to support concurrent processing of heterogenous query processing operations and enable resource allocation among such operations. Understanding the performance of operations as a result of resource consumption is thus a premise in the design of G-SDMS. With NVIDIA’s CUDA framework as the system implementation platform, we present our recent work on performance modeling of CUDA kernels running concurrently under a runtime mechanism named CUDA stream. Specifically, we explore the connection between performance and resource occupancy of compute-bound kernels and develop a model that can predict the performance of such kernels. Furthermore, we provide an in-depth anatomy of the CUDA stream mechanism and summarize the main kernel scheduling disciplines in it. Our models and derived scheduling disciplines are verified by extensive experiments using synthetic and real-world CUDA kernels. PMID:26566545
Application of web-GIS approach for climate change study

NASA Astrophysics Data System (ADS)

Okladnikov, Igor; Gordov, Evgeny; Titov, Alexander; Bogomolov, Vasily; Martynova, Yuliya; Shulgina, Tamara

2013-04-01

Georeferenced datasets are currently actively used in numerous applications including modeling, interpretation and forecast of climatic and ecosystem changes for various spatial and temporal scales. Due to inherent heterogeneity of environmental datasets as well as their huge size which might constitute up to tens terabytes for a single dataset at present studies in the area of climate and environmental change require a special software support. A dedicated web-GIS information-computational system for analysis of georeferenced climatological and meteorological data has been created. It is based on OGC standards and involves many modern solutions such as object-oriented programming model, modular composition, and JavaScript libraries based on GeoExt library, ExtJS Framework and OpenLayers software. The main advantage of the system lies in a possibility to perform mathematical and statistical data analysis, graphical visualization of results with GIS-functionality, and to prepare binary output files with just only a modern graphical web-browser installed on a common desktop computer connected to Internet. Several geophysical datasets represented by two editions of NCEP/NCAR Reanalysis, JMA/CRIEPI JRA-25 Reanalysis, ECMWF ERA-40 Reanalysis, ECMWF ERA Interim Reanalysis, MRI/JMA APHRODITE's Water Resources Project Reanalysis, DWD Global Precipitation Climatology Centre's data, GMAO Modern Era-Retrospective analysis for Research and Applications, meteorological observational data for the territory of the former USSR for the 20th century, results of modeling by global and regional climatological models, and others are available for processing by the system. And this list is extending. Also a functionality to run WRF and "Planet simulator" models was implemented in the system. Due to many preset parameters and limited time and spatial ranges set in the system these models have low computational power requirements and could be used in educational workflow for better understanding of basic climatological and meteorological processes. The Web-GIS information-computational system for geophysical data analysis provides specialists involved into multidisciplinary research projects with reliable and practical instruments for complex analysis of climate and ecosystems changes on global and regional scales. Using it even unskilled user without specific knowledge can perform computational processing and visualization of large meteorological, climatological and satellite monitoring datasets through unified web-interface in a common graphical web-browser. This work is partially supported by the Ministry of education and science of the Russian Federation (contract #8345), SB RAS project VIII.80.2.1, RFBR grant #11-05-01190a, and integrated project SB RAS #131.
Large-scale neural circuit mapping data analysis accelerated with the graphical processing unit (GPU).

PubMed

Shi, Yulin; Veidenbaum, Alexander V; Nicolau, Alex; Xu, Xiangmin

2015-01-15

Modern neuroscience research demands computing power. Neural circuit mapping studies such as those using laser scanning photostimulation (LSPS) produce large amounts of data and require intensive computation for post hoc processing and analysis. Here we report on the design and implementation of a cost-effective desktop computer system for accelerated experimental data processing with recent GPU computing technology. A new version of Matlab software with GPU enabled functions is used to develop programs that run on Nvidia GPUs to harness their parallel computing power. We evaluated both the central processing unit (CPU) and GPU-enabled computational performance of our system in benchmark testing and practical applications. The experimental results show that the GPU-CPU co-processing of simulated data and actual LSPS experimental data clearly outperformed the multi-core CPU with up to a 22× speedup, depending on computational tasks. Further, we present a comparison of numerical accuracy between GPU and CPU computation to verify the precision of GPU computation. In addition, we show how GPUs can be effectively adapted to improve the performance of commercial image processing software such as Adobe Photoshop. To our best knowledge, this is the first demonstration of GPU application in neural circuit mapping and electrophysiology-based data processing. Together, GPU enabled computation enhances our ability to process large-scale data sets derived from neural circuit mapping studies, allowing for increased processing speeds while retaining data precision. Copyright © 2014 Elsevier B.V. All rights reserved.
Large scale neural circuit mapping data analysis accelerated with the graphical processing unit (GPU)

PubMed Central

Shi, Yulin; Veidenbaum, Alexander V.; Nicolau, Alex; Xu, Xiangmin

2014-01-01

Background Modern neuroscience research demands computing power. Neural circuit mapping studies such as those using laser scanning photostimulation (LSPS) produce large amounts of data and require intensive computation for post-hoc processing and analysis. New Method Here we report on the design and implementation of a cost-effective desktop computer system for accelerated experimental data processing with recent GPU computing technology. A new version of Matlab software with GPU enabled functions is used to develop programs that run on Nvidia GPUs to harness their parallel computing power. Results We evaluated both the central processing unit (CPU) and GPU-enabled computational performance of our system in benchmark testing and practical applications. The experimental results show that the GPU-CPU co-processing of simulated data and actual LSPS experimental data clearly outperformed the multi-core CPU with up to a 22x speedup, depending on computational tasks. Further, we present a comparison of numerical accuracy between GPU and CPU computation to verify the precision of GPU computation. In addition, we show how GPUs can be effectively adapted to improve the performance of commercial image processing software such as Adobe Photoshop. Comparison with Existing Method(s) To our best knowledge, this is the first demonstration of GPU application in neural circuit mapping and electrophysiology-based data processing. Conclusions Together, GPU enabled computation enhances our ability to process large-scale data sets derived from neural circuit mapping studies, allowing for increased processing speeds while retaining data precision. PMID:25277633

Mobile high-performance computing (HPC) for synthetic aperture radar signal processing

NASA Astrophysics Data System (ADS)

Misko, Joshua; Kim, Youngsoo; Qi, Chenchen; Sirkeci, Birsen

2018-04-01

The importance of mobile high-performance computing has emerged in numerous battlespace applications at the tactical edge in hostile environments. Energy efficient computing power is a key enabler for diverse areas ranging from real-time big data analytics and atmospheric science to network science. However, the design of tactical mobile data centers is dominated by power, thermal, and physical constraints. Presently, it is very unlikely to achieve required computing processing power by aggregating emerging heterogeneous many-core processing platforms consisting of CPU, Field Programmable Gate Arrays and Graphic Processor cores constrained by power and performance. To address these challenges, we performed a Synthetic Aperture Radar case study for Automatic Target Recognition (ATR) using Deep Neural Networks (DNNs). However, these DNN models are typically trained using GPUs with gigabytes of external memories and massively used 32-bit floating point operations. As a result, DNNs do not run efficiently on hardware appropriate for low power or mobile applications. To address this limitation, we proposed for compressing DNN models for ATR suited to deployment on resource constrained hardware. This proposed compression framework utilizes promising DNN compression techniques including pruning and weight quantization while also focusing on processor features common to modern low-power devices. Following this methodology as a guideline produced a DNN for ATR tuned to maximize classification throughput, minimize power consumption, and minimize memory footprint on a low-power device.
Application of KinectTM and wireless technology for patient data recording and viewing system in the course of surgery

NASA Astrophysics Data System (ADS)

Ong, Aira Patrice R.; Bugtai, Nilo T.; Aldaba, Luis Miguel M.; Madrangca, Astrid Valeska H.; Que, Giselle V.; Que, Miles Frederick L.; Tan, Kean Anderson. S.

2017-02-01

In modern operating room (OR) conditions, a patient's computed tomography (CT) or magnetic resonance imaging (MRI) scans are some of the most important resources during surgical procedures. In practice, the surgeon is impelled to scrub out and back in every time he needs to scroll through scan images in mid-operation. To prevent leaving the operating table, many surgeons rely on assistants or nurses and give instructions to manipulate the computer for them, which can be cumbersome and frustrating. As a motivation for this study, the use of touchless (non-contact) gesture-based interface in medical practice is incorporated to have aseptic interactions with the computer systems and with the patient's data. The system presented in this paper is composed of three main parts: the Trek Ai-Ball Camera, the Microsoft Kinect™, and the computer software. The incorporation of these components and the developed software allows the user to perform 13 hand gestures, which have been tested to be 100 percent accurate. Based on the results of the tests performed on the system performance, the conclusions made regarding the time efficiency of the viewing system, the quality and the safety of the recording system has gained positive feedback from consulting doctors.
Parallel and Portable Monte Carlo Particle Transport

NASA Astrophysics Data System (ADS)

Lee, S. R.; Cummings, J. C.; Nolen, S. D.; Keen, N. D.

1997-08-01

We have developed a multi-group, Monte Carlo neutron transport code in C++ using object-oriented methods and the Parallel Object-Oriented Methods and Applications (POOMA) class library. This transport code, called MC++, currently computes k and α eigenvalues of the neutron transport equation on a rectilinear computational mesh. It is portable to and runs in parallel on a wide variety of platforms, including MPPs, clustered SMPs, and individual workstations. It contains appropriate classes and abstractions for particle transport and, through the use of POOMA, for portable parallelism. Current capabilities are discussed, along with physics and performance results for several test problems on a variety of hardware, including all three Accelerated Strategic Computing Initiative (ASCI) platforms. Current parallel performance indicates the ability to compute α-eigenvalues in seconds or minutes rather than days or weeks. Current and future work on the implementation of a general transport physics framework (TPF) is also described. This TPF employs modern C++ programming techniques to provide simplified user interfaces, generic STL-style programming, and compile-time performance optimization. Physics capabilities of the TPF will be extended to include continuous energy treatments, implicit Monte Carlo algorithms, and a variety of convergence acceleration techniques such as importance combing.
How Computer Graphics Work.

ERIC Educational Resources Information Center

Prosise, Jeff

This document presents the principles behind modern computer graphics without straying into the arcane languages of mathematics and computer science. Illustrations accompany the clear, step-by-step explanations that describe how computers draw pictures. The 22 chapters of the book are organized into 5 sections. "Part 1: Computer Graphics in…
A novel upgrade to Helsinki AMS: Fast switching of isotopes with electrostatic deflectors

NASA Astrophysics Data System (ADS)

Palonen, V.; Tikkanen, P.

2015-10-01

We have developed and installed electrostatic deflectors at the injection magnet entrance and exit to enable fast switching between isotopes in AMS measurements. The fast selection of the injected isotope, stable isotope current measurements, and rare isotope detection are all performed with three synchronized real-time NI-PXI computers. With the improvements, we are able to attain a precision of better than 0.2% for the 14C/13C ratio of modern samples.
High Performance Computing Modernization Program Kerberos Throughput Test Report

DTIC Science & Technology

2017-10-26

functionality as Kerberos plugins. The pre -release production kit was used in these tests to compare against the current release kit. YubiKey support...HPCMP Kerberos Throughput Test Report 3 2. THROUGHPUT TESTING 2.1 Testing Components Throughput testing was done to determine the benefits of the pre ...both the current release kit and the pre -release production kit for a total of 378 individual tests in order to note any improvements. Based on work
Research in digital adaptive flight controllers

NASA Technical Reports Server (NTRS)

Kaufman, H.

1976-01-01

A design study of adaptive control logic suitable for implementation in modern airborne digital flight computers was conducted. Both explicit controllers which directly utilize parameter identification and implicit controllers which do not require identification were considered. Extensive analytical and simulation efforts resulted in the recommendation of two explicit digital adaptive flight controllers. Interface weighted least squares estimation procedures with control logic were developed using either optimal regulator theory or with control logic based upon single stage performance indices.
DIC-CAM recipe for reverse engineering

NASA Astrophysics Data System (ADS)

Romero-Carrillo, P.; Lopez-Alba, E.; Dorado, R.; Diaz-Garrido, F. A.

2012-04-01

Reverse engineering (RE) tries to model and manufacture an object from measurements one of a reference object. Modern optical measurement systems and computer aided engineering software have improved reverse engineering procedures. We detail the main RE steps from 3D digitalization by Digital Image Correlation to manufacturing. The previous description is complemented with an application example, which portrays the performance of RE. The differences between original and manufactured objects are less than 2 mm (close to the tool radius).
The novel high-performance 3-D MT inverse solver

NASA Astrophysics Data System (ADS)

Kruglyakov, Mikhail; Geraskin, Alexey; Kuvshinov, Alexey

2016-04-01

We present novel, robust, scalable, and fast 3-D magnetotelluric (MT) inverse solver. The solver is written in multi-language paradigm to make it as efficient, readable and maintainable as possible. Separation of concerns and single responsibility concepts go through implementation of the solver. As a forward modelling engine a modern scalable solver extrEMe, based on contracting integral equation approach, is used. Iterative gradient-type (quasi-Newton) optimization scheme is invoked to search for (regularized) inverse problem solution, and adjoint source approach is used to calculate efficiently the gradient of the misfit. The inverse solver is able to deal with highly detailed and contrasting models, allows for working (separately or jointly) with any type of MT responses, and supports massive parallelization. Moreover, different parallelization strategies implemented in the code allow optimal usage of available computational resources for a given problem statement. To parameterize an inverse domain the so-called mask parameterization is implemented, which means that one can merge any subset of forward modelling cells in order to account for (usually) irregular distribution of observation sites. We report results of 3-D numerical experiments aimed at analysing the robustness, performance and scalability of the code. In particular, our computational experiments carried out at different platforms ranging from modern laptops to HPC Piz Daint (6th supercomputer in the world) demonstrate practically linear scalability of the code up to thousands of nodes.
Analysis of Defenses Against Code Reuse Attacks on Modern and New Architectures

DTIC Science & Technology

2015-09-01

soundness or completeness. An incomplete analysis will produce extra edges in the CFG that might allow an attacker to slip through. An unsound analysis...Analysis of Defenses Against Code Reuse Attacks on Modern and New Architectures by Isaac Noah Evans Submitted to the Department of Electrical...Engineering and Computer Science in partial fulfillment of the requirements for the degree of Master of Engineering in Electrical Engineering and Computer
Body metaphors--reading the body in contemporary culture.

PubMed

Skara, Danica

2004-01-01

This paper addresses the linguistic reframing of the human body in contemporary culture. Our aim is to provide a linguistic description of the ways in which the body is represented in modern English language. First, we will try to focus on body metaphors in general. We have collected a sample of 300 words and phrases functioning as body metaphors in modern English language. Reading the symbolism of the body we are witnessing changes in the basic metaphorical structuring of the human body. The results show that new vocabulary binds different fields of knowledge associated with machines and human beings according to a shared textual frame: human as computer and computer as human metaphor. Humans are almost blended with computers and vice versa. This metaphorical use of the human body and its parts reveals not only currents of unconscious though but also the structures of modern society and culture.
Federal Aviation Administration : challenges in modernizing the agency

DOT National Transportation Integrated Search

2000-02-01

FAA's efforts to implement initiatives in five key areas-air traffic control modernization, procurement and personnel reform, aviation safety, aviation and computer security, and financial management-have met with limited success. For example, FAA ha...
Preparing Dairy Technologists

ERIC Educational Resources Information Center

Sliva, William R.

1977-01-01

The size of modern dairy plant operations has led to extreme specialization in product manufacturing, milk processing, microbiological analysis, chemical and mathematical computations. Morrisville Agricultural and Technical College, New York, has modernized its curricula to meet these changes. (HD)
Hands-on Approach to Prepare Specialists in Climate Changes Modeling and Analysis Using an Information-Computational Web-GIS Portal "Climate"

NASA Astrophysics Data System (ADS)

Shulgina, T. M.; Gordova, Y. E.; Martynova, Y. V.

2014-12-01

A problem of making education relevant to the workplace tasks is a key problem of higher education in the professional field of environmental sciences. To answer this challenge several new courses for students of "Climatology" and "Meteorology" specialties were developed and implemented at the Tomsk State University, which comprises theoretical knowledge from up-to-date environmental sciences with computational tasks. To organize the educational process we use an open-source course management system Moodle (www.moodle.org). It gave us an opportunity to combine text and multimedia in a theoretical part of educational courses. The hands-on approach is realized through development of innovative trainings which are performed within the information-computational web GIS platform "Climate" (http://climate.scert.ru/). The platform has a set of tools and data bases allowing a researcher to perform climate changes analysis on the selected territory. The tools are also used for students' trainings, which contain practical tasks on climate modeling and climate changes assessment and analysis. Laboratory exercises are covering three topics: "Analysis of regional climate changes"; "Analysis of climate extreme indices on the regional scale"; and "Analysis of future climate". They designed to consolidate students' knowledge of discipline, to instill in them the skills to work independently with large amounts of geophysical data using modern processing and analysis tools of web-GIS platform "Climate" and to train them to present results obtained on laboratory work as reports with the statement of the problem, the results of calculations and logically justified conclusion. Thus, students are engaged in n the use of modern tools of the geophysical data analysis and it cultivates dynamic of their professional learning. The approach can help us to fill in this gap because it is the only approach that offers experience, increases students involvement, advance the use of modern information and communication tools. Financial support for this research from the RFBR (13-05-12034, 14-05-00502), SB RAS project VIII.80.2.1 and grant of the President of RF (№ 181) is acknowledged.
Computation and analysis of cavitating flow in Francis-class hydraulic turbines

NASA Astrophysics Data System (ADS)

Leonard, Daniel J.

Hydropower is the most proven renewable energy technology, supplying the world with 16% of its electricity. Conventional hydropower generates a vast majority of that percentage. Although a mature technology, hydroelectric generation shows great promise for expansion through new dams and plants in developing hydro countries. Moreover, in developed hydro countries, such as the United States, installing generating units in existing dams and the modern refurbishment of existing plants can greatly expand generating capabilities with little to no further impact on the environment. In addition, modern computational technology and fluid dynamics expertise has led to substantial improvements in modern turbine design and performance. Cavitation has always presented a problem in hydroturbines, causing performance breakdown, erosion, damage, vibration, and noise. While modern turbines are usually designed to be cavitation-free at their best efficiency point, due to the variable demand of the energy market it is fairly common to operate at off-design conditions. Here, cavitation and its deleterious effects are unavoidable, and hence, cavitation is a limiting factor on the design and operation of these turbines. Multiphase Computational Fluid Dynamics (CFD) has been used in recent years to model cavitating flow for a large range of problems, including turbomachinery. However, CFD of cavitating flow in hydroturbines is still in its infancy. This dissertation presents steady-periodic Reynolds-averaged Navier-Stokes simulations of a cavitating Francis-class hydroturbine at model and prototype scales. Computational results of the reduced-scale model and full-scale prototype, undergoing performance breakdown, are compared with empirical model data and prototype performance estimations based on standard industry scalings from the model data. Mesh convergence of the simulations is also displayed. Comparisons are made between the scales to display that cavitation performance breakdown can occur more abruptly in the model than the prototype, due to lack of Froude similitude between the two. When severe cavitation occurs, clear differences are observed in vapor content between the scales. A stage-by-stage performance decomposition is conducted to analyze the losses within individual components of each scale of the machine. As cavitation becomes more severe, the losses in the draft tube account for an increasing amount of the total losses in the machine. More losses occur in the model draft tube as cavitation formation in the prototype draft tube is prevented by the larger hydrostatic pressure gradient across the machine. Additionally, unsteady Detached Eddy Simulations of the fully-coupled cavitating hydroturbine are performed for both scales. Both mesh and temporal convergence studies are provided. The temporal and spectral content of fluctuations in torque and pressure are monitored and compared between single-phase, cavitating, model, and prototype cases. A shallow draft tube induced runner imbalance results in an asymmetric vapor distribution about the runner, leading to more extensive growth and collapse of vapor on any individual blade as it undergoes a revolution. Unique frequency components manifest and persist through the entire machine only when cavitation is present in the hub vortex. Large maximum pressure spikes, which result from vapor collapse, are observed on the blade surfaces in the multiphase simulations, and these may be a potential source of cavitation damage and erosion. Multiphase CFD is shown to be an accurate and effective technique for simulating and analyzing cavitating flow in Francis-class hydraulic turbines. It is recommended that it be used as an industrial tool to supplement model cavitation experiments for all types of hydraulic turbines. Moreover, multiphase CFD can be equally effective as a research tool, to investigate mechanisms of cavitating hydraulic turbines that are not understood, and to uncover unique new phenomena which are currently unknown.
Student Perceived Importance and Correlations of Selected Computer Literacy Course Topics

ERIC Educational Resources Information Center

Ciampa, Mark

2013-01-01

Traditional college-level courses designed to teach computer literacy are in a state of flux. Today's students have high rates of access to computing technology and computer ownership, leading many policy decision makers to conclude that students already are computer literate and thus computer literacy courses are dinosaurs in a modern digital…
A highly efficient 3D level-set grain growth algorithm tailored for ccNUMA architecture

NASA Astrophysics Data System (ADS)

Mießen, C.; Velinov, N.; Gottstein, G.; Barrales-Mora, L. A.

2017-12-01

A highly efficient simulation model for 2D and 3D grain growth was developed based on the level-set method. The model introduces modern computational concepts to achieve excellent performance on parallel computer architectures. Strong scalability was measured on cache-coherent non-uniform memory access (ccNUMA) architectures. To achieve this, the proposed approach considers the application of local level-set functions at the grain level. Ideal and non-ideal grain growth was simulated in 3D with the objective to study the evolution of statistical representative volume elements in polycrystals. In addition, microstructure evolution in an anisotropic magnetic material affected by an external magnetic field was simulated.
Computational Ecology and Open Science: Tools to Help Manage Lakes for Cyanobacteria in Lakes

EPA Science Inventory

Computational ecology is an interdisciplinary field that takes advantage of modern computation abilities to expand our ecological understanding. As computational ecologists, we use large data sets, which often cover large spatial extents, and advanced statistical/mathematical co...
Quo vadis: Hydrologic inverse analyses using high-performance computing and a D-Wave quantum annealer

NASA Astrophysics Data System (ADS)

O'Malley, D.; Vesselinov, V. V.

2017-12-01

Classical microprocessors have had a dramatic impact on hydrology for decades, due largely to the exponential growth in computing power predicted by Moore's law. However, this growth is not expected to continue indefinitely and has already begun to slow. Quantum computing is an emerging alternative to classical microprocessors. Here, we demonstrated cutting edge inverse model analyses utilizing some of the best available resources in both worlds: high-performance classical computing and a D-Wave quantum annealer. The classical high-performance computing resources are utilized to build an advanced numerical model that assimilates data from O(10^5) observations, including water levels, drawdowns, and contaminant concentrations. The developed model accurately reproduces the hydrologic conditions at a Los Alamos National Laboratory contamination site, and can be leveraged to inform decision-making about site remediation. We demonstrate the use of a D-Wave 2X quantum annealer to solve hydrologic inverse problems. This work can be seen as an early step in quantum-computational hydrology. We compare and contrast our results with an early inverse approach in classical-computational hydrology that is comparable to the approach we use with quantum annealing. Our results show that quantum annealing can be useful for identifying regions of high and low permeability within an aquifer. While the problems we consider are small-scale compared to the problems that can be solved with modern classical computers, they are large compared to the problems that could be solved with early classical CPUs. Further, the binary nature of the high/low permeability problem makes it well-suited to quantum annealing, but challenging for classical computers.
Comparison of Acceleration Techniques for Selected Low-Level Bioinformatics Operations

PubMed Central

Langenkämper, Daniel; Jakobi, Tobias; Feld, Dustin; Jelonek, Lukas; Goesmann, Alexander; Nattkemper, Tim W.

2016-01-01

Within the recent years clock rates of modern processors stagnated while the demand for computing power continued to grow. This applied particularly for the fields of life sciences and bioinformatics, where new technologies keep on creating rapidly growing piles of raw data with increasing speed. The number of cores per processor increased in an attempt to compensate for slight increments of clock rates. This technological shift demands changes in software development, especially in the field of high performance computing where parallelization techniques are gaining in importance due to the pressing issue of large sized datasets generated by e.g., modern genomics. This paper presents an overview of state-of-the-art manual and automatic acceleration techniques and lists some applications employing these in different areas of sequence informatics. Furthermore, we provide examples for automatic acceleration of two use cases to show typical problems and gains of transforming a serial application to a parallel one. The paper should aid the reader in deciding for a certain techniques for the problem at hand. We compare four different state-of-the-art automatic acceleration approaches (OpenMP, PluTo-SICA, PPCG, and OpenACC). Their performance as well as their applicability for selected use cases is discussed. While optimizations targeting the CPU worked better in the complex k-mer use case, optimizers for Graphics Processing Units (GPUs) performed better in the matrix multiplication example. But performance is only superior at a certain problem size due to data migration overhead. We show that automatic code parallelization is feasible with current compiler software and yields significant increases in execution speed. Automatic optimizers for CPU are mature and usually no additional manual adjustment is required. In contrast, some automatic parallelizers targeting GPUs still lack maturity and are limited to simple statements and structures. PMID:26904094

Comparison of Acceleration Techniques for Selected Low-Level Bioinformatics Operations.

PubMed

Langenkämper, Daniel; Jakobi, Tobias; Feld, Dustin; Jelonek, Lukas; Goesmann, Alexander; Nattkemper, Tim W

2016-01-01

Within the recent years clock rates of modern processors stagnated while the demand for computing power continued to grow. This applied particularly for the fields of life sciences and bioinformatics, where new technologies keep on creating rapidly growing piles of raw data with increasing speed. The number of cores per processor increased in an attempt to compensate for slight increments of clock rates. This technological shift demands changes in software development, especially in the field of high performance computing where parallelization techniques are gaining in importance due to the pressing issue of large sized datasets generated by e.g., modern genomics. This paper presents an overview of state-of-the-art manual and automatic acceleration techniques and lists some applications employing these in different areas of sequence informatics. Furthermore, we provide examples for automatic acceleration of two use cases to show typical problems and gains of transforming a serial application to a parallel one. The paper should aid the reader in deciding for a certain techniques for the problem at hand. We compare four different state-of-the-art automatic acceleration approaches (OpenMP, PluTo-SICA, PPCG, and OpenACC). Their performance as well as their applicability for selected use cases is discussed. While optimizations targeting the CPU worked better in the complex k-mer use case, optimizers for Graphics Processing Units (GPUs) performed better in the matrix multiplication example. But performance is only superior at a certain problem size due to data migration overhead. We show that automatic code parallelization is feasible with current compiler software and yields significant increases in execution speed. Automatic optimizers for CPU are mature and usually no additional manual adjustment is required. In contrast, some automatic parallelizers targeting GPUs still lack maturity and are limited to simple statements and structures.
JMS: An Open Source Workflow Management System and Web-Based Cluster Front-End for High Performance Computing.

PubMed

Brown, David K; Penkler, David L; Musyoka, Thommas M; Bishop, Özlem Tastan

2015-01-01

Complex computational pipelines are becoming a staple of modern scientific research. Often these pipelines are resource intensive and require days of computing time. In such cases, it makes sense to run them over high performance computing (HPC) clusters where they can take advantage of the aggregated resources of many powerful computers. In addition to this, researchers often want to integrate their workflows into their own web servers. In these cases, software is needed to manage the submission of jobs from the web interface to the cluster and then return the results once the job has finished executing. We have developed the Job Management System (JMS), a workflow management system and web interface for high performance computing (HPC). JMS provides users with a user-friendly web interface for creating complex workflows with multiple stages. It integrates this workflow functionality with the resource manager, a tool that is used to control and manage batch jobs on HPC clusters. As such, JMS combines workflow management functionality with cluster administration functionality. In addition, JMS provides developer tools including a code editor and the ability to version tools and scripts. JMS can be used by researchers from any field to build and run complex computational pipelines and provides functionality to include these pipelines in external interfaces. JMS is currently being used to house a number of bioinformatics pipelines at the Research Unit in Bioinformatics (RUBi) at Rhodes University. JMS is an open-source project and is freely available at https://github.com/RUBi-ZA/JMS.
JMS: An Open Source Workflow Management System and Web-Based Cluster Front-End for High Performance Computing

PubMed Central

Brown, David K.; Penkler, David L.; Musyoka, Thommas M.; Bishop, Özlem Tastan

2015-01-01

Complex computational pipelines are becoming a staple of modern scientific research. Often these pipelines are resource intensive and require days of computing time. In such cases, it makes sense to run them over high performance computing (HPC) clusters where they can take advantage of the aggregated resources of many powerful computers. In addition to this, researchers often want to integrate their workflows into their own web servers. In these cases, software is needed to manage the submission of jobs from the web interface to the cluster and then return the results once the job has finished executing. We have developed the Job Management System (JMS), a workflow management system and web interface for high performance computing (HPC). JMS provides users with a user-friendly web interface for creating complex workflows with multiple stages. It integrates this workflow functionality with the resource manager, a tool that is used to control and manage batch jobs on HPC clusters. As such, JMS combines workflow management functionality with cluster administration functionality. In addition, JMS provides developer tools including a code editor and the ability to version tools and scripts. JMS can be used by researchers from any field to build and run complex computational pipelines and provides functionality to include these pipelines in external interfaces. JMS is currently being used to house a number of bioinformatics pipelines at the Research Unit in Bioinformatics (RUBi) at Rhodes University. JMS is an open-source project and is freely available at https://github.com/RUBi-ZA/JMS. PMID:26280450
An anatomy of industrial robots and their controls

NASA Astrophysics Data System (ADS)

Luh, J. Y. S.

1983-02-01

The modernization of manufacturing facilities by means of automation represents an approach for increasing productivity in industry. The three existing types of automation are related to continuous process controls, the use of transfer conveyor methods, and the employment of programmable automation for the low-volume batch production of discrete parts. The industrial robots, which are defined as computer controlled mechanics manipulators, belong to the area of programmable automation. Typically, the robots perform tasks of arc welding, paint spraying, or foundary operation. One may assign a robot to perform a variety of job assignments simply by changing the appropriate computer program. The present investigation is concerned with an evaluation of the potential of the robot on the basis of its basic structure and controls. It is found that robots function well in limited areas of industry. If the range of tasks which robots can perform is to be expanded, it is necessary to provide multiple-task sensors, or special tooling, or even automatic tooling.
Solving lattice QCD systems of equations using mixed precision solvers on GPUs

NASA Astrophysics Data System (ADS)

Clark, M. A.; Babich, R.; Barros, K.; Brower, R. C.; Rebbi, C.

2010-09-01

Modern graphics hardware is designed for highly parallel numerical tasks and promises significant cost and performance benefits for many scientific applications. One such application is lattice quantum chromodynamics (lattice QCD), where the main computational challenge is to efficiently solve the discretized Dirac equation in the presence of an SU(3) gauge field. Using NVIDIA's CUDA platform we have implemented a Wilson-Dirac sparse matrix-vector product that performs at up to 40, 135 and 212 Gflops for double, single and half precision respectively on NVIDIA's GeForce GTX 280 GPU. We have developed a new mixed precision approach for Krylov solvers using reliable updates which allows for full double precision accuracy while using only single or half precision arithmetic for the bulk of the computation. The resulting BiCGstab and CG solvers run in excess of 100 Gflops and, in terms of iterations until convergence, perform better than the usual defect-correction approach for mixed precision.
Vids: Version 2.0 Alpha Visualization Engine

DTIC Science & Technology

2018-04-25

fidelity than existing efforts. Vids is a project aimed at producing more dynamic and interactive visualization tools using modern computer game ...move through and interact with the data to improve informational understanding. The Vids software leverages off-the-shelf modern game development...analysis and correlations. Recently, an ARL-pioneered project named Virtual Reality Data Analysis Environment (VRDAE) used VR and a modern game engine
Identification of key ancestors of modern germplasm in a breeding program of maize.

PubMed

Technow, F; Schrag, T A; Schipprack, W; Melchinger, A E

2014-12-01

Probabilities of gene origin computed from the genomic kinships matrix can accurately identify key ancestors of modern germplasms Identifying the key ancestors of modern plant breeding populations can provide valuable insights into the history of a breeding program and provide reference genomes for next generation whole genome sequencing. In an animal breeding context, a method was developed that employs probabilities of gene origin, computed from the pedigree-based additive kinship matrix, for identifying key ancestors. Because reliable and complete pedigree information is often not available in plant breeding, we replaced the additive kinship matrix with the genomic kinship matrix. As a proof-of-concept, we applied this approach to simulated data sets with known ancestries. The relative contribution of the ancestral lines to later generations could be determined with high accuracy, with and without selection. Our method was subsequently used for identifying the key ancestors of the modern Dent germplasm of the public maize breeding program of the University of Hohenheim. We found that the modern germplasm can be traced back to six or seven key ancestors, with one or two of them having a disproportionately large contribution. These results largely corroborated conjectures based on early records of the breeding program. We conclude that probabilities of gene origin computed from the genomic kinships matrix can be used for identifying key ancestors in breeding programs and estimating the proportion of genes contributed by them.
Supercomputing resources empowering superstack with interactive and integrated systems

NASA Astrophysics Data System (ADS)

Rückemann, Claus-Peter

2012-09-01

This paper presents the results from the development and implementation of Superstack algorithms to be dynamically used with integrated systems and supercomputing resources. Processing of geophysical data, thus named geoprocessing, is an essential part of the analysis of geoscientific data. The theory of Superstack algorithms and the practical application on modern computing architectures was inspired by developments introduced with processing of seismic data on mainframes and within the last years leading to high end scientific computing applications. There are several stacking algorithms known but with low signal to noise ratio in seismic data the use of iterative algorithms like the Superstack can support analysis and interpretation. The new Superstack algorithms are in use with wave theory and optical phenomena on highly performant computing resources for huge data sets as well as for sophisticated application scenarios in geosciences and archaeology.
Aerodynamic optimization of supersonic compressor cascade using differential evolution on GPU

NASA Astrophysics Data System (ADS)

Aissa, Mohamed Hasanine; Verstraete, Tom; Vuik, Cornelis

2016-06-01

Differential Evolution (DE) is a powerful stochastic optimization method. Compared to gradient-based algorithms, DE is able to avoid local minima but requires at the same time more function evaluations. In turbomachinery applications, function evaluations are performed with time-consuming CFD simulation, which results in a long, non affordable, design cycle. Modern High Performance Computing systems, especially Graphic Processing Units (GPUs), are able to alleviate this inconvenience by accelerating the design evaluation itself. In this work we present a validated CFD Solver running on GPUs, able to accelerate the design evaluation and thus the entire design process. An achieved speedup of 20x to 30x enabled the DE algorithm to run on a high-end computer instead of a costly large cluster. The GPU-enhanced DE was used to optimize the aerodynamics of a supersonic compressor cascade, achieving an aerodynamic loss minimization of 20%.
Spatiotemporal video deinterlacing using control grid interpolation

NASA Astrophysics Data System (ADS)

Venkatesan, Ragav; Zwart, Christine M.; Frakes, David H.; Li, Baoxin

2015-03-01

With the advent of progressive format display and broadcast technologies, video deinterlacing has become an important video-processing technique. Numerous approaches exist in the literature to accomplish deinterlacing. While most earlier methods were simple linear filtering-based approaches, the emergence of faster computing technologies and even dedicated video-processing hardware in display units has allowed higher quality but also more computationally intense deinterlacing algorithms to become practical. Most modern approaches analyze motion and content in video to select different deinterlacing methods for various spatiotemporal regions. We introduce a family of deinterlacers that employs spectral residue to choose between and weight control grid interpolation based spatial and temporal deinterlacing methods. The proposed approaches perform better than the prior state-of-the-art based on peak signal-to-noise ratio, other visual quality metrics, and simple perception-based subjective evaluations conducted by human viewers. We further study the advantages of using soft and hard decision thresholds on the visual performance.
Load management strategy for Particle-In-Cell simulations in high energy particle acceleration

NASA Astrophysics Data System (ADS)

Beck, A.; Frederiksen, J. T.; Dérouillat, J.

2016-09-01

In the wake of the intense effort made for the experimental CILEX project, numerical simulation campaigns have been carried out in order to finalize the design of the facility and to identify optimal laser and plasma parameters. These simulations bring, of course, important insight into the fundamental physics at play. As a by-product, they also characterize the quality of our theoretical and numerical models. In this paper, we compare the results given by different codes and point out algorithmic limitations both in terms of physical accuracy and computational performances. These limitations are illustrated in the context of electron laser wakefield acceleration (LWFA). The main limitation we identify in state-of-the-art Particle-In-Cell (PIC) codes is computational load imbalance. We propose an innovative algorithm to deal with this specific issue as well as milestones towards a modern, accurate high-performance PIC code for high energy particle acceleration.
Accelerating gravitational microlensing simulations using the Xeon Phi coprocessor

NASA Astrophysics Data System (ADS)

Chen, B.; Kantowski, R.; Dai, X.; Baron, E.; Van der Mark, P.

2017-04-01

Recently Graphics Processing Units (GPUs) have been used to speed up very CPU-intensive gravitational microlensing simulations. In this work, we use the Xeon Phi coprocessor to accelerate such simulations and compare its performance on a microlensing code with that of NVIDIA's GPUs. For the selected set of parameters evaluated in our experiment, we find that the speedup by Intel's Knights Corner coprocessor is comparable to that by NVIDIA's Fermi family of GPUs with compute capability 2.0, but less significant than GPUs with higher compute capabilities such as the Kepler. However, the very recently released second generation Xeon Phi, Knights Landing, is about 5.8 times faster than the Knights Corner, and about 2.9 times faster than the Kepler GPU used in our simulations. We conclude that the Xeon Phi is a very promising alternative to GPUs for modern high performance microlensing simulations.
Aerodynamic optimization of supersonic compressor cascade using differential evolution on GPU

DOE Office of Scientific and Technical Information (OSTI.GOV)

Aissa, Mohamed Hasanine; Verstraete, Tom; Vuik, Cornelis

Differential Evolution (DE) is a powerful stochastic optimization method. Compared to gradient-based algorithms, DE is able to avoid local minima but requires at the same time more function evaluations. In turbomachinery applications, function evaluations are performed with time-consuming CFD simulation, which results in a long, non affordable, design cycle. Modern High Performance Computing systems, especially Graphic Processing Units (GPUs), are able to alleviate this inconvenience by accelerating the design evaluation itself. In this work we present a validated CFD Solver running on GPUs, able to accelerate the design evaluation and thus the entire design process. An achieved speedup of 20xmore » to 30x enabled the DE algorithm to run on a high-end computer instead of a costly large cluster. The GPU-enhanced DE was used to optimize the aerodynamics of a supersonic compressor cascade, achieving an aerodynamic loss minimization of 20%.« less
WOMBAT: A Scalable and High-performance Astrophysical Magnetohydrodynamics Code

DOE Office of Scientific and Technical Information (OSTI.GOV)

Mendygral, P. J.; Radcliffe, N.; Kandalla, K.

2017-02-01

We present a new code for astrophysical magnetohydrodynamics specifically designed and optimized for high performance and scaling on modern and future supercomputers. We describe a novel hybrid OpenMP/MPI programming model that emerged from a collaboration between Cray, Inc. and the University of Minnesota. This design utilizes MPI-RMA optimized for thread scaling, which allows the code to run extremely efficiently at very high thread counts ideal for the latest generation of multi-core and many-core architectures. Such performance characteristics are needed in the era of “exascale” computing. We describe and demonstrate our high-performance design in detail with the intent that it maymore » be used as a model for other, future astrophysical codes intended for applications demanding exceptional performance.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)

Yang, Y M; Bush, K; Han, B

Purpose: Accurate and fast dose calculation is a prerequisite of precision radiation therapy in modern photon and particle therapy. While Monte Carlo (MC) dose calculation provides high dosimetric accuracy, the drastically increased computational time hinders its routine use. Deterministic dose calculation methods are fast, but problematic in the presence of tissue density inhomogeneity. We leverage the useful features of deterministic methods and MC to develop a hybrid dose calculation platform with autonomous utilization of MC and deterministic calculation depending on the local geometry, for optimal accuracy and speed. Methods: Our platform utilizes a Geant4 based “localized Monte Carlo” (LMC) methodmore » that isolates MC dose calculations only to volumes that have potential for dosimetric inaccuracy. In our approach, additional structures are created encompassing heterogeneous volumes. Deterministic methods calculate dose and energy fluence up to the volume surfaces, where the energy fluence distribution is sampled into discrete histories and transported using MC. Histories exiting the volume are converted back into energy fluence, and transported deterministically. By matching boundary conditions at both interfaces, deterministic dose calculation account for dose perturbations “downstream” of localized heterogeneities. Hybrid dose calculation was performed for water and anthropomorphic phantoms. Results: We achieved <1% agreement between deterministic and MC calculations in the water benchmark for photon and proton beams, and dose differences of 2%–15% could be observed in heterogeneous phantoms. The saving in computational time (a factor ∼4–7 compared to a full Monte Carlo dose calculation) was found to be approximately proportional to the volume of the heterogeneous region. Conclusion: Our hybrid dose calculation approach takes advantage of the computational efficiency of deterministic method and accuracy of MC, providing a practical tool for high performance dose calculation in modern RT. The approach is generalizable to all modalities where heterogeneities play a large role, notably particle therapy.« less
A High Performance VLSI Computer Architecture For Computer Graphics

NASA Astrophysics Data System (ADS)

Chin, Chi-Yuan; Lin, Wen-Tai

1988-10-01

A VLSI computer architecture, consisting of multiple processors, is presented in this paper to satisfy the modern computer graphics demands, e.g. high resolution, realistic animation, real-time display etc.. All processors share a global memory which are partitioned into multiple banks. Through a crossbar network, data from one memory bank can be broadcasted to many processors. Processors are physically interconnected through a hyper-crossbar network (a crossbar-like network). By programming the network, the topology of communication links among processors can be reconfigurated to satisfy specific dataflows of different applications. Each processor consists of a controller, arithmetic operators, local memory, a local crossbar network, and I/O ports to communicate with other processors, memory banks, and a system controller. Operations in each processor are characterized into two modes, i.e. object domain and space domain, to fully utilize the data-independency characteristics of graphics processing. Special graphics features such as 3D-to-2D conversion, shadow generation, texturing, and reflection, can be easily handled. With the current high density interconnection (MI) technology, it is feasible to implement a 64-processor system to achieve 2.5 billion operations per second, a performance needed in most advanced graphics applications.
SiMon: Simulation Monitor for Computational Astrophysics

NASA Astrophysics Data System (ADS)

Xuran Qian, Penny; Cai, Maxwell Xu; Portegies Zwart, Simon; Zhu, Ming

2017-09-01

Scientific discovery via numerical simulations is important in modern astrophysics. This relatively new branch of astrophysics has become possible due to the development of reliable numerical algorithms and the high performance of modern computing technologies. These enable the analysis of large collections of observational data and the acquisition of new data via simulations at unprecedented accuracy and resolution. Ideally, simulations run until they reach some pre-determined termination condition, but often other factors cause extensive numerical approaches to break down at an earlier stage. In those cases, processes tend to be interrupted due to unexpected events in the software or the hardware. In those cases, the scientist handles the interrupt manually, which is time-consuming and prone to errors. We present the Simulation Monitor (SiMon) to automatize the farming of large and extensive simulation processes. Our method is light-weight, it fully automates the entire workflow management, operates concurrently across multiple platforms and can be installed in user space. Inspired by the process of crop farming, we perceive each simulation as a crop in the field and running simulation becomes analogous to growing crops. With the development of SiMon we relax the technical aspects of simulation management. The initial package was developed for extensive parameter searchers in numerical simulations, but it turns out to work equally well for automating the computational processing and reduction of observational data reduction.
Parallel-Vector Algorithm For Rapid Structural Anlysis

NASA Technical Reports Server (NTRS)

Agarwal, Tarun R.; Nguyen, Duc T.; Storaasli, Olaf O.

1993-01-01

New algorithm developed to overcome deficiency of skyline storage scheme by use of variable-band storage scheme. Exploits both parallel and vector capabilities of modern high-performance computers. Gives engineers and designers opportunity to include more design variables and constraints during optimization of structures. Enables use of more refined finite-element meshes to obtain improved understanding of complex behaviors of aerospace structures leading to better, safer designs. Not only attractive for current supercomputers but also for next generation of shared-memory supercomputers.
DoD HPC Insights Fall 2016A publication of the Department of Defense High Performance Computing Modernization Program

DTIC Science & Technology

2016-09-01

HPCMP will continue to be a key resource in solving challenging problems for the Department of Defense . 1 Fall 2016 High-F idel i ty Simulat ions of...laser interactions. The group had studied plasma expansion experimentally, but this wasn’t sufficient to understand the problem . Feister adapted and...focused on increasing the efficiency of jet turbine engines and extending aircraft flight ranges by changing the shape (articulation) of the turbine
Computing Legacy Software Behavior to Understand Functionality and Security Properties: An IBM/370 Demonstration

DOE Office of Scientific and Technical Information (OSTI.GOV)

Linger, Richard C; Pleszkoch, Mark G; Prowell, Stacy J

Organizations maintaining mainframe legacy software can benefit from code modernization and incorporation of security capabilities to address the current threat environment. Oak Ridge National Laboratory is developing the Hyperion system to compute the behavior of software as a means to gain understanding of software functionality and security properties. Computation of functionality is critical to revealing security attributes, which are in fact specialized functional behaviors of software. Oak Ridge is collaborating with MITRE Corporation to conduct a demonstration project to compute behavior of legacy IBM Assembly Language code for a federal agency. The ultimate goal is to understand functionality and securitymore » vulnerabilities as a basis for code modernization. This paper reports on the first phase, to define functional semantics for IBM Assembly instructions and conduct behavior computation experiments.« less

A survey study on student preferences regarding pathology teaching in Germany: a call for curricular modernization.

PubMed

Herrmann, Florian E M; Lenski, Markus; Steffen, Julius; Kailuweit, Magdalena; Nikolaus, Marc; Koteeswaran, Rajasekaran; Sailer, Andreas; Hanszke, Anna; Wintergerst, Maximilian; Dittmer, Sissi; Mayr, Doris; Genzel-Boroviczény, Orsolya; Eley, Diann S; Fischer, Martin R

2015-06-02

Pathology is a discipline that provides the basis of the understanding of disease in medicine. The past decades have seen a decline in the emphasis laid on pathology teaching in medical schools and outdated pathology curricula have worsened the situation. Student opinions and thoughts are central to the questions of whether and how such curricula should be modernized. A survey was conducted among 1018 German medical students regarding their preferences in pathology teaching modalities and their satisfaction with lecture-based courses. A qualitative analysis was performed comparing a recently modernized pathology curriculum with a traditional lecture-based curriculum. The differences in modalities of teaching used were investigated. Student satisfaction with the lecture-based curriculum positively correlated with student grades (spearman's correlation coefficient 0.24). Additionally, students with lower grades supported changing the curriculum (spearman's correlation coefficient 0.47). The majority supported virtual microscopy, autopsies, seminars and podcasts as preferred didactic methods. The data supports the implementation of a pathology curriculum where tutorials, autopsies and supplementary computer-based learning tools play important roles.
CBRAIN: a web-based, distributed computing platform for collaborative neuroimaging research

PubMed Central

Sherif, Tarek; Rioux, Pierre; Rousseau, Marc-Etienne; Kassis, Nicolas; Beck, Natacha; Adalat, Reza; Das, Samir; Glatard, Tristan; Evans, Alan C.

2014-01-01

The Canadian Brain Imaging Research Platform (CBRAIN) is a web-based collaborative research platform developed in response to the challenges raised by data-heavy, compute-intensive neuroimaging research. CBRAIN offers transparent access to remote data sources, distributed computing sites, and an array of processing and visualization tools within a controlled, secure environment. Its web interface is accessible through any modern browser and uses graphical interface idioms to reduce the technical expertise required to perform large-scale computational analyses. CBRAIN's flexible meta-scheduling has allowed the incorporation of a wide range of heterogeneous computing sites, currently including nine national research High Performance Computing (HPC) centers in Canada, one in Korea, one in Germany, and several local research servers. CBRAIN leverages remote computing cycles and facilitates resource-interoperability in a transparent manner for the end-user. Compared with typical grid solutions available, our architecture was designed to be easily extendable and deployed on existing remote computing sites with no tool modification, administrative intervention, or special software/hardware configuration. As October 2013, CBRAIN serves over 200 users spread across 53 cities in 17 countries. The platform is built as a generic framework that can accept data and analysis tools from any discipline. However, its current focus is primarily on neuroimaging research and studies of neurological diseases such as Autism, Parkinson's and Alzheimer's diseases, Multiple Sclerosis as well as on normal brain structure and development. This technical report presents the CBRAIN Platform, its current deployment and usage and future direction. PMID:24904400
CBRAIN: a web-based, distributed computing platform for collaborative neuroimaging research.

PubMed

Sherif, Tarek; Rioux, Pierre; Rousseau, Marc-Etienne; Kassis, Nicolas; Beck, Natacha; Adalat, Reza; Das, Samir; Glatard, Tristan; Evans, Alan C

2014-01-01

The Canadian Brain Imaging Research Platform (CBRAIN) is a web-based collaborative research platform developed in response to the challenges raised by data-heavy, compute-intensive neuroimaging research. CBRAIN offers transparent access to remote data sources, distributed computing sites, and an array of processing and visualization tools within a controlled, secure environment. Its web interface is accessible through any modern browser and uses graphical interface idioms to reduce the technical expertise required to perform large-scale computational analyses. CBRAIN's flexible meta-scheduling has allowed the incorporation of a wide range of heterogeneous computing sites, currently including nine national research High Performance Computing (HPC) centers in Canada, one in Korea, one in Germany, and several local research servers. CBRAIN leverages remote computing cycles and facilitates resource-interoperability in a transparent manner for the end-user. Compared with typical grid solutions available, our architecture was designed to be easily extendable and deployed on existing remote computing sites with no tool modification, administrative intervention, or special software/hardware configuration. As October 2013, CBRAIN serves over 200 users spread across 53 cities in 17 countries. The platform is built as a generic framework that can accept data and analysis tools from any discipline. However, its current focus is primarily on neuroimaging research and studies of neurological diseases such as Autism, Parkinson's and Alzheimer's diseases, Multiple Sclerosis as well as on normal brain structure and development. This technical report presents the CBRAIN Platform, its current deployment and usage and future direction.
Evaluation of a Multicore-Optimized Implementation for Tomographic Reconstruction

PubMed Central

Agulleiro, Jose-Ignacio; Fernández, José Jesús

2012-01-01

Tomography allows elucidation of the three-dimensional structure of an object from a set of projection images. In life sciences, electron microscope tomography is providing invaluable information about the cell structure at a resolution of a few nanometres. Here, large images are required to combine wide fields of view with high resolution requirements. The computational complexity of the algorithms along with the large image size then turns tomographic reconstruction into a computationally demanding problem. Traditionally, high-performance computing techniques have been applied to cope with such demands on supercomputers, distributed systems and computer clusters. In the last few years, the trend has turned towards graphics processing units (GPUs). Here we present a detailed description and a thorough evaluation of an alternative approach that relies on exploitation of the power available in modern multicore computers. The combination of single-core code optimization, vector processing, multithreading and efficient disk I/O operations succeeds in providing fast tomographic reconstructions on standard computers. The approach turns out to be competitive with the fastest GPU-based solutions thus far. PMID:23139768
Programmable computing with a single magnetoresistive element

NASA Astrophysics Data System (ADS)

Ney, A.; Pampuch, C.; Koch, R.; Ploog, K. H.

2003-10-01

The development of transistor-based integrated circuits for modern computing is a story of great success. However, the proved concept for enhancing computational power by continuous miniaturization is approaching its fundamental limits. Alternative approaches consider logic elements that are reconfigurable at run-time to overcome the rigid architecture of the present hardware systems. Implementation of parallel algorithms on such `chameleon' processors has the potential to yield a dramatic increase of computational speed, competitive with that of supercomputers. Owing to their functional flexibility, `chameleon' processors can be readily optimized with respect to any computer application. In conventional microprocessors, information must be transferred to a memory to prevent it from getting lost, because electrically processed information is volatile. Therefore the computational performance can be improved if the logic gate is additionally capable of storing the output. Here we describe a simple hardware concept for a programmable logic element that is based on a single magnetic random access memory (MRAM) cell. It combines the inherent advantage of a non-volatile output with flexible functionality which can be selected at run-time to operate as an AND, OR, NAND or NOR gate.
The Center for Computational Biology: resources, achievements, and challenges

PubMed Central

Dinov, Ivo D; Thompson, Paul M; Woods, Roger P; Van Horn, John D; Shattuck, David W; Parker, D Stott

2011-01-01

The Center for Computational Biology (CCB) is a multidisciplinary program where biomedical scientists, engineers, and clinicians work jointly to combine modern mathematical and computational techniques, to perform phenotypic and genotypic studies of biological structure, function, and physiology in health and disease. CCB has developed a computational framework built around the Manifold Atlas, an integrated biomedical computing environment that enables statistical inference on biological manifolds. These manifolds model biological structures, features, shapes, and flows, and support sophisticated morphometric and statistical analyses. The Manifold Atlas includes tools, workflows, and services for multimodal population-based modeling and analysis of biological manifolds. The broad spectrum of biomedical topics explored by CCB investigators include the study of normal and pathological brain development, maturation and aging, discovery of associations between neuroimaging and genetic biomarkers, and the modeling, analysis, and visualization of biological shape, form, and size. CCB supports a wide range of short-term and long-term collaborations with outside investigators, which drive the center's computational developments and focus the validation and dissemination of CCB resources to new areas and scientific domains. PMID:22081221
The Center for Computational Biology: resources, achievements, and challenges.

PubMed

Toga, Arthur W; Dinov, Ivo D; Thompson, Paul M; Woods, Roger P; Van Horn, John D; Shattuck, David W; Parker, D Stott

2012-01-01

The Center for Computational Biology (CCB) is a multidisciplinary program where biomedical scientists, engineers, and clinicians work jointly to combine modern mathematical and computational techniques, to perform phenotypic and genotypic studies of biological structure, function, and physiology in health and disease. CCB has developed a computational framework built around the Manifold Atlas, an integrated biomedical computing environment that enables statistical inference on biological manifolds. These manifolds model biological structures, features, shapes, and flows, and support sophisticated morphometric and statistical analyses. The Manifold Atlas includes tools, workflows, and services for multimodal population-based modeling and analysis of biological manifolds. The broad spectrum of biomedical topics explored by CCB investigators include the study of normal and pathological brain development, maturation and aging, discovery of associations between neuroimaging and genetic biomarkers, and the modeling, analysis, and visualization of biological shape, form, and size. CCB supports a wide range of short-term and long-term collaborations with outside investigators, which drive the center's computational developments and focus the validation and dissemination of CCB resources to new areas and scientific domains.
High-performance noncontact thermal diode via asymmetric nanostructures

NASA Astrophysics Data System (ADS)

Shen, Jiadong; Liu, Xianglei; He, Huan; Wu, Weitao; Liu, Baoan

2018-05-01

Electric diodes, though laying the foundation of modern electronics and information processing industries, suffer from ineffectiveness and even failure at high temperatures. Thermal diodes are promising alternatives to relieve above limitations, but usually possess low rectification ratios, and how to obtain a high-performance thermal rectification effect is still an open question. This paper proposes an efficient contactless thermal diode based on the near-field thermal radiation of asymmetric doped silicon nanostructures. The rectification ratio computed via exact scattering theories is demonstrated to be as high as 10 at a nanoscale gap distance and period, outperforming the counterpart flat-plate diode by more than one order of magnitude. This extraordinary performance mainly lies in the higher forward and lower reverse radiative heat flux within the low frequency band compared with the counterpart flat-plate diode, which is caused by a lower loss and smaller cut-off wavevector of nanostructures for the forward and reversed scheme, respectively. This work opens new routes to realize high performance thermal diodes, and may have wide applications in efficient thermal computing, thermal information processing, and thermal management.
Aerodynamics of High-Lift Configuration Civil Aircraft Model in JAXA

NASA Astrophysics Data System (ADS)

Yokokawa, Yuzuru; Murayama, Mitsuhiro; Ito, Takeshi; Yamamoto, Kazuomi

This paper presents basic aerodynamics and stall characteristics of the high-lift configuration aircraft model JSM (JAXA Standard Model). During research process of developing high-lift system design method, wind tunnel testing at JAXA 6.5m by 5.5m low-speed wind tunnel and Navier-Stokes computation on unstructured hybrid mesh were performed for a realistic configuration aircraft model equipped with high-lift devices, fuselage, nacelle-pylon, slat tracks and Flap Track Fairings (FTF), which was assumed 100 passenger class modern commercial transport aircraft. The testing and the computation aimed to understand flow physics and then to obtain some guidelines for designing a high performance high-lift system. As a result of the testing, Reynolds number effects within linear region and stall region were observed. Analysis of static pressure distribution and flow visualization gave the knowledge to understand the aerodynamic performance. CFD could capture the whole characteristics of basic aerodynamics and clarify flow mechanism which governs stall characteristics even for complicated geometry and its flow field. This collaborative work between wind tunnel testing and CFD is advantageous for improving or has improved the aerodynamic performance.
cudaMap: a GPU accelerated program for gene expression connectivity mapping.

PubMed

McArt, Darragh G; Bankhead, Peter; Dunne, Philip D; Salto-Tellez, Manuel; Hamilton, Peter; Zhang, Shu-Dong

2013-10-11

Modern cancer research often involves large datasets and the use of sophisticated statistical techniques. Together these add a heavy computational load to the analysis, which is often coupled with issues surrounding data accessibility. Connectivity mapping is an advanced bioinformatic and computational technique dedicated to therapeutics discovery and drug re-purposing around differential gene expression analysis. On a normal desktop PC, it is common for the connectivity mapping task with a single gene signature to take > 2h to complete using sscMap, a popular Java application that runs on standard CPUs (Central Processing Units). Here, we describe new software, cudaMap, which has been implemented using CUDA C/C++ to harness the computational power of NVIDIA GPUs (Graphics Processing Units) to greatly reduce processing times for connectivity mapping. cudaMap can identify candidate therapeutics from the same signature in just over thirty seconds when using an NVIDIA Tesla C2050 GPU. Results from the analysis of multiple gene signatures, which would previously have taken several days, can now be obtained in as little as 10 minutes, greatly facilitating candidate therapeutics discovery with high throughput. We are able to demonstrate dramatic speed differentials between GPU assisted performance and CPU executions as the computational load increases for high accuracy evaluation of statistical significance. Emerging 'omics' technologies are constantly increasing the volume of data and information to be processed in all areas of biomedical research. Embracing the multicore functionality of GPUs represents a major avenue of local accelerated computing. cudaMap will make a strong contribution in the discovery of candidate therapeutics by enabling speedy execution of heavy duty connectivity mapping tasks, which are increasingly required in modern cancer research. cudaMap is open source and can be freely downloaded from http://purl.oclc.org/NET/cudaMap.
Applying "Climate" system to teaching basic climatology and raising public awareness of climate change issues

NASA Astrophysics Data System (ADS)

Gordova, Yulia; Okladnikov, Igor; Titov, Alexander; Gordov, Evgeny

2016-04-01

While there is a strong demand for innovation in digital learning, available training programs in the environmental sciences have no time to adapt to rapid changes in the domain content. A joint group of scientists and university teachers develops and implements an educational environment for new learning experiences in basics of climatic science and its applications. This so-called virtual learning laboratory "Climate" contains educational materials and interactive training courses developed to provide undergraduate and graduate students with profound understanding of changes in regional climate and environment. The main feature of this Laboratory is that students perform their computational tasks on climate modeling and evaluation and assessment of climate change using the typical tools of the "Climate" information-computational system, which are usually used by real-life practitioners performing such kind of research. Students have an opportunity to perform computational laboratory works using information-computational tools of the system and improve skills of their usage simultaneously with mastering the subject. We did not create an artificial learning environment to pass the trainings. On the contrary, the main purpose of association of the educational block and computational information system was to familiarize students with the real existing technologies for monitoring and analysis of data on the state of the climate. Trainings are based on technologies and procedures which are typical for Earth system sciences. Educational courses are designed to permit students to conduct their own investigations of ongoing and future climate changes in a manner that is essentially identical to the techniques used by national and international climate research organizations. All trainings are supported by lectures, devoted to the basic aspects of modern climatology, including analysis of current climate change and its possible impacts ensuring effective links between theory and practice. Along with its usage in graduate and postgraduate education, "Climate" is used as a framework for a developed basic information course on climate change for common public. In this course basic concepts and problems of modern climate change and its possible consequences are described for non-specialists. The course will also include links to relevant information resources on topical issues of Earth Sciences and a number of case studies, which are carried out for a selected region to consolidate the received knowledge.
Image analysis in modern ophthalmology: from acquisition to computer assisted diagnosis and telemedicine

NASA Astrophysics Data System (ADS)

Marrugo, Andrés G.; Millán, María S.; Cristóbal, Gabriel; Gabarda, Salvador; Sorel, Michal; Sroubek, Filip

2012-06-01

Medical digital imaging has become a key element of modern health care procedures. It provides visual documentation and a permanent record for the patients, and most important the ability to extract information about many diseases. Modern ophthalmology thrives and develops on the advances in digital imaging and computing power. In this work we present an overview of recent image processing techniques proposed by the authors in the area of digital eye fundus photography. Our applications range from retinal image quality assessment to image restoration via blind deconvolution and visualization of structural changes in time between patient visits. All proposed within a framework for improving and assisting the medical practice and the forthcoming scenario of the information chain in telemedicine.
Upgrading of the LGD cluster at JINR to support DLNP experiments

NASA Astrophysics Data System (ADS)

Bednyakov, I. V.; Dolbilov, A. G.; Ivanov, Yu. P.

2017-01-01

Since its construction in 2005, the Computing Cluster of the Dzhelepov Laboratory of Nuclear Problems has been mainly used to perform calculations (data analysis, simulation, etc.) for various scientific collaborations in which DLNP scientists take an active part. The Cluster also serves to train specialists. Much has changed in the past decades, and the necessity has arisen to upgrade the cluster, increasing its power and replacing the outdated equipment to maintain its reliability and modernity. In this work we describe the experience of performing this upgrading, which can be helpful for system administrators to put new equipment for clusters of this type into operation quickly and efficiently.
STORM WATER MANAGEMENT MODEL (SWMM) MODERNIZATION

EPA Science Inventory

The U.S. Environmental Protection Agency's Water Supply and Water Resources Division in partnership with the consulting firm of CDM to redevelop and modernize the Storm Water Management Model (SWMM). In the initial phase of this project EPA rewrote SWMM's computational engine usi...
A History of Computer Numerical Control.

ERIC Educational Resources Information Center

Haggen, Gilbert L.

Computer numerical control (CNC) has evolved from the first significant counting method--the abacus. Babbage had perhaps the greatest impact on the development of modern day computers with his analytical engine. Hollerith's functioning machine with punched cards was used in tabulating the 1890 U.S. Census. In order for computers to become a…
Web-Based Computational Chemistry Education with CHARMMing I: Lessons and Tutorial

PubMed Central

Miller, Benjamin T.; Singh, Rishi P.; Schalk, Vinushka; Pevzner, Yuri; Sun, Jingjun; Miller, Carrie S.; Boresch, Stefan; Ichiye, Toshiko; Brooks, Bernard R.; Woodcock, H. Lee

2014-01-01

This article describes the development, implementation, and use of web-based “lessons” to introduce students and other newcomers to computer simulations of biological macromolecules. These lessons, i.e., interactive step-by-step instructions for performing common molecular simulation tasks, are integrated into the collaboratively developed CHARMM INterface and Graphics (CHARMMing) web user interface (http://www.charmming.org). Several lessons have already been developed with new ones easily added via a provided Python script. In addition to CHARMMing's new lessons functionality, web-based graphical capabilities have been overhauled and are fully compatible with modern mobile web browsers (e.g., phones and tablets), allowing easy integration of these advanced simulation techniques into coursework. Finally, one of the primary objections to web-based systems like CHARMMing has been that “point and click” simulation set-up does little to teach the user about the underlying physics, biology, and computational methods being applied. In response to this criticism, we have developed a freely available tutorial to bridge the gap between graphical simulation setup and the technical knowledge necessary to perform simulations without user interface assistance. PMID:25057988
Web-based computational chemistry education with CHARMMing I: Lessons and tutorial.

PubMed

Miller, Benjamin T; Singh, Rishi P; Schalk, Vinushka; Pevzner, Yuri; Sun, Jingjun; Miller, Carrie S; Boresch, Stefan; Ichiye, Toshiko; Brooks, Bernard R; Woodcock, H Lee

2014-07-01

This article describes the development, implementation, and use of web-based "lessons" to introduce students and other newcomers to computer simulations of biological macromolecules. These lessons, i.e., interactive step-by-step instructions for performing common molecular simulation tasks, are integrated into the collaboratively developed CHARMM INterface and Graphics (CHARMMing) web user interface (http://www.charmming.org). Several lessons have already been developed with new ones easily added via a provided Python script. In addition to CHARMMing's new lessons functionality, web-based graphical capabilities have been overhauled and are fully compatible with modern mobile web browsers (e.g., phones and tablets), allowing easy integration of these advanced simulation techniques into coursework. Finally, one of the primary objections to web-based systems like CHARMMing has been that "point and click" simulation set-up does little to teach the user about the underlying physics, biology, and computational methods being applied. In response to this criticism, we have developed a freely available tutorial to bridge the gap between graphical simulation setup and the technical knowledge necessary to perform simulations without user interface assistance.
SNAVA-A real-time multi-FPGA multi-model spiking neural network simulation architecture.

PubMed

Sripad, Athul; Sanchez, Giovanny; Zapata, Mireya; Pirrone, Vito; Dorta, Taho; Cambria, Salvatore; Marti, Albert; Krishnamourthy, Karthikeyan; Madrenas, Jordi

2018-01-01

Spiking Neural Networks (SNN) for Versatile Applications (SNAVA) simulation platform is a scalable and programmable parallel architecture that supports real-time, large-scale, multi-model SNN computation. This parallel architecture is implemented in modern Field-Programmable Gate Arrays (FPGAs) devices to provide high performance execution and flexibility to support large-scale SNN models. Flexibility is defined in terms of programmability, which allows easy synapse and neuron implementation. This has been achieved by using a special-purpose Processing Elements (PEs) for computing SNNs, and analyzing and customizing the instruction set according to the processing needs to achieve maximum performance with minimum resources. The parallel architecture is interfaced with customized Graphical User Interfaces (GUIs) to configure the SNN's connectivity, to compile the neuron-synapse model and to monitor SNN's activity. Our contribution intends to provide a tool that allows to prototype SNNs faster than on CPU/GPU architectures but significantly cheaper than fabricating a customized neuromorphic chip. This could be potentially valuable to the computational neuroscience and neuromorphic engineering communities. Copyright © 2017 Elsevier Ltd. All rights reserved.
Computational Approach for Securing Radiology-Diagnostic Data in Connected Health Network using High-Performance GPU-Accelerated AES.

PubMed

Adeshina, A M; Hashim, R

2017-03-01

Diagnostic radiology is a core and integral part of modern medicine, paving ways for the primary care physicians in the disease diagnoses, treatments and therapy managements. Obviously, all recent standard healthcare procedures have immensely benefitted from the contemporary information technology revolutions, apparently revolutionizing those approaches to acquiring, storing and sharing of diagnostic data for efficient and timely diagnosis of diseases. Connected health network was introduced as an alternative to the ageing traditional concept in healthcare system, improving hospital-physician connectivity and clinical collaborations. Undoubtedly, the modern medicinal approach has drastically improved healthcare but at the expense of high computational cost and possible breach of diagnosis privacy. Consequently, a number of cryptographical techniques are recently being applied to clinical applications, but the challenges of not being able to successfully encrypt both the image and the textual data persist. Furthermore, processing time of encryption-decryption of medical datasets, within a considerable lower computational cost without jeopardizing the required security strength of the encryption algorithm, still remains as an outstanding issue. This study proposes a secured radiology-diagnostic data framework for connected health network using high-performance GPU-accelerated Advanced Encryption Standard. The study was evaluated with radiology image datasets consisting of brain MR and CT datasets obtained from the department of Surgery, University of North Carolina, USA, and the Swedish National Infrastructure for Computing. Sample patients' notes from the University of North Carolina, School of medicine at Chapel Hill were also used to evaluate the framework for its strength in encrypting-decrypting textual data in the form of medical report. Significantly, the framework is not only able to accurately encrypt and decrypt medical image datasets, but it also successfully encrypts and decrypts textual data in Microsoft Word document, Microsoft Excel and Portable Document Formats which are the conventional format of documenting medical records. Interestingly, the entire encryption and decryption procedures were achieved at a lower computational cost using regular hardware and software resources without compromising neither the quality of the decrypted data nor the security level of the algorithms.
ChRIS--A web-based neuroimaging and informatics system for collecting, organizing, processing, visualizing and sharing of medical data.

PubMed

Pienaar, Rudolph; Rannou, Nicolas; Bernal, Jorge; Hahn, Daniel; Grant, P Ellen

2015-01-01

The utility of web browsers for general purpose computing, long anticipated, is only now coming into fruition. In this paper we present a web-based medical image data and information management software platform called ChRIS ([Boston] Children's Research Integration System). ChRIS' deep functionality allows for easy retrieval of medical image data from resources typically found in hospitals, organizes and presents information in a modern feed-like interface, provides access to a growing library of plugins that process these data - typically on a connected High Performance Compute Cluster, allows for easy data sharing between users and instances of ChRIS and provides powerful 3D visualization and real time collaboration.

Wasatch: An architecture-proof multiphysics development environment using a Domain Specific Language and graph theory

DOE Office of Scientific and Technical Information (OSTI.GOV)

Saad, Tony; Sutherland, James C.

To address the coding and software challenges of modern hybrid architectures, we propose an approach to multiphysics code development for high-performance computing. This approach is based on using a Domain Specific Language (DSL) in tandem with a directed acyclic graph (DAG) representation of the problem to be solved that allows runtime algorithm generation. When coupled with a large-scale parallel framework, the result is a portable development framework capable of executing on hybrid platforms and handling the challenges of multiphysics applications. In addition, we share our experience developing a code in such an environment – an effort that spans an interdisciplinarymore » team of engineers and computer scientists.« less
Wasatch: An architecture-proof multiphysics development environment using a Domain Specific Language and graph theory

DOE PAGES

Saad, Tony; Sutherland, James C.

2016-05-04

To address the coding and software challenges of modern hybrid architectures, we propose an approach to multiphysics code development for high-performance computing. This approach is based on using a Domain Specific Language (DSL) in tandem with a directed acyclic graph (DAG) representation of the problem to be solved that allows runtime algorithm generation. When coupled with a large-scale parallel framework, the result is a portable development framework capable of executing on hybrid platforms and handling the challenges of multiphysics applications. In addition, we share our experience developing a code in such an environment – an effort that spans an interdisciplinarymore » team of engineers and computer scientists.« less
ADVANCED COMPUTATIONAL METHODS IN DOSE MODELING

EPA Science Inventory

The overall goal of the EPA-ORD NERL research program on Computational Toxicology (CompTox) is to provide the Agency with the tools of modern chemistry, biology, and computing to improve quantitative risk assessments and reduce uncertainties in the source-to-adverse outcome conti...
A parameter optimization approach to controller partitioning for integrated flight/propulsion control application

NASA Technical Reports Server (NTRS)

Schmidt, Phillip; Garg, Sanjay; Holowecky, Brian

1992-01-01

A parameter optimization framework is presented to solve the problem of partitioning a centralized controller into a decentralized hierarchical structure suitable for integrated flight/propulsion control implementation. The controller partitioning problem is briefly discussed and a cost function to be minimized is formulated, such that the resulting 'optimal' partitioned subsystem controllers will closely match the performance (including robustness) properties of the closed-loop system with the centralized controller while maintaining the desired controller partitioning structure. The cost function is written in terms of parameters in a state-space representation of the partitioned sub-controllers. Analytical expressions are obtained for the gradient of this cost function with respect to parameters, and an optimization algorithm is developed using modern computer-aided control design and analysis software. The capabilities of the algorithm are demonstrated by application to partitioned integrated flight/propulsion control design for a modern fighter aircraft in the short approach to landing task. The partitioning optimization is shown to lead to reduced-order subcontrollers that match the closed-loop command tracking and decoupling performance achieved by a high-order centralized controller.
A parameter optimization approach to controller partitioning for integrated flight/propulsion control application

NASA Technical Reports Server (NTRS)

Schmidt, Phillip H.; Garg, Sanjay; Holowecky, Brian R.

1993-01-01

A parameter optimization framework is presented to solve the problem of partitioning a centralized controller into a decentralized hierarchical structure suitable for integrated flight/propulsion control implementation. The controller partitioning problem is briefly discussed and a cost function to be minimized is formulated, such that the resulting 'optimal' partitioned subsystem controllers will closely match the performance (including robustness) properties of the closed-loop system with the centralized controller while maintaining the desired controller partitioning structure. The cost function is written in terms of parameters in a state-space representation of the partitioned sub-controllers. Analytical expressions are obtained for the gradient of this cost function with respect to parameters, and an optimization algorithm is developed using modern computer-aided control design and analysis software. The capabilities of the algorithm are demonstrated by application to partitioned integrated flight/propulsion control design for a modern fighter aircraft in the short approach to landing task. The partitioning optimization is shown to lead to reduced-order subcontrollers that match the closed-loop command tracking and decoupling performance achieved by a high-order centralized controller.
An investigation into non-invasive physical activity recognition using smartphones.

PubMed

Kelly, Daniel; Caulfield, Brian

2012-01-01

Technology utilized to automatically monitor Activities of Daily Living (ADL) could be a key component in identifying deviations from normal functional profiles and providing feedback on interventions aimed at improving health. However, if activity recognition systems are to be implemented in real world scenarios such as health and wellness monitoring, the activity sensing modality must unobtrusively fit the human environment rather than forcing humans to adhere to sensor specific conditions. Modern smart phones represent a ubiquitous computing device which has already undergone mainstream adoption. In this paper, we investigate the feasibility of using a modern smartphone, with limited placement constraints, as the sensing modality for an activity recognition system. A dataset of 4 subjects performing 7 activities, using varying sensor placement conditions, is utilized to investigate this. Initial experiments show that a decision tree classifier performs activity classification with precision and recall scores of 0.75 and 0.73 respectively. More importantly, as part of this initial experiment, 3 main problems, and subsequently 3 solutions, relating to unconstrained sensor placement were identified. Using our proposed solutions, classification precision and recall scores were improved by +13% and +14.6% respectively.
Role of computational fluid dynamics in unsteady aerodynamics for aeroelasticity

NASA Technical Reports Server (NTRS)

Guruswamy, Guru P.; Goorjian, Peter M.

1989-01-01

In the last two decades there have been extensive developments in computational unsteady transonic aerodynamics. Such developments are essential since the transonic regime plays an important role in the design of modern aircraft. Therefore, there has been a large effort to develop computational tools with which to accurately perform flutter analysis at transonic speeds. In the area of Computational Fluid Dynamics (CFD), unsteady transonic aerodynamics are characterized by the feature of modeling the motion of shock waves over aerodynamic bodies, such as wings. This modeling requires the solution of nonlinear partial differential equations. Most advanced codes such as XTRAN3S use the transonic small perturbation equation. Currently, XTRAN3S is being used for generic research in unsteady aerodynamics and aeroelasticity of almost full aircraft configurations. Use of Euler/Navier Stokes equations for simple typical sections has just begun. A brief history of the development of CFD for aeroelastic applications is summarized. The development of unsteady transonic aerodynamics and aeroelasticity are also summarized.
Strength computation of forged parts taking into account strain hardening and damage

NASA Astrophysics Data System (ADS)

Cristescu, Michel L.

2004-06-01

Modern non-linear simulation software, such as FORGE 3 (registered trade mark of TRANSVALOR), are able to compute the residual stresses, the strain hardening and the damage during the forging process. A thermally dependent elasto-visco-plastic law is used to simulate the behavior of the material of the hot forged piece. A modified Lemaitre law coupled with elasticiy, plasticity and thermic is used to simulate the damage. After the simulation of the different steps of the forging process, the part is cooled and then virtually machined, in order to obtain the finished part. An elastic computation is then performed to equilibrate the residual stresses, so that we obtain the true geometry of the finished part after machining. The response of the part to the loadings it will sustain during it's life is then computed, taking into account the residual stresses, the strain hardening and the damage that occur during forging. This process is illustrated by the forging, virtual machining and stress analysis of an aluminium wheel hub.
Trace: a high-throughput tomographic reconstruction engine for large-scale datasets.

PubMed

Bicer, Tekin; Gürsoy, Doğa; Andrade, Vincent De; Kettimuthu, Rajkumar; Scullin, William; Carlo, Francesco De; Foster, Ian T

2017-01-01

Modern synchrotron light sources and detectors produce data at such scale and complexity that large-scale computation is required to unleash their full power. One of the widely used imaging techniques that generates data at tens of gigabytes per second is computed tomography (CT). Although CT experiments result in rapid data generation, the analysis and reconstruction of the collected data may require hours or even days of computation time with a medium-sized workstation, which hinders the scientific progress that relies on the results of analysis. We present Trace, a data-intensive computing engine that we have developed to enable high-performance implementation of iterative tomographic reconstruction algorithms for parallel computers. Trace provides fine-grained reconstruction of tomography datasets using both (thread-level) shared memory and (process-level) distributed memory parallelization. Trace utilizes a special data structure called replicated reconstruction object to maximize application performance. We also present the optimizations that we apply to the replicated reconstruction objects and evaluate them using tomography datasets collected at the Advanced Photon Source. Our experimental evaluations show that our optimizations and parallelization techniques can provide 158× speedup using 32 compute nodes (384 cores) over a single-core configuration and decrease the end-to-end processing time of a large sinogram (with 4501 × 1 × 22,400 dimensions) from 12.5 h to <5 min per iteration. The proposed tomographic reconstruction engine can efficiently process large-scale tomographic data using many compute nodes and minimize reconstruction times.
A Bitslice Implementation of Anderson's Attack on A5/1

NASA Astrophysics Data System (ADS)

Bulavintsev, Vadim; Semenov, Alexander; Zaikin, Oleg; Kochemazov, Stepan

2018-03-01

The A5/1 keystream generator is a part of Global System for Mobile Communications (GSM) protocol, employed in cellular networks all over the world. Its cryptographic resistance was extensively analyzed in dozens of papers. However, almost all corresponding methods either employ a specific hardware or require an extensive preprocessing stage and significant amounts of memory. In the present study, a bitslice variant of Anderson's Attack on A5/1 is implemented. It requires very little computer memory and no preprocessing. Moreover, the attack can be made even more efficient by harnessing the computing power of modern Graphics Processing Units (GPUs). As a result, using commonly available GPUs this method can quite efficiently recover the secret key using only 64 bits of keystream. To test the performance of the implementation, a volunteer computing project was launched. 10 instances of A5/1 cryptanalysis have been successfully solved in this project in a single week.
Vectorization with SIMD extensions speeds up reconstruction in electron tomography.

PubMed

Agulleiro, J I; Garzón, E M; García, I; Fernández, J J

2010-06-01

Electron tomography allows structural studies of cellular structures at molecular detail. Large 3D reconstructions are needed to meet the resolution requirements. The processing time to compute these large volumes may be considerable and so, high performance computing techniques have been used traditionally. This work presents a vector approach to tomographic reconstruction that relies on the exploitation of the SIMD extensions available in modern processors in combination to other single processor optimization techniques. This approach succeeds in producing full resolution tomograms with an important reduction in processing time, as evaluated with the most common reconstruction algorithms, namely WBP and SIRT. The main advantage stems from the fact that this approach is to be run on standard computers without the need of specialized hardware, which facilitates the development, use and management of programs. Future trends in processor design open excellent opportunities for vector processing with processor's SIMD extensions in the field of 3D electron microscopy.
GPU accelerated implementation of NCI calculations using promolecular density.

PubMed

Rubez, Gaëtan; Etancelin, Jean-Matthieu; Vigouroux, Xavier; Krajecki, Michael; Boisson, Jean-Charles; Hénon, Eric

2017-05-30

The NCI approach is a modern tool to reveal chemical noncovalent interactions. It is particularly attractive to describe ligand-protein binding. A custom implementation for NCI using promolecular density is presented. It is designed to leverage the computational power of NVIDIA graphics processing unit (GPU) accelerators through the CUDA programming model. The code performances of three versions are examined on a test set of 144 systems. NCI calculations are particularly well suited to the GPU architecture, which reduces drastically the computational time. On a single compute node, the dual-GPU version leads to a 39-fold improvement for the biggest instance compared to the optimal OpenMP parallel run (C code, icc compiler) with 16 CPU cores. Energy consumption measurements carried out on both CPU and GPU NCI tests show that the GPU approach provides substantial energy savings. © 2017 Wiley Periodicals, Inc. © 2017 Wiley Periodicals, Inc.
[HYGIENIC REGULATION OF THE USE OF ELECTRONIC EDUCATIONAL RESOURCES IN THE MODERN SCHOOL].

PubMed

Stepanova, M I; Aleksandrova, I E; Sazanyuk, Z I; Voronova, B Z; Lashneva, L P; Shumkova, T V; Berezina, N O

2015-01-01

We studied the effect of academic studies with the use a notebook computer and interactive whiteboard on the functional state of an organism of schoolchildren. Using a complex of hygienic and physiological methods of the study we established that regulation of the computer activity of students must take into account not only duration but its intensity either. Design features of a notebook computer were shown both to impede keeping the optimal working posture in primary school children and increase the risk offormation of disorders of vision and musculoskeletal system. There were established the activating influence of the interactive whiteboard on performance activities and favorable dynamics of indices of the functional state of the organism of students under keeping optimal density of the academic study and the duration of its use. There are determined safety regulations of the work of schoolchildren with electronic resources in the educational process.
Parallelization of the Coupled Earthquake Model

NASA Technical Reports Server (NTRS)

Block, Gary; Li, P. Peggy; Song, Yuhe T.

2007-01-01

This Web-based tsunami simulation system allows users to remotely run a model on JPL s supercomputers for a given undersea earthquake. At the time of this reporting, predicting tsunamis on the Internet has never happened before. This new code directly couples the earthquake model and the ocean model on parallel computers and improves simulation speed. Seismometers can only detect information from earthquakes; they cannot detect whether or not a tsunami may occur as a result of the earthquake. When earthquake-tsunami models are coupled with the improved computational speed of modern, high-performance computers and constrained by remotely sensed data, they are able to provide early warnings for those coastal regions at risk. The software is capable of testing NASA s satellite observations of tsunamis. It has been successfully tested for several historical tsunamis, has passed all alpha and beta testing, and is well documented for users.
Diffraction scattering computed tomography: a window into the structures of complex nanomaterials

PubMed Central

Birkbak, M. E.; Leemreize, H.; Frølich, S.; Stock, S. R.

2015-01-01

Modern functional nanomaterials and devices are increasingly composed of multiple phases arranged in three dimensions over several length scales. Therefore there is a pressing demand for improved methods for structural characterization of such complex materials. An excellent emerging technique that addresses this problem is diffraction/scattering computed tomography (DSCT). DSCT combines the merits of diffraction and/or small angle scattering with computed tomography to allow imaging the interior of materials based on the diffraction or small angle scattering signals. This allows, e.g., one to distinguish the distributions of polymorphs in complex mixtures. Here we review this technique and give examples of how it can shed light on modern nanoscale materials. PMID:26505175
Computational Skills for Biology Students

ERIC Educational Resources Information Center

Gross, Louis J.

2008-01-01

This interview with Distinguished Science Award recipient Louis J. Gross highlights essential computational skills for modern biology, including: (1) teaching concepts listed in the Math & Bio 2010 report; (2) illustrating to students that jobs today require quantitative skills; and (3) resources and materials that focus on computational skills.
Human performance in the modern cockpit

NASA Technical Reports Server (NTRS)

Dismukes, R. K.; Cohen, M. M.

1992-01-01

This panel was organized by the Aerospace Human Factors Committee to illustrate behavioral research on the perceptual, cognitive, and group processes that determine crew effectiveness in modern cockpits. Crew reactions to the introduction of highly automated systems in the cockpit will be reported on. Automation can improve operational capabilities and efficiency and can reduce some types of human error, but may also introduce entirely new opportunities for error. The problem solving and decision making strategies used by crews led by captains with various personality profiles will be discussed. Also presented will be computational approaches to modeling the cognitive demands of cockpit operations and the cognitive capabilities and limitations of crew members. Factors contributing to aircrew deviations from standard operating procedures and misuse of checklist, often leading to violations, incidents, or accidents will be examined. The mechanisms of visual perception pilots use in aircraft control and the implications of these mechanisms for effective design of visual displays will be discussed.
Application of laminar flow control to high-bypass-ratio turbofan engine nacelles

NASA Technical Reports Server (NTRS)

Wie, Y. S.; Collier, F. S., Jr.; Wagner, R. D.

1991-01-01

Recently, the concept of the application of hybrid laminar flow to modern commercial transport aircraft was successfully flight tested on a Boeing 757 aircraft. In this limited demonstration, in which only part of the upper surface of the swept wing was designed for the attainment of laminar flow, significant local drag reduction was measured. This paper addresses the potential application of this technology to laminarize the external surface of large, modern turbofan engine nacelles which may comprise as much as 5-10 percent of the total wetted area of future commercial transports. A hybrid-laminar-flow-control (HLFC) pressure distribution is specified and the corresponding nacelle geometry is computed utilizing a predictor/corrector design method. Linear stability calculations are conducted to provide predictions of the extent of the laminar boundary layer. Performance studies are presented to determine potential benefits in terms of reduced fuel consumption.
Virtopsy: postmortem imaging of laryngeal foreign bodies.

PubMed

Oesterhelweg, Lars; Bolliger, Stephan A; Thali, Michael J; Ross, Steffen

2009-05-01

Death from corpora aliena in the larynx is a well-known entity in forensic pathology. The correct diagnosis of this cause of death is difficult without an autopsy, and misdiagnoses by external examination alone are common. To determine the postmortem usefulness of modern imaging techniques in the diagnosis of foreign bodies in the larynx, multislice computed tomography, magnetic resonance imaging, and postmortem full-body computed tomography-angiography were performed. Three decedents with a suspected foreign body in the larynx underwent the 3 different imaging techniques before medicolegal autopsy. Multislice computed tomography has a high diagnostic value in the noninvasive localization of a foreign body and abnormalities in the larynx. The differentiation between neoplasm or soft foreign bodies (eg, food) is possible, but difficult, by unenhanced multislice computed tomography. By magnetic resonance imaging, the discrimination of the soft tissue structures and soft foreign bodies is much easier. In addition to the postmortem multislice computed tomography, the combination with postmortem angiography will increase the diagnostic value. Postmortem, cross-sectional imaging methods are highly valuable procedures for the noninvasive detection of corpora aliena in the larynx.
Performance comparison of plastic shopping bags in modern and traditional retail

NASA Astrophysics Data System (ADS)

Radini, F. A.; Wulandari, R.; Nasiri, S. J. A.; Winarto, D. A.

2017-07-01

Followed by implementation of paid plastic bag policy in Indonesia’s modern and traditional retail, community question related to plastic shopping bag performance arise. But, there is limited information about it. Therefore, the assessment of the performance to compare between plastic shopping bags in modern retail and traditional retail should be interesting. The observation performance of plastic shopping bag were weight holding capacity, tear resistant and elongation. This performance were tested using Universal Testing Machine. Physical and physico-chemical properties also identified to determine factor affecting the performance of plastic shopping bag. The physical properties were analysed using visual and thickness gauge to see the colour and measure the thickness. The analysis of physico-chemical properties were carried out using DSC (Differential Scanning Calorimetry), TGA (Thermal Gravimetry Analysis), Furnace and FTIR (Fourier Transform Infra Red Spectroscopy) to identify the materials, also its melting and decomposition temperature. The result showed that the performance difference between modern retail plastic bag with traditional retail plastic bag appears only in the performance of elongation. The elongation of modern retail plastic bag is 121 - 413%, while traditional has 170 - 609%. According to physico-chemical test result, modern retail and traditional retail plastic bag contain polyethylene as main material and has melting temperature in the range of High Density Polyethylene (HDPE) melting temperature. However, modern retail plastic bag has 18.31 - 33.87% of inorganic filler percentage, whereas the traditional retail plastic bag has 0.35 - 9.85%. This inorganic filler percentage probably a contributing factor in the elongation performance difference between modern retail plastic bag with traditional retail plastic bag.

Porting plasma physics simulation codes to modern computing architectures using the libmrc framework

NASA Astrophysics Data System (ADS)

Germaschewski, Kai; Abbott, Stephen

2015-11-01

Available computing power has continued to grow exponentially even after single-core performance satured in the last decade. The increase has since been driven by more parallelism, both using more cores and having more parallelism in each core, e.g. in GPUs and Intel Xeon Phi. Adapting existing plasma physics codes is challenging, in particular as there is no single programming model that covers current and future architectures. We will introduce the open-source libmrc framework that has been used to modularize and port three plasma physics codes: The extended MHD code MRCv3 with implicit time integration and curvilinear grids; the OpenGGCM global magnetosphere model; and the particle-in-cell code PSC. libmrc consolidates basic functionality needed for simulations based on structured grids (I/O, load balancing, time integrators), and also introduces a parallel object model that makes it possible to maintain multiple implementations of computational kernels, on e.g. conventional processors and GPUs. It handles data layout conversions and enables us to port performance-critical parts of a code to a new architecture step-by-step, while the rest of the code can remain unchanged. We will show examples of the performance gains and some physics applications.
Evolution of computational chemistry, the "launch pad" to scientific computational models: The early days from a personal account, the present status from the TACC-2012 congress, and eventual future applications from the global simulation approach

NASA Astrophysics Data System (ADS)

Clementi, Enrico

2012-06-01

This is the introductory chapter to the AIP Proceedings volume "Theory and Applications of Computational Chemistry: The First Decade of the Second Millennium" where we discuss the evolution of "computational chemistry". Very early variational computational chemistry developments are reported in Sections 1 to 7, and 11, 12 by recalling some of the computational chemistry contributions by the author and his collaborators (from late 1950 to mid 1990); perturbation techniques are not considered in this already extended work. Present day's computational chemistry is partly considered in Sections 8 to 10 where more recent studies by the author and his collaborators are discussed, including the Hartree-Fock-Heitler-London method; a more general discussion on present day computational chemistry is presented in Section 14. The following chapters of this AIP volume provide a view of modern computational chemistry. Future computational chemistry developments can be extrapolated from the chapters of this AIP volume; further, in Sections 13 and 15 present an overall analysis on computational chemistry, obtained from the Global Simulation approach, by considering the evolution of scientific knowledge confronted with the opportunities offered by modern computers.
Workflows for Full Waveform Inversions

NASA Astrophysics Data System (ADS)

Boehm, Christian; Krischer, Lion; Afanasiev, Michael; van Driel, Martin; May, Dave A.; Rietmann, Max; Fichtner, Andreas

2017-04-01

Despite many theoretical advances and the increasing availability of high-performance computing clusters, full seismic waveform inversions still face considerable challenges regarding data and workflow management. While the community has access to solvers which can harness modern heterogeneous computing architectures, the computational bottleneck has fallen to these often manpower-bounded issues that need to be overcome to facilitate further progress. Modern inversions involve huge amounts of data and require a tight integration between numerical PDE solvers, data acquisition and processing systems, nonlinear optimization libraries, and job orchestration frameworks. To this end we created a set of libraries and applications revolving around Salvus (http://salvus.io), a novel software package designed to solve large-scale full waveform inverse problems. This presentation focuses on solving passive source seismic full waveform inversions from local to global scales with Salvus. We discuss (i) design choices for the aforementioned components required for full waveform modeling and inversion, (ii) their implementation in the Salvus framework, and (iii) how it is all tied together by a usable workflow system. We combine state-of-the-art algorithms ranging from high-order finite-element solutions of the wave equation to quasi-Newton optimization algorithms using trust-region methods that can handle inexact derivatives. All is steered by an automated interactive graph-based workflow framework capable of orchestrating all necessary pieces. This naturally facilitates the creation of new Earth models and hopefully sparks new scientific insights. Additionally, and even more importantly, it enhances reproducibility and reliability of the final results.
MPACT Standard Input User s Manual, Version 2.2.0

DOE Office of Scientific and Technical Information (OSTI.GOV)

Collins, Benjamin S.; Downar, Thomas; Fitzgerald, Andrew

The MPACT (Michigan PArallel Charactistics based Transport) code is designed to perform high-fidelity light water reactor (LWR) analysis using whole-core pin-resolved neutron transport calculations on modern parallel-computing hardware. The code consists of several libraries which provide the functionality necessary to solve steady-state eigenvalue problems. Several transport capabilities are available within MPACT including both 2-D and 3-D Method of Characteristics (MOC). A three-dimensional whole core solution based on the 2D-1D solution method provides the capability for full core depletion calculations.
Virtual file system on NoSQL for processing high volumes of HL7 messages.

PubMed

Kimura, Eizen; Ishihara, Ken

2015-01-01

The Standardized Structured Medical Information Exchange (SS-MIX) is intended to be the standard repository for HL7 messages that depend on a local file system. However, its scalability is limited. We implemented a virtual file system using NoSQL to incorporate modern computing technology into SS-MIX and allow the system to integrate local patient IDs from different healthcare systems into a universal system. We discuss its implementation using the database MongoDB and describe its performance in a case study.
Electron tubes for industrial applications

NASA Astrophysics Data System (ADS)

Gellert, Bernd

1994-05-01

This report reviews research and development efforts within the last years for vacuum electron tubes, in particular power grid tubes for industrial applications. Physical and chemical effects are discussed that determine the performance of todays devices. Due to the progress made in the fundamental understanding of materials and newly developed processes the reliability and reproducibility of power grid tubes could be improved considerably. Modern computer controlled manufacturing methods ensure a high reproducibility of production and continuous quality certification according to ISO 9001 guarantees future high quality standards. Some typical applications of these tubes are given as an example.
Applications of an architecture design and assessment system (ADAS)

NASA Technical Reports Server (NTRS)

Gray, F. Gail; Debrunner, Linda S.; White, Tennis S.

1988-01-01

A new Architecture Design and Assessment System (ADAS) tool package is introduced, and a range of possible applications is illustrated. ADAS was used to evaluate the performance of an advanced fault-tolerant computer architecture in a modern flight control application. Bottlenecks were identified and possible solutions suggested. The tool was also used to inject faults into the architecture and evaluate the synchronization algorithm, and improvements are suggested. Finally, ADAS was used as a front end research tool to aid in the design of reconfiguration algorithms in a distributed array architecture.
Global magnetohydrodynamic simulations on multiple GPUs

NASA Astrophysics Data System (ADS)

Wong, Un-Hong; Wong, Hon-Cheng; Ma, Yonghui

2014-01-01

Global magnetohydrodynamic (MHD) models play the major role in investigating the solar wind-magnetosphere interaction. However, the huge computation requirement in global MHD simulations is also the main problem that needs to be solved. With the recent development of modern graphics processing units (GPUs) and the Compute Unified Device Architecture (CUDA), it is possible to perform global MHD simulations in a more efficient manner. In this paper, we present a global magnetohydrodynamic (MHD) simulator on multiple GPUs using CUDA 4.0 with GPUDirect 2.0. Our implementation is based on the modified leapfrog scheme, which is a combination of the leapfrog scheme and the two-step Lax-Wendroff scheme. GPUDirect 2.0 is used in our implementation to drive multiple GPUs. All data transferring and kernel processing are managed with CUDA 4.0 API instead of using MPI or OpenMP. Performance measurements are made on a multi-GPU system with eight NVIDIA Tesla M2050 (Fermi architecture) graphics cards. These measurements show that our multi-GPU implementation achieves a peak performance of 97.36 GFLOPS in double precision.
A Prospectus for the Future Development of a Speech Lab: Hypertext Applications.

ERIC Educational Resources Information Center

Berube, David M.

This paper presents a plan for the next generation of speech laboratories which integrates technologies of modern communication in order to improve and modernize the instructional process. The paper first examines the application of intermediate technologies including audio-video recording and playback, computer assisted instruction and testing…
MODERN LINGUISTICS, ITS DEVELOPMENT AND SCOPE.

ERIC Educational Resources Information Center

LEVIN, SAMUEL R.

THE DEVELOPMENT OF MODERN LINGUISTICS STARTED WITH JONES' DISCOVERY IN 1786 THAT SANSKRIT IS CLOSELY RELATED TO THE CLASSICAL, GERMANIC, AND CELTIC LANGUAGES, AND HAS ADVANCED TO INCLUDE THE APPLICATION OF COMPUTERS IN LANGUAGE ANALYSIS. THE HIGHLIGHTS OF LINGUISTIC RESEARCH HAVE BEEN DE SAUSSURE'S DISTINCTION BETWEEN THE DIACHRONIC AND THE…
Energy Systems Integration Facility (ESIF): Golden, CO - Energy Integration

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sheppy, Michael; VanGeet, Otto; Pless, Shanti

2015-03-01

At NREL's Energy Systems Integration Facility (ESIF) in Golden, Colo., scientists and engineers work to overcome challenges related to how the nation generates, delivers and uses energy by modernizing the interplay between energy sources, infrastructure, and data. Test facilities include a megawatt-scale ac electric grid, photovoltaic simulators and a load bank. Additionally, a high performance computing data center (HPCDC) is dedicated to advancing renewable energy and energy efficient technologies. A key design strategy is to use waste heat from the HPCDC to heat parts of the building. The ESIF boasts an annual EUI of 168.3 kBtu/ft2. This article describes themore » building's procurement, design and first year of performance.« less
Modernizing the Federal Government: Paying for Performance

DTIC Science & Technology

2007-01-01

works (Barr, 2007d). Employees are rated on performance measures such as “fair and equitable treatment of taxpayers” and “customer satisfaction ... Performance Act of 2007, Senate bill 1046, Washington, D.C., 2007b. 38 Modernizing the Federal Government: Paying for Performance Vroom , Victor H...AND SUBTITLE Modernizing the federal government paying for performance 5a. CONTRACT NUMBER 5b. GRANT NUMBER 5c. PROGRAM ELEMENT NUMBER 6. AUTHOR
Quantitative Investigation of the Technologies That Support Cloud Computing

ERIC Educational Resources Information Center

Hu, Wenjin

2014-01-01

Cloud computing is dramatically shaping modern IT infrastructure. It virtualizes computing resources, provides elastic scalability, serves as a pay-as-you-use utility, simplifies the IT administrators' daily tasks, enhances the mobility and collaboration of data, and increases user productivity. We focus on providing generalized black-box…
Graphical User Interface Programming in Introductory Computer Science.

ERIC Educational Resources Information Center

Skolnick, Michael M.; Spooner, David L.

Modern computing systems exploit graphical user interfaces for interaction with users; as a result, introductory computer science courses must begin to teach the principles underlying such interfaces. This paper presents an approach to graphical user interface (GUI) implementation that is simple enough for beginning students to understand, yet…
Coordinating Technological Resources in a Non-Technical Profession: The Administrative Computer User Group.

ERIC Educational Resources Information Center

Rollo, J. Michael; Marmarchev, Helen L.

1999-01-01

The explosion of computer applications in the modern workplace has required student affairs professionals to keep pace with technological advances for office productivity. This article recommends establishing an administrative computer user groups, utilizing coordinated web site development, and enhancing working relationships as ways of dealing…
Computer-Based Molecular Modelling: Finnish School Teachers' Experiences and Views

ERIC Educational Resources Information Center

Aksela, Maija; Lundell, Jan

2008-01-01

Modern computer-based molecular modelling opens up new possibilities for chemistry teaching at different levels. This article presents a case study seeking insight into Finnish school teachers' use of computer-based molecular modelling in teaching chemistry, into the different working and teaching methods used, and their opinions about necessary…
Computer Problem-Solving Coaches for Introductory Physics: Design and Usability Studies

ERIC Educational Resources Information Center

Ryan, Qing X.; Frodermann, Evan; Heller, Kenneth; Hsu, Leonardo; Mason, Andrew

2016-01-01

The combination of modern computing power, the interactivity of web applications, and the flexibility of object-oriented programming may finally be sufficient to create computer coaches that can help students develop metacognitive problem-solving skills, an important competence in our rapidly changing technological society. However, no matter how…
Introduction to the theory of machines and languages

DOE Office of Scientific and Technical Information (OSTI.GOV)

Weidhaas, P. P.

1976-04-01

This text is intended to be an elementary ''guided tour'' through some basic concepts of modern computer science. Various models of computing machines and formal languages are studied in detail. Discussions center around questions such as, ''What is the scope of problems that can or cannot be solved by computers.''
Computer-Aided Instruction in Automated Instrumentation.

ERIC Educational Resources Information Center

Stephenson, David T.

1986-01-01

Discusses functions of automated instrumentation systems, i.e., systems which combine electrical measuring instruments and a controlling computer to measure responses of a unit under test. The computer-assisted tutorial then described is programmed for use on such a system--a modern microwave spectrum analyzer--to introduce engineering students to…
Control Law Design in a Computational Aeroelasticity Environment

NASA Technical Reports Server (NTRS)

Newsom, Jerry R.; Robertshaw, Harry H.; Kapania, Rakesh K.

2003-01-01

A methodology for designing active control laws in a computational aeroelasticity environment is given. The methodology involves employing a systems identification technique to develop an explicit state-space model for control law design from the output of a computational aeroelasticity code. The particular computational aeroelasticity code employed in this paper solves the transonic small disturbance aerodynamic equation using a time-accurate, finite-difference scheme. Linear structural dynamics equations are integrated simultaneously with the computational fluid dynamics equations to determine the time responses of the structure. These structural responses are employed as the input to a modern systems identification technique that determines the Markov parameters of an "equivalent linear system". The Eigensystem Realization Algorithm is then employed to develop an explicit state-space model of the equivalent linear system. The Linear Quadratic Guassian control law design technique is employed to design a control law. The computational aeroelasticity code is modified to accept control laws and perform closed-loop simulations. Flutter control of a rectangular wing model is chosen to demonstrate the methodology. Various cases are used to illustrate the usefulness of the methodology as the nonlinearity of the aeroelastic system is increased through increased angle-of-attack changes.

Instruction-Level Characterization of Scientific Computing Applications Using Hardware Performance Counters

DOE Office of Scientific and Technical Information (OSTI.GOV)

Luo, Y.; Cameron, K.W.

1998-11-24

Workload characterization has been proven an essential tool to architecture design and performance evaluation in both scientific and commercial computing areas. Traditional workload characterization techniques include FLOPS rate, cache miss ratios, CPI (cycles per instruction or IPC, instructions per cycle) etc. With the complexity of sophisticated modern superscalar microprocessors, these traditional characterization techniques are not powerful enough to pinpoint the performance bottleneck of an application on a specific microprocessor. They are also incapable of immediately demonstrating the potential performance benefit of any architectural or functional improvement in a new processor design. To solve these problems, many people rely on simulators,more » which have substantial constraints especially on large-scale scientific computing applications. This paper presents a new technique of characterizing applications at the instruction level using hardware performance counters. It has the advantage of collecting instruction-level characteristics in a few runs virtually without overhead or slowdown. A variety of instruction counts can be utilized to calculate some average abstract workload parameters corresponding to microprocessor pipelines or functional units. Based on the microprocessor architectural constraints and these calculated abstract parameters, the architectural performance bottleneck for a specific application can be estimated. In particular, the analysis results can provide some insight to the problem that only a small percentage of processor peak performance can be achieved even for many very cache-friendly codes. Meanwhile, the bottleneck estimation can provide suggestions about viable architectural/functional improvement for certain workloads. Eventually, these abstract parameters can lead to the creation of an analytical microprocessor pipeline model and memory hierarchy model.« less
Hydrodynamic Simulations of Protoplanetary Disks with GIZMO

NASA Astrophysics Data System (ADS)

Rice, Malena; Laughlin, Greg

2018-01-01

Over the past several decades, the field of computational fluid dynamics has rapidly advanced as the range of available numerical algorithms and computationally feasible physical problems has expanded. The development of modern numerical solvers has provided a compelling opportunity to reconsider previously obtained results in search for yet undiscovered effects that may be revealed through longer integration times and more precise numerical approaches. In this study, we compare the results of past hydrodynamic disk simulations with those obtained from modern analytical resources. We focus our study on the GIZMO code (Hopkins 2015), which uses meshless methods to solve the homogeneous Euler equations of hydrodynamics while eliminating problems arising as a result of advection between grid cells. By comparing modern simulations with prior results, we hope to provide an improved understanding of the impact of fluid mechanics upon the evolution of protoplanetary disks.
Flight test results for the Digital Integrated Automatic Landing Systems (DIALS): A modern control full-state feedback design

NASA Technical Reports Server (NTRS)

Hueschen, R. M.

1984-01-01

The Digital Integrated Automatic Landing System (DIALS) is discussed. The DIALS is a modern control theory design performing all the maneuver modes associated with current autoland systems: localizer capture and track, glideslope capture and track, decrab, and flare. The DIALS is an integrated full-state feedback system which was designed using direct-digital methods. The DIALS uses standard aircraft sensors and the digital Microwave Landing System (MLS) signals as measurements. It consists of separately designed longitudinal and lateral channels although some cross-coupling variables are fed between channels for improved state estimates and trajectory commands. The DIALS was implemented within the 16-bit fixed-point flight computers of the ATOPS research aircraft, a small twin jet commercial transport outfitted with a second research cockpit and a fly-by-wire system. The DIALS became the first modern control theory design to be successfully flight tested on a commercial-type aircraft. Flight tests were conducted in late 1981 using a wide coverage MLS on Runway 22 at Wallops Flight Center. All the modes were exercised including the capture and track of steep glidescopes up to 5 degrees.
Analytical and flight investigation of the influence of rotor and other high-order dynamics on helicopter flight-control system bandwidth

NASA Technical Reports Server (NTRS)

Chen, R. T. N.; Hindson, W. S.

1985-01-01

The increasing use of highly augmented digital flight-control systems in modern military helicopters prompted an examination of the influence of rotor dynamics and other high-order dynamics on control-system performance. A study was conducted at NASA Ames Research Center to correlate theoretical predictions of feedback gain limits in the roll axis with experimental test data obtained from a variable-stability research helicopter. Feedback gains, the break frequency of the presampling sensor filter, and the computational frame time of the flight computer were systematically varied. The results, which showed excellent theoretical and experimental correlation, indicate that the rotor-dynamics, sensor-filter, and digital-data processing delays can severely limit the usable values of the roll-rate and roll-attitude feedback gains.
Techniques of EMG signal analysis: detection, processing, classification and applications

PubMed Central

Hussain, M.S.; Mohd-Yasin, F.

2006-01-01

Electromyography (EMG) signals can be used for clinical/biomedical applications, Evolvable Hardware Chip (EHW) development, and modern human computer interaction. EMG signals acquired from muscles require advanced methods for detection, decomposition, processing, and classification. The purpose of this paper is to illustrate the various methodologies and algorithms for EMG signal analysis to provide efficient and effective ways of understanding the signal and its nature. We further point up some of the hardware implementations using EMG focusing on applications related to prosthetic hand control, grasp recognition, and human computer interaction. A comparison study is also given to show performance of various EMG signal analysis methods. This paper provides researchers a good understanding of EMG signal and its analysis procedures. This knowledge will help them develop more powerful, flexible, and efficient applications. PMID:16799694
Many-core graph analytics using accelerated sparse linear algebra routines

NASA Astrophysics Data System (ADS)

Kozacik, Stephen; Paolini, Aaron L.; Fox, Paul; Kelmelis, Eric

2016-05-01

Graph analytics is a key component in identifying emerging trends and threats in many real-world applications. Largescale graph analytics frameworks provide a convenient and highly-scalable platform for developing algorithms to analyze large datasets. Although conceptually scalable, these techniques exhibit poor performance on modern computational hardware. Another model of graph computation has emerged that promises improved performance and scalability by using abstract linear algebra operations as the basis for graph analysis as laid out by the GraphBLAS standard. By using sparse linear algebra as the basis, existing highly efficient algorithms can be adapted to perform computations on the graph. This approach, however, is often less intuitive to graph analytics experts, who are accustomed to vertex-centric APIs such as Giraph, GraphX, and Tinkerpop. We are developing an implementation of the high-level operations supported by these APIs in terms of linear algebra operations. This implementation is be backed by many-core implementations of the fundamental GraphBLAS operations required, and offers the advantages of both the intuitive programming model of a vertex-centric API and the performance of a sparse linear algebra implementation. This technology can reduce the number of nodes required, as well as the run-time for a graph analysis problem, enabling customers to perform more complex analysis with less hardware at lower cost. All of this can be accomplished without the requirement for the customer to make any changes to their analytics code, thanks to the compatibility with existing graph APIs.
Improvement and speed optimization of numerical tsunami modelling program using OpenMP technology

NASA Astrophysics Data System (ADS)

Chernov, A.; Zaytsev, A.; Yalciner, A.; Kurkin, A.

2009-04-01

Currently, the basic problem of tsunami modeling is low speed of calculations which is unacceptable for services of the operative notification. Existing algorithms of numerical modeling of hydrodynamic processes of tsunami waves are developed without taking the opportunities of modern computer facilities. There is an opportunity to have considerable acceleration of process of calculations by using parallel algorithms. We discuss here new approach to parallelization tsunami modeling code using OpenMP Technology (for multiprocessing systems with the general memory). Nowadays, multiprocessing systems are easily accessible for everyone. The cost of the use of such systems becomes much lower comparing to the costs of clusters. This opportunity also benefits all programmers to apply multithreading algorithms on desktop computers of researchers. Other important advantage of the given approach is the mechanism of the general memory - there is no necessity to send data on slow networks (for example Ethernet). All memory is the common for all computing processes; it causes almost linear scalability of the program and processes. In the new version of NAMI DANCE using OpenMP technology and multi-threading algorithm provide 80% gain in speed in comparison with the one-thread version for dual-processor unit. The speed increased and 320% gain was attained for four core processor unit of PCs. Thus, it was possible to reduce considerably time of performance of calculations on the scientific workstations (desktops) without complete change of the program and user interfaces. The further modernization of algorithms of preparation of initial data and processing of results using OpenMP looks reasonable. The final version of NAMI DANCE with the increased computational speed can be used not only for research purposes but also in real time Tsunami Warning Systems.
New computer and communications environments for light armored vehicles

NASA Astrophysics Data System (ADS)

Rapanotti, John L.; Palmarini, Marc; Dumont, Marc

2002-08-01

Light Armoured Vehicles (LAVs) are being developed to meet the modern requirements of rapid deployment and operations other than war. To achieve these requirements, passive armour is minimized and survivability depends more on sensors, computers and countermeasures to detect and avoid threats. The performance, reliability, and ultimately the cost of these components, will be determined by the trends in computing and communications. These trends and the potential impact on DAS (Defensive Aids Suite) development were investigated and are reported in this paper. Vehicle performance is affected by communication with other vehicles and other ISTAR (Intelligence, Surveillance, Target Acquisition and Reconnaissance) battlefield assets. This investigation includes the networking technology Jini developed by SUN Microsystems, which can be used to interface the vehicle to the ISTAR network. VxWorks by Wind River Systems, is a real time operating system designed for military systems and compatible with Jini. Other technologies affecting computer hardware development include, dynamic reconfiguration, hot swap, alternate pathing, CompactPCI, and Fiber Channel serial communication. To achieve the necessary performance at reasonable cost, and over the long service life of the vehicle, a DAS should have two essential features. A fitted for, but not fitted with approach will provide the necessary rapid deployment without a need to equip the entire fleet. With an expected vehicle service life of 50 years, 5-year technology upgrades can be used to maintain vehicle performance over the entire service life. A federation of modules instead of integrated fused sensors will provide the capability for incremental upgrades and mission configurability. A plug and play capability can be used for both hardware and expendables.
High-Performance Design Patterns for Modern Fortran

DOE PAGES

Haveraaen, Magne; Morris, Karla; Rouson, Damian; ...

2015-01-01

This paper presents ideas for using coordinate-free numerics in modern Fortran to achieve code flexibility in the partial differential equation (PDE) domain. We also show how Fortran, over the last few decades, has changed to become a language well-suited for state-of-the-art software development. Fortran’s new coarray distributed data structure, the language’s class mechanism, and its side-effect-free, pure procedure capability provide the scaffolding on which we implement HPC software. These features empower compilers to organize parallel computations with efficient communication. We present some programming patterns that support asynchronous evaluation of expressions comprised of parallel operations on distributed data. We implemented thesemore » patterns using coarrays and the message passing interface (MPI). We compared the codes’ complexity and performance. The MPI code is much more complex and depends on external libraries. The MPI code on Cray hardware using the Cray compiler is 1.5–2 times faster than the coarray code on the same hardware. The Intel compiler implements coarrays atop Intel’s MPI library with the result apparently being 2–2.5 times slower than manually coded MPI despite exhibiting nearly linear scaling efficiency. As compilers mature and further improvements to coarrays comes in Fortran 2015, we expect this performance gap to narrow.« less
A Next Generation Digital Counting System For Low-Level Tritium Studies (Project Report)

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bowman, P.

2016-10-03

Since the early seventies, SRNL has pioneered low-level tritium analysis using various nuclear counting technologies and techniques. Since 1999, SRNL has successfully performed routine low-level tritium analyses with counting systems based on digital signal processor (DSP) modules developed in the late 1990s. Each of these counting systems are complex, unique to SRNL, and fully dedicated to performing routine tritium analyses of low-level environmental samples. It is time to modernize these systems due to a variety of issues including (1) age, (2) lack of direct replacement electronics modules and (3) advances in digital signal processing and computer technology. There has beenmore » considerable development in many areas associated with the enterprise of performing low-level tritium analyses. The objective of this LDRD project was to design, build, and demonstrate a Next Generation Tritium Counting System (NGTCS), while not disrupting the routine low-level tritium analyses underway in the facility on the legacy counting systems. The work involved (1) developing a test bed for building and testing new counting system hardware that does not interfere with our routine analyses, (2) testing a new counting system based on a modern state of the art DSP module, and (3) evolving the low-level tritium counter design to reflect the state of the science.« less
A New Network Modeling Tool for the Ground-based Nuclear Explosion Monitoring Community

NASA Astrophysics Data System (ADS)

Merchant, B. J.; Chael, E. P.; Young, C. J.

2013-12-01

Network simulations have long been used to assess the performance of monitoring networks to detect events for such purposes as planning station deployments and network resilience to outages. The standard tool has been the SAIC-developed NetSim package. With correct parameters, NetSim can produce useful simulations; however, the package has several shortcomings: an older language (FORTRAN), an emphasis on seismic monitoring with limited support for other technologies, limited documentation, and a limited parameter set. Thus, we are developing NetMOD (Network Monitoring for Optimal Detection), a Java-based tool designed to assess the performance of ground-based networks. NetMOD's advantages include: coded in a modern language that is multi-platform, utilizes modern computing performance (e.g. multi-core processors), incorporates monitoring technologies other than seismic, and includes a well-validated default parameter set for the IMS stations. NetMOD is designed to be extendable through a plugin infrastructure, so new phenomenological models can be added. Development of the Seismic Detection Plugin is being pursued first. Seismic location and infrasound and hydroacoustic detection plugins will follow. By making NetMOD an open-release package, it can hopefully provide a common tool that the monitoring community can use to produce assessments of monitoring networks and to verify assessments made by others.
On-call service of neurosurgeons in Germany: organization, use of communication services, and personal acceptance of modern technologies.

PubMed

Brenke, Christopher; Lassel, Elke A; Terris, Darcey; Kurt, Aysel; Schmieder, Kirsten; Schoenberg, Stefan O; Weisser, Gerald

2014-05-01

A significant proportion of acute care neurosurgical patients present to hospital outside regular working hours. The objective of our study was to evaluate the structure of neurosurgical on-call services in Germany, the use of modern communication devices and teleradiology services, and the personal acceptance of modern technologies by neurosurgeons. A nationwide survey of all 141 neurosurgical departments in Germany was performed. The questionnaire consisted of two parts: one for neurosurgical departments and one for individual neurosurgeons. The questionnaire, available online and mailed in paper form, included 21 questions about on-call service structure; the availability and use of communication devices, teleradiology services, and other information services; and neurosurgeons' personal acceptance of modern technologies. The questionnaire return rate from departments was 63.1% (89/141), whereas 187 individual neurosurgeons responded. For 57.3% of departments, teleradiology services were available and were frequently used by 62.2% of neurosurgeons. A further 23.6% of departments described using smartphone screenshots of computed tomography (CT) images transmitted by multimedia messaging service (MMS), and 8.6% of images were described as sent by unencrypted email. Although 47.0% of neurosurgeons reported owning a smartphone, only 1.1% used their phone for on-call image communication. Teleradiology services were observed to be widely used by on-call neurosurgeons in Germany. Nevertheless, a significant number of departments appear to use outdated techniques or techniques that leave patient data unprotected. On-call neurosurgeons in Germany report a willingness to adopt more modern approaches, utilizing readily available smartphones or tablet technology. Georg Thieme Verlag KG Stuttgart · New York.
WASP (Write a Scientific Paper) using Excel - 1: Data entry and validation.

PubMed

Grech, Victor

2018-02-01

Data collection for the purposes of analysis, after the planning and execution of a research study, commences with data input and validation. The process of data entry and analysis may appear daunting to the uninitiated, but as pointed out in the 1970s in a series of papers by British Medical Journal Deputy Editor TDV Swinscow, modern hardware and software (he was then referring to the availability of hand calculators) permits the performance of statistical testing outside a computer laboratory. In this day and age, modern software, such as the ubiquitous and almost universally familiar Microsoft Excel™ greatly facilitates this process. This first paper comprises the first of a collection of papers which will emulate Swinscow's series, in his own words, "addressed to readers who want to start at the beginning, not to those who are already skilled statisticians." These papers will have less focus on the actual arithmetic, and more emphasis on how to actually implement simple statistics, step by step, using Excel, thereby constituting the equivalent of Swinscow's papers in the personal computer age. Data entry can be facilitated by several underutilised features in Excel. This paper will explain Excel's little-known form function, data validation implementation at input stage, simple coding tips and data cleaning tools. Copyright © 2018 Elsevier B.V. All rights reserved.
Petascale Many Body Methods for Complex Correlated Systems

NASA Astrophysics Data System (ADS)

Pruschke, Thomas

2012-02-01

Correlated systems constitute an important class of materials in modern condensed matter physics. Correlation among electrons are at the heart of all ordering phenomena and many intriguing novel aspects, such as quantum phase transitions or topological insulators, observed in a variety of compounds. Yet, theoretically describing these phenomena is still a formidable task, even if one restricts the models used to the smallest possible set of degrees of freedom. Here, modern computer architectures play an essential role, and the joint effort to devise efficient algorithms and implement them on state-of-the art hardware has become an extremely active field in condensed-matter research. To tackle this task single-handed is quite obviously not possible. The NSF-OISE funded PIRE collaboration ``Graduate Education and Research in Petascale Many Body Methods for Complex Correlated Systems'' is a successful initiative to bring together leading experts around the world to form a virtual international organization for addressing these emerging challenges and educate the next generation of computational condensed matter physicists. The collaboration includes research groups developing novel theoretical tools to reliably and systematically study correlated solids, experts in efficient computational algorithms needed to solve the emerging equations, and those able to use modern heterogeneous computer architectures to make then working tools for the growing community.
An Adaptable Seismic Data Format for Modern Scientific Workflows

NASA Astrophysics Data System (ADS)

Smith, J. A.; Bozdag, E.; Krischer, L.; Lefebvre, M.; Lei, W.; Podhorszki, N.; Tromp, J.

2013-12-01

Data storage, exchange, and access play a critical role in modern seismology. Current seismic data formats, such as SEED, SAC, and SEG-Y, were designed with specific applications in mind and are frequently a major bottleneck in implementing efficient workflows. We propose a new modern parallel format that can be adapted for a variety of seismic workflows. The Adaptable Seismic Data Format (ASDF) features high-performance parallel read and write support and the ability to store an arbitrary number of traces of varying sizes. Provenance information is stored inside the file so that users know the origin of the data as well as the precise operations that have been applied to the waveforms. The design of the new format is based on several real-world use cases, including earthquake seismology and seismic interferometry. The metadata is based on the proven XML schemas StationXML and QuakeML. Existing time-series analysis tool-kits are easily interfaced with this new format so that seismologists can use robust, previously developed software packages, such as ObsPy and the SAC library. ADIOS, netCDF4, and HDF5 can be used as the underlying container format. At Princeton University, we have chosen to use ADIOS as the container format because it has shown superior scalability for certain applications, such as dealing with big data on HPC systems. In the context of high-performance computing, we have implemented ASDF into the global adjoint tomography workflow on Oak Ridge National Laboratory's supercomputer Titan.
Efficient parallel implementation of active appearance model fitting algorithm on GPU.

PubMed

Wang, Jinwei; Ma, Xirong; Zhu, Yuanping; Sun, Jizhou

2014-01-01

The active appearance model (AAM) is one of the most powerful model-based object detecting and tracking methods which has been widely used in various situations. However, the high-dimensional texture representation causes very time-consuming computations, which makes the AAM difficult to apply to real-time systems. The emergence of modern graphics processing units (GPUs) that feature a many-core, fine-grained parallel architecture provides new and promising solutions to overcome the computational challenge. In this paper, we propose an efficient parallel implementation of the AAM fitting algorithm on GPUs. Our design idea is fine grain parallelism in which we distribute the texture data of the AAM, in pixels, to thousands of parallel GPU threads for processing, which makes the algorithm fit better into the GPU architecture. We implement our algorithm using the compute unified device architecture (CUDA) on the Nvidia's GTX 650 GPU, which has the latest Kepler architecture. To compare the performance of our algorithm with different data sizes, we built sixteen face AAM models of different dimensional textures. The experiment results show that our parallel AAM fitting algorithm can achieve real-time performance for videos even on very high-dimensional textures.
Efficient Parallel Implementation of Active Appearance Model Fitting Algorithm on GPU

PubMed Central

Wang, Jinwei; Ma, Xirong; Zhu, Yuanping; Sun, Jizhou

2014-01-01

The active appearance model (AAM) is one of the most powerful model-based object detecting and tracking methods which has been widely used in various situations. However, the high-dimensional texture representation causes very time-consuming computations, which makes the AAM difficult to apply to real-time systems. The emergence of modern graphics processing units (GPUs) that feature a many-core, fine-grained parallel architecture provides new and promising solutions to overcome the computational challenge. In this paper, we propose an efficient parallel implementation of the AAM fitting algorithm on GPUs. Our design idea is fine grain parallelism in which we distribute the texture data of the AAM, in pixels, to thousands of parallel GPU threads for processing, which makes the algorithm fit better into the GPU architecture. We implement our algorithm using the compute unified device architecture (CUDA) on the Nvidia's GTX 650 GPU, which has the latest Kepler architecture. To compare the performance of our algorithm with different data sizes, we built sixteen face AAM models of different dimensional textures. The experiment results show that our parallel AAM fitting algorithm can achieve real-time performance for videos even on very high-dimensional textures. PMID:24723812
Advancing Creative Visual Thinking with Constructive Function-Based Modelling

ERIC Educational Resources Information Center

Pasko, Alexander; Adzhiev, Valery; Malikova, Evgeniya; Pilyugin, Victor

2013-01-01

Modern education technologies are destined to reflect the realities of a modern digital age. The juxtaposition of real and synthetic (computer-generated) worlds as well as a greater emphasis on visual dimension are especially important characteristics that have to be taken into account in learning and teaching. We describe the ways in which an…
EIAGRID: In-field optimization of seismic data acquisition by real-time subsurface imaging using a remote GRID computing environment.

NASA Astrophysics Data System (ADS)

Heilmann, B. Z.; Vallenilla Ferrara, A. M.

2009-04-01

The constant growth of contaminated sites, the unsustainable use of natural resources, and, last but not least, the hydrological risk related to extreme meteorological events and increased climate variability are major environmental issues of today. Finding solutions for these complex problems requires an integrated cross-disciplinary approach, providing a unified basis for environmental science and engineering. In computer science, grid computing is emerging worldwide as a formidable tool allowing distributed computation and data management with administratively-distant resources. Utilizing these modern High Performance Computing (HPC) technologies, the GRIDA3 project bundles several applications from different fields of geoscience aiming to support decision making for reasonable and responsible land use and resource management. In this abstract we present a geophysical application called EIAGRID that uses grid computing facilities to perform real-time subsurface imaging by on-the-fly processing of seismic field data and fast optimization of the processing workflow. Even though, seismic reflection profiling has a broad application range spanning from shallow targets in a few meters depth to targets in a depth of several kilometers, it is primarily used by the hydrocarbon industry and hardly for environmental purposes. The complexity of data acquisition and processing poses severe problems for environmental and geotechnical engineering: Professional seismic processing software is expensive to buy and demands large experience from the user. In-field processing equipment needed for real-time data Quality Control (QC) and immediate optimization of the acquisition parameters is often not available for this kind of studies. As a result, the data quality will be suboptimal. In the worst case, a crucial parameter such as receiver spacing, maximum offset, or recording time turns out later to be inappropriate and the complete acquisition campaign has to be repeated. The EIAGRID portal provides an innovative solution to this problem combining state-of-the-art data processing methods and modern remote grid computing technology. In field-processing equipment is substituted by remote access to high performance grid computing facilities. The latter can be ubiquitously controlled by a user-friendly web-browser interface accessed from the field by any mobile computer using wireless data transmission technology such as UMTS (Universal Mobile Telecommunications System) or HSUPA/HSDPA (High-Speed Uplink/Downlink Packet Access). The complexity of data-manipulation and processing and thus also the time demanding user interaction is minimized by a data-driven, and highly automated velocity analysis and imaging approach based on the Common-Reflection-Surface (CRS) stack. Furthermore, the huge computing power provided by the grid deployment allows parallel testing of alternative processing sequences and parameter settings, a feature which considerably reduces the turn-around times. A shared data storage using georeferencing tools and data grid technology is under current development. It will allow to publish already accomplished projects, making results, processing workflows and parameter settings available in a transparent and reproducible way. Creating a unified database shared by all users will facilitate complex studies and enable the use of data-crossing techniques to incorporate results of other environmental applications hosted on the GRIDA3 portal.
Brønsted acidity of protic ionic liquids: a modern ab initio valence bond theory perspective.

PubMed

Patil, Amol Baliram; Mahadeo Bhanage, Bhalchandra

2016-09-21

Room temperature ionic liquids (ILs), especially protic ionic liquids (PILs), are used in many areas of the chemical sciences. Ionicity, the extent of proton transfer, is a key parameter which determines many physicochemical properties and in turn the suitability of PILs for various applications. The spectrum of computational chemistry techniques applied to investigate ionic liquids includes classical molecular dynamics, Monte Carlo simulations, ab initio molecular dynamics, Density Functional Theory (DFT), CCSD(t) etc. At the other end of the spectrum is another computational approach: modern ab initio Valence Bond Theory (VBT). VBT differs from molecular orbital theory based methods in the expression of the molecular wave function. The molecular wave function in the valence bond ansatz is expressed as a linear combination of valence bond structures. These structures include covalent and ionic structures explicitly. Modern ab initio valence bond theory calculations of representative primary and tertiary ammonium protic ionic liquids indicate that modern ab initio valence bond theory can be employed to assess the acidity and ionicity of protic ionic liquids a priori.

Factors Affecting Teachers' Adoption of Educational Computer Games: A Case Study

ERIC Educational Resources Information Center

Kebritchi, Mansureh

2010-01-01

Even though computer games hold considerable potential for engaging and facilitating learning among today's children, the adoption of modern educational computer games is still meeting significant resistance in K-12 education. The purpose of this paper is to inform educators and instructional designers on factors affecting teachers' adoption of…
Let Me Share a Secret with You! Teaching with Computers.

ERIC Educational Resources Information Center

de Vasconcelos, Maria

The author describes her experiences teaching a computer-enhanced Modern Poetry course. The author argues that using computers enhances the concept of the classroom as learning community. It was the author's experience that students' postings on the discussion board created an atmosphere that encouraged student involvement, as opposed to the…
The State of the Art in Information Handling. Operation PEP/Executive Information Systems.

ERIC Educational Resources Information Center

Summers, J. K.; Sullivan, J. E.

This document explains recent developments in computer science and information systems of interest to the educational manager. A brief history of computers is included, together with an examination of modern computers' capabilities. Various features of card, tape, and disk information storage systems are presented. The importance of time-sharing…
Computer-Mediated Communication as an Autonomy-Enhancement Tool for Advanced Learners of English

ERIC Educational Resources Information Center

Wach, Aleksandra

2012-01-01

This article examines the relevance of modern technology for the development of learner autonomy in the process of learning English as a foreign language. Computer-assisted language learning and computer-mediated communication (CMC) appear to be particularly conducive to fostering autonomous learning, as they naturally incorporate many elements of…
Redesigning the Quantum Mechanics Curriculum to Incorporate Problem Solving Using a Computer Algebra System

NASA Astrophysics Data System (ADS)

Roussel, Marc R.

1999-10-01

One of the traditional obstacles to learning quantum mechanics is the relatively high level of mathematical proficiency required to solve even routine problems. Modern computer algebra systems are now sufficiently reliable that they can be used as mathematical assistants to alleviate this difficulty. In the quantum mechanics course at the University of Lethbridge, the traditional three lecture hours per week have been replaced by two lecture hours and a one-hour computer-aided problem solving session using a computer algebra system (Maple). While this somewhat reduces the number of topics that can be tackled during the term, students have a better opportunity to familiarize themselves with the underlying theory with this course design. Maple is also available to students during examinations. The use of a computer algebra system expands the class of feasible problems during a time-limited exercise such as a midterm or final examination. A modern computer algebra system is a complex piece of software, so some time needs to be devoted to teaching the students its proper use. However, the advantages to the teaching of quantum mechanics appear to outweigh the disadvantages.
Ocular Tolerance of Contemporary Electronic Display Devices.

PubMed

Clark, Andrew J; Yang, Paul; Khaderi, Khizer R; Moshfeghi, Andrew A

2018-05-01

Electronic displays have become an integral part of life in the developed world since the revolution of mobile computing a decade ago. With the release of multiple consumer-grade virtual reality (VR) and augmented reality (AR) products in the past 2 years utilizing head-mounted displays (HMDs), as well as the development of low-cost, smartphone-based HMDs, the ability to intimately interact with electronic screens is greater than ever. VR/AR HMDs also place the display at much closer ocular proximity than traditional electronic devices while also isolating the user from the ambient environment to create a "closed" system between the user's eyes and the display. Whether the increased interaction with these devices places the user's retina at higher risk of damage is currently unclear. Herein, the authors review the discovery of photochemical damage of the retina from visible light as well as summarize relevant clinical and preclinical data regarding the influence of modern display devices on retinal health. Multiple preclinical studies have been performed with modern light-emitting diode technology demonstrating damage to the retina at modest exposure levels, particularly from blue-light wavelengths. Unfortunately, high-quality in-human studies are lacking, and the small clinical investigations performed to date have failed to keep pace with the rapid evolutions in display technology. Clinical investigations assessing the effect of HMDs on human retinal function are also yet to be performed. From the available data, modern consumer electronic displays do not appear to pose any acute risk to vision with average use; however, future studies with well-defined clinical outcomes and illuminance metrics are needed to better understand the long-term risks of cumulative exposure to electronic displays in general and with "closed" VR/AR HMDs in particular. [Ophthalmic Surg Lasers Imaging Retina. 2018;49:346-354.]. Copyright 2018, SLACK Incorporated.
Modern modeling techniques had limited external validity in predicting mortality from traumatic brain injury.

PubMed

van der Ploeg, Tjeerd; Nieboer, Daan; Steyerberg, Ewout W

2016-10-01

Prediction of medical outcomes may potentially benefit from using modern statistical modeling techniques. We aimed to externally validate modeling strategies for prediction of 6-month mortality of patients suffering from traumatic brain injury (TBI) with predictor sets of increasing complexity. We analyzed individual patient data from 15 different studies including 11,026 TBI patients. We consecutively considered a core set of predictors (age, motor score, and pupillary reactivity), an extended set with computed tomography scan characteristics, and a further extension with two laboratory measurements (glucose and hemoglobin). With each of these sets, we predicted 6-month mortality using default settings with five statistical modeling techniques: logistic regression (LR), classification and regression trees, random forests (RFs), support vector machines (SVM) and neural nets. For external validation, a model developed on one of the 15 data sets was applied to each of the 14 remaining sets. This process was repeated 15 times for a total of 630 validations. The area under the receiver operating characteristic curve (AUC) was used to assess the discriminative ability of the models. For the most complex predictor set, the LR models performed best (median validated AUC value, 0.757), followed by RF and support vector machine models (median validated AUC value, 0.735 and 0.732, respectively). With each predictor set, the classification and regression trees models showed poor performance (median validated AUC value, <0.7). The variability in performance across the studies was smallest for the RF- and LR-based models (inter quartile range for validated AUC values from 0.07 to 0.10). In the area of predicting mortality from TBI, nonlinear and nonadditive effects are not pronounced enough to make modern prediction methods beneficial. Copyright © 2016 Elsevier Inc. All rights reserved.
Technology and the Modern Library.

ERIC Educational Resources Information Center

Boss, Richard W.

1984-01-01

Overview of the impact of information technology on libraries highlights turnkey vendors, bibliographic utilities, commercial suppliers of records, state and regional networks, computer-to-computer linkages, remote database searching, terminals and microcomputers, building local databases, delivery of information, digital telefacsimile,…
Accelerating Astronomy & Astrophysics in the New Era of Parallel Computing: GPUs, Phi and Cloud Computing

NASA Astrophysics Data System (ADS)

Ford, Eric B.; Dindar, Saleh; Peters, Jorg

2015-08-01

The realism of astrophysical simulations and statistical analyses of astronomical data are set by the available computational resources. Thus, astronomers and astrophysicists are constantly pushing the limits of computational capabilities. For decades, astronomers benefited from massive improvements in computational power that were driven primarily by increasing clock speeds and required relatively little attention to details of the computational hardware. For nearly a decade, increases in computational capabilities have come primarily from increasing the degree of parallelism, rather than increasing clock speeds. Further increases in computational capabilities will likely be led by many-core architectures such as Graphical Processing Units (GPUs) and Intel Xeon Phi. Successfully harnessing these new architectures, requires significantly more understanding of the hardware architecture, cache hierarchy, compiler capabilities and network network characteristics.I will provide an astronomer's overview of the opportunities and challenges provided by modern many-core architectures and elastic cloud computing. The primary goal is to help an astronomical audience understand what types of problems are likely to yield more than order of magnitude speed-ups and which problems are unlikely to parallelize sufficiently efficiently to be worth the development time and/or costs.I will draw on my experience leading a team in developing the Swarm-NG library for parallel integration of large ensembles of small n-body systems on GPUs, as well as several smaller software projects. I will share lessons learned from collaborating with computer scientists, including both technical and soft skills. Finally, I will discuss the challenges of training the next generation of astronomers to be proficient in this new era of high-performance computing, drawing on experience teaching a graduate class on High-Performance Scientific Computing for Astrophysics and organizing a 2014 advanced summer school on Bayesian Computing for Astronomical Data Analysis with support of the Penn State Center for Astrostatistics and Institute for CyberScience.
Accelerating EPI distortion correction by utilizing a modern GPU-based parallel computation.

PubMed

Yang, Yao-Hao; Huang, Teng-Yi; Wang, Fu-Nien; Chuang, Tzu-Chao; Chen, Nan-Kuei

2013-04-01

The combination of phase demodulation and field mapping is a practical method to correct echo planar imaging (EPI) geometric distortion. However, since phase dispersion accumulates in each phase-encoding step, the calculation complexity of phase modulation is Ny-fold higher than conventional image reconstructions. Thus, correcting EPI images via phase demodulation is generally a time-consuming task. Parallel computing by employing general-purpose calculations on graphics processing units (GPU) can accelerate scientific computing if the algorithm is parallelized. This study proposes a method that incorporates the GPU-based technique into phase demodulation calculations to reduce computation time. The proposed parallel algorithm was applied to a PROPELLER-EPI diffusion tensor data set. The GPU-based phase demodulation method reduced the EPI distortion correctly, and accelerated the computation. The total reconstruction time of the 16-slice PROPELLER-EPI diffusion tensor images with matrix size of 128 × 128 was reduced from 1,754 seconds to 101 seconds by utilizing the parallelized 4-GPU program. GPU computing is a promising method to accelerate EPI geometric correction. The resulting reduction in computation time of phase demodulation should accelerate postprocessing for studies performed with EPI, and should effectuate the PROPELLER-EPI technique for clinical practice. Copyright © 2011 by the American Society of Neuroimaging.
Parietal neural prosthetic control of a computer cursor in a graphical-user-interface task

NASA Astrophysics Data System (ADS)

Revechkis, Boris; Aflalo, Tyson NS; Kellis, Spencer; Pouratian, Nader; Andersen, Richard A.

2014-12-01

Objective. To date, the majority of Brain-Machine Interfaces have been used to perform simple tasks with sequences of individual targets in otherwise blank environments. In this study we developed a more practical and clinically relevant task that approximated modern computers and graphical user interfaces (GUIs). This task could be problematic given the known sensitivity of areas typically used for BMIs to visual stimuli, eye movements, decision-making, and attentional control. Consequently, we sought to assess the effect of a complex, GUI-like task on the quality of neural decoding. Approach. A male rhesus macaque monkey was implanted with two 96-channel electrode arrays in area 5d of the superior parietal lobule. The animal was trained to perform a GUI-like ‘Face in a Crowd’ task on a computer screen that required selecting one cued, icon-like, face image from a group of alternatives (the ‘Crowd’) using a neurally controlled cursor. We assessed whether the crowd affected decodes of intended cursor movements by comparing it to a ‘Crowd Off’ condition in which only the matching target appeared without alternatives. We also examined if training a neural decoder with the Crowd On rather than Off had any effect on subsequent decode quality. Main results. Despite the additional demands of working with the Crowd On, the animal was able to robustly perform the task under Brain Control. The presence of the crowd did not itself affect decode quality. Training the decoder with the Crowd On relative to Off had no negative influence on subsequent decoding performance. Additionally, the subject was able to gaze around freely without influencing cursor position. Significance. Our results demonstrate that area 5d recordings can be used for decoding in a complex, GUI-like task with free gaze. Thus, this area is a promising source of signals for neural prosthetics that utilize computing devices with GUI interfaces, e.g. personal computers, mobile devices, and tablet computers.
Parietal neural prosthetic control of a computer cursor in a graphical-user-interface task.

PubMed

Revechkis, Boris; Aflalo, Tyson N S; Kellis, Spencer; Pouratian, Nader; Andersen, Richard A

2014-12-01

To date, the majority of Brain-Machine Interfaces have been used to perform simple tasks with sequences of individual targets in otherwise blank environments. In this study we developed a more practical and clinically relevant task that approximated modern computers and graphical user interfaces (GUIs). This task could be problematic given the known sensitivity of areas typically used for BMIs to visual stimuli, eye movements, decision-making, and attentional control. Consequently, we sought to assess the effect of a complex, GUI-like task on the quality of neural decoding. A male rhesus macaque monkey was implanted with two 96-channel electrode arrays in area 5d of the superior parietal lobule. The animal was trained to perform a GUI-like 'Face in a Crowd' task on a computer screen that required selecting one cued, icon-like, face image from a group of alternatives (the 'Crowd') using a neurally controlled cursor. We assessed whether the crowd affected decodes of intended cursor movements by comparing it to a 'Crowd Off' condition in which only the matching target appeared without alternatives. We also examined if training a neural decoder with the Crowd On rather than Off had any effect on subsequent decode quality. Despite the additional demands of working with the Crowd On, the animal was able to robustly perform the task under Brain Control. The presence of the crowd did not itself affect decode quality. Training the decoder with the Crowd On relative to Off had no negative influence on subsequent decoding performance. Additionally, the subject was able to gaze around freely without influencing cursor position. Our results demonstrate that area 5d recordings can be used for decoding in a complex, GUI-like task with free gaze. Thus, this area is a promising source of signals for neural prosthetics that utilize computing devices with GUI interfaces, e.g. personal computers, mobile devices, and tablet computers.
Preliminary Axial Flow Turbine Design and Off-Design Performance Analysis Methods for Rotary Wing Aircraft Engines. Part 1; Validation

NASA Technical Reports Server (NTRS)

Chen, Shu-cheng, S.

2009-01-01

For the preliminary design and the off-design performance analysis of axial flow turbines, a pair of intermediate level-of-fidelity computer codes, TD2-2 (design; reference 1) and AXOD (off-design; reference 2), are being evaluated for use in turbine design and performance prediction of the modern high performance aircraft engines. TD2-2 employs a streamline curvature method for design, while AXOD approaches the flow analysis with an equal radius-height domain decomposition strategy. Both methods resolve only the flows in the annulus region while modeling the impact introduced by the blade rows. The mathematical formulations and derivations involved in both methods are documented in references 3, 4 for TD2-2) and in reference 5 (for AXOD). The focus of this paper is to discuss the fundamental issues of applicability and compatibility of the two codes as a pair of companion pieces, to perform preliminary design and off-design analysis for modern aircraft engine turbines. Two validation cases for the design and the off-design prediction using TD2-2 and AXOD conducted on two existing high efficiency turbines, developed and tested in the NASA/GE Energy Efficient Engine (GE-E3) Program, the High Pressure Turbine (HPT; two stages, air cooled) and the Low Pressure Turbine (LPT; five stages, un-cooled), are provided in support of the analysis and discussion presented in this paper.
Application of enhanced modern structured analysis techniques to Space Station Freedom electric power system requirements

NASA Technical Reports Server (NTRS)

Biernacki, John; Juhasz, John; Sadler, Gerald

1991-01-01

A team of Space Station Freedom (SSF) system engineers are in the process of extensive analysis of the SSF requirements, particularly those pertaining to the electrical power system (EPS). The objective of this analysis is the development of a comprehensive, computer-based requirements model, using an enhanced modern structured analysis methodology (EMSA). Such a model provides a detailed and consistent representation of the system's requirements. The process outlined in the EMSA methodology is unique in that it allows the graphical modeling of real-time system state transitions, as well as functional requirements and data relationships, to be implemented using modern computer-based tools. These tools permit flexible updating and continuous maintenance of the models. Initial findings resulting from the application of EMSA to the EPS have benefited the space station program by linking requirements to design, providing traceability of requirements, identifying discrepancies, and fostering an understanding of the EPS.
Computed tomographic evidence of atherosclerosis in the mummified remains of humans from around the world.

PubMed

Thompson, Randall C; Allam, Adel H; Zink, Albert; Wann, L Samuel; Lombardi, Guido P; Cox, Samantha L; Frohlich, Bruno; Sutherland, M Linda; Sutherland, James D; Frohlich, Thomas C; King, Samantha I; Miyamoto, Michael I; Monge, Janet M; Valladolid, Clide M; El-Halim Nur El-Din, Abd; Narula, Jagat; Thompson, Adam M; Finch, Caleb E; Thomas, Gregory S

2014-06-01

Although atherosclerosis is widely thought to be a disease of modernity, computed tomographic evidence of atherosclerosis has been found in the bodies of a large number of mummies. This article reviews the findings of atherosclerotic calcifications in the remains of ancient people-humans who lived across a very wide span of human history and over most of the inhabited globe. These people had a wide range of diets and lifestyles and traditional modern risk factors do not thoroughly explain the presence and easy detectability of this disease. Nontraditional risk factors such as the inhalation of cooking fire smoke and chronic infection or inflammation might have been important atherogenic factors in ancient times. Study of the genetic and environmental risk factors for atherosclerosis in ancient people may offer insights into this common modern disease. Copyright © 2014 World Heart Federation (Geneva). Published by Elsevier B.V. All rights reserved.
Intelligent machines in the twenty-first century: foundations of inference and inquiry.

PubMed

Knuth, Kevin H

2003-12-15

The last century saw the application of Boolean algebra to the construction of computing machines, which work by applying logical transformations to information contained in their memory. The development of information theory and the generalization of Boolean algebra to Bayesian inference have enabled these computing machines, in the last quarter of the twentieth century, to be endowed with the ability to learn by making inferences from data. This revolution is just beginning as new computational techniques continue to make difficult problems more accessible. Recent advances in our understanding of the foundations of probability theory have revealed implications for areas other than logic. Of relevance to intelligent machines, we recently identified the algebra of questions as the free distributive algebra, which will now allow us to work with questions in a way analogous to that which Boolean algebra enables us to work with logical statements. In this paper, we examine the foundations of inference and inquiry. We begin with a history of inferential reasoning, highlighting key concepts that have led to the automation of inference in modern machine-learning systems. We then discuss the foundations of inference in more detail using a modern viewpoint that relies on the mathematics of partially ordered sets and the scaffolding of lattice theory. This new viewpoint allows us to develop the logic of inquiry and introduce a measure describing the relevance of a proposed question to an unresolved issue. Last, we will demonstrate the automation of inference, and discuss how this new logic of inquiry will enable intelligent machines to ask questions. Automation of both inference and inquiry promises to allow robots to perform science in the far reaches of our solar system and in other star systems by enabling them not only to make inferences from data, but also to decide which question to ask, which experiment to perform, or which measurement to take given what they have learned and what they are designed to understand.
Intelligent machines in the twenty-first century: foundations of inference and inquiry

NASA Technical Reports Server (NTRS)

Knuth, Kevin H.

2003-01-01

The last century saw the application of Boolean algebra to the construction of computing machines, which work by applying logical transformations to information contained in their memory. The development of information theory and the generalization of Boolean algebra to Bayesian inference have enabled these computing machines, in the last quarter of the twentieth century, to be endowed with the ability to learn by making inferences from data. This revolution is just beginning as new computational techniques continue to make difficult problems more accessible. Recent advances in our understanding of the foundations of probability theory have revealed implications for areas other than logic. Of relevance to intelligent machines, we recently identified the algebra of questions as the free distributive algebra, which will now allow us to work with questions in a way analogous to that which Boolean algebra enables us to work with logical statements. In this paper, we examine the foundations of inference and inquiry. We begin with a history of inferential reasoning, highlighting key concepts that have led to the automation of inference in modern machine-learning systems. We then discuss the foundations of inference in more detail using a modern viewpoint that relies on the mathematics of partially ordered sets and the scaffolding of lattice theory. This new viewpoint allows us to develop the logic of inquiry and introduce a measure describing the relevance of a proposed question to an unresolved issue. Last, we will demonstrate the automation of inference, and discuss how this new logic of inquiry will enable intelligent machines to ask questions. Automation of both inference and inquiry promises to allow robots to perform science in the far reaches of our solar system and in other star systems by enabling them not only to make inferences from data, but also to decide which question to ask, which experiment to perform, or which measurement to take given what they have learned and what they are designed to understand.
Real-time dose computation: GPU-accelerated source modeling and superposition/convolution

DOE Office of Scientific and Technical Information (OSTI.GOV)

Jacques, Robert; Wong, John; Taylor, Russell

Purpose: To accelerate dose calculation to interactive rates using highly parallel graphics processing units (GPUs). Methods: The authors have extended their prior work in GPU-accelerated superposition/convolution with a modern dual-source model and have enhanced performance. The primary source algorithm supports both focused leaf ends and asymmetric rounded leaf ends. The extra-focal algorithm uses a discretized, isotropic area source and models multileaf collimator leaf height effects. The spectral and attenuation effects of static beam modifiers were integrated into each source's spectral function. The authors introduce the concepts of arc superposition and delta superposition. Arc superposition utilizes separate angular sampling for themore » total energy released per unit mass (TERMA) and superposition computations to increase accuracy and performance. Delta superposition allows single beamlet changes to be computed efficiently. The authors extended their concept of multi-resolution superposition to include kernel tilting. Multi-resolution superposition approximates solid angle ray-tracing, improving performance and scalability with a minor loss in accuracy. Superposition/convolution was implemented using the inverse cumulative-cumulative kernel and exact radiological path ray-tracing. The accuracy analyses were performed using multiple kernel ray samplings, both with and without kernel tilting and multi-resolution superposition. Results: Source model performance was <9 ms (data dependent) for a high resolution (400{sup 2}) field using an NVIDIA (Santa Clara, CA) GeForce GTX 280. Computation of the physically correct multispectral TERMA attenuation was improved by a material centric approach, which increased performance by over 80%. Superposition performance was improved by {approx}24% to 0.058 and 0.94 s for 64{sup 3} and 128{sup 3} water phantoms; a speed-up of 101-144x over the highly optimized Pinnacle{sup 3} (Philips, Madison, WI) implementation. Pinnacle{sup 3} times were 8.3 and 94 s, respectively, on an AMD (Sunnyvale, CA) Opteron 254 (two cores, 2.8 GHz). Conclusions: The authors have completed a comprehensive, GPU-accelerated dose engine in order to provide a substantial performance gain over CPU based implementations. Real-time dose computation is feasible with the accuracy levels of the superposition/convolution algorithm.« less
Experimental Evidence on the Effects of Home Computers on Academic Achievement among Schoolchildren. National Poverty Center Working Paper Series #13-02

ERIC Educational Resources Information Center

Fairlie, Robert W.; Robinson, Jonathan

2013-01-01

Computers are an important part of modern education, yet large segments of the population--especially low-income and minority children--lack access to a computer at home. Does this impede educational achievement? We test this hypothesis by conducting the largest-ever field experiment involving the random provision of free computers for home use to…
A Computer Model for Red Blood Cell Chemistry

DTIC Science & Technology

1996-10-01

5012. 13. ABSTRACT (Maximum 200 There is a growing need for interactive computational tools for medical education and research. The most exciting...paradigm for interactive education is simulation. Fluid Mod is a simulation based computational tool developed in the late sixties and early seventies at...to a modern Windows, object oriented interface. This development will provide students with a useful computational tool for learning . More important

A Large-Scale Design Integration Approach Developed in Conjunction with the Ares Launch Vehicle Program

NASA Technical Reports Server (NTRS)

Redmon, John W.; Shirley, Michael C.; Kinard, Paul S.

2012-01-01

This paper presents a method for performing large-scale design integration, taking a classical 2D drawing envelope and interface approach and applying it to modern three dimensional computer aided design (3D CAD) systems. Today, the paradigm often used when performing design integration with 3D models involves a digital mockup of an overall vehicle, in the form of a massive, fully detailed, CAD assembly; therefore, adding unnecessary burden and overhead to design and product data management processes. While fully detailed data may yield a broad depth of design detail, pertinent integration features are often obscured under the excessive amounts of information, making them difficult to discern. In contrast, the envelope and interface method results in a reduction in both the amount and complexity of information necessary for design integration while yielding significant savings in time and effort when applied to today's complex design integration projects. This approach, combining classical and modern methods, proved advantageous during the complex design integration activities of the Ares I vehicle. Downstream processes, benefiting from this approach by reducing development and design cycle time, include: Creation of analysis models for the Aerodynamic discipline; Vehicle to ground interface development; Documentation development for the vehicle assembly.
Linear solver performance in elastoplastic problem solution on GPU cluster

NASA Astrophysics Data System (ADS)

Khalevitsky, Yu. V.; Konovalov, A. V.; Burmasheva, N. V.; Partin, A. S.

2017-12-01

Applying the finite element method to severe plastic deformation problems involves solving linear equation systems. While the solution procedure is relatively hard to parallelize and computationally intensive by itself, a long series of large scale systems need to be solved for each problem. When dealing with fine computational meshes, such as in the simulations of three-dimensional metal matrix composite microvolume deformation, tens and hundreds of hours may be needed to complete the whole solution procedure, even using modern supercomputers. In general, one of the preconditioned Krylov subspace methods is used in a linear solver for such problems. The method convergence highly depends on the operator spectrum of a problem stiffness matrix. In order to choose the appropriate method, a series of computational experiments is used. Different methods may be preferable for different computational systems for the same problem. In this paper we present experimental data obtained by solving linear equation systems from an elastoplastic problem on a GPU cluster. The data can be used to substantiate the choice of the appropriate method for a linear solver to use in severe plastic deformation simulations.
Personal computer security: part 1. Firewalls, antivirus software, and Internet security suites.

PubMed

Caruso, Ronald D

2003-01-01

Personal computer (PC) security in the era of the Health Insurance Portability and Accountability Act of 1996 (HIPAA) involves two interrelated elements: safeguarding the basic computer system itself and protecting the information it contains and transmits, including personal files. HIPAA regulations have toughened the requirements for securing patient information, requiring every radiologist with such data to take further precautions. Security starts with physically securing the computer. Account passwords and a password-protected screen saver should also be set up. A modern antivirus program can easily be installed and configured. File scanning and updating of virus definitions are simple processes that can largely be automated and should be performed at least weekly. A software firewall is also essential for protection from outside intrusion, and an inexpensive hardware firewall can provide yet another layer of protection. An Internet security suite yields additional safety. Regular updating of the security features of installed programs is important. Obtaining a moderate degree of PC safety and security is somewhat inconvenient but is necessary and well worth the effort. Copyright RSNA, 2003
All biology is computational biology.

PubMed

Markowetz, Florian

2017-03-01

Here, I argue that computational thinking and techniques are so central to the quest of understanding life that today all biology is computational biology. Computational biology brings order into our understanding of life, it makes biological concepts rigorous and testable, and it provides a reference map that holds together individual insights. The next modern synthesis in biology will be driven by mathematical, statistical, and computational methods being absorbed into mainstream biological training, turning biology into a quantitative science.
cudaMap: a GPU accelerated program for gene expression connectivity mapping

PubMed Central

2013-01-01

Background Modern cancer research often involves large datasets and the use of sophisticated statistical techniques. Together these add a heavy computational load to the analysis, which is often coupled with issues surrounding data accessibility. Connectivity mapping is an advanced bioinformatic and computational technique dedicated to therapeutics discovery and drug re-purposing around differential gene expression analysis. On a normal desktop PC, it is common for the connectivity mapping task with a single gene signature to take > 2h to complete using sscMap, a popular Java application that runs on standard CPUs (Central Processing Units). Here, we describe new software, cudaMap, which has been implemented using CUDA C/C++ to harness the computational power of NVIDIA GPUs (Graphics Processing Units) to greatly reduce processing times for connectivity mapping. Results cudaMap can identify candidate therapeutics from the same signature in just over thirty seconds when using an NVIDIA Tesla C2050 GPU. Results from the analysis of multiple gene signatures, which would previously have taken several days, can now be obtained in as little as 10 minutes, greatly facilitating candidate therapeutics discovery with high throughput. We are able to demonstrate dramatic speed differentials between GPU assisted performance and CPU executions as the computational load increases for high accuracy evaluation of statistical significance. Conclusion Emerging ‘omics’ technologies are constantly increasing the volume of data and information to be processed in all areas of biomedical research. Embracing the multicore functionality of GPUs represents a major avenue of local accelerated computing. cudaMap will make a strong contribution in the discovery of candidate therapeutics by enabling speedy execution of heavy duty connectivity mapping tasks, which are increasingly required in modern cancer research. cudaMap is open source and can be freely downloaded from http://purl.oclc.org/NET/cudaMap. PMID:24112435
Fast Acceleration of 2D Wave Propagation Simulations Using Modern Computational Accelerators

PubMed Central

Wang, Wei; Xu, Lifan; Cavazos, John; Huang, Howie H.; Kay, Matthew

2014-01-01

Recent developments in modern computational accelerators like Graphics Processing Units (GPUs) and coprocessors provide great opportunities for making scientific applications run faster than ever before. However, efficient parallelization of scientific code using new programming tools like CUDA requires a high level of expertise that is not available to many scientists. This, plus the fact that parallelized code is usually not portable to different architectures, creates major challenges for exploiting the full capabilities of modern computational accelerators. In this work, we sought to overcome these challenges by studying how to achieve both automated parallelization using OpenACC and enhanced portability using OpenCL. We applied our parallelization schemes using GPUs as well as Intel Many Integrated Core (MIC) coprocessor to reduce the run time of wave propagation simulations. We used a well-established 2D cardiac action potential model as a specific case-study. To the best of our knowledge, we are the first to study auto-parallelization of 2D cardiac wave propagation simulations using OpenACC. Our results identify several approaches that provide substantial speedups. The OpenACC-generated GPU code achieved more than speedup above the sequential implementation and required the addition of only a few OpenACC pragmas to the code. An OpenCL implementation provided speedups on GPUs of at least faster than the sequential implementation and faster than a parallelized OpenMP implementation. An implementation of OpenMP on Intel MIC coprocessor provided speedups of with only a few code changes to the sequential implementation. We highlight that OpenACC provides an automatic, efficient, and portable approach to achieve parallelization of 2D cardiac wave simulations on GPUs. Our approach of using OpenACC, OpenCL, and OpenMP to parallelize this particular model on modern computational accelerators should be applicable to other computational models of wave propagation in multi-dimensional media. PMID:24497950
ExpoCast: Exposure Science for Prioritization and Toxicity Testing (T)

EPA Science Inventory

The US EPA National Center for Computational Toxicology (NCCT) has a mission to integrate modern computing and information technology with molecular biology to improve Agency prioritization of data requirements and risk assessment of chemicals. Recognizing the critical need for ...
The Jinn and the Computer. Consumption and Identity in Arabic Children's Magazines

ERIC Educational Resources Information Center

Peterson, Mark Allen

2005-01-01

One of the fundamental problems facing middle-class Egyptian parents is the problem of how to ensure that their children are simultaneously modern and Egyptian. Arabic children's magazines offer a window into the processes by which consumption links childhood and modernity in the social imaginations of children and their parents as they construct…
Adaptationism and intuitions about modern criminal justice.

PubMed

Petersen, Michael Bang

2013-02-01

Research indicates that individuals have incoherent intuitions about particular features of the criminal justice system. This could be seen as an argument against the existence of adapted computational systems for counter-exploitation. Here, I outline how the model developed by McCullough et al. readily predicts the production of conflicting intuitions in the context of modern criminal justice issues.
[Some engineering problems on developing production industry of modern traditional Chinese medicine].

PubMed

Qu, Hai-bin; Cheng, Yi-yu; Wang, Yue-sheng

2003-10-01

Based on the review of some engineering problems on developing modern production industry of Traditional Chinese Medicine (TCM), the differences of TCM production industry between China and abroad were pointed out. Accelerating the application and extension of high-tech and computer integrated manufacturing system (CIMS) were suggested to promote the technology advancement of TCM industry.
SIMD Optimization of Linear Expressions for Programmable Graphics Hardware

PubMed Central

Bajaj, Chandrajit; Ihm, Insung; Min, Jungki; Oh, Jinsang

2009-01-01

The increased programmability of graphics hardware allows efficient graphical processing unit (GPU) implementations of a wide range of general computations on commodity PCs. An important factor in such implementations is how to fully exploit the SIMD computing capacities offered by modern graphics processors. Linear expressions in the form of ȳ = Ax̄ + b̄, where A is a matrix, and x̄, ȳ and b̄ are vectors, constitute one of the most basic operations in many scientific computations. In this paper, we propose a SIMD code optimization technique that enables efficient shader codes to be generated for evaluating linear expressions. It is shown that performance can be improved considerably by efficiently packing arithmetic operations into four-wide SIMD instructions through reordering of the operations in linear expressions. We demonstrate that the presented technique can be used effectively for programming both vertex and pixel shaders for a variety of mathematical applications, including integrating differential equations and solving a sparse linear system of equations using iterative methods. PMID:19946569
Thermal Hydraulics Design and Analysis Methodology for a Solid-Core Nuclear Thermal Rocket Engine Thrust Chamber

NASA Technical Reports Server (NTRS)

Wang, Ten-See; Canabal, Francisco; Chen, Yen-Sen; Cheng, Gary; Ito, Yasushi

2013-01-01

Nuclear thermal propulsion is a leading candidate for in-space propulsion for human Mars missions. This chapter describes a thermal hydraulics design and analysis methodology developed at the NASA Marshall Space Flight Center, in support of the nuclear thermal propulsion development effort. The objective of this campaign is to bridge the design methods in the Rover/NERVA era, with a modern computational fluid dynamics and heat transfer methodology, to predict thermal, fluid, and hydrogen environments of a hypothetical solid-core, nuclear thermal engine the Small Engine, designed in the 1960s. The computational methodology is based on an unstructured-grid, pressure-based, all speeds, chemically reacting, computational fluid dynamics and heat transfer platform, while formulations of flow and heat transfer through porous and solid media were implemented to describe those of hydrogen flow channels inside the solid24 core. Design analyses of a single flow element and the entire solid-core thrust chamber of the Small Engine were performed and the results are presented herein
Exploiting current-generation graphics hardware for synthetic-scene generation

NASA Astrophysics Data System (ADS)

Tanner, Michael A.; Keen, Wayne A.

2010-04-01

Increasing seeker frame rate and pixel count, as well as the demand for higher levels of scene fidelity, have driven scene generation software for hardware-in-the-loop (HWIL) and software-in-the-loop (SWIL) testing to higher levels of parallelization. Because modern PC graphics cards provide multiple computational cores (240 shader cores for a current NVIDIA Corporation GeForce and Quadro cards), implementation of phenomenology codes on graphics processing units (GPUs) offers significant potential for simultaneous enhancement of simulation frame rate and fidelity. To take advantage of this potential requires algorithm implementation that is structured to minimize data transfers between the central processing unit (CPU) and the GPU. In this paper, preliminary methodologies developed at the Kinetic Hardware In-The-Loop Simulator (KHILS) will be presented. Included in this paper will be various language tradeoffs between conventional shader programming, Compute Unified Device Architecture (CUDA) and Open Computing Language (OpenCL), including performance trades and possible pathways for future tool development.
Real-time simulation of large-scale neural architectures for visual features computation based on GPU.

PubMed

Chessa, Manuela; Bianchi, Valentina; Zampetti, Massimo; Sabatini, Silvio P; Solari, Fabio

2012-01-01

The intrinsic parallelism of visual neural architectures based on distributed hierarchical layers is well suited to be implemented on the multi-core architectures of modern graphics cards. The design strategies that allow us to optimally take advantage of such parallelism, in order to efficiently map on GPU the hierarchy of layers and the canonical neural computations, are proposed. Speciﬁcally, the advantages of a cortical map-like representation of the data are exploited. Moreover, a GPU implementation of a novel neural architecture for the computation of binocular disparity from stereo image pairs, based on populations of binocular energy neurons, is presented. The implemented neural model achieves good performances in terms of reliability of the disparity estimates and a near real-time execution speed, thus demonstrating the effectiveness of the devised design strategies. The proposed approach is valid in general, since the neural building blocks we implemented are a common basis for the modeling of visual neural functionalities.
Artificial Intelligence in Medical Practice: The Question to the Answer?

PubMed

Miller, D Douglas; Brown, Eric W

2018-02-01

Computer science advances and ultra-fast computing speeds find artificial intelligence (AI) broadly benefitting modern society-forecasting weather, recognizing faces, detecting fraud, and deciphering genomics. AI's future role in medical practice remains an unanswered question. Machines (computers) learn to detect patterns not decipherable using biostatistics by processing massive datasets (big data) through layered mathematical models (algorithms). Correcting algorithm mistakes (training) adds to AI predictive model confidence. AI is being successfully applied for image analysis in radiology, pathology, and dermatology, with diagnostic speed exceeding, and accuracy paralleling, medical experts. While diagnostic confidence never reaches 100%, combining machines plus physicians reliably enhances system performance. Cognitive programs are impacting medical practice by applying natural language processing to read the rapidly expanding scientific literature and collate years of diverse electronic medical records. In this and other ways, AI may optimize the care trajectory of chronic disease patients, suggest precision therapies for complex illnesses, reduce medical errors, and improve subject enrollment into clinical trials. Copyright © 2018 Elsevier Inc. All rights reserved.
Computer Synthesis Approaches of Hyperboloid Gear Drives with Linear Contact

NASA Astrophysics Data System (ADS)

Abadjiev, Valentin; Kawasaki, Haruhisa

2014-09-01

The computer design has improved forming different type software for scientific researches in the field of gearing theory as well as performing an adequate scientific support of the gear drives manufacture. Here are attached computer programs that are based on mathematical models as a result of scientific researches. The modern gear transmissions require the construction of new mathematical approaches to their geometric, technological and strength analysis. The process of optimization, synthesis and design is based on adequate iteration procedures to find out an optimal solution by varying definite parameters. The study is dedicated to accepted methodology in the creation of soft- ware for the synthesis of a class high reduction hyperboloid gears - Spiroid and Helicon ones (Spiroid and Helicon are trademarks registered by the Illinois Tool Works, Chicago, Ill). The developed basic computer products belong to software, based on original mathematical models. They are based on the two mathematical models for the synthesis: "upon a pitch contact point" and "upon a mesh region". Computer programs are worked out on the basis of the described mathematical models, and the relations between them are shown. The application of the shown approaches to the synthesis of commented gear drives is illustrated.
Computational Software to Fit Seismic Data Using Epidemic-Type Aftershock Sequence Models and Modeling Performance Comparisons

NASA Astrophysics Data System (ADS)

Chu, A.

2016-12-01

Modern earthquake catalogs are often analyzed using spatial-temporal point process models such as the epidemic-type aftershock sequence (ETAS) models of Ogata (1998). My work implements three of the homogeneous ETAS models described in Ogata (1998). With a model's log-likelihood function, my software finds the Maximum-Likelihood Estimates (MLEs) of the model's parameters to estimate the homogeneous background rate and the temporal and spatial parameters that govern triggering effects. EM-algorithm is employed for its advantages of stability and robustness (Veen and Schoenberg, 2008). My work also presents comparisons among the three models in robustness, convergence speed, and implementations from theory to computing practice. Up-to-date regional seismic data of seismic active areas such as Southern California and Japan are used to demonstrate the comparisons. Data analysis has been done using computer languages Java and R. Java has the advantages of being strong-typed and easiness of controlling memory resources, while R has the advantages of having numerous available functions in statistical computing. Comparisons are also made between the two programming languages in convergence and stability, computational speed, and easiness of implementation. Issues that may affect convergence such as spatial shapes are discussed.
Computations of Axisymmetric Flows in Hypersonic Shock Tubes

NASA Technical Reports Server (NTRS)

Sharma, Surendra P.; Wilson, Gregory J.

1995-01-01

A time-accurate two-dimensional fluid code is used to compute test times in shock tubes operated at supersonic speeds. Unlike previous studies, this investigation resolves the finer temporal details of the shock-tube flow by making use of modern supercomputers and state-of-the-art computational fluid dynamic solution techniques. The code, besides solving the time-dependent fluid equations, also accounts for the finite rate chemistry in the hypersonic environment. The flowfield solutions are used to estimate relevant shock-tube parameters for laminar flow, such as test times, and to predict density and velocity profiles. Boundary-layer parameters such as bar-delta(sub u), bar-delta(sup *), and bar-tau(sub w), and test time parameters such as bar-tau and particle time of flight t(sub f), are computed and compared with those evaluated by using Mirels' correlations. This article then discusses in detail the effects of flow nonuniformities on particle time-of-flight behind the normal shock and, consequently, on the interpretation of shock-tube data. This article concludes that for accurate interpretation of shock-tube data, a detailed analysis of flowfield parameters, using a computer code such as used in this study, must be performed.
CompNanoTox2015: novel perspectives from a European conference on computational nanotoxicology on predictive nanotoxicology.

PubMed

Bañares, Miguel A; Haase, Andrea; Tran, Lang; Lobaskin, Vladimir; Oberdörster, Günter; Rallo, Robert; Leszczynski, Jerzy; Hoet, Peter; Korenstein, Rafi; Hardy, Barry; Puzyn, Tomasz

2017-09-01

A first European Conference on Computational Nanotoxicology, CompNanoTox, was held in November 2015 in Benahavís, Spain with the objectives to disseminate and integrate results from the European modeling and database projects (NanoPUZZLES, ModENPTox, PreNanoTox, MembraneNanoPart, MODERN, eNanoMapper and EU COST TD1204 MODENA) as well as to create synergies within the European NanoSafety Cluster. This conference was supported by the COST Action TD1204 MODENA on developing computational methods for toxicological risk assessment of engineered nanoparticles and provided a unique opportunity for cross fertilization among complementary disciplines. The efforts to develop and validate computational models crucially depend on high quality experimental data and relevant assays which will be the basis to identify relevant descriptors. The ambitious overarching goal of this conference was to promote predictive nanotoxicology, which can only be achieved by a close collaboration between the computational scientists (e.g. database experts, modeling experts for structure, (eco) toxicological effects, performance and interaction of nanomaterials) and experimentalists from different areas (in particular toxicologists, biologists, chemists and material scientists, among others). The main outcome and new perspectives of this conference are summarized here.
CompNanoTox2015: novel perspectives from a European conference on computational nanotoxicology on predictive nanotoxicology

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bañares, Miguel A.; Haase, Andrea; Tran, Lang

A first European Conference on Computational Nanotoxicology, CompNanoTox, was held in November 2015 in Benahavís, Spain with the objectives to disseminate and integrate results from the European modeling and database projects (NanoPUZZLES, ModENPTox, PreNanoTox, MembraneNanoPart, MODERN, eNanoMapper and EU COST TD1204 MODENA) as well as to create synergies within the European NanoSafety Cluster. This conference was supported by the COST Action TD1204 MODENA on developing computational methods for toxicological risk assessment of engineered nanoparticles and provided a unique opportunity for crossfertilization among complementary disciplines. The efforts to develop and validate computational models crucially depend on high quality experimental data andmore » relevant assays which will be the basis to identify relevant descriptors. The ambitious overarching goal of this conference was to promote predictive nanotoxicology, which can only be achieved by a close collaboration between the computational scientists (e.g. database experts, modeling experts for structure, (eco) toxicological effects, performance and interaction of nanomaterials) and experimentalists from different areas (in particular toxicologists, biologists, chemists and material scientists, among others). The main outcome and new perspectives of this conference are summarized here.« less

Processing-in-Memory Enabled Graphics Processors for 3D Rendering

DOE Office of Scientific and Technical Information (OSTI.GOV)

Xie, Chenhao; Song, Shuaiwen; Wang, Jing

2017-02-06

The performance of 3D rendering of Graphics Processing Unit that convents 3D vector stream into 2D frame with 3D image effects significantly impact users’ gaming experience on modern computer systems. Due to the high texture throughput in 3D rendering, main memory bandwidth becomes a critical obstacle for improving the overall rendering performance. 3D stacked memory systems such as Hybrid Memory Cube (HMC) provide opportunities to significantly overcome the memory wall by directly connecting logic controllers to DRAM dies. Based on the observation that texel fetches significantly impact off-chip memory traffic, we propose two architectural designs to enable Processing-In-Memory based GPUmore » for efficient 3D rendering.« less
Computing Principal Eigenvectors of Large Web Graphs: Algorithms and Accelerations Related to PageRank and HITS

ERIC Educational Resources Information Center

Nagasinghe, Iranga

2010-01-01

This thesis investigates and develops a few acceleration techniques for the search engine algorithms used in PageRank and HITS computations. PageRank and HITS methods are two highly successful applications of modern Linear Algebra in computer science and engineering. They constitute the essential technologies accounted for the immense growth and…
Principles of Tablet Computing for Educators

ERIC Educational Resources Information Center

Katzan, Harry, Jr.

2015-01-01

In the study of modern technology for the 21st century, one of the most popular subjects is tablet computing. Tablet computers are now used in business, government, education, and the personal lives of practically everyone--at least, it seems that way. As of October 2013, Apple has sold 170 million iPads. The success of tablets is enormous and has…
Incorporating Knowledge of Legal and Ethical Aspects into Computing Curricula of South African Universities

ERIC Educational Resources Information Center

Wayman, Ian; Kyobe, Michael

2012-01-01

As students in computing disciplines are introduced to modern information technologies, numerous unethical practices also escalate. With the increase in stringent legislations on use of IT, users of technology could easily be held liable for violation of this legislation. There is however lack of understanding of social aspects of computing, and…
The Effects of Modern Mathematics Computer Games on Mathematics Achievement and Class Motivation

ERIC Educational Resources Information Center

Kebritchi, Mansureh; Hirumi, Atsusi; Bai, Haiyan

2010-01-01

This study examined the effects of a computer game on students' mathematics achievement and motivation, and the role of prior mathematics knowledge, computer skill, and English language skill on their achievement and motivation as they played the game. A total of 193 students and 10 teachers participated in this study. The teachers were randomly…
A Computational Turn: Fiction and the Forms of Invention, 1965-1980

ERIC Educational Resources Information Center

Sims, Matthew

2017-01-01

This dissertation examines the relationship between American fiction and computing. More specifically, it argues that each domain explored similar formal possibilities during a period I refer to as Early Postmodernism (1965-1980). In the case of computing, the 60s and 70s represent a turning point in which modern systems and approaches were…
Investigation of the part-load performance of two 1.12 MW regenerative marine gas turbines

NASA Astrophysics Data System (ADS)

Korakianitis, T.; Beier, K. J.

1994-04-01

Regenerative and intercooled-regenerative gas turbine engines with low pressure ratio have significant efficiency advantages over traditional aero-derivative engines of higher pressure ratios, and can compete with modern diesel engines for marine propulsion. Their performance is extremely sensitive to thermodynamic-cycle parameter choices and the type of components. The performances of two 1.12 MW (1500 hp) regenerative gas turbines are predicted with computer simulations. One engine has a single-shaft configuration, and the other has a gas-generator/power-turbine combination. The latter arrangement is essential for wide off-design operating regime. The performance of each engine driving fixed-pitch and controllable-pitch propellers, or an AC electric bus (for electric-motor-driven propellers) is investigated. For commercial applications the controllable-pitch propeller may have efficiency advantages (depending on engine type and shaft arrangements). For military applications the electric drive provides better operational flexibility.
Accurate reaction-diffusion operator splitting on tetrahedral meshes for parallel stochastic molecular simulations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hepburn, I.; De Schutter, E., E-mail: erik@oist.jp; Theoretical Neurobiology & Neuroengineering, University of Antwerp, Antwerp 2610

Spatial stochastic molecular simulations in biology are limited by the intense computation required to track molecules in space either in a discrete time or discrete space framework, which has led to the development of parallel methods that can take advantage of the power of modern supercomputers in recent years. We systematically test suggested components of stochastic reaction-diffusion operator splitting in the literature and discuss their effects on accuracy. We introduce an operator splitting implementation for irregular meshes that enhances accuracy with minimal performance cost. We test a range of models in small-scale MPI simulations from simple diffusion models to realisticmore » biological models and find that multi-dimensional geometry partitioning is an important consideration for optimum performance. We demonstrate performance gains of 1-3 orders of magnitude in the parallel implementation, with peak performance strongly dependent on model specification.« less
GPU-based parallel algorithm for blind image restoration using midfrequency-based methods

NASA Astrophysics Data System (ADS)

Xie, Lang; Luo, Yi-han; Bao, Qi-liang

2013-08-01

GPU-based general-purpose computing is a new branch of modern parallel computing, so the study of parallel algorithms specially designed for GPU hardware architecture is of great significance. In order to solve the problem of high computational complexity and poor real-time performance in blind image restoration, the midfrequency-based algorithm for blind image restoration was analyzed and improved in this paper. Furthermore, a midfrequency-based filtering method is also used to restore the image hardly with any recursion or iteration. Combining the algorithm with data intensiveness, data parallel computing and GPU execution model of single instruction and multiple threads, a new parallel midfrequency-based algorithm for blind image restoration is proposed in this paper, which is suitable for stream computing of GPU. In this algorithm, the GPU is utilized to accelerate the estimation of class-G point spread functions and midfrequency-based filtering. Aiming at better management of the GPU threads, the threads in a grid are scheduled according to the decomposition of the filtering data in frequency domain after the optimization of data access and the communication between the host and the device. The kernel parallelism structure is determined by the decomposition of the filtering data to ensure the transmission rate to get around the memory bandwidth limitation. The results show that, with the new algorithm, the operational speed is significantly increased and the real-time performance of image restoration is effectively improved, especially for high-resolution images.
Distributed GPU Computing in GIScience

NASA Astrophysics Data System (ADS)

Jiang, Y.; Yang, C.; Huang, Q.; Li, J.; Sun, M.

2013-12-01

Geoscientists strived to discover potential principles and patterns hidden inside ever-growing Big Data for scientific discoveries. To better achieve this objective, more capable computing resources are required to process, analyze and visualize Big Data (Ferreira et al., 2003; Li et al., 2013). Current CPU-based computing techniques cannot promptly meet the computing challenges caused by increasing amount of datasets from different domains, such as social media, earth observation, environmental sensing (Li et al., 2013). Meanwhile CPU-based computing resources structured as cluster or supercomputer is costly. In the past several years with GPU-based technology matured in both the capability and performance, GPU-based computing has emerged as a new computing paradigm. Compare to traditional computing microprocessor, the modern GPU, as a compelling alternative microprocessor, has outstanding high parallel processing capability with cost-effectiveness and efficiency(Owens et al., 2008), although it is initially designed for graphical rendering in visualization pipe. This presentation reports a distributed GPU computing framework for integrating GPU-based computing within distributed environment. Within this framework, 1) for each single computer, computing resources of both GPU-based and CPU-based can be fully utilized to improve the performance of visualizing and processing Big Data; 2) within a network environment, a variety of computers can be used to build up a virtual super computer to support CPU-based and GPU-based computing in distributed computing environment; 3) GPUs, as a specific graphic targeted device, are used to greatly improve the rendering efficiency in distributed geo-visualization, especially for 3D/4D visualization. Key words: Geovisualization, GIScience, Spatiotemporal Studies Reference : 1. Ferreira de Oliveira, M. C., & Levkowitz, H. (2003). From visual data exploration to visual data mining: A survey. Visualization and Computer Graphics, IEEE Transactions on, 9(3), 378-394. 2. Li, J., Jiang, Y., Yang, C., Huang, Q., & Rice, M. (2013). Visualizing 3D/4D Environmental Data Using Many-core Graphics Processing Units (GPUs) and Multi-core Central Processing Units (CPUs). Computers & Geosciences, 59(9), 78-89. 3. Owens, J. D., Houston, M., Luebke, D., Green, S., Stone, J. E., & Phillips, J. C. (2008). GPU computing. Proceedings of the IEEE, 96(5), 879-899.
The modern library: lost and found.

PubMed Central

Lindberg, D A

1996-01-01

The modern library, a term that was heard frequently in the mid-twentieth century, has fallen into disuse. The over-promotion of computers and all that their enthusiasts promised probably hastened its demise. Today, networking is transforming how libraries provide--and users seek--information. Although the Internet is the natural environment for the health sciences librarian, it is going through growing pains as we face issues of censorship and standards. Today's "modern librarian" must not only be adept at using the Internet but must become familiar with digital information in all its forms--images, full text, and factual data banks. Most important, to stay "modern," today's librarians must embark on a program of lifelong learning that will enable them to make optimum use of the advantages offered by modern technology. PMID:8938334
Single-chip microprocessor that communicates directly using light

NASA Astrophysics Data System (ADS)

Sun, Chen; Wade, Mark T.; Lee, Yunsup; Orcutt, Jason S.; Alloatti, Luca; Georgas, Michael S.; Waterman, Andrew S.; Shainline, Jeffrey M.; Avizienis, Rimas R.; Lin, Sen; Moss, Benjamin R.; Kumar, Rajesh; Pavanello, Fabio; Atabaki, Amir H.; Cook, Henry M.; Ou, Albert J.; Leu, Jonathan C.; Chen, Yu-Hsin; Asanović, Krste; Ram, Rajeev J.; Popović, Miloš A.; Stojanović, Vladimir M.

2015-12-01

Data transport across short electrical wires is limited by both bandwidth and power density, which creates a performance bottleneck for semiconductor microchips in modern computer systems—from mobile phones to large-scale data centres. These limitations can be overcome by using optical communications based on chip-scale electronic-photonic systems enabled by silicon-based nanophotonic devices8. However, combining electronics and photonics on the same chip has proved challenging, owing to microchip manufacturing conflicts between electronics and photonics. Consequently, current electronic-photonic chips are limited to niche manufacturing processes and include only a few optical devices alongside simple circuits. Here we report an electronic-photonic system on a single chip integrating over 70 million transistors and 850 photonic components that work together to provide logic, memory, and interconnect functions. This system is a realization of a microprocessor that uses on-chip photonic devices to directly communicate with other chips using light. To integrate electronics and photonics at the scale of a microprocessor chip, we adopt a ‘zero-change’ approach to the integration of photonics. Instead of developing a custom process to enable the fabrication of photonics, which would complicate or eliminate the possibility of integration with state-of-the-art transistors at large scale and at high yield, we design optical devices using a standard microelectronics foundry process that is used for modern microprocessors. This demonstration could represent the beginning of an era of chip-scale electronic-photonic systems with the potential to transform computing system architectures, enabling more powerful computers, from network infrastructure to data centres and supercomputers.
Single-chip microprocessor that communicates directly using light.

PubMed

Sun, Chen; Wade, Mark T; Lee, Yunsup; Orcutt, Jason S; Alloatti, Luca; Georgas, Michael S; Waterman, Andrew S; Shainline, Jeffrey M; Avizienis, Rimas R; Lin, Sen; Moss, Benjamin R; Kumar, Rajesh; Pavanello, Fabio; Atabaki, Amir H; Cook, Henry M; Ou, Albert J; Leu, Jonathan C; Chen, Yu-Hsin; Asanović, Krste; Ram, Rajeev J; Popović, Miloš A; Stojanović, Vladimir M

2015-12-24

Data transport across short electrical wires is limited by both bandwidth and power density, which creates a performance bottleneck for semiconductor microchips in modern computer systems--from mobile phones to large-scale data centres. These limitations can be overcome by using optical communications based on chip-scale electronic-photonic systems enabled by silicon-based nanophotonic devices. However, combining electronics and photonics on the same chip has proved challenging, owing to microchip manufacturing conflicts between electronics and photonics. Consequently, current electronic-photonic chips are limited to niche manufacturing processes and include only a few optical devices alongside simple circuits. Here we report an electronic-photonic system on a single chip integrating over 70 million transistors and 850 photonic components that work together to provide logic, memory, and interconnect functions. This system is a realization of a microprocessor that uses on-chip photonic devices to directly communicate with other chips using light. To integrate electronics and photonics at the scale of a microprocessor chip, we adopt a 'zero-change' approach to the integration of photonics. Instead of developing a custom process to enable the fabrication of photonics, which would complicate or eliminate the possibility of integration with state-of-the-art transistors at large scale and at high yield, we design optical devices using a standard microelectronics foundry process that is used for modern microprocessors. This demonstration could represent the beginning of an era of chip-scale electronic-photonic systems with the potential to transform computing system architectures, enabling more powerful computers, from network infrastructure to data centres and supercomputers.
Embedded ensemble propagation for improving performance, portability, and scalability of uncertainty quantification on emerging computational architectures

DOE PAGES

Phipps, Eric T.; D'Elia, Marta; Edwards, Harold C.; ...

2017-04-18

In this study, quantifying simulation uncertainties is a critical component of rigorous predictive simulation. A key component of this is forward propagation of uncertainties in simulation input data to output quantities of interest. Typical approaches involve repeated sampling of the simulation over the uncertain input data, and can require numerous samples when accurately propagating uncertainties from large numbers of sources. Often simulation processes from sample to sample are similar and much of the data generated from each sample evaluation could be reused. We explore a new method for implementing sampling methods that simultaneously propagates groups of samples together in anmore » embedded fashion, which we call embedded ensemble propagation. We show how this approach takes advantage of properties of modern computer architectures to improve performance by enabling reuse between samples, reducing memory bandwidth requirements, improving memory access patterns, improving opportunities for fine-grained parallelization, and reducing communication costs. We describe a software technique for implementing embedded ensemble propagation based on the use of C++ templates and describe its integration with various scientific computing libraries within Trilinos. We demonstrate improved performance, portability and scalability for the approach applied to the simulation of partial differential equations on a variety of CPU, GPU, and accelerator architectures, including up to 131,072 cores on a Cray XK7 (Titan).« less
Toward Question-Asking Machines: The Logic of Questions and the Inquiry Calculus

NASA Technical Reports Server (NTRS)

Knuth,Kevin H.

2005-01-01

For over a century, the study of logic has focused on the algebra of logical statements. This work, first performed by George Boole, has led to the development of modern computers, and was shown by Richard T. Cox to be the foundation of Bayesian inference. Meanwhile the logic of questions has been much neglected. For our computing machines to be truly intelligent, they need to be able to ask relevant questions. In this paper I will show how the Boolean lattice of logical statements gives rise to the free distributive lattice of questions thus defining their algebra. Furthermore, there exists a quantity analogous to probability, called relevance, which quantifies the degree to which one question answers another. I will show that relevance is not only a natural generalization of information theory, but also forms its foundation.
National Combustion Code: A Multidisciplinary Combustor Design System

NASA Technical Reports Server (NTRS)

Stubbs, Robert M.; Liu, Nan-Suey

1997-01-01

The Internal Fluid Mechanics Division conducts both basic research and technology, and system technology research for aerospace propulsion systems components. The research within the division, which is both computational and experimental, is aimed at improving fundamental understanding of flow physics in inlets, ducts, nozzles, turbomachinery, and combustors. This article and the following three articles highlight some of the work accomplished in 1996. A multidisciplinary combustor design system is critical for optimizing the combustor design process. Such a system should include sophisticated computer-aided design (CAD) tools for geometry creation, advanced mesh generators for creating solid model representations, a common framework for fluid flow and structural analyses, modern postprocessing tools, and parallel processing. The goal of the present effort is to develop some of the enabling technologies and to demonstrate their overall performance in an integrated system called the National Combustion Code.
A full 3D-navigation system in a suitcase.

PubMed

Freysinger, W; Truppe, M J; Gunkel, A R; Thumfart, W F

2001-01-01

To reduce the impact of contemporary 3D-navigation systems on the environment of typical otorhinolaryngologic operating rooms, we demonstrate that a transfer of navigation software to modern high-power notebook computers is feasible and results in a practicable way to provide positional information to a surgeon intraoperatively. The ARTMA Virtual Patient System has been implemented on a Macintosh PowerBook G3 and, in connection with the Polhemus FASTRAK digitizer, provides intraoperative positional information during endoscopic endonasal surgery. Satisfactory intraoperative navigation has been realized in two- and three-dimensional medical image data sets (i.e., X-ray, ultrasound images, CT, and MR) and live video. This proof-of-concept study demonstrates that acceptable ergonomics and excellent performance of the system can be achieved with contemporary high-end notebook computers. Copyright 2001 Wiley-Liss, Inc.
Massively parallel GPU-accelerated minimization of classical density functional theory

NASA Astrophysics Data System (ADS)

Stopper, Daniel; Roth, Roland

2017-08-01

In this paper, we discuss the ability to numerically minimize the grand potential of hard disks in two-dimensional and of hard spheres in three-dimensional space within the framework of classical density functional and fundamental measure theory on modern graphics cards. Our main finding is that a massively parallel minimization leads to an enormous performance gain in comparison to standard sequential minimization schemes. Furthermore, the results indicate that in complex multi-dimensional situations, a heavy parallel minimization of the grand potential seems to be mandatory in order to reach a reasonable balance between accuracy and computational cost.
Large-scale-system effectiveness analysis. Final report

DOE Office of Scientific and Technical Information (OSTI.GOV)

Patton, A.D.; Ayoub, A.K.; Foster, J.W.

1979-11-01

Objective of the research project has been the investigation and development of methods for calculating system reliability indices which have absolute, and measurable, significance to consumers. Such indices are a necessary prerequisite to any scheme for system optimization which includes the economic consequences of consumer service interruptions. A further area of investigation has been joint consideration of generation and transmission in reliability studies. Methods for finding or estimating the probability distributions of some measures of reliability performance have been developed. The application of modern Monte Carlo simulation methods to compute reliability indices in generating systems has been studied.
Numerical and experimental analyses of lighting columns in terms of passive safety

NASA Astrophysics Data System (ADS)

Jedliński, Tomasz Ireneusz; Buśkiewicz, Jacek

2018-01-01

Modern lighting columns have a very beneficial influence on road safety. Currently, the columns are being designed to keep the driver safe in the event of a car collision. The following work compares experimental results of vehicle impact on a lighting column with FEM simulations performed using the Ansys LS-DYNA program. Due to high costs of experiments and time-consuming research process, the computer software seems to be very useful utility in the development of pole structures, which are to absorb kinetic energy of the vehicle in a precisely prescribed way.

Creating technical heritage object replicas in a virtual environment

NASA Astrophysics Data System (ADS)

Egorova, Olga; Shcherbinin, Dmitry

2016-03-01

The paper presents innovative informatics methods for creating virtual technical heritage replicas, which are of significant scientific and practical importance not only to researchers but to the public in general. By performing 3D modeling and animation of aircrafts, spaceships, architectural-engineering buildings, and other technical objects, the process of learning is achieved while promoting the preservation of the replicas for future generations. Modern approaches based on the wide usage of computer technologies attract a greater number of young people to explore the history of science and technology and renew their interest in the field of mechanical engineering.
Injector element characterization methodology

NASA Technical Reports Server (NTRS)

Cox, George B., Jr.

1988-01-01

Characterization of liquid rocket engine injector elements is an important part of the development process for rocket engine combustion devices. Modern nonintrusive instrumentation for flow velocity and spray droplet size measurement, and automated, computer-controlled test facilities allow rapid, low-cost evaluation of injector element performance and behavior. Application of these methods in rocket engine development, paralleling their use in gas turbine engine development, will reduce rocket engine development cost and risk. The Alternate Turbopump (ATP) Hot Gas Systems (HGS) preburner injector elements were characterized using such methods, and the methodology and some of the results obtained will be shown.
Automation based on knowledge modeling theory and its applications in engine diagnostic systems using Space Shuttle Main Engine vibrational data. M.S. Thesis

NASA Technical Reports Server (NTRS)

Kim, Jonnathan H.

1995-01-01

Humans can perform many complicated tasks without explicit rules. This inherent and advantageous capability becomes a hurdle when a task is to be automated. Modern computers and numerical calculations require explicit rules and discrete numerical values. In order to bridge the gap between human knowledge and automating tools, a knowledge model is proposed. Knowledge modeling techniques are discussed and utilized to automate a labor and time intensive task of detecting anomalous bearing wear patterns in the Space Shuttle Main Engine (SSME) High Pressure Oxygen Turbopump (HPOTP).
Rare earth element and rare metal inventory of central Asia

USGS Publications Warehouse

Mihalasky, Mark J.; Tucker, Robert D.; Renaud, Karine; Verstraeten, Ingrid M.

2018-03-06

Rare earth elements (REE), with their unique physical and chemical properties, are an essential part of modern living. REE have enabled development and manufacture of high-performance materials, processes, and electronic technologies commonly used today in computing and communications, clean energy and transportation, medical treatment and health care, glass and ceramics, aerospace and defense, and metallurgy and chemical refining. Central Asia is an emerging REE and rare metals (RM) producing region. A newly compiled inventory of REE-RM-bearing mineral occurrences and delineation of areas-of-interest indicate this region may have considerable undiscovered resources.
Bayesian Software Health Management for Aircraft Guidance, Navigation, and Control

NASA Technical Reports Server (NTRS)

Schumann, Johann; Mbaya, Timmy; Menghoel, Ole

2011-01-01

Modern aircraft, both piloted fly-by-wire commercial aircraft as well as UAVs, more and more depend on highly complex safety critical software systems with many sensors and computer-controlled actuators. Despite careful design and V&V of the software, severe incidents have happened due to malfunctioning software. In this paper, we discuss the use of Bayesian networks (BNs) to monitor the health of the on-board software and sensor system, and to perform advanced on-board diagnostic reasoning. We will focus on the approach to develop reliable and robust health models for the combined software and sensor systems.
Battlefield awareness computers: the engine of battlefield digitization

NASA Astrophysics Data System (ADS)

Ho, Jackson; Chamseddine, Ahmad

1997-06-01

To modernize the army for the 21st century, the U.S. Army Digitization Office (ADO) initiated in 1995 the Force XXI Battle Command Brigade-and-Below (FBCB2) Applique program which became a centerpiece in the U.S. Army's master plan to win future information wars. The Applique team led by TRW fielded a 'tactical Internet' for Brigade and below command to demonstrate the advantages of 'shared situation awareness' and battlefield digitization in advanced war-fighting experiments (AWE) to be conducted in March 1997 at the Army's National Training Center in California. Computing Devices is designated the primary hardware developer for the militarized version of the battlefield awareness computers. The first generation of militarized battlefield awareness computer, designated as the V3 computer, was an integration of off-the-shelf components developed to meet the agressive delivery requirements of the Task Force XXI AWE. The design efficiency and cost effectiveness of the computer hardware were secondary in importance to delivery deadlines imposed by the March 1997 AWE. However, declining defense budgets will impose cost constraints on the Force XXI production hardware that can only be met by rigorous value engineering to further improve design optimization for battlefield awareness without compromising the level of reliability the military has come to expect in modern military hardened vetronics. To answer the Army's needs for a more cost effective computing solution, Computing Devices developed a second generation 'combat ready' battlefield awareness computer, designated the V3+, which is designed specifically to meet the upcoming demands of Force XXI (FBCB2) and beyond. The primary design objective is to achieve a technologically superior design, value engineered to strike an optimal balance between reliability, life cycle cost, and procurement cost. Recognizing that the diverse digitization demands of Force XXI cannot be adequately met by any one computer hardware solution, Computing Devices is planning to develop a notebook sized military computer designed for space limited vehicle-mounted applications, as well as a high-performance portable workstation equipped with a 19', full color, ultra-high resolution and high brightness active matrix liquid crystal display (AMLCD) targeting the command posts and tactical operations centers (TOC) applications. Together with the wearable computers Computing Devices developed at the Minneapolis facility for dismounted soldiers, Computing Devices will have a complete suite of interoperable battlefield awareness computers spanning the entire spectrum of battle digitization operating environments. Although this paper's primary focus is on a second generation 'combat ready' battlefield awareness computer or the V3+, this paper also briefly discusses the extension of the V3+ architecture to address the needs of the embedded and command post applications.3080
Using parallel computing for the display and simulation of the space debris environment

NASA Astrophysics Data System (ADS)

Möckel, M.; Wiedemann, C.; Flegel, S.; Gelhaus, J.; Vörsmann, P.; Klinkrad, H.; Krag, H.

2011-07-01

Parallelism is becoming the leading paradigm in today's computer architectures. In order to take full advantage of this development, new algorithms have to be specifically designed for parallel execution while many old ones have to be upgraded accordingly. One field in which parallel computing has been firmly established for many years is computer graphics. Calculating and displaying three-dimensional computer generated imagery in real time requires complex numerical operations to be performed at high speed on a large number of objects. Since most of these objects can be processed independently, parallel computing is applicable in this field. Modern graphics processing units (GPUs) have become capable of performing millions of matrix and vector operations per second on multiple objects simultaneously. As a side project, a software tool is currently being developed at the Institute of Aerospace Systems that provides an animated, three-dimensional visualization of both actual and simulated space debris objects. Due to the nature of these objects it is possible to process them individually and independently from each other. Therefore, an analytical orbit propagation algorithm has been implemented to run on a GPU. By taking advantage of all its processing power a huge performance increase, compared to its CPU-based counterpart, could be achieved. For several years efforts have been made to harness this computing power for applications other than computer graphics. Software tools for the simulation of space debris are among those that could profit from embracing parallelism. With recently emerged software development tools such as OpenCL it is possible to transfer the new algorithms used in the visualization outside the field of computer graphics and implement them, for example, into the space debris simulation environment. This way they can make use of parallel hardware such as GPUs and Multi-Core-CPUs for faster computation. In this paper the visualization software will be introduced, including a comparison between the serial and the parallel method of orbit propagation. Ways of how to use the benefits of the latter method for space debris simulation will be discussed. An introduction to OpenCL will be given as well as an exemplary algorithm from the field of space debris simulation.
Using parallel computing for the display and simulation of the space debris environment

NASA Astrophysics Data System (ADS)

Moeckel, Marek; Wiedemann, Carsten; Flegel, Sven Kevin; Gelhaus, Johannes; Klinkrad, Heiner; Krag, Holger; Voersmann, Peter

Parallelism is becoming the leading paradigm in today's computer architectures. In order to take full advantage of this development, new algorithms have to be specifically designed for parallel execution while many old ones have to be upgraded accordingly. One field in which parallel computing has been firmly established for many years is computer graphics. Calculating and displaying three-dimensional computer generated imagery in real time requires complex numerical operations to be performed at high speed on a large number of objects. Since most of these objects can be processed independently, parallel computing is applicable in this field. Modern graphics processing units (GPUs) have become capable of performing millions of matrix and vector operations per second on multiple objects simultaneously. As a side project, a software tool is currently being developed at the Institute of Aerospace Systems that provides an animated, three-dimensional visualization of both actual and simulated space debris objects. Due to the nature of these objects it is possible to process them individually and independently from each other. Therefore, an analytical orbit propagation algorithm has been implemented to run on a GPU. By taking advantage of all its processing power a huge performance increase, compared to its CPU-based counterpart, could be achieved. For several years efforts have been made to harness this computing power for applications other than computer graphics. Software tools for the simulation of space debris are among those that could profit from embracing parallelism. With recently emerged software development tools such as OpenCL it is possible to transfer the new algorithms used in the visualization outside the field of computer graphics and implement them, for example, into the space debris simulation environment. This way they can make use of parallel hardware such as GPUs and Multi-Core-CPUs for faster computation. In this paper the visualization software will be introduced, including a comparison between the serial and the parallel method of orbit propagation. Ways of how to use the benefits of the latter method for space debris simulation will be discussed. An introduction of OpenCL will be given as well as an exemplary algorithm from the field of space debris simulation.
Evaluating Application Resilience with XRay

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chen, Sui; Bronevetsky, Greg; Li, Bin

2015-05-07

The rising count and shrinking feature size of transistors within modern computers is making them increasingly vulnerable to various types of soft faults. This problem is especially acute in high-performance computing (HPC) systems used for scientific computing, because these systems include many thousands of compute cores and nodes, all of which may be utilized in a single large-scale run. The increasing vulnerability of HPC applications to errors induced by soft faults is motivating extensive work on techniques to make these applications more resiilent to such faults, ranging from generic techniques such as replication or checkpoint/restart to algorithmspecific error detection andmore » tolerance techniques. Effective use of such techniques requires a detailed understanding of how a given application is affected by soft faults to ensure that (i) efforts to improve application resilience are spent in the code regions most vulnerable to faults and (ii) the appropriate resilience technique is applied to each code region. This paper presents XRay, a tool to view the application vulnerability to soft errors, and illustrates how XRay can be used in the context of a representative application. In addition to providing actionable insights into application behavior XRay automatically selects the number of fault injection experiments required to provide an informative view of application behavior, ensuring that the information is statistically well-grounded without performing unnecessary experiments.« less
High performance computing enabling exhaustive analysis of higher order single nucleotide polymorphism interaction in Genome Wide Association Studies.

PubMed

Goudey, Benjamin; Abedini, Mani; Hopper, John L; Inouye, Michael; Makalic, Enes; Schmidt, Daniel F; Wagner, John; Zhou, Zeyu; Zobel, Justin; Reumann, Matthias

2015-01-01

Genome-wide association studies (GWAS) are a common approach for systematic discovery of single nucleotide polymorphisms (SNPs) which are associated with a given disease. Univariate analysis approaches commonly employed may miss important SNP associations that only appear through multivariate analysis in complex diseases. However, multivariate SNP analysis is currently limited by its inherent computational complexity. In this work, we present a computational framework that harnesses supercomputers. Based on our results, we estimate a three-way interaction analysis on 1.1 million SNP GWAS data requiring over 5.8 years on the full "Avoca" IBM Blue Gene/Q installation at the Victorian Life Sciences Computation Initiative. This is hundreds of times faster than estimates for other CPU based methods and four times faster than runtimes estimated for GPU methods, indicating how the improvement in the level of hardware applied to interaction analysis may alter the types of analysis that can be performed. Furthermore, the same analysis would take under 3 months on the currently largest IBM Blue Gene/Q supercomputer "Sequoia" at the Lawrence Livermore National Laboratory assuming linear scaling is maintained as our results suggest. Given that the implementation used in this study can be further optimised, this runtime means it is becoming feasible to carry out exhaustive analysis of higher order interaction studies on large modern GWAS.
Foundational Report Series. Advanced Distribution management Systems for Grid Modernization (Importance of DMS for Distribution Grid Modernization)

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wang, Jianhui

2015-09-01

Grid modernization is transforming the operation and management of electric distribution systems from manual, paper-driven business processes to electronic, computer-assisted decisionmaking. At the center of this business transformation is the distribution management system (DMS), which provides a foundation from which optimal levels of performance can be achieved in an increasingly complex business and operating environment. Electric distribution utilities are facing many new challenges that are dramatically increasing the complexity of operating and managing the electric distribution system: growing customer expectations for service reliability and power quality, pressure to achieve better efficiency and utilization of existing distribution system assets, and reductionmore » of greenhouse gas emissions by accommodating high penetration levels of distributed generating resources powered by renewable energy sources (wind, solar, etc.). Recent “storm of the century” events in the northeastern United States and the lengthy power outages and customer hardships that followed have greatly elevated the need to make power delivery systems more resilient to major storm events and to provide a more effective electric utility response during such regional power grid emergencies. Despite these newly emerging challenges for electric distribution system operators, only a small percentage of electric utilities have actually implemented a DMS. This paper discusses reasons why a DMS is needed and why the DMS may emerge as a mission-critical system that will soon be considered essential as electric utilities roll out their grid modernization strategies.« less
USSR Report, Military Affairs Foreign Military Review No 6, June 1986

DTIC Science & Technology

1986-11-20

computers used for an objective accounting of the difference in current firing conditions from standard hold an important place in integrated fire...control systems of modern tanks of capitalist countries. Mechanical ballistic computers gave way in the early 1970’s to electronic computers , initially...made with analog components. Then digital ballistic computers were created, installed in particular in the Ml Abrams and Leopard-2 tanks. The basic
Fuel Performance Experiments and Modeling: Fission Gas Bubble Nucleation and Growth in Alloy Nuclear Fuels

DOE Office of Scientific and Technical Information (OSTI.GOV)

McDeavitt, Sean; Shao, Lin; Tsvetkov, Pavel

2014-04-07

Advanced fast reactor systems being developed under the DOE's Advanced Fuel Cycle Initiative are designed to destroy TRU isotopes generated in existing and future nuclear energy systems. Over the past 40 years, multiple experiments and demonstrations have been completed using U-Zr, U-Pu-Zr, U-Mo and other metal alloys. As a result, multiple empirical and semi-empirical relationships have been established to develop empirical performance modeling codes. Many mechanistic questions about fission as mobility, bubble coalescience, and gas release have been answered through industrial experience, research, and empirical understanding. The advent of modern computational materials science, however, opens new doors of development suchmore » that physics-based multi-scale models may be developed to enable a new generation of predictive fuel performance codes that are not limited by empiricism.« less
Cost-Performance Parametrics for Transporting Small Packages to the Mars Vicinity

NASA Technical Reports Server (NTRS)

McCleskey, C.; Lepsch, Roger A.; Martin, J.; Popescu, M.

2015-01-01

This paper explores the costs and performance required to deliver a small-sized payload package (CubeSat-sized, for instance) to various transportation nodes en route to Mars and near-Mars destinations (such as Mars moons, Phobos and Deimos). Needed is a contemporary assessment and summary compilation of transportation metrics that factor both performance and affordability of modern and emerging delivery capabilities. The paper brings together: (a) required mass transport gear ratios in delivering payload from Earths surface to the Mars vicinity, (b) the cyclical energy required for delivery, and (c) the affordability and availability of various means of transporting material across various Earth-Moon vicinity and Near-Mars vicinity nodes relevant to Mars transportation. Examples for unit deliveries are computed and tabulated, using a CubeSat as a unit, for periodic near-Mars delivery campaign scenarios.
Efficacy of Code Optimization on Cache-Based Processors

NASA Technical Reports Server (NTRS)

VanderWijngaart, Rob F.; Saphir, William C.; Chancellor, Marisa K. (Technical Monitor)

1997-01-01

In this paper a number of techniques for improving the cache performance of a representative piece of numerical software is presented. Target machines are popular processors from several vendors: MIPS R5000 (SGI Indy), MIPS R8000 (SGI PowerChallenge), MIPS R10000 (SGI Origin), DEC Alpha EV4 + EV5 (Cray T3D & T3E), IBM RS6000 (SP Wide-node), Intel PentiumPro (Ames' Whitney), Sun UltraSparc (NERSC's NOW). The optimizations all attempt to increase the locality of memory accesses. But they meet with rather varied and often counterintuitive success on the different computing platforms. We conclude that it may be genuinely impossible to obtain portable performance on the current generation of cache-based machines. At the least, it appears that the performance of modern commodity processors cannot be described with parameters defining the cache alone.
Micro-Biomechanics of the Kebara 2 Hyoid and Its Implications for Speech in Neanderthals

PubMed Central

D’Anastasio, Ruggero; Wroe, Stephen; Tuniz, Claudio; Mancini, Lucia; Cesana, Deneb T.; Dreossi, Diego; Ravichandiran, Mayoorendra; Attard, Marie; Parr, William C. H.; Agur, Anne; Capasso, Luigi

2013-01-01

The description of a Neanderthal hyoid from Kebara Cave (Israel) in 1989 fuelled scientific debate on the evolution of speech and complex language. Gross anatomy of the Kebara 2 hyoid differs little from that of modern humans. However, whether Homo neanderthalensis could use speech or complex language remains controversial. Similarity in overall shape does not necessarily demonstrate that the Kebara 2 hyoid was used in the same way as that of Homo sapiens. The mechanical performance of whole bones is partly controlled by internal trabecular geometries, regulated by bone-remodelling in response to the forces applied. Here we show that the Neanderthal and modern human hyoids also present very similar internal architectures and micro-biomechanical behaviours. Our study incorporates detailed analysis of histology, meticulous reconstruction of musculature, and computational biomechanical analysis with models incorporating internal micro-geometry. Because internal architecture reflects the loadings to which a bone is routinely subjected, our findings are consistent with a capacity for speech in the Neanderthals. PMID:24367509
Decision support systems for clinical radiological practice — towards the next generation

PubMed Central

Stivaros, S M; Gledson, A; Nenadic, G; Zeng, X-J; Keane, J; Jackson, A

2010-01-01

The huge amount of information that needs to be assimilated in order to keep pace with the continued advances in modern medical practice can form an insurmountable obstacle to the individual clinician. Within radiology, the recent development of quantitative imaging techniques, such as perfusion imaging, and the development of imaging-based biomarkers in modern therapeutic assessment has highlighted the need for computer systems to provide the radiological community with support for academic as well as clinical/translational applications. This article provides an overview of the underlying design and functionality of radiological decision support systems with examples tracing the development and evolution of such systems over the past 40 years. More importantly, we discuss the specific design, performance and usage characteristics that previous systems have highlighted as being necessary for clinical uptake and routine use. Additionally, we have identified particular failings in our current methodologies for data dissemination within the medical domain that must be overcome if the next generation of decision support systems is to be implemented successfully. PMID:20965900
A new telescope control system for the Telescopio Nazionale Galileo: I - derotators

NASA Astrophysics Data System (ADS)

Ghedina, Adriano; Gonzalez, Manuel; Perez Ventura, Hector; Carmona, Candido; Riverol, Luis

2014-07-01

Telescopio Nazionale Galileo (TNG) is a 4m class active optics telescope at the observatory of Roque de Los Muchachos. In the framework of keeping optimum performances during observation and continuous reliability the telescope control system (TCS) of the TNG is going through a deep upgrade after nearly 20 years of service. The original glass encoders and bulb lamp heads are substituted with modern steel scale drums and scanning units. The obsolete electronic racks and computers for the control loops are replaced with modern and compact commercial drivers with a net improvement in the tracking error RMS. In order to minimize the impact on the number of nights lost during the mechanical and electronic changes in the TCS the new TCS is developed and tested in parallel to the existing one and three steps will be taken to achieve the full upgrade. We describe here the first step affecting the mechanical derotators at the Nasmyth foci.
A new telescope control system for the Telescopio Nazionale Galileo II: azimuth and elevation axes

NASA Astrophysics Data System (ADS)

Ghedina, Adriano; Gonzalez, Manuel; Pérez Ventura, Héctor; Riverol Rodríguez, A. Luis

2016-07-01

TNG is a 4m class active optics telescope at the Observatory of Roque de Los Muchachos. In the framework of keeping optimum performances during observation and continuous reliability the telescope control system (TCS) of the TNG is going through a deep upgrade after nearly 20 years of service. The original glass encoders and bulb lamp heads are substituted with modern steel scale drums and scanning units. The obsolete electronic racks and computers for the control loops are replaced with modern and compact commercial drivers with a net improvement in the motors torque ripple. In order to minimize the impact on the number of nights lost during the mechanical and electronic changes in the TCS the new TCS is developed and tested in parallel to the existing one and three steps will be taken to achieve the full upgrade. We describe here the second step that affected the main axes of the telescope, AZ and EL.
Numerical characteristics of quantum computer simulation

NASA Astrophysics Data System (ADS)

Chernyavskiy, A.; Khamitov, K.; Teplov, A.; Voevodin, V.; Voevodin, Vl.

2016-12-01

The simulation of quantum circuits is significantly important for the implementation of quantum information technologies. The main difficulty of such modeling is the exponential growth of dimensionality, thus the usage of modern high-performance parallel computations is relevant. As it is well known, arbitrary quantum computation in circuit model can be done by only single- and two-qubit gates, and we analyze the computational structure and properties of the simulation of such gates. We investigate the fact that the unique properties of quantum nature lead to the computational properties of the considered algorithms: the quantum parallelism make the simulation of quantum gates highly parallel, and on the other hand, quantum entanglement leads to the problem of computational locality during simulation. We use the methodology of the AlgoWiki project (algowiki-project.org) to analyze the algorithm. This methodology consists of theoretical (sequential and parallel complexity, macro structure, and visual informational graph) and experimental (locality and memory access, scalability and more specific dynamic characteristics) parts. Experimental part was made by using the petascale Lomonosov supercomputer (Moscow State University, Russia). We show that the simulation of quantum gates is a good base for the research and testing of the development methods for data intense parallel software, and considered methodology of the analysis can be successfully used for the improvement of the algorithms in quantum information science.

Analytical description of the modern steam automobile

NASA Technical Reports Server (NTRS)

Peoples, J. A.

1974-01-01

The sensitivity of operating conditions upon performance of the modern steam automobile is discussed. The word modern has been used in the title to indicate that emphasis is upon miles per gallon rather than theoretical thermal efficiency. This has been accomplished by combining classical power analysis with the ideal Pressure-Volume diagram. Several parameters are derived which characterize performance capability of the modern steam car. The report illustrates that performance is dictated by the characteristics of the working medium, and the supply temperature. Performance is nearly independent of pressures above 800 psia. Analysis techniques were developed specifically for reciprocating steam engines suitable for automotive application. Specific performance charts have been constructed on the basis of water as a working medium. The conclusions and data interpretation are therefore limited within this scope.
Shared Memory Parallelism for 3D Cartesian Discrete Ordinates Solver

NASA Astrophysics Data System (ADS)

Moustafa, Salli; Dutka-Malen, Ivan; Plagne, Laurent; Ponçot, Angélique; Ramet, Pierre

2014-06-01

This paper describes the design and the performance of DOMINO, a 3D Cartesian SN solver that implements two nested levels of parallelism (multicore+SIMD) on shared memory computation nodes. DOMINO is written in C++, a multi-paradigm programming language that enables the use of powerful and generic parallel programming tools such as Intel TBB and Eigen. These two libraries allow us to combine multi-thread parallelism with vector operations in an efficient and yet portable way. As a result, DOMINO can exploit the full power of modern multi-core processors and is able to tackle very large simulations, that usually require large HPC clusters, using a single computing node. For example, DOMINO solves a 3D full core PWR eigenvalue problem involving 26 energy groups, 288 angular directions (S16), 46 × 106 spatial cells and 1 × 1012 DoFs within 11 hours on a single 32-core SMP node. This represents a sustained performance of 235 GFlops and 40:74% of the SMP node peak performance for the DOMINO sweep implementation. The very high Flops/Watt ratio of DOMINO makes it a very interesting building block for a future many-nodes nuclear simulation tool.
Application of Pinniped Vibrissae to Aeropropulsion

NASA Technical Reports Server (NTRS)

Shyam, Vikram (Principal Investigator); Ameri, Ali; Poinsatte, Phil; Thurman, Doug; Wroblewski, Adam; Snyder, Chris

2015-01-01

Vibrissae of Phoca Vitulina (Harbor Seal) and Mirounga Angustirostris (Elephant Seal) possess undulations along their length. Harbor Seal Vibrissae were shown to reduce vortex induced vibrations and reduce drag compared to appropriately scaled cylinders and ellipses. Samples of Harbor Seal vibrissae, Elephant Seal vibrissae and California Sea Lion vibrissae were collected from the Marine Mammal Center in California. CT scanning, microscopy and 3D scanning techniques were utilized to characterize the whiskers. Computational fluid dynamics simulations of the whiskers were carried out to compare them to an ellipse and a cylinder. Leading edge parameters from the whiskers were used to create a 3D profile based on a modern power turbine blade. The NASA SW-2 facility was used to perform wind tunnel cascade testing on the 'Seal Blades'. Computational Fluid Dynamics simulations were used to study incidence angles from -37 to +10 degrees on the aerodynamic performance of the Seal Blade. The tests and simulations were conducted at a Reynolds number of 100,000. The Seal Blades showed consistent performance improvements over the baseline configuration. It was determined that a fuel burn reduction of approximately 5 could be achieved for a fixed wing aircraft. Noise reduction potential is also explored.
Achieving energy efficiency during collective communications

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sundriyal, Vaibhav; Sosonkina, Masha; Zhang, Zhao

2012-09-13

Energy consumption has become a major design constraint in modern computing systems. With the advent of petaflops architectures, power-efficient software stacks have become imperative for scalability. Techniques such as dynamic voltage and frequency scaling (called DVFS) and CPU clock modulation (called throttling) are often used to reduce the power consumption of the compute nodes. To avoid significant performance losses, these techniques should be used judiciously during parallel application execution. For example, its communication phases may be good candidates to apply the DVFS and CPU throttling without incurring a considerable performance loss. They are often considered as indivisible operations although littlemore » attention is being devoted to the energy saving potential of their algorithmic steps. In this work, two important collective communication operations, all-to-all and allgather, are investigated as to their augmentation with energy saving strategies on the per-call basis. The experiments prove the viability of such a fine-grain approach. They also validate a theoretical power consumption estimate for multicore nodes proposed here. While keeping the performance loss low, the obtained energy savings were always significantly higher than those achieved when DVFS or throttling were switched on across the entire application run« less
Off-design Performance Analysis of Multi-Stage Transonic Axial Compressors

NASA Astrophysics Data System (ADS)

Du, W. H.; Wu, H.; Zhang, L.

Because of the complex flow fields and component interaction in modern gas turbine engines, they require extensive experiment to validate performance and stability. The experiment process can become expensive and complex. Modeling and simulation of gas turbine engines are way to reduce experiment costs, provide fidelity and enhance the quality of essential experiment. The flow field of a transonic compressor contains all the flow aspects, which are difficult to present-boundary layer transition and separation, shock-boundary layer interactions, and large flow unsteadiness. Accurate transonic axial compressor off-design performance prediction is especially difficult, due in large part to three-dimensional blade design and the resulting flow field. Although recent advancements in computer capacity have brought computational fluid dynamics to forefront of turbomachinery design and analysis, the grid and turbulence model still limit Reynolds-average Navier-Stokes (RANS) approximations in the multi-stage transonic axial compressor flow field. Streamline curvature methods are still the dominant numerical approach as an important tool for turbomachinery to analyze and design, and it is generally accepted that streamline curvature solution techniques will provide satisfactory flow prediction as long as the losses, deviation and blockage are accurately predicted.
A Smart Home Test Bed for Undergraduate Education to Bridge the Curriculum Gap from Traditional Power Systems to Modernized Smart Grids

ERIC Educational Resources Information Center

Hu, Qinran; Li, Fangxing; Chen, Chien-fei

2015-01-01

There is a worldwide trend to modernize old power grid infrastructures to form future smart grids, which will achieve efficient, flexible energy consumption by using the latest technologies in communication, computing, and control. Smart grid initiatives are moving power systems curricula toward smart grids. Although the components of smart grids…
Revisiting Mathematical Problem Solving and Posing in the Digital Era: Toward Pedagogically Sound Uses of Modern Technology

ERIC Educational Resources Information Center

Abramovich, S.

2014-01-01

The availability of sophisticated computer programs such as "Wolfram Alpha" has made many problems found in the secondary mathematics curriculum somewhat obsolete for they can be easily solved by the software. Against this background, an interplay between the power of a modern tool of technology and educational constraints it presents is…
Interscholastic Correspondence Exchanges in Celestin Freinet's Modern School Movement: Implications for Computer-Mediated Intercultural Learning Networks.

ERIC Educational Resources Information Center

Sayers, Dennis

Although the work of Celestin Freinet has exerted considerable influence on European education, it remains largely unknown to English-speaking educators. The Modern School Movement (MSM), which Freinet founded in 1926, is worldwide in scope, and has affiliated organizations in 13 countries with correspondent groups in more than 20 nations. The MSM…
Discussion Forum Interactions: Text and Context

ERIC Educational Resources Information Center

Montero, Begona; Watts, Frances; Garcia-Carbonell, Amparo

2007-01-01

Computer-mediated communication (CMC) is currently used in language teaching as a bridge for the development of written and spoken skills [Kern, R., 1995. "Restructuring classroom interaction with networked computers: effects on quantity and characteristics of language production." "The Modern Language Journal" 79, 457-476]. Within CMC…
Welding--Trade or Profession?

ERIC Educational Resources Information Center

Albright, C. E.; Smith, Kenneth

2006-01-01

This article discusses a collaborative program between schools with the purpose of training and providing advanced education in welding. Modern manufacturing is turning to automation to increase productivity, but it can be a great challenge to program robots and other computer-controlled welding and joining systems. Computer programming and…
Relations between Some Characteristic Lengths in a Triangle

ERIC Educational Resources Information Center

Koepf, Wolfram; Brede, Markus

2005-01-01

The paper's aim is to note a remarkable (and apparently unknown) relation for right triangles, its generalisation to arbitrary triangles and the possibility to derive these and some related relations by elimination using Groebner basis computations with a modern computer algebra system. (Contains 9 figures.)
Scaling Watershed Models: Modern Approaches to Science Computation with MapReduce, Parallelization, and Cloud Optimization

EPA Science Inventory

Environmental models are products of the computer architecture and software tools available at the time of development. Scientifically sound algorithms may persist in their original state even as system architectures and software development approaches evolve and progress. Dating...
Computer and visual display terminals (VDT) vision syndrome (CVDTS).

PubMed

Parihar, J K S; Jain, Vaibhav Kumar; Chaturvedi, Piyush; Kaushik, Jaya; Jain, Gunjan; Parihar, Ashwini K S

2016-07-01

Computer and visual display terminals have become an essential part of modern lifestyle. The use of these devices has made our life simple in household work as well as in offices. However the prolonged use of these devices is not without any complication. Computer and visual display terminals syndrome is a constellation of symptoms ocular as well as extraocular associated with prolonged use of visual display terminals. This syndrome is gaining importance in this modern era because of the widespread use of technologies in day-to-day life. It is associated with asthenopic symptoms, visual blurring, dry eyes, musculoskeletal symptoms such as neck pain, back pain, shoulder pain, carpal tunnel syndrome, psychosocial factors, venous thromboembolism, shoulder tendonitis, and elbow epicondylitis. Proper identification of symptoms and causative factors are necessary for the accurate diagnosis and management. This article focuses on the various aspects of the computer vision display terminals syndrome described in the previous literature. Further research is needed for the better understanding of the complex pathophysiology and management.
Towards topological quantum computer

NASA Astrophysics Data System (ADS)

Melnikov, D.; Mironov, A.; Mironov, S.; Morozov, A.; Morozov, An.

2018-01-01

Quantum R-matrices, the entangling deformations of non-entangling (classical) permutations, provide a distinguished basis in the space of unitary evolutions and, consequently, a natural choice for a minimal set of basic operations (universal gates) for quantum computation. Yet they play a special role in group theory, integrable systems and modern theory of non-perturbative calculations in quantum field and string theory. Despite recent developments in those fields the idea of topological quantum computing and use of R-matrices, in particular, practically reduce to reinterpretation of standard sets of quantum gates, and subsequently algorithms, in terms of available topological ones. In this paper we summarize a modern view on quantum R-matrix calculus and propose to look at the R-matrices acting in the space of irreducible representations, which are unitary for the real-valued couplings in Chern-Simons theory, as the fundamental set of universal gates for topological quantum computer. Such an approach calls for a more thorough investigation of the relation between topological invariants of knots and quantum algorithms.
Digital pathology in nephrology clinical trials, research, and pathology practice.

PubMed

Barisoni, Laura; Hodgin, Jeffrey B

2017-11-01

In this review, we will discuss (i) how the recent advancements in digital technology and computational engineering are currently applied to nephropathology in the setting of clinical research, trials, and practice; (ii) the benefits of the new digital environment; (iii) how recognizing its challenges provides opportunities for transformation; and (iv) nephropathology in the upcoming era of kidney precision and predictive medicine. Recent studies highlighted how new standardized protocols facilitate the harmonization of digital pathology database infrastructure and morphologic, morphometric, and computer-aided quantitative analyses. Digital pathology enables robust protocols for clinical trials and research, with the potential to identify previously underused or unrecognized clinically useful parameters. The integration of digital pathology with molecular signatures is leading the way to establishing clinically relevant morpho-omic taxonomies of renal diseases. The introduction of digital pathology in clinical research and trials, and the progressive implementation of the modern software ecosystem, opens opportunities for the development of new predictive diagnostic paradigms and computer-aided algorithms, transforming the practice of renal disease into a modern computational science.
The role of modern diagnostic imaging in diagnosing and differentiating kidney diseases in children.

PubMed

Maliborski, Artur; Zegadło, Arkadiusz; Placzyńska, Małgorzata; Sopińska, Małgorzata; Lichosik, Marianna; Jobs, Katarzyna

2018-01-01

Urinary tract diseases are in the group of the most commonly diagnosed medical conditions in pediatric patients. Many diseases with different etiologies are accompanied by pain, fever, hematuria, or urinary tract dysfunction. Those most common ones in children are urinary tract infections and congenital malformation. They can also represent tumors or changes caused by systemic diseases. Clinical tests and even more often additional imaging studies are required to make a proper diagnosis of urinary tract diseases. Just a few decades ago urography, cystography or voiding cystourethrography were the main methods in diagnostic imaging of the urinary tract. Today's imaging methods supported by digital radiographic and fluoroscopy systems, high sensitivity detectors with quantum detection, advanced algorithms eliminating motion artifacts, modern medical imaging monitors with a resolution of three or even eight megapixels significantly differ from conventional radiographic methods. The methods that are currently usually performed are: computed tomography, magnetic resonance imaging, isotopic methods and ultrasonography using elastography and new solutions in Doppler imaging. Modern techniques are currently focused on reducing radiation exposure with better imaging capabilities. The development of these techniques became an essential diagnostic aid in nephrological and urological practice. The aim of this paper is to present the latest solutions that are currently used in the diagnostic imaging of urinary tract diseases.
How Well Can Modern Nonhabitual Barefoot Youth Adapt to Barefoot and Minimalist Barefoot Technology Shoe Walking, in regard to Gait Symmetry.

PubMed

Xu, Y; Hou, Q; Wang, C; Simpson, T; Bennett, B; Russell, S

2017-01-01

We aim to test how well modern nonhabitual barefoot people can adapt to barefoot and Minimalist Bare Foot Technology (MBFT) shoes, in regard to gait symmetry. 28 healthy university students (22 females/6 males) were recruited to walk on a 10-meter walkway randomly on barefoot, in MBFT shoes, and in neutral running shoes at their comfortable walking speed. Kinetic and kinematic data were collected using an 8-camera motion capture system. Data of joint angles, joint forces, and joint moments were extracted to compute a consecutive symmetry index. Compared to walking in neutral running shoes, walking barefoot led to worse symmetry of the following: ankle joint force in sagittal plane, knee joint moment in transverse plane, and ankle joint moment in frontal plane, while improving the symmetry of joint angle in sagittal plane at ankle joints and global (hip-knee-ankle) level. Walking in MBFT shoes had intermediate gait symmetry performance as compared to walking barefoot/walking in neutral running shoes. We conclude that modern nonhabitual barefoot adults will lose some gait symmetry in joint force/moment if they switch to barefoot walking without fitting in; MBFT shoe might be an ideal compromise for healthy youth as regards gait symmetry in walking.
The Humanistic Duo: The Park/Recreation Professional and the Computer. (Computer-Can I Use It?).

ERIC Educational Resources Information Center

Weiner, Myron E.

This paper states that there are two fundamental reasons for the comparative absence of computer use for parks and recreation at the present time. These are (1) lack of clear cut cost justification and (2) reluctance on the part of recreation professionals to accept their role as managers and, consequently, to utilize modern management tools. The…
Towards a Versatile Tele-Education Platform for Computer Science Educators Based on the Greek School Network

ERIC Educational Resources Information Center

Paraskevas, Michael; Zarouchas, Thomas; Angelopoulos, Panagiotis; Perikos, Isidoros

2013-01-01

Now days the growing need for highly qualified computer science educators in modern educational environments is commonplace. This study examines the potential use of Greek School Network (GSN) to provide a robust and comprehensive e-training course for computer science educators in order to efficiently exploit advanced IT services and establish a…
Virtual memory

NASA Technical Reports Server (NTRS)

Denning, P. J.

1986-01-01

Virtual memory was conceived as a way to automate overlaying of program segments. Modern computers have very large main memories, but need automatic solutions to the relocation and protection problems. Virtual memory serves this need as well and is thus useful in computers of all sizes. The history of the idea is traced, showing how it has become a widespread, little noticed feature of computers today.

A modern approach to storing of 3D geometry of objects in machine engineering industry

NASA Astrophysics Data System (ADS)

Sokolova, E. A.; Aslanov, G. A.; Sokolov, A. A.

2017-02-01

3D graphics is a kind of computer graphics which has absorbed a lot from the vector and raster computer graphics. It is used in interior design projects, architectural projects, advertising, while creating educational computer programs, movies, visual images of parts and products in engineering, etc. 3D computer graphics allows one to create 3D scenes along with simulation of light conditions and setting up standpoints.
Development of modern human subadult age and sex estimation standards using multi-slice computed tomography images from medical examiner's offices

NASA Astrophysics Data System (ADS)

Stock, Michala K.; Stull, Kyra E.; Garvin, Heather M.; Klales, Alexandra R.

2016-10-01

Forensic anthropologists are routinely asked to estimate a biological profile (i.e., age, sex, ancestry and stature) from a set of unidentified remains. In contrast to the abundance of collections and techniques associated with adult skeletons, there is a paucity of modern, documented subadult skeletal material, which limits the creation and validation of appropriate forensic standards. Many are forced to use antiquated methods derived from small sample sizes, which given documented secular changes in the growth and development of children, are not appropriate for application in the medico-legal setting. Therefore, the aim of this project is to use multi-slice computed tomography (MSCT) data from a large, diverse sample of modern subadults to develop new methods to estimate subadult age and sex for practical forensic applications. The research sample will consist of over 1,500 full-body MSCT scans of modern subadult individuals (aged birth to 20 years) obtained from two U.S. medical examiner's offices. Statistical analysis of epiphyseal union scores, long bone osteometrics, and os coxae landmark data will be used to develop modern subadult age and sex estimation standards. This project will result in a database of information gathered from the MSCT scans, as well as the creation of modern, statistically rigorous standards for skeletal age and sex estimation in subadults. Furthermore, the research and methods developed in this project will be applicable to dry bone specimens, MSCT scans, and radiographic images, thus providing both tools and continued access to data for forensic practitioners in a variety of settings.
cuSwift --- a suite of numerical integration methods for modelling planetary systems implemented in C/CUDA

NASA Astrophysics Data System (ADS)

Hellmich, S.; Mottola, S.; Hahn, G.; Kührt, E.; Hlawitschka, M.

2014-07-01

Simulations of dynamical processes in planetary systems represent an important tool for studying the orbital evolution of the systems [1--3]. Using modern numerical integration methods, it is possible to model systems containing many thousands of objects over timescales of several hundred million years. However, in general, supercomputers are needed to get reasonable simulation results in acceptable execution times [3]. To exploit the ever-growing computation power of Graphics Processing Units (GPUs) in modern desktop computers, we implemented cuSwift, a library of numerical integration methods for studying long-term dynamical processes in planetary systems. cuSwift can be seen as a re-implementation of the famous SWIFT integrator package written by Hal Levison and Martin Duncan. cuSwift is written in C/CUDA and contains different integration methods for various purposes. So far, we have implemented three algorithms: a 15th-order Radau integrator [4], the Wisdom-Holman Mapping (WHM) integrator [5], and the Regularized Mixed Variable Symplectic (RMVS) Method [6]. These algorithms treat only the planets as mutually gravitationally interacting bodies whereas asteroids and comets (or other minor bodies of interest) are treated as massless test particles which are gravitationally influenced by the massive bodies but do not affect each other or the massive bodies. The main focus of this work is on the symplectic methods (WHM and RMVS) which use a larger time step and thus are capable of integrating many particles over a large time span. As an additional feature, we implemented the non-gravitational Yarkovsky effect as described by M. Brož [7]. With cuSwift, we show that the use of modern GPUs makes it possible to speed up these methods by more than one order of magnitude compared to the single-core CPU implementation, thereby enabling modest workstation computers to perform long-term dynamical simulations. We use these methods to study the influence of the Yarkovsky effect on resonant asteroids. We present first results and compare them with integrations done with the original algorithms implemented in SWIFT in order to assess the numerical precision of cuSwift and to demonstrate the speed-up we achieved using the GPU.
Fast matrix multiplication and its algebraic neighbourhood

NASA Astrophysics Data System (ADS)

Pan, V. Ya.

2017-11-01

Matrix multiplication is among the most fundamental operations of modern computations. By 1969 it was still commonly believed that the classical algorithm was optimal, although the experts already knew that this was not so. Worldwide interest in matrix multiplication instantly exploded in 1969, when Strassen decreased the exponent 3 of cubic time to 2.807. Then everyone expected to see matrix multiplication performed in quadratic or nearly quadratic time very soon. Further progress, however, turned out to be capricious. It was at stalemate for almost a decade, then a combination of surprising techniques (completely independent of Strassen's original ones and much more advanced) enabled a new decrease of the exponent in 1978-1981 and then again in 1986, to 2.376. By 2017 the exponent has still not passed through the barrier of 2.373, but most disturbing was the curse of recursion — even the decrease of exponents below 2.7733 required numerous recursive steps, and each of them squared the problem size. As a result, all algorithms supporting such exponents supersede the classical algorithm only for inputs of immense sizes, far beyond any potential interest for the user. We survey the long study of fast matrix multiplication, focusing on neglected algorithms for feasible matrix multiplication. We comment on their design, the techniques involved, implementation issues, the impact of their study on the modern theory and practice of Algebraic Computations, and perspectives for fast matrix multiplication. Bibliography: 163 titles.
Electrostatics with Computer-Interfaced Charge Sensors

ERIC Educational Resources Information Center

Morse, Robert A.

2006-01-01

Computer interfaced electrostatic charge sensors allow both qualitative and quantitative measurements of electrostatic charge but are quite sensitive to charges accumulating on modern synthetic materials. They need to be used with care so that students can correctly interpret their measurements. This paper describes the operation of the sensors,…
Novel Scalable 3-D MT Inverse Solver

NASA Astrophysics Data System (ADS)

Kuvshinov, A. V.; Kruglyakov, M.; Geraskin, A.

2016-12-01

We present a new, robust and fast, three-dimensional (3-D) magnetotelluric (MT) inverse solver. As a forward modelling engine a highly-scalable solver extrEMe [1] is used. The (regularized) inversion is based on an iterative gradient-type optimization (quasi-Newton method) and exploits adjoint sources approach for fast calculation of the gradient of the misfit. The inverse solver is able to deal with highly detailed and contrasting models, allows for working (separately or jointly) with any type of MT (single-site and/or inter-site) responses, and supports massive parallelization. Different parallelization strategies implemented in the code allow for optimal usage of available computational resources for a given problem set up. To parameterize an inverse domain a mask approach is implemented, which means that one can merge any subset of forward modelling cells in order to account for (usually) irregular distribution of observation sites. We report results of 3-D numerical experiments aimed at analysing the robustness, performance and scalability of the code. In particular, our computational experiments carried out at different platforms ranging from modern laptops to high-performance clusters demonstrate practically linear scalability of the code up to thousands of nodes. 1. Kruglyakov, M., A. Geraskin, A. Kuvshinov, 2016. Novel accurate and scalable 3-D MT forward solver based on a contracting integral equation method, Computers and Geosciences, in press.
Smartphones as image processing systems for prosthetic vision.

PubMed

Zapf, Marc P; Matteucci, Paul B; Lovell, Nigel H; Suaning, Gregg J

2013-01-01

The feasibility of implants for prosthetic vision has been demonstrated by research and commercial organizations. In most devices, an essential forerunner to the internal stimulation circuit is an external electronics solution for capturing, processing and relaying image information as well as extracting useful features from the scene surrounding the patient. The capabilities and multitude of image processing algorithms that can be performed by the device in real-time plays a major part in the final quality of the prosthetic vision. It is therefore optimal to use powerful hardware yet to avoid bulky, straining solutions. Recent publications have reported of portable single-board computers fast enough for computationally intensive image processing. Following the rapid evolution of commercial, ultra-portable ARM (Advanced RISC machine) mobile devices, the authors investigated the feasibility of modern smartphones running complex face detection as external processing devices for vision implants. The role of dedicated graphics processors in speeding up computation was evaluated while performing a demanding noise reduction algorithm (image denoising). The time required for face detection was found to decrease by 95% from 2.5 year old to recent devices. In denoising, graphics acceleration played a major role, speeding up denoising by a factor of 18. These results demonstrate that the technology has matured sufficiently to be considered as a valid external electronics platform for visual prosthetic research.
Current Capabilities at SNL for the Integration of Small Modular Reactors onto Smart Microgrids Using Sandia's Smart Microgrid Technology High Performance Computing and Advanced Manufacturing.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Rodriguez, Salvador B.

Smart grids are a crucial component for enabling the nation’s future energy needs, as part of a modernization effort led by the Department of Energy. Smart grids and smart microgrids are being considered in niche applications, and as part of a comprehensive energy strategy to help manage the nation’s growing energy demands, for critical infrastructures, military installations, small rural communities, and large populations with limited water supplies. As part of a far-reaching strategic initiative, Sandia National Laboratories (SNL) presents herein a unique, three-pronged approach to integrate small modular reactors (SMRs) into microgrids, with the goal of providing economically-competitive, reliable, andmore » secure energy to meet the nation’s needs. SNL’s triad methodology involves an innovative blend of smart microgrid technology, high performance computing (HPC), and advanced manufacturing (AM). In this report, Sandia’s current capabilities in those areas are summarized, as well as paths forward that will enable DOE to achieve its energy goals. In the area of smart grid/microgrid technology, Sandia’s current computational capabilities can model the entire grid, including temporal aspects and cyber security issues. Our tools include system development, integration, testing and evaluation, monitoring, and sustainment.« less
DNS of Low-Pressure Turbine Cascade Flows with Elevated Inflow Turbulence Using a Discontinuous-Galerkin Spectral-Element Method

NASA Technical Reports Server (NTRS)

Garai, Anirban; Diosady, Laslo T.; Murman, Scott M.; Madavan, Nateri K.

2016-01-01

Recent progress towards developing a new computational capability for accurate and efficient high-fidelity direct numerical simulation (DNS) and large-eddy simulation (LES) of turbomachinery is described. This capability is based on an entropy- stable Discontinuous-Galerkin spectral-element approach that extends to arbitrarily high orders of spatial and temporal accuracy, and is implemented in a computationally efficient manner on a modern high performance computer architecture. An inflow turbulence generation procedure based on a linear forcing approach has been incorporated in this framework and DNS conducted to study the effect of inflow turbulence on the suction- side separation bubble in low-pressure turbine (LPT) cascades. The T106 series of airfoil cascades in both lightly (T106A) and highly loaded (T106C) configurations at exit isentropic Reynolds numbers of 60,000 and 80,000, respectively, are considered. The numerical simulations are performed using 8th-order accurate spatial and 4th-order accurate temporal discretization. The changes in separation bubble topology due to elevated inflow turbulence is captured by the present method and the physical mechanisms leading to the changes are explained. The present results are in good agreement with prior numerical simulations but some expected discrepancies with the experimental data for the T106C case are noted and discussed.
A principled approach to the measurement of situation awareness in commercial aviation

NASA Technical Reports Server (NTRS)

Tenney, Yvette J.; Adams, Marilyn Jager; Pew, Richard W.; Huggins, A. W. F.; Rogers, William H.

1992-01-01

The issue of how to support situation awareness among crews of modern commercial aircraft is becoming especially important with the introduction of automation in the form of sophisticated flight management computers and expert systems designed to assist the crew. In this paper, cognitive theories are discussed that have relevance for the definition and measurement of situation awareness. These theories suggest that comprehension of the flow of events is an active process that is limited by the modularity of attention and memory constraints, but can be enhanced by expert knowledge and strategies. Three implications of this perspective for assessing and improving situation awareness are considered: (1) Scenario variations are proposed that tax awareness by placing demands on attention; (2) Experimental tasks and probes are described for assessing the cognitive processes that underlie situation awareness; and (3) The use of computer-based human performance models to augment the measures of situation awareness derived from performance data is explored. Finally, two potential example applications of the proposed assessment techniques are described, one concerning spatial awareness using wide field of view displays and the other emphasizing fault management in aircraft systems.
Enabling a high throughput real time data pipeline for a large radio telescope array with GPUs

NASA Astrophysics Data System (ADS)

Edgar, R. G.; Clark, M. A.; Dale, K.; Mitchell, D. A.; Ord, S. M.; Wayth, R. B.; Pfister, H.; Greenhill, L. J.

2010-10-01

The Murchison Widefield Array (MWA) is a next-generation radio telescope currently under construction in the remote Western Australia Outback. Raw data will be generated continuously at 5 GiB s-1, grouped into 8 s cadences. This high throughput motivates the development of on-site, real time processing and reduction in preference to archiving, transport and off-line processing. Each batch of 8 s data must be completely reduced before the next batch arrives. Maintaining real time operation will require a sustained performance of around 2.5 TFLOP s-1 (including convolutions, FFTs, interpolations and matrix multiplications). We describe a scalable heterogeneous computing pipeline implementation, exploiting both the high computing density and FLOP-per-Watt ratio of modern GPUs. The architecture is highly parallel within and across nodes, with all major processing elements performed by GPUs. Necessary scatter-gather operations along the pipeline are loosely synchronized between the nodes hosting the GPUs. The MWA will be a frontier scientific instrument and a pathfinder for planned peta- and exa-scale facilities.
Job submission and management through web services: the experience with the CREAM service

NASA Astrophysics Data System (ADS)

Aiftimiei, C.; Andreetto, P.; Bertocco, S.; Fina, S. D.; Ronco, S. D.; Dorigo, A.; Gianelle, A.; Marzolla, M.; Mazzucato, M.; Sgaravatto, M.; Verlato, M.; Zangrando, L.; Corvo, M.; Miccio, V.; Sciaba, A.; Cesini, D.; Dongiovanni, D.; Grandi, C.

2008-07-01

Modern Grid middleware is built around components providing basic functionality, such as data storage, authentication, security, job management, resource monitoring and reservation. In this paper we describe the Computing Resource Execution and Management (CREAM) service. CREAM provides a Web service-based job execution and management capability for Grid systems; in particular, it is being used within the gLite middleware. CREAM exposes a Web service interface allowing conforming clients to submit and manage computational jobs to a Local Resource Management System. We developed a special component, called ICE (Interface to CREAM Environment) to integrate CREAM in gLite. ICE transfers job submissions and cancellations from the Workload Management System, allowing users to manage CREAM jobs from the gLite User Interface. This paper describes some recent studies aimed at assessing the performance and reliability of CREAM and ICE; those tests have been performed as part of the acceptance tests for integration of CREAM and ICE in gLite. We also discuss recent work towards enhancing CREAM with a BES and JSDL compliant interface.
Application of Pinniped Vibrissae to Aeropropulsion

NASA Technical Reports Server (NTRS)

Shyam, Vikram; Ameri, Ali; Poinsatte, Philip; Thurman, Douglas; Wroblewski, Adam; Snyder, Christopher

2015-01-01

Vibrissae of Phoca Vitulina (Harbor Seal) and Mirounga Angustirostris (Elephant Seal) possessundulations along their length. Harbor Seal Vibrissae were shown to reduce vortex induced vibrations and reduce dragcompared to appropriately scaled cylinders and ellipses. Samples of Harbor Seal vibrissae, Elephant Seal vibrissae andCalifornia Sea Lion vibrissae were collected from the Marine Mammal Center in California. CT scanning, microscopy and3D scanning techniques were utilized to characterize the whiskers. Computational fluid dynamics simulations of thewhiskers were carried out to compare them to an ellipse and a cylinder. Leading edge parameters from the whiskerswere used to create a 3D profile based on a modern power turbine blade. The NASA SW-2 facility was used to performwind tunnel cascade testing on the 'Seal Blades'. Computational Fluid Dynamics simulations were used to studyincidence angles from -37 to +10 degrees on the aerodynamic performance of the Seal Blade. The tests and simulationswere conducted at a Reynolds number of 100,000. The Seal Blades showed consistent performance improvements overthe baseline configuration. It was determined that a fuel burn reduction of approximately 5 could be achieved for a fixedwing aircraft. Noise reduction potential is also explored
Architectural Techniques For Managing Non-volatile Caches

DOE Office of Scientific and Technical Information (OSTI.GOV)

Mittal, Sparsh

As chip power dissipation becomes a critical challenge in scaling processor performance, computer architects are forced to fundamentally rethink the design of modern processors and hence, the chip-design industry is now at a major inflection point in its hardware roadmap. The high leakage power and low density of SRAM poses serious obstacles in its use for designing large on-chip caches and for this reason, researchers are exploring non-volatile memory (NVM) devices, such as spin torque transfer RAM, phase change RAM and resistive RAM. However, since NVMs are not strictly superior to SRAM, effective architectural techniques are required for making themmore » a universal memory solution. This book discusses techniques for designing processor caches using NVM devices. It presents algorithms and architectures for improving their energy efficiency, performance and lifetime. It also provides both qualitative and quantitative evaluation to help the reader gain insights and motivate them to explore further. This book will be highly useful for beginners as well as veterans in computer architecture, chip designers, product managers and technical marketing professionals.« less
VLSI Implementation of Fault Tolerance Multiplier based on Reversible Logic Gate

NASA Astrophysics Data System (ADS)

Ahmad, Nabihah; Hakimi Mokhtar, Ahmad; Othman, Nurmiza binti; Fhong Soon, Chin; Rahman, Ab Al Hadi Ab

2017-08-01

Multiplier is one of the essential component in the digital world such as in digital signal processing, microprocessor, quantum computing and widely used in arithmetic unit. Due to the complexity of the multiplier, tendency of errors are very high. This paper aimed to design a 2×2 bit Fault Tolerance Multiplier based on Reversible logic gate with low power consumption and high performance. This design have been implemented using 90nm Complemetary Metal Oxide Semiconductor (CMOS) technology in Synopsys Electronic Design Automation (EDA) Tools. Implementation of the multiplier architecture is by using the reversible logic gates. The fault tolerance multiplier used the combination of three reversible logic gate which are Double Feynman gate (F2G), New Fault Tolerance (NFT) gate and Islam Gate (IG) with the area of 160μm x 420.3μm (67.25 mm2). This design achieved a low power consumption of 122.85μW and propagation delay of 16.99ns. The fault tolerance multiplier proposed achieved a low power consumption and high performance which suitable for application of modern computing as it has a fault tolerance capabilities.
Petascale computation of multi-physics seismic simulations

NASA Astrophysics Data System (ADS)

Gabriel, Alice-Agnes; Madden, Elizabeth H.; Ulrich, Thomas; Wollherr, Stephanie; Duru, Kenneth C.

2017-04-01

Capturing the observed complexity of earthquake sources in concurrence with seismic wave propagation simulations is an inherently multi-scale, multi-physics problem. In this presentation, we present simulations of earthquake scenarios resolving high-detail dynamic rupture evolution and high frequency ground motion. The simulations combine a multitude of representations of model complexity; such as non-linear fault friction, thermal and fluid effects, heterogeneous fault stress and fault strength initial conditions, fault curvature and roughness, on- and off-fault non-elastic failure to capture dynamic rupture behavior at the source; and seismic wave attenuation, 3D subsurface structure and bathymetry impacting seismic wave propagation. Performing such scenarios at the necessary spatio-temporal resolution requires highly optimized and massively parallel simulation tools which can efficiently exploit HPC facilities. Our up to multi-PetaFLOP simulations are performed with SeisSol (www.seissol.org), an open-source software package based on an ADER-Discontinuous Galerkin (DG) scheme solving the seismic wave equations in velocity-stress formulation in elastic, viscoelastic, and viscoplastic media with high-order accuracy in time and space. Our flux-based implementation of frictional failure remains free of spurious oscillations. Tetrahedral unstructured meshes allow for complicated model geometry. SeisSol has been optimized on all software levels, including: assembler-level DG kernels which obtain 50% peak performance on some of the largest supercomputers worldwide; an overlapping MPI-OpenMP parallelization shadowing the multiphysics computations; usage of local time stepping; parallel input and output schemes and direct interfaces to community standard data formats. All these factors enable aim to minimise the time-to-solution. The results presented highlight the fact that modern numerical methods and hardware-aware optimization for modern supercomputers are essential to further our understanding of earthquake source physics and complement both physic-based ground motion research and empirical approaches in seismic hazard analysis. Lastly, we will conclude with an outlook on future exascale ADER-DG solvers for seismological applications.
A highly efficient multi-core algorithm for clustering extremely large datasets

PubMed Central

2010-01-01

Background In recent years, the demand for computational power in computational biology has increased due to rapidly growing data sets from microarray and other high-throughput technologies. This demand is likely to increase. Standard algorithms for analyzing data, such as cluster algorithms, need to be parallelized for fast processing. Unfortunately, most approaches for parallelizing algorithms largely rely on network communication protocols connecting and requiring multiple computers. One answer to this problem is to utilize the intrinsic capabilities in current multi-core hardware to distribute the tasks among the different cores of one computer. Results We introduce a multi-core parallelization of the k-means and k-modes cluster algorithms based on the design principles of transactional memory for clustering gene expression microarray type data and categorial SNP data. Our new shared memory parallel algorithms show to be highly efficient. We demonstrate their computational power and show their utility in cluster stability and sensitivity analysis employing repeated runs with slightly changed parameters. Computation speed of our Java based algorithm was increased by a factor of 10 for large data sets while preserving computational accuracy compared to single-core implementations and a recently published network based parallelization. Conclusions Most desktop computers and even notebooks provide at least dual-core processors. Our multi-core algorithms show that using modern algorithmic concepts, parallelization makes it possible to perform even such laborious tasks as cluster sensitivity and cluster number estimation on the laboratory computer. PMID:20370922
Performance Studies on Distributed Virtual Screening

PubMed Central

Krüger, Jens; de la Garza, Luis; Kohlbacher, Oliver; Nagel, Wolfgang E.

2014-01-01

Virtual high-throughput screening (vHTS) is an invaluable method in modern drug discovery. It permits screening large datasets or databases of chemical structures for those structures binding possibly to a drug target. Virtual screening is typically performed by docking code, which often runs sequentially. Processing of huge vHTS datasets can be parallelized by chunking the data because individual docking runs are independent of each other. The goal of this work is to find an optimal splitting maximizing the speedup while considering overhead and available cores on Distributed Computing Infrastructures (DCIs). We have conducted thorough performance studies accounting not only for the runtime of the docking itself, but also for structure preparation. Performance studies were conducted via the workflow-enabled science gateway MoSGrid (Molecular Simulation Grid). As input we used benchmark datasets for protein kinases. Our performance studies show that docking workflows can be made to scale almost linearly up to 500 concurrent processes distributed even over large DCIs, thus accelerating vHTS campaigns significantly. PMID:25032219
Optimizing the Performance of Reactive Molecular Dynamics Simulations for Multi-core Architectures

DOE Office of Scientific and Technical Information (OSTI.GOV)

Aktulga, Hasan Metin; Coffman, Paul; Shan, Tzu-Ray

2015-12-01

Hybrid parallelism allows high performance computing applications to better leverage the increasing on-node parallelism of modern supercomputers. In this paper, we present a hybrid parallel implementation of the widely used LAMMPS/ReaxC package, where the construction of bonded and nonbonded lists and evaluation of complex ReaxFF interactions are implemented efficiently using OpenMP parallelism. Additionally, the performance of the QEq charge equilibration scheme is examined and a dual-solver is implemented. We present the performance of the resulting ReaxC-OMP package on a state-of-the-art multi-core architecture Mira, an IBM BlueGene/Q supercomputer. For system sizes ranging from 32 thousand to 16.6 million particles, speedups inmore » the range of 1.5-4.5x are observed using the new ReaxC-OMP software. Sustained performance improvements have been observed for up to 262,144 cores (1,048,576 processes) of Mira with a weak scaling efficiency of 91.5% in larger simulations containing 16.6 million particles.« less
Modeling and Simulation of High Dimensional Stochastic Multiscale PDE Systems at the Exascale

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kevrekidis, Ioannis

2017-03-22

The thrust of the proposal was to exploit modern data-mining tools in a way that will create a systematic, computer-assisted approach to the representation of random media -- and also to the representation of the solutions of an array of important physicochemical processes that take place in/on such media. A parsimonious representation/parametrization of the random media links directly (via uncertainty quantification tools) to good sampling of the distribution of random media realizations. It also links directly to modern multiscale computational algorithms (like the equation-free approach that has been developed in our group) and plays a crucial role in accelerating themore » scientific computation of solutions of nonlinear PDE models (deterministic or stochastic) in such media – both solutions in particular realizations of the random media, and estimation of the statistics of the solutions over multiple realizations (e.g. expectations).« less

Some links on this page may take you to non-federal websites. Their policies may differ from this site.