Science.gov

Sample records for massively parallel electrical

  1. Massively parallel visualization: Parallel rendering

    SciTech Connect

    Hansen, C.D.; Krogh, M.; White, W.

    1995-12-01

    This paper presents rendering algorithms, developed for massively parallel processors (MPPs), for polygonal, spheres, and volumetric data. The polygon algorithm uses a data parallel approach whereas the sphere and volume renderer use a MIMD approach. Implementations for these algorithms are presented for the Thinking Machines Corporation CM-5 MPP.

  2. Massively parallel electrical conductivity imaging of the subsurface: Applications to hydrocarbon exploration

    SciTech Connect

    Newman, G.A.; Commer, M.

    2009-06-01

    Three-dimensional (3D) geophysical imaging is now receiving considerable attention for electrical conductivity mapping of potential offshore oil and gas reservoirs. The imaging technology employs controlled source electromagnetic (CSEM) and magnetotelluric (MT) fields and treats geological media exhibiting transverse anisotropy. Moreover when combined with established seismic methods, direct imaging of reservoir fluids is possible. Because of the size of the 3D conductivity imaging problem, strategies are required exploiting computational parallelism and optimal meshing. The algorithm thus developed has been shown to scale to tens of thousands of processors. In one imaging experiment, 32,768 tasks/processors on the IBM Watson Research Blue Gene/L supercomputer were successfully utilized. Over a 24 hour period we were able to image a large scale field data set that previously required over four months of processing time on distributed clusters based on Intel or AMD processors utilizing 1024 tasks on an InfiniBand fabric. Electrical conductivity imaging using massively parallel computational resources produces results that cannot be obtained otherwise and are consistent with timeframes required for practical exploration problems.

  3. Massively parallel electrical conductivity imaging of the subsurface: Applications to hydrocarbon exploration

    NASA Astrophysics Data System (ADS)

    Newman, Gregory A.; Commer, Michael

    2009-07-01

    Three-dimensional (3D) geophysical imaging is now receiving considerable attention for electrical conductivity mapping of potential offshore oil and gas reservoirs. The imaging technology employs controlled source electromagnetic (CSEM) and magnetotelluric (MT) fields and treats geological media exhibiting transverse anisotropy. Moreover when combined with established seismic methods, direct imaging of reservoir fluids is possible. Because of the size of the 3D conductivity imaging problem, strategies are required exploiting computational parallelism and optimal meshing. The algorithm thus developed has been shown to scale to tens of thousands of processors. In one imaging experiment, 32,768 tasks/processors on the IBM Watson Research Blue Gene/L supercomputer were successfully utilized. Over a 24 hour period we were able to image a large scale field data set that previously required over four months of processing time on distributed clusters based on Intel or AMD processors utilizing 1024 tasks on an InfiniBand fabric. Electrical conductivity imaging using massively parallel computational resources produces results that cannot be obtained otherwise and are consistent with timeframes required for practical exploration problems.

  4. Massively parallel mathematical sieves

    SciTech Connect

    Montry, G.R.

    1989-01-01

    The Sieve of Eratosthenes is a well-known algorithm for finding all prime numbers in a given subset of integers. A parallel version of the Sieve is described that produces computational speedups over 800 on a hypercube with 1,024 processing elements for problems of fixed size. Computational speedups as high as 980 are achieved when the problem size per processor is fixed. The method of parallelization generalizes to other sieves and will be efficient on any ensemble architecture. We investigate two highly parallel sieves using scattered decomposition and compare their performance on a hypercube multiprocessor. A comparison of different parallelization techniques for the sieve illustrates the trade-offs necessary in the design and implementation of massively parallel algorithms for large ensemble computers.

  5. Massively Parallel QCD

    SciTech Connect

    Soltz, R; Vranas, P; Blumrich, M; Chen, D; Gara, A; Giampap, M; Heidelberger, P; Salapura, V; Sexton, J; Bhanot, G

    2007-04-11

    The theory of the strong nuclear force, Quantum Chromodynamics (QCD), can be numerically simulated from first principles on massively-parallel supercomputers using the method of Lattice Gauge Theory. We describe the special programming requirements of lattice QCD (LQCD) as well as the optimal supercomputer hardware architectures that it suggests. We demonstrate these methods on the BlueGene massively-parallel supercomputer and argue that LQCD and the BlueGene architecture are a natural match. This can be traced to the simple fact that LQCD is a regular lattice discretization of space into lattice sites while the BlueGene supercomputer is a discretization of space into compute nodes, and that both are constrained by requirements of locality. This simple relation is both technologically important and theoretically intriguing. The main result of this paper is the speedup of LQCD using up to 131,072 CPUs on the largest BlueGene/L supercomputer. The speedup is perfect with sustained performance of about 20% of peak. This corresponds to a maximum of 70.5 sustained TFlop/s. At these speeds LQCD and BlueGene are poised to produce the next generation of strong interaction physics theoretical results.

  6. Parallel rendering techniques for massively parallel visualization

    SciTech Connect

    Hansen, C.; Krogh, M.; Painter, J.

    1995-07-01

    As the resolution of simulation models increases, scientific visualization algorithms which take advantage of the large memory. and parallelism of Massively Parallel Processors (MPPs) are becoming increasingly important. For large applications rendering on the MPP tends to be preferable to rendering on a graphics workstation due to the MPP`s abundant resources: memory, disk, and numerous processors. The challenge becomes developing algorithms that can exploit these resources while minimizing overhead, typically communication costs. This paper will describe recent efforts in parallel rendering for polygonal primitives as well as parallel volumetric techniques. This paper presents rendering algorithms, developed for massively parallel processors (MPPs), for polygonal, spheres, and volumetric data. The polygon algorithm uses a data parallel approach whereas the sphere and volume render use a MIMD approach. Implementations for these algorithms are presented for the Thinking Ma.chines Corporation CM-5 MPP.

  7. Massively-parallel electrical-conductivity imaging of hydrocarbonsusing the Blue Gene/L supercomputer

    SciTech Connect

    Commer, M.; Newman, G.A.; Carazzone, J.J.; Dickens, T.A.; Green,K.E.; Wahrmund, L.A.; Willen, D.E.; Shiu, J.

    2007-05-16

    Large-scale controlled source electromagnetic (CSEM)three-dimensional (3D) geophysical imaging is now receiving considerableattention for electrical conductivity mapping of potential offshore oiland gas reservoirs. To cope with the typically large computationalrequirements of the 3D CSEM imaging problem, our strategies exploitcomputational parallelism and optimized finite-difference meshing. Wereport on an imaging experiment, utilizing 32,768 tasks/processors on theIBM Watson Research Blue Gene/L (BG/L) supercomputer. Over a 24-hourperiod, we were able to image a large scale marine CSEM field data setthat previously required over four months of computing time ondistributed clusters utilizing 1024 tasks on an Infiniband fabric. Thetotal initial data misfit could be decreased by 67 percent within 72completed inversion iterations, indicating an electrically resistiveregion in the southern survey area below a depth of 1500 m below theseafloor. The major part of the residual misfit stems from transmitterparallel receiver components that have an offset from the transmittersail line (broadside configuration). Modeling confirms that improvedbroadside data fits can be achieved by considering anisotropic electricalconductivities. While delivering a satisfactory gross scale image for thedepths of interest, the experiment provides important evidence for thenecessity of discriminating between horizontal and verticalconductivities for maximally consistent 3D CSEM inversions.

  8. Efficient, massively parallel eigenvalue computation

    NASA Technical Reports Server (NTRS)

    Huo, Yan; Schreiber, Robert

    1993-01-01

    In numerical simulations of disordered electronic systems, one of the most common approaches is to diagonalize random Hamiltonian matrices and to study the eigenvalues and eigenfunctions of a single electron in the presence of a random potential. An effort to implement a matrix diagonalization routine for real symmetric dense matrices on massively parallel SIMD computers, the Maspar MP-1 and MP-2 systems, is described. Results of numerical tests and timings are also presented.

  9. Massively parallel MRI detector arrays

    NASA Astrophysics Data System (ADS)

    Keil, Boris; Wald, Lawrence L.

    2013-04-01

    Originally proposed as a method to increase sensitivity by extending the locally high-sensitivity of small surface coil elements to larger areas via reception, the term parallel imaging now includes the use of array coils to perform image encoding. This methodology has impacted clinical imaging to the point where many examinations are performed with an array comprising multiple smaller surface coil elements as the detector of the MR signal. This article reviews the theoretical and experimental basis for the trend towards higher channel counts relying on insights gained from modeling and experimental studies as well as the theoretical analysis of the so-called “ultimate” SNR and g-factor. We also review the methods for optimally combining array data and changes in RF methodology needed to construct massively parallel MRI detector arrays and show some examples of state-of-the-art for highly accelerated imaging with the resulting highly parallel arrays.

  10. Massively Parallel MRI Detector Arrays

    PubMed Central

    Keil, Boris; Wald, Lawrence L

    2013-01-01

    Originally proposed as a method to increase sensitivity by extending the locally high-sensitivity of small surface coil elements to larger areas, the term parallel imaging now includes the use of array coils to perform image encoding. This methodology has impacted clinical imaging to the point where many examinations are performed with an array comprising multiple smaller surface coil elements as the detector of the MR signal. This article reviews the theoretical and experimental basis for the trend towards higher channel counts relying on insights gained from modeling and experimental studies as well as the theoretical analysis of the so-called “ultimate” SNR and g-factor. We also review the methods for optimally combining array data and changes in RF methodology needed to construct massively parallel MRI detector arrays and show some examples of state-of-the-art for highly accelerated imaging with the resulting highly parallel arrays. PMID:23453758

  11. Massively parallel MRI detector arrays.

    PubMed

    Keil, Boris; Wald, Lawrence L

    2013-04-01

    Originally proposed as a method to increase sensitivity by extending the locally high-sensitivity of small surface coil elements to larger areas via reception, the term parallel imaging now includes the use of array coils to perform image encoding. This methodology has impacted clinical imaging to the point where many examinations are performed with an array comprising multiple smaller surface coil elements as the detector of the MR signal. This article reviews the theoretical and experimental basis for the trend towards higher channel counts relying on insights gained from modeling and experimental studies as well as the theoretical analysis of the so-called "ultimate" SNR and g-factor. We also review the methods for optimally combining array data and changes in RF methodology needed to construct massively parallel MRI detector arrays and show some examples of state-of-the-art for highly accelerated imaging with the resulting highly parallel arrays. PMID:23453758

  12. Merlin - Massively parallel heterogeneous computing

    NASA Technical Reports Server (NTRS)

    Wittie, Larry; Maples, Creve

    1989-01-01

    Hardware and software for Merlin, a new kind of massively parallel computing system, are described. Eight computers are linked as a 300-MIPS prototype to develop system software for a larger Merlin network with 16 to 64 nodes, totaling 600 to 3000 MIPS. These working prototypes help refine a mapped reflective memory technique that offers a new, very general way of linking many types of computer to form supercomputers. Processors share data selectively and rapidly on a word-by-word basis. Fast firmware virtual circuits are reconfigured to match topological needs of individual application programs. Merlin's low-latency memory-sharing interfaces solve many problems in the design of high-performance computing systems. The Merlin prototypes are intended to run parallel programs for scientific applications and to determine hardware and software needs for a future Teraflops Merlin network.

  13. Massively parallel quantum computer simulator

    NASA Astrophysics Data System (ADS)

    De Raedt, K.; Michielsen, K.; De Raedt, H.; Trieu, B.; Arnold, G.; Richter, M.; Lippert, Th.; Watanabe, H.; Ito, N.

    2007-01-01

    We describe portable software to simulate universal quantum computers on massive parallel computers. We illustrate the use of the simulation software by running various quantum algorithms on different computer architectures, such as a IBM BlueGene/L, a IBM Regatta p690+, a Hitachi SR11000/J1, a Cray X1E, a SGI Altix 3700 and clusters of PCs running Windows XP. We study the performance of the software by simulating quantum computers containing up to 36 qubits, using up to 4096 processors and up to 1 TB of memory. Our results demonstrate that the simulator exhibits nearly ideal scaling as a function of the number of processors and suggest that the simulation software described in this paper may also serve as benchmark for testing high-end parallel computers.

  14. Massively parallel femtosecond laser processing.

    PubMed

    Hasegawa, Satoshi; Ito, Haruyasu; Toyoda, Haruyoshi; Hayasaki, Yoshio

    2016-08-01

    Massively parallel femtosecond laser processing with more than 1000 beams was demonstrated. Parallel beams were generated by a computer-generated hologram (CGH) displayed on a spatial light modulator (SLM). The key to this technique is to optimize the CGH in the laser processing system using a scheme called in-system optimization. It was analytically demonstrated that the number of beams is determined by the horizontal number of pixels in the SLM NSLM that is imaged at the pupil plane of an objective lens and a distance parameter pd obtained by dividing the distance between adjacent beams by the diffraction-limited beam diameter. A performance limitation of parallel laser processing in our system was estimated at NSLM of 250 and pd of 7.0. Based on these parameters, the maximum number of beams in a hexagonal close-packed structure was calculated to be 1189 by using an analytical equation. PMID:27505815

  15. Multigrid on massively parallel architectures

    SciTech Connect

    Falgout, R D; Jones, J E

    1999-09-17

    The scalable implementation of multigrid methods for machines with several thousands of processors is investigated. Parallel performance models are presented for three different structured-grid multigrid algorithms, and a description is given of how these models can be used to guide implementation. Potential pitfalls are illustrated when moving from moderate-sized parallelism to large-scale parallelism, and results are given from existing multigrid codes to support the discussion. Finally, the use of mixed programming models is investigated for multigrid codes on clusters of SMPs.

  16. Massively parallel neurocomputing for aerospace applications

    NASA Technical Reports Server (NTRS)

    Fijany, Amir; Barhen, Jacob; Toomarian, Nikzad

    1993-01-01

    An innovative hybrid, analog-digital charge-domain technology, for the massively parallel VLSI implementation of certain large scale matrix-vector operations, has recently been introduced. It employs arrays of Charge Coupled/Charge Injection Device cells holding an analog matrix of charge, which process digital vectors in parallel by means of binary, non-destructive charge transfer operations. The impact of this technology on massively parallel processing is discussed. Fundamentally new classes of algorithms, specifically designed for this emerging technology, as applied to signal processing, are derived.

  17. Massively Parallel Computing: A Sandia Perspective

    SciTech Connect

    Dosanjh, Sudip S.; Greenberg, David S.; Hendrickson, Bruce; Heroux, Michael A.; Plimpton, Steve J.; Tomkins, James L.; Womble, David E.

    1999-05-06

    The computing power available to scientists and engineers has increased dramatically in the past decade, due in part to progress in making massively parallel computing practical and available. The expectation for these machines has been great. The reality is that progress has been slower than expected. Nevertheless, massively parallel computing is beginning to realize its potential for enabling significant break-throughs in science and engineering. This paper provides a perspective on the state of the field, colored by the authors' experiences using large scale parallel machines at Sandia National Laboratories. We address trends in hardware, system software and algorithms, and we also offer our view of the forces shaping the parallel computing industry.

  18. Massive parallelism in the future of science

    NASA Technical Reports Server (NTRS)

    Denning, Peter J.

    1988-01-01

    Massive parallelism appears in three domains of action of concern to scientists, where it produces collective action that is not possible from any individual agent's behavior. In the domain of data parallelism, computers comprising very large numbers of processing agents, one for each data item in the result will be designed. These agents collectively can solve problems thousands of times faster than current supercomputers. In the domain of distributed parallelism, computations comprising large numbers of resource attached to the world network will be designed. The network will support computations far beyond the power of any one machine. In the domain of people parallelism collaborations among large groups of scientists around the world who participate in projects that endure well past the sojourns of individuals within them will be designed. Computing and telecommunications technology will support the large, long projects that will characterize big science by the turn of the century. Scientists must become masters in these three domains during the coming decade.

  19. Massively parallel sequencing and rare disease

    PubMed Central

    Ng, Sarah B.; Nickerson, Deborah A.; Bamshad, Michael J.; Shendure, Jay

    2010-01-01

    Massively parallel sequencing has enabled the rapid, systematic identification of variants on a large scale. This has, in turn, accelerated the pace of gene discovery and disease diagnosis on a molecular level and has the potential to revolutionize methods particularly for the analysis of Mendelian disease. Using massively parallel sequencing has enabled investigators to interrogate variants both in the context of linkage intervals and also on a genome-wide scale, in the absence of linkage information entirely. The primary challenge now is to distinguish between background polymorphisms and pathogenic mutations. Recently developed strategies for rare monogenic disorders have met with some early success. These strategies include filtering for potential causal variants based on frequency and function, and also ranking variants based on conservation scores and predicted deleteriousness to protein structure. Here, we review the recent literature in the use of high-throughput sequence data and its analysis in the discovery of causal mutations for rare disorders. PMID:20846941

  20. Associative massively parallel processor for video processing

    NASA Astrophysics Data System (ADS)

    Krikelis, Argy; Tawiah, T.

    1996-03-01

    Massively parallel processing architectures have matured primarily through image processing and computer vision application. The similarity of processing requirements between these areas and video processing suggest that they should be very appropriate for video processing applications. This research describes the use of an associative massively parallel processing based system for video compression which includes architectural and system description, discussion of the implementation of compression tasks such as DCT/IDCT, Motion Estimation and Quantization and system evaluation. The core of the processing system is the ASP (Associative String Processor) architecture a modular massively parallel, programmable and inherently fault-tolerant fine-grain SIMD processing architecture incorporating a string of identical APEs (Associative Processing Elements), a reconfigurable inter-processor communication network and a Vector Data Buffer for fully-overlapped data input-output. For video compression applications a prototype system is developed, which is using ASP modules to implement the required compression tasks. This scheme leads to a linear speed up of the computation by simply adding more APEs to the modules.

  1. Template based parallel checkpointing in a massively parallel computer system

    DOEpatents

    Archer, Charles Jens; Inglett, Todd Alan

    2009-01-13

    A method and apparatus for a template based parallel checkpoint save for a massively parallel super computer system using a parallel variation of the rsync protocol, and network broadcast. In preferred embodiments, the checkpoint data for each node is compared to a template checkpoint file that resides in the storage and that was previously produced. Embodiments herein greatly decrease the amount of data that must be transmitted and stored for faster checkpointing and increased efficiency of the computer system. Embodiments are directed to a parallel computer system with nodes arranged in a cluster with a high speed interconnect that can perform broadcast communication. The checkpoint contains a set of actual small data blocks with their corresponding checksums from all nodes in the system. The data blocks may be compressed using conventional non-lossy data compression algorithms to further reduce the overall checkpoint size.

  2. Efficient communication in massively parallel computers

    SciTech Connect

    Cypher, R.E.

    1989-01-01

    A fundamental operation in parallel computation is sorting. Sorting is important not only because it is required by many algorithms, but also because it can be used to implement irregular, pointer-based communication. The author studies two algorithms for sorting in massively parallel computers. First, he examines Shellsort. Shellsort is a sorting algorithm that is based on a sequence of parameters called increments. Shellsort can be used to create a parallel sorting device known as a sorting network. Researchers have suggested that if the correct increment sequence is used, an optimal size sorting network can be obtained. All published increment sequences have been monotonically decreasing. He shows that no monotonically decreasing increment sequence will yield an optimal size sorting network. Second, he presents a sorting algorithm called Cubesort. Cubesort is the fastest known sorting algorithm for a variety of parallel computers aver a wide range of parameters. He also presents a paradigm for developing parallel algorithms that have efficient communication. The paradigm, called the data reduction paradigm, consists of using a divide-and-conquer strategy. Both the division and combination phases of the divide-and-conquer algorithm may require irregular, pointer-based communication between processors. However, the problem is divided so as to limit the amount of data that must be communicated. As a result the communication can be performed efficiently. He presents data reduction algorithms for the image component labeling problem, the closest pair problem and four versions of the parallel prefix problem.

  3. Massively Parallel Direct Simulation of Multiphase Flow

    SciTech Connect

    COOK,BENJAMIN K.; PREECE,DALE S.; WILLIAMS,J.R.

    2000-08-10

    The authors understanding of multiphase physics and the associated predictive capability for multi-phase systems are severely limited by current continuum modeling methods and experimental approaches. This research will deliver an unprecedented modeling capability to directly simulate three-dimensional multi-phase systems at the particle-scale. The model solves the fully coupled equations of motion governing the fluid phase and the individual particles comprising the solid phase using a newly discovered, highly efficient coupled numerical method based on the discrete-element method and the Lattice-Boltzmann method. A massively parallel implementation will enable the solution of large, physically realistic systems.

  4. Time sharing massively parallel machines. Draft

    SciTech Connect

    Gorda, B.; Wolski, R.

    1995-03-01

    As part of the Massively Parallel Computing Initiative (MPCI) at the Lawrence Livermore National Laboratory, the authors have developed a simple, effective and portable time sharing mechanism by scheduling gangs of processes on tightly coupled parallel machines. By time-sharing the resources, the system interleaves production and interactive jobs. Immediate priority is given to interactive use, maintaining good response time. Production jobs are scheduled during idle periods, making use of the otherwise unused resources. In this paper the authors discuss their experience with gang scheduling over the 3 year life-time of the project. In section 2, they motivate the project and discuss some of its details. Section 3.0 describes the general scheduling problem and how gang scheduling addresses it. In section 4.0, they describe the implementation. Section 8.0 presents results culled over the lifetime of the project. They conclude this paper with some observations and possible future directions.

  5. Seismic imaging on massively parallel computers

    SciTech Connect

    Ober, C.C.; Oldfield, R.A.; Womble, D.E.; Mosher, C.C.

    1997-07-01

    A key to reducing the risks and costs associated with oil and gas exploration is the fast, accurate imaging of complex geologies, such as salt domes in the Gulf of Mexico and overthrust regions in US onshore regions. Pre-stack depth migration generally yields the most accurate images, and one approach to this is to solve the scalar-wave equation using finite differences. Current industry computational capabilities are insufficient for the application of finite-difference, 3-D, prestack, depth-migration algorithms. High performance computers and state-of-the-art algorithms and software are required to meet this need. As part of an ongoing ACTI project funded by the US Department of Energy, the authors have developed a finite-difference, 3-D prestack, depth-migration code for massively parallel computer systems. The goal of this work is to demonstrate that massively parallel computers (thousands of processors) can be used efficiently for seismic imaging, and that sufficient computing power exists (or soon will exist) to make finite-difference, prestack, depth migration practical for oil and gas exploration.

  6. Massive hybrid parallelism for fully implicit multiphysics

    SciTech Connect

    Gaston, D. R.; Permann, C. J.; Andrs, D.; Peterson, J. W.

    2013-07-01

    As hardware advances continue to modify the supercomputing landscape, traditional scientific software development practices will become more outdated, ineffective, and inefficient. The process of rewriting/retooling existing software for new architectures is a Sisyphean task, and results in substantial hours of development time, effort, and money. Software libraries which provide an abstraction of the resources provided by such architectures are therefore essential if the computational engineering and science communities are to continue to flourish in this modern computing environment. The Multiphysics Object Oriented Simulation Environment (MOOSE) framework enables complex multiphysics analysis tools to be built rapidly by scientists, engineers, and domain specialists, while also allowing them to both take advantage of current HPC architectures, and efficiently prepare for future supercomputer designs. MOOSE employs a hybrid shared-memory and distributed-memory parallel model and provides a complete and consistent interface for creating multiphysics analysis tools. In this paper, a brief discussion of the mathematical algorithms underlying the framework and the internal object-oriented hybrid parallel design are given. Representative massively parallel results from several applications areas are presented, and a brief discussion of future areas of research for the framework are provided. (authors)

  7. MASSIVE HYBRID PARALLELISM FOR FULLY IMPLICIT MULTIPHYSICS

    SciTech Connect

    Cody J. Permann; David Andrs; John W. Peterson; Derek R. Gaston

    2013-05-01

    As hardware advances continue to modify the supercomputing landscape, traditional scientific software development practices will become more outdated, ineffective, and inefficient. The process of rewriting/retooling existing software for new architectures is a Sisyphean task, and results in substantial hours of development time, effort, and money. Software libraries which provide an abstraction of the resources provided by such architectures are therefore essential if the computational engineering and science communities are to continue to flourish in this modern computing environment. The Multiphysics Object Oriented Simulation Environment (MOOSE) framework enables complex multiphysics analysis tools to be built rapidly by scientists, engineers, and domain specialists, while also allowing them to both take advantage of current HPC architectures, and efficiently prepare for future supercomputer designs. MOOSE employs a hybrid shared-memory and distributed-memory parallel model and provides a complete and consistent interface for creating multiphysics analysis tools. In this paper, a brief discussion of the mathematical algorithms underlying the framework and the internal object-oriented hybrid parallel design are given. Representative massively parallel results from several applications areas are presented, and a brief discussion of future areas of research for the framework are provided.

  8. Solid modeling on a massively parallel processor

    SciTech Connect

    Strip, D. ); Karasick, M. )

    1992-01-01

    Solid modeling underlies many technologies that are key to modern manufacturing. These range from computer-aided design systems to robot simulators, from finite element analysis to integrated circuit process modeling. The accuracy, and hence the utility, of these models is often constrained by the amount of computer time required to perform the desired operations. This paper presents a family of algorithms for solid modeling operations using the Connection Machine, a massively parallel SIMD processor. The authors describe a data structure for representing solid models and algorithms that use the representation to implement efficiently a variety of solid modeling operations. The authors give a sketch of the algorithm for intersecting solids and present computational experience using these algorithms. The data structure and algorithms are contrasted with those of serial architectures, and execution times are compared.

  9. Massively parallel neural network intelligent browse

    NASA Astrophysics Data System (ADS)

    Maxwell, Thomas P.; Zion, Philip M.

    1992-04-01

    A massively parallel neural network architecture is currently being developed as a potential component of a distributed information system in support of NASA's Earth Observing System. This architecture can be trained, via an iterative learning process, to recognize objects in images based on texture features, allowing scientists to search for all patterns which are similar to a target pattern in a database of images. It may facilitate scientific inquiry by allowing scientists to automatically search for physical features of interest in a database through computer pattern recognition, alleviating the need for exhaustive visual searches through possibly thousands of images. The architecture is implemented on a Connection Machine such that each physical processor contains a simulated 'neuron' which views a feature vector derived from a subregion of the input image. Each of these neurons is trained, via the perceptron rule, to identify the same pattern. The network output gives a probability distribution over the input image of finding the target pattern in a given region. In initial tests the architecture was trained to separate regions containing clouds from clear regions in 512 by 512 pixel AVHRR images. We found that in about 10 minutes we can train a network to perform with high accuracy in recognizing clouds which were texturally similar to a target cloud group. These promising results suggest that this type of architecture may play a significant role in coping with the forthcoming flood of data from the Earth-monitoring missions of the major space-faring nations.

  10. Seismic imaging on massively parallel computers

    SciTech Connect

    Ober, C.C.; Oldfield, R.; Womble, D.E.; VanDyke, J.; Dosanjh, S.

    1996-03-01

    Fast, accurate imaging of complex, oil-bearing geologies, such as overthrusts and salt domes, is the key to reducing the costs of domestic oil and gas exploration. Geophysicists say that the known oil reserves in the Gulf of Mexico could be significantly increased if accurate seismic imaging beneath salt domes was possible. A range of techniques exist for imaging these regions, but the highly accurate techniques involve the solution of the wave equation and are characterized by large data sets and large computational demands. Massively parallel computers can provide the computational power for these highly accurate imaging techniques. A brief introduction to seismic processing will be presented, and the implementation of a seismic-imaging code for distributed memory computers will be discussed. The portable code, Salvo, performs a wave equation-based, 3-D, prestack, depth imaging and currently runs on the Intel Paragon and the Cray T3D. It used MPI for portability, and has sustained 22 Mflops/sec/proc (compiled FORTRAN) on the Intel Paragon.

  11. Multiplexed microsatellite recovery using massively parallel sequencing

    USGS Publications Warehouse

    Jennings, T.N.; Knaus, B.J.; Mullins, T.D.; Haig, S.M.; Cronn, R.C.

    2011-01-01

    Conservation and management of natural populations requires accurate and inexpensive genotyping methods. Traditional microsatellite, or simple sequence repeat (SSR), marker analysis remains a popular genotyping method because of the comparatively low cost of marker development, ease of analysis and high power of genotype discrimination. With the availability of massively parallel sequencing (MPS), it is now possible to sequence microsatellite-enriched genomic libraries in multiplex pools. To test this approach, we prepared seven microsatellite-enriched, barcoded genomic libraries from diverse taxa (two conifer trees, five birds) and sequenced these on one lane of the Illumina Genome Analyzer using paired-end 80-bp reads. In this experiment, we screened 6.1 million sequences and identified 356958 unique microreads that contained di- or trinucleotide microsatellites. Examination of four species shows that our conversion rate from raw sequences to polymorphic markers compares favourably to Sanger- and 454-based methods. The advantage of multiplexed MPS is that the staggering capacity of modern microread sequencing is spread across many libraries; this reduces sample preparation and sequencing costs to less than $400 (USD) per species. This price is sufficiently low that microsatellite libraries could be prepared and sequenced for all 1373 organisms listed as 'threatened' and 'endangered' in the United States for under $0.5M (USD).

  12. Fault tolerant massively parallel processing architecture

    SciTech Connect

    Balasubramanian, V.; Banerjee, P.

    1987-08-01

    This paper presents two massively parallel processing architectures suitable for solving a wide variety of algorithms of divide-and-conquer type for problems such as the discrete Fourier transform, production systems, design automation, and others. The first architecture, called the Chain-structured Butterfly ARchitecture (CBAR), consists of a two-dimensional array of N-L . (log/sub 2/(L)+1) processing elements (PE) organized as L levels of log/sub 2/(L)+1 stages, and which has the butterfly connection between PEs in consecutive stages with straight-through feedback between PEs in the last and first stages. This connection system has the desirable property of allowing thousands of PEs to be connected with O(N) connection cost, O(log/sub 2/(N/log/sub 2/N)) communication paths, and a small number (=4) of I/O ports per PE. However, this architecture is not fault tolerant. The authors, therefore, propose a second architecture, called the REconfigurable Chain-structured Butterfly ARchitecture (RECBAR), which is a modified version of the CBAR. The RECBAR possesses all the desirable features of the CBAR, with the number of I/O ports per PE increased to six, and uses O(log/sub 2/N)/N) overhead in PEs and approximately 50% overhead in links to achieve single-level fault tolerance. Reliability improvements of the RECBAR over the CBAR are studied. This paper also presents a distributed diagnostic and structuring algorithm for the RECBAR that enables the architecture to detect faults and structure itself accordingly within 2 . log/sub 2/(L)+1 time steps, thus making it a truly fault tolerant architecture.

  13. The EMCC / DARPA Massively Parallel Electromagnetic Scattering Project

    NASA Technical Reports Server (NTRS)

    Woo, Alex C.; Hill, Kueichien C.

    1996-01-01

    The Electromagnetic Code Consortium (EMCC) was sponsored by the Advanced Research Program Agency (ARPA) to demonstrate the effectiveness of massively parallel computing in large scale radar signature predictions. The EMCC/ARPA project consisted of three parts.

  14. Visualization on massively parallel computers using CM/AVS

    SciTech Connect

    Krogh, M.F.; Hansen, C.D.

    1993-09-01

    CM/AVS is a visualization environment for the massively parallel CM-5 from Thinking Machines. It provides a backend to the standard commercially available AVS visualization product. At the Advanced Computing Laboratory at Los Alamos National Laboratory, we have been experimenting and utilizing this software within our visualization environment. This paper describes our experiences with CM/AVS. The conclusions reached are applicable to any implimentation of visualization software within a massively parallel computing environment.

  15. Experimental free-space optical network for massively parallel computers

    NASA Astrophysics Data System (ADS)

    Araki, S.; Kajita, M.; Kasahara, K.; Kubota, K.; Kurihara, K.; Redmond, I.; Schenfeld, E.; Suzaki, T.

    1996-03-01

    A free-space optical interconnection scheme is described for massively parallel processors based on the interconnection-cached network architecture. The optical network operates in a circuit-switching mode. Combined with a packet-switching operation among the circuit-switched optical channels, a high-bandwidth, low-latency network for massively parallel processing results. The design and assembly of a 64-channel experimental prototype is discussed, and operational results are presented.

  16. RAMA: A file system for massively parallel computers

    NASA Technical Reports Server (NTRS)

    Miller, Ethan L.; Katz, Randy H.

    1993-01-01

    This paper describes a file system design for massively parallel computers which makes very efficient use of a few disks per processor. This overcomes the traditional I/O bottleneck of massively parallel machines by storing the data on disks within the high-speed interconnection network. In addition, the file system, called RAMA, requires little inter-node synchronization, removing another common bottleneck in parallel processor file systems. Support for a large tertiary storage system can easily be integrated in lo the file system; in fact, RAMA runs most efficiently when tertiary storage is used.

  17. IMPAIR: massively parallel deconvolution on the GPU

    NASA Astrophysics Data System (ADS)

    Sherry, Michael; Shearer, Andy

    2013-02-01

    The IMPAIR software is a high throughput image deconvolution tool for processing large out-of-core datasets of images, varying from large images with spatially varying PSFs to large numbers of images with spatially invariant PSFs. IMPAIR implements a parallel version of the tried and tested Richardson-Lucy deconvolution algorithm regularised via a custom wavelet thresholding library. It exploits the inherently parallel nature of the convolution operation to achieve quality results on consumer grade hardware: through the NVIDIA Tesla GPU implementation, the multi-core OpenMP implementation, and the cluster computing MPI implementation of the software. IMPAIR aims to address the problem of parallel processing in both top-down and bottom-up approaches: by managing the input data at the image level, and by managing the execution at the instruction level. These combined techniques will lead to a scalable solution with minimal resource consumption and maximal load balancing. IMPAIR is being developed as both a stand-alone tool for image processing, and as a library which can be embedded into non-parallel code to transparently provide parallel high throughput deconvolution.

  18. EFFICIENT SCHEDULING OF PARALLEL JOBS ON MASSIVELY PARALLEL SYSTEMS

    SciTech Connect

    F. PETRINI; W. FENG

    1999-09-01

    We present buffered coscheduling, a new methodology to multitask parallel jobs in a message-passing environment and to develop parallel programs that can pave the way to the efficient implementation of a distributed operating system. Buffered coscheduling is based on three innovative techniques: communication buffering, strobing, and non-blocking communication. By leveraging these techniques, we can perform effective optimizations based on the global status of the parallel machine rather than on the limited knowledge available locally to each processor. The advantages of buffered coscheduling include higher resource utilization, reduced communication overhead, efficient implementation of low-control strategies and fault-tolerant protocols, accurate performance modeling, and a simplified yet still expressive parallel programming model. Preliminary experimental results show that buffered coscheduling is very effective in increasing the overall performance in the presence of load imbalance and communication-intensive workloads.

  19. Scan line graphics generation on the massively parallel processor

    NASA Technical Reports Server (NTRS)

    Dorband, John E.

    1988-01-01

    Described here is how researchers implemented a scan line graphics generation algorithm on the Massively Parallel Processor (MPP). Pixels are computed in parallel and their results are applied to the Z buffer in large groups. To perform pixel value calculations, facilitate load balancing across the processors and apply the results to the Z buffer efficiently in parallel requires special virtual routing (sort computation) techniques developed by the author especially for use on single-instruction multiple-data (SIMD) architectures.

  20. Massively Parallel Sequencing: The Next Big Thing in Genetic Medicine

    PubMed Central

    Tucker, Tracy; Marra, Marco; Friedman, Jan M.

    2009-01-01

    Massively parallel sequencing has reduced the cost and increased the throughput of genomic sequencing by more than three orders of magnitude, and it seems likely that costs will fall and throughput improve even more in the next few years. Clinical use of massively parallel sequencing will provide a way to identify the cause of many diseases of unknown etiology through simultaneous screening of thousands of loci for pathogenic mutations and by sequencing biological specimens for the genomic signatures of novel infectious agents. In addition to providing these entirely new diagnostic capabilities, massively parallel sequencing may also replace arrays and Sanger sequencing in clinical applications where they are currently being used. Routine clinical use of massively parallel sequencing will require higher accuracy, better ways to select genomic subsets of interest, and improvements in the functionality, speed, and ease of use of data analysis software. In addition, substantial enhancements in laboratory computer infrastructure, data storage, and data transfer capacity will be needed to handle the extremely large data sets produced. Clinicians and laboratory personnel will require training to use the sequence data effectively, and appropriate methods will need to be developed to deal with the incidental discovery of pathogenic mutations and variants of uncertain clinical significance. Massively parallel sequencing has the potential to transform the practice of medical genetics and related fields, but the vast amount of personal genomic data produced will increase the responsibility of geneticists to ensure that the information obtained is used in a medically and socially responsible manner. PMID:19679224

  1. Massively parallel neural encoding and decoding of visual stimuli.

    PubMed

    Lazar, Aurel A; Zhou, Yiyin

    2012-08-01

    The massively parallel nature of video Time Encoding Machines (TEMs) calls for scalable, massively parallel decoders that are implemented with neural components. The current generation of decoding algorithms is based on computing the pseudo-inverse of a matrix and does not satisfy these requirements. Here we consider video TEMs with an architecture built using Gabor receptive fields and a population of Integrate-and-Fire neurons. We show how to build a scalable architecture for video Time Decoding Machines using recurrent neural networks. Furthermore, we extend our architecture to handle the reconstruction of visual stimuli encoded with massively parallel video TEMs having neurons with random thresholds. Finally, we discuss in detail our algorithms and demonstrate their scalability and performance on a large scale GPU cluster. PMID:22397951

  2. Staging memory for massively parallel processor

    NASA Technical Reports Server (NTRS)

    Batcher, Kenneth E. (Inventor)

    1988-01-01

    The invention herein relates to a computer organization capable of rapidly processing extremely large volumes of data. A staging memory is provided having a main stager portion consisting of a large number of memory banks which are accessed in parallel to receive, store, and transfer data words simultaneous with each other. Substager portions interconnect with the main stager portion to match input and output data formats with the data format of the main stager portion. An address generator is coded for accessing the data banks for receiving or transferring the appropriate words. Input and output permutation networks arrange the lineal order of data into and out of the memory banks.

  3. The Challenge of Massively Parallel Computing

    SciTech Connect

    WOMBLE,DAVID E.

    1999-11-03

    Since the mid-1980's, there have been a number of commercially available parallel computers with hundreds or thousands of processors. These machines have provided a new capability to the scientific community, and they been used successfully by scientists and engineers although with varying degrees of success. One of the reasons for the limited success is the difficulty, or perceived difficulty, in developing code for these machines. In this paper we discuss many of the issues and challenges in developing scalable hardware, system software and algorithms for machines comprising hundreds or thousands of processors.

  4. Design and implementation of a massively parallel version of DIRECT

    SciTech Connect

    He, J.; Verstak, A.; Watson, L.; Sosonkina, M.

    2007-10-24

    This paper describes several massively parallel implementations for a global search algorithm DIRECT. Two parallel schemes take different approaches to address DIRECT's design challenges imposed by memory requirements and data dependency. Three design aspects in topology, data structures, and task allocation are compared in detail. The goal is to analytically investigate the strengths and weaknesses of these parallel schemes, identify several key sources of inefficiency, and experimentally evaluate a number of improvements in the latest parallel DIRECT implementation. The performance studies demonstrate improved data structure efficiency and load balancing on a 2200 processor cluster.

  5. Massively parallel solution of the assignment problem. Technical report

    SciTech Connect

    Wein, J.; Zenios, S.

    1990-12-01

    In this paper we discuss the design, implementation and effectiveness of massively parallel algorithms for the solution of large-scale assignment problems. In particular, we study the auction algorithms of Bertsekas, an algorithm based on the method of multipliers of Hestenes and Powell, and an algorithm based on the alternating direction method of multipliers of Eckstein. We discuss alternative approaches to the massively parallel implementation of the auction algorithm, including Jacobi, Gauss-Seidel and a hybrid scheme. The hybrid scheme, in particular, exploits two different levels of parallelism and an efficient way of communicating the data between them without the need to perform general router operations across the hypercube network. We then study the performance of massively parallel implementations of two methods of multipliers. Implementations are carried out on the Connection Machine CM-2, and the algorithms are evaluated empirically with the solution of large scale problems. The hybrid scheme significantly outperforms all of the other methods and gives the best computational results to date for a massively parallel solution to this problem.

  6. Shift: A Massively Parallel Monte Carlo Radiation Transport Package

    SciTech Connect

    Pandya, Tara M; Johnson, Seth R; Davidson, Gregory G; Evans, Thomas M; Hamilton, Steven P

    2015-01-01

    This paper discusses the massively-parallel Monte Carlo radiation transport package, Shift, developed at Oak Ridge National Laboratory. It reviews the capabilities, implementation, and parallel performance of this code package. Scaling results demonstrate very good strong and weak scaling behavior of the implemented algorithms. Benchmark results from various reactor problems show that Shift results compare well to other contemporary Monte Carlo codes and experimental results.

  7. Efficient parallel global garbage collection on massively parallel computers

    SciTech Connect

    Kamada, Tomio; Matsuoka, Satoshi; Yonezawa, Akinori

    1994-12-31

    On distributed-memory high-performance MPPs where processors are interconnected by an asynchronous network, efficient Garbage Collection (GC) becomes difficult due to inter-node references and references within pending, unprocessed messages. The parallel global GC algorithm (1) takes advantage of reference locality, (2) efficiently traverses references over nodes, (3) admits minimum pause time of ongoing computations, and (4) has been shown to scale up to 1024 node MPPs. The algorithm employs a global weight counting scheme to substantially reduce message traffic. The two methods for confirming the arrival of pending messages are used: one counts numbers of messages and the other uses network `bulldozing.` Performance evaluation in actual implementations on a multicomputer with 32-1024 nodes, Fujitsu AP1000, reveals various favorable properties of the algorithm.

  8. Solving unstructured grid problems on massively parallel computers

    NASA Technical Reports Server (NTRS)

    Hammond, Steven W.; Schreiber, Robert

    1990-01-01

    A highly parallel graph mapping technique that enables one to efficiently solve unstructured grid problems on massively parallel computers is presented. Many implicit and explicit methods for solving discretized partial differential equations require each point in the discretization to exchange data with its neighboring points every time step or iteration. The cost of this communication can negate the high performance promised by massively parallel computing. To eliminate this bottleneck, the graph of the irregular problem is mapped into the graph representing the interconnection topology of the computer such that the sum of the distances that the messages travel is minimized. It is shown that using the heuristic mapping algorithm significantly reduces the communication time compared to a naive assignment of processes to processors.

  9. Time efficient 3-D electromagnetic modeling on massively parallel computers

    SciTech Connect

    Alumbaugh, D.L.; Newman, G.A.

    1995-08-01

    A numerical modeling algorithm has been developed to simulate the electromagnetic response of a three dimensional earth to a dipole source for frequencies ranging from 100 to 100MHz. The numerical problem is formulated in terms of a frequency domain--modified vector Helmholtz equation for the scattered electric fields. The resulting differential equation is approximated using a staggered finite difference grid which results in a linear system of equations for which the matrix is sparse and complex symmetric. The system of equations is solved using a preconditioned quasi-minimum-residual method. Dirichlet boundary conditions are employed at the edges of the mesh by setting the tangential electric fields equal to zero. At frequencies less than 1MHz, normal grid stretching is employed to mitigate unwanted reflections off the grid boundaries. For frequencies greater than this, absorbing boundary conditions must be employed by making the stretching parameters of the modified vector Helmholtz equation complex which introduces loss at the boundaries. To allow for faster calculation of realistic models, the original serial version of the code has been modified to run on a massively parallel architecture. This modification involves three distinct tasks; (1) mapping the finite difference stencil to a processor stencil which allows for the necessary information to be exchanged between processors that contain adjacent nodes in the model, (2) determining the most efficient method to input the model which is accomplished by dividing the input into ``global`` and ``local`` data and then reading the two sets in differently, and (3) deciding how to output the data which is an inherently nonparallel process.

  10. A Programming Model for Massive Data Parallelism with Data Dependencies

    SciTech Connect

    Cui, Xiaohui; Mueller, Frank; Potok, Thomas E; Zhang, Yongpeng

    2009-01-01

    Accelerating processors can often be more cost and energy effective for a wide range of data-parallel computing problems than general-purpose processors. For graphics processor units (GPUs), this is particularly the case when program development is aided by environments such as NVIDIA s Compute Unified Device Architecture (CUDA), which dramatically reduces the gap between domain-specific architectures and general purpose programming. Nonetheless, general-purpose GPU (GPGPU) programming remains subject to several restrictions. Most significantly, the separation of host (CPU) and accelerator (GPU) address spaces requires explicit management of GPU memory resources, especially for massive data parallelism that well exceeds the memory capacity of GPUs. One solution to this problem is to transfer data between the GPU and host memories frequently. In this work, we investigate another approach. We run massively data-parallel applications on GPU clusters. We further propose a programming model for massive data parallelism with data dependencies for this scenario. Experience from micro benchmarks and real-world applications shows that our model provides not only ease of programming but also significant performance gains.

  11. The Application of a Massively Parallel Computer to the Simulation of Electrical Wave Propagation Phenomena in the Heart Muscle Using Simplified Models

    NASA Technical Reports Server (NTRS)

    Karpoukhin, Mikhii G.; Kogan, Boris Y.; Karplus, Walter J.

    1995-01-01

    The simulation of heart arrhythmia and fibrillation are very important and challenging tasks. The solution of these problems using sophisticated mathematical models is beyond the capabilities of modern super computers. To overcome these difficulties it is proposed to break the whole simulation problem into two tightly coupled stages: generation of the action potential using sophisticated models. and propagation of the action potential using simplified models. The well known simplified models are compared and modified to bring the rate of depolarization and action potential duration restitution closer to reality. The modified method of lines is used to parallelize the computational process. The conditions for the appearance of 2D spiral waves after the application of a premature beat and the subsequent traveling of the spiral wave inside the simulated tissue are studied.

  12. The language parallel Pascal and other aspects of the massively parallel processor

    NASA Technical Reports Server (NTRS)

    Reeves, A. P.; Bruner, J. D.

    1982-01-01

    A high level language for the Massively Parallel Processor (MPP) was designed. This language, called Parallel Pascal, is described in detail. A description of the language design, a description of the intermediate language, Parallel P-Code, and details for the MPP implementation are included. Formal descriptions of Parallel Pascal and Parallel P-Code are given. A compiler was developed which converts programs in Parallel Pascal into the intermediate Parallel P-Code language. The code generator to complete the compiler for the MPP is being developed independently. A Parallel Pascal to Pascal translator was also developed. The architecture design for a VLSI version of the MPP was completed with a description of fault tolerant interconnection networks. The memory arrangement aspects of the MPP are discussed and a survey of other high level languages is given.

  13. Supercomputing on massively parallel bit-serial architectures

    NASA Technical Reports Server (NTRS)

    Iobst, Ken

    1985-01-01

    Research on the Goodyear Massively Parallel Processor (MPP) suggests that high-level parallel languages are practical and can be designed with powerful new semantics that allow algorithms to be efficiently mapped to the real machines. For the MPP these semantics include parallel/associative array selection for both dense and sparse matrices, variable precision arithmetic to trade accuracy for speed, micro-pipelined train broadcast, and conditional branching at the processing element (PE) control unit level. The preliminary design of a FORTRAN-like parallel language for the MPP has been completed and is being used to write programs to perform sparse matrix array selection, min/max search, matrix multiplication, Gaussian elimination on single bit arrays and other generic algorithms. A description is given of the MPP design. Features of the system and its operation are illustrated in the form of charts and diagrams.

  14. Development of massively parallel quantum chemistry program SMASH

    SciTech Connect

    Ishimura, Kazuya

    2015-12-31

    A massively parallel program for quantum chemistry calculations SMASH was released under the Apache License 2.0 in September 2014. The SMASH program is written in the Fortran90/95 language with MPI and OpenMP standards for parallelization. Frequently used routines, such as one- and two-electron integral calculations, are modularized to make program developments simple. The speed-up of the B3LYP energy calculation for (C{sub 150}H{sub 30}){sub 2} with the cc-pVDZ basis set (4500 basis functions) was 50,499 on 98,304 cores of the K computer.

  15. Development of massively parallel quantum chemistry program SMASH

    NASA Astrophysics Data System (ADS)

    Ishimura, Kazuya

    2015-12-01

    A massively parallel program for quantum chemistry calculations SMASH was released under the Apache License 2.0 in September 2014. The SMASH program is written in the Fortran90/95 language with MPI and OpenMP standards for parallelization. Frequently used routines, such as one- and two-electron integral calculations, are modularized to make program developments simple. The speed-up of the B3LYP energy calculation for (C150H30)2 with the cc-pVDZ basis set (4500 basis functions) was 50,499 on 98,304 cores of the K computer.

  16. TSE computers - A means for massively parallel computations

    NASA Technical Reports Server (NTRS)

    Strong, J. P., III

    1976-01-01

    A description is presented of hardware concepts for building a massively parallel processing system for two-dimensional data. The processing system is to use logic arrays of 128 x 128 elements which perform over 16 thousand operations simultaneously. Attention is given to image data, logic arrays, basic image logic functions, a prototype negator, an interleaver device, image logic circuits, and an image memory circuit.

  17. Computational fluid dynamics on a massively parallel computer

    NASA Technical Reports Server (NTRS)

    Jespersen, Dennis C.; Levit, Creon

    1989-01-01

    A finite difference code was implemented for the compressible Navier-Stokes equations on the Connection Machine, a massively parallel computer. The code is based on the ARC2D/ARC3D program and uses the implicit factored algorithm of Beam and Warming. The codes uses odd-even elimination to solve linear systems. Timings and computation rates are given for the code, and a comparison is made with a Cray XMP.

  18. MIMD massively parallel methods for engineering and science problems

    SciTech Connect

    Camp, W.J.; Plimpton, S.J.

    1993-08-01

    MIMD massively parallel computers promise unique power and flexibility for engineering and scientific simulations. In this paper we review the development of a number of software methods and algorithms for scientific and engineering problems which are helping to realize that promise. We discuss new domain decomposition, load balancing, data layout and communications methods applicable to simulations in a broad range of technical field including signal processing, multi-dimensional structural and fluid mechanics, materials science, and chemical and biological systems.

  19. Massively parallel Wang Landau sampling on multiple GPUs

    SciTech Connect

    Yin, Junqi; Landau, D. P.

    2012-01-01

    Wang Landau sampling is implemented on the Graphics Processing Unit (GPU) with the Compute Unified Device Architecture (CUDA). Performances on three different GPU cards, including the new generation Fermi architecture card, are compared with that on a Central Processing Unit (CPU). The parameters for massively parallel Wang Landau sampling are tuned in order to achieve fast convergence. For simulations of the water cluster systems, we obtain an average of over 50 times speedup for a given workload.

  20. 3D seismic imaging on massively parallel computers

    SciTech Connect

    Womble, D.E.; Ober, C.C.; Oldfield, R.

    1997-02-01

    The ability to image complex geologies such as salt domes in the Gulf of Mexico and thrusts in mountainous regions is a key to reducing the risk and cost associated with oil and gas exploration. Imaging these structures, however, is computationally expensive. Datasets can be terabytes in size, and the processing time required for the multiple iterations needed to produce a velocity model can take months, even with the massively parallel computers available today. Some algorithms, such as 3D, finite-difference, prestack, depth migration remain beyond the capacity of production seismic processing. Massively parallel processors (MPPs) and algorithms research are the tools that will enable this project to provide new seismic processing capabilities to the oil and gas industry. The goals of this work are to (1) develop finite-difference algorithms for 3D, prestack, depth migration; (2) develop efficient computational approaches for seismic imaging and for processing terabyte datasets on massively parallel computers; and (3) develop a modular, portable, seismic imaging code.

  1. Requirements for supercomputing in energy research: The transition to massively parallel computing

    SciTech Connect

    Not Available

    1993-02-01

    This report discusses: The emergence of a practical path to TeraFlop computing and beyond; requirements of energy research programs at DOE; implementation: supercomputer production computing environment on massively parallel computers; and implementation: user transition to massively parallel computing.

  2. The 2nd Symposium on the Frontiers of Massively Parallel Computations

    NASA Technical Reports Server (NTRS)

    Mills, Ronnie (Editor)

    1988-01-01

    Programming languages, computer graphics, neural networks, massively parallel computers, SIMD architecture, algorithms, digital terrain models, sort computation, simulation of charged particle transport on the massively parallel processor and image processing are among the topics discussed.

  3. MMS Observations of Parallel Electric Fields

    NASA Astrophysics Data System (ADS)

    Ergun, R.; Goodrich, K.; Wilder, F. D.; Sturner, A. P.; Holmes, J.; Stawarz, J. E.; Malaspina, D.; Usanova, M.; Torbert, R. B.; Lindqvist, P. A.; Khotyaintsev, Y. V.; Burch, J. L.; Strangeway, R. J.; Russell, C. T.; Pollock, C. J.; Giles, B. L.; Hesse, M.; Goldman, M. V.; Drake, J. F.; Phan, T.; Nakamura, R.

    2015-12-01

    Parallel electric fields are a necessary condition for magnetic reconnection with non-zero guide field and are ultimately accountable for topological reconfiguration of a magnetic field. Parallel electric fields also play a strong role in charged particle acceleration and turbulence. The Magnetospheric Multiscale (MMS) mission targets these three universal plasma processes. The MMS satellites have an accurate three-dimensional electric field measurement, which can identify parallel electric fields as low as 1 mV/m at four adjacent locations. We present preliminary observations of parallel electric fields from MMS and provide an early interpretation of their impact on magnetic reconnection, in particular, where the topological change occurs. We also examine the role of parallel electric fields in particle acceleration. Direct particle acceleration by parallel electric fields is well established in the auroral region. Observations of double layers in by the Van Allan Probes suggest that acceleration by parallel electric fields may be significant in energizing some populations of the radiation belts. THEMIS observations also indicate that some of the largest parallel electric fields are found in regions of strong field-aligned currents associated with turbulence, suggesting a highly non-linear dissipation mechanism. We discuss how the MMS observations extend our understanding of the role of parallel electric fields in some of the most critical processes in the magnetosphere.

  4. Routing performance analysis and optimization within a massively parallel computer

    DOEpatents

    Archer, Charles Jens; Peters, Amanda; Pinnow, Kurt Walter; Swartz, Brent Allen

    2013-04-16

    An apparatus, program product and method optimize the operation of a massively parallel computer system by, in part, receiving actual performance data concerning an application executed by the plurality of interconnected nodes, and analyzing the actual performance data to identify an actual performance pattern. A desired performance pattern may be determined for the application, and an algorithm may be selected from among a plurality of algorithms stored within a memory, the algorithm being configured to achieve the desired performance pattern based on the actual performance data.

  5. The Massively Parallel Processor and its applications. [for environmental monitoring

    NASA Technical Reports Server (NTRS)

    Strong, J. P.; Schaefer, D. H.; Fischer, J. R.; Wallgren, K. R.; Bracken, P. A.

    1979-01-01

    A long-term experimental development program conducted at Goddard Space Flight Center to implement an ultrahigh-speed data processing system known as the Massively Parallel Processor (MPP) is described. The MPP is a single instruction multiple data stream computer designed to perform logical, integer, and floating point arithmetic operations on variable word length data. Information is presented on system architecture, the system configuration, the array unit architecture, individual processing units, and expected operating rates for several image processing applications (including the processing of Landsat data).

  6. A biconjugate gradient type algorithm on massively parallel architectures

    NASA Technical Reports Server (NTRS)

    Freund, Roland W.; Hochbruck, Marlis

    1991-01-01

    The biconjugate gradient (BCG) method is the natural generalization of the classical conjugate gradient algorithm for Hermitian positive definite matrices to general non-Hermitian linear systems. Unfortunately, the original BCG algorithm is susceptible to possible breakdowns and numerical instabilities. Recently, Freund and Nachtigal have proposed a novel BCG type approach, the quasi-minimal residual method (QMR), which overcomes the problems of BCG. Here, an implementation is presented of QMR based on an s-step version of the nonsymmetric look-ahead Lanczos algorithm. The main feature of the s-step Lanczos algorithm is that, in general, all inner products, except for one, can be computed in parallel at the end of each block; this is unlike the other standard Lanczos process where inner products are generated sequentially. The resulting implementation of QMR is particularly attractive on massively parallel SIMD architectures, such as the Connection Machine.

  7. Numerical computation on massively parallel hypercubes. [Connection machine

    SciTech Connect

    McBryan, O.A.

    1986-01-01

    We describe numerical computations on the Connection Machine, a massively parallel hypercube architecture with 65,536 single-bit processors and 32 Mbytes of memory. A parallel extension of COMMON LISP, provides access to the processors and network. The rich software environment is further enhanced by a powerful virtual processor capability, which extends the degree of fine-grained parallelism beyond 1,000,000. We briefly describe the hardware and indicate the principal features of the parallel programming environment. We then present implementations of SOR, multigrid and pre-conditioned conjugate gradient algorithms for solving partial differential equations on the Connection Machine. Despite the lack of floating point hardware, computation rates above 100 megaflops have been achieved in PDE solution. Virtual processors prove to be a real advantage, easing the effort of software development while improving system performance significantly. The software development effort is also facilitated by the fact that hypercube communications prove to be fast and essentially independent of distance. 29 refs., 4 figs.

  8. The performance realities of massively parallel processors: A case study

    SciTech Connect

    Lubeck, O.M.; Simmons, M.L.; Wasserman, H.J.

    1992-07-01

    This paper presents the results of an architectural comparison of SIMD massive parallelism, as implemented in the Thinking Machines Corp. CM-2 computer, and vector or concurrent-vector processing, as implemented in the Cray Research Inc. Y-MP/8. The comparison is based primarily upon three application codes that represent Los Alamos production computing. Tests were run by porting optimized CM Fortran codes to the Y-MP, so that the same level of optimization was obtained on both machines. The results for fully-configured systems, using measured data rather than scaled data from smaller configurations, show that the Y-MP/8 is faster than the 64k CM-2 for all three codes. A simple model that accounts for the relative characteristic computational speeds of the two machines, and reduction in overall CM-2 performance due to communication or SIMD conditional execution, is included. The model predicts the performance of two codes well, but fails for the third code, because the proportion of communications in this code is very high. Other factors, such as memory bandwidth and compiler effects, are also discussed. Finally, the paper attempts to show the equivalence of the CM-2 and Y-MP programming models, and also comments on selected future massively parallel processor designs.

  9. Comparison of massively parallel hand-print segmenters

    SciTech Connect

    Wilkinson, R.A.; Garris, M.D.

    1992-09-01

    NIST has developed a massively parallel hand-print recognition system that allows components to be interchanged. Using this system, three different character segmentation algorithms have been developed and studied. They are blob coloring, histogramming, and a hybrid of the two. The blob coloring method uses connected components to isolate characters. The histogramming method locates linear spaces, which may be slanted, to segment characters. The hybrid method is an augmented histogramming method that incorporates statistically adaptive rules to decide when a histogrammed item is too large and applies blob coloring to further segment the difficult item. The hardware configuration is a serial host computer with a 1024 processor Single Instruction Multiple Data (SIMD) machine attached to it. The data used in this comparison is 'NIST Special Database 1' which contains 2100 forms from different writers where each form contains 130 digit characters distributed across 28 fields. This gives a potential 273,000 characters to be segmented. Running the massively parallel system across the 2100 forms, blob coloring required 2.1 seconds per form with an accuracy of 97.5%, histogramming required 14.4 seconds with an accuracy of 95.3%, and the hybrid method required 13.2 seconds with an accuracy of 95.4%. The results of this comparison show that the blob coloring method on a SIMD architecture is superior.

  10. Learning Quantitative Sequence-Function Relationships from Massively Parallel Experiments

    NASA Astrophysics Data System (ADS)

    Atwal, Gurinder S.; Kinney, Justin B.

    2016-03-01

    A fundamental aspect of biological information processing is the ubiquity of sequence-function relationships—functions that map the sequence of DNA, RNA, or protein to a biochemically relevant activity. Most sequence-function relationships in biology are quantitative, but only recently have experimental techniques for effectively measuring these relationships been developed. The advent of such "massively parallel" experiments presents an exciting opportunity for the concepts and methods of statistical physics to inform the study of biological systems. After reviewing these recent experimental advances, we focus on the problem of how to infer parametric models of sequence-function relationships from the data produced by these experiments. Specifically, we retrace and extend recent theoretical work showing that inference based on mutual information, not the standard likelihood-based approach, is often necessary for accurately learning the parameters of these models. Closely connected with this result is the emergence of "diffeomorphic modes"—directions in parameter space that are far less constrained by data than likelihood-based inference would suggest. Analogous to Goldstone modes in physics, diffeomorphic modes arise from an arbitrarily broken symmetry of the inference problem. An analytically tractable model of a massively parallel experiment is then described, providing an explicit demonstration of these fundamental aspects of statistical inference. This paper concludes with an outlook on the theoretical and computational challenges currently facing studies of quantitative sequence-function relationships.

  11. Optimal evaluation of array expressions on massively parallel machines

    NASA Technical Reports Server (NTRS)

    Chatterjee, Siddhartha; Gilbert, John R.; Schreiber, Robert; Teng, Shang-Hua

    1992-01-01

    We investigate the problem of evaluating FORTRAN 90 style array expressions on massively parallel distributed-memory machines. On such machines, an elementwise operation can be performed in constant time for arrays whose corresponding elements are in the same processor. If the arrays are not aligned in this manner, the cost of aligning them is part of the cost of evaluating the expression. The choice of where to perform the operation then affects this cost. We present algorithms based on dynamic programming to solve this problem efficiently for a wide variety of interconnection schemes, including multidimensional grids and rings, hypercubes, and fat-trees. We also consider expressions containing operations that change the shape of the arrays, and show that our approach extends naturally to handle this case.

  12. Applications of massively parallel computers in telemetry processing

    NASA Technical Reports Server (NTRS)

    El-Ghazawi, Tarek A.; Pritchard, Jim; Knoble, Gordon

    1994-01-01

    Telemetry processing refers to the reconstruction of full resolution raw instrumentation data with artifacts, of space and ground recording and transmission, removed. Being the first processing phase of satellite data, this process is also referred to as level-zero processing. This study is aimed at investigating the use of massively parallel computing technology in providing level-zero processing to spaceflights that adhere to the recommendations of the Consultative Committee on Space Data Systems (CCSDS). The workload characteristics, of level-zero processing, are used to identify processing requirements in high-performance computing systems. An example of level-zero functions on a SIMD MPP, such as the MasPar, is discussed. The requirements in this paper are based in part on the Earth Observing System (EOS) Data and Operation System (EDOS).

  13. A Computational Fluid Dynamics Algorithm on a Massively Parallel Computer

    NASA Technical Reports Server (NTRS)

    Jespersen, Dennis C.; Levit, Creon

    1989-01-01

    The discipline of computational fluid dynamics is demanding ever-increasing computational power to deal with complex fluid flow problems. We investigate the performance of a finite-difference computational fluid dynamics algorithm on a massively parallel computer, the Connection Machine. Of special interest is an implicit time-stepping algorithm; to obtain maximum performance from the Connection Machine, it is necessary to use a nonstandard algorithm to solve the linear systems that arise in the implicit algorithm. We find that the Connection Machine ran achieve very high computation rates on both explicit and implicit algorithms. The performance of the Connection Machine puts it in the same class as today's most powerful conventional supercomputers.

  14. MPSim: A Massively Parallel General Simulation Program for Materials

    NASA Astrophysics Data System (ADS)

    Iotov, Mihail; Gao, Guanghua; Vaidehi, Nagarajan; Cagin, Tahir; Goddard, William A., III

    1997-08-01

    In this talk, we describe a general purpose Massively Parallel Simulation (MPSim) program used for computational materials science and life sciences. We also will present scaling aspects of the program along with several case studies. The program incorporates highly efficient CMM method to accurately calculate the interactions. For studying bulk materials, the program uses the Reduced CMM to account for infinite range sums. The software embodies various advanced molecular dynamics algorithms, energy and structure optimization techniques with a set of analysis tools suitable for large scale structures. The applications using the program range amorphous polymers, liquid-polymer interfaces, large viruses, million atom clusters, surfaces, gas diffusion in polymers. Program is originally developed on KSR in an object oriented fashion and is ported to SGI-PC, and HP-Examplar. Message Passing version is originally implemented on Intel Paragon using NX, then MPI and later tested on Cray T3D, and IBM SP2 platforms.

  15. Beam dynamics calculations and particle tracking using massively parallel processors

    SciTech Connect

    Ryne, R.D.; Habib, S.

    1995-12-31

    During the past decade massively parallel processors (MPPs) have slowly gained acceptance within the scientific community. At present these machines typically contain a few hundred to one thousand off-the-shelf microprocessors and a total memory of up to 32 GBytes. The potential performance of these machines is illustrated by the fact that a month long job on a high end workstation might require only a few hours on an MPP. The acceptance of MPPs has been slow for a variety of reasons. For example, some algorithms are not easily parallelizable. Also, in the past these machines were difficult to program. But in recent years the development of Fortran-like languages such as CM Fortran and High Performance Fortran have made MPPs much easier to use. In the following we will describe how MPPs can be used for beam dynamics calculations and long term particle tracking.

  16. Integration of IR focal plane arrays with massively parallel processor

    NASA Astrophysics Data System (ADS)

    Esfandiari, P.; Koskey, P.; Vaccaro, K.; Buchwald, W.; Clark, F.; Krejca, B.; Rekeczky, C.; Zarandy, A.

    2008-04-01

    The intent of this investigation is to replace the low fill factor visible sensor of a Cellular Neural Network (CNN) processor with an InGaAs Focal Plane Array (FPA) using both bump bonding and epitaxial layer transfer techniques for use in the Ballistic Missile Defense System (BMDS) interceptor seekers. The goal is to fabricate a massively parallel digital processor with a local as well as a global interconnect architecture. Currently, this unique CNN processor is capable of processing a target scene in excess of 10,000 frames per second with its visible sensor. What makes the CNN processor so unique is that each processing element includes memory, local data storage, local and global communication devices and a visible sensor supported by a programmable analog or digital computer program.

  17. Transmissive Nanohole Arrays for Massively-Parallel Optical Biosensing

    PubMed Central

    2015-01-01

    A high-throughput optical biosensing technique is proposed and demonstrated. This hybrid technique combines optical transmission of nanoholes with colorimetric silver staining. The size and spacing of the nanoholes are chosen so that individual nanoholes can be independently resolved in massive parallel using an ordinary transmission optical microscope, and, in place of determining a spectral shift, the brightness of each nanohole is recorded to greatly simplify the readout. Each nanohole then acts as an independent sensor, and the blocking of nanohole optical transmission by enzymatic silver staining defines the specific detection of a biological agent. Nearly 10000 nanoholes can be simultaneously monitored under the field of view of a typical microscope. As an initial proof of concept, biotinylated lysozyme (biotin-HEL) was used as a model analyte, giving a detection limit as low as 0.1 ng/mL. PMID:25530982

  18. Development of a massively parallel parachute performance prediction code

    SciTech Connect

    Peterson, C.W.; Strickland, J.H.; Wolfe, W.P.; Sundberg, W.D.; McBride, D.D.

    1997-04-01

    The Department of Energy has given Sandia full responsibility for the complete life cycle (cradle to grave) of all nuclear weapon parachutes. Sandia National Laboratories is initiating development of a complete numerical simulation of parachute performance, beginning with parachute deployment and continuing through inflation and steady state descent. The purpose of the parachute performance code is to predict the performance of stockpile weapon parachutes as these parachutes continue to age well beyond their intended service life. A new massively parallel computer will provide unprecedented speed and memory for solving this complex problem, and new software will be written to treat the coupled fluid, structure and trajectory calculations as part of a single code. Verification and validation experiments have been proposed to provide the necessary confidence in the computations.

  19. Efficient Identification of Assembly Neurons within Massively Parallel Spike Trains

    PubMed Central

    Berger, Denise; Borgelt, Christian; Louis, Sebastien; Morrison, Abigail; Grün, Sonja

    2010-01-01

    The chance of detecting assembly activity is expected to increase if the spiking activities of large numbers of neurons are recorded simultaneously. Although such massively parallel recordings are now becoming available, methods able to analyze such data for spike correlation are still rare, as a combinatorial explosion often makes it infeasible to extend methods developed for smaller data sets. By evaluating pattern complexity distributions the existence of correlated groups can be detected, but their member neurons cannot be identified. In this contribution, we present approaches to actually identify the individual neurons involved in assemblies. Our results may complement other methods and also provide a way to reduce data sets to the “relevant” neurons, thus allowing us to carry out a refined analysis of the detailed correlation structure due to reduced computation time. PMID:19809521

  20. Massively parallel high-order combinatorial genetics in human cells

    PubMed Central

    Wong, Alan S L; Choi, Gigi C G; Cheng, Allen A; Purcell, Oliver; Lu, Timothy K

    2016-01-01

    The systematic functional analysis of combinatorial genetics has been limited by the throughput that can be achieved and the order of complexity that can be studied. To enable massively parallel characterization of genetic combinations in human cells, we developed a technology for rapid, scalable assembly of high-order barcoded combinatorial genetic libraries that can be quantified with high-throughput sequencing. We applied this technology, combinatorial genetics en masse (CombiGEM), to create high-coverage libraries of 1,521 two-wise and 51,770 three-wise barcoded combinations of 39 human microRNA (miRNA) precursors. We identified miRNA combinations that synergistically sensitize drug-resistant cancer cells to chemotherapy and/or inhibit cancer cell proliferation, providing insights into complex miRNA networks. More broadly, our method will enable high-throughput profiling of multifactorial genetic combinations that regulate phenotypes of relevance to biomedicine, biotechnology and basic science. PMID:26280411

  1. Massively Parallel Simulations of Diffusion in Dense Polymeric Structures

    SciTech Connect

    Faulon, Jean-Loup, Wilcox, R.T. , Hobbs, J.D. , Ford, D.M.

    1997-11-01

    An original computational technique to generate close-to-equilibrium dense polymeric structures is proposed. Diffusion of small gases are studied on the equilibrated structures using massively parallel molecular dynamics simulations running on the Intel Teraflops (9216 Pentium Pro processors) and Intel Paragon(1840 processors). Compared to the current state-of-the-art equilibration methods this new technique appears to be faster by some orders of magnitude.The main advantage of the technique is that one can circumvent the bottlenecks in configuration space that inhibit relaxation in molecular dynamics simulations. The technique is based on the fact that tetravalent atoms (such as carbon and silicon) fit in the center of a regular tetrahedron and that regular tetrahedrons can be used to mesh the three-dimensional space. Thus, the problem of polymer equilibration described by continuous equations in molecular dynamics is reduced to a discrete problem where solutions are approximated by simple algorithms. Practical modeling applications include the constructing of butyl rubber and ethylene-propylene-dimer-monomer (EPDM) models for oxygen and water diffusion calculations. Butyl and EPDM are used in O-ring systems and serve as sealing joints in many manufactured objects. Diffusion coefficients of small gases have been measured experimentally on both polymeric systems, and in general the diffusion coefficients in EPDM are an order of magnitude larger than in butyl. In order to better understand the diffusion phenomena, 10, 000 atoms models were generated and equilibrated for butyl and EPDM. The models were submitted to a massively parallel molecular dynamics simulation to monitor the trajectories of the diffusing species.

  2. Particle simulation of plasmas on the massively parallel processor

    NASA Technical Reports Server (NTRS)

    Gledhill, I. M. A.; Storey, L. R. O.

    1987-01-01

    Particle simulations, in which collective phenomena in plasmas are studied by following the self consistent motions of many discrete particles, involve several highly repetitive sets of calculations that are readily adaptable to SIMD parallel processing. A fully electromagnetic, relativistic plasma simulation for the massively parallel processor is described. The particle motions are followed in 2 1/2 dimensions on a 128 x 128 grid, with periodic boundary conditions. The two dimensional simulation space is mapped directly onto the processor network; a Fast Fourier Transform is used to solve the field equations. Particle data are stored according to an Eulerian scheme, i.e., the information associated with each particle is moved from one local memory to another as the particle moves across the spatial grid. The method is applied to the study of the nonlinear development of the whistler instability in a magnetospheric plasma model, with an anisotropic electron temperature. The wave distribution function is included as a new diagnostic to allow simulation results to be compared with satellite observations.

  3. Massively parallel simulations of multiphase flows using Lattice Boltzmann methods

    NASA Astrophysics Data System (ADS)

    Ahrenholz, Benjamin

    2010-03-01

    In the last two decades the lattice Boltzmann method (LBM) has matured as an alternative and efficient numerical scheme for the simulation of fluid flows and transport problems. Unlike conventional numerical schemes based on discretizations of macroscopic continuum equations, the LBM is based on microscopic models and mesoscopic kinetic equations. The fundamental idea of the LBM is to construct simplified kinetic models that incorporate the essential physics of microscopic or mesoscopic processes so that the macroscopic averaged properties obey the desired macroscopic equations. Especially applications involving interfacial dynamics, complex and/or changing boundaries and complicated constitutive relationships which can be derived from a microscopic picture are suitable for the LBM. In this talk a modified and optimized version of a Gunstensen color model is presented to describe the dynamics of the fluid/fluid interface where the flow field is based on a multi-relaxation-time model. Based on that modeling approach validation studies of contact line motion are shown. Due to the fact that the LB method generally needs only nearest neighbor information, the algorithm is an ideal candidate for parallelization. Hence, it is possible to perform efficient simulations in complex geometries at a large scale by massively parallel computations. Here, the results of drainage and imbibition (Degree of Freedom > 2E11) in natural porous media gained from microtomography methods are presented. Those fully resolved pore scale simulations are essential for a better understanding of the physical processes in porous media and therefore important for the determination of constitutive relationships.

  4. Efficiently modeling neural networks on massively parallel computers

    NASA Technical Reports Server (NTRS)

    Farber, Robert M.

    1993-01-01

    Neural networks are a very useful tool for analyzing and modeling complex real world systems. Applying neural network simulations to real world problems generally involves large amounts of data and massive amounts of computation. To efficiently handle the computational requirements of large problems, we have implemented at Los Alamos a highly efficient neural network compiler for serial computers, vector computers, vector parallel computers, and fine grain SIMD computers such as the CM-2 connection machine. This paper describes the mapping used by the compiler to implement feed-forward backpropagation neural networks for a SIMD (Single Instruction Multiple Data) architecture parallel computer. Thinking Machines Corporation has benchmarked our code at 1.3 billion interconnects per second (approximately 3 gigaflops) on a 64,000 processor CM-2 connection machine (Singer 1990). This mapping is applicable to other SIMD computers and can be implemented on MIMD computers such as the CM-5 connection machine. Our mapping has virtually no communications overhead with the exception of the communications required for a global summation across the processors (which has a sub-linear runtime growth on the order of O(log(number of processors)). We can efficiently model very large neural networks which have many neurons and interconnects and our mapping can extend to arbitrarily large networks (within memory limitations) by merging the memory space of separate processors with fast adjacent processor interprocessor communications. This paper will consider the simulation of only feed forward neural network although this method is extendable to recurrent networks.

  5. Massively Parallel Interrogation of Aptamer Sequence, Structure and Function

    SciTech Connect

    Fischer, N O; Tok, J B; Tarasow, T M

    2008-02-08

    Optimization of high affinity reagents is a significant bottleneck in medicine and the life sciences. The ability to synthetically create thousands of permutations of a lead high-affinity reagent and survey the properties of individual permutations in parallel could potentially relieve this bottleneck. Aptamers are single stranded oligonucleotides affinity reagents isolated by in vitro selection processes and as a class have been shown to bind a wide variety of target molecules. Methodology/Principal Findings. High density DNA microarray technology was used to synthesize, in situ, arrays of approximately 3,900 aptamer sequence permutations in triplicate. These sequences were interrogated on-chip for their ability to bind the fluorescently-labeled cognate target, immunoglobulin E, resulting in the parallel execution of thousands of experiments. Fluorescence intensity at each array feature was well resolved and shown to be a function of the sequence present. The data demonstrated high intra- and interchip correlation between the same features as well as among the sequence triplicates within a single array. Consistent with aptamer mediated IgE binding, fluorescence intensity correlated strongly with specific aptamer sequences and the concentration of IgE applied to the array. The massively parallel sequence-function analyses provided by this approach confirmed the importance of a consensus sequence found in all 21 of the original IgE aptamer sequences and support a common stem:loop structure as being the secondary structure underlying IgE binding. The microarray application, data and results presented illustrate an efficient, high information content approach to optimizing aptamer function. It also provides a foundation from which to better understand and manipulate this important class of high affinity biomolecules.

  6. Electrical properties of seafloor massive sulfides

    NASA Astrophysics Data System (ADS)

    Spagnoli, Giovanni; Hannington, Mark; Bairlein, Katharina; Hördt, Andreas; Jegen, Marion; Petersen, Sven; Laurila, Tea

    2016-06-01

    Seafloor massive sulfide (SMS) deposits are increasingly seen as important marine metal resources for the future. A growing number of industrialized nations are involved in the surveying and sampling of such deposits by drilling. Drill ships are expensive and their availability can be limited; seabed drill rigs are a cost-effective alternative and more suitable for obtaining cores for resource evaluation. In order to achieve the objectives of resource evaluations, details are required of the geological, mineralogical, and physical properties of the polymetallic deposits and their host rocks. Electrical properties of the deposits and their ore minerals are distinct from their unmineralized host rocks. Therefore, the use of electrical methods to detect SMS while drilling and recovering drill cores could decrease the costs and accelerate offshore operations by limiting the amount of drilling in unmineralized material. This paper presents new data regarding the electrical properties of SMS cores that can be used in that assessment. Frequency-dependent complex electrical resistivity in the frequency range between 0.002 and 100 Hz was examined in order to potentially discriminate between different types of fresh rocks, alteration and mineralization. Forty mini-cores of SMS and unmineralized host rocks were tested in the laboratory, originating from different tectonic settings such as the intermediate-spreading ridges of the Galapagos and Axial Seamount, and the Pacmanus back-arc basin. The results indicate that there is a clear potential to distinguish between mineralized and non-mineralized samples, with some evidence that even different types of mineralization can be discriminated. This could be achieved using resistivity magnitude alone with appropriate rig-mounted electrical sensors. Exploiting the frequency-dependent behavior of resistivity might amplify the differences and further improve the rock characterization.

  7. Electrical properties of seafloor massive sulfides

    NASA Astrophysics Data System (ADS)

    Spagnoli, Giovanni; Hannington, Mark; Bairlein, Katharina; Hördt, Andreas; Jegen, Marion; Petersen, Sven; Laurila, Tea

    2016-02-01

    Seafloor massive sulfide (SMS) deposits are increasingly seen as important marine metal resources for the future. A growing number of industrialized nations are involved in the surveying and sampling of such deposits by drilling. Drill ships are expensive and their availability can be limited; seabed drill rigs are a cost-effective alternative and more suitable for obtaining cores for resource evaluation. In order to achieve the objectives of resource evaluations, details are required of the geological, mineralogical, and physical properties of the polymetallic deposits and their host rocks. Electrical properties of the deposits and their ore minerals are distinct from their unmineralized host rocks. Therefore, the use of electrical methods to detect SMS while drilling and recovering drill cores could decrease the costs and accelerate offshore operations by limiting the amount of drilling in unmineralized material. This paper presents new data regarding the electrical properties of SMS cores that can be used in that assessment. Frequency-dependent complex electrical resistivity in the frequency range between 0.002 and 100 Hz was examined in order to potentially discriminate between different types of fresh rocks, alteration and mineralization. Forty mini-cores of SMS and unmineralized host rocks were tested in the laboratory, originating from different tectonic settings such as the intermediate-spreading ridges of the Galapagos and Axial Seamount, and the Pacmanus back-arc basin. The results indicate that there is a clear potential to distinguish between mineralized and non-mineralized samples, with some evidence that even different types of mineralization can be discriminated. This could be achieved using resistivity magnitude alone with appropriate rig-mounted electrical sensors. Exploiting the frequency-dependent behavior of resistivity might amplify the differences and further improve the rock characterization.

  8. Analysis of composite ablators using massively parallel computation

    NASA Technical Reports Server (NTRS)

    Shia, David

    1995-01-01

    In this work, the feasibility of using massively parallel computation to study the response of ablative materials is investigated. Explicit and implicit finite difference methods are used on a massively parallel computer, the Thinking Machines CM-5. The governing equations are a set of nonlinear partial differential equations. The governing equations are developed for three sample problems: (1) transpiration cooling, (2) ablative composite plate, and (3) restrained thermal growth testing. The transpiration cooling problem is solved using a solution scheme based solely on the explicit finite difference method. The results are compared with available analytical steady-state through-thickness temperature and pressure distributions and good agreement between the numerical and analytical solutions is found. It is also found that a solution scheme based on the explicit finite difference method has the following advantages: incorporates complex physics easily, results in a simple algorithm, and is easily parallelizable. However, a solution scheme of this kind needs very small time steps to maintain stability. A solution scheme based on the implicit finite difference method has the advantage that it does not require very small times steps to maintain stability. However, this kind of solution scheme has the disadvantages that complex physics cannot be easily incorporated into the algorithm and that the solution scheme is difficult to parallelize. A hybrid solution scheme is then developed to combine the strengths of the explicit and implicit finite difference methods and minimize their weaknesses. This is achieved by identifying the critical time scale associated with the governing equations and applying the appropriate finite difference method according to this critical time scale. The hybrid solution scheme is then applied to the ablative composite plate and restrained thermal growth problems. The gas storage term is included in the explicit pressure calculation of both

  9. Massively Parallel Processing for Fast and Accurate Stamping Simulations

    NASA Astrophysics Data System (ADS)

    Gress, Jeffrey J.; Xu, Siguang; Joshi, Ramesh; Wang, Chuan-tao; Paul, Sabu

    2005-08-01

    The competitive automotive market drives automotive manufacturers to speed up the vehicle development cycles and reduce the lead-time. Fast tooling development is one of the key areas to support fast and short vehicle development programs (VDP). In the past ten years, the stamping simulation has become the most effective validation tool in predicting and resolving all potential formability and quality problems before the dies are physically made. The stamping simulation and formability analysis has become an critical business segment in GM math-based die engineering process. As the simulation becomes as one of the major production tools in engineering factory, the simulation speed and accuracy are the two of the most important measures for stamping simulation technology. The speed and time-in-system of forming analysis becomes an even more critical to support the fast VDP and tooling readiness. Since 1997, General Motors Die Center has been working jointly with our software vendor to develop and implement a parallel version of simulation software for mass production analysis applications. By 2001, this technology was matured in the form of distributed memory processing (DMP) of draw die simulations in a networked distributed memory computing environment. In 2004, this technology was refined to massively parallel processing (MPP) and extended to line die forming analysis (draw, trim, flange, and associated spring-back) running on a dedicated computing environment. The evolution of this technology and the insight gained through the implementation of DM0P/MPP technology as well as performance benchmarks are discussed in this publication.

  10. Cloud identification using genetic algorithms and massively parallel computation

    NASA Technical Reports Server (NTRS)

    Buckles, Bill P.; Petry, Frederick E.

    1996-01-01

    As a Guest Computational Investigator under the NASA administered component of the High Performance Computing and Communication Program, we implemented a massively parallel genetic algorithm on the MasPar SIMD computer. Experiments were conducted using Earth Science data in the domains of meteorology and oceanography. Results obtained in these domains are competitive with, and in most cases better than, similar problems solved using other methods. In the meteorological domain, we chose to identify clouds using AVHRR spectral data. Four cloud speciations were used although most researchers settle for three. Results were remarkedly consistent across all tests (91% accuracy). Refinements of this method may lead to more timely and complete information for Global Circulation Models (GCMS) that are prevalent in weather forecasting and global environment studies. In the oceanographic domain, we chose to identify ocean currents from a spectrometer having similar characteristics to AVHRR. Here the results were mixed (60% to 80% accuracy). Given that one is willing to run the experiment several times (say 10), then it is acceptable to claim the higher accuracy rating. This problem has never been successfully automated. Therefore, these results are encouraging even though less impressive than the cloud experiment. Successful conclusion of an automated ocean current detection system would impact coastal fishing, naval tactics, and the study of micro-climates. Finally we contributed to the basic knowledge of GA (genetic algorithm) behavior in parallel environments. We developed better knowledge of the use of subpopulations in the context of shared breeding pools and the migration of individuals. Rigorous experiments were conducted based on quantifiable performance criteria. While much of the work confirmed current wisdom, for the first time we were able to submit conclusive evidence. The software developed under this grant was placed in the public domain. An extensive user

  11. Comparing current cluster, massively parallel, and accelerated systems

    SciTech Connect

    Barker, Kevin J; Davis, Kei; Hoisie, Adolfy; Kerbyson, Darren J; Pakin, Scott; Lang, Mike; Sancho Pitarch, Jose C

    2010-01-01

    Currently there is large architectural diversity in high perfonnance computing systems. They include 'commodity' cluster systems that optimize per-node performance for small jobs, massively parallel processors (MPPs) that optimize aggregate perfonnance for large jobs, and accelerated systems that optimize both per-node and aggregate performance but only for applications custom-designed to take advantage of such systems. Because of these dissimilarities, meaningful comparisons of achievable performance are not straightforward. In this work we utilize a methodology that combines both empirical analysis and performance modeling to compare clusters (represented by a 4,352-core IB cluster), MPPs (represented by a 147,456-core BG/P), and accelerated systems (represented by the 129,600-core Roadrunner) across a workload of four applications. Strengths of our approach include the ability to compare architectures - as opposed to specific implementations of an architecture - attribute each application's performance bottlenecks to characteristics unique to each system, and to explore performance scenarios in advance of their availability for measurement. Our analysis illustrates that application performance is essentially unrelated to relative peak performance but that application performance can be both predicted and explained using modeling.

  12. Investigation of reflective notching with massively parallel simulation

    NASA Astrophysics Data System (ADS)

    Tadros, Karim H.; Neureuther, Andrew R.; Gamelin, John K.; Guerrieri, Roberto

    1990-06-01

    A massively parallel simulation program TEMPEST is used to investigate the role of topography in generating reflective notching and to study the possibility of reducing effects through the introduction of special properties of resists and antireflection coating materials. The emphasis is on examining physical scattering mechanisms such as focused specular reflections resist thickness interference effects reflections from substrate grains and focusing of incident light by the resist curvature. Specular reflection from topography can focus incident radiation causing a 10-fold increase in effective exposure. Further complications such as dimples in the surface of positive resist features can result from a second reflection of focused energy by the resist/air interface. Variations in line-edge exposure due to substrate grain structure are primarily specular in nature and can become significant for grains larger than )tresi Local exposure variations due to vertical standing waves and changes in energy coupling due to changes in resist thickness are displaced laterally and are significant effects even though they are slightly less severe than vertical wave propagation theory suggests. Focusing effects due to refraction by the curved surface of the resist produce only minor changes in exposure. Increased resist contrast and resist absorption offer some improvement in reducing notching effects though minimizing substrate reflectivity is more effective. CPU time using 32 virtual nodes to simulate a 4 pm by 2 pm isolated domain with 13 bleaching steps was 30 minutes

  13. A high-plex PCR approach for massively parallel sequencing.

    PubMed

    Nguyen-Dumont, Tú; Pope, Bernard J; Hammet, Fleur; Southey, Melissa C; Park, Daniel J

    2013-08-01

    Current methods for targeted massively parallel sequencing (MPS) have several drawbacks, including limited design flexibility, expense, and protocol complexity, which restrict their application to settings involving modest target size and requiring low cost and high throughput. To address this, we have developed Hi-Plex, a PCR-MPS strategy intended for high-throughput screening of multiple genomic target regions that integrates simple, automated primer design software to control product size. Featuring permissive thermocycling conditions and clamp bias reduction, our protocol is simple, cost- and time-effective, uses readily available reagents, does not require expensive instrumentation, and requires minimal optimization. In a 60-plex assay targeting the breast cancer predisposition genes PALB2 and XRCC2, we applied Hi-Plex to 100 ng LCL-derived DNA, and 100 ng and 25 ng FFPE tumor-derived DNA. Altogether, at least 86.94% of the human genome-mapped reads were on target, and 100% of targeted amplicons were represented within 25-fold of the mean. Using 25 ng FFPE-derived DNA, 95.14% of mapped reads were on-target and relative representation ranged from 10.1-fold lower to 5.8-fold higher than the mean. These results were obtained using only the initial automatically-designed primers present in equal concentration. Hi-Plex represents a powerful new approach for screening panels of genomic target regions. PMID:23931594

  14. Wavelet-Based DFT calculations on Massively Parallel Hybrid Architectures

    NASA Astrophysics Data System (ADS)

    Genovese, Luigi

    2011-03-01

    In this contribution, we present an implementation of a full DFT code that can run on massively parallel hybrid CPU-GPU clusters. Our implementation is based on modern GPU architectures which support double-precision floating-point numbers. This DFT code, named BigDFT, is delivered within the GNU-GPL license either in a stand-alone version or integrated in the ABINIT software package. Hybrid BigDFT routines were initially ported with NVidia's CUDA language, and recently more functionalities have been added with new routines writeen within Kronos' OpenCL standard. The formalism of this code is based on Daubechies wavelets, which is a systematic real-space based basis set. As we will see in the presentation, the properties of this basis set are well suited for an extension on a GPU-accelerated environment. In addition to focusing on the implementation of the operators of the BigDFT code, this presentation also relies of the usage of the GPU resources in a complex code with different kinds of operations. A discussion on the interest of present and expected performances of Hybrid architectures computation in the framework of electronic structure calculations is also adressed.

  15. Massively parallel processor networks with optical express channels

    DOEpatents

    Deri, R.J.; Brooks, E.D. III; Haigh, R.E.; DeGroot, A.J.

    1999-08-24

    An optical method for separating and routing local and express channel data comprises interconnecting the nodes in a network with fiber optic cables. A single fiber optic cable carries both express channel traffic and local channel traffic, e.g., in a massively parallel processor (MPP) network. Express channel traffic is placed on, or filtered from, the fiber optic cable at a light frequency or a color different from that of the local channel traffic. The express channel traffic is thus placed on a light carrier that skips over the local intermediate nodes one-by-one by reflecting off of selective mirrors placed at each local node. The local-channel-traffic light carriers pass through the selective mirrors and are not reflected. A single fiber optic cable can thus be threaded throughout a three-dimensional matrix of nodes with the x,y,z directions of propagation encoded by the color of the respective light carriers for both local and express channel traffic. Thus frequency division multiple access is used to hierarchically separate the local and express channels to eliminate the bucket brigade latencies that would otherwise result if the express traffic had to hop between every local node to reach its ultimate destination. 3 figs.

  16. Massively parallel processor networks with optical express channels

    DOEpatents

    Deri, Robert J.; Brooks, III, Eugene D.; Haigh, Ronald E.; DeGroot, Anthony J.

    1999-01-01

    An optical method for separating and routing local and express channel data comprises interconnecting the nodes in a network with fiber optic cables. A single fiber optic cable carries both express channel traffic and local channel traffic, e.g., in a massively parallel processor (MPP) network. Express channel traffic is placed on, or filtered from, the fiber optic cable at a light frequency or a color different from that of the local channel traffic. The express channel traffic is thus placed on a light carrier that skips over the local intermediate nodes one-by-one by reflecting off of selective mirrors placed at each local node. The local-channel-traffic light carriers pass through the selective mirrors and are not reflected. A single fiber optic cable can thus be threaded throughout a three-dimensional matrix of nodes with the x,y,z directions of propagation encoded by the color of the respective light carriers for both local and express channel traffic. Thus frequency division multiple access is used to hierarchically separate the local and express channels to eliminate the bucket brigade latencies that would otherwise result if the express traffic had to hop between every local node to reach its ultimate destination.

  17. Massively parallel support for a case-based planning system

    NASA Technical Reports Server (NTRS)

    Kettler, Brian P.; Hendler, James A.; Anderson, William A.

    1993-01-01

    Case-based planning (CBP), a kind of case-based reasoning, is a technique in which previously generated plans (cases) are stored in memory and can be reused to solve similar planning problems in the future. CBP can save considerable time over generative planning, in which a new plan is produced from scratch. CBP thus offers a potential (heuristic) mechanism for handling intractable problems. One drawback of CBP systems has been the need for a highly structured memory to reduce retrieval times. This approach requires significant domain engineering and complex memory indexing schemes to make these planners efficient. In contrast, our CBP system, CaPER, uses a massively parallel frame-based AI language (PARKA) and can do extremely fast retrieval of complex cases from a large, unindexed memory. The ability to do fast, frequent retrievals has many advantages: indexing is unnecessary; very large case bases can be used; memory can be probed in numerous alternate ways; and queries can be made at several levels, allowing more specific retrieval of stored plans that better fit the target problem with less adaptation. In this paper we describe CaPER's case retrieval techniques and some experimental results showing its good performance, even on large case bases.

  18. Three-dimensional radiative transfer on a massively parallel computer

    NASA Astrophysics Data System (ADS)

    Vath, H. M.

    1994-04-01

    We perform 3D radiative transfer calculations in non-local thermodynamic equilibrium (NLTE) in the simple two-level atom approximation on the Mas-Par MP-1, which contains 8192 processors and is a single instruction multiple data (SIMD) machine, an example of the new generation of massively parallel computers. On such a machine, all processors execute the same command at a given time, but on different data. To make radiative transfer calculations efficient, we must re-consider the numerical methods and storage of data. To solve the transfer equation, we adopt the short characteristic method and examine different acceleration methods to obtain the source function. We use the ALI method and test local and non-local operators. Furthermore, we compare the Ng and the orthomin methods of acceleration. We also investigate the use of multi-grid methods to get fast solutions for the NLTE case. In order to test these numerical methods, we apply them to two problems with and without periodic boundary conditions.

  19. PFLOTRAN: Recent Developments Facilitating Massively-Parallel Reactive Biogeochemical Transport

    NASA Astrophysics Data System (ADS)

    Hammond, G. E.

    2015-12-01

    With the recent shift towards modeling carbon and nitrogen cycling in support of climate-related initiatives, emphasis has been placed on incorporating increasingly mechanistic biogeochemistry within Earth system models to more accurately predict the response of terrestrial processes to natural and anthropogenic climate cycles. PFLOTRAN is an open-source subsurface code that is specialized for simulating multiphase flow and multicomponent biogeochemical transport on supercomputers. The object-oriented code was designed with modularity in mind and has been coupled with several third-party simulators (e.g. CLM to simulate land surface processes and E4D for coupled hydrogeophysical inversion). Central to PFLOTRAN's capabilities is its ability to simulate tightly-coupled reactive transport processes. This presentation focuses on recent enhancements to the code that enable the solution of large parameterized biogeochemical reaction networks with numerous chemical species. PFLOTRAN's "reaction sandbox" is described, which facilitates the implementation of user-defined reaction networks without the need for a comprehensive understanding of PFLOTRAN software infrastructure. The reaction sandbox is written in modern Fortran (2003-2008) and leverages encapsulation, inheritance, and polymorphism to provide the researcher with a flexible workspace for prototyping reactions within a massively parallel flow and transport simulation framework. As these prototypical reactions mature into well-accepted implementations, they can be incorporated into PFLOTRAN as native biogeochemistry capability. Users of the reaction sandbox are encouraged to upload their source code to PFLOTRAN's main source code repository, including the addition of simple regression tests to better ensure the long-term code compatibility and validity of simulation results.

  20. Massively parallel computational fluid dynamics calculations for aerodynamics and aerothermodynamics applications

    SciTech Connect

    Payne, J.L.; Hassan, B.

    1998-09-01

    Massively parallel computers have enabled the analyst to solve complicated flow fields (turbulent, chemically reacting) that were previously intractable. Calculations are presented using a massively parallel CFD code called SACCARA (Sandia Advanced Code for Compressible Aerothermodynamics Research and Analysis) currently under development at Sandia National Laboratories as part of the Department of Energy (DOE) Accelerated Strategic Computing Initiative (ASCI). Computations were made on a generic reentry vehicle in a hypersonic flowfield utilizing three different distributed parallel computers to assess the parallel efficiency of the code with increasing numbers of processors. The parallel efficiencies for the SACCARA code will be presented for cases using 1, 150, 100 and 500 processors. Computations were also made on a subsonic/transonic vehicle using both 236 and 521 processors on a grid containing approximately 14.7 million grid points. Ongoing and future plans to implement a parallel overset grid capability and couple SACCARA with other mechanics codes in a massively parallel environment are discussed.

  1. Three-Dimensional Radiative Transfer on a Massively Parallel Computer.

    NASA Astrophysics Data System (ADS)

    Vath, Horst Michael

    1994-01-01

    We perform three-dimensional radiative transfer calculations on the MasPar MP-1, which contains 8192 processors and is a single instruction multiple data (SIMD) machine, an example of the new generation of massively parallel computers. To make radiative transfer calculations efficient, we must re-consider the numerical methods and methods of storage of data that have been used with serial machines. We developed a numerical code which efficiently calculates images and spectra of astrophysical systems as seen from different viewing directions and at different wavelengths. We use this code to examine a number of different astrophysical systems. First we image the HI distribution of model galaxies. Then we investigate the galaxy NGC 5055, which displays a radial asymmetry in its optical appearance. This can be explained by the presence of dust in the outer HI disk far beyond the optical disk. As the formation of dust is connected to the presence of stars, the existence of dust in outer regions of this galaxy could have consequences for star formation at a time when this galaxy was just forming. Next we use the code for polarized radiative transfer. We first discuss the numerical computation of the required cyclotron opacities and use them to calculate spectra of AM Her systems, binaries containing accreting magnetic white dwarfs. Then we obtain spectra of an extended polar cap. Previous calculations did not consider the three -dimensional extension of the shock. We find that this results in a significant underestimate of the radiation emitted in the shock. Next we calculate the spectrum of the intermediate polar RE 0751+14. For this system we obtain a magnetic field of ~10 MG, which has consequences for the evolution of intermediate polars. Finally we perform 3D radiative transfer in NLTE in the two-level atom approximation. To solve the transfer equation in this case, we adapt the short characteristic method and examine different acceleration methods to obtain the

  2. 3-D readout-electronics packaging for high-bandwidth massively paralleled imager

    DOEpatents

    Kwiatkowski, Kris; Lyke, James

    2007-12-18

    Dense, massively parallel signal processing electronics are co-packaged behind associated sensor pixels. Microchips containing a linear or bilinear arrangement of photo-sensors, together with associated complex electronics, are integrated into a simple 3-D structure (a "mirror cube"). An array of photo-sensitive cells are disposed on a stacked CMOS chip's surface at a 45.degree. angle from light reflecting mirror surfaces formed on a neighboring CMOS chip surface. Image processing electronics are held within the stacked CMOS chip layers. Electrical connections couple each of said stacked CMOS chip layers and a distribution grid, the connections for distributing power and signals to components associated with each stacked CSMO chip layer.

  3. SWAMP+: multiple subsequence alignment using associative massive parallelism

    SciTech Connect

    Steinfadt, Shannon Irene; Baker, Johnnie W

    2010-10-18

    A new parallel algorithm SWAMP+ incorporates the Smith-Waterman sequence alignment on an associative parallel model known as ASC. It is a highly sensitive parallel approach that expands traditional pairwise sequence alignment. This is the first parallel algorithm to provide multiple non-overlapping, non-intersecting subsequence alignments with the accuracy of Smith-Waterman. The efficient algorithm provides multiple alignments similar to BLAST while creating a better workflow for the end users. The parallel portions of the code run in O(m+n) time using m processors. When m = n, the algorithmic analysis becomes O(n) with a coefficient of two, yielding a linear speedup. Implementation of the algorithm on the SIMD ClearSpeed CSX620 confirms this theoretical linear speedup with real timings.

  4. QCD on the Massively Parallel Computer AP1000

    NASA Astrophysics Data System (ADS)

    Akemi, K.; Fujisaki, M.; Okuda, M.; Tago, Y.; Hashimoto, T.; Hioki, S.; Miyamura, O.; Takaishi, T.; Nakamura, A.; de Forcrand, Ph.; Hege, C.; Stamatescu, I. O.

    We present the QCD-TARO program of calculations which uses the parallel computer AP1000 of Fujitsu. We discuss the results on scaling, correlation times and hadronic spectrum, some aspects of the implementation and the future prospects.

  5. A development plan for a massively parallel version of the hydrocode CTH

    SciTech Connect

    Robinson, A.C.; Fang, E.; Holdridge, D.; McGlaun, J.M.

    1990-07-01

    Massively parallel computers and computer networks are beginning to appear as an integral part of the scientific computing workplace. This report documents the goals and the corresponding development plan of the massively parallel project of Departments 1530 and 1420. The main goal of the project is to provide a clear understanding of the issues and difficulties involved in bringing the current production hydrocode CTH to the state of being portable to a number of currently available parallel computing architectures. In the process of this research, various working versions of the code will be produced. 6 refs., 6 figs.

  6. Numerical analysis of electrical defibrillation. The parallel approach.

    PubMed

    Ng, K T; Hutchinson, S A; Gao, S

    1995-01-01

    Numerical modeling offers a viable tool for studying electrical defibrillation, allowing the behavior of field quantities to be observed easily as the different system parameters are varied. One numerical technique, namely the finite-element method, has been found particularly effective for modeling complex thoracic anatomies. However, an accurate finite-element model of the thorax often requires a large number of elements and nodes, leading to a large set of equations that cannot be solved effectively with the computational power of conventional computers. This is especially true if many finite-element solutions need to be achieved within a reasonable time period (eg, electrode configuration optimization). In this study, the use of massively parallel computers to provide the memory and reduction in solution time for solving these large finite-element problems is discussed. Both the uniform and unstructured grid approaches are considered. Algorithms that allow efficient mapping of uniform and unstructured grids to data-parallel and message-passing parallel computers are discussed. An automatic iterative procedure for electrode configuration optimization is presented. The procedure is based on the minimization of an objective function using the parallel direct search technique. Computational performance results are presented together with simulation results. PMID:8656104

  7. A Massively Parallel Adaptive Fast Multipole Method on Heterogeneous Architectures

    SciTech Connect

    Lashuk, Ilya; Chandramowlishwaran, Aparna; Langston, Harper; Nguyen, Tuan-Anh; Sampath, Rahul S; Shringarpure, Aashay; Vuduc, Richard; Ying, Lexing; Zorin, Denis; Biros, George

    2012-01-01

    We describe a parallel fast multipole method (FMM) for highly nonuniform distributions of particles. We employ both distributed memory parallelism (via MPI) and shared memory parallelism (via OpenMP and GPU acceleration) to rapidly evaluate two-body nonoscillatory potentials in three dimensions on heterogeneous high performance computing architectures. We have performed scalability tests with up to 30 billion particles on 196,608 cores on the AMD/CRAY-based Jaguar system at ORNL. On a GPU-enabled system (NSF's Keeneland at Georgia Tech/ORNL), we observed 30x speedup over a single core CPU and 7x speedup over a multicore CPU implementation. By combining GPUs with MPI, we achieve less than 10 ns/particle and six digits of accuracy for a run with 48 million nonuniformly distributed particles on 192 GPUs.

  8. Massively parallel switch-level simulation: A feasibility study

    SciTech Connect

    Kravitz, S.A.

    1989-01-01

    This thesis addresses the feasibility of mapping the COSMOS switch-level simulator onto computers with thousands of simple processors. COSMOS Preprocesses transistor networks into equivalent Boolean behavioral models, capturing the switch-level behavior of a circuit in a set of Boolean formulas. The author shows that thousand-fold parallelism exists in the formulas derived by COSMOS for some actual circuits. He exposes this parallelism by eliminating the event list from the simulator, and he demonstrates that this represents an attractive tradeoff given sufficient parallelism in the circuit model. To investigate the feasibility of this approach, he has developed a prototype implementation of the COSMOS simulator on a 32k processor Connection Machine.

  9. High density packaging and interconnect of massively parallel image processors

    NASA Technical Reports Server (NTRS)

    Carson, John C.; Indin, Ronald J.

    1991-01-01

    This paper presents conceptual designs for high density packaging of parallel processing systems. The systems fall into two categories: global memory systems where many processors are packaged into a stack, and distributed memory systems where a single processor and many memory chips are packaged into a stack. Thermal behavior and performance are discussed.

  10. Molecular simulation of rheological properties using massively parallel supercomputers

    SciTech Connect

    Bhupathiraju, R.K.; Cui, S.T.; Gupta, S.A.; Cummings, P.T.; Cochran, H.D.

    1996-11-01

    Advances in parallel supercomputing now make possible molecular-based engineering and science calculations that will soon revolutionize many technologies, such as those involving polymers and those involving aqueous electrolytes. We have developed a suite of message-passing codes for classical molecular simulation of such complex fluids and amorphous materials and have completed a number of demonstration calculations of problems of scientific and technological importance with each. In this paper, we will focus on the molecular simulation of rheological properties, particularly viscosity, of simple and complex fluids using parallel implementations of non-equilibrium molecular dynamics. Such calculations represent significant challenges computationally because, in order to reduce the thermal noise in the calculated properties within acceptable limits, large systems and/or long simulated times are required.

  11. Casting Pearls Ballistically: Efficient Massively Parallel Simulation of Particle Deposition

    NASA Astrophysics Data System (ADS)

    Lubachevsky, Boris D.; Privman, Vladimir; Roy, Subhas C.

    1996-06-01

    We simulate ballistic particle deposition wherein a large number of spherical particles are "cast" vertically over a planar horizontal surface. Upon first contact (with the surface or with a previously deposited particle) each particle stops. This model helps material scientists to study the adsorption and sediment formation. The model is sequential, with particles deposited one by one. We have found an equivalent formulation using a continuous time random process and we simulate the latter in parallel using a method similar to the one previously employed for simulating Ising spins. We augment the parallel algorithm for simulating Ising spins with several techniques aimed at the increase of efficiency of producing the particle configuration and statistics collection. Some of these techniques are similar to earlier ones. We implement the resulting algorithm on a 16K PE MasPar MP-1 and a 4K PE MasPar MP-2. The parallel code runs on MasPar computers nearly two orders of magnitude faster than an optimized sequential code runs on a fast workstation.

  12. Casting pearls ballistically: Efficient massively parallel simulation of particle deposition

    SciTech Connect

    Lubachevsky, B.D.; Privman, V.; Roy, S.C.

    1996-06-01

    We simulate ballistic particle deposition wherein a large number of spherical particles are {open_quotes}cast{close_quotes} vertically over a planar horizontal surface. Upon first contact (with the surface or with a previously deposited particle) each particle stops. This model helps material scientists to study the adsorption and sediment formation. The model is sequential, with particles deposited one by one. We have found an equivalent formulation using a continuous time random process and we simulate the latter in parallel using a method similar to the one previously employed for simulating Ising spins. We augment the parallel algorithm for simulating Ising spins with several techniques aimed at the increase of efficiency of producing the particle configuration and statistics collection. Some of these techniques are similar to earlier ones. We implement the resulting algorithm on a 16K PE MasPar MP-1 and a 4K PE MasPar MP-2. The parallel code runs on MasPar computers nearly two orders of magnitude faster than an optimized sequential code runs on a fast workstation. 17 refs., 9 figs.

  13. Performance effects of irregular communications patterns on massively parallel multiprocessors

    NASA Technical Reports Server (NTRS)

    Saltz, Joel; Petiton, Serge; Berryman, Harry; Rifkin, Adam

    1991-01-01

    A detailed study of the performance effects of irregular communications patterns on the CM-2 was conducted. The communications capabilities of the CM-2 were characterized under a variety of controlled conditions. In the process of carrying out the performance evaluation, extensive use was made of a parameterized synthetic mesh. In addition, timings with unstructured meshes generated for aerodynamic codes and a set of sparse matrices with banded patterns on non-zeroes were performed. This benchmarking suite stresses the communications capabilities of the CM-2 in a range of different ways. Benchmark results demonstrate that it is possible to make effective use of much of the massive concurrency available in the communications network.

  14. Massively parallel spatial light modulation-based optical signal processing

    NASA Astrophysics Data System (ADS)

    Li, Yao

    1993-03-01

    A new optical parallel arithmetic processing scheme using a nonholographic optoelectronic content-addressable memory (CAM) was proposed. The design of a four-bit CAM-based optical carry look-ahead adder was studied. Compared with existing optoelectronic binary addition approaches, this nonholographic CAM Scheme offers a number of practical advantages, such as faster processing speed and ease of optical implementation and alignment. For an addition of numbers longer than four bits, by incorporating the previous stage's carry, a number of four-bit CLA's can be cascaded. Experimental results were also demonstrated. One paper to the Optics Letters was published.

  15. A sweep algorithm for massively parallel simulation of circuit-switched networks

    NASA Technical Reports Server (NTRS)

    Gaujal, Bruno; Greenberg, Albert G.; Nicol, David M.

    1992-01-01

    A new massively parallel algorithm is presented for simulating large asymmetric circuit-switched networks, controlled by a randomized-routing policy that includes trunk-reservation. A single instruction multiple data (SIMD) implementation is described, and corresponding experiments on a 16384 processor MasPar parallel computer are reported. A multiple instruction multiple data (MIMD) implementation is also described, and corresponding experiments on an Intel IPSC/860 parallel computer, using 16 processors, are reported. By exploiting parallelism, our algorithm increases the possible execution rate of such complex simulations by as much as an order of magnitude.

  16. Performance of the Wavelet Decomposition on Massively Parallel Architectures

    NASA Technical Reports Server (NTRS)

    El-Ghazawi, Tarek A.; LeMoigne, Jacqueline; Zukor, Dorothy (Technical Monitor)

    2001-01-01

    Traditionally, Fourier Transforms have been utilized for performing signal analysis and representation. But although it is straightforward to reconstruct a signal from its Fourier transform, no local description of the signal is included in its Fourier representation. To alleviate this problem, Windowed Fourier transforms and then wavelet transforms have been introduced, and it has been proven that wavelets give a better localization than traditional Fourier transforms, as well as a better division of the time- or space-frequency plane than Windowed Fourier transforms. Because of these properties and after the development of several fast algorithms for computing the wavelet representation of any signal, in particular the Multi-Resolution Analysis (MRA) developed by Mallat, wavelet transforms have increasingly been applied to signal analysis problems, especially real-life problems, in which speed is critical. In this paper we present and compare efficient wavelet decomposition algorithms on different parallel architectures. We report and analyze experimental measurements, using NASA remotely sensed images. Results show that our algorithms achieve significant performance gains on current high performance parallel systems, and meet scientific applications and multimedia requirements. The extensive performance measurements collected over a number of high-performance computer systems have revealed important architectural characteristics of these systems, in relation to the processing demands of the wavelet decomposition of digital images.

  17. Scientific development of a massively parallel ocean climate model. Final report

    SciTech Connect

    Semtner, A.J.; Chervin, R.M.

    1996-09-01

    Over the last three years, very significant advances have been made in refining the grid resolution of ocean models and in improving the physical and numerical treatments of ocean hydrodynamics. Some of these advances have occurred as a result of the successful transition of ocean models onto massively parallel computers, which has been led by Los Alamos investigators. Major progress has been made in simulating global ocean circulation and in understanding various ocean climatic aspects such as the effect of wind driving on heat and freshwater transports. These steps have demonstrated the capability to conduct realistic decadal to century ocean integrations at high resolution on massively parallel computers.

  18. Signal processing applications of massively parallel charge domain computing devices

    NASA Technical Reports Server (NTRS)

    Fijany, Amir (Inventor); Barhen, Jacob (Inventor); Toomarian, Nikzad (Inventor)

    1999-01-01

    The present invention is embodied in a charge coupled device (CCD)/charge injection device (CID) architecture capable of performing a Fourier transform by simultaneous matrix vector multiplication (MVM) operations in respective plural CCD/CID arrays in parallel in O(1) steps. For example, in one embodiment, a first CCD/CID array stores charge packets representing a first matrix operator based upon permutations of a Hartley transform and computes the Fourier transform of an incoming vector. A second CCD/CID array stores charge packets representing a second matrix operator based upon different permutations of a Hartley transform and computes the Fourier transform of an incoming vector. The incoming vector is applied to the inputs of the two CCD/CID arrays simultaneously, and the real and imaginary parts of the Fourier transform are produced simultaneously in the time required to perform a single MVM operation in a CCD/CID array.

  19. Factorization of large integers on a massively parallel computer

    SciTech Connect

    Davis, J.A.; Holdridge, D.B.

    1988-01-01

    Our interest in integer factorization at Sandia National Laboratories is motivated by cryptographic applications and in particular the security of the RSA encryption-decryption algorithm. We have implemented our version of the quadratic sieve procedure on the NCUBE computer with 1024 processors (nodes). The new code is significantly different in all important aspects from the program used to factor number of order 10/sup 70/ on a single processor CRAY computer. Capabilities of parallel processing and limitation of small local memory necessitated this entirely new implementation. This effort involved several restarts as realizations of program structures that seemed appealing bogged down due to inter-processor communications. We are presently working with integers of magnitude about 10/sup 70/ in tuning this code to the novel hardware. 6 refs., 3 figs.

  20. Massively parallel algorithms for trace-driven cache simulations

    NASA Technical Reports Server (NTRS)

    Nicol, David M.; Greenberg, Albert G.; Lubachevsky, Boris D.

    1991-01-01

    Trace driven cache simulation is central to computer design. A trace is a very long sequence of reference lines from main memory. At the t(exp th) instant, reference x sub t is hashed into a set of cache locations, the contents of which are then compared with x sub t. If at the t sup th instant x sub t is not present in the cache, then it is said to be a miss, and is loaded into the cache set, possibly forcing the replacement of some other memory line, and making x sub t present for the (t+1) sup st instant. The problem of parallel simulation of a subtrace of N references directed to a C line cache set is considered, with the aim of determining which references are misses and related statistics. A simulation method is presented for the Least Recently Used (LRU) policy, which regradless of the set size C runs in time O(log N) using N processors on the exclusive read, exclusive write (EREW) parallel model. A simpler LRU simulation algorithm is given that runs in O(C log N) time using N/log N processors. Timings are presented of the second algorithm's implementation on the MasPar MP-1, a machine with 16384 processors. A broad class of reference based line replacement policies are considered, which includes LRU as well as the Least Frequently Used and Random replacement policies. A simulation method is presented for any such policy that on any trace of length N directed to a C line set runs in the O(C log N) time with high probability using N processors on the EREW model. The algorithms are simple, have very little space overhead, and are well suited for SIMD implementation.

  1. MASSIVELY PARALLEL LATENT SEMANTIC ANALYSES USING A GRAPHICS PROCESSING UNIT

    SciTech Connect

    Cavanagh, J.; Cui, S.

    2009-01-01

    Latent Semantic Analysis (LSA) aims to reduce the dimensions of large term-document datasets using Singular Value Decomposition. However, with the ever-expanding size of datasets, current implementations are not fast enough to quickly and easily compute the results on a standard PC. A graphics processing unit (GPU) can solve some highly parallel problems much faster than a traditional sequential processor or central processing unit (CPU). Thus, a deployable system using a GPU to speed up large-scale LSA processes would be a much more effective choice (in terms of cost/performance ratio) than using a PC cluster. Due to the GPU’s application-specifi c architecture, harnessing the GPU’s computational prowess for LSA is a great challenge. We presented a parallel LSA implementation on the GPU, using NVIDIA® Compute Unifi ed Device Architecture and Compute Unifi ed Basic Linear Algebra Subprograms software. The performance of this implementation is compared to traditional LSA implementation on a CPU using an optimized Basic Linear Algebra Subprograms library. After implementation, we discovered that the GPU version of the algorithm was twice as fast for large matrices (1 000x1 000 and above) that had dimensions not divisible by 16. For large matrices that did have dimensions divisible by 16, the GPU algorithm ran fi ve to six times faster than the CPU version. The large variation is due to architectural benefi ts of the GPU for matrices divisible by 16. It should be noted that the overall speeds for the CPU version did not vary from relative normal when the matrix dimensions were divisible by 16. Further research is needed in order to produce a fully implementable version of LSA. With that in mind, the research we presented shows that the GPU is a viable option for increasing the speed of LSA, in terms of cost/performance ratio.

  2. Massively parallel computation of RCS with finite elements

    NASA Technical Reports Server (NTRS)

    Parker, Jay

    1993-01-01

    One of the promising combinations of finite element approaches for scattering problems uses Whitney edge elements, spherical vector wave-absorbing boundary conditions, and bi-conjugate gradient solution for the frequency-domain near field. Each of these approaches may be criticized. Low-order elements require high mesh density, but also result in fast, reliable iterative convergence. Spherical wave-absorbing boundary conditions require additional space to be meshed beyond the most minimal near-space region, but result in fully sparse, symmetric matrices which keep storage and solution times low. Iterative solution is somewhat unpredictable and unfriendly to multiple right-hand sides, yet we find it to be uniformly fast on large problems to date, given the other two approaches. Implementation of these approaches on a distributed memory, message passing machine yields huge dividends, as full scalability to the largest machines appears assured and iterative solution times are well-behaved for large problems. We present times and solutions for computed RCS for a conducting cube and composite permeability/conducting sphere on the Intel ipsc860 with up to 16 processors solving over 200,000 unknowns. We estimate problems of approximately 10 million unknowns, encompassing 1000 cubic wavelengths, may be attempted on a currently available 512 processor machine, but would be exceedingly tedious to prepare. The most severe bottlenecks are due to the slow rate of mesh generation on non-parallel machines and the large transfer time from such a machine to the parallel processor. One solution, in progress, is to create and then distribute a coarse mesh among the processors, followed by systematic refinement within each processor. Elimination of redundant node definitions at the mesh-partition surfaces, snap-to-surface post processing of the resulting mesh for good modelling of curved surfaces, and load-balancing redistribution of new elements after the refinement are auxiliary

  3. A massively parallel computational approach to coupled thermoelastic/porous gas flow problems

    NASA Technical Reports Server (NTRS)

    Shia, David; Mcmanus, Hugh L.

    1995-01-01

    A new computational scheme for coupled thermoelastic/porous gas flow problems is presented. Heat transfer, gas flow, and dynamic thermoelastic governing equations are expressed in fully explicit form, and solved on a massively parallel computer. The transpiration cooling problem is used as an example problem. The numerical solutions have been verified by comparison to available analytical solutions. Transient temperature, pressure, and stress distributions have been obtained. Small spatial oscillations in pressure and stress have been observed, which would be impractical to predict with previously available schemes. Comparisons between serial and massively parallel versions of the scheme have also been made. The results indicate that for small scale problems the serial and parallel versions use practically the same amount of CPU time. However, as the problem size increases the parallel version becomes more efficient than the serial version.

  4. Solution of large linear systems of equations on the massively parallel processor

    NASA Technical Reports Server (NTRS)

    Ida, Nathan; Udawatta, Kapila

    1987-01-01

    The Massively Parallel Processor (MPP) was designed as a special machine for specific applications in image processing. As a parallel machine, with a large number of processors that can be reconfigured in different combinations it is also applicable to other problems that require a large number of processors. The solution of linear systems of equations on the MPP is investigated. The solution times achieved are compared to those obtained with a serial machine and the performance of the MPP is discussed.

  5. Three-dimensional electromagnetic modeling and inversion on massively parallel computers

    SciTech Connect

    Newman, G.A.; Alumbaugh, D.L.

    1996-03-01

    This report has demonstrated techniques that can be used to construct solutions to the 3-D electromagnetic inverse problem using full wave equation modeling. To this point great progress has been made in developing an inverse solution using the method of conjugate gradients which employs a 3-D finite difference solver to construct model sensitivities and predicted data. The forward modeling code has been developed to incorporate absorbing boundary conditions for high frequency solutions (radar), as well as complex electrical properties, including electrical conductivity, dielectric permittivity and magnetic permeability. In addition both forward and inverse codes have been ported to a massively parallel computer architecture which allows for more realistic solutions that can be achieved with serial machines. While the inversion code has been demonstrated on field data collected at the Richmond field site, techniques for appraising the quality of the reconstructions still need to be developed. Here it is suggested that rather than employing direct matrix inversion to construct the model covariance matrix which would be impossible because of the size of the problem, one can linearize about the 3-D model achieved in the inverse and use Monte-Carlo simulations to construct it. Using these appraisal and construction tools, it is now necessary to demonstrate 3-D inversion for a variety of EM data sets that span the frequency range from induction sounding to radar: below 100 kHz to 100 MHz. Appraised 3-D images of the earth`s electrical properties can provide researchers opportunities to infer the flow paths, flow rates and perhaps the chemistry of fluids in geologic mediums. It also offers a means to study the frequency dependence behavior of the properties in situ. This is of significant relevance to the Department of Energy, paramount to characterizing and monitoring of environmental waste sites and oil and gas exploration.

  6. Electric and Magnetic Forces between Parallel-Wire Conductors.

    ERIC Educational Resources Information Center

    Morton, N.

    1979-01-01

    Discusses electric and magnetic forces between parallel-wire conductors and derives, in a simple fashion, order of magnitude estimates of the ratio of the likely electrostatic and electromagnetic forces for a simple parallel-wire balance. (Author/HM)

  7. Massively parallel algorithms for real-time wavefront control of a dense adaptive optics system

    SciTech Connect

    Fijany, A.; Milman, M.; Redding, D.

    1994-12-31

    In this paper massively parallel algorithms and architectures for real-time wavefront control of a dense adaptive optic system (SELENE) are presented. The authors have already shown that the computation of a near optimal control algorithm for SELENE can be reduced to the solution of a discrete Poisson equation on a regular domain. Although, this represents an optimal computation, due the large size of the system and the high sampling rate requirement, the implementation of this control algorithm poses a computationally challenging problem since it demands a sustained computational throughput of the order of 10 GFlops. They develop a novel algorithm, designated as Fast Invariant Imbedding algorithm, which offers a massive degree of parallelism with simple communication and synchronization requirements. Due to these features, this algorithm is significantly more efficient than other Fast Poisson Solvers for implementation on massively parallel architectures. The authors also discuss two massively parallel, algorithmically specialized, architectures for low-cost and optimal implementation of the Fast Invariant Imbedding algorithm.

  8. Large-eddy simulation of the Rayleigh-Taylor instability on a massively parallel computer

    SciTech Connect

    Amala, P.A.K.

    1995-03-01

    A computational model for the solution of the three-dimensional Navier-Stokes equations is developed. This model includes a turbulence model: a modified Smagorinsky eddy-viscosity with a stochastic backscatter extension. The resultant equations are solved using finite difference techniques: the second-order explicit Lax-Wendroff schemes. This computational model is implemented on a massively parallel computer. Programming models on massively parallel computers are next studied. It is desired to determine the best programming model for the developed computational model. To this end, three different codes are tested on a current massively parallel computer: the CM-5 at Los Alamos. Each code uses a different programming model: one is a data parallel code; the other two are message passing codes. Timing studies are done to determine which method is the fastest. The data parallel approach turns out to be the fastest method on the CM-5 by at least an order of magnitude. The resultant code is then used to study a current problem of interest to the computational fluid dynamics community. This is the Rayleigh-Taylor instability. The Lax-Wendroff methods handle shocks and sharp interfaces poorly. To this end, the Rayleigh-Taylor linear analysis is modified to include a smoothed interface. The linear growth rate problem is then investigated. Finally, the problem of the randomly perturbed interface is examined. Stochastic backscatter breaks the symmetry of the stationary unstable interface and generates a mixing layer growing at the experimentally observed rate. 115 refs., 51 figs., 19 tabs.

  9. Parallel optimization of pixel purity index algorithm for massive hyperspectral images in cloud computing environment

    NASA Astrophysics Data System (ADS)

    Chen, Yufeng; Wu, Zebin; Sun, Le; Wei, Zhihui; Li, Yonglong

    2016-04-01

    With the gradual increase in the spatial and spectral resolution of hyperspectral images, the size of image data becomes larger and larger, and the complexity of processing algorithms is growing, which poses a big challenge to efficient massive hyperspectral image processing. Cloud computing technologies distribute computing tasks to a large number of computing resources for handling large data sets without the limitation of memory and computing resource of a single machine. This paper proposes a parallel pixel purity index (PPI) algorithm for unmixing massive hyperspectral images based on a MapReduce programming model for the first time in the literature. According to the characteristics of hyperspectral images, we describe the design principle of the algorithm, illustrate the main cloud unmixing processes of PPI, and analyze the time complexity of serial and parallel algorithms. Experimental results demonstrate that the parallel implementation of the PPI algorithm on the cloud can effectively process big hyperspectral data and accelerate the algorithm.

  10. Applications of the massively parallel machine, the MasPar MP-1, to Earth sciences

    NASA Technical Reports Server (NTRS)

    Fischer, James R.; Strong, James P.; Dorband, John E.; Tilton, James C.

    1991-01-01

    The computational workload of upcoming NASA science missions, especially the ground data processing for the Earth Observing System, is projected to be quite large (in the 50 to 100 gigaFLOPS range) and corespondingly very expensive to perform using conventional supercomputer systems. High performance, general purpose massively parallel computer systems such as the MasPar MP-1 are being investigated by NASA as a more cost effective alternative. Massively parallel systems are targeted for accelerated development and maturation by NASA's upcoming five-year High Performance Computing and Communications Program. A summary of the broad range of applications currently running on the MP-1 at NASA/Goddard are presented in this paper along with descriptions of the parallel algorithmic techniques employed in five applications that have bearing on Earth sciences.

  11. Massively parallel implementation of the Penn State/NCAR Mesoscale Model

    SciTech Connect

    Foster, I.; Michalakes, J.

    1992-01-01

    Parallel computing promises significant improvements in both the raw speed and cost performance of mesoscale atmospheric models. On distributed-memory massively parallel computers available today, the performance of a mesoscale model will exceed that of conventional supercomputers; on the teraflops machines expected within the next five years, performance will increase by several orders of magnitude. As a result, scientists will be able to consider larger problems, more complex model processes, and finer resolutions. In this paper. we report on a project at Argonne National Laboratory that will allow scientists to take advantage of parallel computing technology. This Massively Parallel Mesoscale Model (MPMM) will be functionally equivalent to the Penn State/NCAR Mesoscale Model (MM). In a prototype study, we produced a parallel version of MM4 using a static (compile-time) coarse-grained patch'' decomposition. This code achieves one-third the performance of a one-processor CRAY Y-MP on twelve Intel 1860 microprocessors. The current version of MPMM is based on all MM5 and uses a more fine-grained approach, decomposing the grid as finely as the mesh itself allows so that each horizontal grid cell is a parallel process. This will allow the code to utilize many hundreds of processors. A high-level language for expressing parallel programs is used to implement communication strearns between the processes in a way that permits dynamic remapping to the physical processors of a particular parallel computer. This facilitates load balancing, grid nesting, and coupling with graphical systems and other models.

  12. Massively parallel implementation of the Penn State/NCAR Mesoscale Model

    SciTech Connect

    Foster, I.; Michalakes, J.

    1992-12-01

    Parallel computing promises significant improvements in both the raw speed and cost performance of mesoscale atmospheric models. On distributed-memory massively parallel computers available today, the performance of a mesoscale model will exceed that of conventional supercomputers; on the teraflops machines expected within the next five years, performance will increase by several orders of magnitude. As a result, scientists will be able to consider larger problems, more complex model processes, and finer resolutions. In this paper. we report on a project at Argonne National Laboratory that will allow scientists to take advantage of parallel computing technology. This Massively Parallel Mesoscale Model (MPMM) will be functionally equivalent to the Penn State/NCAR Mesoscale Model (MM). In a prototype study, we produced a parallel version of MM4 using a static (compile-time) coarse-grained ``patch`` decomposition. This code achieves one-third the performance of a one-processor CRAY Y-MP on twelve Intel 1860 microprocessors. The current version of MPMM is based on all MM5 and uses a more fine-grained approach, decomposing the grid as finely as the mesh itself allows so that each horizontal grid cell is a parallel process. This will allow the code to utilize many hundreds of processors. A high-level language for expressing parallel programs is used to implement communication strearns between the processes in a way that permits dynamic remapping to the physical processors of a particular parallel computer. This facilitates load balancing, grid nesting, and coupling with graphical systems and other models.

  13. Massively parallel per-pixel-based zerotree processing architecture for real-time video compression

    NASA Astrophysics Data System (ADS)

    Alagoda, Geoffrey; Rassau, Alexander M.; Eshraghian, Kamran

    2001-11-01

    In the span of a few years, mobile multimedia communication has rapidly become a significant area of research and development constantly challenging boundaries on a variety of technological fronts. Video compression, a fundamental component for most mobile multimedia applications, generally places heavy demands in terms of the required processing capacity. Hardware implementations of typical modern hybrid codecs require realisation of components such as motion compensation, wavelet transform, quantisation, zerotree coding and arithmetic coding in real-time. While the implementation of such codecs using a fast generic processor is possible, undesirable trade-offs in terms of power consumption and speed must generally be made. The improvement in power consumption that is achievable through the use of a slow-clocked massively parallel processing environment, while maintaining real-time processing speeds, should thus not be overlooked. An architecture to realise such a massively parallel solution for a zerotree entropy coder is, therefore, presented in this paper.

  14. Numerical and physical instabilities in massively parallel LES of reacting flows

    NASA Astrophysics Data System (ADS)

    Poinsot, Thierry

    LES of reacting flows is rapidly becoming mature and providing levels of precision which can not be reached with any RANS (Reynolds Averaged) technique. In addition to the multiple subgrid scale models required for such LES and to the questions raised by the required numerical accurcay of LES solvers, various issues related the reliability, mesh independence and repetitivity of LES must still be addressed, especially when LES is used on massively parallel machines. This talk discusses some of these issues: (1) the existence of non physical waves (known as `wiggles' by most LES practitioners) in LES, (2) the effects of mesh size on LES of reacting flows, (3) the growth of rounding errors in LES on massively parallel machines and more generally (4) the ability to qualify a LES code as `bug free' and `accurate'. Examples range from academic cases (minimum non-reacting turbulent channel) to applied configurations (a sector of an helicopter combustion chamber).

  15. Cross-platform compatibility of Hi-Plex, a streamlined approach for targeted massively parallel sequencing.

    PubMed

    Nguyen-Dumont, Tú; Pope, Bernard J; Hammet, Fleur; Mahmoodi, Maryam; Tsimiklis, Helen; Southey, Melissa C; Park, Daniel J

    2013-11-15

    Although per-base sequencing costs have decreased during recent years, library preparation for targeted massively parallel sequencing remains constrained by high reagent cost, limited design flexibility, and protocol complexity. To address these limitations, we previously developed Hi-Plex, a polymerase chain reaction (PCR) massively parallel sequencing strategy for screening panels of genomic target regions. Here, we demonstrate that Hi-Plex applied with hybrid adapters can generate a library suitable for sequencing with both the Ion Torrent and the TruSeq chemistries and that adjusting primer concentrations improves coverage uniformity. These results expand Hi-Plex capabilities as an accurate, affordable, flexible, and rapid approach for various genetic screening applications. PMID:23933242

  16. Cross-platform compatibility of Hi-Plex, a streamlined approach for targeted massively parallel sequencing

    PubMed Central

    Nguyen-Dumont, Tú; Pope, Bernard J.; Hammet, Fleur; Mahmoodi, Maryam; Tsimiklis, Helen; Southey, Melissa C.; Park, Daniel J.

    2013-01-01

    Although per-base sequencing costs have decreased during recent years, library preparation for targeted massively parallel sequencing remains constrained by high reagent cost, limited design flexibility, and protocol complexity. To address these limitations, we previously developed Hi-Plex, a polymerase chain reaction (PCR) massively parallel sequencing strategy for screening panels of genomic target regions. Here, we demonstrate that Hi-Plex applied with hybrid adapters can generate a library suitable for sequencing with both the Ion Torrent and the TruSeq chemistries and that adjusting primer concentrations improves coverage uniformity. These results expand Hi-Plex capabilities as an accurate, affordable, flexible, and rapid approach for various genetic screening applications. PMID:23933242

  17. A domain decomposition study of massively parallel computing in compressible gas dynamics

    NASA Astrophysics Data System (ADS)

    Wong, C. C.; Blottner, F. G.; Payne, J. L.; Soetrisno, M.

    1995-03-01

    The appropriate utilization of massively parallel computers for solving the Navier-Stokes equations is investigated and determined from an engineering perspective. The issues investigated are: (1) Should strip or patch domain decomposition of the spatial mesh be used to reduce computer time? (2) How many computer nodes should be used for a problem with a given sized mesh to reduce computer time? (3) Is the convergence of the Navier-Stokes solution procedure (LU-SGS) adversely influenced by the domain decomposition approach? The results of the paper show that the present Navier-Stokes solution technique has good performance on a massively parallel computer for transient flow problems. For steady-state problems with a large number of mesh cells, the solution procedure will require significant computer time due to an increased number of iterations to achieve a converged solution. There is an optimum number of computer nodes to use for a problem with a given global mesh size.

  18. Chemical network problems solved on NASA/Goddard's massively parallel processor computer

    NASA Technical Reports Server (NTRS)

    Cho, Seog Y.; Carmichael, Gregory R.

    1987-01-01

    The single instruction stream, multiple data stream Massively Parallel Processor (MPP) unit consists of 16,384 bit serial arithmetic processors configured as a 128 x 128 array whose speed can exceed that of current supercomputers (Cyber 205). The applicability of the MPP for solving reaction network problems is presented and discussed, including the mapping of the calculation to the architecture, and CPU timing comparisons.

  19. Progressive Vector Quantization on a massively parallel SIMD machine with application to multispectral image data

    NASA Technical Reports Server (NTRS)

    Manohar, Mareboyana; Tilton, James C.

    1994-01-01

    A progressive vector quantization (VQ) compression approach is discussed which decomposes image data into a number of levels using full search VQ. The final level is losslessly compressed, enabling lossless reconstruction. The computational difficulties are addressed by implementation on a massively parallel SIMD machine. We demonstrate progressive VQ on multispectral imagery obtained from the Advanced Very High Resolution Radiometer instrument and other Earth observation image data, and investigate the trade-offs in selecting the number of decomposition levels and codebook training method.

  20. Parallel contributing area calculation with granularity control on massive grid terrain datasets

    NASA Astrophysics Data System (ADS)

    Jiang, Ling; Tang, Guoan; Liu, Xuejun; Song, Xiaodong; Yang, Jianyi; Liu, Kai

    2013-10-01

    The calculation of contributing areas from digital elevation models (DEMs) is one of the important tasks in digital terrain analysis (DTA). The computational process usually involves two steps in a real application: (1) calculating flow directions via a flow model, and (2) computing the contributing area for each grid cell in the DEM. The traditional algorithm for calculating contributing areas is coded as a sequential program executed on a single processor. With the increase of scope and resolution of DEMs, the serial algorithm has become increasingly difficult to perform and is often very time-consuming, especially for DEMs of large areas and fine scales. In recent years, parallel computing is able to meet this challenge with the development of computer technology. However, the parallel implementation with granularity control, an efficient strategy to reap the best parallel performance and to break the limitation of computing resources in processing massive grid terrain datasets, has not been found in DTA research field. This paper develops a message-passing-interface (MPI) parallel approach with granularity control to calculate contributing areas. According to the proposed parallelization strategy, the parallel D8 algorithm with granularity control is designed as well as the parallel AreaD8 algorithm. Based on the domain decomposition of DEM data, it is possible for each process to process multiple partitions decomposed under a grain size. According to an iterative procedure of reading source data, executing the operator and writing resulting data, the partitions achieve the calculation results one by one in each process. The experimental results on a multi-node cluster show that the proposed parallel algorithms with granularity control are the powerful tools to process the big dataset and the parallel D8 algorithm is insensitive to granularity, while the parallel AreaD8 algorithm has an optimal grain size to reap the best parallel performance.

  1. Massively parallel multifrontal methods for finite element analysis on MIMD computer systems

    SciTech Connect

    Benner, R.E.

    1993-03-01

    The development of highly parallel direct solvers for large, sparse linear systems of equations (e.g. for finite element or finite difference models) is lagging behind progress in parallel direct solvers for dense matrices and iterative methods for sparse matrices. We describe a massively parallel (MP) multifrontal solver for the direct solution of large sparse linear systems, such as those routinely encountered in finite element structural analysis, in an effort to address concerns about the viability of scalable, MP direct methods for sparse systems and enhance the software base for MP applications. Performance results are presented and future directions are outlined for research and development efforts in parallel multifrontal and related solvers. In particular, parallel efficiencies of 25% on 1024 nCUBE 2 nodes and 36% on 64 Intel iPSCS60 nodes have been demonstrated, and parallel efficiencies of 60--85% are expected when a severe load imbalance is overcome by static mapping and dynamic load balance techniques previously developed for other parallel solvers and application codes.

  2. Using CLIPS in the domain of knowledge-based massively parallel programming

    NASA Technical Reports Server (NTRS)

    Dvorak, Jiri J.

    1994-01-01

    The Program Development Environment (PDE) is a tool for massively parallel programming of distributed-memory architectures. Adopting a knowledge-based approach, the PDE eliminates the complexity introduced by parallel hardware with distributed memory and offers complete transparency in respect of parallelism exploitation. The knowledge-based part of the PDE is realized in CLIPS. Its principal task is to find an efficient parallel realization of the application specified by the user in a comfortable, abstract, domain-oriented formalism. A large collection of fine-grain parallel algorithmic skeletons, represented as COOL objects in a tree hierarchy, contains the algorithmic knowledge. A hybrid knowledge base with rule modules and procedural parts, encoding expertise about application domain, parallel programming, software engineering, and parallel hardware, enables a high degree of automation in the software development process. In this paper, important aspects of the implementation of the PDE using CLIPS and COOL are shown, including the embedding of CLIPS with C++-based parts of the PDE. The appropriateness of the chosen approach and of the CLIPS language for knowledge-based software engineering are discussed.

  3. Massively parallel fast elliptic equation solver for three dimensional hydrodynamics and relativity

    SciTech Connect

    Sholl, P.L.; Wilson, J.R.; Mathews, G.J.; Avila, J.H.

    1995-01-01

    Through the work proposed in this document we expect to advance the forefront of large scale computational efforts on massively parallel distributed-memory multiprocessors. We will develop tools for effective conversion to a parallel implementation of sequential numerical methods used to solve large systems of partial differential equations. The research supported by this work will involve conversion of a program which does state of the art modeling of multi-dimensional hydrodynamics, general relativity and particle transport in energetic astrophysical environments. The proposed parallel algorithm development, particularly the study and development of fast elliptic equation solvers, could significantly benefit this program and other applications involving solutions to systems of differential equations. We shall develop a data communication manager for distributed memory computers as an aid in program conversions to a parallel environment and implement it in the three dimensional relativistic hydrodynamics program discussed below; develop a concurrent system/concurrent subgrid multigrid method. Currently, five systems are approximated sequentially using multigrid successive overrelaxation. Results from an iteration cycle of one multigrid system are used in following multigrid systems iterations. We shall develop a multigrid algorithm for simultaneous computation of the sets of equations. In addition, we shall implement a method for concurrent processing of the subgrids in each of the multigrid computations. The conditions for convergence of the method will be examined. We`ll compare this technique to other parallel multigrid techniques, such as distributed data/sequential subgrids and the Parallel Superconvergent Multigrid of Frederickson and McBryan. We expect the results of these studies to offer insight and tools both for the selection of new algorithms as well as for conversion of existing large codes for massively parallel architectures.

  4. ASCI Red -- Experiences and lessons learned with a massively parallel teraFLOP supercomputer

    SciTech Connect

    Christon, M.A.; Crawford, D.A.; Hertel, E.S.; Peery, J.S.; Robinson, A.C.

    1997-06-01

    The Accelerated Strategic Computing Initiative (ASCI) program involves Sandia, Los Alamos and Lawrence Livermore National Laboratories. At Sandia National Laboratories, ASCI applications include large deformation transient dynamics, shock propagation, electromechanics, and abnormal thermal environments. In order to resolve important physical phenomena in these problems, it is estimated that meshes ranging from 10{sup 6} to 10{sup 9} grid points will be required. The ASCI program is relying on the use of massively parallel supercomputers initially capable of delivering over 1 TFLOPs to perform such demanding computations. The ASCI Red machine at Sandia National Laboratories consists of over 4,500 computational nodes with a peak computational rate of 1.8 TFLOPs, 567 GBytes of memory, and 2 TBytes of disk storage. Regardless of the peak FLOP rate, there are many issues surrounding the use of massively parallel supercomputers in a production environment. These issues include parallel I/O, mesh generation, visualization, archival storage, high-bandwidth networking and the development of parallel algorithms. In order to illustrate these issues and their solution with respect to ASCI Red, demonstration calculations of time-dependent buoyancy-dominated plumes, electromechanics, and shock propagation will be presented.

  5. Massively parallel Monte Carlo for many-particle simulations on GPUs

    SciTech Connect

    Anderson, Joshua A.; Jankowski, Eric; Grubb, Thomas L.; Engel, Michael; Glotzer, Sharon C.

    2013-12-01

    Current trends in parallel processors call for the design of efficient massively parallel algorithms for scientific computing. Parallel algorithms for Monte Carlo simulations of thermodynamic ensembles of particles have received little attention because of the inherent serial nature of the statistical sampling. In this paper, we present a massively parallel method that obeys detailed balance and implement it for a system of hard disks on the GPU. We reproduce results of serial high-precision Monte Carlo runs to verify the method. This is a good test case because the hard disk equation of state over the range where the liquid transforms into the solid is particularly sensitive to small deviations away from the balance conditions. On a Tesla K20, our GPU implementation executes over one billion trial moves per second, which is 148 times faster than on a single Intel Xeon E5540 CPU core, enables 27 times better performance per dollar, and cuts energy usage by a factor of 13. With this improved performance we are able to calculate the equation of state for systems of up to one million hard disks. These large system sizes are required in order to probe the nature of the melting transition, which has been debated for the last forty years. In this paper we present the details of our computational method, and discuss the thermodynamics of hard disks separately in a companion paper.

  6. Massive photons and Dirac monopoles: Electric condensate and magnetic confinement

    NASA Astrophysics Data System (ADS)

    Guimaraes, M. S.; Rougemont, R.; Wotzasek, C.; Zarro, C. A. D.

    2013-06-01

    We use the generalized Julia-Toulouse approach (GJTA) for condensation of topological currents (charges or defects) to argue that massive photons can coexist consistently with Dirac monopoles. The Proca theory is obtained here via GJTA as a low energy effective theory describing an electric condensate and the mass of the vector boson is responsible for generating a Meissner effect which confines the magnetic defects in monopole-antimonopole pairs connected by physical open magnetic vortices described by Dirac brane invariants, instead of Dirac strings.

  7. Molecular Dynamics Simulations from SNL's Large-scale Atomic/Molecular Massively Parallel Simulator (LAMMPS)

    DOE Data Explorer

    Plimpton, Steve; Thompson, Aidan; Crozier, Paul

    LAMMPS (http://lammps.sandia.gov/index.html) stands for Large-scale Atomic/Molecular Massively Parallel Simulator and is a code that can be used to model atoms or, as the LAMMPS website says, as a parallel particle simulator at the atomic, meso, or continuum scale. This Sandia-based website provides a long list of animations from large simulations. These were created using different visualization packages to read LAMMPS output, and each one provides the name of the PI and a brief description of the work done or visualization package used. See also the static images produced from simulations at http://lammps.sandia.gov/pictures.html The foundation paper for LAMMPS is: S. Plimpton, Fast Parallel Algorithms for Short-Range Molecular Dynamics, J Comp Phys, 117, 1-19 (1995), but the website also lists other papers describing contributions to LAMMPS over the years.

  8. Medical image processing utilizing neural networks trained on a massively parallel computer.

    PubMed

    Kerr, J P; Bartlett, E B

    1995-07-01

    While finding many applications in science, engineering, and medicine, artificial neural networks (ANNs) have typically been limited to small architectures. In this paper, we demonstrate how very large architecture neural networks can be trained for medical image processing utilizing a massively parallel, single-instruction multiple data (SIMD) computer. The two- to three-orders of magnitude improvement in processing time attainable using a parallel computer makes it practical to train very large architecture ANNs. As an example we have trained several ANNs to demonstrate the tomographic reconstruction of 64 x 64 single photon emission computed tomography (SPECT) images from 64 planar views of the images. The potential for these large architecture ANNs lies in the fact that once the neural network is properly trained on the parallel computer the corresponding interconnection weight file can be loaded on a serial computer. Subsequently, relatively fast processing of all novel images can be performed on a PC or workstation. PMID:7497701

  9. A massively parallel adaptive finite element method with dynamic load balancing

    SciTech Connect

    Devine, K.D.; Flaherty, J.E.; Wheat, S.R.; Maccabe, A.B.

    1993-05-01

    We construct massively parallel, adaptive finite element methods for the solution of hyperbolic conservation laws in one and two dimensions. Spatial discretization is performed by a discontinuous Galerkin finite element method using a basis of piecewise Legendre polynomials. Temporal discretization utilizes a Runge-Kutta method. Dissipative fluxes and projection limiting prevent oscillations near solution discontinuities. The resulting method is of high order and may be parallelized efficiently on MIMD computers. We demonstrate parallel efficiency through computations on a 1024-processor nCUBE/2 hypercube. We also present results using adaptive p-refinement to reduce the computational cost of the method. We describe tiling, a dynamic, element-based data migration system. Tiling dynamically maintains global load balance in the adaptive method by overlapping neighborhoods of processors, where each neighborhood performs local load balancing. We demonstrate the effectiveness of the dynamic load balancing with adaptive p-refinement examples.

  10. A massively parallel adaptive finite element method with dynamic load balancing

    SciTech Connect

    Devine, K.D.; Flaherty, J.E.; Wheat, S.R.; Maccabe, A.B.

    1993-12-31

    The authors construct massively parallel adaptive finite element methods for the solution of hyperbolic conservation laws. Spatial discretization is performed by a discontinuous Galerkin finite element method using a basis of piecewise Legendre polynomials. Temporal discretization utilizes a Runge-Kutta method. Dissipative fluxes and projection limiting prevent oscillations near solution discontinuities. The resulting method is of high order and may be parallelized efficiently on MIMD computers. They demonstrate parallel efficiency through computations on a 1024-processor nCUBE/2 hypercube. They present results using adaptive p-refinement to reduce the computational cost of the method, and tiling, a dynamic, element-based data migration system that maintains global load balance of the adaptive method by overlapping neighborhoods of processors that each perform local balancing.

  11. The use of inexact ODE solver in waveform relaxation methods on a massively parallel computer

    SciTech Connect

    Luk, W.S.; Wing, O.

    1995-12-01

    This paper presents the use of inexact ordinary differential equation (ODE) solver in waveform relaxation methods for solving initial value problems: Since the conventional ODE solvers are inherently sequential, the inexact ODE solver is used by taking time points from only previous waveform iteration for time integration. As a result, this method is truly massively parallel, as the equation is completely unfolded both in system and in time. Convergence analysis shows that the spectral radius of the iteration equation resulting from the {open_quotes}inexact{close_quotes} solver is the same as that from the standard method, and hence the new method is robust. The parallel implementation issues on the DECmpp 12000/Sx computer will also be discussed. Numerical results illustrate that though the number of iterations in the inexact method is increased over the exact method, as expected, the computation time is much reduced because of the large-scale parallelism.

  12. Design and Performance Analysis of a Massively Parallel Atmospheric General Circulation Model

    NASA Technical Reports Server (NTRS)

    Schaffer, Daniel S.; Suarez, Max J.

    1998-01-01

    In the 1990's computer manufacturers are increasingly turning to the development of parallel processor machines to meet the high performance needs of their customers. Simultaneously, atmospheric scientists study weather and climate phenomena ranging from hurricanes to El Nino to global warming that require increasingly fine resolution models. Here, implementation of a parallel atmospheric general circulation model (GCM) which exploits the power of massively parallel machines is described. Using the horizontal data domain decomposition methodology, this FORTRAN 90 model is able to integrate a 0.6 deg. longitude by 0.5 deg. latitude problem at a rate of 19 Gigaflops on 512 processors of a Cray T3E 600; corresponding to 280 seconds of wall-clock time per simulated model day. At this resolution, the model has 64 times as many degrees of freedom and performs 400 times as many floating point operations per simulated day as the model it replaces.

  13. Virtual Simulator: An infrastructure for design and performance-prediction of massively parallel codes

    NASA Astrophysics Data System (ADS)

    Perumalla, K.; Fujimoto, R.; Pande, S.; Karimabadi, H.; Driscoll, J.; Omelchenko, Y.

    2005-12-01

    Large parallel/distributed scientific simulations are very complex, and their dynamic behavior is hard to predict. Efficient development of massively parallel codes remains a computational challenge. For example, almost none of the kinetic codes in use in space physics today have dynamic load balancing capability. Here we present a new infrastructure for design and prediction of parallel codes. Performance prediction is useful to analyze, understand and experiment with different partitioning schemes, multiple modeling alternatives and so on, without having to run the application on supercomputers. Instrumentation of the model (with least perturbance to performance) is useful to glean key metrics and understand application-level behavior. Unfortunately, traditional approaches to virtual execution and instrumentation are limited by either slow execution speed or low resolution or both. We present a new framework that provides a high-resolution framework that provides a virtual CPU abstraction (with a full thread context per CPU), yet scales to thousands of virtual CPUs. The tool, called PDES2, presents different levels of modeling interfaces, from general purpose parallel simulations to parallel grid-based particle-in-cell (PIC) codes. The tool itself runs on multiple processors in order to accommodate the high-resolution by distributing the virtual execution across processors. Validation experiments of PIC models in the framework using a 1-D hybrid shock application show close agreement of results from virtual executions with results from actual supercomputer runs. The utility of this tool is further illustrated through an application to a parallel global hybrid code.

  14. LDRD final report on massively-parallel linear programming : the parPCx system.

    SciTech Connect

    Parekh, Ojas; Phillips, Cynthia Ann; Boman, Erik Gunnar

    2005-02-01

    This report summarizes the research and development performed from October 2002 to September 2004 at Sandia National Laboratories under the Laboratory-Directed Research and Development (LDRD) project ''Massively-Parallel Linear Programming''. We developed a linear programming (LP) solver designed to use a large number of processors. LP is the optimization of a linear objective function subject to linear constraints. Companies and universities have expended huge efforts over decades to produce fast, stable serial LP solvers. Previous parallel codes run on shared-memory systems and have little or no distribution of the constraint matrix. We have seen no reports of general LP solver runs on large numbers of processors. Our parallel LP code is based on an efficient serial implementation of Mehrotra's interior-point predictor-corrector algorithm (PCx). The computational core of this algorithm is the assembly and solution of a sparse linear system. We have substantially rewritten the PCx code and based it on Trilinos, the parallel linear algebra library developed at Sandia. Our interior-point method can use either direct or iterative solvers for the linear system. To achieve a good parallel data distribution of the constraint matrix, we use a (pre-release) version of a hypergraph partitioner from the Zoltan partitioning library. We describe the design and implementation of our new LP solver called parPCx and give preliminary computational results. We summarize a number of issues related to efficient parallel solution of LPs with interior-point methods including data distribution, numerical stability, and solving the core linear system using both direct and iterative methods. We describe a number of applications of LP specific to US Department of Energy mission areas and we summarize our efforts to integrate parPCx (and parallel LP solvers in general) into Sandia's massively-parallel integer programming solver PICO (Parallel Interger and Combinatorial Optimizer). We

  15. Overcoming rule-based rigidity and connectionist limitations through massively-parallel case-based reasoning

    NASA Technical Reports Server (NTRS)

    Barnden, John; Srinivas, Kankanahalli

    1990-01-01

    Symbol manipulation as used in traditional Artificial Intelligence has been criticized by neural net researchers for being excessively inflexible and sequential. On the other hand, the application of neural net techniques to the types of high-level cognitive processing studied in traditional artificial intelligence presents major problems as well. A promising way out of this impasse is to build neural net models that accomplish massively parallel case-based reasoning. Case-based reasoning, which has received much attention recently, is essentially the same as analogy-based reasoning, and avoids many of the problems leveled at traditional artificial intelligence. Further problems are avoided by doing many strands of case-based reasoning in parallel, and by implementing the whole system as a neural net. In addition, such a system provides an approach to some aspects of the problems of noise, uncertainty and novelty in reasoning systems. The current neural net system (Conposit), which performs standard rule-based reasoning, is being modified into a massively parallel case-based reasoning version.

  16. A Novel Implementation of Massively Parallel Three Dimensional Monte Carlo Radiation Transport

    NASA Astrophysics Data System (ADS)

    Robinson, P. B.; Peterson, J. D. L.

    2005-12-01

    The goal of our summer project was to implement the difference formulation for radiation transport into Cosmos++, a multidimensional, massively parallel, magneto hydrodynamics code for astrophysical applications (Peter Anninos - AX). The difference formulation is a new method for Symbolic Implicit Monte Carlo thermal transport (Brooks and Szöke - PAT). Formerly, simultaneous implementation of fully implicit Monte Carlo radiation transport in multiple dimensions on multiple processors had not been convincingly demonstrated. We found that a combination of the difference formulation and the inherent structure of Cosmos++ makes such an implementation both accurate and straightforward. We developed a "nearly nearest neighbor physics" technique to allow each processor to work independently, even with a fully implicit code. This technique coupled with the increased accuracy of an implicit Monte Carlo solution and the efficiency of parallel computing systems allows us to demonstrate the possibility of massively parallel thermal transport. This work was performed under the auspices of the U.S. Department of Energy by University of California Lawrence Livermore National Laboratory under contract No. W-7405-Eng-48

  17. Massively parallel regularized 3D inversion of potential fields on CPUs and GPUs

    NASA Astrophysics Data System (ADS)

    Čuma, Martin; Zhdanov, Michael S.

    2014-01-01

    We have recently introduced a massively parallel regularized 3D inversion of potential fields data. This program takes as an input gravity or magnetic vector, tensor and Total Magnetic Intensity (TMI) measurements and produces 3D volume of density, susceptibility, or three dimensional magnetization vector, the latest also including magnetic remanence information. The code uses combined MPI and OpenMP approach that maps well onto current multiprocessor multicore clusters and exhibits nearly linear strong and weak parallel scaling. It has been used to invert regional to continental size data sets with up to billion cells of the 3D Earth's volume on large clusters for interpretation of large airborne gravity and magnetics surveys. In this paper we explain the features that made this massive parallelization feasible and extend the code to add GPU support in the form of the OpenACC directives. This implementation resulted in up to a 22x speedup as compared to the scalar multithreaded implementation on a 12 core Intel CPU based computer node. Furthermore, we also introduce a mixed single-double precision approach, which allows us to perform most of the calculation at a single floating point number precision while keeping the result as precise as if the double precision had been used. This approach provides an additional 40% speedup on the GPUs, as compared to the pure double precision implementation. It also has about half of the memory footprint of the fully double precision version.

  18. A cost-effective methodology for the design of massively-parallel VLSI functional units

    NASA Technical Reports Server (NTRS)

    Venkateswaran, N.; Sriram, G.; Desouza, J.

    1993-01-01

    In this paper we propose a generalized methodology for the design of cost-effective massively-parallel VLSI Functional Units. This methodology is based on a technique of generating and reducing a massive bit-array on the mask-programmable PAcube VLSI array. This methodology unifies (maintains identical data flow and control) the execution of complex arithmetic functions on PAcube arrays. It is highly regular, expandable and uniform with respect to problem-size and wordlength, thereby reducing the communication complexity. The memory-functional unit interface is regular and expandable. Using this technique functional units of dedicated processors can be mask-programmed on the naked PAcube arrays, reducing the turn-around time. The production cost of such dedicated processors can be drastically reduced since the naked PAcube arrays can be mass-produced. Analysis of the the performance of functional units designed by our method yields promising results.

  19. Commodity cluster and hardware-based massively parallel implementations of hyperspectral imaging algorithms

    NASA Astrophysics Data System (ADS)

    Plaza, Antonio; Chang, Chein-I.; Plaza, Javier; Valencia, David

    2006-05-01

    The incorporation of hyperspectral sensors aboard airborne/satellite platforms is currently producing a nearly continual stream of multidimensional image data, and this high data volume has soon introduced new processing challenges. The price paid for the wealth spatial and spectral information available from hyperspectral sensors is the enormous amounts of data that they generate. Several applications exist, however, where having the desired information calculated quickly enough for practical use is highly desirable. High computing performance of algorithm analysis is particularly important in homeland defense and security applications, in which swift decisions often involve detection of (sub-pixel) military targets (including hostile weaponry, camouflage, concealment, and decoys) or chemical/biological agents. In order to speed-up computational performance of hyperspectral imaging algorithms, this paper develops several fast parallel data processing techniques. Techniques include four classes of algorithms: (1) unsupervised classification, (2) spectral unmixing, and (3) automatic target recognition, and (4) onboard data compression. A massively parallel Beowulf cluster (Thunderhead) at NASA's Goddard Space Flight Center in Maryland is used to measure parallel performance of the proposed algorithms. In order to explore the viability of developing onboard, real-time hyperspectral data compression algorithms, a Xilinx Virtex-II field programmable gate array (FPGA) is also used in experiments. Our quantitative and comparative assessment of parallel techniques and strategies may help image analysts in selection of parallel hyperspectral algorithms for specific applications.

  20. Stochastic simulation of charged particle transport on the massively parallel processor

    NASA Technical Reports Server (NTRS)

    Earl, James A.

    1988-01-01

    Computations of cosmic-ray transport based upon finite-difference methods are afflicted by instabilities, inaccuracies, and artifacts. To avoid these problems, researchers developed a Monte Carlo formulation which is closely related not only to the finite-difference formulation, but also to the underlying physics of transport phenomena. Implementations of this approach are currently running on the Massively Parallel Processor at Goddard Space Flight Center, whose enormous computing power overcomes the poor statistical accuracy that usually limits the use of stochastic methods. These simulations have progressed to a stage where they provide a useful and realistic picture of solar energetic particle propagation in interplanetary space.

  1. Block iterative restoration of astronomical images with the massively parallel processor

    NASA Technical Reports Server (NTRS)

    Heap, Sara R.; Lindler, Don J.

    1987-01-01

    A method is described for algebraic image restoration capable of treating astronomical images. For a typical 500 x 500 image, direct algebraic restoration would require the solution of a 250,000 x 250,000 linear system. The block iterative approach is used to reduce the problem to solving 4900 121 x 121 linear systems. The algorithm was implemented on the Goddard Massively Parallel Processor, which can solve a 121 x 121 system in approximately 0.06 seconds. Examples are shown of the results for various astronomical images.

  2. Direct methods for banded linear systems on massively parallel processor computers

    SciTech Connect

    Arbenz, P.; Gander, W.

    1995-12-01

    The authors discuss direct methods for solving systems of linear equations Ax = b, A {element_of} lR{sup nxn}, on massively parallel processor (MPP) computers. Here, A is a real banded n x n matrix with lower and upper half-bandwidth r and s, respectively. We assume that the matrix A has a narrow band, meaning r + s << n. Only in this case, it is worthwhile taking into account the zero structure of A, i.e. store the matrix by diagonals and modify algorithms.

  3. Scalable load balancing for massively parallel distributed Monte Carlo particle transport

    SciTech Connect

    O'Brien, M. J.; Brantley, P. S.; Joy, K. I.

    2013-07-01

    In order to run computer simulations efficiently on massively parallel computers with hundreds of thousands or millions of processors, care must be taken that the calculation is load balanced across the processors. Examining the workload of every processor leads to an unscalable algorithm, with run time at least as large as O(N), where N is the number of processors. We present a scalable load balancing algorithm, with run time 0(log(N)), that involves iterated processor-pair-wise balancing steps, ultimately leading to a globally balanced workload. We demonstrate scalability of the algorithm up to 2 million processors on the Sequoia supercomputer at Lawrence Livermore National Laboratory. (authors)

  4. Estimating water flow through a hillslope using the massively parallel processor

    NASA Technical Reports Server (NTRS)

    Devaney, Judy E.; Camillo, P. J.; Gurney, R. J.

    1988-01-01

    A new two-dimensional model of water flow in a hillslope has been implemented on the Massively Parallel Processor at the Goddard Space Flight Center. Flow in the soil both in the saturated and unsaturated zones, evaporation and overland flow are all modelled, and the rainfall rates are allowed to vary spatially. Previous models of this type had always been very limited computationally. This model takes less than a minute to model all the components of the hillslope water flow for a day. The model can now be used in sensitivity studies to specify which measurements should be taken and how accurate they should be to describe such flows for environmental studies.

  5. Animated computer graphics models of space and earth sciences data generated via the massively parallel processor

    NASA Technical Reports Server (NTRS)

    Treinish, Lloyd A.; Gough, Michael L.; Wildenhain, W. David

    1987-01-01

    The capability was developed of rapidly producing visual representations of large, complex, multi-dimensional space and earth sciences data sets via the implementation of computer graphics modeling techniques on the Massively Parallel Processor (MPP) by employing techniques recently developed for typically non-scientific applications. Such capabilities can provide a new and valuable tool for the understanding of complex scientific data, and a new application of parallel computing via the MPP. A prototype system with such capabilities was developed and integrated into the National Space Science Data Center's (NSSDC) Pilot Climate Data System (PCDS) data-independent environment for computer graphics data display to provide easy access to users. While developing these capabilities, several problems had to be solved independently of the actual use of the MPP, all of which are outlined.

  6. Implementation, capabilities, and benchmarking of Shift, a massively parallel Monte Carlo radiation transport code

    DOE PAGESBeta

    Pandya, Tara M.; Johnson, Seth R.; Evans, Thomas M.; Davidson, Gregory G.; Hamilton, Steven P.; Godfrey, Andrew T.

    2015-12-21

    This paper discusses the implementation, capabilities, and validation of Shift, a massively parallel Monte Carlo radiation transport package developed and maintained at Oak Ridge National Laboratory. It has been developed to scale well from laptop to small computing clusters to advanced supercomputers. Special features of Shift include hybrid capabilities for variance reduction such as CADIS and FW-CADIS, and advanced parallel decomposition and tally methods optimized for scalability on supercomputing architectures. Shift has been validated and verified against various reactor physics benchmarks and compares well to other state-of-the-art Monte Carlo radiation transport codes such as MCNP5, CE KENO-VI, and OpenMC. Somemore » specific benchmarks used for verification and validation include the CASL VERA criticality test suite and several Westinghouse AP1000® problems. These benchmark and scaling studies show promising results.« less

  7. Implementation, capabilities, and benchmarking of Shift, a massively parallel Monte Carlo radiation transport code

    SciTech Connect

    Pandya, Tara M.; Johnson, Seth R.; Evans, Thomas M.; Davidson, Gregory G.; Hamilton, Steven P.; Godfrey, Andrew T.

    2015-12-21

    This paper discusses the implementation, capabilities, and validation of Shift, a massively parallel Monte Carlo radiation transport package developed and maintained at Oak Ridge National Laboratory. It has been developed to scale well from laptop to small computing clusters to advanced supercomputers. Special features of Shift include hybrid capabilities for variance reduction such as CADIS and FW-CADIS, and advanced parallel decomposition and tally methods optimized for scalability on supercomputing architectures. Shift has been validated and verified against various reactor physics benchmarks and compares well to other state-of-the-art Monte Carlo radiation transport codes such as MCNP5, CE KENO-VI, and OpenMC. Some specific benchmarks used for verification and validation include the CASL VERA criticality test suite and several Westinghouse AP1000® problems. These benchmark and scaling studies show promising results.

  8. Implementation, capabilities, and benchmarking of Shift, a massively parallel Monte Carlo radiation transport code

    NASA Astrophysics Data System (ADS)

    Pandya, Tara M.; Johnson, Seth R.; Evans, Thomas M.; Davidson, Gregory G.; Hamilton, Steven P.; Godfrey, Andrew T.

    2016-03-01

    This work discusses the implementation, capabilities, and validation of Shift, a massively parallel Monte Carlo radiation transport package authored at Oak Ridge National Laboratory. Shift has been developed to scale well from laptops to small computing clusters to advanced supercomputers and includes features such as support for multiple geometry and physics engines, hybrid capabilities for variance reduction methods such as the Consistent Adjoint-Driven Importance Sampling methodology, advanced parallel decompositions, and tally methods optimized for scalability on supercomputing architectures. The scaling studies presented in this paper demonstrate good weak and strong scaling behavior for the implemented algorithms. Shift has also been validated and verified against various reactor physics benchmarks, including the Consortium for Advanced Simulation of Light Water Reactors' Virtual Environment for Reactor Analysis criticality test suite and several Westinghouse AP1000® problems presented in this paper. These benchmark results compare well to those from other contemporary Monte Carlo codes such as MCNP5 and KENO.

  9. Massively Parallel Computation of Soil Surface Roughness Parameters on A Fermi GPU

    NASA Astrophysics Data System (ADS)

    Li, Xiaojie; Song, Changhe

    2016-06-01

    Surface roughness is description of the surface micro topography of randomness or irregular. The standard deviation of surface height and the surface correlation length describe the statistical variation for the random component of a surface height relative to a reference surface. When the number of data points is large, calculation of surface roughness parameters is time-consuming. With the advent of Graphics Processing Unit (GPU) architectures, inherently parallel problem can be effectively solved using GPUs. In this paper we propose a GPU-based massively parallel computing method for 2D bare soil surface roughness estimation. This method was applied to the data collected by the surface roughness tester based on the laser triangulation principle during the field experiment in April 2012. The total number of data points was 52,040. It took 47 seconds on a Fermi GTX 590 GPU whereas its serial CPU version took 5422 seconds, leading to a significant 115x speedup.

  10. A Massively Parallel Solver for the Mechanical Harmonic Analysis of Accelerator Cavities

    SciTech Connect

    O. Kononenko

    2015-02-17

    ACE3P is a 3D massively parallel simulation suite that developed at SLAC National Accelerator Laboratory that can perform coupled electromagnetic, thermal and mechanical study. Effectively utilizing supercomputer resources, ACE3P has become a key simulation tool for particle accelerator R and D. A new frequency domain solver to perform mechanical harmonic response analysis of accelerator components is developed within the existing parallel framework. This solver is designed to determine the frequency response of the mechanical system to external harmonic excitations for time-efficient accurate analysis of the large-scale problems. Coupled with the ACE3P electromagnetic modules, this capability complements a set of multi-physics tools for a comprehensive study of microphonics in superconducting accelerating cavities in order to understand the RF response and feedback requirements for the operational reliability of a particle accelerator. (auth)

  11. Massively Parallel and Scalable Implicit Time Integration Algorithms for Structural Dynamics

    NASA Technical Reports Server (NTRS)

    Farhat, Charbel

    1997-01-01

    Explicit codes are often used to simulate the nonlinear dynamics of large-scale structural systems, even for low frequency response, because the storage and CPU requirements entailed by the repeated factorizations traditionally found in implicit codes rapidly overwhelm the available computing resources. With the advent of parallel processing, this trend is accelerating because of the following additional facts: (a) explicit schemes are easier to parallelize than implicit ones, and (b) explicit schemes induce short range interprocessor communications that are relatively inexpensive, while the factorization methods used in most implicit schemes induce long range interprocessor communications that often ruin the sought-after speed-up. However, the time step restriction imposed by the Courant stability condition on all explicit schemes cannot yet be offset by the speed of the currently available parallel hardware. Therefore, it is essential to develop efficient alternatives to direct methods that are also amenable to massively parallel processing because implicit codes using unconditionally stable time-integration algorithms are computationally more efficient when simulating the low-frequency dynamics of aerospace structures.

  12. Massively parallel simulation of flow and transport in variably saturated porous and fractured media

    SciTech Connect

    Wu, Yu-Shu; Zhang, Keni; Pruess, Karsten

    2002-01-15

    This paper describes a massively parallel simulation method and its application for modeling multiphase flow and multicomponent transport in porous and fractured reservoirs. The parallel-computing method has been implemented into the TOUGH2 code and its numerical performance is tested on a Cray T3E-900 and IBM SP. The efficiency and robustness of the parallel-computing algorithm are demonstrated by completing two simulations with more than one million gridblocks, using site-specific data obtained from a site-characterization study. The first application involves the development of a three-dimensional numerical model for flow in the unsaturated zone of Yucca Mountain, Nevada. The second application is the study of tracer/radionuclide transport through fracture-matrix rocks for the same site. The parallel-computing technique enhances modeling capabilities by achieving several-orders-of-magnitude speedup for large-scale and high resolution modeling studies. The resulting modeling results provide many new insights into flow and transport processes that could not be obtained from simulations using the single-CPU simulator.

  13. DGDFT: A massively parallel method for large scale density functional theory calculations

    SciTech Connect

    Hu, Wei Yang, Chao; Lin, Lin

    2015-09-28

    We describe a massively parallel implementation of the recently developed discontinuous Galerkin density functional theory (DGDFT) method, for efficient large-scale Kohn-Sham DFT based electronic structure calculations. The DGDFT method uses adaptive local basis (ALB) functions generated on-the-fly during the self-consistent field iteration to represent the solution to the Kohn-Sham equations. The use of the ALB set provides a systematic way to improve the accuracy of the approximation. By using the pole expansion and selected inversion technique to compute electron density, energy, and atomic forces, we can make the computational complexity of DGDFT scale at most quadratically with respect to the number of electrons for both insulating and metallic systems. We show that for the two-dimensional (2D) phosphorene systems studied here, using 37 basis functions per atom allows us to reach an accuracy level of 1.3 × 10{sup −4} Hartree/atom in terms of the error of energy and 6.2 × 10{sup −4} Hartree/bohr in terms of the error of atomic force, respectively. DGDFT can achieve 80% parallel efficiency on 128,000 high performance computing cores when it is used to study the electronic structure of 2D phosphorene systems with 3500-14 000 atoms. This high parallel efficiency results from a two-level parallelization scheme that we will describe in detail.

  14. Seismic waves modeling with the Fourier pseudo-spectral method on massively parallel machines.

    NASA Astrophysics Data System (ADS)

    Klin, Peter

    2015-04-01

    The Fourier pseudo-spectral method (FPSM) is an approach for the 3D numerical modeling of the wave propagation, which is based on the discretization of the spatial domain in a structured grid and relies on global spatial differential operators for the solution of the wave equation. This last peculiarity is advantageous from the accuracy point of view but poses difficulties for an efficient implementation of the method to be run on parallel computers with distributed memory architecture. The 1D spatial domain decomposition approach has been so far commonly adopted in the parallel implementations of the FPSM, but it implies an intensive data exchange among all the processors involved in the computation, which can degrade the performance because of communication latencies. Moreover, the scalability of the 1D domain decomposition is limited, since the number of processors can not exceed the number of grid points along the directions in which the domain is partitioned. This limitation inhibits an efficient exploitation of the computational environments with a very large number of processors. In order to overcome the limitations of the 1D domain decomposition we implemented a parallel version of the FPSM based on a 2D domain decomposition, which allows to achieve a higher degree of parallelism and scalability on massively parallel machines with several thousands of processing elements. The parallel programming is essentially achieved using the MPI protocol but OpenMP parts are also included in order to exploit the single processor multi - threading capabilities, when available. The developed tool is aimed at the numerical simulation of the seismic waves propagation and in particular is intended for earthquake ground motion research. We show the scalability tests performed up to 16k processing elements on the IBM Blue Gene/Q computer at CINECA (Italy), as well as the application to the simulation of the earthquake ground motion in the alluvial plain of the Po river (Italy).

  15. A Faster Parallel Algorithm and Efficient Multithreaded Implementations for Evaluating Betweenness Centrality on Massive Datasets

    SciTech Connect

    Madduri, Kamesh; Ediger, David; Jiang, Karl; Bader, David A.; Chavarría-Miranda, Daniel

    2009-05-29

    We present a new lock-free parallel algorithm for computing betweenness centrality of massive small-world networks. With minor changes to the data structures, our algorithm also achieves better spatial cache locality compared to previous approaches. Betweenness centrality is a key algorithm kernel in the HPCS SSCA#2 Graph Analysis benchmark, which has been extensively used to evaluate the performance of emerging high-performance computing architectures for graph-theoretic computations. We design optimized implementations of betweenness centrality and the SSCA#2 benchmark for two hardware multithreaded systems: a Cray XMT system with the ThreadStorm processor, and a single-socket Sun multicore server with the UltraSparc T2 processor. For a small-world network of 134 million vertices and 1.073 billion edges, the 16-processor XMT system and the 8-core Sun Fire T5120 server achieve TEPS scores (an algorithmic performance count for the SSCA#2 benchmark) of 160 million and 90 million respectively, which corresponds to more than a 2X performance improvement over the previous parallel implementations. To better characterize the performance of these multithreaded systems, we correlate the SSCA#2 performance results with data from the memory-intensive STREAM and RandomAccess benchmarks. Finally, we demonstrate the applicability of our implementation to analyze massive real-world datasets by computing approximate betweenness centrality for a large-scale IMDb movie-actor network.

  16. A Faster Parallel Algorithm and Efficient Multithreaded Implementations for Evaluating Betweenness Centrality on Massive Datasets

    SciTech Connect

    Madduri, Kamesh; Ediger, David; Jiang, Karl; Bader, David A.; Chavarria-Miranda, Daniel

    2009-02-15

    We present a new lock-free parallel algorithm for computing betweenness centralityof massive small-world networks. With minor changes to the data structures, ouralgorithm also achieves better spatial cache locality compared to previous approaches. Betweenness centrality is a key algorithm kernel in HPCS SSCA#2, a benchmark extensively used to evaluate the performance of emerging high-performance computing architectures for graph-theoretic computations. We design optimized implementations of betweenness centrality and the SSCA#2 benchmark for two hardware multithreaded systems: a Cray XMT system with the Threadstorm processor, and a single-socket Sun multicore server with the UltraSPARC T2 processor. For a small-world network of 134 million vertices and 1.073 billion edges, the 16-processor XMT system and the 8-core Sun Fire T5120 server achieve TEPS scores (an algorithmic performance count for the SSCA#2 benchmark) of 160 million and 90 million respectively, which corresponds to more than a 2X performance improvement over the previous parallel implementations. To better characterize the performance of these multithreaded systems, we correlate the SSCA#2 performance results with data from the memory-intensive STREAM and RandomAccess benchmarks. Finally, we demonstrate the applicability of our implementation to analyze massive real-world datasets by computing approximate betweenness centrality for a large-scale IMDb movie-actor network.

  17. On distributed memory MPI-based parallelization of SPH codes in massive HPC context

    NASA Astrophysics Data System (ADS)

    Oger, G.; Le Touzé, D.; Guibert, D.; de Leffe, M.; Biddiscombe, J.; Soumagne, J.; Piccinali, J.-G.

    2016-03-01

    Most of particle methods share the problem of high computational cost and in order to satisfy the demands of solvers, currently available hardware technologies must be fully exploited. Two complementary technologies are now accessible. On the one hand, CPUs which can be structured into a multi-node framework, allowing massive data exchanges through a high speed network. In this case, each node is usually comprised of several cores available to perform multithreaded computations. On the other hand, GPUs which are derived from the graphics computing technologies, able to perform highly multi-threaded calculations with hundreds of independent threads connected together through a common shared memory. This paper is primarily dedicated to the distributed memory parallelization of particle methods, targeting several thousands of CPU cores. The experience gained clearly shows that parallelizing a particle-based code on moderate numbers of cores can easily lead to an acceptable scalability, whilst a scalable speedup on thousands of cores is much more difficult to obtain. The discussion revolves around speeding up particle methods as a whole, in a massive HPC context by making use of the MPI library. We focus on one particular particle method which is Smoothed Particle Hydrodynamics (SPH), one of the most widespread today in the literature as well as in engineering.

  18. Massively Parallel Dantzig-Wolfe Decomposition Applied to Traffic Flow Scheduling

    NASA Technical Reports Server (NTRS)

    Rios, Joseph Lucio; Ross, Kevin

    2009-01-01

    Optimal scheduling of air traffic over the entire National Airspace System is a computationally difficult task. To speed computation, Dantzig-Wolfe decomposition is applied to a known linear integer programming approach for assigning delays to flights. The optimization model is proven to have the block-angular structure necessary for Dantzig-Wolfe decomposition. The subproblems for this decomposition are solved in parallel via independent computation threads. Experimental evidence suggests that as the number of subproblems/threads increases (and their respective sizes decrease), the solution quality, convergence, and runtime improve. A demonstration of this is provided by using one flight per subproblem, which is the finest possible decomposition. This results in thousands of subproblems and associated computation threads. This massively parallel approach is compared to one with few threads and to standard (non-decomposed) approaches in terms of solution quality and runtime. Since this method generally provides a non-integral (relaxed) solution to the original optimization problem, two heuristics are developed to generate an integral solution. Dantzig-Wolfe followed by these heuristics can provide a near-optimal (sometimes optimal) solution to the original problem hundreds of times faster than standard (non-decomposed) approaches. In addition, when massive decomposition is employed, the solution is shown to be more likely integral, which obviates the need for an integerization step. These results indicate that nationwide, real-time, high fidelity, optimal traffic flow scheduling is achievable for (at least) 3 hour planning horizons.

  19. The Fortran-P Translator: Towards Automatic Translation of Fortran 77 Programs for Massively Parallel Processors

    DOE PAGESBeta

    O'keefe, Matthew; Parr, Terence; Edgar, B. Kevin; Anderson, Steve; Woodward, Paul; Dietz, Hank

    1995-01-01

    Massively parallel processors (MPPs) hold the promise of extremely high performance that, if realized, could be used to study problems of unprecedented size and complexity. One of the primary stumbling blocks to this promise has been the lack of tools to translate application codes to MPP form. In this article we show how applications codes written in a subset of Fortran 77, called Fortran-P, can be translated to achieve good performance on several massively parallel machines. This subset can express codes that are self-similar, where the algorithm applied to the global data domain is also applied to each subdomain. Wemore » have found many codes that match the Fortran-P programming style and have converted them using our tools. We believe a self-similar coding style will accomplish what a vectorizable style has accomplished for vector machines by allowing the construction of robust, user-friendly, automatic translation systems that increase programmer productivity and generate fast, efficient code for MPPs.« less

  20. Massively Parallel Sequencing-Based Clonality Analysis of Synchronous Endometrioid Endometrial and Ovarian Carcinomas.

    PubMed

    Schultheis, Anne M; Ng, Charlotte K Y; De Filippo, Maria R; Piscuoglio, Salvatore; Macedo, Gabriel S; Gatius, Sonia; Perez Mies, Belen; Soslow, Robert A; Lim, Raymond S; Viale, Agnes; Huberman, Kety H; Palacios, Jose C; Reis-Filho, Jorge S; Matias-Guiu, Xavier; Weigelt, Britta

    2016-06-01

    Synchronous early-stage endometrioid endometrial carcinomas (EECs) and endometrioid ovarian carcinomas (EOCs) are associated with a favorable prognosis and have been suggested to represent independent primary tumors rather than metastatic disease. We subjected sporadic synchronous EECs/EOCs from five patients to whole-exome massively parallel sequencing, which revealed that the EEC and EOC of each case displayed strikingly similar repertoires of somatic mutations and gene copy number alterations. Despite the presence of mutations restricted to the EEC or EOC in each case, we observed that the mutational processes that shaped their respective genomes were consistent. High-depth targeted massively parallel sequencing of sporadic synchronous EECs/EOCs from 17 additional patients confirmed that these lesions are clonally related. In an additional Lynch Syndrome case, however, the EEC and EOC were found to constitute independent cancers lacking somatic mutations in common. Taken together, sporadic synchronous EECs/EOCs are clonally related and likely constitute dissemination from one site to the other. PMID:26832770

  1. New strategies and emerging technologies for massively parallel sequencing: applications in medical research.

    PubMed

    Mardis, Elaine R

    2009-01-01

    A variety of techniques that specifically target human gene sequences for differential capture from a genomic sample, coupled with next-generation, massively parallel DNA sequencing instruments, is rapidly supplanting the combination of polymerase chain reaction and capillary sequencing to discover coding variants in medically relevant samples. These studies are most appropriate for the sample numbers necessary to identify both common and rare single nucleotide variants, as well as small insertion or deletion events, which may cause complex inherited diseases. The same massively parallel sequencers are simultaneously being used for whole-genome resequencing and comprehensive, genome-wide variant discovery in studies of somatic diseases such as cancer. Viral and microbial researchers are using next-generation sequences to identify unknown etiologic agents in human diseases, to study the viral and microbial species that occupy surfaces of the human body, and to inform the clinical management of chronic infectious diseases such as human immunodeficiency virus (HIV). Taken together, these approaches are dramatically accelerating the pace of human disease research and are already impacting patient care. PMID:19435481

  2. Transcriptional analysis of endocrine disruption using zebrafish and massively parallel sequencing

    PubMed Central

    Baker, Michael E.; Hardiman, Gary

    2014-01-01

    Endocrine disrupting chemicals (EDCs) including plasticizers, pesticides, detergents and pharmaceuticals, affect a variety of hormone-regulated physiological pathways in humans and wildlife. Many EDCs are lipophilic molecules and bind to hydrophobic pockets in steroid receptors, such as the estrogen receptor and androgen receptor, which are important in vertebrate reproduction and development. Indeed, health effects attributed to EDCs include reproductive dysfunction (e.g., reduced fertility, reproductive tract abnormalities and skewed male/female sex ratios in fish), early puberty, various cancers and obesity. A major concern is the effects of exposure to low concentrations of endocrine disruptors in utero and post partum, which may increase the incidence of cancer and diabetes in adults. EDCs affect transcription of hundreds and even thousands of genes, which has created the need for new tools to monitor the global effects of EDCs. The emergence of massive parallel sequencing for investigating gene transcription provides a sensitive tool for monitoring the effects of EDCs on humans and other vertebrates as well as elucidating the mechanism of action of EDCs. Zebrafish conserve many developmental pathways found in humans, which makes zebrafish a valuable model system for studying EDCs especially on early organ development because their embryos are translucent. In this article we review recent advances in massive parallel sequencing approaches with a focus on zebrafish. We make the case that zebrafish exposed to EDCs at different stages of development, can provide important insights on EDC effects on human health. PMID:24850832

  3. A massively parallel semi-Lagrangian algorithm for solving the transport equation

    SciTech Connect

    Manson, Russell; Wang, Dali

    2010-01-01

    The scalar transport equation underpins many models employed in science, engineering, technology and business. Application areas include, but are not restricted to, pollution transport, weather forecasting, video analysis and encoding (the optical flow equation), options and stock pricing (the Black-Scholes equation) and spatially explicit ecological models. Unfortunately finding numerical solutions to this equation which are fast and accurate is not trivial. Moreover, finding such numerical algorithms that can be implemented on high performance computer architectures efficiently is challenging. In this paper the authors describe a massively parallel algorithm for solving the advection portion of the transport equation. We present an approach here which is different to that used in most transport models and which we have tried and tested for various scenarios. The approach employs an intelligent domain decomposition based on the vector field of the system equations and thus automatically partitions the computational domain into algorithmically autonomous regions. The solution of a classic pure advection transport problem is shown to be conservative, monotonic and highly accurate at large time steps. Additionally we demonstrate that the algorithm is highly efficient for high performance computer architectures and thus offers a route towards massively parallel application.

  4. MADmap: A Massively Parallel Maximum-Likelihood Cosmic Microwave Background Map-Maker

    SciTech Connect

    Cantalupo, Christopher; Borrill, Julian; Jaffe, Andrew; Kisner, Theodore; Stompor, Radoslaw

    2009-06-09

    MADmap is a software application used to produce maximum-likelihood images of the sky from time-ordered data which include correlated noise, such as those gathered by Cosmic Microwave Background (CMB) experiments. It works efficiently on platforms ranging from small workstations to the most massively parallel supercomputers. Map-making is a critical step in the analysis of all CMB data sets, and the maximum-likelihood approach is the most accurate and widely applicable algorithm; however, it is a computationally challenging task. This challenge will only increase with the next generation of ground-based, balloon-borne and satellite CMB polarization experiments. The faintness of the B-mode signal that these experiments seek to measure requires them to gather enormous data sets. MADmap is already being run on up to O(1011) time samples, O(108) pixels and O(104) cores, with ongoing work to scale to the next generation of data sets and supercomputers. We describe MADmap's algorithm based around a preconditioned conjugate gradient solver, fast Fourier transforms and sparse matrix operations. We highlight MADmap's ability to address problems typically encountered in the analysis of realistic CMB data sets and describe its application to simulations of the Planck and EBEX experiments. The massively parallel and distributed implementation is detailed and scaling complexities are given for the resources required. MADmap is capable of analysing the largest data sets now being collected on computing resources currently available, and we argue that, given Moore's Law, MADmap will be capable of reducing the most massive projected data sets.

  5. ASSET: Analysis of Sequences of Synchronous Events in Massively Parallel Spike Trains.

    PubMed

    Torre, Emiliano; Canova, Carlos; Denker, Michael; Gerstein, George; Helias, Moritz; Grün, Sonja

    2016-07-01

    With the ability to observe the activity from large numbers of neurons simultaneously using modern recording technologies, the chance to identify sub-networks involved in coordinated processing increases. Sequences of synchronous spike events (SSEs) constitute one type of such coordinated spiking that propagates activity in a temporally precise manner. The synfire chain was proposed as one potential model for such network processing. Previous work introduced a method for visualization of SSEs in massively parallel spike trains, based on an intersection matrix that contains in each entry the degree of overlap of active neurons in two corresponding time bins. Repeated SSEs are reflected in the matrix as diagonal structures of high overlap values. The method as such, however, leaves the task of identifying these diagonal structures to visual inspection rather than to a quantitative analysis. Here we present ASSET (Analysis of Sequences of Synchronous EvenTs), an improved, fully automated method which determines diagonal structures in the intersection matrix by a robust mathematical procedure. The method consists of a sequence of steps that i) assess which entries in the matrix potentially belong to a diagonal structure, ii) cluster these entries into individual diagonal structures and iii) determine the neurons composing the associated SSEs. We employ parallel point processes generated by stochastic simulations as test data to demonstrate the performance of the method under a wide range of realistic scenarios, including different types of non-stationarity of the spiking activity and different correlation structures. Finally, the ability of the method to discover SSEs is demonstrated on complex data from large network simulations with embedded synfire chains. Thus, ASSET represents an effective and efficient tool to analyze massively parallel spike data for temporal sequences of synchronous activity. PMID:27420734

  6. ASSET: Analysis of Sequences of Synchronous Events in Massively Parallel Spike Trains

    PubMed Central

    Canova, Carlos; Denker, Michael; Gerstein, George; Helias, Moritz

    2016-01-01

    With the ability to observe the activity from large numbers of neurons simultaneously using modern recording technologies, the chance to identify sub-networks involved in coordinated processing increases. Sequences of synchronous spike events (SSEs) constitute one type of such coordinated spiking that propagates activity in a temporally precise manner. The synfire chain was proposed as one potential model for such network processing. Previous work introduced a method for visualization of SSEs in massively parallel spike trains, based on an intersection matrix that contains in each entry the degree of overlap of active neurons in two corresponding time bins. Repeated SSEs are reflected in the matrix as diagonal structures of high overlap values. The method as such, however, leaves the task of identifying these diagonal structures to visual inspection rather than to a quantitative analysis. Here we present ASSET (Analysis of Sequences of Synchronous EvenTs), an improved, fully automated method which determines diagonal structures in the intersection matrix by a robust mathematical procedure. The method consists of a sequence of steps that i) assess which entries in the matrix potentially belong to a diagonal structure, ii) cluster these entries into individual diagonal structures and iii) determine the neurons composing the associated SSEs. We employ parallel point processes generated by stochastic simulations as test data to demonstrate the performance of the method under a wide range of realistic scenarios, including different types of non-stationarity of the spiking activity and different correlation structures. Finally, the ability of the method to discover SSEs is demonstrated on complex data from large network simulations with embedded synfire chains. Thus, ASSET represents an effective and efficient tool to analyze massively parallel spike data for temporal sequences of synchronous activity. PMID:27420734

  7. MPI/OpenMP Hybrid Parallel Algorithm of Resolution of Identity Second-Order Møller-Plesset Perturbation Calculation for Massively Parallel Multicore Supercomputers.

    PubMed

    Katouda, Michio; Nakajima, Takahito

    2013-12-10

    A new algorithm for massively parallel calculations of electron correlation energy of large molecules based on the resolution of identity second-order Møller-Plesset perturbation (RI-MP2) technique is developed and implemented into the quantum chemistry software NTChem. In this algorithm, a Message Passing Interface (MPI) and Open Multi-Processing (OpenMP) hybrid parallel programming model is applied to attain efficient parallel performance on massively parallel supercomputers. An in-core storage scheme of intermediate data of three-center electron repulsion integrals utilizing the distributed memory is developed to eliminate input/output (I/O) overhead. The parallel performance of the algorithm is tested on massively parallel supercomputers such as the K computer (using up to 45 992 central processing unit (CPU) cores) and a commodity Intel Xeon cluster (using up to 8192 CPU cores). The parallel RI-MP2/cc-pVTZ calculation of two-layer nanographene sheets (C150H30)2 (number of atomic orbitals is 9640) is performed using 8991 node and 71 288 CPU cores of the K computer. PMID:26592275

  8. User's Guide for TOUGH2-MP - A Massively Parallel Version of the TOUGH2 Code

    SciTech Connect

    Earth Sciences Division; Zhang, Keni; Zhang, Keni; Wu, Yu-Shu; Pruess, Karsten

    2008-05-27

    TOUGH2-MP is a massively parallel (MP) version of the TOUGH2 code, designed for computationally efficient parallel simulation of isothermal and nonisothermal flows of multicomponent, multiphase fluids in one, two, and three-dimensional porous and fractured media. In recent years, computational requirements have become increasingly intensive in large or highly nonlinear problems for applications in areas such as radioactive waste disposal, CO2 geological sequestration, environmental assessment and remediation, reservoir engineering, and groundwater hydrology. The primary objective of developing the parallel-simulation capability is to significantly improve the computational performance of the TOUGH2 family of codes. The particular goal for the parallel simulator is to achieve orders-of-magnitude improvement in computational time for models with ever-increasing complexity. TOUGH2-MP is designed to perform parallel simulation on multi-CPU computational platforms. An earlier version of TOUGH2-MP (V1.0) was based on the TOUGH2 Version 1.4 with EOS3, EOS9, and T2R3D modules, a software previously qualified for applications in the Yucca Mountain project, and was designed for execution on CRAY T3E and IBM SP supercomputers. The current version of TOUGH2-MP (V2.0) includes all fluid property modules of the standard version TOUGH2 V2.0. It provides computationally efficient capabilities using supercomputers, Linux clusters, or multi-core PCs, and also offers many user-friendly features. The parallel simulator inherits all process capabilities from V2.0 together with additional capabilities for handling fractured media from V1.4. This report provides a quick starting guide on how to set up and run the TOUGH2-MP program for users with a basic knowledge of running the (standard) version TOUGH2 code, The report also gives a brief technical description of the code, including a discussion of parallel methodology, code structure, as well as mathematical and numerical methods used

  9. Compact Graph Representations and Parallel Connectivity Algorithms for Massive Dynamic Network Analysis

    SciTech Connect

    Madduri, Kamesh; Bader, David A.

    2009-02-15

    Graph-theoretic abstractions are extensively used to analyze massive data sets. Temporal data streams from socioeconomic interactions, social networking web sites, communication traffic, and scientific computing can be intuitively modeled as graphs. We present the first study of novel high-performance combinatorial techniques for analyzing large-scale information networks, encapsulating dynamic interaction data in the order of billions of entities. We present new data structures to represent dynamic interaction networks, and discuss algorithms for processing parallel insertions and deletions of edges in small-world networks. With these new approaches, we achieve an average performance rate of 25 million structural updates per second and a parallel speedup of nearly28 on a 64-way Sun UltraSPARC T2 multicore processor, for insertions and deletions to a small-world network of 33.5 million vertices and 268 million edges. We also design parallel implementations of fundamental dynamic graph kernels related to connectivity and centrality queries. Our implementations are freely distributed as part of the open-source SNAP (Small-world Network Analysis and Partitioning) complex network analysis framework.

  10. Microfluidic Reactor Array Device for Massively Parallel In-situ Synthesis of Oligonucleotides

    PubMed Central

    Srivannavit, Onnop; Gulari, Mayurachat; Hua, Zhishan.; Gao, Xiaolian; Zhou, Xiaochuan; Hong, Ailing; Zhou, Tiecheng; Gulari, Erdogan

    2009-01-01

    We have designed and fabricated a microfluidic reactor array device for massively parallel in-situ synthesis of oligonucleotides (oDNA). The device is made of glass anodically bonded to silicon consisting of three level features: microreactors, microchannels and through inlet/outlet holes. Main challenges in the design of this device include preventing diffusion of photogenerated reagents upon activation and achieving uniform reagent flow through thousands of parallel reactors. The device embodies a simple and effective dynamic isolation mechanism which prevents the intermixing of active reagents between discrete microreactors. Depending on the design parameters, it is possible to achieve uniform flow and synthesis reaction in all of the reactors by proper design of the microreactors and the microchannels. We demonstrated the use of this device on a solution-based, light-directed parallel in-situ oDNA synthesis. We were able to synthesize long oDNA, up to 120 mers at stepwise yield of 98 %. The quality of our microfluidic oDNA microarray including sensitivity, signal noise, specificity, spot variation and accuracy was characterized. Our microfluidic reactor array devices show a great potential for genomics and proteomics researches. PMID:20161215

  11. Scalable High Performance Computing: Direct and Large-Eddy Turbulent Flow Simulations Using Massively Parallel Computers

    NASA Technical Reports Server (NTRS)

    Morgan, Philip E.

    2004-01-01

    This final report contains reports of research related to the tasks "Scalable High Performance Computing: Direct and Lark-Eddy Turbulent FLow Simulations Using Massively Parallel Computers" and "Devleop High-Performance Time-Domain Computational Electromagnetics Capability for RCS Prediction, Wave Propagation in Dispersive Media, and Dual-Use Applications. The discussion of Scalable High Performance Computing reports on three objectives: validate, access scalability, and apply two parallel flow solvers for three-dimensional Navier-Stokes flows; develop and validate a high-order parallel solver for Direct Numerical Simulations (DNS) and Large Eddy Simulation (LES) problems; and Investigate and develop a high-order Reynolds averaged Navier-Stokes turbulence model. The discussion of High-Performance Time-Domain Computational Electromagnetics reports on five objectives: enhancement of an electromagnetics code (CHARGE) to be able to effectively model antenna problems; utilize lessons learned in high-order/spectral solution of swirling 3D jets to apply to solving electromagnetics project; transition a high-order fluids code, FDL3DI, to be able to solve Maxwell's Equations using compact-differencing; develop and demonstrate improved radiation absorbing boundary conditions for high-order CEM; and extend high-order CEM solver to address variable material properties. The report also contains a review of work done by the systems engineer.

  12. Displacement Current and the Generation of Parallel Electric Fields

    SciTech Connect

    Song Yan; Lysak, Robert L.

    2006-04-14

    We show for the first time the dynamical relationship between the generation of magnetic field-aligned electric field (E{sub parallel}) and the temporal changes and spatial gradients of magnetic and velocity shears, and the plasma density in Earth's magnetosphere. We predict that the signatures of reconnection and auroral particle acceleration should have a correlation with low plasma density, and a localized voltage drop (V{sub parallel}) should often be associated with a localized magnetic stress concentration. Previous interpretations of the E{sub parallel} generation are mostly based on the generalized Ohm's law, causing serious confusion in understanding the nature of reconnection and auroral acceleration.

  13. Climate system modeling on massively parallel systems: LDRD Project 95-ERP-47 final report

    SciTech Connect

    Mirin, A.A.; Dannevik, W.P.; Chan, B.; Duffy, P.B.; Eltgroth, P.G.; Wehner, M.F.

    1996-12-01

    Global warming, acid rain, ozone depletion, and biodiversity loss are some of the major climate-related issues presently being addressed by climate and environmental scientists. Because unexpected changes in the climate could have significant effect on our economy, it is vitally important to improve the scientific basis for understanding and predicting the earth`s climate. The impracticality of modeling the earth experimentally in the laboratory together with the fact that the model equations are highly nonlinear has created a unique and vital role for computer-based climate experiments. However, today`s computer models, when run at desired spatial and temporal resolution and physical complexity, severely overtax the capabilities of our most powerful computers. Parallel processing offers significant potential for attaining increased performance and making tractable simulations that cannot be performed today. The principal goals of this project have been to develop and demonstrate the capability to perform large-scale climate simulations on high-performance computing systems (using methodology that scales to the systems of tomorrow), and to carry out leading-edge scientific calculations using parallelized models. The demonstration platform for these studies has been the 256-processor Cray-T3D located at Lawrence Livermore National Laboratory. Our plan was to undertake an ambitious program in optimization, proof-of-principle and scientific study. These goals have been met. We are now regularly using massively parallel processors for scientific study of the ocean and atmosphere, and preliminary parallel coupled ocean/atmosphere calculations are being carried out as well. Furthermore, our work suggests that it should be possible to develop an advanced comprehensive climate system model with performance scalable to the teraflops range. 9 refs., 3 figs.

  14. Large-Scale Eigenvalue Calculations for Stability Analysis of Steady Flows on Massively Parallel Computers

    SciTech Connect

    Lehoucq, Richard B.; Salinger, Andrew G.

    1999-08-01

    We present an approach for determining the linear stability of steady states of PDEs on massively parallel computers. Linearizing the transient behavior around a steady state leads to a generalized eigenvalue problem. The eigenvalues with largest real part are calculated using Arnoldi's iteration driven by a novel implementation of the Cayley transformation to recast the problem as an ordinary eigenvalue problem. The Cayley transformation requires the solution of a linear system at each Arnoldi iteration, which must be done iteratively for the algorithm to scale with problem size. A representative model problem of 3D incompressible flow and heat transfer in a rotating disk reactor is used to analyze the effect of algorithmic parameters on the performance of the eigenvalue algorithm. Successful calculations of leading eigenvalues for matrix systems of order up to 4 million were performed, identifying the critical Grashof number for a Hopf bifurcation.

  15. Massively Parallel Interrogation of the Effects of Gene Expression Levels on Fitness.

    PubMed

    Keren, Leeat; Hausser, Jean; Lotan-Pompan, Maya; Vainberg Slutskin, Ilya; Alisar, Hadas; Kaminski, Sivan; Weinberger, Adina; Alon, Uri; Milo, Ron; Segal, Eran

    2016-08-25

    Data of gene expression levels across individuals, cell types, and disease states is expanding, yet our understanding of how expression levels impact phenotype is limited. Here, we present a massively parallel system for assaying the effect of gene expression levels on fitness in Saccharomyces cerevisiae by systematically altering the expression level of ∼100 genes at ∼100 distinct levels spanning a 500-fold range at high resolution. We show that the relationship between expression levels and growth is gene and environment specific and provides information on the function, stoichiometry, and interactions of genes. Wild-type expression levels in some conditions are not optimal for growth, and genes whose fitness is greatly affected by small changes in expression level tend to exhibit lower cell-to-cell variability in expression. Our study addresses a fundamental gap in understanding the functional significance of gene expression regulation and offers a framework for evaluating the phenotypic effects of expression variation. PMID:27545349

  16. The sensitivity of massively parallel sequencing for detecting candidate infectious agents associated with human tissue.

    PubMed

    Moore, Richard A; Warren, René L; Freeman, J Douglas; Gustavsen, Julia A; Chénard, Caroline; Friedman, Jan M; Suttle, Curtis A; Zhao, Yongjun; Holt, Robert A

    2011-01-01

    Massively parallel sequencing technology now provides the opportunity to sample the transcriptome of a given tissue comprehensively. Transcripts at only a few copies per cell are readily detectable, allowing the discovery of low abundance viral and bacterial transcripts in human tissue samples. Here we describe an approach for mining large sequence data sets for the presence of microbial sequences. Further, we demonstrate the sensitivity of this approach by sequencing human RNA-seq libraries spiked with decreasing amounts of an RNA-virus. At a modest depth of sequencing, viral transcripts can be detected at frequencies less than 1 in 1,000,000. With current sequencing platforms approaching outputs of one billion reads per run, this is a highly sensitive method for detecting putative infectious agents associated with human tissues. PMID:21603639

  17. Demonstration of EDA flow for massively parallel e-beam lithography

    NASA Astrophysics Data System (ADS)

    Brandt, P.; Belledent, J.; Tranquillin, C.; Figueiro, T.; Meunier, S.; Bayle, S.; Fay, A.; Milléquant, M.; Icard, B.; Wieland, M.

    2014-03-01

    Today's soaring complexity in pushing the limits of 193nm immersion lithography drives the development of other technologies. One of these alternatives is mask-less massively parallel electron beam lithography, (MP-EBL), a promising candidate in which future resolution needs can be fulfilled at competitive cost. MAPPER Lithography's MATRIX MP-EBL platform has currently entered an advanced stage of development. The first tool in this platform, the FLX 1200, will operate using more than 1,300 beams, each one writing a stripe 2.2μm wide. 0.2μm overlap from stripe to stripe is allocated for stitching. Each beam is composed of 49 individual sub-beams that can be blanked independently in order to write in a raster scan pixels onto the wafer.

  18. Simulating massively parallel electron beam inspection for sub-20 nm defects

    NASA Astrophysics Data System (ADS)

    Bunday, Benjamin D.; Mukhtar, Maseeh; Quoi, Kathy; Thiel, Brad; Malloy, Matt

    2015-03-01

    SEMATECH has initiated a program to develop massively-parallel electron beam defect inspection (MPEBI). Here we use JMONSEL simulations to generate expected imaging responses of chosen test cases of patterns and defects with ability to vary parameters for beam energy, spot size, pixel size, and/or defect material and form factor. The patterns are representative of the design rules for an aggressively-scaled FinFET-type design. With these simulated images and resulting shot noise, a signal-to-noise framework is developed, which relates to defect detection probabilities. Additionally, with this infrastructure the effect of detection chain noise and frequency dependent system response can be made, allowing for targeting of best recipe parameters for MPEBI validation experiments, ultimately leading to insights into how such parameters will impact MPEBI tool design, including necessary doses for defect detection and estimations of scanning speeds for achieving high throughput for HVM.

  19. A Massively Parallel Sparse Eigensolver for Structural Dynamics Finite Element Analysis

    SciTech Connect

    Day, David M.; Reese, G.M.

    1999-05-01

    Eigenanalysis is a critical component of structural dynamics which is essential for determinating the vibrational response of systems. This effort addresses the development of numerical algorithms associated with scalable eigensolver techniques suitable for use on massively parallel, distributed memory computers that are capable of solving large scale structural dynamics problems. An iterative Lanczos method was determined to be the best choice for the application. Scalability of the eigenproblem depends on scalability of the underlying linear solver. A multi-level solver (FETI) was selected as most promising for this component. Issues relating to heterogeneous materials, mechanisms and multipoint constraints have been examined, and the linear solver algorithm has been developed to incorporate features that result in a scalable, robust algorithm for practical structural dynamics applications. The resulting tools have been demonstrated on large problems representative of a weapon's system.

  20. Inside the intraterrestrials: The deep biosphere seen through massively parallel sequencing

    NASA Astrophysics Data System (ADS)

    Biddle, J.

    2009-12-01

    Deeply buried marine sediments may house a large amount of the Earth’s microbial population. Initial studies based on 16S rRNA clone libraries suggest that these sediments contain unique phylotypes of microorganisms, particularly from the archaeal domain. Since this environment is so difficult to study, microbiologists are challenged to find ways to examine these populations remotely. A major approach taken to study this environment uses massively parallel sequencing to examine the inner genetic workings of these microorganisms after the sediment has been drilled. Both metagenomics and tagged amplicon sequencing have been employed on deep sediments, and initial results show that different geographic regions can be differentiated through genomics and also minor populations may cause major geochemical changes.

  1. Macro-scale phenomena of arterial coupled cells: a massively parallel simulation

    PubMed Central

    Shaikh, Mohsin Ahmed; Wall, David J. N.; David, Tim

    2012-01-01

    Impaired mass transfer characteristics of blood-borne vasoactive species such as adenosine triphosphate in regions such as an arterial bifurcation have been hypothesized as a prospective mechanism in the aetiology of atherosclerotic lesions. Arterial endothelial cells (ECs) and smooth muscle cells (SMCs) respond differentially to altered local haemodynamics and produce coordinated macro-scale responses via intercellular communication. Using a computationally designed arterial segment comprising large populations of mathematically modelled coupled ECs and SMCs, we investigate their response to spatial gradients of blood-borne agonist concentrations and the effect of micro-scale-driven perturbation on the macro-scale. Altering homocellular (between same cell type) and heterocellular (between different cell types) intercellular coupling, we simulated four cases of normal and pathological arterial segments experiencing an identical gradient in the concentration of the agonist. Results show that the heterocellular calcium (Ca2+) coupling between ECs and SMCs is important in eliciting a rapid response when the vessel segment is stimulated by the agonist gradient. In the absence of heterocellular coupling, homocellular Ca2+ coupling between SMCs is necessary for propagation of Ca2+ waves from downstream to upstream cells axially. Desynchronized intracellular Ca2+ oscillations in coupled SMCs are mandatory for this propagation. Upon decoupling the heterocellular membrane potential, the arterial segment looses the inhibitory effect of ECs on the Ca2+ dynamics of the underlying SMCs. The full system comprises hundreds of thousands of coupled nonlinear ordinary differential equations simulated on the massively parallel Blue Gene architecture. The use of massively parallel computational architectures shows the capability of this approach to address macro-scale phenomena driven by elementary micro-scale components of the system. PMID:21920960

  2. Massively parallel cis-regulatory analysis in the mammalian central nervous system

    PubMed Central

    Shen, Susan Q.; Myers, Connie A.; Hughes, Andrew E.O.; Byrne, Leah C.; Flannery, John G.; Corbo, Joseph C.

    2016-01-01

    Cis-regulatory elements (CREs, e.g., promoters and enhancers) regulate gene expression, and variants within CREs can modulate disease risk. Next-generation sequencing has enabled the rapid generation of genomic data that predict the locations of CREs, but a bottleneck lies in functionally interpreting these data. To address this issue, massively parallel reporter assays (MPRAs) have emerged, in which barcoded reporter libraries are introduced into cells, and the resulting barcoded transcripts are quantified by next-generation sequencing. Thus far, MPRAs have been largely restricted to assaying short CREs in a limited repertoire of cultured cell types. Here, we present two advances that extend the biological relevance and applicability of MPRAs. First, we adapt exome capture technology to instead capture candidate CREs, thereby tiling across the targeted regions and markedly increasing the length of CREs that can be readily assayed. Second, we package the library into adeno-associated virus (AAV), thereby allowing delivery to target organs in vivo. As a proof of concept, we introduce a capture library of about 46,000 constructs, corresponding to roughly 3500 DNase I hypersensitive (DHS) sites, into the mouse retina by ex vivo plasmid electroporation and into the mouse cerebral cortex by in vivo AAV injection. We demonstrate tissue-specific cis-regulatory activity of DHSs and provide examples of high-resolution truncation mutation analysis for multiplex parsing of CREs. Our approach should enable massively parallel functional analysis of a wide range of CREs in any organ or species that can be infected by AAV, such as nonhuman primates and human stem cell–derived organoids. PMID:26576614

  3. Architecture for next-generation massively parallel maskless lithography system (MPML2)

    NASA Astrophysics Data System (ADS)

    Su, Ming-Shing; Tsai, Kuen-Yu; Lu, Yi-Chang; Kuo, Yu-Hsuan; Pei, Ting-Hang; Yen, Jia-Yush

    2010-03-01

    Electron-beam lithography is promising for future manufacturing technology because it does not suffer from wavelength limits set by light sources. Since single electron-beam lithography systems have a common problem in throughput, a multi-electron-beam lithography (MEBL) system should be a feasible alternative using the concept of massive parallelism. In this paper, we evaluate the advantages and the disadvantages of different MEBL system architectures, and propose our novel Massively Parallel MaskLess Lithography System, MPML2. MPML2 system is targeting for cost-effective manufacturing at the 32nm node and beyond. The key structure of the proposed system is its beamlet array cells (BACs). Hundreds of BACs are uniformly arranged over the whole wafer area in the proposed system. Each BAC has a data processor and an array of beamlets, and each beamlet consists of an electron-beam source, a source controller, a set of electron lenses, a blanker, a deflector, and an electron detector. These essential parts of beamlets are integrated using MEMS technology, which increases the density of beamlets and reduces the system cost. The data processor in the BAC processes layout information coming off-chamber and dispatches them to the corresponding beamlet to control its ON/OFF status. High manufacturing cost of masks is saved in maskless lithography systems, however, immense mask data are needed to be handled and transmitted. Therefore, data compression technique is applied to reduce required transmission bandwidth. The compression algorithm is fast and efficient so that the real-time decoder can be implemented on-chip. Consequently, the proposed MPML2 can achieve 10 wafers per hour (WPH) throughput for 300mm-wafer systems.

  4. Massively parallel simulation with DOE's ASCI supercomputers : an overview of the Los Alamos Crestone project

    SciTech Connect

    Weaver, R. P.; Gittings, M. L.

    2004-01-01

    The Los Alamos Crestone Project is part of the Department of Energy's (DOE) Accelerated Strategic Computing Initiative, or ASCI Program. The main goal of this software development project is to investigate the use of continuous adaptive mesh refinement (CAMR) techniques for application to problems of interest to the Laboratory. There are many code development efforts in the Crestone Project, both unclassified and classified codes. In this overview I will discuss the unclassified SAGE and the RAGE codes. The SAGE (SAIC adaptive grid Eulerian) code is a one-, two-, and three-dimensional multimaterial Eulerian massively parallel hydrodynamics code for use in solving a variety of high-deformation flow problems. The RAGE CAMR code is built from the SAGE code by adding various radiation packages, improved setup utilities and graphics packages and is used for problems in which radiation transport of energy is important. The goal of these massively-parallel versions of the codes is to run extremely large problems in a reasonable amount of calendar time. Our target is scalable performance to {approx}10,000 processors on a 1 billion CAMR computational cell problem that requires hundreds of variables per cell, multiple physics packages (e.g. radiation and hydrodynamics), and implicit matrix solves for each cycle. A general description of the RAGE code has been published in [l],[ 2], [3] and [4]. Currently, the largest simulations we do are three-dimensional, using around 500 million computation cells and running for literally months of calendar time using {approx}2000 processors. Current ASCI platforms range from several 3-teraOPS supercomputers to one 12-teraOPS machine at Lawrence Livermore National Laboratory, the White machine, and one 20-teraOPS machine installed at Los Alamos, the Q machine. Each machine is a system comprised of many component parts that must perform in unity for the successful run of these simulations. Key features of any massively parallel system

  5. Massive parallelization of a 3D finite difference electromagnetic forward solution using domain decomposition methods on multiple CUDA enabled GPUs

    NASA Astrophysics Data System (ADS)

    Schultz, A.

    2010-12-01

    3D forward solvers lie at the core of inverse formulations used to image the variation of electrical conductivity within the Earth's interior. This property is associated with variations in temperature, composition, phase, presence of volatiles, and in specific settings, the presence of groundwater, geothermal resources, oil/gas or minerals. The high cost of 3D solutions has been a stumbling block to wider adoption of 3D methods. Parallel algorithms for modeling frequency domain 3D EM problems have not achieved wide scale adoption, with emphasis on fairly coarse grained parallelism using MPI and similar approaches. The communications bandwidth as well as the latency required to send and receive network communication packets is a limiting factor in implementing fine grained parallel strategies, inhibiting wide adoption of these algorithms. Leading Graphics Processor Unit (GPU) companies now produce GPUs with hundreds of GPU processor cores per die. The footprint, in silicon, of the GPU's restricted instruction set is much smaller than the general purpose instruction set required of a CPU. Consequently, the density of processor cores on a GPU can be much greater than on a CPU. GPUs also have local memory, registers and high speed communication with host CPUs, usually through PCIe type interconnects. The extremely low cost and high computational power of GPUs provides the EM geophysics community with an opportunity to achieve fine grained (i.e. massive) parallelization of codes on low cost hardware. The current generation of GPUs (e.g. NVidia Fermi) provides 3 billion transistors per chip die, with nearly 500 processor cores and up to 6 GB of fast (DDR5) GPU memory. This latest generation of GPU supports fast hardware double precision (64 bit) floating point operations of the type required for frequency domain EM forward solutions. Each Fermi GPU board can sustain nearly 1 TFLOP in double precision, and multiple boards can be installed in the host computer system. We

  6. Integration Architecture of Content Addressable Memory and Massive-Parallel Memory-Embedded SIMD Matrix for Versatile Multimedia Processor

    NASA Astrophysics Data System (ADS)

    Kumaki, Takeshi; Ishizaki, Masakatsu; Koide, Tetsushi; Mattausch, Hans Jürgen; Kuroda, Yasuto; Gyohten, Takayuki; Noda, Hideyuki; Dosaka, Katsumi; Arimoto, Kazutami; Saito, Kazunori

    This paper presents an integration architecture of content addressable memory (CAM) and a massive-parallel memory-embedded SIMD matrix for constructing a versatile multimedia processor. The massive-parallel memory-embedded SIMD matrix has 2,048 2-bit processing elements, which are connected by a flexible switching network, and supports 2-bit 2,048-way bit-serial and word-parallel operations with a single command. The SIMD matrix architecture is verified to be a better way for processing the repeated arithmetic operation types in multimedia applications. The proposed architecture, reported in this paper, exploits in addition CAM technology and enables therefore fast pipelined table-lookup coding operations. Since both arithmetic and table-lookup operations execute extremely fast, the proposed novel architecture can realize consequently efficient and versatile multimedia data processing. Evaluation results of the proposed CAM-enhanced massive-parallel SIMD matrix processor for the example of the frequently used JPEG image-compression application show that the necessary clock cycle number can be reduced by 86% in comparison to a conventional mobile DSP architecture. The determined performances in Mpixel/mm2 are factors 3.3 and 4.4 better than with a CAM-less massive-parallel memory-embedded SIMD matrix processor and a conventional mobile DSP, respectively.

  7. Massive Exploration of Perturbed Conditions of the Blood Coagulation Cascade through GPU Parallelization

    PubMed Central

    Cazzaniga, Paolo; Nobile, Marco S.; Besozzi, Daniela; Bellini, Matteo; Mauri, Giancarlo

    2014-01-01

    The introduction of general-purpose Graphics Processing Units (GPUs) is boosting scientific applications in Bioinformatics, Systems Biology, and Computational Biology. In these fields, the use of high-performance computing solutions is motivated by the need of performing large numbers of in silico analysis to study the behavior of biological systems in different conditions, which necessitate a computing power that usually overtakes the capability of standard desktop computers. In this work we present coagSODA, a CUDA-powered computational tool that was purposely developed for the analysis of a large mechanistic model of the blood coagulation cascade (BCC), defined according to both mass-action kinetics and Hill functions. coagSODA allows the execution of parallel simulations of the dynamics of the BCC by automatically deriving the system of ordinary differential equations and then exploiting the numerical integration algorithm LSODA. We present the biological results achieved with a massive exploration of perturbed conditions of the BCC, carried out with one-dimensional and bi-dimensional parameter sweep analysis, and show that GPU-accelerated parallel simulations of this model can increase the computational performances up to a 181× speedup compared to the corresponding sequential simulations. PMID:25025072

  8. Practical Realization of Massively Parallel Fiber -Free-Space Optical Interconnects

    NASA Astrophysics Data System (ADS)

    Gruber, Matthias; Jahns, Jürgen; El Joudi, El Mehdi; Sinzinger, Stefan

    2001-06-01

    We propose a novel approach to realizing massively parallel optical interconnects based on commercially available multifiber ribbons with MT-type connectors and custom-designed planar-integrated free-space components. It combines the advantages of fiber optics, that is, a long range and convenient and flexible installation, with those of (planar-integrated) free-space optics, that is, a wide range of implementable functions and a high potential for integration and parallelization. For the interface between fibers and free-space optical systems a low-cost practical solution is presented. It consists of using a metal connector plate that was manufactured on a computer-controlled milling machine. Channel densities are of the order of 100 /mm2 between optoelectronic VLSI chips and the free-space optical systems and 1 /mm2 between the free-space optical systems and MT-type fiber connectors. Experiments in combination with specially designed planar-integrated test systems prove that multiple one-to-one and one-to-many interconnects can be established with not more than 10% uniformity error.

  9. Measures of effectiveness for BMD mid-course tracking on MIMD massively parallel computers

    SciTech Connect

    VanDyke, J.P.; Tomkins, J.L.; Furnish, M.D.

    1995-05-01

    The TRC code, a mid-course tracking code for ballistic missiles, has previously been implemented on a 1024-processor MIMD (Multiple Instruction -- Multiple Data) massively parallel computer. Measures of Effectiveness (MOE) for this algorithm have been developed for this computing environment. The MOE code is run in parallel with the TRC code. Particularly useful MOEs include the number of missed objects (real objects for which the TRC algorithm did not construct a track); of ghost tracks (tracks not corresponding to a real object); of redundant tracks (multiple tracks corresponding to a single real object); and of unresolved objects (multiple objects corresponding to a single track). All of these are expressed as a function of time, and tend to maximize during the time in which real objects are spawned (multiple reentry vehicles per post-boost vehicle). As well, it is possible to measure the track-truth separation as a function of time. A set of calculations is presented illustrating these MOEs as a function of time for a case with 99 post-boost vehicles, each of which spawns 9 reentry vehicles.

  10. Massive parallel IGHV gene sequencing reveals a germinal center pathway in origins of human multiple myeloma

    PubMed Central

    Bryant, Dean; Seckinger, Anja; Hose, Dirk; Zojer, Niklas; Sahota, Surinder S.

    2015-01-01

    Human multiple myeloma (MM) is characterized by accumulation of malignant terminally differentiated plasma cells (PCs) in the bone marrow (BM), raising the question when during maturation neoplastic transformation begins. Immunoglobulin IGHV genes carry imprints of clonal tumor history, delineating somatic hypermutation (SHM) events that generally occur in the germinal center (GC). Here, we examine MM-derived IGHV genes using massive parallel deep sequencing, comparing them with profiles in normal BM PCs. In 4/4 presentation IgG MM, monoclonal tumor-derived IGHV sequences revealed significant evidence for intraclonal variation (ICV) in mutation patterns. IGHV sequences of 2/2 normal PC IgG populations revealed dominant oligoclonal expansions, each expansion also displaying mutational ICV. Clonal expansions in MM and in normal BM PCs reveal common IGHV features. In such MM, the data fit a model of tumor origins in which neoplastic transformation is initiated in a GC B-cell committed to terminal differentiation but still targeted by on-going SHM. Strikingly, the data parallel IGHV clonal sequences in some monoclonal gammopathy of undetermined significance (MGUS) known to display on-going SHM imprints. Since MGUS generally precedes MM, these data suggest origins of MGUS and MM with IGHV gene mutational ICV from the same GC B-cell, arising via a distinctive pathway. PMID:25929340

  11. The divide-expand-consolidate MP2 scheme goes massively parallel

    NASA Astrophysics Data System (ADS)

    Kristensen, Kasper; Kjærgaard, Thomas; Høyvik, Ida-Marie; Ettenhuber, Patrick; Jørgensen, Poul; Jansik, Branislav; Reine, Simen; Jakowski, Jacek

    2013-07-01

    For large molecular systems conventional implementations of second order Møller-Plesset (MP2) theory encounter a scaling wall, both memory- and time-wise. We describe how this scaling wall can be removed. We present a massively parallel algorithm for calculating MP2 energies and densities using the divide-expand-consolidate scheme where a calculation on a large system is divided into many small fragment calculations employing local orbital spaces. The resulting algorithm is linear-scaling with system size, exhibits near perfect parallel scalability, removes memory bottlenecks and does not involve any I/O. The algorithm employs three levels of parallelisation combined via a dynamic job distribution scheme. Results on two molecular systems containing 528 and 1056 atoms (4278 and 8556 basis functions) using 47,120 and 94,240 cores are presented. The results demonstrate the scalability of the algorithm both with respect to the number of cores and with respect to system size. The presented algorithm is thus highly suited for large super computer architectures and allows MP2 calculations on large molecular systems to be carried out within a few hours - for example, the correlated calculation on the molecular system containing 1056 atoms took 2.37 hours using 94240 cores.

  12. Rigid body constraints realized in massively-parallel molecular dynamics on graphics processing units

    NASA Astrophysics Data System (ADS)

    Nguyen, Trung Dac; Phillips, Carolyn L.; Anderson, Joshua A.; Glotzer, Sharon C.

    2011-11-01

    Molecular dynamics (MD) methods compute the trajectory of a system of point particles in response to a potential function by numerically integrating Newton's equations of motion. Extending these basic methods with rigid body constraints enables composite particles with complex shapes such as anisotropic nanoparticles, grains, molecules, and rigid proteins to be modeled. Rigid body constraints are added to the GPU-accelerated MD package, HOOMD-blue, version 0.10.0. The software can now simulate systems of particles, rigid bodies, or mixed systems in microcanonical (NVE), canonical (NVT), and isothermal-isobaric (NPT) ensembles. It can also apply the FIRE energy minimization technique to these systems. In this paper, we detail the massively parallel scheme that implements these algorithms and discuss how our design is tuned for the maximum possible performance. Two different case studies are included to demonstrate the performance attained, patchy spheres and tethered nanorods. In typical cases, HOOMD-blue on a single GTX 480 executes 2.5-3.6 times faster than LAMMPS executing the same simulation on any number of CPU cores in parallel. Simulations with rigid bodies may now be run with larger systems and for longer time scales on a single workstation than was previously even possible on large clusters.

  13. A two-phase thermal model for subsurface transport on massively parallel computers

    SciTech Connect

    Martinez, M.J.; Hopkins, P.L.

    1997-12-01

    Many research activities in subsurface transport require the numerical simulation of multiphase flow in porous media. This capability is critical to research in environmental remediation (e.g. contaminations with dense, non-aqueous-phase liquids), nuclear waste management, reservoir engineering, and to the assessment of the future availability of groundwater in many parts of the world. This paper presents an unstructured grid numerical algorithm for subsurface transport in heterogeneous porous media implemented for use on massively parallel (MP) computers. The mathematical model considers nonisothermal two-phase (liquid/gas) flow, including capillary pressure effects, binary diffusion in the gas phase, conductive, latent, and sensible heat transport. The Galerkin finite element method is used for spatial discretization, and temporal integration is accomplished via a predictor/corrector scheme. Message-passing and domain decomposition techniques are used for implementing a scalable algorithm for distributed memory parallel computers. Illustrative applications are shown to demonstrate capabilities and performance, one of which is modeling hydrothermal transport at the Yucca Mountain site for a radioactive waste facility.

  14. GPAW - massively parallel electronic structure calculations with Python-based software.

    SciTech Connect

    Enkovaara, J.; Romero, N.; Shende, S.; Mortensen, J.

    2011-01-01

    Electronic structure calculations are a widely used tool in materials science and large consumer of supercomputing resources. Traditionally, the software packages for these kind of simulations have been implemented in compiled languages, where Fortran in its different versions has been the most popular choice. While dynamic, interpreted languages, such as Python, can increase the effciency of programmer, they cannot compete directly with the raw performance of compiled languages. However, by using an interpreted language together with a compiled language, it is possible to have most of the productivity enhancing features together with a good numerical performance. We have used this approach in implementing an electronic structure simulation software GPAW using the combination of Python and C programming languages. While the chosen approach works well in standard workstations and Unix environments, massively parallel supercomputing systems can present some challenges in porting, debugging and profiling the software. In this paper we describe some details of the implementation and discuss the advantages and challenges of the combined Python/C approach. We show that despite the challenges it is possible to obtain good numerical performance and good parallel scalability with Python based software.

  15. Multi-mode sensor processing on a dynamically reconfigurable massively parallel processor array

    NASA Astrophysics Data System (ADS)

    Chen, Paul; Butts, Mike; Budlong, Brad; Wasson, Paul

    2008-04-01

    This paper introduces a novel computing architecture that can be reconfigured in real time to adapt on demand to multi-mode sensor platforms' dynamic computational and functional requirements. This 1 teraOPS reconfigurable Massively Parallel Processor Array (MPPA) has 336 32-bit processors. The programmable 32-bit communication fabric provides streamlined inter-processor connections with deterministically high performance. Software programmability, scalability, ease of use, and fast reconfiguration time (ranging from microseconds to milliseconds) are the most significant advantages over FPGAs and DSPs. This paper introduces the MPPA architecture, its programming model, and methods of reconfigurability. An MPPA platform for reconfigurable computing is based on a structural object programming model. Objects are software programs running concurrently on hundreds of 32-bit RISC processors and memories. They exchange data and control through a network of self-synchronizing channels. A common application design pattern on this platform, called a work farm, is a parallel set of worker objects, with one input and one output stream. Statically configured work farms with homogeneous and heterogeneous sets of workers have been used in video compression and decompression, network processing, and graphics applications.

  16. A Lightweight Remote Parallel Visualization Platform for Interactive Massive Time-varying Climate Data Analysis

    NASA Astrophysics Data System (ADS)

    Li, J.; Zhang, T.; Huang, Q.; Liu, Q.

    2014-12-01

    Today's climate datasets are featured with large volume, high degree of spatiotemporal complexity and evolving fast overtime. As visualizing large volume distributed climate datasets is computationally intensive, traditional desktop based visualization applications fail to handle the computational intensity. Recently, scientists have developed remote visualization techniques to address the computational issue. Remote visualization techniques usually leverage server-side parallel computing capabilities to perform visualization tasks and deliver visualization results to clients through network. In this research, we aim to build a remote parallel visualization platform for visualizing and analyzing massive climate data. Our visualization platform was built based on Paraview, which is one of the most popular open source remote visualization and analysis applications. To further enhance the scalability and stability of the platform, we have employed cloud computing techniques to support the deployment of the platform. In this platform, all climate datasets are regular grid data which are stored in NetCDF format. Three types of data access methods are supported in the platform: accessing remote datasets provided by OpenDAP servers, accessing datasets hosted on the web visualization server and accessing local datasets. Despite different data access methods, all visualization tasks are completed at the server side to reduce the workload of clients. As a proof of concept, we have implemented a set of scientific visualization methods to show the feasibility of the platform. Preliminary results indicate that the framework can address the computation limitation of desktop based visualization applications.

  17. Automation of Molecular-Based Analyses: A Primer on Massively Parallel Sequencing

    PubMed Central

    Nguyen, Lan; Burnett, Leslie

    2014-01-01

    Recent advances in genetics have been enabled by new genetic sequencing techniques called massively parallel sequencing (MPS) or next-generation sequencing. Through the ability to sequence in parallel hundreds of thousands to millions of DNA fragments, the cost and time required for sequencing has dramatically decreased. There are a number of different MPS platforms currently available and being used in Australia. Although they differ in the underlying technology involved, their overall processes are very similar: DNA fragmentation, adaptor ligation, immobilisation, amplification, sequencing reaction and data analysis. MPS is being used in research, translational and increasingly now also in clinical settings. Common applications include sequencing of whole genomes, whole exomes or targeted genes for disease-causing gene discovery, genetic diagnosis and targeted cancer therapy. Even though the revolution that is occurring with MPS is exciting due to its increasing use, improving and emerging technologies and new applications, significant challenges still exist. Particularly challenging issues are the bioinformatics required for data analysis, interpretation of results and the ethical dilemma of ‘incidental findings’. PMID:25336762

  18. Genetic algorithm based task reordering to improve the performance of batch scheduled massively parallel scientific applications

    DOE PAGESBeta

    Sankaran, Ramanan; Angel, Jordan; Brown, W. Michael

    2015-04-08

    The growth in size of networked high performance computers along with novel accelerator-based node architectures has further emphasized the importance of communication efficiency in high performance computing. The world's largest high performance computers are usually operated as shared user facilities due to the costs of acquisition and operation. Applications are scheduled for execution in a shared environment and are placed on nodes that are not necessarily contiguous on the interconnect. Furthermore, the placement of tasks on the nodes allocated by the scheduler is sub-optimal, leading to performance loss and variability. Here, we investigate the impact of task placement on themore » performance of two massively parallel application codes on the Titan supercomputer, a turbulent combustion flow solver (S3D) and a molecular dynamics code (LAMMPS). Benchmark studies show a significant deviation from ideal weak scaling and variability in performance. The inter-task communication distance was determined to be one of the significant contributors to the performance degradation and variability. A genetic algorithm-based parallel optimization technique was used to optimize the task ordering. This technique provides an improved placement of the tasks on the nodes, taking into account the application's communication topology and the system interconnect topology. As a result, application benchmarks after task reordering through genetic algorithm show a significant improvement in performance and reduction in variability, therefore enabling the applications to achieve better time to solution and scalability on Titan during production.« less

  19. Genetic algorithm based task reordering to improve the performance of batch scheduled massively parallel scientific applications

    SciTech Connect

    Sankaran, Ramanan; Angel, Jordan; Brown, W. Michael

    2015-04-08

    The growth in size of networked high performance computers along with novel accelerator-based node architectures has further emphasized the importance of communication efficiency in high performance computing. The world's largest high performance computers are usually operated as shared user facilities due to the costs of acquisition and operation. Applications are scheduled for execution in a shared environment and are placed on nodes that are not necessarily contiguous on the interconnect. Furthermore, the placement of tasks on the nodes allocated by the scheduler is sub-optimal, leading to performance loss and variability. Here, we investigate the impact of task placement on the performance of two massively parallel application codes on the Titan supercomputer, a turbulent combustion flow solver (S3D) and a molecular dynamics code (LAMMPS). Benchmark studies show a significant deviation from ideal weak scaling and variability in performance. The inter-task communication distance was determined to be one of the significant contributors to the performance degradation and variability. A genetic algorithm-based parallel optimization technique was used to optimize the task ordering. This technique provides an improved placement of the tasks on the nodes, taking into account the application's communication topology and the system interconnect topology. As a result, application benchmarks after task reordering through genetic algorithm show a significant improvement in performance and reduction in variability, therefore enabling the applications to achieve better time to solution and scalability on Titan during production.

  20. Field-Scale, Massively Parallel Simulation of Production from Oceanic Gas Hydrate Deposits

    NASA Astrophysics Data System (ADS)

    Reagan, M. T.; Moridis, G. J.; Freeman, C. M.; Pan, L.; Boyle, K. L.; Johnson, J. N.; Husebo, J. A.

    2012-12-01

    The quantity of hydrocarbon gases trapped in natural hydrate accumulations is enormous, leading to significant interest in the evaluation of their potential as an energy source. It has been shown that large volumes of gas can be readily produced at high rates for long times from some types of methane hydrate accumulations by means of depressurization-induced dissociation, and using conventional technologies with horizontal or vertical well configurations. However, these systems are currently assessed using simplified or reduced-scale 3D or even 2D production simulations. In this study, we use the massively parallel TOUGH+HYDRATE code (pT+H) to assess the production potential of a large, deep-ocean hydrate reservoir and develop strategies for effective production. The simulations model a full 3D system of over 24 km2 extent, examining the productivity of vertical and horizontal wells, single or multiple wells, and explore variations in reservoir properties. Systems of up to 2.5M gridblocks, running on thousands of supercomputing nodes, are required to simulate such large systems at the highest level of detail. The simulations reveal the challenges inherent in producing from deep, relatively cold systems with extensive water-bearing channels and connectivity to large aquifers, including the difficulty of achieving depressurizing, the challenges of high water removal rates, and the complexity of production design. Also highlighted are new frontiers in large-scale reservoir simulation of coupled flow, transport, thermodynamics, and phase behavior, including the construction of large meshes, the use parallel numerical solvers and MPI, and large-scale, parallel 3D visualization of results.

  1. KINETIC ALFVEN TURBULENCE AND PARALLEL ELECTRIC FIELDS IN FLARE LOOPS

    SciTech Connect

    Zhao, J. S.; Wu, D. J.; Lu, J. Y.

    2013-04-20

    This study investigates the spectral structure of the kinetic Alfven turbulence in the low-beta plasmas. We consider a strong turbulence resulting from collisions between counterpropagating wavepackets with equal energy. Our results show that (1) the spectra of the magnetic and electric field fluctuations display a transition at the electron inertial length scale, (2) the turbulence cascades mainly toward the magnetic field direction as the cascade scale is smaller than the electron inertial length, and (3) the parallel electric field increases as the turbulent scale decreases. We also show that the parallel electric field in the solar flare loops can be 10{sup 2}-10{sup 4} times the Dreicer field as the turbulence reaches the electron inertial length scale.

  2. MicroRNA transcriptome in the newborn mouse ovaries determined by massive parallel sequencing.

    PubMed

    Ahn, Hyo Won; Morin, Ryan D; Zhao, Han; Harris, Ronald A; Coarfa, Cristian; Chen, Zi-Jiang; Milosavljevic, Aleksandar; Marra, Marco A; Rajkovic, Aleksandar

    2010-07-01

    Small non-coding RNAs, such as microRNAs (miRNAs), are involved in diverse biological processes including organ development and tissue differentiation. Global disruption of miRNA biogenesis in Dicer knockout mice disrupts early embryogenesis and primordial germ cell formation. However, the role of miRNAs in early folliculogenesis is poorly understood. In order to identify a full transcriptome set of small RNAs expressed in the newborn (NB) ovary, we extracted small RNA fraction from mouse NB ovary tissues and subjected it to massive parallel sequencing using the Genome Analyzer from Illumina. Massive sequencing produced 4 655 992 reads of 33 bp each representing a total of 154 Mbp of sequence data. The Pash alignment algorithm mapped 50.13% of the reads to the mouse genome. Sequence reads were clustered based on overlapping mapping coordinates and intersected with known miRNAs, small nucleolar RNAs (snoRNAs), piwi-interacting RNA (piRNA) clusters and repetitive genomic regions; 25.2% of the reads mapped to known miRNAs, 25.5% to genomic repeats, 3.5% to piRNAs and 0.18% to snoRNAs. Three hundred and ninety-eight known miRNA species were among the sequenced small RNAs, and 118 isomiR sequences that are not in the miRBase database. Let-7 family was the most abundantly expressed miRNA, and mmu-mir-672, mmu-mir-322, mmu-mir-503 and mmu-mir-465 families are the most abundant X-linked miRNA detected. X-linked mmu-mir-503, mmu-mir-672 and mmu-mir-465 family showed preferential expression in testes and ovaries. We also identified four novel miRNAs that are preferentially expressed in gonads. Gonadal selective miRNAs may play important roles in ovarian development, folliculogenesis and female fertility. PMID:20215419

  3. Wideband aperture array using RF channelizers and massively parallel digital 2D IIR filterbank

    NASA Astrophysics Data System (ADS)

    Sengupta, Arindam; Madanayake, Arjuna; Gómez-García, Roberto; Engeberg, Erik D.

    2014-05-01

    Wideband receive-mode beamforming applications in wireless location, electronically-scanned antennas for radar, RF sensing, microwave imaging and wireless communications require digital aperture arrays that offer a relatively constant far-field beam over several octaves of bandwidth. Several beamforming schemes including the well-known true time-delay and the phased array beamformers have been realized using either finite impulse response (FIR) or fast Fourier transform (FFT) digital filter-sum based techniques. These beamforming algorithms offer the desired selectivity at the cost of a high computational complexity and frequency-dependant far-field array patterns. A novel approach to receiver beamforming is the use of massively parallel 2-D infinite impulse response (IIR) fan filterbanks for the synthesis of relatively frequency independent RF beams at an order of magnitude lower multiplier complexity compared to FFT or FIR filter based conventional algorithms. The 2-D IIR filterbanks demand fast digital processing that can support several octaves of RF bandwidth, fast analog-to-digital converters (ADCs) for RF-to-bits type direct conversion of wideband antenna element signals. Fast digital implementation platforms that can realize high-precision recursive filter structures necessary for real-time beamforming, at RF radio bandwidths, are also desired. We propose a novel technique that combines a passive RF channelizer, multichannel ADC technology, and single-phase massively parallel 2-D IIR digital fan filterbanks, realized at low complexity using FPGA and/or ASIC technology. There exists native support for a larger bandwidth than the maximum clock frequency of the digital implementation technology. We also strive to achieve More-than-Moore throughput by processing a wideband RF signal having content with N-fold (B = N Fclk/2) bandwidth compared to the maximum clock frequency Fclk Hz of the digital VLSI platform under consideration. Such increase in bandwidth is

  4. LiNbO3: A photovoltaic substrate for massive parallel manipulation and patterning of nano-objects

    NASA Astrophysics Data System (ADS)

    Carrascosa, M.; García-Cabañes, A.; Jubera, M.; Ramiro, J. B.; Agulló-López, F.

    2015-12-01

    The application of evanescent photovoltaic (PV) fields, generated by visible illumination of Fe:LiNbO3 substrates, for parallel massive trapping and manipulation of micro- and nano-objects is critically reviewed. The technique has been often referred to as photovoltaic or photorefractive tweezers. The main advantage of the new method is that the involved electrophoretic and/or dielectrophoretic forces do not require any electrodes and large scale manipulation of nano-objects can be easily achieved using the patterning capabilities of light. The paper describes the experimental techniques for particle trapping and the main reported experimental results obtained with a variety of micro- and nano-particles (dielectric and conductive) and different illumination configurations (single beam, holographic geometry, and spatial light modulator projection). The report also pays attention to the physical basis of the method, namely, the coupling of the evanescent photorefractive fields to the dielectric response of the nano-particles. The role of a number of physical parameters such as the contrast and spatial periodicities of the illumination pattern or the particle deposition method is discussed. Moreover, the main properties of the obtained particle patterns in relation to potential applications are summarized, and first demonstrations reviewed. Finally, the PV method is discussed in comparison to other patterning strategies, such as those based on the pyroelectric response and the electric fields associated to domain poling of ferroelectric materials.

  5. LiNbO{sub 3}: A photovoltaic substrate for massive parallel manipulation and patterning of nano-objects

    SciTech Connect

    Carrascosa, M.; García-Cabañes, A.; Jubera, M.; Ramiro, J. B.; Agulló-López, F.

    2015-12-15

    The application of evanescent photovoltaic (PV) fields, generated by visible illumination of Fe:LiNbO{sub 3} substrates, for parallel massive trapping and manipulation of micro- and nano-objects is critically reviewed. The technique has been often referred to as photovoltaic or photorefractive tweezers. The main advantage of the new method is that the involved electrophoretic and/or dielectrophoretic forces do not require any electrodes and large scale manipulation of nano-objects can be easily achieved using the patterning capabilities of light. The paper describes the experimental techniques for particle trapping and the main reported experimental results obtained with a variety of micro- and nano-particles (dielectric and conductive) and different illumination configurations (single beam, holographic geometry, and spatial light modulator projection). The report also pays attention to the physical basis of the method, namely, the coupling of the evanescent photorefractive fields to the dielectric response of the nano-particles. The role of a number of physical parameters such as the contrast and spatial periodicities of the illumination pattern or the particle deposition method is discussed. Moreover, the main properties of the obtained particle patterns in relation to potential applications are summarized, and first demonstrations reviewed. Finally, the PV method is discussed in comparison to other patterning strategies, such as those based on the pyroelectric response and the electric fields associated to domain poling of ferroelectric materials.

  6. Switching dynamics of thin film ferroelectric devices - a massively parallel phase field study

    NASA Astrophysics Data System (ADS)

    Ashraf, Md. Khalid

    In this thesis, we investigate the switching dynamics in thin film ferroelectrics. Ferroelectric materials are of inherent interest for low power and multi-functional devices. However, possible device applications of these materials have been limited due to the poorly understood electromagnetic and mechanical response at the nanoscale in arbitrary device structures. The difficulty in understanding switching dynamics mainly arises from the presence of features at multiple length scales and the nonlinearity associated with the strongly coupled states. For example, in a ferroelectric material, the domain walls are of nm size whereas the domain pattern forms at micron scale. The switching is determined by coupled chemical, electrostatic, mechanical and thermal interactions. Thus computational understanding of switching dynamics in thin film ferroelectrics and a direct comparison with experiment poses a significant numerical challenge. We have developed a phase field model that describes the physics of polarization dynamics at the microscopic scale. A number of efficient numerical methods have been applied for achieving massive parallelization of all the calculation steps. Conformally mapped elements, node wise assembly and prevention of dynamic loading minimized the communication between processors and increased the parallelization efficiency. With these improvements, we have reached the experimental scale - a significant step forward compared to the state of the art thin film ferroelectric switching dynamics models. Using this model, we elucidated the switching dynamics on multiple surfaces of the multiferroic material BFO. We also calculated the switching energy of scaled BFO islands. Finally, we studied the interaction of domain wall propagation with misfit dislocations in the thin film. We believe that the model will be useful in understanding the switching dynamics in many different experimental setups incorporating thin film ferroelectrics.

  7. Diffuse large B-cell lymphoma: sub-classification by massive parallel quantitative RT-PCR.

    PubMed

    Xue, Xuemin; Zeng, Naiyan; Gao, Zifen; Du, Ming-Qing

    2015-01-01

    Diffuse large B-cell lymphoma (DLBCL) is a heterogeneous entity with remarkably variable clinical outcome. Gene expression profiling (GEP) classifies DLBCL into activated B-cell like (ABC), germinal center B-cell like (GCB), and Type-III subtypes, with ABC-DLBCL characterized by a poor prognosis and constitutive NF-κB activation. A major challenge for the application of this cell of origin (COO) classification in routine clinical practice is to establish a robust clinical assay amenable to routine formalin-fixed paraffin-embedded (FFPE) diagnostic biopsies. In this study, we investigated the possibility of COO-classification using FFPE tissue RNA samples by massive parallel quantitative reverse transcription PCR (qRT-PCR). We established a protocol for parallel qRT-PCR using FFPE RNA samples with the Fluidigm BioMark HD system, and quantified the expression of the COO classifier genes and the NF-κB targeted-genes that characterize ABC-DLBCL in 143 cases of DLBCL. We also trained and validated a series of basic machine-learning classifiers and their derived meta classifiers, and identified SimpleLogistic as the top classifier that gave excellent performance across various GEP data sets derived from fresh-frozen or FFPE tissues by different microarray platforms. Finally, we applied SimpleLogistic to our data set generated by qRT-PCR, and the ABC and GCB-DLBCL assigned showed the respective characteristics in their clinical outcome and NF-κB target gene expression. The methodology established in this study provides a robust approach for DLBCL sub-classification using routine FFPE diagnostic biopsies in a routine clinical setting. PMID:25418578

  8. Hierarchical Image Segmentation of Remotely Sensed Data using Massively Parallel GNU-LINUX Software

    NASA Technical Reports Server (NTRS)

    Tilton, James C.

    2003-01-01

    A hierarchical set of image segmentations is a set of several image segmentations of the same image at different levels of detail in which the segmentations at coarser levels of detail can be produced from simple merges of regions at finer levels of detail. In [1], Tilton, et a1 describes an approach for producing hierarchical segmentations (called HSEG) and gave a progress report on exploiting these hierarchical segmentations for image information mining. The HSEG algorithm is a hybrid of region growing and constrained spectral clustering that produces a hierarchical set of image segmentations based on detected convergence points. In the main, HSEG employs the hierarchical stepwise optimization (HSWO) approach to region growing, which was described as early as 1989 by Beaulieu and Goldberg. The HSWO approach seeks to produce segmentations that are more optimized than those produced by more classic approaches to region growing (e.g. Horowitz and T. Pavlidis, [3]). In addition, HSEG optionally interjects between HSWO region growing iterations, merges between spatially non-adjacent regions (i.e., spectrally based merging or clustering) constrained by a threshold derived from the previous HSWO region growing iteration. While the addition of constrained spectral clustering improves the utility of the segmentation results, especially for larger images, it also significantly increases HSEG s computational requirements. To counteract this, a computationally efficient recursive, divide-and-conquer, implementation of HSEG (RHSEG) was devised, which includes special code to avoid processing artifacts caused by RHSEG s recursive subdivision of the image data. The recursive nature of RHSEG makes for a straightforward parallel implementation. This paper describes the HSEG algorithm, its recursive formulation (referred to as RHSEG), and the implementation of RHSEG using massively parallel GNU-LINUX software. Results with Landsat TM data are included comparing RHSEG with classic

  9. World Wide Web interface for advanced SPECT reconstruction algorithms implemented on a remote massively parallel computer.

    PubMed

    Formiconi, A R; Passeri, A; Guelfi, M R; Masoni, M; Pupi, A; Meldolesi, U; Malfetti, P; Calori, L; Guidazzoli, A

    1997-11-01

    Data from Single Photon Emission Computed Tomography (SPECT) studies are blurred by inevitable physical phenomena occurring during data acquisition. These errors may be compensated by means of reconstruction algorithms which take into account accurate physical models of the data acquisition procedure. Unfortunately, this approach involves high memory requirements as well as a high computational burden which cannot be afforded by the computer systems of SPECT acquisition devices. In this work the possibility of accessing High Performance Computing and Networking (HPCN) resources through a World Wide Web interface for the advanced reconstruction of SPECT data in a clinical environment was investigated. An iterative algorithm with an accurate model of the variable system response was ported on the Multiple Instruction Multiple Data (MIMD) parallel architecture of a Cray T3D massively parallel computer. The system was accessible even from low cost PC-based workstations through standard TCP/IP networking. A speedup factor of 148 was predicted by the benchmarks run on the Cray T3D. A complete brain study of 30 (64 x 64) slices was reconstructed from a set of 90 (64 x 64) projections with ten iterations of the conjugate gradients algorithm in 9 s which corresponds to an actual speed-up factor of 135. The technique was extended to a more accurate 3D modeling of the system response for a true 3D reconstruction of SPECT data; the reconstruction time of the same data set with this more accurate model was 5 min. This work demonstrates the possibility of exploiting remote HPCN resources from hospital sites by means of low cost workstations using standard communication protocols and an user-friendly WWW interface without particular problems for routine use. PMID:9506406

  10. PORTA: A Massively Parallel Code for 3D Non-LTE Polarized Radiative Transfer

    NASA Astrophysics Data System (ADS)

    Štěpán, J.

    2014-10-01

    The interpretation of the Stokes profiles of the solar (stellar) spectral line radiation requires solving a non-LTE radiative transfer problem that can be very complex, especially when the main interest lies in modeling the linear polarization signals produced by scattering processes and their modification by the Hanle effect. One of the main difficulties is due to the fact that the plasma of a stellar atmosphere can be highly inhomogeneous and dynamic, which implies the need to solve the non-equilibrium problem of generation and transfer of polarized radiation in realistic three-dimensional stellar atmospheric models. Here we present PORTA, a computer program we have developed for solving, in three-dimensional (3D) models of stellar atmospheres, the problem of the generation and transfer of spectral line polarization taking into account anisotropic radiation pumping and the Hanle and Zeeman effects in multilevel atoms. The numerical method of solution is based on a highly convergent iterative algorithm, whose convergence rate is insensitive to the grid size, and on an accurate short-characteristics formal solver of the Stokes-vector transfer equation which uses monotonic Bezier interpolation. In addition to the iterative method and the 3D formal solver, another important feature of PORTA is a novel parallelization strategy suitable for taking advantage of massively parallel computers. Linear scaling of the solution with the number of processors allows to reduce the solution time by several orders of magnitude. We present useful benchmarks and a few illustrations of applications using a 3D model of the solar chromosphere resulting from MHD simulations. Finally, we present our conclusions with a view to future research. For more details see Štěpán & Trujillo Bueno (2013).

  11. Massively parallel LES of azimuthal thermo-acoustic instabilities in annular gas turbines

    NASA Astrophysics Data System (ADS)

    Wolf, P.; Staffelbach, G.; Roux, A.; Gicquel, L.; Poinsot, T.; Moureau, V.

    2009-06-01

    Increasingly stringent regulations and the need to tackle rising fuel prices have placed great emphasis on the design of aeronautical gas turbines, which are unfortunately more and more prone to combustion instabilities. In the particular field of annular combustion chambers, these instabilities often take the form of azimuthal modes. To predict these modes, one must compute the full combustion chamber, which remained out of reach until very recently and the development of massively parallel computers. In this article, full annular Large Eddy Simulations (LES) of two helicopter combustors, which differ only on the swirlers' design, are performed. In both computations, LES captures self-established rotating azimuthal modes. However, the two cases exhibit different thermo-acoustic responses and the resulting limit-cycles are different. With the first design, a self-excited strong instability develops, leading to pulsating flames and local flashback. In the second case, the flames are much less affected by the azimuthal mode and remain stable, allowing an acceptable operation. Hence, this study highlights the potential of LES for discriminating injection system designs. To cite this article: P. Wolf et al., C. R. Mecanique 337 (2009).

  12. Massively parallel LES of azimuthal thermo-acoustic instabilities in annular gas turbines

    NASA Astrophysics Data System (ADS)

    Wolf, Pierre; Staffelbach, Gabriel; Gicquel, Laurent; Poinsot, Thierry

    2009-07-01

    Most of the energy produced worldwide comes from the combustion of fossil fuels. In the context of global climate changes and dramatically decreasing resources, there is a critical need for optimizing the process of burning, especially in the field of gas turbines. Unfortunately, new designs for efficient combustion are prone to destructive thermo-acoustic instabilities. Large Eddy Simulation (LES) is a promising tool to predict turbulent reacting flows in complex industrial configurations and explore the mechanisms triggering the coupling between acoustics and combustion. In the particular field of annular combustion chambers, these instabilities usually take the form of azimuthal modes. To predict these modes, one must compute the full combustion chamber comprising all sectors, which remained out of reach until very recently and the development of massively parallel computers. A fully compressible, multi-species reactive Navier-Stokes solver is used on up to 4096 BlueGene/P CPUs for two designs of a full annular helicopter chamber. Results show evidence of self-established azimuthal modes for the two cases but with different energy containing limit-cycles. Mesh dependency is checked with grids comprising 38 and 93 million tetrahedra. The fact that the two grid predictions yield similar flow topologies and limit-cycles enforces the ability of LES to discriminate design changes.

  13. Ensuring the safety of vaccine cell substrates by massively parallel sequencing of the transcriptome.

    PubMed

    Onions, D; Côté, C; Love, B; Toms, B; Koduri, S; Armstrong, A; Chang, A; Kolman, J

    2011-09-22

    Massively parallel, deep, sequencing of the transcriptome coupled with algorithmic analysis to identify adventitious agents (MP-Seq™) is an important adjunct in ensuring the safety of cells used in vaccine production. Such cells may harbour novel viruses whose sequences are unknown or latent viruses that are only expressed following stress to the cells. MP-Seq is an unbiased and comprehensive method to identify such viruses and other adventitious agents without prior knowledge of the nature of those agents. Here we demonstrate its utility as part of an integrated approach to identify and characterise potential contaminants within commonly used virus and vaccine production cell lines. Through this analysis, in combination with more traditional approaches, we have excluded the presence of porcine circoviruses in the ATCC Vero cell bank (CCL-81), however, we found that a full length betaretrovirus related to SRV can be expressed in these cells, a factor that may be of importance in the production of certain vaccines. Similarly, insect cells are proving to be valuable for the production of virus like particles and sub-unit vaccines, but they can harbour a range of latent viruses. We show that following MP-Seq of the Trichoplusia ni (High Five cell line) transcriptome we were able to detect a contaminating, latent nodavirus and identify an expressed errantivirus genome. Collectively, these studies have reinforced the role of MP-Seq as an integral tool for the identification of contaminating agents in vaccine cell substrates. PMID:21651935

  14. GRay: A Massively Parallel GPU-based Code for Ray Tracing in Relativistic Spacetimes

    NASA Astrophysics Data System (ADS)

    Chan, Chi-kwan; Psaltis, Dimitrios; Özel, Feryal

    2013-11-01

    We introduce GRay, a massively parallel integrator designed to trace the trajectories of billions of photons in a curved spacetime. This graphics-processing-unit (GPU)-based integrator employs the stream processing paradigm, is implemented in CUDA C/C++, and runs on nVidia graphics cards. The peak performance of GRay using single-precision floating-point arithmetic on a single GPU exceeds 300 GFLOP (or 1 ns per photon per time step). For a realistic problem, where the peak performance cannot be reached, GRay is two orders of magnitude faster than existing central-processing-unit-based ray-tracing codes. This performance enhancement allows more effective searches of large parameter spaces when comparing theoretical predictions of images, spectra, and light curves from the vicinities of compact objects to observations. GRay can also perform on-the-fly ray tracing within general relativistic magnetohydrodynamic algorithms that simulate accretion flows around compact objects. Making use of this algorithm, we calculate the properties of the shadows of Kerr black holes and the photon rings that surround them. We also provide accurate fitting formulae of their dependencies on black hole spin and observer inclination, which can be used to interpret upcoming observations of the black holes at the center of the Milky Way, as well as M87, with the Event Horizon Telescope.

  15. Massive Parallel Sequencing for Diagnostic Genetic Testing of BRCA Genes--a Single Center Experience.

    PubMed

    Ermolenko, Natalya A; Boyarskikh, Uljana A; Kechin, Andrey A; Mazitova, Alexandra M; Khrapov, Evgeny A; Petrova, Valentina D; Lazarev, Alexandr F; Kushlinskii, Nikolay E; Filipenko, Maxim L

    2015-01-01

    The aim of this study was to implement massive parallel sequencing (MPS) technology in clinical genetics testing. We developed and tested an amplicon-based method for resequencing the BRCA1 and BRCA2 genes on an Illumina MiSeq to identify disease-causing mutations in patients with hereditary breast or ovarian cancer (HBOC). The coding regions of BRCA1 and BRCA2 were resequenced in 96 HBOC patient DNA samples obtained from different sample types: peripheral blood leukocytes, whole blood drops dried on paper, and buccal wash epithelia. A total of 16 random DNA samples were characterized using standard Sanger sequencing and applied to optimize the variant calling process and evaluate the accuracy of the MPS-method. The best bioinformatics workflow included the filtration of variants using GATK with the following cut-offs: variant frequency >14%, coverage (>25x) and presence in both the forward and reverse reads. The MPS method had 100% sensitivity and 94.4% specificity. Similar accuracy levels were achieved for DNA obtained from the different sample types. The workflow presented herein requires low amounts of DNA samples (170 ng) and is cost-effective due to the elimination of DNA and PCR product normalization steps. PMID:26625824

  16. Resolving genomic disorder–associated breakpoints within segmental DNA duplications using massively parallel sequencing

    PubMed Central

    Nuttle, Xander; Itsara, Andy; Shendure, Jay; Eichler, Evan E.

    2014-01-01

    The most common recurrent copy number variants associated with autism, developmental delay, and epilepsy are flanked by segmental duplications. Complete genetic characterization of these events is challenging because their breakpoints often occur within high-identity, copy number polymorphic paralogous sequences that cannot be specifically assayed using hybridization-based methods. Here, we provide a protocol for breakpoint resolution with sequence-level precision. Massively parallel sequencing is performed on libraries generated from haplotype-resolved chromosomes, genomic DNA, or molecular inversion probe–captured breakpoint-informative regions harboring paralog-distinguishing variants. Quantifying sequencing depth over informative sites enables breakpoint localization, typically within several kilobases to tens of kilobases. Depending on the approach employed, the sequencing platform, and the accuracy and completeness of the reference genome sequence, this protocol takes from a few days to several months to complete. Once established for a specific genomic disorder, it is possible to process thousands of DNA samples within as little as 3–4 weeks. PMID:24874815

  17. Characterization of the Zoarces viviparus liver transcriptome using massively parallel pyrosequencing

    PubMed Central

    Kristiansson, Erik; Asker, Noomi; Förlin, Lars; Larsson, DG Joakim

    2009-01-01

    Background The teleost Zoarces viviparus (eelpout) lives along the coasts of Northern Europe and has long been an established model organism for marine ecology and environmental monitoring. The scarce information about this species genome has however restrained the use of efficient molecular-level assays, such as gene expression microarrays. Results In the present study we present the first comprehensive characterization of the Zoarces viviparus liver transcriptome. From 400,000 reads generated by massively parallel pyrosequencing, more than 50,000 pieces of putative transcripts were assembled, annotated and functionally classified. The data was estimated to cover roughly 40% of the total transcriptome and homologues for about half of the genes of Gasterosteus aculeatus (stickleback) were identified. The sequence data was consequently used to design an oligonucleotide microarray for large-scale gene expression analysis. Conclusion Our results show that one run using a Genome Sequencer FLX from 454 Life Science/Roche generates enough genomic information for adequate de novo assembly of a large number of genes in a higher vertebrate. The generated sequence data, including the validated microarray probes, are publicly available to promote genome-wide research in Zoarces viviparus. PMID:19646242

  18. Hybrid selection of discrete genomic intervals on custom-designed microarrays for massively parallel sequencing

    PubMed Central

    Hodges, Emily; Rooks, Michelle; Xuan, Zhenyu; Bhattacharjee, Arindam; Gordon, D Benjamin; Brizuela, Leonardo; McCombie, W Richard; Hannon, Gregory J

    2010-01-01

    Complementary techniques that deepen information content and minimize reagent costs are required to realize the full potential of massively parallel sequencing. Here, we describe a resequencing approach that directs focus to genomic regions of high interest by combining hybridization-based purification of multi-megabase regions with sequencing on the Illumina Genome Analyzer (GA). The capture matrix is created by a microarray on which probes can be programmed as desired to target any non-repeat portion of the genome, while the method requires only a basic familiarity with microarray hybridization. We present a detailed protocol suitable for 1–2 µg of input genomic DNA and highlight key design tips in which high specificity (>65% of reads stem from enriched exons) and high sensitivity (98% targeted base pair coverage) can be achieved. We have successfully applied this to the enrichment of coding regions, in both human and mouse, ranging from 0.5 to 4 Mb in length. From genomic DNA library production to base-called sequences, this procedure takes approximately 9–10 d inclusive of array captures and one Illumina flow cell run. PMID:19478811

  19. Radiation hydrodynamics using characteristics on adaptive decomposed domains for massively parallel star formation simulations

    NASA Astrophysics Data System (ADS)

    Buntemeyer, Lars; Banerjee, Robi; Peters, Thomas; Klassen, Mikhail; Pudritz, Ralph E.

    2016-02-01

    We present an algorithm for solving the radiative transfer problem on massively parallel computers using adaptive mesh refinement and domain decomposition. The solver is based on the method of characteristics which requires an adaptive raytracer that integrates the equation of radiative transfer. The radiation field is split into local and global components which are handled separately to overcome the non-locality problem. The solver is implemented in the framework of the magneto-hydrodynamics code FLASH and is coupled by an operator splitting step. The goal is the study of radiation in the context of star formation simulations with a focus on early disc formation and evolution. This requires a proper treatment of radiation physics that covers both the optically thin as well as the optically thick regimes and the transition region in particular. We successfully show the accuracy and feasibility of our method in a series of standard radiative transfer problems and two 3D collapse simulations resembling the early stages of protostar and disc formation.

  20. Massively Parallel Sequencing for Genetic Diagnosis of Hearing Loss: The New Standard of Care

    PubMed Central

    Shearer, A. Eliot; Smith, Richard J.H.

    2016-01-01

    Objective To evaluate the use of new genetic sequencing techniques for comprehensive genetic testing for hearing loss. Data Sources Articles were identified from PubMed and Google Scholar databases using pertinent search terms. Review Methods Literature search identified 30 studies as candidates that met search criteria. Three studies were excluded and eight studies were found to be case reports. 20 studies were included for review analysis including seven studies that evaluated controls and 16 studies that evaluated patients with unknown causes of hearing loss; three studies evaluated both controls and patients. Conclusions In the 20 studies included in review analysis, 426 control samples and 603 patients with unknown causes of hearing loss underwent comprehensive genetic diagnosis for hearing loss using massively parallel sequencing. Control analysis showed a sensitivity and specificity > 99%, sufficient for clinical use of these tests. The overall diagnostic rate was 41% (range 10% to 83%) and varied based on several factors including inheritance and pre-screening prior to comprehensive testing. There were significant differences in platforms available in regards to number and type of genes included and whether copy number variations were examined. Based on these results, comprehensive genetic testing should form the cornerstone of a tiered approach to clinical evaluation of patients with hearing loss along with history, physical exam, and audiometry and can determine further testing that may be required, if any. Implications for Practice Comprehensive genetic testing has become the new standard of care for genetic testing for patients with sensorineural hearing loss. PMID:26084827

  1. Massively parallel network architectures for automatic recognition of visual speech signals. Final technical report

    SciTech Connect

    Sejnowski, T.J.; Goldstein, M.

    1990-01-01

    This research sought to produce a massively-parallel network architecture that could interpret speech signals from video recordings of human talkers. This report summarizes the project's results: (1) A corpus of video recordings from two human speakers was analyzed with image processing techniques and used as the data for this study; (2) We demonstrated that a feed forward network could be trained to categorize vowels from these talkers. The performance was comparable to that of the nearest neighbors techniques and to trained humans on the same data; (3) We developed a novel approach to sensory fusion by training a network to transform from facial images to short-time spectral amplitude envelopes. This information can be used to increase the signal-to-noise ratio and hence the performance of acoustic speech recognition systems in noisy environments; (4) We explored the use of recurrent networks to perform the same mapping for continuous speech. Results of this project demonstrate the feasibility of adding a visual speech recognition component to enhance existing speech recognition systems. Such a combined system could be used in noisy environments, such as cockpits, where improved communication is needed. This demonstration of presymbolic fusion of visual and acoustic speech signals is consistent with our current understanding of human speech perception.

  2. Novel Y-chromosome Short Tandem Repeat Variants Detected Through the Use of Massively Parallel Sequencing.

    PubMed

    Warshauer, David H; Churchill, Jennifer D; Novroski, Nicole; King, Jonathan L; Budowle, Bruce

    2015-08-01

    Massively parallel sequencing (MPS) technology is capable of determining the sizes of short tandem repeat (STR) alleles as well as their individual nucleotide sequences. Thus, single nucleotide polymorphisms (SNPs) within the repeat regions of STRs and variations in the pattern of repeat units in a given repeat motif can be used to differentiate alleles of the same length. In this study, MPS was used to sequence 28 forensically-relevant Y-chromosome STRs in a set of 41 DNA samples from the 3 major U.S. population groups (African Americans, Caucasians, and Hispanics). The resulting sequence data, which were analyzed with STRait Razor v2.0, revealed 37 unique allele sequence variants that have not been previously reported. Of these, 19 sequences were variations of documented sequences resulting from the presence of intra-repeat SNPs or alternative repeat unit patterns. Despite a limited sampling, two of the most frequently-observed variants were found only in African American samples. The remaining 18 variants represented allele sequences for which there were no published data with which to compare. These findings illustrate the great potential of MPS with regard to increasing the resolving power of STR typing and emphasize the need for sample population characterization of STR alleles. PMID:26391384

  3. Massively parallel decoding of mammalian regulatory sequences supports a flexible organizational model.

    PubMed

    Smith, Robin P; Taher, Leila; Patwardhan, Rupali P; Kim, Mee J; Inoue, Fumitaka; Shendure, Jay; Ovcharenko, Ivan; Ahituv, Nadav

    2013-09-01

    Despite continual progress in the cataloging of vertebrate regulatory elements, little is known about their organization and regulatory architecture. Here we describe a massively parallel experiment to systematically test the impact of copy number, spacing, combination and order of transcription factor binding sites on gene expression. A complex library of ∼5,000 synthetic regulatory elements containing patterns from 12 liver-specific transcription factor binding sites was assayed in mice and in HepG2 cells. We find that certain transcription factors act as direct drivers of gene expression in homotypic clusters of binding sites, independent of spacing between sites, whereas others function only synergistically. Heterotypic enhancers are stronger than their homotypic analogs and favor specific transcription factor binding site combinations, mimicking putative native enhancers. Exhaustive testing of binding site permutations suggests that there is flexibility in binding site order. Our findings provide quantitative support for a flexible model of regulatory element activity and suggest a framework for the design of synthetic tissue-specific enhancers. PMID:23892608

  4. Photo-patterned free-standing hydrogel microarrays for massively parallel protein analysis

    NASA Astrophysics Data System (ADS)

    Duncombe, Todd A.; Herr, Amy E.

    2015-03-01

    Microfluidic technologies have largely been realized within enclosed microchannels. While powerful, a principle limitation of closed-channel microfluidics is the difficulty for sample extraction and downstream processing. To address this limitation and expand the utility of microfluidic analytical separation tools, we developed an openchannel hydrogel architecture for rapid protein analysis. Designed for compatibility with slab-gel polyacrylamide gel electrophoresis (PAGE) reagents and instruments, we detail the development of free-standing polyacrylamide gel (fsPAG) microstructures supporting electrophoretic performance rivalling that of microfluidic platforms. Owing to its open architecture - the platform can be easily interfaced with automated robotic controllers and downstream processing (e.g., sample spotters, immunological probing, mass spectroscopy). The fsPAG devices are directly photopatterened atop of and covalently attached to planar polymer or glass surfaces. Due to the fast < 1 hr design-prototype-test cycle - significantly faster than mold based fabrication techniques - rapid prototyping devices with fsPAG microstructures provides researchers a powerful tool for developing custom analytical assays. Leveraging the rapid prototyping benefits - we up-scale from a unit separation to an array of 96 concurrent fsPAGE assays in 10 min run time driven by one electrode pair. The fsPAGE platform is uniquely well-suited for massively parallelized proteomics, a major unrealized goal from bioanalytical technology.

  5. Transcriptomic analysis of the housefly (Musca domestica) larva using massively parallel pyrosequencing.

    PubMed

    Liu, Fengsong; Tang, Ting; Sun, Lingling; Jose Priya, T A

    2012-02-01

    To explore the transcriptome of Musca domestica larvae and to identify unique sequences, we used massively parallel pyrosequencing on the Roche 454-FLX platform to generate a substantial EST dataset of this fly. As a result, we obtained a total of 249,555 ESTs with an average read length of 373 bp. These reads were assembled into 13,206 contigs and 20,556 singletons. Using BlastX searches of the Swissprot and Nr databases, we were able to identify 4,814 contigs and 8,166 singletons as unique sequences. Subsequently, the annotated sequences were subjected to GO analysis and the search results showed a majority of the query sequences were assignable to certain gene ontology terms. In addition, functional classification and pathway assignment were performed by KEGG and 2,164 unique sequences were mapped into 184 KEGG pathways in total. As the first attempt on large-scale RNA sequencing of M. domestica, this general picture of the transcriptome can establish a fundamental resource for further research on functional genomics. PMID:21643958

  6. Novel Y-chromosome Short Tandem Repeat Variants Detected Through the Use of Massively Parallel Sequencing

    PubMed Central

    Warshauer, David H.; Churchill, Jennifer D.; Novroski, Nicole; King, Jonathan L.; Budowle, Bruce

    2015-01-01

    Massively parallel sequencing (MPS) technology is capable of determining the sizes of short tandem repeat (STR) alleles as well as their individual nucleotide sequences. Thus, single nucleotide polymorphisms (SNPs) within the repeat regions of STRs and variations in the pattern of repeat units in a given repeat motif can be used to differentiate alleles of the same length. In this study, MPS was used to sequence 28 forensically-relevant Y-chromosome STRs in a set of 41 DNA samples from the 3 major U.S. population groups (African Americans, Caucasians, and Hispanics). The resulting sequence data, which were analyzed with STRait Razor v2.0, revealed 37 unique allele sequence variants that have not been previously reported. Of these, 19 sequences were variations of documented sequences resulting from the presence of intra-repeat SNPs or alternative repeat unit patterns. Despite a limited sampling, two of the most frequently-observed variants were found only in African American samples. The remaining 18 variants represented allele sequences for which there were no published data with which to compare. These findings illustrate the great potential of MPS with regard to increasing the resolving power of STR typing and emphasize the need for sample population characterization of STR alleles. PMID:26391384

  7. GRay: A MASSIVELY PARALLEL GPU-BASED CODE FOR RAY TRACING IN RELATIVISTIC SPACETIMES

    SciTech Connect

    Chan, Chi-kwan; Psaltis, Dimitrios; Özel, Feryal

    2013-11-01

    We introduce GRay, a massively parallel integrator designed to trace the trajectories of billions of photons in a curved spacetime. This graphics-processing-unit (GPU)-based integrator employs the stream processing paradigm, is implemented in CUDA C/C++, and runs on nVidia graphics cards. The peak performance of GRay using single-precision floating-point arithmetic on a single GPU exceeds 300 GFLOP (or 1 ns per photon per time step). For a realistic problem, where the peak performance cannot be reached, GRay is two orders of magnitude faster than existing central-processing-unit-based ray-tracing codes. This performance enhancement allows more effective searches of large parameter spaces when comparing theoretical predictions of images, spectra, and light curves from the vicinities of compact objects to observations. GRay can also perform on-the-fly ray tracing within general relativistic magnetohydrodynamic algorithms that simulate accretion flows around compact objects. Making use of this algorithm, we calculate the properties of the shadows of Kerr black holes and the photon rings that surround them. We also provide accurate fitting formulae of their dependencies on black hole spin and observer inclination, which can be used to interpret upcoming observations of the black holes at the center of the Milky Way, as well as M87, with the Event Horizon Telescope.

  8. Targeted massively parallel sequencing provides comprehensive genetic diagnosis for patients with disorders of sex development

    PubMed Central

    Arboleda, VA; Lee, H; Sánchez, FJ; Délot, EC; Sandberg, DE; Grody, WW; Nelson, SF; Vilain, E

    2013-01-01

    Disorders of sex development (DSD) are rare disorders in which there is discordance between chromosomal, gonadal, and phenotypic sex. Only a minority of patients clinically diagnosed with DSD obtains a molecular diagnosis, leaving a large gap in our understanding of the prevalence, management, and outcomes in affected patients. We created a novel DSD-genetic diagnostic tool, in which sex development genes are captured using RNA probes and undergo massively parallel sequencing. In the pilot group of 14 patients, we determined sex chromosome dosage, copy number variation, and gene mutations. In the patients with a known genetic diagnosis (obtained either on a clinical or research basis), this test identified the molecular cause in 100% (7/7) of patients. In patients in whom no molecular diagnosis had been made, this tool identified a genetic diagnosis in two of seven patients. Targeted sequencing of genes representing a specific spectrum of disorders can result in a higher rate of genetic diagnoses than current diagnostic approaches. Our DSD diagnostic tool provides for first time, in a single blood test, a comprehensive genetic diagnosis in patients presenting with a wide range of urogenital anomalies. PMID:22435390

  9. Identification of Novel FMR1 Variants by Massively Parallel Sequencing in Developmentally Delayed Males

    PubMed Central

    Collins, Stephen C.; Bray, Steven M.; Suhl, Joshua A.; Cutler, David J.; Coffee, Bradford; Zwick, Michael E.; Warren, Stephen T.

    2010-01-01

    Fragile X syndrome (FXS), the most common inherited form of developmental delay, is typically caused by CGG-repeat expansion in FMR1. However, little attention has been paid to sequence variants in FMR1. Through the use of pooled-template massively parallel sequencing, we identified 130 novel FMR1 sequence variants in a population of 963 developmentally delayed males without CGG-repeat expansion mutations. Among these, we identified a novel missense change, p.R138Q, which alters a conserved residue in the nuclear localization signal of FMRP. We have also identified three promoter mutations in this population, all of which significantly reduce in vitro levels of FMR1 transcription. Additionally, we identified 10 noncoding variants of possible functional significance in the introns and 3’-untranslated region of FMR1, including two predicted splice site mutations. These findings greatly expand the catalogue of known FMR1 sequence variants and suggest that FMR1 sequence variants may represent an important cause of developmental delay. PMID:20799337

  10. Mitochondrial DNA heteroplasmy in the emerging field of massively parallel sequencing

    PubMed Central

    Just, Rebecca S.; Irwin, Jodi A.; Parson, Walther

    2015-01-01

    Long an important and useful tool in forensic genetic investigations, mitochondrial DNA (mtDNA) typing continues to mature. Research in the last few years has demonstrated both that data from the entire molecule will have practical benefits in forensic DNA casework, and that massively parallel sequencing (MPS) methods will make full mitochondrial genome (mtGenome) sequencing of forensic specimens feasible and cost-effective. A spate of recent studies has employed these new technologies to assess intraindividual mtDNA variation. However, in several instances, contamination and other sources of mixed mtDNA data have been erroneously identified as heteroplasmy. Well vetted mtGenome datasets based on both Sanger and MPS sequences have found authentic point heteroplasmy in approximately 25% of individuals when minor component detection thresholds are in the range of 10–20%, along with positional distribution patterns in the coding region that differ from patterns of point heteroplasmy in the well-studied control region. A few recent studies that examined very low-level heteroplasmy are concordant with these observations when the data are examined at a common level of resolution. In this review we provide an overview of considerations related to the use of MPS technologies to detect mtDNA heteroplasmy. In addition, we examine published reports on point heteroplasmy to characterize features of the data that will assist in the evaluation of future mtGenome data developed by any typing method. PMID:26009256

  11. Simulation of hydraulic fracture networks in three dimensions utilizing massively parallel computing platforms

    NASA Astrophysics Data System (ADS)

    Settgast, R. R.; Johnson, S.; Fu, P.; Walsh, S. D.; Ryerson, F. J.; Antoun, T.

    2012-12-01

    Hydraulic fracturing has been an enabling technology for commercially stimulating fracture networks for over half of a century. It has become one of the most widespread technologies for engineering subsurface fracture systems. Despite the ubiquity of this technique in the field, understanding and prediction of the hydraulic induced propagation of the fracture network in realistic, heterogeneous reservoirs has been limited. A number of developments in multiscale modeling in recent years have allowed researchers in related fields to tackle the modeling of complex fracture propagation as well as the mechanics of heterogeneous materials. These developments, combined with advances in quantifying solution uncertainties, provide possibilities for the geologic modeling community to capture both the fracturing behavior and longer-term permeability evolution of rock masses under hydraulic loading across both dynamic and viscosity-dominated regimes. Here we will demonstrate the first phase of this effort through illustrations of fully three-dimensional, tightly coupled hydromechanical simulations of hydraulically induced fracture network propagation run on massively parallel computing scales, and discuss preliminary results regarding the mechanisms by which fracture interactions and the accompanying changes to the stress field can lead to deleterious or beneficial changes to the fracture network.

  12. Identification of a novel GATA3 mutation in a deaf Taiwanese family by massively parallel sequencing.

    PubMed

    Lin, Yin-Hung; Wu, Chen-Chi; Hsu, Tun-Yen; Chiu, Wei-Yih; Hsu, Chuan-Jen; Chen, Pei-Lung

    2015-01-01

    Recent studies have confirmed the utility of massively parallel sequencing (MPS) in addressing genetically heterogeneous hereditary hearing impairment. By applying a MPS diagnostic panel targeting 129 known deafness genes, we identified a novel frameshift GATA3 mutation, c.149delT (p.Phe51LeufsX144), in a hearing-impaired family compatible with autosomal dominant inheritance. The GATA3 haploinsufficiency is thought to be associated with the hypoparathyroidism, sensorineural deafness, and renal dysplasia (HDR) syndrome. The pathogenicity of GATA3 c.149delT was supported by its absence in the 5400 NHLBI exomes, 1000 Genomes, and the 100 normal hearing controls of the present study; the co-segregation of c.149delT heterozygosity with hearing impairment in 9 affected members of the family; as well as the nonsense-mediated mRNA decay of the mutant allele in in vitro functional studies. The phenotypes in this family appeared relatively mild, as most affected members presented no signs of hypoparathyroidism or renal abnormalities, including the proband. To our knowledge, this is the first report of genetic diagnosis of HDR syndrome before the clinical diagnosis. Genetic examination for multiple deafness genes with MPS might be helpful in identifying certain types of syndromic hearing loss such as HDR syndrome, contributing to earlier diagnosis and treatment of the affected individuals. PMID:25771973

  13. Mitochondrial DNA heteroplasmy in the emerging field of massively parallel sequencing.

    PubMed

    Just, Rebecca S; Irwin, Jodi A; Parson, Walther

    2015-09-01

    Long an important and useful tool in forensic genetic investigations, mitochondrial DNA (mtDNA) typing continues to mature. Research in the last few years has demonstrated both that data from the entire molecule will have practical benefits in forensic DNA casework, and that massively parallel sequencing (MPS) methods will make full mitochondrial genome (mtGenome) sequencing of forensic specimens feasible and cost-effective. A spate of recent studies has employed these new technologies to assess intraindividual mtDNA variation. However, in several instances, contamination and other sources of mixed mtDNA data have been erroneously identified as heteroplasmy. Well vetted mtGenome datasets based on both Sanger and MPS sequences have found authentic point heteroplasmy in approximately 25% of individuals when minor component detection thresholds are in the range of 10-20%, along with positional distribution patterns in the coding region that differ from patterns of point heteroplasmy in the well-studied control region. A few recent studies that examined very low-level heteroplasmy are concordant with these observations when the data are examined at a common level of resolution. In this review we provide an overview of considerations related to the use of MPS technologies to detect mtDNA heteroplasmy. In addition, we examine published reports on point heteroplasmy to characterize features of the data that will assist in the evaluation of future mtGenome data developed by any typing method. PMID:26009256

  14. Massively parallel sequencing of complete mitochondrial genomes from hair shaft samples.

    PubMed

    Parson, Walther; Huber, Gabriela; Moreno, Lilliana; Madel, Maria-Bernadette; Brandhagen, Michael D; Nagl, Simone; Xavier, Catarina; Eduardoff, Mayra; Callaghan, Thomas C; Irwin, Jodi A

    2015-03-01

    Though shed hairs are one of the most commonly encountered evidence types, they are among the most limited in terms of DNA quantity and quality. As a result, DNA testing has historically focused on the recovery of just about 600 base pairs of the mitochondrial DNA control region. Here, we describe our success in recovering complete mitochondrial genome (mtGenome) data (∼16,569bp) from single shed hairs. By employing massively parallel sequencing (MPS), we demonstrate that particular hair samples yield DNA sufficient in quantity and quality to produce 2-3kb mtGenome amplicons and that entire mtGenome data can be recovered from hair extracts even without PCR enrichment. Most importantly, we describe a small amplicon multiplex assay comprised of sixty-two primer sets that can be routinely applied to the compromised hair samples typically encountered in forensic casework. In all samples tested here, the MPS data recovered using any one of the three methods were consistent with the control Sanger sequence data developed from high quality known specimens. Given the recently demonstrated value of complete mtGenome data in terms of discrimination power among randomly sampled individuals, the possibility of recovering mtGenome data from the most compromised and limited evidentiary material is likely to vastly increase the utility of mtDNA testing for hair evidence. PMID:25438934

  15. Large-scale massively parallel atomistic simulations of short pulse laser interaction with metals

    NASA Astrophysics Data System (ADS)

    Wu, Chengping; Zhigilei, Leonid; Computational Materials Group Team

    2014-03-01

    Taking advantage of petascale supercomputing architectures, large-scale massively parallel atomistic simulations (108-109 atoms) are performed to study the microscopic mechanisms of short pulse laser interaction with metals. The results of the simulations reveal a complex picture of highly non-equilibrium processes responsible for material modification and/or ejection. At low laser fluences below the ablation threshold, fast melting and resolidification occur under conditions of extreme heating and cooling rates resulting in surface microstructure modification. At higher laser fluences in the spallation regime, the material is ejected by the relaxation of laser-induced stresses and proceeds through the nucleation, growth and percolation of multiple voids in the sub-surface region of the irradiated target. At a fluence of ~ 2.5 times the spallation threshold, the top part of the target reaches the conditions for an explosive decomposition into vapor and small droplets, marking the transition to the phase explosion regime of laser ablation. The dynamics of plume formation and the characteristics of the ablation plume are obtained from the simulations and compared with the results of time-resolved plume imaging experiments. Financial support for this work was provided by NSF (DMR-0907247 and CMMI-1301298) and AFOSR (FA9550-10-1-0541). Computational support was provided by the OLCF (MAT048) and XSEDE (TG-DMR110090).

  16. Frequency of Usher syndrome type 1 in deaf children by massively parallel DNA sequencing.

    PubMed

    Yoshimura, Hidekane; Miyagawa, Maiko; Kumakawa, Kozo; Nishio, Shin-Ya; Usami, Shin-Ichi

    2016-05-01

    Usher syndrome type 1 (USH1) is the most severe of the three USH subtypes due to its profound hearing loss, absent vestibular response and retinitis pigmentosa appearing at a prepubescent age. Six causative genes have been identified for USH1, making early diagnosis and therapy possible through DNA testing. Targeted exon sequencing of selected genes using massively parallel DNA sequencing (MPS) technology enables clinicians to systematically tackle previously intractable monogenic disorders and improve molecular diagnosis. Using MPS along with direct sequence analysis, we screened 227 unrelated non-syndromic deaf children and detected recessive mutations in USH1 causative genes in five patients (2.2%): three patients harbored MYO7A mutations and one each carried CDH23 or PCDH15 mutations. As indicated by an earlier genotype-phenotype correlation study of the CDH23 and PCDH15 genes, we considered the latter two patients to have USH1. Based on clinical findings, it was also highly likely that one patient with MYO7A mutations possessed USH1 due to a late onset age of walking. This first report describing the frequency (1.3-2.2%) of USH1 among non-syndromic deaf children highlights the importance of comprehensive genetic testing for early disease diagnosis. PMID:26791358

  17. Tracking the roots of cellulase hyperproduction by the fungus Trichoderma reesei using massively parallel DNA sequencing

    SciTech Connect

    Le Crom, Stphane; Schackwitz, Wendy; Pennacchiod, Len; Magnuson, Jon K.; Culley, David E.; Collett, James R.; Martin, Joel X.; Druzhinina, Irina S.; Mathis, Hugues; Monot, Frdric; Seiboth, Bernhard; Cherry, Barbara; Rey, Michael; Berka, Randy; Kubicek, Christian P.; Baker, Scott E.; Margeot, Antoine

    2009-09-22

    Trichoderma reesei (teleomorph Hypocrea jecorina) is the main industrial source of cellulases and hemicellulases harnessed for the hydrolysis of biomass to simple sugars, which can then be converted to biofuels, such as ethanol, and other chemicals. The highly productive strains in use today were generated by classical mutagenesis. To learn how cellulase production was improved by these techniques, we performed massively parallel sequencing to identify mutations in the genomes of two hyperproducing strains (NG14, and its direct improved descendant, RUT C30). We detected a surprisingly high number of mutagenic events: 223 single nucleotides variants, 15 small deletions or insertions and 18 larger deletions leading to the loss of more than 100 kb of genomic DNA. From these events we report previously undocumented non-synonymous mutations in 43 genes that are mainly involved in nuclear transport, mRNA stability, transcription, secretion/vacuolar targeting, and metabolism. This homogeneity of functional categories suggests that multiple changes are necessary to improve cellulase production and not simply a few clear-cut mutagenic events. Phenotype microarrays show that some of these mutations result in strong changes in the carbon assimilation pattern of the two mutants with respect to the wild type strain QM6a. Our analysis provides the first genome-wide insights into the changes induced by classical mutagenesis in a filamentous fungus, and suggests new areas for the generation of enhanced T. reesei strains for industrial applications such as biofuel production.

  18. Frequency of Usher syndrome type 1 in deaf children by massively parallel DNA sequencing

    PubMed Central

    Yoshimura, Hidekane; Miyagawa, Maiko; Kumakawa, Kozo; Nishio, Shin-ya; Usami, Shin-ichi

    2016-01-01

    Usher syndrome type 1 (USH1) is the most severe of the three USH subtypes due to its profound hearing loss, absent vestibular response and retinitis pigmentosa appearing at a prepubescent age. Six causative genes have been identified for USH1, making early diagnosis and therapy possible through DNA testing. Targeted exon sequencing of selected genes using massively parallel DNA sequencing (MPS) technology enables clinicians to systematically tackle previously intractable monogenic disorders and improve molecular diagnosis. Using MPS along with direct sequence analysis, we screened 227 unrelated non-syndromic deaf children and detected recessive mutations in USH1 causative genes in five patients (2.2%): three patients harbored MYO7A mutations and one each carried CDH23 or PCDH15 mutations. As indicated by an earlier genotype–phenotype correlation study of the CDH23 and PCDH15 genes, we considered the latter two patients to have USH1. Based on clinical findings, it was also highly likely that one patient with MYO7A mutations possessed USH1 due to a late onset age of walking. This first report describing the frequency (1.3–2.2%) of USH1 among non-syndromic deaf children highlights the importance of comprehensive genetic testing for early disease diagnosis. PMID:26791358

  19. Massively parallel sampling of lattice proteins reveals foundations of thermal adaptation.

    PubMed

    Venev, Sergey V; Zeldovich, Konstantin B

    2015-08-01

    Evolution of proteins in bacteria and archaea living in different conditions leads to significant correlations between amino acid usage and environmental temperature. The origins of these correlations are poorly understood, and an important question of protein theory, physics-based prediction of types of amino acids overrepresented in highly thermostable proteins, remains largely unsolved. Here, we extend the random energy model of protein folding by weighting the interaction energies of amino acids by their frequencies in protein sequences and predict the energy gap of proteins designed to fold well at elevated temperatures. To test the model, we present a novel scalable algorithm for simultaneous energy calculation for many sequences in many structures, targeting massively parallel computing architectures such as graphics processing unit. The energy calculation is performed by multiplying two matrices, one representing the complete set of sequences, and the other describing the contact maps of all structural templates. An implementation of the algorithm for the CUDA platform is available at http://www.github.com/kzeldovich/galeprot and calculates protein folding energies over 250 times faster than a single central processing unit. Analysis of amino acid usage in 64-mer cubic lattice proteins designed to fold well at different temperatures demonstrates an excellent agreement between theoretical and simulated values of energy gap. The theoretical predictions of temperature trends of amino acid frequencies are significantly correlated with bioinformatics data on 191 bacteria and archaea, and highlight protein folding constraints as a fundamental selection pressure during thermal adaptation in biological evolution. PMID:26254668

  20. Tracking the roots of cellulase hyperproduction by the fungus Trichoderma reesei using massively parallel DNA sequencing

    PubMed Central

    Le Crom, Stéphane; Schackwitz, Wendy; Pennacchio, Len; Magnuson, Jon K.; Culley, David E.; Collett, James R.; Martin, Joel; Druzhinina, Irina S.; Mathis, Hugues; Monot, Frédéric; Seiboth, Bernhard; Cherry, Barbara; Rey, Michael; Berka, Randy; Kubicek, Christian P.; Baker, Scott E.; Margeot, Antoine

    2009-01-01

    Trichoderma reesei (teleomorph Hypocrea jecorina) is the main industrial source of cellulases and hemicellulases harnessed for the hydrolysis of biomass to simple sugars, which can then be converted to biofuels such as ethanol and other chemicals. The highly productive strains in use today were generated by classical mutagenesis. To learn how cellulase production was improved by these techniques, we performed massively parallel sequencing to identify mutations in the genomes of two hyperproducing strains (NG14, and its direct improved descendant, RUT C30). We detected a surprisingly high number of mutagenic events: 223 single nucleotides variants, 15 small deletions or insertions, and 18 larger deletions, leading to the loss of more than 100 kb of genomic DNA. From these events, we report previously undocumented non-synonymous mutations in 43 genes that are mainly involved in nuclear transport, mRNA stability, transcription, secretion/vacuolar targeting, and metabolism. This homogeneity of functional categories suggests that multiple changes are necessary to improve cellulase production and not simply a few clear-cut mutagenic events. Phenotype microarrays show that some of these mutations result in strong changes in the carbon assimilation pattern of the two mutants with respect to the wild-type strain QM6a. Our analysis provides genome-wide insights into the changes induced by classical mutagenesis in a filamentous fungus and suggests areas for the generation of enhanced T. reesei strains for industrial applications such as biofuel production. PMID:19805272

  1. Tracking the roots of cellulase hyperproduction by the fungus Trichoderma reesei using massively parallel DNA sequencing.

    PubMed

    Le Crom, Stéphane; Schackwitz, Wendy; Pennacchio, Len; Magnuson, Jon K; Culley, David E; Collett, James R; Martin, Joel; Druzhinina, Irina S; Mathis, Hugues; Monot, Frédéric; Seiboth, Bernhard; Cherry, Barbara; Rey, Michael; Berka, Randy; Kubicek, Christian P; Baker, Scott E; Margeot, Antoine

    2009-09-22

    Trichoderma reesei (teleomorph Hypocrea jecorina) is the main industrial source of cellulases and hemicellulases harnessed for the hydrolysis of biomass to simple sugars, which can then be converted to biofuels such as ethanol and other chemicals. The highly productive strains in use today were generated by classical mutagenesis. To learn how cellulase production was improved by these techniques, we performed massively parallel sequencing to identify mutations in the genomes of two hyperproducing strains (NG14, and its direct improved descendant, RUT C30). We detected a surprisingly high number of mutagenic events: 223 single nucleotides variants, 15 small deletions or insertions, and 18 larger deletions, leading to the loss of more than 100 kb of genomic DNA. From these events, we report previously undocumented non-synonymous mutations in 43 genes that are mainly involved in nuclear transport, mRNA stability, transcription, secretion/vacuolar targeting, and metabolism. This homogeneity of functional categories suggests that multiple changes are necessary to improve cellulase production and not simply a few clear-cut mutagenic events. Phenotype microarrays show that some of these mutations result in strong changes in the carbon assimilation pattern of the two mutants with respect to the wild-type strain QM6a. Our analysis provides genome-wide insights into the changes induced by classical mutagenesis in a filamentous fungus and suggests areas for the generation of enhanced T. reesei strains for industrial applications such as biofuel production. PMID:19805272

  2. SIESTA-PEXSI: Massively parallel method for efficient and accurate ab initio materials simulation

    NASA Astrophysics Data System (ADS)

    Lin, Lin; Huhs, Georg; Garcia, Alberto; Yang, Chao

    2014-03-01

    We describe how to combine the pole expansion and selected inversion (PEXSI) technique with the SIESTA method, which uses numerical atomic orbitals for Kohn-Sham density functional theory (KSDFT) calculations. The PEXSI technique can efficiently utilize the sparsity pattern of the Hamiltonian matrix and the overlap matrix generated from codes such as SIESTA, and solves KSDFT without using cubic scaling matrix diagonalization procedure. The complexity of PEXSI scales at most quadratically with respect to the system size, and the accuracy is comparable to that obtained from full diagonalization. One distinct feature of PEXSI is that it achieves low order scaling without using the near-sightedness property and can be therefore applied to metals as well as insulators and semiconductors, at room temperature or even lower temperature. The PEXSI method is highly scalable, and the recently developed massively parallel PEXSI technique can make efficient usage of 10,000 ~100,000 processors on high performance machines. We demonstrate the performance the SIESTA-PEXSI method using several examples for large scale electronic structure calculation including long DNA chain and graphene-like structures with more than 20000 atoms. Funded by Luis Alvarez fellowship in LBNL, and DOE SciDAC project in partnership with BES.

  3. ALEGRA -- A massively parallel h-adaptive code for solid dynamics

    SciTech Connect

    Summers, R.M.; Wong, M.K.; Boucheron, E.A.; Weatherby, J.R.

    1997-12-31

    ALEGRA is a multi-material, arbitrary-Lagrangian-Eulerian (ALE) code for solid dynamics designed to run on massively parallel (MP) computers. It combines the features of modern Eulerian shock codes, such as CTH, with modern Lagrangian structural analysis codes using an unstructured grid. ALEGRA is being developed for use on the teraflop supercomputers to conduct advanced three-dimensional (3D) simulations of shock phenomena important to a variety of systems. ALEGRA was designed with the Single Program Multiple Data (SPMD) paradigm, in which the mesh is decomposed into sub-meshes so that each processor gets a single sub-mesh with approximately the same number of elements. Using this approach the authors have been able to produce a single code that can scale from one processor to thousands of processors. A current major effort is to develop efficient, high precision simulation capabilities for ALEGRA, without the computational cost of using a global highly resolved mesh, through flexible, robust h-adaptivity of finite elements. H-adaptivity is the dynamic refinement of the mesh by subdividing elements, thus changing the characteristic element size and reducing numerical error. The authors are working on several major technical challenges that must be met to make effective use of HAMMER on MP computers.

  4. Massively parallel sequencing-based survey of eukaryotic community structures in Hiroshima Bay and Ishigaki Island.

    PubMed

    Nagai, Satoshi; Hida, Kohsuke; Urusizaki, Shingo; Takano, Yoshihito; Hongo, Yuki; Kameda, Takahiko; Abe, Kazuo

    2016-02-01

    In this study, we compared the eukaryote biodiversity between Hiroshima Bay and Ishigaki Island in Japanese coastal waters by using the massively parallel sequencing (MPS)-based technique to collect preliminary data. The relative abundance of Alveolata was highest in both localities, and the second highest groups were Stramenopiles, Opisthokonta, or Hacrobia, which varied depending on the samples considered. For microalgal phyla, the relative abundance of operational taxonomic units (OTUs) and the number of MPS were highest for Dinophyceae in both localities, followed by Bacillariophyceae in Hiroshima Bay, and by Bacillariophyceae or Chlorophyceae in Ishigaki Island. The number of detected OTUs in Hiroshima Bay and Ishigaki Island was 645 and 791, respectively, and 15.3% and 12.5% of the OTUs were common between the two localities. In the non-metric multidimensional scaling analysis, the samples from the two localities were plotted in different positions. In the dendrogram developed using similarity indices, the samples were clustered into different nodes based on localities with high multiscale bootstrap values, reflecting geographic differences in biodiversity. Thus, we succeeded in demonstrating biodiversity differences between the two localities, although the read numbers of the MPSs were not high enough. The corresponding analysis showed a clear seasonal change in the biodiversity of Hiroshima Bay but it was not clear in Ishigaki Island. Thus, the MPS-based technique shows a great advantage of high performance by detecting several hundreds of OTUs from a single sample, strongly suggesting the effectiveness to apply this technique to routine monitoring programs. PMID:26476293

  5. Massively parallel enzyme kinetics reveals the substrate recognition landscape of the metalloprotease ADAMTS13

    PubMed Central

    Kretz, Colin A.; Dai, Manhong; Soylemez, Onuralp; Yee, Andrew; Desch, Karl C.; Siemieniak, David; Tomberg, Kärt; Kondrashov, Fyodor A.; Meng, Fan; Ginsburg, David

    2015-01-01

    Proteases play important roles in many biologic processes and are key mediators of cancer, inflammation, and thrombosis. However, comprehensive and quantitative techniques to define the substrate specificity profile of proteases are lacking. The metalloprotease ADAMTS13 regulates blood coagulation by cleaving von Willebrand factor (VWF), reducing its procoagulant activity. A mutagenized substrate phage display library based on a 73-amino acid fragment of VWF was constructed, and the ADAMTS13-dependent change in library complexity was evaluated over reaction time points, using high-throughput sequencing. Reaction rate constants (kcat/KM) were calculated for nearly every possible single amino acid substitution within this fragment. This massively parallel enzyme kinetics analysis detailed the specificity of ADAMTS13 and demonstrated the critical importance of the P1-P1′ substrate residues while defining exosite binding domains. These data provided empirical evidence for the propensity for epistasis within VWF and showed strong correlation to conservation across orthologs, highlighting evolutionary selective pressures for VWF. PMID:26170332

  6. The minimal amount of starting DNA for Agilent's hybrid capture-based targeted massively parallel sequencing.

    PubMed

    Chung, Jongsuk; Son, Dae-Soon; Jeon, Hyo-Jeong; Kim, Kyoung-Mee; Park, Gahee; Ryu, Gyu Ha; Park, Woong-Yang; Park, Donghyun

    2016-01-01

    Targeted capture massively parallel sequencing is increasingly being used in clinical settings, and as costs continue to decline, use of this technology may become routine in health care. However, a limited amount of tissue has often been a challenge in meeting quality requirements. To offer a practical guideline for the minimum amount of input DNA for targeted sequencing, we optimized and evaluated the performance of targeted sequencing depending on the input DNA amount. First, using various amounts of input DNA, we compared commercially available library construction kits and selected Agilent's SureSelect-XT and KAPA Biosystems' Hyper Prep kits as the kits most compatible with targeted deep sequencing using Agilent's SureSelect custom capture. Then, we optimized the adapter ligation conditions of the Hyper Prep kit to improve library construction efficiency and adapted multiplexed hybrid selection to reduce the cost of sequencing. In this study, we systematically evaluated the performance of the optimized protocol depending on the amount of input DNA, ranging from 6.25 to 200 ng, suggesting the minimal input DNA amounts based on coverage depths required for specific applications. PMID:27220682

  7. Nanopantography: A new method for massively parallel nanopatterning over large areas

    NASA Astrophysics Data System (ADS)

    Xu, Lin

    Nanopantography, a radically new method for versatile fabrication of sub-20 nm features in a massively parallel fashion, represents a breakthrough in nanotechnology. The concept of this technique is to focus ion "beamlets" in parallel to write identical, arbitrary nano-patterns. Depending on the ion species, nanopatterns can be either etched, or deposited by nanopantography. An array of electrostatic lenses and a broad-area, directional, monoenergetic ion beam are required to implement nanopantography. This dissertation is dedicated to extracting an ion beam with desired properties from a plasma source and realizing nanopantography using this beam. A novel ion extraction strategy has been used to extract a nearly monoenergetic and energy-specified ion beam from a capacitively-coupled or an inductively-coupled, pulsed Ar plasma. The electron temperature decayed rapidly in the afterglow, resulting in uniform plasma potential, and minimal energy spread for ions extracted in the afterglow. Ion energy was controlled by a DC bias, or alternatively by a high-voltage pulse, on the ring electrode surrounding the plasma. Langmuir probe measurements indicated that this bias raised the plasma potential without heating the electrons in the afterglow. The energy spread was 3.4 eV (FWHM) For a peak ion beam energy of 102.0 eV. Similar results were obtained in an inductively-coupled pulsed plasma when the acceleration ring was pulsed exclusively during the afterglow. To achieve Ni deposition by nanopantography, higher Ni atom and ion densities are desired in the plasma source. An ionized physical vapor deposition (IPVD) system with a Ni internal RF coil and Ni target was used to introduce Ni atoms, and a fraction of the atoms becomes ionized in the high-density plasma. Optical emission spectroscopy (OAS) and optical absorption spectroscopy (OAS), in combination with global models, were used to determine the Ni atom and ion density. For a pressure of 8-20 mTorr and coil power of 40

  8. A Massive Parallel Variational Multiscale FEM Scheme Applied to Nonhydrostatic Atmospheric Dynamics

    NASA Astrophysics Data System (ADS)

    Vazquez, Mariano; Marras, Simone; Moragues, Margarida; Jorba, Oriol; Houzeaux, Guillaume; Aubry, Romain

    2010-05-01

    The solution of the fully compressible Euler equations of stratified flows is approached from the point of view of Computational Fluid Dynamics techniques. Specifically, the main aim of this contribution is the introduction of a Variational Multiscale Finite Element (CVMS-FE) approach to solve dry atmospheric dynamics effectively on massive parallel architectures with more than 1000 processors. The conservation form of the equations of motion is discretized in all directions with a Galerkin scheme with stabilization given by the compressible counterpart of the variational multiscale technique of Hughes [1] and Houzeaux et al. [2]. The justification of this effort is twofold: the search of optimal parallelization characteristics and linear scalability trends on petascale machines is one. The development of a numerical algorithm whose local nature helps maintaining minimal the communication among the processors implies, in fact, a large leap towards efficient parallel computing. Second, the rising trend to global models and models of higher spatial resolution naturally suggests the use of adaptive grids to only resolve zones of larger gradients while keeping the computational mesh properly coarse elsewhere (thus keeping the computational cost low). With these two hypotheses in mind, the finite element scheme presented here is an open option to the development of the next generation Numerical Weather Prediction (NWP) codes. This methodology is as new in Computational Fluid Dynamics for compressible flows at low Mach number as it is in Numerical Weather Prediction (NWP). We however mean to show its ability to maintain stability in the solution of thermal, gravity-driven flows in a stratified environment in the specific context of dry atmospheric dynamics. Standard two dimensional benchmarks are implemented and compared against the reference literature. In the context of thermal and gravity-driven flows in a neutral atmosphere, we present: (1) the density current

  9. Massively parallel computation of 3D flow and reactions in chemical vapor deposition reactors

    SciTech Connect

    Salinger, A.G.; Shadid, J.N.; Hutchinson, S.A.; Hennigan, G.L.; Devine, K.D.; Moffat, H.K.

    1997-12-01

    Computer modeling of Chemical Vapor Deposition (CVD) reactors can greatly aid in the understanding, design, and optimization of these complex systems. Modeling is particularly attractive in these systems since the costs of experimentally evaluating many design alternatives can be prohibitively expensive, time consuming, and even dangerous, when working with toxic chemicals like Arsine (AsH{sub 3}): until now, predictive modeling has not been possible for most systems since the behavior is three-dimensional and governed by complex reaction mechanisms. In addition, CVD reactors often exhibit large thermal gradients, large changes in physical properties over regions of the domain, and significant thermal diffusion for gas mixtures with widely varying molecular weights. As a result, significant simplifications in the models have been made which erode the accuracy of the models` predictions. In this paper, the authors will demonstrate how the vast computational resources of massively parallel computers can be exploited to make possible the analysis of models that include coupled fluid flow and detailed chemistry in three-dimensional domains. For the most part, models have either simplified the reaction mechanisms and concentrated on the fluid flow, or have simplified the fluid flow and concentrated on rigorous reactions. An important CVD research thrust has been in detailed modeling of fluid flow and heat transfer in the reactor vessel, treating transport and reaction of chemical species either very simply or as a totally decoupled problem. Using the analogy between heat transfer and mass transfer, and the fact that deposition is often diffusion limited, much can be learned from these calculations; however, the effects of thermal diffusion, the change in physical properties with composition, and the incorporation of surface reaction mechanisms are not included in this model, nor can transitions to three-dimensional flows be detected.

  10. Massively parallel neural circuits for stereoscopic color vision: encoding, decoding and identification.

    PubMed

    Lazar, Aurel A; Slutskiy, Yevgeniy B; Zhou, Yiyin

    2015-03-01

    Past work demonstrated how monochromatic visual stimuli could be faithfully encoded and decoded under Nyquist-type rate conditions. Color visual stimuli were then traditionally encoded and decoded in multiple separate monochromatic channels. The brain, however, appears to mix information about color channels at the earliest stages of the visual system, including the retina itself. If information about color is mixed and encoded by a common pool of neurons, how can colors be demixed and perceived? We present Color Video Time Encoding Machines (Color Video TEMs) for encoding color visual stimuli that take into account a variety of color representations within a single neural circuit. We then derive a Color Video Time Decoding Machine (Color Video TDM) algorithm for color demixing and reconstruction of color visual scenes from spikes produced by a population of visual neurons. In addition, we formulate Color Video Channel Identification Machines (Color Video CIMs) for functionally identifying color visual processing performed by a spiking neural circuit. Furthermore, we derive a duality between TDMs and CIMs that unifies the two and leads to a general theory of neural information representation for stereoscopic color vision. We provide examples demonstrating that a massively parallel color visual neural circuit can be first identified with arbitrary precision and its spike trains can be subsequently used to reconstruct the encoded stimuli. We argue that evaluation of the functional identification methodology can be effectively and intuitively performed in the stimulus space. In this space, a signal reconstructed from spike trains generated by the identified neural circuit can be compared to the original stimulus. PMID:25594573

  11. Detection of inherited mutations for breast and ovarian cancer using genomic capture and massively parallel sequencing

    PubMed Central

    Walsh, Tom; Lee, Ming K.; Casadei, Silvia; Thornton, Anne M.; Stray, Sunday M.; Pennil, Christopher; Nord, Alex S.; Mandell, Jessica B.; Swisher, Elizabeth M.; King, Mary-Claire

    2010-01-01

    Inherited loss-of-function mutations in the tumor suppressor genes BRCA1, BRCA2, and multiple other genes predispose to high risks of breast and/or ovarian cancer. Cancer-associated inherited mutations in these genes are collectively quite common, but individually rare or even private. Genetic testing for BRCA1 and BRCA2 mutations has become an integral part of clinical practice, but testing is generally limited to these two genes and to women with severe family histories of breast or ovarian cancer. To determine whether massively parallel, “next-generation” sequencing would enable accurate, thorough, and cost-effective identification of inherited mutations for breast and ovarian cancer, we developed a genomic assay to capture, sequence, and detect all mutations in 21 genes, including BRCA1 and BRCA2, with inherited mutations that predispose to breast or ovarian cancer. Constitutional genomic DNA from subjects with known inherited mutations, ranging in size from 1 to >100,000 bp, was hybridized to custom oligonucleotides and then sequenced using a genome analyzer. Analysis was carried out blind to the mutation in each sample. Average coverage was >1200 reads per base pair. After filtering sequences for quality and number of reads, all single-nucleotide substitutions, small insertion and deletion mutations, and large genomic duplications and deletions were detected. There were zero false-positive calls of nonsense mutations, frameshift mutations, or genomic rearrangements for any gene in any of the test samples. This approach enables widespread genetic testing and personalized risk assessment for breast and ovarian cancer. PMID:20616022

  12. Massively parallel sampling of lattice proteins reveals foundations of thermal adaptation

    NASA Astrophysics Data System (ADS)

    Venev, Sergey V.; Zeldovich, Konstantin B.

    2015-08-01

    Evolution of proteins in bacteria and archaea living in different conditions leads to significant correlations between amino acid usage and environmental temperature. The origins of these correlations are poorly understood, and an important question of protein theory, physics-based prediction of types of amino acids overrepresented in highly thermostable proteins, remains largely unsolved. Here, we extend the random energy model of protein folding by weighting the interaction energies of amino acids by their frequencies in protein sequences and predict the energy gap of proteins designed to fold well at elevated temperatures. To test the model, we present a novel scalable algorithm for simultaneous energy calculation for many sequences in many structures, targeting massively parallel computing architectures such as graphics processing unit. The energy calculation is performed by multiplying two matrices, one representing the complete set of sequences, and the other describing the contact maps of all structural templates. An implementation of the algorithm for the CUDA platform is available at http://www.github.com/kzeldovich/galeprot and calculates protein folding energies over 250 times faster than a single central processing unit. Analysis of amino acid usage in 64-mer cubic lattice proteins designed to fold well at different temperatures demonstrates an excellent agreement between theoretical and simulated values of energy gap. The theoretical predictions of temperature trends of amino acid frequencies are significantly correlated with bioinformatics data on 191 bacteria and archaea, and highlight protein folding constraints as a fundamental selection pressure during thermal adaptation in biological evolution.

  13. A SNP panel for identity and kinship testing using massive parallel sequencing.

    PubMed

    Grandell, Ida; Samara, Raed; Tillmar, Andreas O

    2016-07-01

    Within forensic genetics, there is still a need for supplementary DNA marker typing in order to increase the power to solve cases for both identity testing and complex kinship issues. One major disadvantage with current capillary electrophoresis (CE) methods is the limitation in DNA marker multiplex capability. By utilizing massive parallel sequencing (MPS) technology, this capability can, however, be increased. We have designed a customized GeneRead DNASeq SNP panel (Qiagen) of 140 previously published autosomal forensically relevant identity SNPs for analysis using MPS. One single amplification step was followed by library preparation using the GeneRead Library Prep workflow (Qiagen). The sequencing was performed on a MiSeq System (Illumina), and the bioinformatic analyses were done using the software Biomedical Genomics Workbench (CLC Bio, Qiagen). Forty-nine individuals from a Swedish population were genotyped in order to establish genotype frequencies and to evaluate the performance of the assay. The analyses showed to have a balanced coverage among the included loci, and the heterozygous balance showed to have less than 0.5 % outliers. Analyses of dilution series of the 2800M Control DNA gave reproducible results down to 0.2 ng DNA input. In addition, typing of FTA samples and bone samples was performed with promising results. Further studies and optimizations are, however, required for a more detailed evaluation of the performance of degraded and PCR-inhibited forensic samples. In summary, the assay offers a straightforward sample-to-genotype workflow and could be useful to gain information in forensic casework, for both identity testing and in order to solve complex kinship issues. PMID:26932869

  14. Implementation of a Message Passing Interface into a Cloud-Resolving Model for Massively Parallel Computing

    NASA Technical Reports Server (NTRS)

    Juang, Hann-Ming Henry; Tao, Wei-Kuo; Zeng, Xi-Ping; Shie, Chung-Lin; Simpson, Joanne; Lang, Steve

    2004-01-01

    The capability for massively parallel programming (MPP) using a message passing interface (MPI) has been implemented into a three-dimensional version of the Goddard Cumulus Ensemble (GCE) model. The design for the MPP with MPI uses the concept of maintaining similar code structure between the whole domain as well as the portions after decomposition. Hence the model follows the same integration for single and multiple tasks (CPUs). Also, it provides for minimal changes to the original code, so it is easily modified and/or managed by the model developers and users who have little knowledge of MPP. The entire model domain could be sliced into one- or two-dimensional decomposition with a halo regime, which is overlaid on partial domains. The halo regime requires that no data be fetched across tasks during the computational stage, but it must be updated before the next computational stage through data exchange via MPI. For reproducible purposes, transposing data among tasks is required for spectral transform (Fast Fourier Transform, FFT), which is used in the anelastic version of the model for solving the pressure equation. The performance of the MPI-implemented codes (i.e., the compressible and anelastic versions) was tested on three different computing platforms. The major results are: 1) both versions have speedups of about 99% up to 256 tasks but not for 512 tasks; 2) the anelastic version has better speedup and efficiency because it requires more computations than that of the compressible version; 3) equal or approximately-equal numbers of slices between the x- and y- directions provide the fastest integration due to fewer data exchanges; and 4) one-dimensional slices in the x-direction result in the slowest integration due to the need for more memory relocation for computation.

  15. Identification of cancer/testis-antigen genes by massively parallel signature sequencing

    PubMed Central

    Chen, Yao-Tseng; Scanlan, Matthew J.; Venditti, Charis A.; Chua, Ramon; Theiler, Gregory; Stevenson, Brian J.; Iseli, Christian; Gure, Ali O.; Vasicek, Tom; Strausberg, Robert L.; Jongeneel, C. Victor; Old, Lloyd J.; Simpson, Andrew J. G.

    2005-01-01

    Massively parallel signature sequencing (MPSS) generates millions of short sequence tags corresponding to transcripts from a single RNA preparation. Most MPSS tags can be unambiguously assigned to genes, thereby generating a comprehensive expression profile of the tissue of origin. From the comparison of MPSS data from 32 normal human tissues, we identified 1,056 genes that are predominantly expressed in the testis. Further evaluation by using MPSS tags from cancer cell lines and EST data from a wide variety of tumors identified 202 of these genes as candidates for encoding cancer/testis (CT) antigens. Of these genes, the expression in normal tissues was assessed by RT-PCR in a subset of 166 intron-containing genes, and those with confirmed testis-predominant expression were further evaluated for their expression in 21 cancer cell lines. Thus, 20 CT or CT-like genes were identified, with several exhibiting expression in five or more of the cancer cell lines examined. One of these genes is a member of a CT gene family that we designated as CT45. The CT45 family comprises six highly similar (>98% cDNA identity) genes that are clustered in tandem within a 125-kb region on Xq26.3. CT45 was found to be frequently expressed in both cancer cell lines and lung cancer specimens. Thus, MPSS analysis has resulted in a significant extension of our knowledge of CT antigens, leading to the discovery of a distinctive X-linked CT-antigen gene family. PMID:15905330

  16. Massively parallel computation of lattice associative memory classifiers on multicore processors

    NASA Astrophysics Data System (ADS)

    Ritter, Gerhard X.; Schmalz, Mark S.; Hayden, Eric T.

    2011-09-01

    Over the past quarter century, concepts and theory derived from neural networks (NNs) have featured prominently in the literature of pattern recognition. Implementationally, classical NNs based on the linear inner product can present performance challenges due to the use of multiplication operations. In contrast, NNs having nonlinear kernels based on Lattice Associative Memories (LAM) theory tend to concentrate primarily on addition and maximum/minimum operations. More generally, the emergence of LAM-based NNs, with their superior information storage capacity, fast convergence and training due to relatively lower computational cost, as well as noise-tolerant classification has extended the capabilities of neural networks far beyond the limited applications potential of classical NNs. This paper explores theory and algorithmic approaches for the efficient computation of LAM-based neural networks, in particular lattice neural nets and dendritic lattice associative memories. Of particular interest are massively parallel architectures such as multicore CPUs and graphics processing units (GPUs). Originally developed for video gaming applications, GPUs hold the promise of high computational throughput without compromising numerical accuracy. Unfortunately, currently-available GPU architectures tend to have idiosyncratic memory hierarchies that can produce unacceptably high data movement latencies for relatively simple operations, unless careful design of theory and algorithms is employed. Advantageously, some GPUs (e.g., the Nvidia Fermi GPU) are optimized for efficient streaming computation (e.g., concurrent multiply and add operations). As a result, the linear or nonlinear inner product structures of NNs are inherently suited to multicore GPU computational capabilities. In this paper, the authors' recent research in lattice associative memories and their implementation on multicores is overviewed, with results that show utility for a wide variety of pattern

  17. Identifying Children With Poor Cochlear Implantation Outcomes Using Massively Parallel Sequencing

    PubMed Central

    Wu, Chen-Chi; Lin, Yin-Hung; Liu, Tien-Chen; Lin, Kai-Nan; Yang, Wei-Shiung; Hsu, Chuan-Jen; Chen, Pei-Lung; Wu, Che-Ming

    2015-01-01

    Abstract Cochlear implantation is currently the treatment of choice for children with severe to profound hearing impairment. However, the outcomes with cochlear implants (CIs) vary significantly among recipients. The purpose of the present study is to identify the genetic determinants of poor CI outcomes. Twelve children with poor CI outcomes (the “cases”) and 30 “matched controls” with good CI outcomes were subjected to comprehensive genetic analyses using massively parallel sequencing, which targeted 129 known deafness genes. Audiological features, imaging findings, and auditory/speech performance with CIs were then correlated to the genetic diagnoses. We identified genetic variants which are associated with poor CI outcomes in 7 (58%) of the 12 cases; 4 cases had bi-allelic PCDH15 pathogenic mutations and 3 cases were homozygous for the DFNB59 p.G292R variant. Mutations in the WFS1, GJB3, ESRRB, LRTOMT, MYO3A, and POU3F4 genes were detected in 7 (23%) of the 30 matched controls. The allele frequencies of PCDH15 and DFNB59 variants were significantly higher in the cases than in the matched controls (both P < 0.001). In the 7 CI recipients with PCDH15 or DFNB59 variants, otoacoustic emissions were absent in both ears, and imaging findings were normal in all 7 implanted ears. PCDH15 or DFNB59 variants are associated with poor CI performance, yet children with PCDH15 or DFNB59 variants might show clinical features indistinguishable from those of other typical pediatric CI recipients. Accordingly, genetic examination is indicated in all CI candidates before operation. PMID:26166082

  18. Application of Massively Parallel Sequencing to Genetic Diagnosis in Multiplex Families with Idiopathic Sensorineural Hearing Impairment

    PubMed Central

    Wu, Chen-Chi; Lin, Yin-Hung; Lu, Ying-Chang; Chen, Pei-Jer; Yang, Wei-Shiung; Hsu, Chuan-Jen; Chen, Pei-Lung

    2013-01-01

    Despite the clinical utility of genetic diagnosis to address idiopathic sensorineural hearing impairment (SNHI), the current strategy for screening mutations via Sanger sequencing suffers from the limitation that only a limited number of DNA fragments associated with common deafness mutations can be genotyped. Consequently, a definitive genetic diagnosis cannot be achieved in many families with discernible family history. To investigate the diagnostic utility of massively parallel sequencing (MPS), we applied the MPS technique to 12 multiplex families with idiopathic SNHI in which common deafness mutations had previously been ruled out. NimbleGen sequence capture array was designed to target all protein coding sequences (CDSs) and 100 bp of the flanking sequence of 80 common deafness genes. We performed MPS on the Illumina HiSeq2000, and applied BWA, SAMtools, Picard, GATK, Variant Tools, ANNOVAR, and IGV for bioinformatics analyses. Initial data filtering with allele frequencies (<5% in the 1000 Genomes Project and 5400 NHLBI exomes) and PolyPhen2/SIFT scores (>0.95) prioritized 5 indels (insertions/deletions) and 36 missense variants in the 12 multiplex families. After further validation by Sanger sequencing, segregation pattern, and evolutionary conservation of amino acid residues, we identified 4 variants in 4 different genes, which might lead to SNHI in 4 families compatible with autosomal dominant inheritance. These included GJB2 p.R75Q, MYO7A p.T381M, KCNQ4 p.S680F, and MYH9 p.E1256K. Among them, KCNQ4 p.S680F and MYH9 p.E1256K were novel. In conclusion, MPS allows genetic diagnosis in multiplex families with idiopathic SNHI by detecting mutations in relatively uncommon deafness genes. PMID:23451214

  19. Transcriptional analysis of the Arabidopsis ovule by massively parallel signature sequencing

    PubMed Central

    Sánchez-León, Nidia; Arteaga-Vázquez, Mario; Alvarez-Mejía, César; Mendiola-Soto, Javier; Durán-Figueroa, Noé; Rodríguez-Leal, Daniel; Rodríguez-Arévalo, Isaac; García-Campayo, Vicenta; García-Aguilar, Marcelina; Olmedo-Monfil, Vianey; Arteaga-Sánchez, Mario; Martínez de la Vega, Octavio; Nobuta, Kan; Vemaraju, Kalyan; Meyers, Blake C.; Vielle-Calzada, Jean-Philippe

    2012-01-01

    The life cycle of flowering plants alternates between a predominant sporophytic (diploid) and an ephemeral gametophytic (haploid) generation that only occurs in reproductive organs. In Arabidopsis thaliana, the female gametophyte is deeply embedded within the ovule, complicating the study of the genetic and molecular interactions involved in the sporophytic to gametophytic transition. Massively parallel signature sequencing (MPSS) was used to conduct a quantitative large-scale transcriptional analysis of the fully differentiated Arabidopsis ovule prior to fertilization. The expression of 9775 genes was quantified in wild-type ovules, additionally detecting >2200 new transcripts mapping to antisense or intergenic regions. A quantitative comparison of global expression in wild-type and sporocyteless (spl) individuals resulted in 1301 genes showing 25-fold reduced or null activity in ovules lacking a female gametophyte, including those encoding 92 signalling proteins, 75 transcription factors, and 72 RNA-binding proteins not reported in previous studies based on microarray profiling. A combination of independent genetic and molecular strategies confirmed the differential expression of 28 of them, showing that they are either preferentially active in the female gametophyte, or dependent on the presence of a female gametophyte to be expressed in sporophytic cells of the ovule. Among 18 genes encoding pentatricopeptide-repeat proteins (PPRs) that show transcriptional activity in wild-type but not spl ovules, CIHUATEOTL (At4g38150) is specifically expressed in the female gametophyte and necessary for female gametogenesis. These results expand the nature of the transcriptional universe present in the ovule of Arabidopsis, and offer a large-scale quantitative reference of global expression for future genomic and developmental studies. PMID:22442422

  20. Systematic comparison of three genomic enrichment methods for massively parallel DNA sequencing.

    PubMed

    Teer, Jamie K; Bonnycastle, Lori L; Chines, Peter S; Hansen, Nancy F; Aoyama, Natsuyo; Swift, Amy J; Abaan, Hatice Ozel; Albert, Thomas J; Margulies, Elliott H; Green, Eric D; Collins, Francis S; Mullikin, James C; Biesecker, Leslie G

    2010-10-01

    Massively parallel DNA sequencing technologies have greatly increased our ability to generate large amounts of sequencing data at a rapid pace. Several methods have been developed to enrich for genomic regions of interest for targeted sequencing. We have compared three of these methods: Molecular Inversion Probes (MIP), Solution Hybrid Selection (SHS), and Microarray-based Genomic Selection (MGS). Using HapMap DNA samples, we compared each of these methods with respect to their ability to capture an identical set of exons and evolutionarily conserved regions associated with 528 genes (2.61 Mb). For sequence analysis, we developed and used a novel Bayesian genotype-assigning algorithm, Most Probable Genotype (MPG). All three capture methods were effective, but sensitivities (percentage of targeted bases associated with high-quality genotypes) varied for an equivalent amount of pass-filtered sequence: for example, 70% (MIP), 84% (SHS), and 91% (MGS) for 400 Mb. In contrast, all methods yielded similar accuracies of >99.84% when compared to Infinium 1M SNP BeadChip-derived genotypes and >99.998% when compared to 30-fold coverage whole-genome shotgun sequencing data. We also observed a low false-positive rate with all three methods; of the heterozygous positions identified by each of the capture methods, >99.57% agreed with 1M SNP BeadChip, and >98.840% agreed with the whole-genome shotgun data. In addition, we successfully piloted the genomic enrichment of a set of 12 pooled samples via the MGS method using molecular bar codes. We find that these three genomic enrichment methods are highly accurate and practical, with sensitivities comparable to that of 30-fold coverage whole-genome shotgun data. PMID:20810667

  1. Water mass-specificity of bacterial communities in the North Atlantic revealed by massively parallel sequencing

    PubMed Central

    Agogué, Hélène; Lamy, Dominique; Neal, Phillip R.; Sogin, Mitchell L.; Herndl, Gerhard J.

    2011-01-01

    Bacterial assemblages from subsurface (100 m depth), meso- (200–1000 m depth) and bathy-pelagic (below 1000 m depth) zones at 10 stations along a North Atlantic Ocean transect from 60°N to 5°S were characterized using massively parallel pyrotag sequencing of the V6 region of the 16S rRNA gene (V6 pyrotags). In a dataset of more than 830,000 pyrotags we identified 10,780 OTUs of which 52% were singletons. The singletons accounted for less than 2% of the OTU abundance, while the 100 and 1,000 most abundant OTUs represented 80% and 96%, respectively, of all recovered OTUs. Non-metric Multi-Dimensional Scaling and Canonical Correspondence Analysis of all the OTUs excluding the singletons revealed a clear clustering of the bacterial communities according to the water masses. More than 80% of the 1,000 most abundant OTUs corresponded to Proteobacteria of which 55% were Alphaproteobacteria, mostly composed of the SAR11 cluster. Gammaproteobacteria increased with depth and included a relatively large number of OTUs belonging to Alteromonadales and Oceanospirillales. The bathypelagic zone showed higher taxonomic evenness than the overlying waters, albeit bacterial diversity was remarkably variable. Both abundant and low-abundance OTUs were responsible for the distinct bacterial communities characterizing the major deep-water masses. Taken together, our results reveal that deep-water masses act as bio-oceanographic islands for bacterioplankton leading to water mass-specific bacterial communities in the deep waters of the Atlantic. PMID:21143328

  2. The complete genome of an individual by massively parallel DNA sequencing.

    PubMed

    Wheeler, David A; Srinivasan, Maithreyan; Egholm, Michael; Shen, Yufeng; Chen, Lei; McGuire, Amy; He, Wen; Chen, Yi-Ju; Makhijani, Vinod; Roth, G Thomas; Gomes, Xavier; Tartaro, Karrie; Niazi, Faheem; Turcotte, Cynthia L; Irzyk, Gerard P; Lupski, James R; Chinault, Craig; Song, Xing-zhi; Liu, Yue; Yuan, Ye; Nazareth, Lynne; Qin, Xiang; Muzny, Donna M; Margulies, Marcel; Weinstock, George M; Gibbs, Richard A; Rothberg, Jonathan M

    2008-04-17

    The association of genetic variation with disease and drug response, and improvements in nucleic acid technologies, have given great optimism for the impact of 'genomic medicine'. However, the formidable size of the diploid human genome, approximately 6 gigabases, has prevented the routine application of sequencing methods to deciphering complete individual human genomes. To realize the full potential of genomics for human health, this limitation must be overcome. Here we report the DNA sequence of a diploid genome of a single individual, James D. Watson, sequenced to 7.4-fold redundancy in two months using massively parallel sequencing in picolitre-size reaction vessels. This sequence was completed in two months at approximately one-hundredth of the cost of traditional capillary electrophoresis methods. Comparison of the sequence to the reference genome led to the identification of 3.3 million single nucleotide polymorphisms, of which 10,654 cause amino-acid substitution within the coding sequence. In addition, we accurately identified small-scale (2-40,000 base pair (bp)) insertion and deletion polymorphism as well as copy number variation resulting in the large-scale gain and loss of chromosomal segments ranging from 26,000 to 1.5 million base pairs. Overall, these results agree well with recent results of sequencing of a single individual by traditional methods. However, in addition to being faster and significantly less expensive, this sequencing technology avoids the arbitrary loss of genomic sequences inherent in random shotgun sequencing by bacterial cloning because it amplifies DNA in a cell-free system. As a result, we further demonstrate the acquisition of novel human sequence, including novel genes not previously identified by traditional genomic sequencing. This is the first genome sequenced by next-generation technologies. Therefore it is a pilot for the future challenges of 'personalized genome sequencing'. PMID:18421352

  3. Investigations on the usefulness of the Massively Parallel Processor for study of electronic properties of atomic and condensed matter systems

    NASA Technical Reports Server (NTRS)

    Das, T. P.

    1988-01-01

    The usefulness of the Massively Parallel Processor (MPP) for investigation of electronic structures and hyperfine properties of atomic and condensed matter systems was explored. The major effort was directed towards the preparation of algorithms for parallelization of the computational procedure being used on serial computers for electronic structure calculations in condensed matter systems. Detailed descriptions of investigations and results are reported, including MPP adaptation of self-consistent charge extended Hueckel (SCCEH) procedure, MPP adaptation of the first-principles Hartree-Fock cluster procedure for electronic structures of large molecules and solid state systems, and MPP adaptation of the many-body procedure for atomic systems.

  4. 3D frequency modeling of elastic seismic wave propagation via a structured massively parallel direct Helmholtz solver

    NASA Astrophysics Data System (ADS)

    Wang, S.; De Hoop, M. V.; Xia, J.; Li, X.

    2011-12-01

    We consider the modeling of elastic seismic wave propagation on a rectangular domain via the discretization and solution of the inhomogeneous coupled Helmholtz equation in 3D, by exploiting a parallel multifrontal sparse direct solver equipped with Hierarchically Semi-Separable (HSS) structure to reduce the computational complexity and storage. In particular, we are concerned with solving this equation on a large domain, for a large number of different forcing terms in the context of seismic problems in general, and modeling in particular. We resort to a parsimonious mixed grid finite differences scheme for discretizing the Helmholtz operator and Perfect Matched Layer boundaries, resulting in a non-Hermitian matrix. We make use of a nested dissection based domain decomposition, and introduce an approximate direct solver by developing a parallel HSS matrix compression, factorization, and solution approach. We cast our massive parallelization in the framework of the multifrontal method. The assembly tree is partitioned into local trees and a global tree. The local trees are eliminated independently in each processor, while the global tree is eliminated through massive communication. The solver for the inhomogeneous equation is a parallel hybrid between multifrontal and HSS structure. The computational complexity associated with the factorization is almost linear with the size of the Helmholtz matrix. Our numerical approach can be compared with the spectral element method in 3D seismic applications.

  5. Underlying Data for Sequencing the Mitochondrial Genome with the Massively Parallel Sequencing Platform Ion Torrent™ PGM™

    PubMed Central

    2015-01-01

    Background Massively parallel sequencing (MPS) technologies have the capacity to sequence targeted regions or whole genomes of multiple nucleic acid samples with high coverage by sequencing millions of DNA fragments simultaneously. Compared with Sanger sequencing, MPS also can reduce labor and cost on a per nucleotide basis and indeed on a per sample basis. In this study, whole genomes of human mitochondria (mtGenome) were sequenced on the Personal Genome Machine (PGMTM) (Life Technologies, San Francisco, CA), the out data were assessed, and the results were compared with data previously generated on the MiSeqTM (Illumina, San Diego, CA). The objectives of this paper were to determine the feasibility, accuracy, and reliability of sequence data obtained from the PGM. Results 24 samples were multiplexed (in groups of six) and sequenced on the at least 10 megabase throughput 314 chip. The depth of coverage pattern was similar among all 24 samples; however the coverage across the genome varied. For strand bias, the average ratio of coverage between the forward and reverse strands at each nucleotide position indicated that two-thirds of the positions of the genome had ratios that were greater than 0.5. A few sites had more extreme strand bias. Another observation was that 156 positions had a false deletion rate greater than 0.15 in one or more individuals. There were 31-98 (SNP) mtGenome variants observed per sample for the 24 samples analyzed. The total 1237 (SNP) variants were concordant between the results from the PGM and MiSeq. The quality scores for haplogroup assignment for all 24 samples ranged between 88.8%-100%. Conclusions In this study, mtDNA sequence data generated from the PGM were analyzed and the output evaluated. Depth of coverage variation and strand bias were identified but generally were infrequent and did not impact reliability of variant calls. Multiplexing of samples was demonstrated which can improve throughput and reduce cost per sample analyzed

  6. A massively parallel adaptive scheme for melt migration in geodynamics computations

    NASA Astrophysics Data System (ADS)

    Dannberg, Juliane; Heister, Timo; Grove, Ryan

    2016-04-01

    Melt generation and migration are important processes for the evolution of the Earth's interior and impact the global convection of the mantle. While they have been the subject of numerous investigations, the typical time and length-scales of melt transport are vastly different from global mantle convection, which determines where melt is generated. This makes it difficult to study mantle convection and melt migration in a unified framework. In addition, modelling magma dynamics poses the challenge of highly non-linear and spatially variable material properties, in particular the viscosity. We describe our extension of the community mantle convection code ASPECT that adds equations describing the behaviour of silicate melt percolating through and interacting with a viscously deforming host rock. We use the original compressible formulation of the McKenzie equations, augmented by an equation for the conservation of energy. This approach includes both melt migration and melt generation with the accompanying latent heat effects, and it incorporates the individual compressibilities of the solid and the fluid phase. For this, we derive an accurate and stable Finite Element scheme that can be combined with adaptive mesh refinement. This is particularly advantageous for this type of problem, as the resolution can be increased in mesh cells where melt is present and viscosity gradients are high, whereas a lower resolution is sufficient in regions without melt. Together with a high-performance, massively parallel implementation, this allows for high resolution, 3d, compressible, global mantle convection simulations coupled with melt migration. Furthermore, scalable iterative linear solvers are required to solve the large linear systems arising from the discretized system. Finally, we present benchmarks and scaling tests of our solver up to tens of thousands of cores, show the effectiveness of adaptive mesh refinement when applied to melt migration and compare the

  7. Evaluation of Two Highly-Multiplexed Custom Panels for Massively Parallel Semiconductor Sequencing on Paraffin DNA

    PubMed Central

    Kotoula, Vassiliki; Lyberopoulou, Aggeliki; Papadopoulou, Kyriaki; Charalambous, Elpida; Alexopoulou, Zoi; Gakou, Chryssa; Lakis, Sotiris; Tsolaki, Eleftheria; Lilakos, Konstantinos; Fountzilas, George

    2015-01-01

    Background—Aim Massively parallel sequencing (MPS) holds promise for expanding cancer translational research and diagnostics. As yet, it has been applied on paraffin DNA (FFPE) with commercially available highly multiplexed gene panels (100s of DNA targets), while custom panels of low multiplexing are used for re-sequencing. Here, we evaluated the performance of two highly multiplexed custom panels on FFPE DNA. Methods Two custom multiplex amplification panels (B, 373 amplicons; T, 286 amplicons) were coupled with semiconductor sequencing on DNA samples from FFPE breast tumors and matched peripheral blood samples (n samples: 316; n libraries: 332). The two panels shared 37% DNA targets (common or shifted amplicons). Panel performance was evaluated in paired sample groups and quartets of libraries, where possible. Results Amplicon read ratios yielded similar patterns per gene with the same panel in FFPE and blood samples; however, performance of common amplicons differed between panels (p<0.001). FFPE genotypes were compared for 1267 coding and non-coding variant replicates, 999 out of which (78.8%) were concordant in different paired sample combinations. Variant frequency was highly reproducible (Spearman’s rho 0.959). Repeatedly discordant variants were of high coverage / low frequency (p<0.001). Genotype concordance was (a) high, for intra-run duplicates with the same panel (mean±SD: 97.2±4.7, 95%CI: 94.8–99.7, p<0.001); (b) modest, when the same DNA was analyzed with different panels (mean±SD: 81.1±20.3, 95%CI: 66.1–95.1, p = 0.004); and (c) low, when different DNA samples from the same tumor were compared with the same panel (mean±SD: 59.9±24.0; 95%CI: 43.3–76.5; p = 0.282). Low coverage / low frequency variants were validated with Sanger sequencing even in samples with unfavourable DNA quality. Conclusions Custom MPS may yield novel information on genomic alterations, provided that data evaluation is adjusted to tumor tissue FFPE DNA. To this

  8. Performance analysis of three dimensional integral equation computations on a massively parallel computer. M.S. Thesis

    NASA Technical Reports Server (NTRS)

    Logan, Terry G.

    1994-01-01

    The purpose of this study is to investigate the performance of the integral equation computations using numerical source field-panel method in a massively parallel processing (MPP) environment. A comparative study of computational performance of the MPP CM-5 computer and conventional Cray-YMP supercomputer for a three-dimensional flow problem is made. A serial FORTRAN code is converted into a parallel CM-FORTRAN code. Some performance results are obtained on CM-5 with 32, 62, 128 nodes along with those on Cray-YMP with a single processor. The comparison of the performance indicates that the parallel CM-FORTRAN code near or out-performs the equivalent serial FORTRAN code for some cases.

  9. Io-related Jovian auroral arcs: Modeling parallel electric fields

    NASA Astrophysics Data System (ADS)

    Su, Yi-Jiun; Ergun, Robert E.; Bagenal, Fran; Delamere, Peter A.

    2003-02-01

    Recent observations of auroral arcs on Jupiter suggest that electrons are being accelerated downstream from Io's magnetic footprint, creating detectable emissions. The downstream electron acceleration is investigated using one-dimensional spatial, two-dimensional velocity static Vlasov solutions under the constraint of quasi-neutrality and an applied potential drop. The code determines self-consistent charged particle distributions and potential structure along a magnetic field flux tube in the upward (with respect to Jupiter) current region of Io's wake. The boundaries of the flux tube are the Io torus on one end and Jupiter's ionosphere on the other. The results indicate that localized electric potential drops tend to form at 1.5-2.5 RJ Jovicentric distance. A sufficiently high secondary electron density causes an auroral cavity to be produced similar to that on Earth. Interestingly, the model results suggest that the proton and the hot electron population in the Io torus control the electron current densities between the Io torus and Jupiter and thus may control the energy flux and the brightness of the aurora downstream from Io's magnetic footprint. The parallel electric fields also are expected to create an unstable horseshoe electron distribution inside the auroral cavity, which may lead to the shell electron cyclotron maser instability. Results from our model suggest that in spite of the differing boundary conditions and the large centrifugal potentials at Jupiter, the auroral cavity formation may be similar to that of the Earth and that parallel electric fields may be the source mechanism of Io-controlled decametric radio emissions.

  10. A massively parallel algorithm for the collision probability calculations in the Apollo-II code using the PVM library

    SciTech Connect

    Stankovski, Z.

    1995-12-31

    The collision probability method in neutron transport, as applied to 2D geometries, consume a great amount of computer time, for a typical 2D assembly calculation about 90% of the computing time is consumed in the collision probability evaluations. Consequently RZ or 3D calculations became prohibitive. In this paper the author presents a simple but efficient parallel algorithm based on the message passing host/node programmation model. Parallelization was applied to the energy group treatment. Such approach permits parallelization of the existing code, requiring only limited modifications. Sequential/parallel computer portability is preserved, which is a necessary condition for a industrial code. Sequential performances are also preserved. The algorithm is implemented on a CRAY 90 coupled to a 128 processor T3D computer, a 16 processor IBM SPI and a network of workstations, using the Public Domain PVM library. The tests were executed for a 2D geometry with the standard 99-group library. All results were very satisfactory, the best ones with IBM SPI. Because of heterogeneity of the workstation network, the author did not ask high performances for this architecture. The same source code was used for all computers. A more impressive advantage of this algorithm will appear in the calculations of the SAPHYR project (with the future fine multigroup library of about 8000 groups) with a massively parallel computer, using several hundreds of processors.

  11. System and method for representing and manipulating three-dimensional objects on massively parallel architectures

    DOEpatents

    Karasick, M.S.; Strip, D.R.

    1996-01-30

    A parallel computing system is described that comprises a plurality of uniquely labeled, parallel processors, each processor capable of modeling a three-dimensional object that includes a plurality of vertices, faces and edges. The system comprises a front-end processor for issuing a modeling command to the parallel processors, relating to a three-dimensional object. Each parallel processor, in response to the command and through the use of its own unique label, creates a directed-edge (d-edge) data structure that uniquely relates an edge of the three-dimensional object to one face of the object. Each d-edge data structure at least includes vertex descriptions of the edge and a description of the one face. As a result, each processor, in response to the modeling command, operates upon a small component of the model and generates results, in parallel with all other processors, without the need for processor-to-processor intercommunication. 8 figs.

  12. System and method for representing and manipulating three-dimensional objects on massively parallel architectures

    DOEpatents

    Karasick, Michael S.; Strip, David R.

    1996-01-01

    A parallel computing system is described that comprises a plurality of uniquely labeled, parallel processors, each processor capable of modelling a three-dimensional object that includes a plurality of vertices, faces and edges. The system comprises a front-end processor for issuing a modelling command to the parallel processors, relating to a three-dimensional object. Each parallel processor, in response to the command and through the use of its own unique label, creates a directed-edge (d-edge) data structure that uniquely relates an edge of the three-dimensional object to one face of the object. Each d-edge data structure at least includes vertex descriptions of the edge and a description of the one face. As a result, each processor, in response to the modelling command, operates upon a small component of the model and generates results, in parallel with all other processors, without the need for processor-to-processor intercommunication.

  13. Method and apparatus for obtaining stack traceback data for multiple computing nodes of a massively parallel computer system

    DOEpatents

    Gooding, Thomas Michael; McCarthy, Patrick Joseph

    2010-03-02

    A data collector for a massively parallel computer system obtains call-return stack traceback data for multiple nodes by retrieving partial call-return stack traceback data from each node, grouping the nodes in subsets according to the partial traceback data, and obtaining further call-return stack traceback data from a representative node or nodes of each subset. Preferably, the partial data is a respective instruction address from each node, nodes having identical instruction address being grouped together in the same subset. Preferably, a single node of each subset is chosen and full stack traceback data is retrieved from the call-return stack within the chosen node.

  14. A Precision Dose Control Circuit for Maskless E-Beam Lithography With Massively Parallel Vertically Aligned Carbon Nanofibers

    SciTech Connect

    Eliza, Sazia A.; Islam, Syed K; Rahman, Touhidur; Bull, Nora D; Blalock, Benjamin; Baylor, Larry R; Ericson, Milton Nance; Gardner, Walter L

    2011-01-01

    This paper describes a highly accurate dose control circuit (DCC) for the emission of a desired number of electrons from vertically aligned carbon nanofibers (VACNFs) in a massively parallel maskless e-beam lithography system. The parasitic components within the VACNF device cause a premature termination of the electron emission, resulting in underexposure of the photoresist. In this paper, we compensate for the effects of the parasitic components and noise while reducing the area of the chip and achieving a precise count of emitted electrons from the VACNFs to obtain the optimum dose for the e-beam lithography.

  15. Efficient Extraction of Regional Subsets from Massive Climate Datasets using Parallel IO

    SciTech Connect

    Daily, Jeffrey A.; Schuchardt, Karen L.; Palmer, Bruce J.

    2010-09-16

    The size of datasets produced by current climate models is increasing rapidly to the scale of petabytes. To handle data at this scale parallel analysis tools are required, however the majority of climate analysis software remains at the scale of workstations. Further, many climate analysis tools adequately process regularly gridded data but lack sufficient features when handling unstructured grids. This paper presents a data-parallel subsetter capable of correctly handling unstructured grids while scaling to over 2000 cores. The approach is based on the partitioned global address space (PGAS) parallel programming model and one-sided communication. The paper demonstrates that IO remains the single greatest bottleneck for this domain of applications and that parallel analysis of climate data succeeds in practice.

  16. Massively parallel implementation of the multi-reference Brillouin-Wigner CCSD method

    SciTech Connect

    Brabec, Jiri; Krishnamoorthy, Sriram; van Dam, Hubertus JJ; Kowalski, Karol; Pittner, Jiri

    2011-10-06

    This paper reports the parallel implementation of the Brillouin Wigner MultiReference Coupled Cluster method with Single and Double excitations (BW-MRCCSD). Preliminary tests for systems composed of 304 and 440 correlated obritals demonstrate the performance of our implementation across 1000 cores and clearly indicate the advantages of using improved task scheduling. Possible ways for further improvements of the parallel performance are also delineated.

  17. A parallel computing tool for large-scale simulation of massive fluid injection in thermo-poro-mechanical systems

    NASA Astrophysics Data System (ADS)

    Karrech, Ali; Schrank, Christoph; Regenauer-Lieb, Klaus

    2015-10-01

    Massive fluid injections into the earth's upper crust are commonly used to stimulate permeability in geothermal reservoirs, enhance recovery in oil reservoirs, store carbon dioxide and so forth. Currently used models for reservoir simulation are limited to small perturbations and/or hydraulic aspects that are insufficient to describe the complex thermal-hydraulic-mechanical behaviour of natural geomaterials. Comprehensive approaches, which take into account the non-linear mechanical deformations of rock masses, fluid flow in percolating pore spaces, and changes of temperature due to heat transfer, are necessary to predict the behaviour of deep geo-materials subjected to high pressure and temperature changes. In this paper, we introduce a thermodynamically consistent poromechanics formulation which includes coupled thermal, hydraulic and mechanical processes. Moreover, we propose a numerical integration strategy based on massively parallel computing. The proposed formulations and numerical integration are validated using analytical solutions of simple multi-physics problems. As a representative application, we investigate the massive injection of fluids within deep formation to mimic the conditions of reservoir stimulation. The model showed, for instance, the effects of initial pre-existing stress fields on the orientations of stimulation-induced failures.

  18. Massively parallel computing simulation of fluid flow in the unsaturated zone of Yucca Mountain, Nevada

    SciTech Connect

    Zhang, Keni; Wu, Yu-Shu; Bodvarsson, G.S.

    2001-08-31

    This paper presents the application of parallel computing techniques to large-scale modeling of fluid flow in the unsaturated zone (UZ) at Yucca Mountain, Nevada. In this study, parallel computing techniques, as implemented into the TOUGH2 code, are applied in large-scale numerical simulations on a distributed-memory parallel computer. The modeling study has been conducted using an over-one-million-cell three-dimensional numerical model, which incorporates a wide variety of field data for the highly heterogeneous fractured formation at Yucca Mountain. The objective of this study is to analyze the impact of various surface infiltration scenarios (under current and possible future climates) on flow through the UZ system, using various hydrogeological conceptual models with refined grids. The results indicate that the one-million-cell models produce better resolution results and reveal some flow patterns that cannot be obtained using coarse-grid modeling models.

  19. Satisfiability Test with Synchronous Simulated Annealing on the Fujitsu AP1000 Massively-Parallel Multiprocessor

    NASA Technical Reports Server (NTRS)

    Sohn, Andrew; Biswas, Rupak

    1996-01-01

    Solving the hard Satisfiability Problem is time consuming even for modest-sized problem instances. Solving the Random L-SAT Problem is especially difficult due to the ratio of clauses to variables. This report presents a parallel synchronous simulated annealing method for solving the Random L-SAT Problem on a large-scale distributed-memory multiprocessor. In particular, we use a parallel synchronous simulated annealing procedure, called Generalized Speculative Computation, which guarantees the same decision sequence as sequential simulated annealing. To demonstrate the performance of the parallel method, we have selected problem instances varying in size from 100-variables/425-clauses to 5000-variables/21,250-clauses. Experimental results on the AP1000 multiprocessor indicate that our approach can satisfy 99.9 percent of the clauses while giving almost a 70-fold speedup on 500 processors.

  20. Solution of the within-group multidimensional discrete ordinates transport equations on massively parallel architectures

    NASA Astrophysics Data System (ADS)

    Zerr, Robert Joseph

    2011-12-01

    The integral transport matrix method (ITMM) has been used as the kernel of new parallel solution methods for the discrete ordinates approximation of the within-group neutron transport equation. The ITMM abandons the repetitive mesh sweeps of the traditional source iterations (SI) scheme in favor of constructing stored operators that account for the direct coupling factors among all the cells and between the cells and boundary surfaces. The main goals of this work were to develop the algorithms that construct these operators and employ them in the solution process, determine the most suitable way to parallelize the entire procedure, and evaluate the behavior and performance of the developed methods for increasing number of processes. This project compares the effectiveness of the ITMM with the SI scheme parallelized with the Koch-Baker-Alcouffe (KBA) method. The primary parallel solution method involves a decomposition of the domain into smaller spatial sub-domains, each with their own transport matrices, and coupled together via interface boundary angular fluxes. Each sub-domain has its own set of ITMM operators and represents an independent transport problem. Multiple iterative parallel solution methods have investigated, including parallel block Jacobi (PBJ), parallel red/black Gauss-Seidel (PGS), and parallel GMRES (PGMRES). The fastest observed parallel solution method, PGS, was used in a weak scaling comparison with the PARTISN code. Compared to the state-of-the-art SI-KBA with diffusion synthetic acceleration (DSA), this new method without acceleration/preconditioning is not competitive for any problem parameters considered. The best comparisons occur for problems that are difficult for SI DSA, namely highly scattering and optically thick. SI DSA execution time curves are generally steeper than the PGS ones. However, until further testing is performed it cannot be concluded that SI DSA does not outperform the ITMM with PGS even on several thousand or tens of

  1. A Novel Algorithm for Solving the Multidimensional Neutron Transport Equation on Massively Parallel Architectures

    SciTech Connect

    Azmy, Yousry

    2014-06-10

    We employ the Integral Transport Matrix Method (ITMM) as the kernel of new parallel solution methods for the discrete ordinates approximation of the within-group neutron transport equation. The ITMM abandons the repetitive mesh sweeps of the traditional source iterations (SI) scheme in favor of constructing stored operators that account for the direct coupling factors among all the cells' fluxes and between the cells' and boundary surfaces' fluxes. The main goals of this work are to develop the algorithms that construct these operators and employ them in the solution process, determine the most suitable way to parallelize the entire procedure, and evaluate the behavior and parallel performance of the developed methods with increasing number of processes, P. The fastest observed parallel solution method, Parallel Gauss-Seidel (PGS), was used in a weak scaling comparison with the PARTISN transport code, which uses the source iteration (SI) scheme parallelized with the Koch-baker-Alcouffe (KBA) method. Compared to the state-of-the-art SI-KBA with diffusion synthetic acceleration (DSA), this new method- even without acceleration/preconditioning-is completitive for optically thick problems as P is increased to the tens of thousands range. For the most optically thick cells tested, PGS reduced execution time by an approximate factor of three for problems with more than 130 million computational cells on P = 32,768. Moreover, the SI-DSA execution times's trend rises generally more steeply with increasing P than the PGS trend. Furthermore, the PGS method outperforms SI for the periodic heterogeneous layers (PHL) configuration problems. The PGS method outperforms SI and SI-DSA on as few as P = 16 for PHL problems and reduces execution time by a factor of ten or more for all problems considered with more than 2 million computational cells on P = 4.096.

  2. Massively parallel read mapping on GPUs with the q-group index and PEANUT

    PubMed Central

    Rahmann, Sven

    2014-01-01

    We present the q-group index, a novel data structure for read mapping tailored towards graphics processing units (GPUs) with a small memory footprint and efficient parallel algorithms for querying and building. On top of the q-group index we introduce PEANUT, a highly parallel GPU-based read mapper. PEANUT provides the possibility to output both the best hits or all hits of a read. Our benchmarks show that PEANUT outperforms other state-of-the-art read mappers in terms of speed while maintaining or slightly increasing precision, recall and sensitivity. PMID:25289191

  3. Design of electrostatic microcolumn for nanoscale photoemission source in massively parallel electron-beam lithography

    NASA Astrophysics Data System (ADS)

    Wen, Ye; Du, Zhidong; Pan, Liang

    2015-10-01

    Microcolumns are widely used for parallel electron-beam lithography because of their compactness and the ability to achieve high spatial resolution. A design of an electrostatic microcolumn for our recent nanoscale photoemission sources is presented. We proposed a compact column structure (as short as several microns in length) for the ease of microcolumn fabrication and lithography operation. We numerically studied the influence of several design parameters on the optical performance such as microcolumn diameter, electrode thickness, beam current, working voltages, and working distance. We also examined the effect of fringing field between adjacent microcolumns during parallel lithography operations.

  4. Massively Parallel, Three-Dimensional Transport Solutions for the k-Eigenvalue Problem

    SciTech Connect

    Davidson, Gregory G; Evans, Thomas M; Jarrell, Joshua J; Pandya, Tara M; Slaybaugh, R

    2014-01-01

    We have implemented a new multilevel parallel decomposition in the Denovo dis- crete ordinates radiation transport code. In concert with Krylov subspace iterative solvers, the multilevel decomposition allows concurrency over energy in addition to space-angle, enabling scalability beyond the limits imposed by the traditional KBA space-angle partitioning. Furthermore, a new Arnoldi-based k-eigenvalue solver has been implemented. The added phase-space concurrency combined with the high- performance Krylov and Arnoldi solvers has enabled weak scaling to O(100K) cores on the Jaguar XK6 supercomputer. The multilevel decomposition provides sucient parallelism to scale to exascale computing and beyond.

  5. Electrical Circuit Simulation Code

    Energy Science and Technology Software Center (ESTSC)

    2001-08-09

    Massively-Parallel Electrical Circuit Simulation Code. CHILESPICE is a massively-arallel distributed-memory electrical circuit simulation tool that contains many enhanced radiation, time-based, and thermal features and models. Large scale electronic circuit simulation. Shared memory, parallel processing, enhance convergence. Sandia specific device models.

  6. Massively parallel solution of the inverse scattering problem for integrated circuit quality control

    SciTech Connect

    Leland, R.W.; Draper, B.L.; Naqvi, S.; Minhas, B.

    1997-09-01

    The authors developed and implemented a highly parallel computational algorithm for solution of the inverse scattering problem generated when an integrated circuit is illuminated by laser. The method was used as part of a system to measure diffraction grating line widths on specially fabricated test wafers and the results of the computational analysis were compared with more traditional line-width measurement techniques. The authors found they were able to measure the line width of singly periodic and doubly periodic diffraction gratings (i.e. 2D and 3D gratings respectively) with accuracy comparable to the best available experimental techniques. They demonstrated that their parallel code is highly scalable, achieving a scaled parallel efficiency of 90% or more on typical problems running on 1024 processors. They also made substantial improvements to the algorithmics and their original implementation of Rigorous Coupled Waveform Analysis, the underlying computational technique. These resulted in computational speed-ups of two orders of magnitude in some test problems. By combining these algorithmic improvements with parallelism the authors achieve speedups of between a few thousand and hundreds of thousands over the original engineering code. This made the laser diffraction measurement technique practical.

  7. Spatiotemporal Domain Decomposition for Massive Parallel Computation of Space-Time Kernel Density

    NASA Astrophysics Data System (ADS)

    Hohl, A.; Delmelle, E. M.; Tang, W.

    2015-07-01

    Accelerated processing capabilities are deemed critical when conducting analysis on spatiotemporal datasets of increasing size, diversity and availability. High-performance parallel computing offers the capacity to solve computationally demanding problems in a limited timeframe, but likewise poses the challenge of preventing processing inefficiency due to workload imbalance between computing resources. Therefore, when designing new algorithms capable of implementing parallel strategies, careful spatiotemporal domain decomposition is necessary to account for heterogeneity in the data. In this study, we perform octtree-based adaptive decomposition of the spatiotemporal domain for parallel computation of space-time kernel density. In order to avoid edge effects near subdomain boundaries, we establish spatiotemporal buffers to include adjacent data-points that are within the spatial and temporal kernel bandwidths. Then, we quantify computational intensity of each subdomain to balance workloads among processors. We illustrate the benefits of our methodology using a space-time epidemiological dataset of Dengue fever, an infectious vector-borne disease that poses a severe threat to communities in tropical climates. Our parallel implementation of kernel density reaches substantial speedup compared to sequential processing, and achieves high levels of workload balance among processors due to great accuracy in quantifying computational intensity. Our approach is portable of other space-time analytical tests.

  8. Massive parallel implementation of JPEG2000 decoding algorithm with multi-GPUs

    NASA Astrophysics Data System (ADS)

    Wu, Xianyun; Li, Yunsong; Liu, Kai; Wang, Keyan; Wang, Li

    2014-05-01

    JPEG2000 is an important technique for image compression that has been successfully used in many fields. Due to the increasing spatial, spectral and temporal resolution of remotely sensed imagery data sets, fast decompression of remote sensed data is becoming a very important and challenging object. In this paper, we develop an implementation of the JPEG2000 decompression in graphics processing units (GPUs) for fast decoding of codeblock-based parallel compression stream. We use one CUDA block to decode one frame. Tier-2 is still serial decoded while Tier-1 and IDWT are parallel processed. Since our encode stream are block-based parallel which means each block are independent with other blocks, we parallel process each block in T1 with one thread. For IDWT, we use one CUDA block to execute one line and one CUDA thread to process one pixel. We investigate the speedups that can be gained by using the GPUs implementations with regards to the CPUs-based serial implementations. Experimental result reveals that our implementation can achieve significant speedups compared with serial implementations.

  9. Optical binary de Bruijn networks for massively parallel computing: design methodology and feasibility study

    NASA Astrophysics Data System (ADS)

    Louri, Ahmed; Sung, Hongki

    1995-10-01

    The interconnection network structure can be the deciding and limiting factor in the cost and the performance of parallel computers. One of the most popular point-to-point interconnection networks for parallel computers today is the hypercube. The regularity, logarithmic diameter, symmetry, high connectivity, fault tolerance, simple routing, and reconfigurability (easy embedding of other network topologies) of the hypercube make it a very attractive choice for parallel computers. Unfortunately the hypercube possesses a major drawback, which is the links per node increases as the network grows in size. As an alternative to the hypercube, the binary de Bruijn (BdB) network has recently received much attention. The BdB not only provides a logarithmic diameter, fault tolerance, and simple routing but also requires fewer links than the hypercube for the same network size. Additionally, a major advantage of the BdB edges per node is independent of the network size. This makes it very desirable for large-scale parallel systems. However, because of its asymmetrical nature and global connectivity, it poses a major challenge for VLSI technology. Optics, owing to its three-dimensional and global-connectivity nature, seems to be very suitable for implementing BdB networks. We present an implementation methodology for optical BdB networks. The distinctive feature of the proposed implementation methodology is partitionability of the network into a few primitive operations that can be implemented efficiently. We further show feasibility of the

  10. Solution of the within-group multidimensional discrete ordinates transport equations on massively parallel architectures

    NASA Astrophysics Data System (ADS)

    Zerr, Robert Joseph

    2011-12-01

    The integral transport matrix method (ITMM) has been used as the kernel of new parallel solution methods for the discrete ordinates approximation of the within-group neutron transport equation. The ITMM abandons the repetitive mesh sweeps of the traditional source iterations (SI) scheme in favor of constructing stored operators that account for the direct coupling factors among all the cells and between the cells and boundary surfaces. The main goals of this work were to develop the algorithms that construct these operators and employ them in the solution process, determine the most suitable way to parallelize the entire procedure, and evaluate the behavior and performance of the developed methods for increasing number of processes. This project compares the effectiveness of the ITMM with the SI scheme parallelized with the Koch-Baker-Alcouffe (KBA) method. The primary parallel solution method involves a decomposition of the domain into smaller spatial sub-domains, each with their own transport matrices, and coupled together via interface boundary angular fluxes. Each sub-domain has its own set of ITMM operators and represents an independent transport problem. Multiple iterative parallel solution methods have investigated, including parallel block Jacobi (PBJ), parallel red/black Gauss-Seidel (PGS), and parallel GMRES (PGMRES). The fastest observed parallel solution method, PGS, was used in a weak scaling comparison with the PARTISN code. Compared to the state-of-the-art SI-KBA with diffusion synthetic acceleration (DSA), this new method without acceleration/preconditioning is not competitive for any problem parameters considered. The best comparisons occur for problems that are difficult for SI DSA, namely highly scattering and optically thick. SI DSA execution time curves are generally steeper than the PGS ones. However, until further testing is performed it cannot be concluded that SI DSA does not outperform the ITMM with PGS even on several thousand or tens of

  11. Nonlinear structural response using adaptive dynamic relaxation on a massively-parallel-processing system

    NASA Technical Reports Server (NTRS)

    Oakley, David R.; Knight, Norman F., Jr.

    1994-01-01

    A parallel adaptive dynamic relaxation (ADR) algorithm has been developed for nonlinear structural analysis. This algorithm has minimal memory requirements, is easily parallelizable and scalable to many processors, and is generally very reliable and efficient for highly nonlinear problems. Performance evaluations on single-processor computers have shown that the ADR algorithm is reliable and highly vectorizable, and that it is competitive with direct solution methods for the highly nonlinear problems considered. The present algorithm is implemented on the 512-processor Intel Touchstone DELTA system at Caltech, and it is designed to minimize the extent and frequency of interprocessor communication. The algorithm has been used to solve for the nonlinear static response of two and three dimensional hyperelastic systems involving contact. Impressive relative speedups have been achieved and demonstrate the high scalability of the ADR algorithm. For the class of problems addressed, the ADR algorithm represents a very promising approach for parallel-vector processing.

  12. Analysis and selection of optimal function implementations in massively parallel computer

    DOEpatents

    Archer, Charles Jens; Peters, Amanda; Ratterman, Joseph D.

    2011-05-31

    An apparatus, program product and method optimize the operation of a parallel computer system by, in part, collecting performance data for a set of implementations of a function capable of being executed on the parallel computer system based upon the execution of the set of implementations under varying input parameters in a plurality of input dimensions. The collected performance data may be used to generate selection program code that is configured to call selected implementations of the function in response to a call to the function under varying input parameters. The collected performance data may be used to perform more detailed analysis to ascertain the comparative performance of the set of implementations of the function under the varying input parameters.

  13. Application of Parallel Hybrid Algorithm in Massively Parallel GPGPU—The Improved Effective and Efficient Method for Calculating Coulombic Interactions in Simulations of Many Ions with SIMION

    NASA Astrophysics Data System (ADS)

    Saito, Kenichiro; Koizumi, Eiko; Koizumi, Hideya

    2012-09-01

    In our previous study, we introduced a new hybrid approach to effectively approximate the total force on each ion during a trajectory calculation in mass spectrometry device simulations, and the algorithm worked successfully with SIMION. We took one step further and applied the method in massively parallel general-purpose computing with GPU (GPGPU) to test its performance in simulations with thousands to over a million ions. We took extra care to minimize the barrier synchronization and data transfer between the host (CPU) and the device (GPU) memory, and took full advantage of the latency hiding. Parallel codes were written in CUDA C++ and implemented to SIMION via the user-defined Lua program. In this study, we tested the parallel hybrid algorithm with a couple of basic models and analyzed the performance by comparing it to that of the original, fully-explicit method written in serial code. The Coulomb explosion simulation with 128,000 ions was completed in 309 s, over 700 times faster than the 63 h taken by the original explicit method in which we evaluated two-body Coulomb interactions explicitly on one ion with each of all the other ions. The simulation of 1,024,000 ions was completed in 2650 s. In another example, we applied the hybrid method on a simulation of ions in a simple quadrupole ion storage model with 100,000 ions, and it only took less than 10 d. Based on our estimate, the same simulation is expected to take 5-7 y by the explicit method in serial code.

  14. Process Simulation of Complex Biological Pathways in Physical Reactive Space and Reformulated for Massively Parallel Computing Platforms.

    PubMed

    Ganesan, Narayan; Li, Jie; Sharma, Vishakha; Jiang, Hanyu; Compagnoni, Adriana

    2016-01-01

    Biological systems encompass complexity that far surpasses many artificial systems. Modeling and simulation of large and complex biochemical pathways is a computationally intensive challenge. Traditional tools, such as ordinary differential equations, partial differential equations, stochastic master equations, and Gillespie type methods, are all limited either by their modeling fidelity or computational efficiency or both. In this work, we present a scalable computational framework based on modeling biochemical reactions in explicit 3D space, that is suitable for studying the behavior of large and complex biological pathways. The framework is designed to exploit parallelism and scalability offered by commodity massively parallel processors such as the graphics processing units (GPUs) and other parallel computing platforms. The reaction modeling in 3D space is aimed at enhancing the realism of the model compared to traditional modeling tools and framework. We introduce the Parallel Select algorithm that is key to breaking the sequential bottleneck limiting the performance of most other tools designed to study biochemical interactions. The algorithm is designed to be computationally tractable, handle hundreds of interacting chemical species and millions of independent agents by considering all-particle interactions within the system. We also present an implementation of the framework on the popular graphics processing units and apply it to the simulation study of JAK-STAT Signal Transduction Pathway. The computational framework will offer a deeper insight into various biological processes within the cell and help us observe key events as they unfold in space and time. This will advance the current state-of-the-art in simulation study of large scale biological systems and also enable the realistic simulation study of macro-biological cultures, where inter-cellular interactions are prevalent. PMID:27045833

  15. Massively parallel 454-sequencing of fungal communities in Quercus spp. ectomycorrhizas indicates seasonal dynamics in urban and rural sites.

    PubMed

    Jumpponen, Ari; Jones, Kenneth L; David Mattox, J; Yaege, Chulee

    2010-03-01

    We analysed two sites within and outside an urban development in a rural background to estimate the fungal richness, diversity and community composition in Quercus spp. ectomycorrhizas using massively parallel 454-sequencing in combination with DNA-tagging. Our analyses indicated that shallow sequencing ( approximately 150 sequences) of a large number of samples (192 in total) provided data that allowed identification of seasonal trends within the fungal communities: putative root-associated antagonists and saprobes that were abundant early in the growing season were replaced by common ectomycorrhizal fungi in the course of the growing season. Ordination analyses identified a number of factors that were correlated with the observed communities including host species as well as soil organic matter, nutrient and heavy metal enrichment. Overall, our application of the high throughput 454 sequencing provided an expedient means for characterization of fungal communities. PMID:20331769

  16. Probing the Nanosecond Dynamics of a Designed Three-Stranded Beta-Sheet with a Massively Parallel Molecular Dynamics Simulation

    PubMed Central

    Voelz, Vincent A.; Luttmann, Edgar; Bowman, Gregory R.; Pande, Vijay S.

    2009-01-01

    Recently a temperature-jump FTIR study of a designed three-stranded sheet showing a fast relaxation time of ~140 ± 20 ns was published. We performed massively parallel molecular dynamics simulations in explicit solvent to probe the structural events involved in this relaxation. While our simulations produce similar relaxation rates, the structural ensemble is broad. We observe the formation of turn structure, but only very weak interaction in the strand regions, which is consistent with the lack of strong backbone-backbone NOEs in previous structural NMR studies. These results suggest that either DPDP-II folds at time scales longer than 240 ns, or that DPDP-II is not a well-defined three-stranded β-sheet. This work also provides an opportunity to compare the performance of several popular forcefield models against one another. PMID:19399235

  17. High-Throughput Detection of Actionable Genomic Alterations in Clinical Tumor Samples by Targeted, Massively Parallel Sequencing

    PubMed Central

    Wagle, Nikhil; Berger, Michael F.; Davis, Matthew J.; Blumenstiel, Brendan; DeFelice, Matthew; Pochanard, Panisa; Ducar, Matthew; Van Hummelen, Paul; MacConaill, Laura E.; Hahn, William C.; Meyerson, Matthew; Gabriel, Stacey B.; Garraway, Levi A.

    2011-01-01

    Knowledge of “actionable” somatic genomic alterations present in each tumor (e.g., point mutations, small insertions/deletions, and copy number alterations that direct therapeutic options) should facilitate individualized approaches to cancer treatment. However, clinical implementation of systematic genomic profiling has rarely been achieved beyond limited numbers of oncogene point mutations. To address this challenge, we utilized a targeted, massively parallel sequencing approach to detect tumor genomic alterations in formalin-fixed, paraffin embedded (FFPE) tumor samples. Nearly 400-fold mean sequence coverage was achieved, and single nucleotide sequence variants, small insertions/deletions, and chromosomal copy number alterations were detected simultaneously with high accuracy compared to other methods in clinical use. Putatively actionable genomic alterations, including those that predict sensitivity or resistance to established and experimental therapies, were detected in each tumor sample tested. Thus, targeted deep sequencing of clinical tumor material may enable mutation-driven clinical trials and, ultimately, ”personalized” cancer treatment. PMID:22585170

  18. Development of a Massively Parallel Particle-Mesh Algorithm for Simulations of Galaxy Dynamics and Plasmas

    NASA Astrophysics Data System (ADS)

    Wallin, John

    1996-01-01

    Particle-mesh calculations treat forces and potentials as field quantities which are represented approximately on a mesh. A system of particles is mapped onto this mesh as a density distribution of mass or charge. The Fourier transform is used to convolve this distribution with the Green's function of the potential, and a finite difference scheme is used to calculate the forces acting on the particles. The computation time scales as the Ng log Ng, where Ng is the size of the computational grid. In contrast, the particle-particle method's computing time relies on direct summation, so the time for each calculation is given by Np2, where Np is the number of particles. The particle-mesh method is best suited for simulations with a fixed minimum resolution and for collisionless systems, while hierarchical tree codes have proven to be superior for collisional systems where two-body interactions are important. Particle mesh methods still dominate in plasma physics where collisionless systems are modeled. The CM-200 Connection Machine produced by Thinking Machines Corp. is a data parallel system. On this system, the front-end computer controls the timing and execution of the parallel processing units. The programming paradigm is Single-Instruction, Multiple Data (SIMD). The processors on the CM-200 are connected in an N-dimensional hypercube; the largest number of links a message will ever have to make is N. As in all parallel computing, the efficiency of an algorithm is primarily determined by the fraction of the time spent communicating compared to that spent computing. Because of the topology of the processors, nearest neighbor communication is more efficient than general communication.

  19. The implementation of the upwind leapfrog scheme for 3D electromagnetic scattering on massively parallel computers

    SciTech Connect

    Nguyen, B.T.; Hutchinson, S.A.

    1995-07-01

    The upwind leapfrog scheme for electromagnetic scattering is briefly described. Its application to the 3D Maxwell`s time domain equations is shown in detail. The scheme`s use of upwind characteristic variables and a narrow stencil result in a smaller demand in communication overhead, making it ideal for implementation on distributed memory parallel computers. The algorithm`s implementation on two message passing computers, a 1024-processor nCUBE 2 and a 1840-processor Intel Paragon, is described. Performance evaluation demonstrates that the scheme performs well with both good scaling qualities and high efficiencies on these machines.

  20. Extended computational kernels in a massively parallel implementation of the Trotter-Suzuki approximation

    NASA Astrophysics Data System (ADS)

    Wittek, Peter; Calderaro, Luca

    2015-12-01

    We extended a parallel and distributed implementation of the Trotter-Suzuki algorithm for simulating quantum systems to study a wider range of physical problems and to make the library easier to use. The new release allows periodic boundary conditions, many-body simulations of non-interacting particles, arbitrary stationary potential functions, and imaginary time evolution to approximate the ground state energy. The new release is more resilient to the computational environment: a wider range of compiler chains and more platforms are supported. To ease development, we provide a more extensive command-line interface, an application programming interface, and wrappers from high-level languages.

  1. Exposing malaria in-host diversity and estimating population diversity by capture-recapture using massively parallel pyrosequencing

    PubMed Central

    Juliano, Jonathan J.; Porter, Kimberly; Mwapasa, Victor; Sem, Rithy; Rogers, William O.; Ariey, Frédéric; Wongsrichanalai, Chansuda; Read, Andrew; Meshnick, Steven R.

    2010-01-01

    Malaria infections commonly contain multiple genetically distinct variants. Mathematical and animal models suggest that interactions among these variants have a profound impact on the emergence of drug resistance. However, methods currently used for quantifying parasite diversity in individual infections are insensitive to low-abundance variants and are not quantitative for variant population sizes. To more completely describe the in-host complexity and ecology of malaria infections, we used massively parallel pyrosequencing to characterize malaria parasite diversity in the infections of a group of patients. By individually sequencing single strands of DNA in a complex mixture, this technique can quantify uncommon variants in mixed infections. The in-host diversity revealed by this method far exceeded that described by currently recommended genotyping methods, with as many as sixfold more variants per infection. In addition, in paired pre- and posttreatment samples, we show a complex milieu of parasites, including variants likely up-selected and down-selected by drug therapy. As with all surveys of diversity, sampling limitations prevent full discovery and differences in sampling effort can confound comparisons among samples, hosts, and populations. Here, we used ecological approaches of species accumulation curves and capture-recapture to estimate the number of variants we failed to detect in the population, and show that these methods enable comparisons of diversity before and after treatment, as well as between malaria populations. The combination of ecological statistics and massively parallel pyrosequencing provides a powerful tool for studying the evolution of drug resistance and the in-host ecology of malaria infections. PMID:21041629

  2. Inter-laboratory evaluation of SNP-based forensic identification by massively parallel sequencing using the Ion PGM™.

    PubMed

    Eduardoff, M; Santos, C; de la Puente, M; Gross, T E; Fondevila, M; Strobl, C; Sobrino, B; Ballard, D; Schneider, P M; Carracedo, Á; Lareu, M V; Parson, W; Phillips, C

    2015-07-01

    Next generation sequencing (NGS) offers the opportunity to analyse forensic DNA samples and obtain massively parallel coverage of targeted short sequences with the variants they carry. We evaluated the levels of sequence coverage, genotyping precision, sensitivity and mixed DNA patterns of a prototype version of the first commercial forensic NGS kit: the HID-Ion AmpliSeq™ Identity Panel with 169-markers designed for the Ion PGM™ system. Evaluations were made between three laboratories following closely matched Ion PGM™ protocols and a simple validation framework of shared DNA controls. The sequence coverage obtained was extensive for the bulk of SNPs targeted by the HID-Ion AmpliSeq™ Identity Panel. Sensitivity studies showed 90-95% of SNP genotypes could be obtained from 25 to 100pg of input DNA. Genotyping concordance tests included Coriell cell-line control DNA analyses checked against whole-genome sequencing data from 1000 Genomes and Complete Genomics, indicating a very high concordance rate of 99.8%. Discordant genotypes detected in rs1979255, rs1004357, rs938283, rs2032597 and rs2399332 indicate these loci should be excluded from the panel. Therefore, the HID-Ion AmpliSeq™ Identity Panel and Ion PGM™ system provide a sensitive and accurate forensic SNP genotyping assay. However, low-level DNA produced much more varied sequence coverage and in forensic use the Ion PGM™ system will require careful calibration of the total samples loaded per chip to preserve the genotyping reliability seen in routine forensic DNA. Furthermore, assessments of mixed DNA indicate the user's control of sequence analysis parameter settings is necessary to ensure mixtures are detected robustly. Given the sensitivity of Ion PGM™, this aspect of forensic genotyping requires further optimisation before massively parallel sequencing is applied to routine casework. PMID:25955683

  3. A massively parallel method of characteristic neutral particle transport code for GPUs

    SciTech Connect

    Boyd, W. R.; Smith, K.; Forget, B.

    2013-07-01

    Over the past 20 years, parallel computing has enabled computers to grow ever larger and more powerful while scientific applications have advanced in sophistication and resolution. This trend is being challenged, however, as the power consumption for conventional parallel computing architectures has risen to unsustainable levels and memory limitations have come to dominate compute performance. Heterogeneous computing platforms, such as Graphics Processing Units (GPUs), are an increasingly popular paradigm for solving these issues. This paper explores the applicability of GPUs for deterministic neutron transport. A 2D method of characteristics (MOC) code - OpenMOC - has been developed with solvers for both shared memory multi-core platforms as well as GPUs. The multi-threading and memory locality methodologies for the GPU solver are presented. Performance results for the 2D C5G7 benchmark demonstrate 25-35 x speedup for MOC on the GPU. The lessons learned from this case study will provide the basis for further exploration of MOC on GPUs as well as design decisions for hardware vendors exploring technologies for the next generation of machines for scientific computing. (authors)

  4. Harnessing the killer micros: Applications from LLNL's massively parallel computing initiative

    SciTech Connect

    Belak, J.F.

    1991-07-01

    Recent developments in microprocessor technology have led to performance on scalar applications exceeding traditional supercomputers. This suggests that coupling hundreds or even thousands of these killer-micros'' (all working on a single physical problem) may lead to performance on vector applications in excess of vector supercomputers. Also, future generation killer-micros are expected to have vector floating point units as well. The purpose of this paper is to present an overview of the parallel computing environment at Lawrence Livermore National Laboratory. However, the perspective is necessarily quite narrow and most of the examples are taken from the author's implementation of a large scale molecular dynamics code on the BBN-TC2000 at LLNL. Parallelism is achieved through a geometric domain decomposition -- each processor is assigned a distinct region of space and all atoms contained therein. As the atomic positions evolve, the processors must exchange ownership of specific atoms. This geometric domain decomposition proves to be quite general and we highlight its application to image processing and hydrodynamics simulations as well. 10 refs., 6 figs.

  5. A practical approach to portability and performance problems on massively parallel supercomputers

    SciTech Connect

    Beazley, D.M.; Lomdahl, P.S.

    1994-12-08

    We present an overview of the tactics we have used to achieve a high-level of performance while improving portability for a large-scale molecular dynamics code SPaSM. SPaSM was originally implemented in ANSI C with message passing for the Connection Machine 5 (CM-5). In 1993, SPaSM was selected as one of the winners in the IEEE Gordon Bell Prize competition for sustaining 50 Gflops on the 1024 node CM-5 at Los Alamos National Laboratory. Achieving this performance on the CM-5 required rewriting critical sections of code in CDPEAC assembler language. In addition, the code made extensive use of CM-5 parallel I/O and the CMMD message passing library. Given this highly specialized implementation, we describe how we have ported the code to the Cray T3D and high performance workstations. In addition we will describe how it has been possible to do this using a single version of source code that runs on all three platforms without sacrificing any performance. Sound too good to be true? We hope to demonstrate that one can realize both code performance and portability without relying on the latest and greatest prepackaged tool or parallelizing compiler.

  6. Library Preparation and Multiplex Capture for Massive Parallel Sequencing Applications Made Efficient and Easy

    PubMed Central

    Neiman, Mårten; Sundling, Simon; Grönberg, Henrik; Hall, Per; Czene, Kamila

    2012-01-01

    During the recent years, rapid development of sequencing technologies and a competitive market has enabled researchers to perform massive sequencing projects at a reasonable cost. As the price for the actual sequencing reactions drops, enabling more samples to be sequenced, the relative price for preparing libraries gets larger and the practical laboratory work becomes complex and tedious. We present a cost-effective strategy for simplified library preparation compatible with both whole genome- and targeted sequencing experiments. An optimized enzyme composition and reaction buffer reduces the number of required clean-up steps and allows for usage of bulk enzymes which makes the whole process cheap, efficient and simple. We also present a two-tagging strategy, which allows for multiplex sequencing of targeted regions. To prove our concept, we have prepared libraries for low-pass sequencing from 100 ng DNA, performed 2-, 4- and 8-plex exome capture and a 96-plex capture of a 500 kb region. In all samples we see a high concordance (>99.4%) of SNP calls when comparing to commercially available SNP-chip platforms. PMID:23139805

  7. On the magnetic mirroring as the basic cause of parallel electric fields. [in magnetosphere

    NASA Technical Reports Server (NTRS)

    Lennartsson, W.

    1976-01-01

    Among the different proposed mechanisms for generating parallel electric fields, magnetic mirroring of charged particles seems to be the most plausible. In the present paper, it is suggested that magnetic mirroring is the basic cause of parallel electric fields in the magnetosphere and that the magnetic mirroring effect may be able to form the basis of an auroral theory that can remove a major portion of the ambiguity of observations. In the model proposed, the parallel electric field is due to a magnetic confinement of a negatively charged hot collision-free plasma. A transfer of electron gyroenergy into wave energy tends to weaken this confinement; if this energy transfer becomes too strong, the parallel potential gradient will break down. Hence, from this model, in contrast to certain other models of parallel electric fields, only a small fraction of the total auroral particle energy may be expected to be transformed into electromagnetic wave energy during the acceleration process.

  8. Implementation of Helioseismic Data Reduction and Diagnostic Techniques on Massively Parallel Architectures

    NASA Technical Reports Server (NTRS)

    Korzennik, Sylvain

    1997-01-01

    Under the direction of Dr. Rhodes, and the technical supervision of Dr. Korzennik, the data assimilation of high spatial resolution solar dopplergrams has been carried out throughout the program on the Intel Delta Touchstone supercomputer. With the help of a research assistant, partially supported by this grant, and under the supervision of Dr. Korzennik, code development was carried out at SAO, using various available resources. To ensure cross-platform portability, PVM was selected as the message passing library. A parallel implementation of power spectra computation for helioseismology data reduction, using PVM was successfully completed. It was successfully ported to SMP architectures (i.e. SUN), and to some MPP architectures (i.e. the CM5). Due to limitation of the implementation of PVM on the Cray T3D, the port to that architecture was not completed at the time.

  9. On Deciding between Conservative and Optimistic Approaches on Massively Parallel Platforms

    SciTech Connect

    Carothers, Prof. Christopher D.; Perumalla, Kalyan S

    2010-01-01

    Over 5000 publications on parallel discrete event simulation (PDES) have appeared in the literature to date. Nevertheless, few articles have focused on empirical studies of PDES performance on large supercomputer-based systems. This gap is bridged here, by undertaking a parameterized performance study on thousands of processor cores of a Blue Gene supercomputing system. In contrast to theoretical insights from analytical studies, our study is based on actual implementation in software, incurring the actual messaging and computational overheads for both conservative and optimistic synchronization approaches of PDES. Complex and counter-intuitive effects are uncovered and analyzed, with different event timestamp distributions and available levels of concurrency in the synthetic benchmark models. The results are intended to provide guidance to the PDES community in terms of how the synchronization protocols behave at high processor core counts using a state-of-the-art supercomputing systems.

  10. Neptune: An astrophysical smooth particle hydrodynamics code for massively parallel computer architectures

    NASA Astrophysics Data System (ADS)

    Sandalski, Stou

    Smooth particle hydrodynamics is an efficient method for modeling the dynamics of fluids. It is commonly used to simulate astrophysical processes such as binary mergers. We present a newly developed GPU accelerated smooth particle hydrodynamics code for astrophysical simulations. The code is named neptune after the Roman god of water. It is written in OpenMP parallelized C++ and OpenCL and includes octree based hydrodynamic and gravitational acceleration. The design relies on object-oriented methodologies in order to provide a flexible and modular framework that can be easily extended and modified by the user. Several pre-built scenarios for simulating collisions of polytropes and black-hole accretion are provided. The code is released under the MIT Open Source license and publicly available at http://code.google.com/p/neptune-sph/.

  11. The transition to massively parallel computing within a production environment at a DOE access center

    SciTech Connect

    McCoy, M.G.

    1993-04-01

    In contemplating the transition from sequential to MP computing, the National Energy Research Supercomputer Center (NERSC) is faced with the frictions inherent in the duality of its mission. There have been two goals, the first has been to provide a stable, serviceable, production environment to the user base, the second to bring the most capable early serial supercomputers to the Center to make possible the leading edge simulations. This seeming conundrum has in reality been a source of strength. The task of meeting both goals was faced before with the CRAY 1 which, as delivered, was all iron; so the problems associated with the advent of parallel computers are not entirely new, but they are serious. Current vector supercomputers, such as the C90, offer mature production environments, including software tools, a large applications base, and generality; these machines can be used to attack the spectrum of scientific applications by a large user base knowledgeable in programming techniques for this architecture. Parallel computers to date have offered less developed, even rudimentary, working environments, a sparse applications base, and forced specialization. They have been specialized in terms of programming models, and specialized in terms of the kinds of applications which would do well on the machines. Given this context, why do many service computer centers feel that now is the time to cease or slow the procurement of traditional vector supercomputers in favor of MP systems? What are some of the issues that NERSC must face to engineer a smooth transition? The answers to these questions are multifaceted and by no means completely clear. However, a route exists as a result of early efforts at the Laboratories combined with research within the HPCC Program. One can begin with an analysis of why the hardware and software appearing shortly should be made available to the mainstream, and then address what would be required in an initial production environment.

  12. Dissecting the target specificity of RNase H recruiting oligonucleotides using massively parallel reporter analysis of short RNA motifs

    PubMed Central

    Rukov, Jakob Lewin; Hagedorn, Peter H.; Høy, Isabel Bro; Feng, Yanping; Lindow, Morten; Vinther, Jeppe

    2015-01-01

    Processing and post-transcriptional regulation of RNA often depend on binding of regulatory molecules to short motifs in RNA. The effects of such interactions are difficult to study, because most regulatory molecules recognize partially degenerate RNA motifs, embedded in a sequence context specific for each RNA. Here, we describe Library Sequencing (LibSeq), an accurate massively parallel reporter method for completely characterizing the regulatory potential of thousands of short RNA sequences in a specific context. By sequencing cDNA derived from a plasmid library expressing identical reporter genes except for a degenerate 7mer subsequence in the 3′UTR, the regulatory effects of each 7mer can be determined. We show that LibSeq identifies regulatory motifs used by RNA-binding proteins and microRNAs. We furthermore apply the method to cells transfected with RNase H recruiting oligonucleotides to obtain quantitative information for >15000 potential target sequences in parallel. These comprehensive datasets provide insights into the specificity requirements of RNase H and allow a specificity measure to be calculated for each tested oligonucleotide. Moreover, we show that inclusion of chemical modifications in the central part of an RNase H recruiting oligonucleotide can increase its sequence-specificity. PMID:26220183

  13. Running ATLAS workloads within massively parallel distributed applications using Athena Multi-Process framework (AthenaMP)

    NASA Astrophysics Data System (ADS)

    Calafiura, Paolo; Leggett, Charles; Seuster, Rolf; Tsulaia, Vakhtang; Van Gemmeren, Peter

    2015-12-01

    AthenaMP is a multi-process version of the ATLAS reconstruction, simulation and data analysis framework Athena. By leveraging Linux fork and copy-on-write mechanisms, it allows for sharing of memory pages between event processors running on the same compute node with little to no change in the application code. Originally targeted to optimize the memory footprint of reconstruction jobs, AthenaMP has demonstrated that it can reduce the memory usage of certain configurations of ATLAS production jobs by a factor of 2. AthenaMP has also evolved to become the parallel event-processing core of the recently developed ATLAS infrastructure for fine-grained event processing (Event Service) which allows the running of AthenaMP inside massively parallel distributed applications on hundreds of compute nodes simultaneously. We present the architecture of AthenaMP, various strategies implemented by AthenaMP for scheduling workload to worker processes (for example: Shared Event Queue and Shared Distributor of Event Tokens) and the usage of AthenaMP in the diversity of ATLAS event processing workloads on various computing resources: Grid, opportunistic resources and HPC.

  14. Deep mutational scanning of an antibody against epidermal growth factor receptor using mammalian cell display and massively parallel pyrosequencing

    PubMed Central

    Forsyth, Charles M.; Juan, Veronica; Akamatsu, Yoshiko; DuBridge, Robert B.; Doan, Minhtam; Ivanov, Alexander V.; Ma, Zhiyuan; Polakoff, Dixie; Razo, Jennifer; Wilson, Keith; Powers, David B.

    2013-01-01

    We developed a method for deep mutational scanning of antibody complementarity-determining regions (CDRs) that can determine in parallel the effect of every possible single amino acid CDR substitution on antigen binding. The method uses libraries of full length IgGs containing more than 1000 CDR point mutations displayed on mammalian cells, sorted by flow cytometry into subpopulations based on antigen affinity and analyzed by massively parallel pyrosequencing. Higher, lower and neutral affinity mutations are identified by their enrichment or depletion in the FACS subpopulations. We applied this method to a humanized version of the anti-epidermal growth factor receptor antibody cetuximab, generated a near comprehensive data set for 1060 point mutations that recapitulates previously determined structural and mutational data for these CDRs and identified 67 point mutations that increase affinity. The large-scale, comprehensive sequence-function data sets generated by this method should have broad utility for engineering properties such as antibody affinity and specificity and may advance theoretical understanding of antibody-antigen recognition. PMID:23765106

  15. cuTauLeaping: A GPU-Powered Tau-Leaping Stochastic Simulator for Massive Parallel Analyses of Biological Systems

    PubMed Central

    Besozzi, Daniela; Pescini, Dario; Mauri, Giancarlo

    2014-01-01

    Tau-leaping is a stochastic simulation algorithm that efficiently reconstructs the temporal evolution of biological systems, modeled according to the stochastic formulation of chemical kinetics. The analysis of dynamical properties of these systems in physiological and perturbed conditions usually requires the execution of a large number of simulations, leading to high computational costs. Since each simulation can be executed independently from the others, a massive parallelization of tau-leaping can bring to relevant reductions of the overall running time. The emerging field of General Purpose Graphic Processing Units (GPGPU) provides power-efficient high-performance computing at a relatively low cost. In this work we introduce cuTauLeaping, a stochastic simulator of biological systems that makes use of GPGPU computing to execute multiple parallel tau-leaping simulations, by fully exploiting the Nvidia's Fermi GPU architecture. We show how a considerable computational speedup is achieved on GPU by partitioning the execution of tau-leaping into multiple separated phases, and we describe how to avoid some implementation pitfalls related to the scarcity of memory resources on the GPU streaming multiprocessors. Our results show that cuTauLeaping largely outperforms the CPU-based tau-leaping implementation when the number of parallel simulations increases, with a break-even directly depending on the size of the biological system and on the complexity of its emergent dynamics. In particular, cuTauLeaping is exploited to investigate the probability distribution of bistable states in the Schlögl model, and to carry out a bidimensional parameter sweep analysis to study the oscillatory regimes in the Ras/cAMP/PKA pathway in S. cerevisiae. PMID:24663957

  16. Development and characterization of hollow microprobe array as a potential tool for versatile and massively parallel manipulation of single cells.

    PubMed

    Nagai, Moeto; Oohara, Kiyotaka; Kato, Keita; Kawashima, Takahiro; Shibata, Takayuki

    2015-04-01

    Parallel manipulation of single cells is important for reconstructing in vivo cellular microenvironments and studying cell functions. To manipulate single cells and reconstruct their environments, development of a versatile manipulation tool is necessary. In this study, we developed an array of hollow probes using microelectromechanical systems fabrication technology and demonstrated the manipulation of single cells. We conducted a cell aspiration experiment with a glass pipette and modeled a cell using a standard linear solid model, which provided information for designing hollow stepped probes for minimally invasive single-cell manipulation. We etched a silicon wafer on both sides and formed through holes with stepped structures. The inner diameters of the holes were reduced by SiO2 deposition of plasma-enhanced chemical vapor deposition to trap cells on the tips. This fabrication process makes it possible to control the wall thickness, inner diameter, and outer diameter of the probes. With the fabricated probes, single cells were manipulated and placed in microwells at a single-cell level in a parallel manner. We studied the capture, release, and survival rates of cells at different suction and release pressures and found that the cell trapping rate was directly proportional to the suction pressure, whereas the release rate and viability decreased with increasing the suction pressure. The proposed manipulation system makes it possible to place cells in a well array and observe the adherence, spreading, culture, and death of the cells. This system has potential as a tool for massively parallel manipulation and for three-dimensional hetero cellular assays. PMID:25749639

  17. Massively parallel haplotyping on microscopic beads for the high-throughput phase analysis of single molecules.

    PubMed

    Boulanger, Jérôme; Muresan, Leila; Tiemann-Boege, Irene

    2012-01-01

    In spite of the many advances in haplotyping methods, it is still very difficult to characterize rare haplotypes in tissues and different environmental samples or to accurately assess the haplotype diversity in large mixtures. This would require a haplotyping method capable of analyzing the phase of single molecules with an unprecedented throughput. Here we describe such a haplotyping method capable of analyzing in parallel hundreds of thousands single molecules in one experiment. In this method, multiple PCR reactions amplify different polymorphic regions of a single DNA molecule on a magnetic bead compartmentalized in an emulsion drop. The allelic states of the amplified polymorphisms are identified with fluorescently labeled probes that are then decoded from images taken of the arrayed beads by a microscope. This method can evaluate the phase of up to 3 polymorphisms separated by up to 5 kilobases in hundreds of thousands single molecules. We tested the sensitivity of the method by measuring the number of mutant haplotypes synthesized by four different commercially available enzymes: Phusion, Platinum Taq, Titanium Taq, and Phire. The digital nature of the method makes it highly sensitive to detecting haplotype ratios of less than 1:10,000. We also accurately quantified chimera formation during the exponential phase of PCR by different DNA polymerases. PMID:22558329

  18. Efficient massively parallel simulation of dynamic channel assignment schemes for wireless cellular communications

    NASA Technical Reports Server (NTRS)

    Greenberg, Albert G.; Lubachevsky, Boris D.; Nicol, David M.; Wright, Paul E.

    1994-01-01

    Fast, efficient parallel algorithms are presented for discrete event simulations of dynamic channel assignment schemes for wireless cellular communication networks. The driving events are call arrivals and departures, in continuous time, to cells geographically distributed across the service area. A dynamic channel assignment scheme decides which call arrivals to accept, and which channels to allocate to the accepted calls, attempting to minimize call blocking while ensuring co-channel interference is tolerably low. Specifically, the scheme ensures that the same channel is used concurrently at different cells only if the pairwise distances between those cells are sufficiently large. Much of the complexity of the system comes from ensuring this separation. The network is modeled as a system of interacting continuous time automata, each corresponding to a cell. To simulate the model, conservative methods are used; i.e., methods in which no errors occur in the course of the simulation and so no rollback or relaxation is needed. Implemented on a 16K processor MasPar MP-1, an elegant and simple technique provides speedups of about 15 times over an optimized serial simulation running on a high speed workstation. A drawback of this technique, typical of conservative methods, is that processor utilization is rather low. To overcome this, new methods were developed that exploit slackness in event dependencies over short intervals of time, thereby raising the utilization to above 50 percent and the speedup over the optimized serial code to about 120 times.

  19. Delta: An object-oriented finite element code architecture for massively parallel computers

    SciTech Connect

    Weatherby, J.R.; Schutt, J.A.; Peery, J.S.; Hogan, R.E.

    1996-02-01

    Delta is an object-oriented code architecture based on the finite element method which enables simulation of a wide range of engineering mechanics problems in a parallel processing environment. Written in C{sup ++}, Delta is a natural framework for algorithm development and for research involving coupling of mechanics from different Engineering Science disciplines. To enhance flexibility and encourage code reuse, the architecture provides a clean separation of the major aspects of finite element programming. Spatial discretization, temporal discretization, and the solution of linear and nonlinear systems of equations are each implemented separately, independent from the governing field equations. Other attractive features of the Delta architecture include support for constitutive models with internal variables, reusable ``matrix-free`` equation solvers, and support for region-to-region variations in the governing equations and the active degrees of freedom. A demonstration code built from the Delta architecture has been used in two-dimensional and three-dimensional simulations involving dynamic and quasi-static solid mechanics, transient and steady heat transport, and flow in porous media.

  20. Sassena — X-ray and neutron scattering calculated from molecular dynamics trajectories using massively parallel computers

    NASA Astrophysics Data System (ADS)

    Lindner, Benjamin; Smith, Jeremy C.

    2012-07-01

    Massively parallel computers now permit the molecular dynamics (MD) simulation of multi-million atom systems on time scales up to the microsecond. However, the subsequent analysis of the resulting simulation trajectories has now become a high performance computing problem in itself. Here, we present software for calculating X-ray and neutron scattering intensities from MD simulation data that scales well on massively parallel supercomputers. The calculation and data staging schemes used maximize the degree of parallelism and minimize the IO bandwidth requirements. The strong scaling tested on the Jaguar Petaflop Cray XT5 at Oak Ridge National Laboratory exhibits virtually linear scaling up to 7000 cores for most benchmark systems. Since both MPI and thread parallelism is supported, the software is flexible enough to cover scaling demands for different types of scattering calculations. The result is a high performance tool capable of unifying large-scale supercomputing and a wide variety of neutron/synchrotron technology. Catalogue identifier: AELW_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AELW_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: GNU General Public License, version 3 No. of lines in distributed program, including test data, etc.: 1 003 742 No. of bytes in distributed program, including test data, etc.: 798 Distribution format: tar.gz Programming language: C++, OpenMPI Computer: Distributed Memory, Cluster of Computers with high performance network, Supercomputer Operating system: UNIX, LINUX, OSX Has the code been vectorized or parallelized?: Yes, the code has been parallelized using MPI directives. Tested with up to 7000 processors RAM: Up to 1 Gbytes/core Classification: 6.5, 8 External routines: Boost Library, FFTW3, CMAKE, GNU C++ Compiler, OpenMPI, LibXML, LAPACK Nature of problem: Recent developments in supercomputing allow molecular dynamics simulations to

  1. Modeling cardiovascular hemodynamics using the lattice Boltzmann method on massively parallel supercomputers

    NASA Astrophysics Data System (ADS)

    Randles, Amanda Elizabeth

    the modeling of fluids in vessels with smaller diameters and a method for introducing the deformational forces exerted on the arterial flows from the movement of the heart by borrowing concepts from cosmodynamics are presented. These additional forces have a great impact on the endothelial shear stress. Third, the fluid model is extended to not only recover Navier-Stokes hydrodynamics, but also a wider range of Knudsen numbers, which is especially important in micro- and nano-scale flows. The tradeoffs of many optimizations methods such as the use of deep halo level ghost cells that, alongside hybrid programming models, reduce the impact of such higher-order models and enable efficient modeling of extreme regimes of computational fluid dynamics are discussed. Fourth, the extension of these models to other research questions like clogging in microfluidic devices and determining the severity of co-arctation of the aorta is presented. Through this work, a validation of these methods by taking real patient data and the measured pressure value before the narrowing of the aorta and predicting the pressure drop across the co-arctation is shown. Comparison with the measured pressure drop in vivo highlights the accuracy and potential impact of such patient specific simulations. Finally, a method to enable the simulation of longer trajectories in time by discretizing both spatially and temporally is presented. In this method, a serial coarse iterator is used to initialize data at discrete time steps for a fine model that runs in parallel. This coarse solver is based on a larger time step and typically a coarser discretization in space. Iterative refinement enables the compute-intensive fine iterator to be modeled with temporal parallelization. The algorithm consists of a series of prediction-corrector iterations completing when the results have converged within a certain tolerance. Combined, these developments allow large fluid models to be simulated for longer time durations

  2. Massively Parallel Geostatistical Inversion of Coupled Processes in Heterogeneous Porous Media

    NASA Astrophysics Data System (ADS)

    Ngo, A.; Schwede, R. L.; Li, W.; Bastian, P.; Ippisch, O.; Cirpka, O. A.

    2012-04-01

    another level of parallelization has been added.

  3. Massively-parallel neuromonitoring and neurostimulation rodent headset with nanotextured flexible microelectrodes.

    PubMed

    Bagheri, Arezu; Gabran, S R I; Salam, Muhammad Tariqus; Perez Velazquez, Jose Luis; Mansour, Raafat R; Salama, M M A; Genov, Roman

    2013-10-01

    We present a compact wireless headset for simultaneous multi-site neuromonitoring and neurostimulation in the rodent brain. The system comprises flexible-shaft microelectrodes, neural amplifiers, neurostimulators, a digital time-division multiplexer (TDM), a micro-controller and a ZigBee wireless transceiver. The system is built by parallelizing up to four 0.35 μm CMOS integrated circuits (each having 256 neural amplifiers and 64 neurostimulators) to provide a total maximum of 1024 neural amplifiers and 256 neurostimulators. Each bipolar neural amplifier features 54 dB-72 dB adjustable gain, 1 Hz-5 kHz adjustable bandwidth with an input-referred noise of 7.99 μVrms and dissipates 12.9 μW. Each current-mode bipolar neurostimulator generates programmable arbitrary-waveform biphasic current in the range of 20-250 μA and dissipates 2.6 μW in the stand-by mode. Reconfigurability is provided by stacking a set of dedicated mini-PCBs that share a common signaling bus within as small as 22 × 30 × 15 mm³ volume. The system features flexible polyimide-based microelectrode array design that is not brittle and increases pad packing density. Pad nanotexturing by electrodeposition reduces the electrode-tissue interface impedance from an average of 2 MΩ to 30 kΩ at 100 Hz. The rodent headset and the microelectrode array have been experimentally validated in vivo in freely moving rats for two months. We demonstrate 92.8 percent seizure rate reduction by responsive neurostimulation in an acute epilepsy rat model. PMID:24144667

  4. High-Throughput Massively Parallel Sequencing for Fetal Aneuploidy Detection from Maternal Plasma

    PubMed Central

    Džakula, Željko; Kim, Sung K.; Mazloom, Amin R.; Zhu, Zhanyang; Tynan, John; Lu, Tim; McLennan, Graham; Palomaki, Glenn E.; Canick, Jacob A.; Oeth, Paul; Deciu, Cosmin; van den Boom, Dirk; Ehrich, Mathias

    2013-01-01

    Background Circulating cell-free (ccf) fetal DNA comprises 3–20% of all the cell-free DNA present in maternal plasma. Numerous research and clinical studies have described the analysis of ccf DNA using next generation sequencing for the detection of fetal aneuploidies with high sensitivity and specificity. We sought to extend the utility of this approach by assessing semi-automated library preparation, higher sample multiplexing during sequencing, and improved bioinformatic tools to enable a higher throughput, more efficient assay while maintaining or improving clinical performance. Methods Whole blood (10mL) was collected from pregnant female donors and plasma separated using centrifugation. Ccf DNA was extracted using column-based methods. Libraries were prepared using an optimized semi-automated library preparation method and sequenced on an Illumina HiSeq2000 sequencer in a 12-plex format. Z-scores were calculated for affected chromosomes using a robust method after normalization and genomic segment filtering. Classification was based upon a standard normal transformed cutoff value of z = 3 for chromosome 21 and z = 3.95 for chromosomes 18 and 13. Results Two parallel assay development studies using a total of more than 1900 ccf DNA samples were performed to evaluate the technical feasibility of automating library preparation and increasing the sample multiplexing level. These processes were subsequently combined and a study of 1587 samples was completed to verify the stability of the process-optimized assay. Finally, an unblinded clinical evaluation of 1269 euploid and aneuploid samples utilizing this high-throughput assay coupled to improved bioinformatic procedures was performed. We were able to correctly detect all aneuploid cases with extremely low false positive rates of 0.09%, <0.01%, and 0.08% for trisomies 21, 18, and 13, respectively. Conclusions These data suggest that the developed laboratory methods in concert with improved bioinformatic

  5. Massively-parallel FDTD simulations to address mask electromagnetic effects in hyper-NA immersion lithography

    NASA Astrophysics Data System (ADS)

    Tirapu Azpiroz, Jaione; Burr, Geoffrey W.; Rosenbluth, Alan E.; Hibbs, Michael

    2008-03-01

    In the Hyper-NA immersion lithography regime, the electromagnetic response of the reticle is known to deviate in a complicated manner from the idealized Thin-Mask-like behavior. Already, this is driving certain RET choices, such as the use of polarized illumination and the customization of reticle film stacks. Unfortunately, full 3-D electromagnetic mask simulations are computationally intensive. And while OPC-compatible mask electromagnetic field (EMF) models can offer a reasonable tradeoff between speed and accuracy for full-chip OPC applications, full understanding of these complex physical effects demands higher accuracy. Our paper describes recent advances in leveraging High Performance Computing as a critical step towards lithographic modeling of the full manufacturing process. In this paper, highly accurate full 3-D electromagnetic simulation of very large mask layouts are conducted in parallel with reasonable turnaround time, using a Blue- Gene/L supercomputer and a Finite-Difference Time-Domain (FDTD) code developed internally within IBM. A 3-D simulation of a large 2-D layout spanning 5μm×5μm at the wafer plane (and thus (20μm×20μm×0.5μm at the mask) results in a simulation with roughly 12.5GB of memory (grid size of 10nm at the mask, single-precision computation, about 30 bytes/grid point). FDTD is flexible and easily parallelizable to enable full simulations of such large layout in approximately an hour using one BlueGene/L "midplane" containing 512 dual-processor nodes with 256MB of memory per processor. Our scaling studies on BlueGene/L demonstrate that simulations up to 100μm × 100μm at the mask can be computed in a few hours. Finally, we will show that the use of a subcell technique permits accurate simulation of features smaller than the grid discretization, thus improving on the tradeoff between computational complexity and simulation accuracy. We demonstrate the correlation of the real and quadrature components that comprise the

  6. Nanopantography: a new method for massively parallel nanopatterning over large areas.

    PubMed

    Xu, Lin; Vemula, Sri C; Jain, Manish; Nam, Sang Ki; Donnelly, Vincent M; Economou, Demetre J; Ruchhoeft, Paul

    2005-12-01

    We report a radically different approach to the versatile fabrication of nanometer-scale preselected patterns over large areas. Standard lithography, thin film deposition, and etching are used to fabricate arrays of ion-focusing microlenses (e.g., small round holes through a metal/insulator structure) on a substrate such as a silicon wafer. The substrate is then placed in a vacuum chamber, a broad-area collimated beam of ions is directed at the substrate, and electric potentials are applied to the lens arrays such that the ions focus at the bottoms of the holes (e.g., on the wafer surface). When the wafer is tilted off normal (with respect to the ion beam axis), the focal points in each hole are laterally displaced, allowing the focused beamlets to be rastered across the hole bottoms. In this "nanopantography" process, the desired pattern is replicated simultaneously in many closely spaced holes over an area limited only by the size of the broad-area ion beam. With the proper choice of ions and downstream gaseous ambient, the method can be used to deposit or etch materials. Data show that simultaneous impingement of an Ar(+) beam and a Cl(2) effusive beam on an array of 950-nm-diam lenses can be used to etch 10-nm-diam features into a Si substrate, a reduction of 95x. Simulations indicate that the focused "beamlet" diameters scale directly with lens diameter, thus a minimum feature size of approximately 1 nm should be possible with 90-nm-diam lenses that are at the limit of current photolithography. We expect nanopantography to become a viable method for overcoming one of the main obstacles in practical nanoscale fabrication: rapid, large-scale fabrication of virtually any shape and material nanostructure. Unlike all other focused ion or electron beam writing techniques, this self-aligned method is virtually unaffected by vibrations, thermal expansion, and other alignment problems that usually plague standard nanofabrication methods. This is because the ion

  7. Electron acceleration by parallel and perpendicular electric fields during magnetic reconnection without guide field

    NASA Astrophysics Data System (ADS)

    Bessho, N.; Chen, L.-J.; Germaschewski, K.; Bhattacharjee, A.

    2015-11-01

    Electron acceleration due to the electric field parallel to the background magnetic field during magnetic reconnection with no guide field is investigated by theory and two-dimensional electromagnetic particle-in-cell simulations and compared with acceleration due to the electric field perpendicular to the magnetic field. The magnitude of the parallel electric potential shows dependence on the ratio of the plasma frequency to the electron cyclotron frequency as (ωpe/Ωe)-2 and on the background plasma density as nb-1/2. In the Earth's magnetotail, the parameter ωpe/Ωe˜9 and the background (lobe) density can be of the order of 0.01 cm-3, and it is expected that the parallel electric potential is not large enough to accelerate electrons up to 100 keV. Therefore, we must consider the effect of the perpendicular electric field to account for electron energization in excess of 100 keV in the Earth's magnetotail. Trajectories for high-energy electrons are traced in a simulation to demonstrate that acceleration due to the perpendicular electric field in the diffusion region is the dominant acceleration mechanism, rather than acceleration due to the parallel electric fields in the exhaust regions. For energetic electrons accelerated near the X line due to the perpendicular electric field, pitch angle scattering converts the perpendicular momentum to the parallel momentum. On the other hand, for passing electrons that are mainly accelerated by the parallel electric field, pitch angle scattering converting the parallel momentum to the perpendicular momentum occurs. In this way, particle acceleration and pitch angle scattering will generate heated electrons in the exhaust regions.

  8. Electrically charged: An effective mechanism for soft EOS supporting massive neutron star

    NASA Astrophysics Data System (ADS)

    Jing, ZhenZhen; Wen, DeHua; Zhang, XiangDong

    2015-10-01

    The massive neutron star discoverer announced that strange particles, such as hyperons should be ruled out in the neutron star core as the soft Equation of State (EOS) can-not support a massive neutron star. However, many of the nuclear theories and laboratory experiments support that at high density the strange particles will appear and the corresponding EOS of super-dense matters will become soft. This situation promotes a challenge between the astro-observation and nuclear physics. In this work, we introduce an effective mechanism to answer this challenge, that is, if a neutron star is electrically charged, a soft EOS will be equivalently stiffened and thus can support a massive neutron star. By employing a representative soft EOS, it is found that in order to obtain an evident effect on the EOS and thus increasing the maximum stellar mass by the electrostatic field, the total net charge should be in an order of 1020 C. Moreover, by comparing the results of two kind of charge distributions, it is found that even for different distributions, a similar total charge: ~ 2.3 × 1020 C is needed to support a ~ 2.0 M ⊙ neutron star.

  9. Assessing mutant p53 in primary high-grade serous ovarian cancer using immunohistochemistry and massively parallel sequencing

    PubMed Central

    Cole, Alexander J.; Dwight, Trisha; Gill, Anthony J.; Dickson, Kristie-Ann; Zhu, Ying; Clarkson, Adele; Gard, Gregory B.; Maidens, Jayne; Valmadre, Susan; Clifton-Bligh, Roderick; Marsh, Deborah J.

    2016-01-01

    The tumour suppressor p53 is mutated in cancer, including over 96% of high-grade serous ovarian cancer (HGSOC). Mutations cause loss of wild-type p53 function due to either gain of abnormal function of mutant p53 (mutp53), or absent to low mutp53. Massively parallel sequencing (MPS) enables increased accuracy of detection of somatic variants in heterogeneous tumours. We used MPS and immunohistochemistry (IHC) to characterise HGSOCs for TP53 mutation and p53 expression. TP53 mutation was identified in 94% (68/72) of HGSOCs, 62% of which were missense. Missense mutations demonstrated high p53 by IHC, as did 35% (9/26) of non-missense mutations. Low p53 was seen by IHC in 62% of HGSOC associated with non-missense mutations. Most wild-type TP53 tumours (75%, 6/8) displayed intermediate p53 levels. The overall sensitivity of detecting a TP53 mutation based on classification as ‘Low’, ‘Intermediate’ or ‘High’ for p53 IHC was 99%, with a specificity of 75%. We suggest p53 IHC can be used as a surrogate marker of TP53 mutation in HGSOC; however, this will result in misclassification of a proportion of TP53 wild-type and mutant tumours. Therapeutic targeting of mutp53 will require knowledge of both TP53 mutations and mutp53 expression. PMID:27189670

  10. Massively parallel E-beam inspection: enabling next-generation patterned defect inspection for wafer and mask manufacturing

    NASA Astrophysics Data System (ADS)

    Malloy, Matt; Thiel, Brad; Bunday, Benjamin D.; Wurm, Stefan; Mukhtar, Maseeh; Quoi, Kathy; Kemen, Thomas; Zeidler, Dirk; Eberle, Anna Lena; Garbowski, Tomasz; Dellemann, Gregor; Peters, Jan Hendrik

    2015-03-01

    SEMATECH aims to identify and enable disruptive technologies to meet the ever-increasing demands of semiconductor high volume manufacturing (HVM). As such, a program was initiated in 2012 focused on high-speed e-beam defect inspection as a complement, and eventual successor, to bright field optical patterned defect inspection [1]. The primary goal is to enable a new technology to overcome the key gaps that are limiting modern day inspection in the fab; primarily, throughput and sensitivity to detect ultra-small critical defects. The program specifically targets revolutionary solutions based on massively parallel e-beam technologies, as opposed to incremental improvements to existing e-beam and optical inspection platforms. Wafer inspection is the primary target, but attention is also being paid to next generation mask inspection. During the first phase of the multi-year program multiple technologies were reviewed, a down-selection was made to the top candidates, and evaluations began on proof of concept systems. A champion technology has been selected and as of late 2014 the program has begun to move into the core technology maturation phase in order to enable eventual commercialization of an HVM system. Performance data from early proof of concept systems will be shown along with roadmaps to achieving HVM performance. SEMATECH's vision for moving from early-stage development to commercialization will be shown, including plans for development with industry leading technology providers.

  11. Non-CAR resists and advanced materials for Massively Parallel E-Beam Direct Write process integration

    NASA Astrophysics Data System (ADS)

    Pourteau, Marie-Line; Servin, Isabelle; Lepinay, Kévin; Essomba, Cyrille; Dal'Zotto, Bernard; Pradelles, Jonathan; Lattard, Ludovic; Brandt, Pieter; Wieland, Marco

    2016-03-01

    The emerging Massively Parallel-Electron Beam Direct Write (MP-EBDW) is an attractive high resolution high throughput lithography technology. As previously shown, Chemically Amplified Resists (CARs) meet process/integration specifications in terms of dose-to-size, resolution, contrast, and energy latitude. However, they are still limited by their line width roughness. To overcome this issue, we tested an alternative advanced non-CAR and showed it brings a substantial gain in sensitivity compared to CAR. We also implemented and assessed in-line post-lithographic treatments for roughness mitigation. For outgassing-reduction purpose, a top-coat layer is added to the total process stack. A new generation top-coat was tested and showed improved printing performances compared to the previous product, especially avoiding dark erosion: SEM cross-section showed a straight pattern profile. A spin-coatable charge dissipation layer based on conductive polyaniline has also been tested for conductivity and lithographic performances, and compatibility experiments revealed that the underlying resist type has to be carefully chosen when using this product. Finally, the Process Of Reference (POR) trilayer stack defined for 5 kV multi-e-beam lithography was successfully etched with well opened and straight patterns, and no lithography-etch bias.

  12. Use of Massive Parallel Computing Libraries in the Context of Global Gravity Field Determination from Satellite Data

    NASA Astrophysics Data System (ADS)

    Brockmann, J. M.; Schuh, W.-D.

    2011-07-01

    The estimation of the global Earth's gravity field parametrized as a finite spherical harmonic series is computationally demanding. The computational effort depends on the one hand on the maximal resolution of the spherical harmonic expansion (i.e. the number of parameters to be estimated) and on the other hand on the number of observations (which are several millions for e.g. observations from the GOCE satellite missions). To circumvent these restrictions, a massive parallel software based on high-performance computing (HPC) libraries as ScaLAPACK, PBLAS and BLACS was designed in the context of GOCE HPF WP6000 and the GOCO consortium. A prerequisite for the use of these libraries is that all matrices are block-cyclic distributed on a processor grid comprised by a large number of (distributed memory) computers. Using this set of standard HPC libraries has the benefit that once the matrices are distributed across the computer cluster, a huge set of efficient and highly scalable linear algebra operations can be used.

  13. Feasibility of using the Massively Parallel Processor for large eddy simulations and other Computational Fluid Dynamics applications

    NASA Technical Reports Server (NTRS)

    Bruno, John

    1984-01-01

    The results of an investigation into the feasibility of using the MPP for direct and large eddy simulations of the Navier-Stokes equations is presented. A major part of this study was devoted to the implementation of two of the standard numerical algorithms for CFD. These implementations were not run on the Massively Parallel Processor (MPP) since the machine delivered to NASA Goddard does not have sufficient capacity. Instead, a detailed implementation plan was designed and from these were derived estimates of the time and space requirements of the algorithms on a suitably configured MPP. In addition, other issues related to the practical implementation of these algorithms on an MPP-like architecture were considered; namely, adaptive grid generation, zonal boundary conditions, the table lookup problem, and the software interface. Performance estimates show that the architectural components of the MPP, the Staging Memory and the Array Unit, appear to be well suited to the numerical algorithms of CFD. This combined with the prospect of building a faster and larger MMP-like machine holds the promise of achieving sustained gigaflop rates that are required for the numerical simulations in CFD.

  14. A massively parallel track-finding system for the LEVEL 2 trigger in the CLAS detector at CEBAF

    SciTech Connect

    Doughty, D.C. Jr.; Collins, P.; Lemon, S. ); Bonneau, P. )

    1994-02-01

    The track segment finding subsystem of the LEVEL 2 trigger in the CLAS detector has been designed and prototyped. Track segments will be found in the 35,076 wires of the drift chambers using a massively parallel array of 768 Xilinx XC-4005 FPGA's. These FPGA's are located on daughter cards attached to the front-end boards distributed around the detector. Each chip is responsible for finding tracks passing through a 4 x 6 slice of an axial superlayer, and reports two segment found bits, one for each pair of cells. The algorithm used finds segments even when one or two layers or cells along the track is missing (this number is programmable), while being highly resistant to false segments arising from noise hits. Adjacent chips share data to find tracks crossing cell and board boundaries. For maximum speed, fully combinatorial logic is used inside each chip, with the result that all segments in the detector are found within 150 ns. Segment collection boards gather track segments from each axial superlayer and pass them via a high speed link to the segment linking subsystem in an additional 400 ns for typical events. The Xilinx chips are ram-based and therefore reprogrammable, allowing for future upgrades and algorithm enhancements.

  15. Determination of the Allelic Frequency in Smith-Lemli-Opitz Syndrome by Analysis of Massively Parallel Sequencing Data Sets

    PubMed Central

    Cross, Joanna L.; Iben, James; Simpson, Claire; Thurm, Audrey; Swedo, Susan; Tierney, Elaine; Bailey-Wilson, Joan; Biesecker, Leslie G.; Porter, Forbes D.; Wassif, Christopher A.

    2014-01-01

    Data from massively parallel sequencing or “Next Generation Sequencing” of the human exome has reached a critical mass in both public and private databases, in that these collections now allow researchers to critically evaluate population genetics in a manner that was not feasible a decade ago. The ability to determine pathogenic allele frequencies by evaluation of the full coding sequences and not merely a single SNP or series of SNPs will lead to more accurate estimations of incidence. For demonstrative purposes we analyzed the causative gene for the disorder Smith-Lemli-Opitz Syndrome (SLOS), the 7-dehydrocholesterol reductase (DHCR7) gene and determined both the carrier frequency for DHCR7 mutations, and predicted an expected incidence of the disorder. Estimations of the incidence of SLOS have ranged widely from 1:10,000 to 1:70,000 while the carrier frequency has been reported as high as 1 in 30. Using four exome data sets with a total of 17,836 chromosomes, we ascertained a carrier frequency of pathogenic DHRC7 mutations of 1.01%, and predict a SLOS disease incidence of 1/39,215 conceptions. This approach highlights yet another valuable aspect of the exome sequencing databases, to inform clinical and health policy decisions related to genetic counseling, prenatal testing and newborn screening. PMID:24813812

  16. Massively parallel sequencing of short tandem repeats-Population data and mixture analysis results for the PowerSeq™ system.

    PubMed

    van der Gaag, Kristiaan J; de Leeuw, Rick H; Hoogenboom, Jerry; Patel, Jaynish; Storts, Douglas R; Laros, Jeroen F J; de Knijff, Peter

    2016-09-01

    Current forensic DNA analysis predominantly involves identification of human donors by analysis of short tandem repeats (STRs) using Capillary Electrophoresis (CE). Recent developments in Massively Parallel Sequencing (MPS) technologies offer new possibilities in analysis of STRs since they might overcome some of the limitations of CE analysis. In this study 17 STRs and Amelogenin were sequenced in high coverage using a prototype version of the Promega PowerSeq™ system for 297 population samples from the Netherlands, Nepal, Bhutan and Central African Pygmies. In addition, 45 two-person mixtures with different minor contributions down to 1% were analysed to investigate the performance of this system for mixed samples. Regarding fragment length, complete concordance between the MPS and CE-based data was found, marking the reliability of MPS PowerSeq™ system. As expected, MPS presented a broader allele range and higher power of discrimination and exclusion rate. The high coverage sequencing data were used to determine stutter characteristics for all loci and stutter ratios were compared to CE data. The separation of alleles with the same length but exhibiting different stutter ratios lowers the overall variation in stutter ratio and helps in differentiation of stutters from genuine alleles in mixed samples. All alleles of the minor contributors were detected in the sequence reads even for the 1% contributions, but analysis of mixtures below 5% without prior information of the mixture ratio is complicated by PCR and sequencing artefacts. PMID:27347657

  17. Combined fragment molecular orbital cluster in molecule approach to massively parallel electron correlation calculations for large systems.

    PubMed

    Findlater, Alexander D; Zahariev, Federico; Gordon, Mark S

    2015-04-16

    The local correlation "cluster-in-molecule" (CIM) method is combined with the fragment molecular orbital (FMO) method, providing a flexible, massively parallel, and near-linear scaling approach to the calculation of electron correlation energies for large molecular systems. Although the computational scaling of the CIM algorithm is already formally linear, previous knowledge of the Hartree-Fock (HF) reference wave function and subsequent localized orbitals is required; therefore, extending the CIM method to arbitrarily large systems requires the aid of low-scaling/linear-scaling approaches to HF and orbital localization. Through fragmentation, the combined FMO-CIM method linearizes the scaling, with respect to system size, of the HF reference and orbital localization calculations, achieving near-linear scaling at both the reference and electron correlation levels. For the 20-residue alanine α helix, the preliminary implementation of the FMO-CIM method captures 99.6% of the MP2 correlation energy, requiring 21% of the MP2 wall time. The new method is also applied to solvated adamantine to illustrate the multilevel capability of the FMO-CIM method. PMID:25794346

  18. The minimal amount of starting DNA for Agilent’s hybrid capture-based targeted massively parallel sequencing

    PubMed Central

    Chung, Jongsuk; Son, Dae-Soon; Jeon, Hyo-Jeong; Kim, Kyoung-Mee; Park, Gahee; Ryu, Gyu Ha; Park, Woong-Yang; Park, Donghyun

    2016-01-01

    Targeted capture massively parallel sequencing is increasingly being used in clinical settings, and as costs continue to decline, use of this technology may become routine in health care. However, a limited amount of tissue has often been a challenge in meeting quality requirements. To offer a practical guideline for the minimum amount of input DNA for targeted sequencing, we optimized and evaluated the performance of targeted sequencing depending on the input DNA amount. First, using various amounts of input DNA, we compared commercially available library construction kits and selected Agilent’s SureSelect-XT and KAPA Biosystems’ Hyper Prep kits as the kits most compatible with targeted deep sequencing using Agilent’s SureSelect custom capture. Then, we optimized the adapter ligation conditions of the Hyper Prep kit to improve library construction efficiency and adapted multiplexed hybrid selection to reduce the cost of sequencing. In this study, we systematically evaluated the performance of the optimized protocol depending on the amount of input DNA, ranging from 6.25 to 200 ng, suggesting the minimal input DNA amounts based on coverage depths required for specific applications. PMID:27220682

  19. Determination of the allelic frequency in Smith-Lemli-Opitz syndrome by analysis of massively parallel sequencing data sets.

    PubMed

    Cross, J L; Iben, J; Simpson, C L; Thurm, A; Swedo, S; Tierney, E; Bailey-Wilson, J E; Biesecker, L G; Porter, F D; Wassif, C A

    2015-06-01

    Data from massively parallel sequencing or 'Next Generation Sequencing' of the human exome has reached a critical mass in both public and private databases, in that these collections now allow researchers to critically evaluate population genetics in a manner that was not feasible a decade ago. The ability to determine pathogenic allele frequencies by evaluation of the full coding sequences and not merely a single nucleotide polymorphism (SNP) or series of SNPs will lead to more accurate estimations of incidence. For demonstrative purposes, we analyzed the causative gene for the disorder Smith-Lemli-Opitz Syndrome (SLOS), the 7-dehydrocholesterol reductase (DHCR7) gene and determined both the carrier frequency for DHCR7 mutations, and predicted an expected incidence of the disorder. Estimations of the incidence of SLOS have ranged widely from 1:10,000 to 1:70,000 while the carrier frequency has been reported as high as 1 in 30. Using four exome data sets with a total of 17,836 chromosomes, we ascertained a carrier frequency of pathogenic DHRC7 mutations of 1.01%, and predict a SLOS disease incidence of 1/39,215 conceptions. This approach highlights yet another valuable aspect of the exome sequencing databases, to inform clinical and health policy decisions related to genetic counseling, prenatal testing and newborn screening. PMID:24813812

  20. Masking fields: a massively parallel neural architecture for learning, recognizing, and predicting multiple groupings of patterned data

    SciTech Connect

    Cohen, M.A.; Grossberg, S.

    1987-05-15

    A massively parallel neural-network architecture, called a masking field, is characterized through systematic computer simulations. A masking field can simultaneously detect multiple groupings within its input patterns and assign activation weights to the codes for these groupings that are predictive with respect to the contextual information embedded within the patterns and the prior learning of the system. A masking field automatically rescales its sensitivity as the overall size of an input pattern changes, yet also remains sensitive to the microstructure within each pattern. Thus, a masking field suggests a solution of the credit assignment problem by embodying a real-time code for the predictive evidence contained within its input patterns. Such capabilities are useful in speech recognition, visual object recognition, and cognitive information processing. An absolutely stable design for a masking field is disclosed through an analysis of the computer simulations. This design suggests how associative mechanisms, cooperative-competitive interactions, and modulatory gating signals can be joined together to regulate the learning of compressed recognition codes. Data about the neural substrates of learning and memory are compared to these mechanisms.

  1. Massively parallel rRNA gene sequencing exacerbates the potential for biased community diversity comparisons due to variable library sizes

    SciTech Connect

    Gihring, Thomas; Green, Stefan; Schadt, Christopher Warren

    2011-01-01

    Technologies for massively parallel sequencing are revolutionizing microbial ecology and are vastly increasing the scale of ribosomal RNA (rRNA) gene studies. Although pyrosequencing has increased the breadth and depth of possible rRNA gene sampling, one drawback is that the number of reads obtained per sample is difficult to control. Pyrosequencing libraries typically vary widely in the number of sequences per sample, even within individual studies, and there is a need to revisit the behaviour of richness estimators and diversity indices with variable gene sequence library sizes. Multiple reports and review papers have demonstrated the bias in non-parametric richness estimators (e.g. Chao1 and ACE) and diversity indices when using clone libraries. However, we found that biased community comparisons are accumulating in the literature. Here we demonstrate the effects of sample size on Chao1, ACE, CatchAll, Shannon, Chao-Shen and Simpson's estimations specifically using pyrosequencing libraries. The need to equalize the number of reads being compared across libraries is reiterated, and investigators are directed towards available tools for making unbiased diversity comparisons.

  2. Adsorption of CO molecule on AlN nanotubes by parallel electric field.

    PubMed

    Peyghan, Ali Ahmadi; Baei, Mohammad T; Hashemian, Saeedeh; Torabi, Parviz

    2013-02-01

    The behavior of the carbon monoxide (CO) adsorbed on the external surface of H-capped (6,0) zigzag single-walled aluminum nitride nanotube (AlNNT) was studied using parallel and transverse electric field (strengths 0-140 × 10(-4) a.u.) and density functional calculations. The calculated adsorption energies of the CO/AlNNT complex increased with increasing parallel electric field intensity, whereas the adsorption energy values at the applied transverse electric field show a significant reverse trend. The calculated adsorption energies of the complex at the applied parallel electric field strengths increased gradually from -0.42 eV at zero field strength to -0.80 eV at a field strength of 140 × 10(-4) a.u. The considerable changes in the adsorption energies and energy gap values generated by the applied parallel electric field strengths show the high sensitivity of the electronic properties of AlNNT towards the adsorption of CO on its surface. Analysis of structural parameters indicates that the nanotube is resistant to external electric field strengths. The dipole moment variations in the complex show a significant change in the presence of parallel and transverse electric fields, which results in much stronger interactions at higher electric field strengths. Additionally, the natural bond orbital charges, quantum molecular descriptors, and molecular orbital energies of the complex show that the nanotube can absorb CO molecule in its pristine form at a high applied parallel electric field, and that the nanotube can be used as a CO storage medium. PMID:23073700

  3. Synaptic dynamics on different time scales in a parallel fiber feedback pathway of the weakly electric fish.

    PubMed

    Lewis, John E; Maler, Leonard

    2004-02-01

    Synaptic dynamics comprise a variety of interacting processes acting on a wide range of time scales. This enables a synapse to perform a large array of computations, from temporal and spatial filtering to associative learning. In this study, we describe how changing synaptic gain via long-term plasticity can act to shape the temporal filtering of a synapse through modulation of short-term plasticity. In the weakly electric fish, parallel fibers from cerebellar granule cells provide massive feedback inputs to the pyramidal neurons of the electrosensory lateral line lobe. We demonstrate a long-term synaptic enhancement (LTE) of these synapses that is biochemically similar to the presynaptic long-term potentiation expressed by parallel fibers in the mammalian cerebellum. Using a novel stimulation protocol and a simple modeling paradigm, we then quantify the changes in short-term plasticity during the induction of LTE and show that these changes can be explained by gradual changes in only one model parameter, that which is associated with the baseline probability of transmitter release. These changes lead to a shift in the spike frequency preference of the synapse, suggesting that long-term plasticity is not only involved in controlling the gain of the parallel fiber synapse, but also provides a means of controlling synaptic filtering over multiple time scales. PMID:14602840

  4. Re-assessing how much parallel and perpendicular electric fields accelerate electrons during magnetic reconnection

    NASA Astrophysics Data System (ADS)

    Bessho, Naoki; Chen, Li-Jen; Germaschewski, Kai; Bhattacharjee, Amitava

    2014-10-01

    By means of 2-D PIC simulations applicable to reconnection in the Earth's magnetotail, we show that the parallel electric field accelerates electrons only up to 40 keV, and further acceleration above that energy in fact comes from the perpendicular electric field, which can explain observations of energetic electrons with energies greater than 100 keV. We show that the parallel potential, which is the integral of the parallel electric field along the field line, is proportional to (ωpe /Ωe) - 2, and also to (nb /n0) - 1 / 2, where ωpe /Ωe is the ratio of the plasma frequency to the electron cyclotron frequency, and nb /n0 is the ratio of the lobe density to the density of the current sheet. Applying the parameters in the Earth's magnetotail to the above relations, we demonstrate that the parallel potential is not more than 40 keV. In addition to pitch angle scattering from the parallel to the perpendicular velocity for electron beams along magnetic field, which was suggested in previous studies, energetic electrons accelerated by the perpendicular electric field experience pitch angle scattering from the perpendicular to the parallel velocity, which can isotropize plasma in the exhaust.

  5. Enabling inspection solutions for future mask technologies through the development of massively parallel E-Beam inspection

    NASA Astrophysics Data System (ADS)

    Malloy, Matt; Thiel, Brad; Bunday, Benjamin D.; Wurm, Stefan; Jindal, Vibhu; Mukhtar, Maseeh; Quoi, Kathy; Kemen, Thomas; Zeidler, Dirk; Eberle, Anna Lena; Garbowski, Tomasz; Dellemann, Gregor; Peters, Jan Hendrik

    2015-09-01

    The new device architectures and materials being introduced for sub-10nm manufacturing, combined with the complexity of multiple patterning and the need for improved hotspot detection strategies, have pushed current wafer inspection technologies to their limits. In parallel, gaps in mask inspection capability are growing as new generations of mask technologies are developed to support these sub-10nm wafer manufacturing requirements. In particular, the challenges associated with nanoimprint and extreme ultraviolet (EUV) mask inspection require new strategies that enable fast inspection at high sensitivity. The tradeoffs between sensitivity and throughput for optical and e-beam inspection are well understood. Optical inspection offers the highest throughput and is the current workhorse of the industry for both wafer and mask inspection. E-beam inspection offers the highest sensitivity but has historically lacked the throughput required for widespread adoption in the manufacturing environment. It is unlikely that continued incremental improvements to either technology will meet tomorrow's requirements, and therefore a new inspection technology approach is required; one that combines the high-throughput performance of optical with the high-sensitivity capabilities of e-beam inspection. To support the industry in meeting these challenges SUNY Poly SEMATECH has evaluated disruptive technologies that can meet the requirements for high volume manufacturing (HVM), for both the wafer fab [1] and the mask shop. Highspeed massively parallel e-beam defect inspection has been identified as the leading candidate for addressing the key gaps limiting today's patterned defect inspection techniques. As of late 2014 SUNY Poly SEMATECH completed a review, system analysis, and proof of concept evaluation of multiple e-beam technologies for defect inspection. A champion approach has been identified based on a multibeam technology from Carl Zeiss. This paper includes a discussion on the

  6. My-Forensic-Loci-queries (MyFLq) framework for analysis of forensic STR data generated by massive parallel sequencing.

    PubMed

    Van Neste, Christophe; Vandewoestyne, Mado; Van Criekinge, Wim; Deforce, Dieter; Van Nieuwerburgh, Filip

    2014-03-01

    Forensic scientists are currently investigating how to transition from capillary electrophoresis (CE) to massive parallel sequencing (MPS) for analysis of forensic DNA profiles. MPS offers several advantages over CE such as virtually unlimited multiplexy of loci, combining both short tandem repeat (STR) and single nucleotide polymorphism (SNP) loci, small amplicons without constraints of size separation, more discrimination power, deep mixture resolution and sample multiplexing. We present our bioinformatic framework My-Forensic-Loci-queries (MyFLq) for analysis of MPS forensic data. For allele calling, the framework uses a MySQL reference allele database with automatically determined regions of interest (ROIs) by a generic maximal flanking algorithm which makes it possible to use any STR or SNP forensic locus. Python scripts were designed to automatically make allele calls starting from raw MPS data. We also present a method to assess the usefulness and overall performance of a forensic locus with respect to MPS, as well as methods to estimate whether an unknown allele, which sequence is not present in the MySQL database, is in fact a new allele or a sequencing error. The MyFLq framework was applied to an Illumina MiSeq dataset of a forensic Illumina amplicon library, generated from multilocus STR polymerase chain reaction (PCR) on both single contributor samples and multiple person DNA mixtures. Although the multilocus PCR was not yet optimized for MPS in terms of amplicon length or locus selection, the results show excellent results for most loci. The results show a high signal-to-noise ratio, correct allele calls, and a low limit of detection for minor DNA contributors in mixed DNA samples. Technically, forensic MPS affords great promise for routine implementation in forensic genomics. The method is also applicable to adjacent disciplines such as molecular autopsy in legal medicine and in mitochondrial DNA research. PMID:24528572

  7. Comprehensive Assessment of Potential Multiple Myeloma Immunoglobulin Heavy Chain V-D-J Intraclonal Variation Using Massively Parallel Pyrosequencing

    PubMed Central

    Tschumper, Renee C.; Asmann, Yan W.; Hossain, Asif; Huddleston, Paul M.; Wu, Xiaosheng; Dispenzieri, Angela; Eckloff, Bruce W.; Jelinek, Diane F.

    2012-01-01

    Multiple myeloma (MM) is characterized by the accumulation of malignant plasma cells (PCs) in the bone marrow (BM). MM is viewed as a clonal disorder due to lack of verified intraclonal sequence diversity in the immunoglobulin heavy chain variable region gene (IGHV). However, this conclusion is based on analysis of a very limited number of IGHV subclones and the methodology employed did not permit simultaneous analysis of the IGHV repertoire of non-malignant PCs in the same samples. Here we generated genomic DNA and cDNA libraries from purified MM BMPCs and performed massively parallel pyrosequencing to determine the frequency of cells expressing identical IGHV sequences. This method provided an unprecedented opportunity to interrogate the presence of clonally related MM cells and evaluate the IGHV repertoire of non-MM PCs. Within the MM sample, 37 IGHV genes were expressed, with 98.9% of all immunoglobulin sequences using the same IGHV gene as the MM clone and 83.0% exhibiting exact nucleotide sequence identity in the IGHV and heavy chain complementarity determining region 3 (HCDR3). Of interest, we observed in both genomic DNA and cDNA libraries 48 sets of identical sequences with single point mutations in the MM clonal IGHV or HCDR3 regions. These nucleotide changes were suggestive of putative subclones and therefore were subjected to detailed analysis to interpret: 1) their legitimacy as true subclones; and 2) their significance in the context of MM. Finally, we report for the first time the IGHV repertoire of normal human BMPCs and our data demonstrate the extent of IGHV repertoire diversity as well as the frequency of clonally-related normal BMPCs. This study demonstrates the power and potential weaknesses of in-depth sequencing as a tool to thoroughly investigate the phylogeny of malignant PCs in MM and the IGHV repertoire of normal BMPCs. PMID:22522905

  8. Investigating the effect of two methane-mitigating diets on the rumen microbiome using massively parallel sequencing.

    PubMed

    Ross, E M; Moate, P J; Marett, L; Cocks, B G; Hayes, B J

    2013-09-01

    Variation in the composition of microorganisms in the rumen (the rumen microbiome) of dairy cattle (Bos taurus) is of great interest because of possible links to methane emission levels. Feed additives are one method being investigated to reduce enteric methane production by dairy cattle. Here we report the effect of 2 methane-mitigating feed additives (grapemarc and a combination of lipids and tannin) on rumen microbiome profiles of Holstein dairy cattle. We used untargeted (shotgun) massively parallel sequencing of microbes present in rumen fluid to generate quantitative rumen microbiome profiles. We observed large effects of the feed additives on the rumen microbiome profiles using multiple approaches, including linear mixed modeling, hierarchical clustering, and metagenomic predictions. The effect on the fecal microbiome profiles was not detectable using hierarchical clustering, but was significant in the linear mixed model and when metagenomic predictions were used, suggesting a more subtle effect of the diets on the lower gastrointestinal microbiome. A differential representation analysis (analogous to differential expression in RNA sequencing) showed significant overlap in the contigs (which are genome fragments representing different microorganism species) that were differentially represented between experiments. These similarities suggest that, despite the different additives used, the 2 diets assessed in this investigation altered the microbiomes of the samples in similar ways. Contigs that were differentially represented in both experiments were tested for associations with methane production in an independent set of animals. These animals were not treated with a methane-mitigating diet, but did show substantial natural variation in methane emission levels. The contigs that were significantly differentially represented in response to both dietary additives showed a significant enrichment for associations with methane production. This suggests that these

  9. A whole-genome massively parallel sequencing analysis of BRCA1 mutant oestrogen receptor negative and positive breast cancers

    PubMed Central

    Weigelt, Britta; Wilkerson, Paul M; Manie, Elodie; Grigoriadis, Anita; A’Hern, Roger; van der Groep, Petra; Kozarewa, Iwanka; Popova, Tatiana; Mariani, Odette; Turaljic, Samra; Furney, Simon J; Marais, Richard; Rodruigues, Daniel-Nava; Flora, Adriana C; Wai, Patty; Pawar, Vidya; McDade, Simon; Carroll, Jason; Stoppa-Lyonnet, Dominique; Green, Andrew R; Ellis, Ian O; Swanton, Charles; van Diest, Paul; Delattre, Olivier; Lord, Christopher J; Foulkes, William D; Vincent-Salomon, Anne; Ashworth, Alan; Stern, Marc Henri; Reis-Filho, Jorge S

    2016-01-01

    BRCA1 encodes a tumour suppressor protein that plays pivotal roles in homologous recombination (HR) DNA repair, cell-cycle checkpoints, and transcriptional regulation. BRCA1 germline mutations confer a high risk of early-onset breast and ovarian cancer. In >80% of cases, tumours arising in BRCA1 germline mutation carriers are oestrogen receptor (ER)-negative, however up to 15% are ER-positive. It has been suggested that BRCA1 ER-positive breast cancers constitute sporadic cancers arising in the context of a BRCA1 germline mutation rather than being causally related to BRCA1 loss-of-function. Whole-genome massively parallel sequencing of ER-positive and ER-negative BRCA1 breast cancers, and their respective germline DNAs, was used to characterise the genetic landscape of BRCA1 cancers at base-pair resolution. Only BRCA1 germline mutations and somatic loss of the wild-type allele, and TP53 somatic mutations were recurrently found in the index cases. BRCA1 breast cancers displayed a mutational signature consistent with that caused by lack of HR DNA repair in both ER-positive and ER-negative cases. Sequencing analysis of independent cohorts of hereditary BRCA1 and sporadic non-BRCA1 breast cancers for the presence of recurrent pathogenic mutations and/or homozygous deletions found in the index cases revealed that DAPK3, TMEM135, KIAA1797, PDE4D and GATA4 are potential additional drivers of breast cancers. This study demonstrates that BRCA1 pathogenic germline mutations coupled with somatic loss of the wild-type allele are not sufficient for hereditary breast cancers to display an ER-negative phenotype, and has led to the identification of three potential novel breast cancer genes (i.e. DAPK3, TMEM135 and GATA4). PMID:22362584

  10. Massively-parallel sequencing of genes on a single chromosome: a comparison of solution hybrid selection and flow sorting

    PubMed Central

    2013-01-01

    Background Targeted capture, combined with massively-parallel sequencing, is a powerful technique that allows investigation of specific portions of the genome for less cost than whole genome sequencing. Several methods have been developed, and improvements have resulted in commercial products targeting the human or mouse exonic regions (the exome). In some cases it is desirable to custom-target other regions of the genome, either to reduce the amount of sequence that is targeted or to capture regions that are not targeted by commercial kits. It is important to understand the advantages, limitations, and complexity of a given capture method before embarking on a targeted sequencing experiment. Results We compared two custom targeted capture methods suitable for single chromosome analysis: Solution Hybrid Selection (SHS) and Flow Sorting (FS) of single chromosomes. Both methods can capture targeted material and result in high percentages of genotype identifications across these regions: 59-92% for SHS and 70-79% for FS. FS is amenable to current structural variation detection methods, and variants were detected. Structural variation was also assessed for SHS samples with paired end sequencing, resulting in variant identification. Conclusions While both methods can effectively target genomic regions for genotype determination, several considerations make each method appropriate in different circumstances. SHS is well suited for experiments targeting smaller regions in a larger number of samples. FS is well suited when regions of interest cover large regions of a single chromosome. Although whole genome sequencing is becoming less expensive, the sequencing, data storage, and analysis costs make targeted sequencing using SHS or FS a compelling option. PMID:23586822

  11. Performance Optimization of Two-Stage Exoreversible Thermoelectric Converter in Electrically Series and Parallel Configuration

    NASA Astrophysics Data System (ADS)

    Hans, Ranjana; Manikandan, S.; Kaushik, S. C.

    2015-10-01

    A two-stage exoreversible semiconductor thermoelectric converter (TEC) with internal heat transfer is proposed in two different configurations, i.e., electrically series and electrically parallel. The TEC performance assuming Newton's heat transfer law is evaluated through a combination of finite-time thermodynamics (FTT) and nonequilibrium thermodynamics. A formulation based on the power output versus working electrical current and efficiency versus working electrical current is applied in this study. For fixed total number of thermoelectric elements, the current-voltage ( I- V) characteristics of the series and parallel configurations have been obtained for different combinations of thermoelectric elements in the top and bottom stage. The number of thermoelectric elements in the top stage has been optimized to maximize the power output of the TEC in the electrically series and parallel modes. Thermodynamic models for a multistage TEC system considering internal irreversibilities have been developed in a matrix laboratory Simulink environment. The effect of load resistance on V opt, I opt, V oc, and I sc for different combinations of thermoelectric elements in the top and bottom stage has been analyzed. The I- V characteristics obtained for the two-stage series and parallel TEC configurations suggest a range of load resistance values to be chosen, in turn indicating the suitability of the parallel rather than series configuration when designing real multistage TECs. This analysis will be helpful in designing actual multistage TECs.

  12. Massively parallel approach to time-domain forward and inverse modelling of EM induction problem in spherical Earth

    NASA Astrophysics Data System (ADS)

    Velimsky, J.

    2011-12-01

    Inversion of observatory and low-orbit satellite geomagnetic data in terms of the three-dimensional distribution of electrical conductivity in the Earth's mantle can provide an independent constraint on the physical, chemical, and mineralogical composition of the Earth's mantle. This problem has been recently approached by different numerical methods. There are several key challenges from the numerical and algorithmic point of view, in particular the accuracy and speed of the forward solver, the effective evaluation of sensitivities of data to changes of model parameters, and the dependence of results on the a-priori knowledge of the spatio-temporal structure of the primary ionospheric and magnetospheric electric currents. Here I present recent advancements of the time-domain, spherical harmonic-finite element approach. The forward solver has been adapted to distributed-memory parallel architecture using band-matrix routines from the ScaLapack library. The evaluation of gradient of data misfit in the model space using adjoint approach has been also paralellized. Finally, the inverse problem has been reformulated in a way which allows for simultaneous reconstruction of conductivity model and external field model directly from the data.

  13. Displacement current and the generation of parallel electric fields.

    PubMed

    Song, Yan; Lysak, Robert L

    2006-04-14

    We show for the first time the dynamical relationship between the generation of magnetic field-aligned electric field (E||) and the temporal changes and spatial gradients of magnetic and velocity shears, and the plasma density in Earth's magnetosphere. We predict that the signatures of reconnection and auroral particle acceleration should have a correlation with low plasma density, and a localized voltage drop (V||) should often be associated with a localized magnetic stress concentration. Previous interpretations of the E|| generation are mostly based on the generalized Ohm's law, causing serious confusion in understanding the nature of reconnection and auroral acceleration. PMID:16712084

  14. Parallel inversion of a massive ERT data set to characterize deep vadose zone contamination beneath former nuclear waste infiltration galleries at the Hanford Site B-Complex (Invited)

    NASA Astrophysics Data System (ADS)

    Johnson, T.; Rucker, D. F.; Wellman, D.

    2013-12-01

    The Hanford Site, located in south-central Washington, USA, originated in the early 1940's as part of the Manhattan Project and produced plutonium used to build the United States nuclear weapons stockpile. In accordance with accepted industrial practice of that time, a substantial portion of relatively low-activity liquid radioactive waste was disposed of by direct discharge to either surface soil or into near-surface infiltration galleries such as cribs and trenches. This practice was supported by early investigations beginning in the 1940s, including studies by Geological Survey (USGS) experts, whose investigations found vadose zone soils at the site suitable for retaining radionuclides to the extent necessary to protect workers and members of the general public based on the standards of that time. That general disposal practice has long since been discontinued, and the US Department of Energy (USDOE) is now investigating residual contamination at former infiltration galleries as part of its overall environmental management and remediation program. Most of the liquid wastes released into the subsurface were highly ionic and electrically conductive, and therefore present an excellent target for imaging by Electrical Resistivity Tomography (ERT) within the low-conductivity sands and gravels comprising Hanford's vadose zone. In 2006, USDOE commissioned a large scale surface ERT survey to characterize vadose zone contamination beneath the Hanford Site B-Complex, which contained 8 infiltration trenches, 12 cribs, and one tile field. The ERT data were collected in a pole-pole configuration with 18 north-south trending lines, and 18 east-west trending lines ranging from 417m to 816m in length. The final data set consisted of 208,411 measurements collected on 4859 electrodes, covering an area of 600m x 600m. Given the computational demands of inverting this massive data set as a whole, the data were initially inverted in parts with a shared memory inversion code, which

  15. Self-consistent formation of parallel electric fields in the auroral zone

    NASA Technical Reports Server (NTRS)

    Schriver, David; Ashour-Abdalla, Maha

    1993-01-01

    This paper presents results from a fully self-consistent kinetic particle simulation of the time-dependent formation of large scale parallel electric fields in the auroral zone. The results show that magnetic mirroring of the hot plasma that streams earthward from the magnetotail leads to a charge separation potential drop of many kilovolts, over an altitude range of a few thousand kilometers. Once the potential drop is formed, it remains relatively static and is maintained in time by the constant input of hot plasma from the tail; the parallel electric field accelerates ions away from Earth and ionospheric electrons towards the Earth. At altitudes above where the ions are mirror reflected and accelerated by the parallel electric field, low frequency waves are generated, possibly due to an ion/ion two-stream interaction.

  16. Fully Parallel Electrical Impedance Tomography Using Code Division Multiplexing.

    PubMed

    Tsoeu, M S; Inggs, M R

    2016-06-01

    Electrical Impedance Tomography (EIT) has been dominated by the use of Time Division Multiplexing (TDM) and Frequency Division Multiplexing (FDM) as methods of achieving orthogonal injection of excitation signals. Code Division Multiplexing (CDM), presented in this paper is an alternative that eliminates temporal data inconsistencies of TDM for fast changing systems. Furthermore, this approach eliminates data inconsistencies that arise in FDM when frequency bands of current injecting electrodes are chosen over frequencies that have large changes in the imaged object's impedance. To the authors knowledge no fully functional wideband system or simulation platform using simultaneous injection of Gold codes currents has been reported. In this paper, we formulate, simulate and develop a fully functional pseudo-random (Gold) code driven EIT system with 15 excitation currents and 16 separate voltage measurement electrodes. In the work we verify the use of CDM as a multiplexing modality in simultaneous injection EIT, using a prototype system with an overall bandwidth of 15 kHz, and attainable speed of 462 frames/s using codes with a period of 31 chips. Simulations and experiments are performed using the Electrical Impedance and Diffuse Optics Reconstruction Software (EIDORS). We also propose the use of image processing on reconstructed images to establish their quality quantitatively without access to raw reconstruction data. The results of this study show that CDM can be successfully used in EIT, and gives results of similar visual quality to TDM and FDM. Achieved performance shows average position error of 3.5% and size error of 6.2%. PMID:26731774

  17. Electrical geophysical investigations of massive sulfide deposits and their host rocks, West Shasta copper-zinc district.

    USGS Publications Warehouse

    Horton, R.J.; Smith, B.D.; Washburne, J.C.

    1985-01-01

    Galvanic and induction electrical geophysical methods are described, and applied to characterize the electrical properties of selected West Shasta massive sulphide deposits and their host rocks at scales less than and greater than 25 ft. The measurements are analysed for their use in differentiating the various rocks and for their correlation with petrographic and geomorphological/climate attributes, and they are compared with other massive sulphide districts of contrasting geological ages. The integrated use of different methods is recommended for effective exploration of the complex West Shasta geology.-G.J.N.

  18. Massively Parallel Assimilation of TOGA/TAO and Topex/Poseidon Measurements into a Quasi Isopycnal Ocean General Circulation Model Using an Ensemble Kalman Filter

    NASA Technical Reports Server (NTRS)

    Keppenne, Christian L.; Rienecker, Michele; Borovikov, Anna Y.; Suarez, Max

    1999-01-01

    A massively parallel ensemble Kalman filter (EnKF)is used to assimilate temperature data from the TOGA/TAO array and altimetry from TOPEX/POSEIDON into a Pacific basin version of the NASA Seasonal to Interannual Prediction Project (NSIPP)ls quasi-isopycnal ocean general circulation model. The EnKF is an approximate Kalman filter in which the error-covariance propagation step is modeled by the integration of multiple instances of a numerical model. An estimate of the true error covariances is then inferred from the distribution of the ensemble of model state vectors. This inplementation of the filter takes advantage of the inherent parallelism in the EnKF algorithm by running all the model instances concurrently. The Kalman filter update step also occurs in parallel by having each processor process the observations that occur in the region of physical space for which it is responsible. The massively parallel data assimilation system is validated by withholding some of the data and then quantifying the extent to which the withheld information can be inferred from the assimilation of the remaining data. The distributions of the forecast and analysis error covariances predicted by the ENKF are also examined.

  19. Scalable evaluation of polarization energy and associated forces in polarizable molecular dynamics: II. Toward massively parallel computations using smooth particle mesh Ewald.

    PubMed

    Lagardère, Louis; Lipparini, Filippo; Polack, Étienne; Stamm, Benjamin; Cancès, Éric; Schnieders, Michael; Ren, Pengyu; Maday, Yvon; Piquemal, Jean-Philip

    2015-06-01

    In this article, we present a parallel implementation of point dipole-based polarizable force fields for molecular dynamics (MD) simulations with periodic boundary conditions (PBC). The smooth particle mesh Ewald technique is combined with two optimal iterative strategies, namely, a preconditioned conjugate gradient solver and a Jacobi solver in conjunction with the direct inversion in the iterative subspace for convergence acceleration, to solve the polarization equations. We show that both solvers exhibit very good parallel performances and overall very competitive timings in an energy and force computation needed to perform a MD step. Various tests on large systems are provided in the context of the polarizable AMOEBA force field as implemented in the newly developed Tinker-HP package, which is the first implementation of a polarizable model that makes large-scale experiments for massively parallel PBC point dipole models possible. We show that using a large number of cores offers a significant acceleration of the overall process involving the iterative methods within the context of SPME and a noticeable improvement of the memory management, giving access to very large systems (hundreds of thousands of atoms) as the algorithm naturally distributes the data on different cores. Coupled with advanced MD techniques, gains ranging from 2 to 3 orders of magnitude in time are now possible compared to nonoptimized, sequential implementations, giving new directions for polarizable molecular dynamics with periodic boundary conditions using massively parallel implementations. PMID:26575557

  20. Masking fields: a massively parallel neural architecture for learning, recognizing, and predicting multiple groupings of patterned data.

    PubMed

    Cohen, M A; Grossberg, S

    1987-05-15

    A massively parallel neural network architecture, called a masking field, is characterized through systematic computer simulations. A masking field is a multiple-scale self-similar automatically gain-controlled cooperative- competitive feedback network F(2). Network F(2) receives input patterns from an adaptive filter F(1) ? F(2) that is activated by a prior processing level F(1). Such a network F(2) behaves like a content-addressable memory. It activates compressed recognition codes that are predictive with respect to the activation patterns flickering across the feature detectors of F(1) and competitively inhibits, or masks, codes which are unpredictive with respect to the F(1) patterns. In particular, a masking field can simultaneously detect multiple groupings within its input patterns and assign activation weights to the codes for these groupings which are predictive with respect to the contextual information embedded within the patterns and the prior learning of the system. A masking field automatically rescales its sensitivity as the overall size of an input pattern changes, yet also remains sensitive to the microstructure within each input pattern. In this way, a masking field can more strongly activate a code for the whole F(1) pattern than for its salient parts, yet amplifies the code for a pattern part when it becomes a pattern whole in a new input context. A masking field can also be primed by inputs from F(1): it can activate codes which represent predictions of how the F(1) pattern may evolve in the subsequent time interval. Network F(2) can also exhibit an adaptive sharpening property: repetition of a familiar F(1) pattern can tune the adaptive filter to elicit a more focal spatial activation of its F(2) recognition code than does an unfamiliar input pattern. The F(2) recognition code also becomes less distributed when an input pattern contains more contextual information on which to base an unambiguous prediction of which the F(1) pattern is being

  1. Utility of different massive parallel sequencing platforms for mutation profiling in clinical samples and identification of pitfalls using FFPE tissue.

    PubMed

    Fassunke, Jana; Haller, Florian; Hebele, Simone; Moskalev, Evgeny A; Penzel, Roland; Pfarr, Nicole; Merkelbach-Bruse, Sabine; Endris, Volker

    2015-11-01

    In the growing field of personalised medicine, the analysis of numerous potential targets is becoming a challenge in terms of work load, tissue availability, as well as costs. The molecular analysis of non-small cell lung cancer (NSCLC) has shifted from the analysis of the epidermal growth factor receptor (EGFR) mutation status to the analysis of different gene regions, including resistance mutations or translocations. Massive parallel sequencing (MPS) allows rapid comprehensive mutation testing in routine molecular pathological diagnostics even on small formalin-fixed, paraffin‑embedded (FFPE) biopsies. In this study, we compared and evaluated currently used MPS platforms for their application in routine pathological diagnostics. We initiated a first round‑robin testing of 30 cases diagnosed with NSCLC and a known EGFR gene mutation status. In this study, three pathology institutes from Germany received FFPE tumour sections that had been individually processed. Fragment libraries were prepared by targeted multiplex PCR using institution‑specific gene panels. Sequencing was carried out using three MPS systems: MiSeq™, GS Junior and PGM Ion Torrent™. In two institutes, data analysis was performed with the platform-specific software and the Integrative Genomics Viewer. In one institute, data analysis was carried out using an in-house software system. Of 30 samples, 26 were analysed by all institutes. Concerning the EGFR mutation status, concordance was found in 26 out of 26 samples. The analysis of a few samples failed due to poor DNA quality in alternating institutes. We found 100% concordance when comparing the results of the EGFR mutation status. A total of 38 additional mutations were identified in the 26 samples. In two samples, minor variants were found which could not be confirmed by qPCR. Other characteristic variants were identified as fixation artefacts by reanalyzing the respective sample by Sanger sequencing. Overall, the results of this study

  2. Utility of different massive parallel sequencing platforms for mutation profiling in clinical samples and identification of pitfalls using FFPE tissue

    PubMed Central

    FASSUNKE, JANA; HALLER, FLORIAN; HEBELE, SIMONE; MOSKALEV, EVGENY A.; PENZEL, ROLAND; PFARR, NICOLE; MERKELBACH-BRUSE, SABINE; ENDRIS, VOLKER

    2015-01-01

    In the growing field of personalised medicine, the analysis of numerous potential targets is becoming a challenge in terms of work load, tissue availability, as well as costs. The molecular analysis of non-small cell lung cancer (NSCLC) has shifted from the analysis of the epidermal growth factor receptor (EGFR) mutation status to the analysis of different gene regions, including resistance mutations or translocations. Massive parallel sequencing (MPS) allows rapid comprehensive mutation testing in routine molecular pathological diagnostics even on small formalin-fixed, paraffin-embedded (FFPE) biopsies. In this study, we compared and evaluated currently used MPS platforms for their application in routine pathological diagnostics. We initiated a first round-robin testing of 30 cases diagnosed with NSCLC and a known EGFR gene mutation status. In this study, three pathology institutes from Germany received FFPE tumour sections that had been individually processed. Fragment libraries were prepared by targeted multiplex PCR using institution-specific gene panels. Sequencing was carried out using three MPS systems: MiSeq™, GS Junior and PGM Ion Torrent™. In two institutes, data analysis was performed with the platform-specific software and the Integrative Genomics Viewer. In one institute, data analysis was carried out using an in-house software system. Of 30 samples, 26 were analysed by all institutes. Concerning the EGFR mutation status, concordance was found in 26 out of 26 samples. The analysis of a few samples failed due to poor DNA quality in alternating institutes. We found 100% concordance when comparing the results of the EGFR mutation status. A total of 38 additional mutations were identified in the 26 samples. In two samples, minor variants were found which could not be confirmed by qPCR. Other characteristic variants were identified as fixation artefacts by reanalyzing the respective sample by Sanger sequencing. Overall, the results of this study

  3. One-dimensional models of quasi-neutral parallel electric fields

    NASA Technical Reports Server (NTRS)

    Stern, D. P.

    1981-01-01

    Parallel electric fields can exist in the magnetic mirror geometry of auroral field lines if they conform to the quasineutral equilibrium solutions. Results on quasi-neutral equilibria and on double layer discontinuities were reviewed and the effects on such equilibria due to non-unique solutions, potential barriers and field aligned current flows using as inputs monoenergetic isotropic distribution functions were examined.

  4. An Investigation of Perpendicular Gradients of Parallel Electric Field Associated with Magnetic Reconnection

    NASA Astrophysics Data System (ADS)

    Sturner, A. P.; Ergun, R.; Newman, D. L.; Lapenta, G.

    2014-12-01

    Many observations of particle heating and acceleration throughout the universe have been associated with magnetic reconnection. Generalized Ohm's Law describes how particles move under ideal and non-ideal conditions; however, it is insufficient for describing how the magnetic field itself changes. Initial studies have shown that a curl of a parallel electric field is necessary for reconnection to occur. These analytic studies have demonstrated that perpendicular gradients in the parallel electric field drive a counter-twisting of the magnetic field on either side of the localized parallel electric field. This results in the slippage of magnetic flux tubes and a break down of the 'frozen-in' condition. In this presentation, we analyze results from self-consistent implicit kinetic particle-in-cell simulations. The strongest gradients of parallel electric fields in the simulations are along the separator and not at the X-point. We will present where in the simulation domain the 'frozen-in' condition breaks down and compare it with the location of these gradients, and discuss the implications.

  5. FAST Observations of Acceleration Processes in the Cusp--Evidence for Parallel Electric Fields

    NASA Technical Reports Server (NTRS)

    Pfaff, R. F.. Jr.; Carlson, C.; McFadden, J.; Ergun, R.; Clemmons, J.; Klumpar D.; Strangeway, R.

    1999-01-01

    The existence of precipitating keV ions in the Earth's cusp originating at the magnetosheath provide unique means to test our understanding of particle acceleration and parallel electric fields in the lower altitude acceleration region. On numerous occasions, the FAST (The Fast Auroral Snapshot) spacecraft has encountered the Earth's cusp regions near its apogee of 4175 km which are characterized by their signatures of dispersed keV ion injections. The FAST instruments also reveal a complex microphysics inherent to many, but not all, of the cusp regions encountered by the spacecraft, that include upgoing ion beams and conics, inverted-V electrons, upgoing electron beams, and spikey DC-coupled electric fields and plasma waves. Detailed inspection of the FAST data often show clear modulation of the precipitating magnetosheath ions that indicate that they are affected by local electric potentials. For example, the magnetosheath ion precipitation is sometimes abruptly shut off precisely in regions where downgoing localized inverted-V electrons are observed. Such observations support the existence of a localized process, such as parallel electric fields, above the spacecraft which accelerate the electrons downward and consequently impede the precipitating ion precipitation. Other acceleration events in the cusp are sometimes organized with an apparent cellular structure that suggests Alfven waves or other large-scale phenomena are controlling the localized potentials. We examine several cusp encounters by the FAST satellite where the modulation of energetic session on acceleration particle populations reveals evidence of localized acceleration, most likely by parallel electric fields.

  6. Light scattering of rectangular slot antennas: parallel magnetic vector vs perpendicular electric vector.

    PubMed

    Lee, Dukhyung; Kim, Dai-Sik

    2016-01-01

    We study light scattering off rectangular slot nano antennas on a metal film varying incident polarization and incident angle, to examine which field vector of light is more important: electric vector perpendicular to, versus magnetic vector parallel to the long axis of the rectangle. While vector Babinet's principle would prefer magnetic field along the long axis for optimizing slot antenna function, convention and intuition most often refer to the electric field perpendicular to it. Here, we demonstrate experimentally that in accordance with vector Babinet's principle, the incident magnetic vector parallel to the long axis is the dominant component, with the perpendicular incident electric field making a small contribution of the factor of 1/|ε|, the reciprocal of the absolute value of the dielectric constant of the metal, owing to the non-perfectness of metals at optical frequencies. PMID:26740335

  7. Hydrogenic donor impurity in parallel-triangular quantum wires: Hydrostatic pressure and applied electric field effects

    NASA Astrophysics Data System (ADS)

    Restrepo, R. L.; Giraldo, E.; Miranda, G. L.; Ospina, W.; Duque, C. A.

    2009-12-01

    The combined effects of the hydrostatic pressure and in-growth direction applied electric field on the binding energy of hydrogenic shallow-donor impurity states in parallel-coupled-GaAs- Ga1-xAlxAs-quantum-well wires are calculated using a variational procedure within the effective-mass and parabolic-band approximations. Results are obtained for several dimensions of the structure, shallow-donor impurity positions, hydrostatic pressure, and applied electric field. Our results suggest that external inputs such us hydrostatic pressure and in-growth direction electric field are two useful tools in order to modify the binding energy of a donor impurity in parallel-coupled-quantum-well wires.

  8. Light scattering of rectangular slot antennas: parallel magnetic vector vs perpendicular electric vector

    PubMed Central

    Lee, Dukhyung; Kim, Dai-Sik

    2016-01-01

    We study light scattering off rectangular slot nano antennas on a metal film varying incident polarization and incident angle, to examine which field vector of light is more important: electric vector perpendicular to, versus magnetic vector parallel to the long axis of the rectangle. While vector Babinet’s principle would prefer magnetic field along the long axis for optimizing slot antenna function, convention and intuition most often refer to the electric field perpendicular to it. Here, we demonstrate experimentally that in accordance with vector Babinet’s principle, the incident magnetic vector parallel to the long axis is the dominant component, with the perpendicular incident electric field making a small contribution of the factor of 1/|ε|, the reciprocal of the absolute value of the dielectric constant of the metal, owing to the non-perfectness of metals at optical frequencies. PMID:26740335

  9. FWT2D: A massively parallel program for frequency-domain full-waveform tomography of wide-aperture seismic data—Part 1: Algorithm

    NASA Astrophysics Data System (ADS)

    Sourbier, Florent; Operto, Stéphane; Virieux, Jean; Amestoy, Patrick; L'Excellent, Jean-Yves

    2009-03-01

    This is the first paper in a two-part series that describes a massively parallel code that performs 2D frequency-domain full-waveform inversion of wide-aperture seismic data for imaging complex structures. Full-waveform inversion methods, namely quantitative seismic imaging methods based on the resolution of the full wave equation, are computationally expensive. Therefore, designing efficient algorithms which take advantage of parallel computing facilities is critical for the appraisal of these approaches when applied to representative case studies and for further improvements. Full-waveform modelling requires the resolution of a large sparse system of linear equations which is performed with the massively parallel direct solver MUMPS for efficient multiple-shot simulations. Efficiency of the multiple-shot solution phase (forward/backward substitutions) is improved by using the BLAS3 library. The inverse problem relies on a classic local optimization approach implemented with a gradient method. The direct solver returns the multiple-shot wavefield solutions distributed over the processors according to a domain decomposition driven by the distribution of the LU factors. The domain decomposition of the wavefield solutions is used to compute in parallel the gradient of the objective function and the diagonal Hessian, this latter providing a suitable scaling of the gradient. The algorithm allows one to test different strategies for multiscale frequency inversion ranging from successive mono-frequency inversion to simultaneous multifrequency inversion. These different inversion strategies will be illustrated in the following companion paper. The parallel efficiency and the scalability of the code will also be quantified.

  10. Using a constraint on the parallel velocity when determining electric fields with EISCAT

    NASA Technical Reports Server (NTRS)

    Caudal, G.; Blanc, M.

    1988-01-01

    A method is proposed to determine the perpendicular components of the ion velocity vector (and hence the perpendicular electric field) from EISCAT tristatic measurements, in which one introduces an additional constraint on the parallel velocity, in order to take account of our knowledge that the parallel velocity of ions is small. This procedure removes some artificial features introduced when the tristatic geometry becomes too unfavorable. It is particularly well suited for the southernmost or northernmost positions of the tristatic measurements performed by meridian scan experiments (CP3 mode).

  11. Experimentally attainable example of chaotic tunneling: The hydrogen atom in parallel static electric and magnetic fields

    NASA Astrophysics Data System (ADS)

    Delande, Dominique; Zakrzewski, Jakub

    2003-12-01

    Statistics of tunneling rates in the presence of chaotic classical dynamics is discussed on a realistic example: a hydrogen atom placed in parallel, uniform, static electric, and magnetic fields, where tunneling is followed by ionization along the fields direction. Depending on the magnetic quantum number, one may observe either a standard Porter-Thomas distribution of tunneling rates or, for strong scarring by a periodic orbit parallel to the external fields, strong deviations from it. For the latter case, a simple model based on random matrix theory gives the correct distribution.

  12. Method and apparatus for routing data in an inter-nodal communications lattice of a massively parallel computer system by routing through transporter nodes

    DOEpatents

    Archer, Charles Jens; Musselman, Roy Glenn; Peters, Amanda; Pinnow, Kurt Walter; Swartz, Brent Allen; Wallenfelt, Brian Paul

    2010-11-16

    A massively parallel computer system contains an inter-nodal communications network of node-to-node links. An automated routing strategy routes packets through one or more intermediate nodes of the network to reach a destination. Some packets are constrained to be routed through respective designated transporter nodes, the automated routing strategy determining a path from a respective source node to a respective transporter node, and from a respective transporter node to a respective destination node. Preferably, the source node chooses a routing policy from among multiple possible choices, and that policy is followed by all intermediate nodes. The use of transporter nodes allows greater flexibility in routing.

  13. Method and apparatus for routing data in an inter-nodal communications lattice of a massively parallel computer system by employing bandwidth shells at areas of overutilization

    SciTech Connect

    Archer, Charles Jens; Musselman, Roy Glenn; Peters, Amanda; Pinnow, Kurt Walter; Swartz, Brent Allen; Wallenfelt, Brian Paul

    2010-04-27

    A massively parallel computer system contains an inter-nodal communications network of node-to-node links. An automated routing strategy routes packets through one or more intermediate nodes of the network to reach a final destination. The default routing strategy is altered responsive to detection of overutilization of a particular path of one or more links, and at least some traffic is re-routed by distributing the traffic among multiple paths (which may include the default path). An alternative path may require a greater number of link traversals to reach the destination node.

  14. Method and apparatus for analyzing error conditions in a massively parallel computer system by identifying anomalous nodes within a communicator set

    DOEpatents

    Gooding, Thomas Michael

    2011-04-19

    An analytical mechanism for a massively parallel computer system automatically analyzes data retrieved from the system, and identifies nodes which exhibit anomalous behavior in comparison to their immediate neighbors. Preferably, anomalous behavior is determined by comparing call-return stack tracebacks for each node, grouping like nodes together, and identifying neighboring nodes which do not themselves belong to the group. A node, not itself in the group, having a large number of neighbors in the group, is a likely locality of error. The analyzer preferably presents this information to the user by sorting the neighbors according to number of adjoining members of the group.

  15. Method and apparatus for routing data in an inter-nodal communications lattice of a massively parallel computer system by dynamically adjusting local routing strategies

    DOEpatents

    Archer, Charles Jens; Musselman, Roy Glenn; Peters, Amanda; Pinnow, Kurt Walter; Swartz, Brent Allen; Wallenfelt, Brian Paul

    2010-03-16

    A massively parallel computer system contains an inter-nodal communications network of node-to-node links. Each node implements a respective routing strategy for routing data through the network, the routing strategies not necessarily being the same in every node. The routing strategies implemented in the nodes are dynamically adjusted during application execution to shift network workload as required. Preferably, adjustment of routing policies in selective nodes is performed at synchronization points. The network may be dynamically monitored, and routing strategies adjusted according to detected network conditions.

  16. Method and apparatus for routing data in an inter-nodal communications lattice of a massively parallel computer system by dynamic global mapping of contended links

    DOEpatents

    Archer, Charles Jens; Musselman, Roy Glenn; Peters, Amanda; Pinnow, Kurt Walter; Swartz, Brent Allen; Wallenfelt, Brian Paul

    2011-10-04

    A massively parallel nodal computer system periodically collects and broadcasts usage data for an internal communications network. A node sending data over the network makes a global routing determination using the network usage data. Preferably, network usage data comprises an N-bit usage value for each output buffer associated with a network link. An optimum routing is determined by summing the N-bit values associated with each link through which a data packet must pass, and comparing the sums associated with different possible routes.

  17. Three pillars for achieving quantum mechanical molecular dynamics simulations of huge systems: Divide-and-conquer, density-functional tight-binding, and massively parallel computation.

    PubMed

    Nishizawa, Hiroaki; Nishimura, Yoshifumi; Kobayashi, Masato; Irle, Stephan; Nakai, Hiromi

    2016-08-01

    The linear-scaling divide-and-conquer (DC) quantum chemical methodology is applied to the density-functional tight-binding (DFTB) theory to develop a massively parallel program that achieves on-the-fly molecular reaction dynamics simulations of huge systems from scratch. The functions to perform large scale geometry optimization and molecular dynamics with DC-DFTB potential energy surface are implemented to the program called DC-DFTB-K. A novel interpolation-based algorithm is developed for parallelizing the determination of the Fermi level in the DC method. The performance of the DC-DFTB-K program is assessed using a laboratory computer and the K computer. Numerical tests show the high efficiency of the DC-DFTB-K program, a single-point energy gradient calculation of a one-million-atom system is completed within 60 s using 7290 nodes of the K computer. © 2016 Wiley Periodicals, Inc. PMID:27317328

  18. Development of a MEMS electrostatic condenser lens array for nc-Si surface electron emitters of the Massive Parallel Electron Beam Direct-Write system

    NASA Astrophysics Data System (ADS)

    Kojima, A.; Ikegami, N.; Yoshida, T.; Miyaguchi, H.; Muroyama, M.; Yoshida, S.; Totsu, K.; Koshida, N.; Esashi, M.

    2016-03-01

    Developments of a Micro Electro-Mechanical System (MEMS) electrostatic Condenser Lens Array (CLA) for a Massively Parallel Electron Beam Direct Write (MPEBDW) lithography system are described. The CLA converges parallel electron beams for fine patterning. The structure of the CLA was designed on a basis of analysis by a finite element method (FEM) simulation. The lens was fabricated with precise machining and assembled with a nanocrystalline silicon (nc-Si) electron emitter array as an electron source of MPEBDW. The nc-Si electron emitter has the advantage that a vertical-emitted surface electron beam can be obtained without any extractor electrodes. FEM simulation of electron optics characteristics showed that the size of the electron beam emitted from the electron emitter was reduced to 15% by a radial direction, and the divergence angle is reduced to 1/18.

  19. Preselective Screening for Linear-Scaling Exact Exchange-Gradient Calculations for Graphics Processing Units and General Strong-Scaling Massively Parallel Calculations.

    PubMed

    Kussmann, Jörg; Ochsenfeld, Christian

    2015-03-10

    We present an extension of our recently presented PreLinK scheme (J. Chem. Phys. 2013, 138, 134114) for the exact exchange contribution to nuclear forces. The significant contributions to the exchange gradient are determined by preselection based on accurate shell-pair contributions to the SCF exchange energy prior to the calculation. Therefore, our method is highly suitable for massively parallel electronic structure calculations because of an efficient load balancing of the significant contributions only and an unhampered control flow. The efficiency of our method is shown for several illustrative calculations on single GPU servers, as well as for hybrid MPI/CUDA parallel calculations with the largest system comprising 3369 atoms and 26952 basis functions. PMID:26579745

  20. Modelling and experimental evaluation of parallel connected lithium ion cells for an electric vehicle battery system

    NASA Astrophysics Data System (ADS)

    Bruen, Thomas; Marco, James

    2016-04-01

    Variations in cell properties are unavoidable and can be caused by manufacturing tolerances and usage conditions. As a result of this, cells connected in series may have different voltages and states of charge that limit the energy and power capability of the complete battery pack. Methods of removing this energy imbalance have been extensively reported within literature. However, there has been little discussion around the effect that such variation has when cells are connected electrically in parallel. This work aims to explore the impact of connecting cells, with varied properties, in parallel and the issues regarding energy imbalance and battery management that may arise. This has been achieved through analysing experimental data and a validated model. The main results from this study highlight that significant differences in current flow can occur between cells within a parallel stack that will affect how the cells age and the temperature distribution within the battery assembly.

  1. A Method for Studying Protistan Diversity Using Massively Parallel Sequencing of V9 Hypervariable Regions of Small-Subunit Ribosomal RNA Genes

    PubMed Central

    Amaral-Zettler, Linda A.; McCliment, Elizabeth A.; Ducklow, Hugh W.; Huse, Susan M.

    2009-01-01

    Background Massively parallel pyrosequencing of amplicons from the V6 hypervariable regions of small-subunit (SSU) ribosomal RNA (rRNA) genes is commonly used to assess diversity and richness in bacterial and archaeal populations. Recent advances in pyrosequencing technology provide read lengths of up to 240 nucleotides. Amplicon pyrosequencing can now be applied to longer variable regions of the SSU rRNA gene including the V9 region in eukaryotes. Methodology/Principal Findings We present a protocol for the amplicon pyrosequencing of V9 regions for eukaryotic environmental samples for biodiversity inventories and species richness estimation. The International Census of Marine Microbes (ICoMM) and the Microbial Inventory Research Across Diverse Aquatic Long Term Ecological Research Sites (MIRADA-LTERs) projects are already employing this protocol for tag sequencing of eukaryotic samples in a wide diversity of both marine and freshwater environments. Conclusions/Significance Massively parallel pyrosequencing of eukaryotic V9 hypervariable regions of SSU rRNA genes provides a means of estimating species richness from deeply-sampled populations and for discovering novel species from the environment. PMID:19633714

  2. Inter-laboratory evaluation of the EUROFORGEN Global ancestry-informative SNP panel by massively parallel sequencing using the Ion PGM™.

    PubMed

    Eduardoff, M; Gross, T E; Santos, C; de la Puente, M; Ballard, D; Strobl, C; Børsting, C; Morling, N; Fusco, L; Hussing, C; Egyed, B; Souto, L; Uacyisrael, J; Syndercombe Court, D; Carracedo, Á; Lareu, M V; Schneider, P M; Parson, W; Phillips, C; Parson, W; Phillips, C

    2016-07-01

    The EUROFORGEN Global ancestry-informative SNP (AIM-SNPs) panel is a forensic multiplex of 128 markers designed to differentiate an individual's ancestry from amongst the five continental population groups of Africa, Europe, East Asia, Native America, and Oceania. A custom multiplex of AmpliSeq™ PCR primers was designed for the Global AIM-SNPs to perform massively parallel sequencing using the Ion PGM™ system. This study assessed individual SNP genotyping precision using the Ion PGM™, the forensic sensitivity of the multiplex using dilution series, degraded DNA plus simple mixtures, and the ancestry differentiation power of the final panel design, which required substitution of three original ancestry-informative SNPs with alternatives. Fourteen populations that had not been previously analyzed were genotyped using the custom multiplex and these studies allowed assessment of genotyping performance by comparison of data across five laboratories. Results indicate a low level of genotyping error can still occur from sequence misalignment caused by homopolymeric tracts close to the target SNP, despite careful scrutiny of candidate SNPs at the design stage. Such sequence misalignment required the exclusion of component SNP rs2080161 from the Global AIM-SNPs panel. However, the overall genotyping precision and sensitivity of this custom multiplex indicates the Ion PGM™ assay for the Global AIM-SNPs is highly suitable for forensic ancestry analysis with massively parallel sequencing. PMID:27208666

  3. Massively Parallel Sequencing of Patients with Intellectual Disability, Congenital Anomalies and/or Autism Spectrum Disorders with a Targeted Gene Panel

    PubMed Central

    Brett, Maggie; McPherson, John; Zang, Zhi Jiang; Lai, Angeline; Tan, Ee-Shien; Ng, Ivy; Ong, Lai-Choo; Cham, Breana; Tan, Patrick; Rozen, Steve; Tan, Ene-Choo

    2014-01-01

    Developmental delay and/or intellectual disability (DD/ID) affects 1–3% of all children. At least half of these are thought to have a genetic etiology. Recent studies have shown that massively parallel sequencing (MPS) using a targeted gene panel is particularly suited for diagnostic testing for genetically heterogeneous conditions. We report on our experiences with using massively parallel sequencing of a targeted gene panel of 355 genes for investigating the genetic etiology of eight patients with a wide range of phenotypes including DD/ID, congenital anomalies and/or autism spectrum disorder. Targeted sequence enrichment was performed using the Agilent SureSelect Target Enrichment Kit and sequenced on the Illumina HiSeq2000 using paired-end reads. For all eight patients, 81–84% of the targeted regions achieved read depths of at least 20×, with average read depths overlapping targets ranging from 322× to 798×. Causative variants were successfully identified in two of the eight patients: a nonsense mutation in the ATRX gene and a canonical splice site mutation in the L1CAM gene. In a third patient, a canonical splice site variant in the USP9X gene could likely explain all or some of her clinical phenotypes. These results confirm the value of targeted MPS for investigating DD/ID in children for diagnostic purposes. However, targeted gene MPS was less likely to provide a genetic diagnosis for children whose phenotype includes autism. PMID:24690944

  4. Power-balancing instantaneous optimization energy management for a novel series-parallel hybrid electric bus

    NASA Astrophysics Data System (ADS)

    Sun, Dongye; Lin, Xinyou; Qin, Datong; Deng, Tao

    2012-11-01

    Energy management(EM) is a core technique of hybrid electric bus(HEB) in order to advance fuel economy performance optimization and is unique for the corresponding configuration. There are existing algorithms of control strategy seldom take battery power management into account with international combustion engine power management. In this paper, a type of power-balancing instantaneous optimization(PBIO) energy management control strategy is proposed for a novel series-parallel hybrid electric bus. According to the characteristic of the novel series-parallel architecture, the switching boundary condition between series and parallel mode as well as the control rules of the power-balancing strategy are developed. The equivalent fuel model of battery is implemented and combined with the fuel of engine to constitute the objective function which is to minimize the fuel consumption at each sampled time and to coordinate the power distribution in real-time between the engine and battery. To validate the proposed strategy effective and reasonable, a forward model is built based on Matlab/Simulink for the simulation and the dSPACE autobox is applied to act as a controller for hardware in-the-loop integrated with bench test. Both the results of simulation and hardware-in-the-loop demonstrate that the proposed strategy not only enable to sustain the battery SOC within its operational range and keep the engine operation point locating the peak efficiency region, but also the fuel economy of series-parallel hybrid electric bus(SPHEB) dramatically advanced up to 30.73% via comparing with the prototype bus and a similar improvement for PBIO strategy relative to rule-based strategy, the reduction of fuel consumption is up to 12.38%. The proposed research ensures the algorithm of PBIO is real-time applicability, improves the efficiency of SPHEB system, as well as suite to complicated configuration perfectly.

  5. Mapping unstructured grid computations to massively parallel computers. Ph.D. Thesis - Rensselaer Polytechnic Inst., Feb. 1992

    NASA Technical Reports Server (NTRS)

    Hammond, Steven Warren

    1992-01-01

    Investigated here is this mapping problem: assign the tasks of a parallel program to the processors of a parallel computer such that the execution time is minimized. First, a taxonomy of objective functions and heuristics used to solve the mapping problem is presented. Next, we develop a highly parallel heuristic mapping algorithm, called Cyclic Pairwise Exchange (CPE), and discuss its place in the taxonomy. CPE uses local pairwise exchanges of processor assignments to iteratively improve an initial mapping. A variety of initial mapping schemes are tested and recursive spectral bipartitioning (RSB) followed by CPE is shown to result in the best mappings. For the test cases studied here, problems arising in computational fluid dynamics and structural mechanics on unstructured triangular and tetrahedral meshes, RSB and CPE outperform methods based on simulated annealing. Much less time is required to do the mapping and the results obtained are better. Compared with random and naive mappings, RSB and CPE reduce the communication time two fold for the test problems used. Finally, we use CPE in two applications on a CM-2. The first application is a data parallel mesh-vertex upwind finite volume scheme for solving the Euler equations on 2-D triangular unstructured meshes. CPE is used to map grid points to processors. The performance of this code is compared with a similar code on a Cray-YMP and an Intel iPSC/860. The second application is parallel sparse matrix-vector multiplication used in the iterative solution of large sparse linear systems of equations. We map rows of the matrix to processors and use an inner-product based matrix-vector multiplication. We demonstrate that this method is an order of magnitude faster than methods based on scan operations for our test cases.

  6. Ionization of an Highly Excited Hydrogen atom in parallel Electric and Magnetic fields

    NASA Astrophysics Data System (ADS)

    Topçu, T.&Ürker; Robicheaux, Francis

    2006-05-01

    In a recent paper, Mitchell et al [Phys. Rev. Lett. 92, 073001 (2004)] investigated the ionization of a classical hydrogen atom in parallel electric and magnetic fields. They reported epistrophic self- similar pulse trains of ionized electrons attributed to the classical chaos induced by the magnetic field. We study hydrogen atom in an excited state with n˜80 in parallel external fields as an example of an open, chaotic quantum system in the time domain. We investigate the effect of interference between the outgoing pulse trains which is absent in the classical picture. We look at interference effect as a function of the energy since Schr"odinger equation does not scale as the classical equations of motion do. We compare and contrast our quantum results with the classical results of Mitchell et al.

  7. Development of parallel algorithms for electrical power management in space applications

    NASA Technical Reports Server (NTRS)

    Berry, Frederick C.

    1989-01-01

    The application of parallel techniques for electrical power system analysis is discussed. The Newton-Raphson method of load flow analysis was used along with the decomposition-coordination technique to perform load flow analysis. The decomposition-coordination technique enables tasks to be performed in parallel by partitioning the electrical power system into independent local problems. Each independent local problem represents a portion of the total electrical power system on which a loan flow analysis can be performed. The load flow analysis is performed on these partitioned elements by using the Newton-Raphson load flow method. These independent local problems will produce results for voltage and power which can then be passed to the coordinator portion of the solution procedure. The coordinator problem uses the results of the local problems to determine if any correction is needed on the local problems. The coordinator problem is also solved by an iterative method much like the local problem. The iterative method for the coordination problem will also be the Newton-Raphson method. Therefore, each iteration at the coordination level will result in new values for the local problems. The local problems will have to be solved again along with the coordinator problem until some convergence conditions are met.

  8. Parallel plate waveguide with anisotropic graphene plates: Effect of electric and magnetic biases

    NASA Astrophysics Data System (ADS)

    Malekabadi, Ali; Charlebois, Serge A.; Deslandes, Dominic

    2013-03-01

    The performances of a parallel plate waveguide (PPWG) supported by perfect electric conductor (PEC)-graphene and graphene-graphene plates are evaluated. The graphene plate behavior is modeled as an anisotropic medium with both diagonal and Hall conductivities derived from Kubo formula. The PPWG modes supported by PEC-graphene and graphene-graphene plates are studied. Maxwell's equations are solved for these two waveguides, while the graphene layers are biased with an electric field only and with both electric and magnetic fields. It is shown that when both electric and magnetic biases are applied to the graphene, a hybrid mode (simultaneous transverse electric (TE) and transverse magnetic (TM) modes) will propagate inside the waveguide. The intensity of each TE and TM modes can be adjusted with the applied external bias fields. Study of different waveguides demonstrates that by decreasing the plate separation (d), the wave confinement improves. However, it increases the waveguide attenuation. A dielectric layer inserted between the plates can also be used to improve the wave confinement. The presented analytical procedure is applicable to other guiding structures having walls with isotropic or anisotropic conductivities.

  9. Performance of a TthPrimPol-based whole genome amplification kit for copy number alteration detection using massively parallel sequencing

    PubMed Central

    Deleye, Lieselot; De Coninck, Dieter; Dheedene, Annelies; De Sutter, Petra; Menten, Björn; Deforce, Dieter; Van Nieuwerburgh, Filip

    2016-01-01

    Starting from only a few cells, current whole genome amplification (WGA) methods provide enough DNA to perform massively parallel sequencing (MPS). Unfortunately, all current WGA methods introduce representation bias which limits detection of copy number aberrations (CNAs) smaller than 3 Mb. A recent WGA method, called TruePrime single cell WGA, uses a recently discovered DNA primase, TthPrimPol, instead of artificial primers to initiate DNA amplification. This method could lead to a lower representation bias, and consequently to a better detection of CNAs. The enzyme requires no complementarity and thus should generate random primers, equally distributed across the genome. The performance of TruePrime WGA was assessed for aneuploidy screening and CNA analysis after MPS, starting from 1, 3 or 5 cells. Although the method looks promising, the single cell TruePrime WGA kit v1 is not suited for high resolution CNA detection after MPS because too much representation bias is introduced. PMID:27546482

  10. Performance of a TthPrimPol-based whole genome amplification kit for copy number alteration detection using massively parallel sequencing.

    PubMed

    Deleye, Lieselot; De Coninck, Dieter; Dheedene, Annelies; De Sutter, Petra; Menten, Björn; Deforce, Dieter; Van Nieuwerburgh, Filip

    2016-01-01

    Starting from only a few cells, current whole genome amplification (WGA) methods provide enough DNA to perform massively parallel sequencing (MPS). Unfortunately, all current WGA methods introduce representation bias which limits detection of copy number aberrations (CNAs) smaller than 3 Mb. A recent WGA method, called TruePrime single cell WGA, uses a recently discovered DNA primase, TthPrimPol, instead of artificial primers to initiate DNA amplification. This method could lead to a lower representation bias, and consequently to a better detection of CNAs. The enzyme requires no complementarity and thus should generate random primers, equally distributed across the genome. The performance of TruePrime WGA was assessed for aneuploidy screening and CNA analysis after MPS, starting from 1, 3 or 5 cells. Although the method looks promising, the single cell TruePrime WGA kit v1 is not suited for high resolution CNA detection after MPS because too much representation bias is introduced. PMID:27546482

  11. An open-source, massively parallel code for non-LTE synthesis and inversion of spectral lines and Zeeman-induced Stokes profiles

    NASA Astrophysics Data System (ADS)

    Socas-Navarro, H.; de la Cruz Rodríguez, J.; Asensio Ramos, A.; Trujillo Bueno, J.; Ruiz Cobo, B.

    2015-05-01

    With the advent of a new generation of solar telescopes and instrumentation, interpreting chromospheric observations (in particular, spectropolarimetry) requires new, suitable diagnostic tools. This paper describes a new code, NICOLE, that has been designed for Stokes non-LTE radiative transfer, for synthesis and inversion of spectral lines and Zeeman-induced polarization profiles, spanning a wide range of atmospheric heights from the photosphere to the chromosphere. The code features a number of unique features and capabilities and has been built from scratch with a powerful parallelization scheme that makes it suitable for application on massive datasets using large supercomputers. The source code is written entirely in Fortran 90/2003 and complies strictly with the ANSI standards to ensure maximum compatibility and portability. It is being publicly released, with the idea of facilitating future branching by other groups to augment its capabilities. The source code is currently hosted at the following repository: http://https://github.com/hsocasnavarro/NICOLE

  12. Use of Massively Parallel Pyrosequencing to Evaluate the Diversity of and Selection on Plasmodium falciparum csp T-Cell Epitopes in Lilongwe, Malawi

    PubMed Central

    Bailey, Jeffrey A.; Mvalo, Tisungane; Aragam, Nagesh; Weiser, Matthew; Congdon, Seth; Kamwendo, Debbie; Martinson, Francis; Hoffman, Irving; Meshnick, Steven R.; Juliano, Jonathan J.

    2012-01-01

    The development of an effective malaria vaccine has been hampered by the genetic diversity of commonly used target antigens. This diversity has led to concerns about allele-specific immunity limiting the effectiveness of vaccines. Despite extensive genetic diversity of circumsporozoite protein (CS), the most successful malaria vaccine is RTS/S, a monovalent CS vaccine. By use of massively parallel pyrosequencing, we evaluated the diversity of CS haplotypes across the T-cell epitopes in parasites from Lilongwe, Malawi. We identified 57 unique parasite haplotypes from 100 participants. By use of ecological and molecular indexes of diversity, we saw no difference in the diversity of CS haplotypes between adults and children. We saw evidence of weak variant-specific selection within this region of CS, suggesting naturally acquired immunity does induce variant-specific selection on CS. Therefore, the impact of CS vaccines on variant frequencies with widespread implementation of vaccination requires further study. PMID:22551816

  13. Active control of massively separated high-speed/base flows with electric arc plasma actuators

    NASA Astrophysics Data System (ADS)

    DeBlauw, Bradley G.

    The current project was undertaken to evaluate the effects of electric arc plasma actuators on high-speed separated flows. Two underlying goals motivated these experiments. The first goal was to provide a flow control technique that will result in enhanced flight performance for supersonic vehicles by altering the near-wake characteristics. The second goal was to gain a broader and more sophisticated understanding of these complex, supersonic, massively-separated, compressible, and turbulent flow fields. The attainment of the proposed objectives was facilitated through energy deposition from multiple electric-arc plasma discharges near the base corner separation point. The control authority of electric arc plasma actuators on a supersonic axisymmetric base flow was evaluated for several actuator geometries, frequencies, forcing modes, duty cycles/on-times, and currents. Initially, an electric arc plasma actuator power supply and control system were constructed to generate the arcs. Experiments were performed to evaluate the operational characteristics, electromagnetic emission, and fluidic effect of the actuators in quiescent ambient air. The maximum velocity induced by the arc when formed in a 5 mm x 1.6 mm x 2 mm deep cavity was about 40 m/s. During breakdown, the electromagnetic emission exhibited a rise and fall in intensity over a period of about 340 ns. After breakdown, the emission stabilized to a near-constant distribution. It was also observed that the plasma formed into two different modes: "high-voltage" and "low-voltage". It is believed that the plasma may be switching between an arc discharge and a glow discharge for these different modes. The two types of plasma do not appear to cause substantial differences on the induced fluidic effects of the actuator. In general, the characterization study provided a greater fundamental understanding of the operation of the actuators, as well as data for computational model comparison. Preliminary investigations

  14. Adaptation of a Multi-Block Structured Solver for Effective Use in a Hybrid CPU/GPU Massively Parallel Environment

    NASA Astrophysics Data System (ADS)

    Gutzwiller, David; Gontier, Mathieu; Demeulenaere, Alain

    2014-11-01

    Multi-Block structured solvers hold many advantages over their unstructured counterparts, such as a smaller memory footprint and efficient serial performance. Historically, multi-block structured solvers have not been easily adapted for use in a High Performance Computing (HPC) environment, and the recent trend towards hybrid GPU/CPU architectures has further complicated the situation. This paper will elaborate on developments and innovations applied to the NUMECA FINE/Turbo solver that have allowed near-linear scalability with real-world problems on over 250 hybrid GPU/GPU cluster nodes. Discussion will focus on the implementation of virtual partitioning and load balancing algorithms using a novel meta-block concept. This implementation is transparent to the user, allowing all pre- and post-processing steps to be performed using a simple, unpartitioned grid topology. Additional discussion will elaborate on developments that have improved parallel performance, including fully parallel I/O with the ADIOS API and the GPU porting of the computationally heavy CPUBooster convergence acceleration module. Head of HPC and Release Management, Numeca International.

  15. Magnetospheric Multiscale Satellites Observations of Parallel Electric Fields Associated with Magnetic Reconnection

    NASA Astrophysics Data System (ADS)

    Ergun, R. E.; Goodrich, K. A.; Wilder, F. D.; Holmes, J. C.; Stawarz, J. E.; Eriksson, S.; Sturner, A. P.; Malaspina, D. M.; Usanova, M. E.; Torbert, R. B.; Lindqvist, P.-A.; Khotyaintsev, Y.; Burch, J. L.; Strangeway, R. J.; Russell, C. T.; Pollock, C. J.; Giles, B. L.; Hesse, M.; Chen, L. J.; Lapenta, G.; Goldman, M. V.; Newman, D. L.; Schwartz, S. J.; Eastwood, J. P.; Phan, T. D.; Mozer, F. S.; Drake, J.; Shay, M. A.; Cassak, P. A.; Nakamura, R.; Marklund, G.

    2016-06-01

    We report observations from the Magnetospheric Multiscale satellites of parallel electric fields (E∥ ) associated with magnetic reconnection in the subsolar region of the Earth's magnetopause. E∥ events near the electron diffusion region have amplitudes on the order of 100 mV /m , which are significantly larger than those predicted for an antiparallel reconnection electric field. This Letter addresses specific types of E∥ events, which appear as large-amplitude, near unipolar spikes that are associated with tangled, reconnected magnetic fields. These E∥ events are primarily in or near a current layer near the separatrix and are interpreted to be double layers that may be responsible for secondary reconnection in tangled magnetic fields or flux ropes. These results are telling of the three-dimensional nature of magnetopause reconnection and indicate that magnetopause reconnection may be often patchy and/or drive turbulence along the separatrix that results in flux ropes and/or tangled magnetic fields.

  16. Venus nightside ionospheric holes - The signatures of parallel electric field acceleration regions

    NASA Technical Reports Server (NTRS)

    Grebowsky, J. M.; Curtis, S. A.

    1981-01-01

    Attention is given to the existence of 'holes', that is, regions of density depletion in the nightside Venus ionosphere associated with regions of radial magnetic fields. The properties of the electrons within the core of these holes are thought to suggest an acceleration process along the magnetic field lines, a process also suggested by the Venera 9 and 10 observations of energetic ions in the Venus tail. On the basis of the observational information, these Venusian plasma depletions are attributed to the presence of parallel electric fields similar to those observed in the terrestrial auroral ionosphere. The resulting electric field accelerates electrons down the field lines, heating the depleted thermal electron population within the hole and producing ionization below the hole. At the same time, ionospheric ions are accelerated outward toward the plasmasheet.

  17. Rapid profiling of the antigen regions recognized by serum antibodies using massively parallel sequencing of antigen-specific libraries.

    PubMed

    Domina, Maria; Lanza Cariccio, Veronica; Benfatto, Salvatore; D'Aliberti, Deborah; Venza, Mario; Borgogni, Erica; Castellino, Flora; Biondo, Carmelo; D'Andrea, Daniel; Grassi, Luigi; Tramontano, Anna; Teti, Giuseppe; Felici, Franco; Beninati, Concetta

    2014-01-01

    There is a need for techniques capable of identifying the antigenic epitopes targeted by polyclonal antibody responses during deliberate or natural immunization. Although successful, traditional phage library screening is laborious and can map only some of the epitopes. To accelerate and improve epitope identification, we have employed massive sequencing of phage-displayed antigen-specific libraries using the Illumina MiSeq platform. This enabled us to precisely identify the regions of a model antigen, the meningococcal NadA virulence factor, targeted by serum antibodies in vaccinated individuals and to rank hundreds of antigenic fragments according to their immunoreactivity. We found that next generation sequencing can significantly empower the analysis of antigen-specific libraries by allowing simultaneous processing of dozens of library/serum combinations in less than two days, including the time required for antibody-mediated library selection. Moreover, compared with traditional plaque picking, the new technology (named Phage-based Representation OF Immuno-Ligand Epitope Repertoire or PROFILER) provides superior resolution in epitope identification. PROFILER seems ideally suited to streamline and guide rational antigen design, adjuvant selection, and quality control of newly produced vaccines. Furthermore, this method is also susceptible to find important applications in other fields covered by traditional quantitative serology. PMID:25473968

  18. Partition-of-unity finite-element method for large scale quantum molecular dynamics on massively parallel computational platforms

    SciTech Connect

    Pask, J E; Sukumar, N; Guney, M; Hu, W

    2011-02-28

    Over the course of the past two decades, quantum mechanical calculations have emerged as a key component of modern materials research. However, the solution of the required quantum mechanical equations is a formidable task and this has severely limited the range of materials systems which can be investigated by such accurate, quantum mechanical means. The current state of the art for large-scale quantum simulations is the planewave (PW) method, as implemented in now ubiquitous VASP, ABINIT, and QBox codes, among many others. However, since the PW method uses a global Fourier basis, with strictly uniform resolution at all points in space, and in which every basis function overlaps every other at every point, it suffers from substantial inefficiencies in calculations involving atoms with localized states, such as first-row and transition-metal atoms, and requires substantial nonlocal communications in parallel implementations, placing critical limits on scalability. In recent years, real-space methods such as finite-differences (FD) and finite-elements (FE) have been developed to address these deficiencies by reformulating the required quantum mechanical equations in a strictly local representation. However, while addressing both resolution and parallel-communications problems, such local real-space approaches have been plagued by one key disadvantage relative to planewaves: excessive degrees of freedom (grid points, basis functions) needed to achieve the required accuracies. And so, despite critical limitations, the PW method remains the standard today. In this work, we show for the first time that this key remaining disadvantage of real-space methods can in fact be overcome: by building known atomic physics into the solution process using modern partition-of-unity (PU) techniques in finite element analysis. Indeed, our results show order-of-magnitude reductions in basis size relative to state-of-the-art planewave based methods. The method developed here is

  19. Monte Carlo simulation of photoelectron energization in parallel electric fields: Electroglow on Uranus

    SciTech Connect

    Singhal, R.P.; Bhardwaj, A. )

    1991-09-01

    A Monte Carlo simulation of photoelectron energization and energy degradation in H{sub 2} gas in the presence of parallel electric fields has been carried out. Numerical yield spectra which contain information about the electron energy degradation process and can be used to calculate the yield for any inelastic event are obtained. The variation of yield spectra with incident electron energy, electric field, pitch angle, and cutoff limit has been studied. The yield function is employed to determine the photoelectron fluxes. H{sub 2} Lyman and Werner band excitation rates and integrated column intensity are computed for three different electric field profiles taking various low-energy cutoff limits. It is found that an electric field profile with peak value of 4 mV/m at neutral number density of 3{times}10{sup 10} cm{sup {minus}3} produces enhanced volume emission rates of H{sub 2} bands ({lambda} < 1100 {angstrom}) explaining about 20% of the observed electroglow emission on Uranus. The effect of solar zenith angle and solar cycle variation on peak excitation rate is discussed.

  20. A theoretical study for parallel electric field in nonlinear magnetosonic waves in three-component plasmas

    NASA Astrophysics Data System (ADS)

    Toida, Mieko

    2016-07-01

    The electric field parallel to the magnetic field in nonlinear magnetosonic waves in three component plasmas (two-ion-species plasma and electron-positron-ion plasma) is theoretically studied based on a three-fluid model. In a two-ion-species plasma, a magnetosonic mode has two branches, high-frequency mode and low-frequency mode. The parallel electric field E ∥ and its integral along the magnetic field, F = - ∫ E ∥ d s , in the two modes propagating quasiperpendicular to the magnetic field are derived as functions of the wave amplitude ɛ and the density ratio and cyclotron frequency ratio of the two ion species. The theory shows that the magnitude of F in the high-frequency-mode pulse is much greater than that in the low-frequency-mode pulse. Theoretical expressions for E ∥ and F in nonlinear magnetosonic pulses in an electron-positron-ion plasma are also obtained under the assumption that the wave amplitudes are in the range of ( m e / m i ) 1 / 2 < ɛ < 1 , where m e / m i is the electron to ion mass ratio.

  1. Rydberg systems in parallel electric and magnetic fields: an improved method for finding exceptional points

    NASA Astrophysics Data System (ADS)

    Feldmaier, Matthias; Main, Jörg; Schweiner, Frank; Cartarius, Holger; Wunner, Günter

    2016-07-01

    Exceptional points are special parameter points in spectra of open quantum systems, at which resonance energies become degenerate and the associated eigenvectors coalesce. Typical examples are Rydberg systems in parallel electric and magnetic fields, for which we solve the Schrödinger equation in a complete basis to calculate the resonances and eigenvectors. Starting from an avoided crossing within the parameter-dependent spectra and using a two-dimensional matrix model, we develop an iterative algorithm to calculate the field strengths and resonance energies of exceptional points and to verify their basic properties. Additionally, we are able to visualise the wave functions of the degenerate states. We report the existence of various exceptional points. For the hydrogen atom these points are in an experimentally inaccessible regime of field strengths. However, excitons in cuprous oxide in parallel electric and magnetic fields, i.e., the corresponding hydrogen analogue in a solid state body, provide a suitable system, where the high-field regime can be reached at much smaller external fields and for which we propose an experiment to detect exceptional points.

  2. User's guide of TOUGH2-EGS-MP: A Massively Parallel Simulator with Coupled Geomechanics for Fluid and Heat Flow in Enhanced Geothermal Systems VERSION 1.0

    SciTech Connect

    Xiong, Yi; Fakcharoenphol, Perapon; Wang, Shihao; Winterfeld, Philip H.; Zhang, Keni; Wu, Yu-Shu

    2013-12-01

    TOUGH2-EGS-MP is a parallel numerical simulation program coupling geomechanics with fluid and heat flow in fractured and porous media, and is applicable for simulation of enhanced geothermal systems (EGS). TOUGH2-EGS-MP is based on the TOUGH2-MP code, the massively parallel version of TOUGH2. In TOUGH2-EGS-MP, the fully-coupled flow-geomechanics model is developed from linear elastic theory for thermo-poro-elastic systems and is formulated in terms of mean normal stress as well as pore pressure and temperature. Reservoir rock properties such as porosity and permeability depend on rock deformation, and the relationships between these two, obtained from poro-elasticity theories and empirical correlations, are incorporated into the simulation. This report provides the user with detailed information on the TOUGH2-EGS-MP mathematical model and instructions for using it for Thermal-Hydrological-Mechanical (THM) simulations. The mathematical model includes the fluid and heat flow equations, geomechanical equation, and discretization of those equations. In addition, the parallel aspects of the code, such as domain partitioning and communication between processors, are also included. Although TOUGH2-EGS-MP has the capability for simulating fluid and heat flows coupled with geomechanical effects, it is up to the user to select the specific coupling process, such as THM or only TH, in a simulation. There are several example problems illustrating applications of this program. These example problems are described in detail and their input data are presented. Their results demonstrate that this program can be used for field-scale geothermal reservoir simulation in porous and fractured media with fluid and heat flow coupled with geomechanical effects.

  3. Implementation of a flexible and scalable particle-in-cell method for massively parallel computations in the mantle convection code ASPECT

    NASA Astrophysics Data System (ADS)

    Gassmöller, Rene; Bangerth, Wolfgang

    2016-04-01

    Particle-in-cell methods have a long history and many applications in geodynamic modelling of mantle convection, lithospheric deformation and crustal dynamics. They are primarily used to track material information, the strain a material has undergone, the pressure-temperature history a certain material region has experienced, or the amount of volatiles or partial melt present in a region. However, their efficient parallel implementation - in particular combined with adaptive finite-element meshes - is complicated due to the complex communication patterns and frequent reassignment of particles to cells. Consequently, many current scientific software packages accomplish this efficient implementation by specifically designing particle methods for a single purpose, like the advection of scalar material properties that do not evolve over time (e.g., for chemical heterogeneities). Design choices for particle integration, data storage, and parallel communication are then optimized for this single purpose, making the code relatively rigid to changing requirements. Here, we present the implementation of a flexible, scalable and efficient particle-in-cell method for massively parallel finite-element codes with adaptively changing meshes. Using a modular plugin structure, we allow maximum flexibility of the generation of particles, the carried tracer properties, the advection and output algorithms, and the projection of properties to the finite-element mesh. We present scaling tests ranging up to tens of thousands of cores and tens of billions of particles. Additionally, we discuss efficient load-balancing strategies for particles in adaptive meshes with their strengths and weaknesses, local particle-transfer between parallel subdomains utilizing existing communication patterns from the finite element mesh, and the use of established parallel output algorithms like the HDF5 library. Finally, we show some relevant particle application cases, compare our implementation to a

  4. Self-consistent quasi-static parallel electric field associated with substorm growth phase

    NASA Astrophysics Data System (ADS)

    Le Contel, O.; Pellat, R.; Roux, A.

    2000-06-01

    A new approach is proposed to calculate the self-consistent parallel electric field associated with the response of a plasma to quasi-static electromagnetic perturbations (ωparallel component of the wave vector). Calculations are carried out in the case of a mirror geometry, for ω<ωb (ωb being the particle bounce frequency). For the sake of simplification the β of the plasma is assumed to be small. Apart from this restriction, the full Vlasov-Maxwell system of equations has been solved within the constraints described above (ωparallel electric field vanishes. In the present study, we solve the QNC to the next order in Te/Ti and show that a field-aligned potential drop proportional to Te/Ti does develop. We compute explicitly this potential drop in the case of the substorm growth phase modeled as in LC00. This potential drop has been calculated analytically for two regimes of parameters, ωd<ω and ωd>ω (ωd being the bounce averaged magnetic drift frequency equal to kyvd, where ky is the wave number in the y direction and vd the bounce averaged magnetic drift velocity). The first regime (ωd<ω) corresponds to small particle

  5. A microprobe for parallel optical and electrical recordings from single neurons in vivo.

    PubMed

    LeChasseur, Yoan; Dufour, Suzie; Lavertu, Guillaume; Bories, Cyril; Deschênes, Martin; Vallée, Réal; De Koninck, Yves

    2011-04-01

    Recording electrical activity from identified neurons in intact tissue is key to understanding their role in information processing. Recent fluorescence labeling techniques have opened new possibilities to combine electrophysiological recording with optical detection of individual neurons deep in brain tissue. For this purpose we developed dual-core fiberoptics-based microprobes, with an optical core to locally excite and collect fluorescence, and an electrolyte-filled hollow core for extracellular single unit electrophysiology. This design provides microprobes with tips < 10 μm, enabling analyses with single-cell optical resolution. We demonstrate combined electrical and optical detection of single fluorescent neurons in rats and mice. We combined electrical recordings and optical Ca²(+) measurements from single thalamic relay neurons in rats, and achieved detection and activation of single channelrhodopsin-expressing neurons in Thy1::ChR2-YFP transgenic mice. The microprobe expands possibilities for in vivo electrophysiological recording, providing parallel access to single-cell optical monitoring and control. PMID:21317908

  6. A Highly Parallel Implementation of K-Means for Multithreaded Architecture

    SciTech Connect

    Mackey, Patrick S.; Feo, John T.; Wong, Pak C.; Chen, Yousu

    2011-04-06

    We present a parallel implementation of the popular k-means clustering algorithm for massively multithreaded computer systems, as well as a parallelized version of the KKZ seed selection algorithm. We demonstrate that as system size increases, sequential seed selection can become a bottleneck. We also present an early attempt at parallelizing k-means that highlights critical performance issues when programming massively multithreaded systems. For our case studies, we used data collected from electric power simulations and run on the Cray XMT.

  7. Scientific development of a massively parallel ocean climate model. Progress report for 1992--1993 and Continuing request for 1993--1994 to CHAMMP (Computer Hardware, Advanced Mathematics, and Model Physics)

    SciTech Connect

    Semtner, A.J. Jr.; Chervin, R.M.

    1993-05-01

    During the second year of CHAMMP funding to the principal investigators, progress has been made in the proposed areas of research, as follows: investigation of the physics of the thermohaline circulation; examination of resolution effects on ocean general circulation; and development of a massively parallel ocean climate model.

  8. Analysis of bacterial and archaeal diversity in coastal microbial mats using massive parallel 16S rRNA gene tag sequencing

    PubMed Central

    Bolhuis, Henk; Stal, Lucas J

    2011-01-01

    Coastal microbial mats are small-scale and largely closed ecosystems in which a plethora of different functional groups of microorganisms are responsible for the biogeochemical cycling of the elements. Coastal microbial mats play an important role in coastal protection and morphodynamics through stabilization of the sediments and by initiating the development of salt-marshes. Little is known about the bacterial and especially archaeal diversity and how it contributes to the ecological functioning of coastal microbial mats. Here, we analyzed three different types of coastal microbial mats that are located along a tidal gradient and can be characterized as marine (ST2), brackish (ST3) and freshwater (ST3) systems. The mats were sampled during three different seasons and subjected to massive parallel tag sequencing of the V6 region of the 16S rRNA genes of Bacteria and Archaea. Sequence analysis revealed that the mats are among the most diverse marine ecosystems studied so far and consist of several novel taxonomic levels ranging from classes to species. The diversity between the different mat types was far more pronounced than the changes between the different seasons at one location. The archaeal community for these mats have not been studied before and revealed a strong reaction on a short period of draught during summer resulting in a massive increase in halobacterial sequences, whereas the bacterial community was barely affected. We concluded that the community composition and the microbial diversity were intrinsic of the mat type and depend on the location along the tidal gradient indicating a relation with salinity. PMID:21544102

  9. Massively parallel implementations of coupled-cluster methods for electron spin resonance spectra. I. Isotropic hyperfine coupling tensors in large radicals

    NASA Astrophysics Data System (ADS)

    Verma, Prakash; Perera, Ajith; Morales, Jorge A.

    2013-11-01

    Coupled cluster (CC) methods provide highly accurate predictions of molecular properties, but their high computational cost has precluded their routine application to large systems. Fortunately, recent computational developments in the ACES III program by the Bartlett group [the OED/ERD atomic integral package, the super instruction processor, and the super instruction architecture language] permit overcoming that limitation by providing a framework for massively parallel CC implementations. In that scheme, we are further extending those parallel CC efforts to systematically predict the three main electron spin resonance (ESR) tensors (A-, g-, and D-tensors) to be reported in a series of papers. In this paper inaugurating that series, we report our new ACES III parallel capabilities that calculate isotropic hyperfine coupling constants in 38 neutral, cationic, and anionic radicals that include the 11B, 17O, 9Be, 19F, 1H, 13C, 35Cl, 33S,14N, 31P, and 67Zn nuclei. Present parallel calculations are conducted at the Hartree-Fock (HF), second-order many-body perturbation theory [MBPT(2)], CC singles and doubles (CCSD), and CCSD with perturbative triples [CCSD(T)] levels using Roos augmented double- and triple-zeta atomic natural orbitals basis sets. HF results consistently overestimate isotropic hyperfine coupling constants. However, inclusion of electron correlation effects in the simplest way via MBPT(2) provides significant improvements in the predictions, but not without occasional failures. In contrast, CCSD results are consistently in very good agreement with experimental results. Inclusion of perturbative triples to CCSD via CCSD(T) leads to small improvements in the predictions, which might not compensate for the extra computational effort at a non-iterative N7-scaling in CCSD(T). The importance of these accurate computations of isotropic hyperfine coupling constants to elucidate experimental ESR spectra, to interpret spin-density distributions, and to

  10. Massively parallel implementations of coupled-cluster methods for electron spin resonance spectra. I. Isotropic hyperfine coupling tensors in large radicals

    SciTech Connect

    Verma, Prakash; Morales, Jorge A.; Perera, Ajith

    2013-11-07

    Coupled cluster (CC) methods provide highly accurate predictions of molecular properties, but their high computational cost has precluded their routine application to large systems. Fortunately, recent computational developments in the ACES III program by the Bartlett group [the OED/ERD atomic integral package, the super instruction processor, and the super instruction architecture language] permit overcoming that limitation by providing a framework for massively parallel CC implementations. In that scheme, we are further extending those parallel CC efforts to systematically predict the three main electron spin resonance (ESR) tensors (A-, g-, and D-tensors) to be reported in a series of papers. In this paper inaugurating that series, we report our new ACES III parallel capabilities that calculate isotropic hyperfine coupling constants in 38 neutral, cationic, and anionic radicals that include the {sup 11}B, {sup 17}O, {sup 9}Be, {sup 19}F, {sup 1}H, {sup 13}C, {sup 35}Cl, {sup 33}S,{sup 14}N, {sup 31}P, and {sup 67}Zn nuclei. Present parallel calculations are conducted at the Hartree-Fock (HF), second-order many-body perturbation theory [MBPT(2)], CC singles and doubles (CCSD), and CCSD with perturbative triples [CCSD(T)] levels using Roos augmented double- and triple-zeta atomic natural orbitals basis sets. HF results consistently overestimate isotropic hyperfine coupling constants. However, inclusion of electron correlation effects in the simplest way via MBPT(2) provides significant improvements in the predictions, but not without occasional failures. In contrast, CCSD results are consistently in very good agreement with experimental results. Inclusion of perturbative triples to CCSD via CCSD(T) leads to small improvements in the predictions, which might not compensate for the extra computational effort at a non-iterative N{sup 7}-scaling in CCSD(T). The importance of these accurate computations of isotropic hyperfine coupling constants to elucidate

  11. Photoionization microscopy for a hydrogen atom in parallel electric and magnetic fields

    NASA Astrophysics Data System (ADS)

    Deng, M.; Gao, W.; Lu, Rong; Delos, J. B.; You, L.; Liu, H. P.

    2016-06-01

    In photoionization microscopy experiments, an atom is placed in static external fields, it is ionized by a laser, and an electron falls onto a position-sensitive detector. The current of electrons arriving at various points on the detector depends upon the initial state of the atom, the excited states to which the electron is resonantly or nonresonantly excited, and the various paths leading from the atom to the final point on the detector. We report here quantum-mechanical computations of photoionization microscopy in parallel electric and magnetic fields. We focus especially on the patterns resulting from resonant excited states. We show that the magnetic field substantially modifies some of these resonant states, confining them in the radial direction, and that it has a strong effect on the interference pattern at the detector.

  12. Evidence for parallel electric field particle acceleration in the dayside auroral oval

    NASA Technical Reports Server (NTRS)

    Torbert, R. B.; Carlson, C. W.

    1980-01-01

    Electron and ion energy spectra and electron pitch angle distributions are presented for two sounding rocket flights in the dayside auroral zone. At times, effects of dc electric fields parallel to the magnetic field are evident in that: (1) within precipitation features, protons are decelerated by an amount of energy consistent with that which electrons gain and (2) electrons are sometimes aligned to within 3 deg (full width at half maximum) of the magnetic field. A maximum altitude for the accelerating region of several thousand kilometers is deduced from the narrow width of the pitch angle distribution and also from time-of-flight delays between the observation of accelerated electrons and decelerated protons.

  13. Parallel electric fields detected via conjugate electron echoes during the Echo 7 sounding rocket flight

    NASA Technical Reports Server (NTRS)

    Nemzek, R. J.; Winckler, J. R.

    1991-01-01

    Electron detectors on the Echo 7 active sounding rocket experiment measured 'conjugate echoes' resulting from artificial electron beam injections. Analysis of the drift motion of the electrons after a complete bounce leads to measurements of the magnetospheric convection electric field mapped to ionospheric altitudes. The magnetospheric field was highly variable, changing by tens of mV/m on time scales of as little as hundreds of millisec. While the smallest-scale magnetospheric field irregularities were mapped out by ionospheric conductivity, larger-scale features were enhanced by up to 50 mV/m in the ionosphere. The mismatch between magnetospheric and ionspheric convection fields indicates a violation of the equipotential field line condition. The parallel fields occurred in regions roughly 10 km across and probably supported a total potential drop of 10-100 V.

  14. The role of the electron convection term for the parallel electric field and electron acceleration in MHD simulations

    SciTech Connect

    Matsuda, K.; Terada, N.; Katoh, Y.; Misawa, H.

    2011-08-15

    There has been a great concern about the origin of the parallel electric field in the frame of fluid equations in the auroral acceleration region. This paper proposes a new method to simulate magnetohydrodynamic (MHD) equations that include the electron convection term and shows its efficiency with simulation results in one dimension. We apply a third-order semi-discrete central scheme to investigate the characteristics of the electron convection term including its nonlinearity. At a steady state discontinuity, the sum of the ion and electron convection terms balances with the ion pressure gradient. We find that the electron convection term works like the gradient of the negative pressure and reduces the ion sound speed or amplifies the sound mode when parallel current flows. The electron convection term enables us to describe a situation in which a parallel electric field and parallel electron acceleration coexist, which is impossible for ideal or resistive MHD.

  15. Velocity profiles of electric-field-induced backflows in liquid crystals confined between parallel plates

    NASA Astrophysics Data System (ADS)

    Tsuji, Tomohiro; Chono, Shigeomi; Matsumi, Takanori

    2015-02-01

    For the purpose of developing liquid crystalline microactuators, we visualize backflows induced between two parallel plates for various parameters such as the twist angle, cell gap, applied voltage, and molecular configuration mode. We use 4-cyano-4'-pentyl biphenyl, a typical low-molar-mass nematic liquid crystal. By increasing the twist angle from 0° to 180°, the velocity component parallel to the anchoring direction of the lower plate changes from an S-shaped profile to a distorted S-shaped profile before finally becoming unidirectional. In contrast, the velocity component perpendicular to the anchoring direction evolves from a flat profile at 0° into an S-shaped profile at 180°. Because both an increase in the applied voltage and a decrease in the cell gap increase the electric field intensity, the backflow becomes large. The hybrid molecular configuration mode induces a larger backflow than that for the planar aligned mode. The backflow develops in two stages: an early stage with a microsecond time scale and a later stage with a millisecond time scale. The numerical predictions are in qualitative agreement with the measurements, but not quantitative agreement because our computation ignores the plate edge effect of surface tension.

  16. Magnetospheric Multiscale Satellites Observations of Parallel Electric Fields Associated with Magnetic Reconnection.

    PubMed

    Ergun, R E; Goodrich, K A; Wilder, F D; Holmes, J C; Stawarz, J E; Eriksson, S; Sturner, A P; Malaspina, D M; Usanova, M E; Torbert, R B; Lindqvist, P-A; Khotyaintsev, Y; Burch, J L; Strangeway, R J; Russell, C T; Pollock, C J; Giles, B L; Hesse, M; Chen, L J; Lapenta, G; Goldman, M V; Newman, D L; Schwartz, S J; Eastwood, J P; Phan, T D; Mozer, F S; Drake, J; Shay, M A; Cassak, P A; Nakamura, R; Marklund, G

    2016-06-10

    We report observations from the Magnetospheric Multiscale satellites of parallel electric fields (E_{∥}) associated with magnetic reconnection in the subsolar region of the Earth's magnetopause. E_{∥} events near the electron diffusion region have amplitudes on the order of 100  mV/m, which are significantly larger than those predicted for an antiparallel reconnection electric field. This Letter addresses specific types of E_{∥} events, which appear as large-amplitude, near unipolar spikes that are associated with tangled, reconnected magnetic fields. These E_{∥} events are primarily in or near a current layer near the separatrix and are interpreted to be double layers that may be responsible for secondary reconnection in tangled magnetic fields or flux ropes. These results are telling of the three-dimensional nature of magnetopause reconnection and indicate that magnetopause reconnection may be often patchy and/or drive turbulence along the separatrix that results in flux ropes and/or tangled magnetic fields. PMID:27341241

  17. Using Massive Parallel Sequencing for the Development, Validation, and Application of Population Genetics Markers in the Invasive Bivalve Zebra Mussel (Dreissena polymorpha)

    PubMed Central

    Peñarrubia, Luis; Sanz, Nuria; Pla, Carles; Vidal, Oriol; Viñas, Jordi

    2015-01-01

    The zebra mussel (Dreissena polymorpha, Pallas, 1771) is one of the most invasive species of freshwater bivalves, due to a combination of biological and anthropogenic factors. Once this species has been introduced to a new area, individuals form dense aggregations that are very difficult to remove, leading to many adverse socioeconomic and ecological consequences. In this study, we identified, tested, and validated a new set of polymorphic microsatellite loci (also known as SSRs, Single Sequence Repeats) using a Massive Parallel Sequencing (MPS) platform. After several pruning steps, 93 SSRs could potentially be amplified. Out of these SSRs, 14 were polymorphic, producing a polymorphic yield of 15.05%. These 14 polymorphic microsatellites were fully validated in a first approximation of the genetic population structure of D. polymorpha in the Iberian Peninsula. Based on this polymorphic yield, we propose a criterion for establishing the number of SSRs that require validation in similar species, depending on the final use of the markers. These results could be used to optimize MPS approaches in the development of microsatellites as genetic markers, which would reduce the cost of this process. PMID:25780924

  18. A digital 25 µm pixel-pitch uncooled amorphous silicon TEC-less VGA IRFPA with massive parallel Sigma-Delta-ADC readout

    NASA Astrophysics Data System (ADS)

    Weiler, Dirk; Russ, Marco; Würfel, Daniel; Lerch, Renee; Yang, Pin; Bauer, Jochen; Vogt, Holger

    2010-04-01

    This paper presents an advanced 640 x 480 (VGA) IRFPA based on uncooled microbolometers with a pixel-pitch of 25μm developed by Fraunhofer-IMS. The IRFPA is designed for thermal imaging applications in the LWIR (8 .. 14μm) range with a full-frame frequency of 30 Hz and a high sensitivity with NETD < 100 mK @ f/1. A novel readout architecture which utilizes massively parallel on-chip Sigma-Delta-ADCs located under the microbolometer array results in a high performance digital readout. Sigma-Delta-ADCs are inherently linear. A high resolution of 16 bit for a secondorder Sigma-Delta-modulator followed by a third-order digital sinc-filter can be obtained. In addition to several thousand Sigma-Delta-ADCs the readout circuit consists of a configurable sequencer for controlling the readout clocking signals and a temperature sensor for measuring the temperature of the IRFPA. Since packaging is a significant part of IRFPA's price Fraunhofer-IMS uses a chip-scaled package consisting of an IR-transparent window with antireflection coating and a soldering frame for maintaining the vacuum. The IRFPAs are completely fabricated at Fraunhofer-IMS on 8" CMOS wafers with an additional surface micromachining process. In this paper the architecture of the readout electronics, the packaging, and the electro-optical performance characterization are presented.

  19. Towards Anatomic Scale Agent-Based Modeling with a Massively Parallel Spatially Explicit General-Purpose Model of Enteric Tissue (SEGMEnT_HPC)

    PubMed Central

    Cockrell, Robert Chase; Christley, Scott; Chang, Eugene; An, Gary

    2015-01-01

    Perhaps the greatest challenge currently facing the biomedical research community is the ability to integrate highly detailed cellular and molecular mechanisms to represent clinical disease states as a pathway to engineer effective therapeutics. This is particularly evident in the representation of organ-level pathophysiology in terms of abnormal tissue structure, which, through histology, remains a mainstay in disease diagnosis and staging. As such, being able to generate anatomic scale simulations is a highly desirable goal. While computational limitations have previously constrained the size and scope of multi-scale computational models, advances in the capacity and availability of high-performance computing (HPC) resources have greatly expanded the ability of computational models of biological systems to achieve anatomic, clinically relevant scale. Diseases of the intestinal tract are exemplary examples of pathophysiological processes that manifest at multiple scales of spatial resolution, with structural abnormalities present at the microscopic, macroscopic and organ-levels. In this paper, we describe a novel, massively parallel computational model of the gut, the Spatially Explicitly General-purpose Model of Enteric Tissue_HPC (SEGMEnT_HPC), which extends an existing model of the gut epithelium, SEGMEnT, in order to create cell-for-cell anatomic scale simulations. We present an example implementation of SEGMEnT_HPC that simulates the pathogenesis of ileal pouchitis, and important clinical entity that affects patients following remedial surgery for ulcerative colitis. PMID:25806784

  20. Helena, the hidden beauty: Resolving the most common West Eurasian mtDNA control region haplotype by massively parallel sequencing an Italian population sample.

    PubMed

    Bodner, Martin; Iuvaro, Alessandra; Strobl, Christina; Nagl, Simone; Huber, Gabriela; Pelotti, Susi; Pettener, Davide; Luiselli, Donata; Parson, Walther

    2015-03-01

    The analysis of mitochondrial (mt)DNA is a powerful tool in forensic genetics when nuclear markers fail to give results or maternal relatedness is investigated. The mtDNA control region (CR) contains highly condensed variation and is therefore routinely typed. Some samples exhibit an identical haplotype in this restricted range. Thus, they convey only weak evidence in forensic queries and limited phylogenetic information. However, a CR match does not imply that also the mtDNA coding regions are identical or samples belong to the same phylogenetic lineage. This is especially the case for the most frequent West Eurasian CR haplotype 263G 315.1C 16519C, which is observed in various clades within haplogroup H and occurs at a frequency of 3-4% in many European populations. In this study, we investigated the power of massively parallel complete mtGenome sequencing in 29 Italian samples displaying the most common West Eurasian CR haplotype - and found an unexpected high diversity. Twenty-eight different haplotypes falling into 19 described sub-clades of haplogroup H were revealed in the samples with identical CR sequences. This study demonstrates the benefit of complete mtGenome sequencing for forensic applications to enforce maximum discrimination, more comprehensive heteroplasmy detection, as well as highest phylogenetic resolution. PMID:25303789

  1. Using Massive Parallel Sequencing for the development, validation, and application of population genetics markers in the invasive bivalve zebra mussel (Dreissena polymorpha).

    PubMed

    Peñarrubia, Luis; Sanz, Nuria; Pla, Carles; Vidal, Oriol; Viñas, Jordi

    2015-01-01

    The zebra mussel (Dreissena polymorpha, Pallas, 1771) is one of the most invasive species of freshwater bivalves, due to a combination of biological and anthropogenic factors. Once this species has been introduced to a new area, individuals form dense aggregations that are very difficult to remove, leading to many adverse socioeconomic and ecological consequences. In this study, we identified, tested, and validated a new set of polymorphic microsatellite loci (also known as SSRs, Single Sequence Repeats) using a Massive Parallel Sequencing (MPS) platform. After several pruning steps, 93 SSRs could potentially be amplified. Out of these SSRs, 14 were polymorphic, producing a polymorphic yield of 15.05%. These 14 polymorphic microsatellites were fully validated in a first approximation of the genetic population structure of D. polymorpha in the Iberian Peninsula. Based on this polymorphic yield, we propose a criterion for establishing the number of SSRs that require validation in similar species, depending on the final use of the markers. These results could be used to optimize MPS approaches in the development of microsatellites as genetic markers, which would reduce the cost of this process. PMID:25780924

  2. Performance of the UCAN2 Gyrokinetic Particle In Cell (PIC) Code on Two Massively Parallel Mainframes with Intel ``Sandy Bridge'' Processors

    NASA Astrophysics Data System (ADS)

    Leboeuf, Jean-Noel; Decyk, Viktor; Newman, David; Sanchez, Raul

    2013-10-01

    The massively parallel, 2D domain-decomposed, nonlinear, 3D, toroidal, electrostatic, gyrokinetic, Particle in Cell (PIC), Cartesian geometry UCAN2 code, with particle ions and adiabatic electrons, has been ported to two emerging mainframes. These two computers, one at NERSC in the US built by Cray named Edison and the other at the Barcelona Supercomputer Center (BSC) in Spain built by IBM named MareNostrum III (MNIII) just happen to share the same Intel ``Sandy Bridge'' processors. The successful port of UCAN2 to MNIII which came online first has enabled us to be up and running efficiently in record time on Edison. Overall, the performance of UCAN2 on Edison is superior to that on MNIII, particularly at large numbers of processors (>1024) for the same Intel IFORT compiler. This appears to be due to different MPI modules (OpenMPI on MNIII and MPICH2 on Edison) and different interconnection networks (Infiniband on MNIII and Cray's Aries on Edison) on the two mainframes. Details of these ports and comparative benchmarks are presented. Work supported by OFES, USDOE, under contract no. DE-FG02-04ER54741 with the University of Alaska at Fairbanks.

  3. Massively parallel multiple interacting continua formulation for modeling flow in fractured porous media using the subsurface reactive flow and transport code PFLOTRAN

    NASA Astrophysics Data System (ADS)

    Kumar, J.; Mills, R. T.; Lichtner, P. C.; Hammond, G. E.

    2010-12-01

    Fracture dominated flows occur in numerous subsurface geochemical processes and at many different scales in rock pore structures, micro-fractures, fracture networks and faults. Fractured porous media can be modeled as multiple interacting continua which are connected to each other through transfer terms that capture the flow of mass and energy in response to pressure, temperature and concentration gradients. However, the analysis of large-scale transient problems using the multiple interacting continuum approach presents an algorithmic and computational challenge for problems with very large numbers of degrees of freedom. A generalized dual porosity model based on the Dual Continuum Disconnected Matrix approach has been implemented within a massively parallel multiphysics-multicomponent-multiphase subsurface reactive flow and transport code PFLOTRAN. Developed as part of the Department of Energy's SciDAC-2 program, PFLOTRAN provides subsurface simulation capabilities that can scale from laptops to ultrascale supercomputers, and utilizes the PETSc framework to solve the large, sparse algebraic systems that arises in complex subsurface reactive flow and transport problems. It has been successfully applied to the solution of problems composed of more than two billions degrees of freedom, utilizing up to 131,072 processor cores on Jaguar, the Cray XT5 system at Oak Ridge National Laboratory that is the world’s fastest supercomputer. Building upon the capabilities and computational efficiency of PFLOTRAN, we will present an implementation of the multiple interacting continua formulation for fractured porous media along with an application case study.

  4. Module Fourteen: Parallel AC Resistive-Reactive Circuits; Basic Electricity and Electronics Individualized Learning System.

    ERIC Educational Resources Information Center

    Bureau of Naval Personnel, Washington, DC.

    In this module the student will learn about parallel RL (resistive-inductance), RC (resistive-capacitive), and RCL (resistive-capacitive-inductance) circuits and the conditions that exist at resonance. The module is divided into six lessons: solving for quantities in RL parallel circuits; variational analysis of RL parallel circuits; parallel RC…

  5. Development of ballistic hot electron emitter and its applications to parallel processing: active-matrix massive direct-write lithography in vacuum and thin films deposition in solutions

    NASA Astrophysics Data System (ADS)

    Koshida, N.; Kojima, A.; Ikegami, N.; Suda, R.; Yagi, M.; Shirakashi, J.; Yoshida, T.; Miyaguchi, H.; Muroyama, M.; Nishino, H.; Yoshida, S.; Sugata, M.; Totsu, K.; Esashi, M.

    2015-03-01

    Making the best use of the characteristic features in nanocrystalline Si (nc-Si) ballistic hot electron source, the alternative lithographic technology is presented based on the two approaches: physical excitation in vacuum and chemical reduction in solutions. The nc-Si cold cathode is a kind of metal-insulator-semiconductor (MIS) diode, composed of a thin metal film, an nc-Si layer, an n+-Si substrate, and an ohmic back contact. Under a biased condition, energetic electrons are uniformly and directionally emitted through the thin surface electrodes. In vacuum, this emitter is available for active-matrix drive massive parallel lithography. Arrayed 100×100 emitters (each size: 10×10 μm2, pitch: 100 μm) are fabricated on silicon substrate by conventional planar process, and then every emitter is bonded with integrated complementary metal-oxide-semiconductor (CMOS) driver using through-silicon-via (TSV) interconnect technology. Electron multi-beams emitted from selected devices are focused by a micro-electro-mechanical system (MEMS) condenser lens array and introduced into an accelerating system with a demagnification factor of 100. The electron accelerating voltage is 5 kV. The designed size of each beam landing on the target is 10×10 nm2 in square. Here we discuss the fabrication process of the emitter array with TSV holes, implementation of integrated ctive-matrix driver circuit, the bonding of these components, the construction of electron optics, and the overall operation in the exposure system including the correction of possible aberrations. The experimental results of this mask-less parallel pattern transfer are shown in terms of simple 1:1 projection and parallel lithography under an active-matrix drive scheme. Another application is the use of this emitter as an active electrode supplying highly reducing electrons into solutions. A very small amount of metal-salt solutions is dripped onto the nc-Si emitter surface, and the emitter is driven without

  6. Significant Association between Sulfate-Reducing Bacteria and Uranium-Reducing Microbial Communities as Revealed by a Combined Massively Parallel Sequencing-Indicator Species Approach

    SciTech Connect

    Cardenas, Erick; Leigh, Mary Beth; Marsh, Terence; Tiedje, James M.; Wu, Wei-min; Luo, Jian; Ginder-Vogel, Matthew; Kitanidis, Peter K.; Criddle, Craig; Carley, Jack M; Carroll, Sue L; Gentry, Terry J; Watson, David B; Gu, Baohua; Jardine, Philip M; Zhou, Jizhong

    2010-10-01

    Massively parallel sequencing has provided a more affordable and high-throughput method to study microbial communities, although it has mostly been used in an exploratory fashion. We combined pyrosequencing with a strict indicator species statistical analysis to test if bacteria specifically responded to ethanol injection that successfully promoted dissimilatory uranium(VI) reduction in the subsurface of a uranium contamination plume at the Oak Ridge Field Research Center in Tennessee. Remediation was achieved with a hydraulic flow control consisting of an inner loop, where ethanol was injected, and an outer loop for flow-field protection. This strategy reduced uranium concentrations in groundwater to levels below 0.126 M and created geochemical gradients in electron donors from the inner-loop injection well toward the outer loop and downgradient flow path. Our analysis with 15 sediment samples from the entire test area found significant indicator species that showed a high degree of adaptation to the three different hydrochemical-created conditions. Castellaniella and Rhodanobacter characterized areas with low pH, heavy metals, and low bioactivity, while sulfate-, Fe(III)-, and U(VI)-reducing bacteria (Desulfovibrio, Anaeromyxobacter, and Desulfosporosinus) were indicators of areas where U(VI) reduction occurred. The abundance of these bacteria, as well as the Fe(III) and U(VI) reducer Geobacter, correlated with the hydraulic connectivity to the substrate injection site, suggesting that the selected populations were a direct response to electron donor addition by the groundwater flow path. A false-discovery-rate approach was implemented to discard false-positive results by chance, given the large amount of data compared.

  7. Significant association between sulfate-reducing bacteria and uranium-reducing microbial communities as revealed by a combined massively parallel sequencing-indicator species approach.

    PubMed

    Cardenas, Erick; Wu, Wei-Min; Leigh, Mary Beth; Carley, Jack; Carroll, Sue; Gentry, Terry; Luo, Jian; Watson, David; Gu, Baohua; Ginder-Vogel, Matthew; Kitanidis, Peter K; Jardine, Philip M; Zhou, Jizhong; Criddle, Craig S; Marsh, Terence L; Tiedje, James M

    2010-10-01

    Massively parallel sequencing has provided a more affordable and high-throughput method to study microbial communities, although it has mostly been used in an exploratory fashion. We combined pyrosequencing with a strict indicator species statistical analysis to test if bacteria specifically responded to ethanol injection that successfully promoted dissimilatory uranium(VI) reduction in the subsurface of a uranium contamination plume at the Oak Ridge Field Research Center in Tennessee. Remediation was achieved with a hydraulic flow control consisting of an inner loop, where ethanol was injected, and an outer loop for flow-field protection. This strategy reduced uranium concentrations in groundwater to levels below 0.126 μM and created geochemical gradients in electron donors from the inner-loop injection well toward the outer loop and downgradient flow path. Our analysis with 15 sediment samples from the entire test area found significant indicator species that showed a high degree of adaptation to the three different hydrochemical-created conditions. Castellaniella and Rhodanobacter characterized areas with low pH, heavy metals, and low bioactivity, while sulfate-, Fe(III)-, and U(VI)-reducing bacteria (Desulfovibrio, Anaeromyxobacter, and Desulfosporosinus) were indicators of areas where U(VI) reduction occurred. The abundance of these bacteria, as well as the Fe(III) and U(VI) reducer Geobacter, correlated with the hydraulic connectivity to the substrate injection site, suggesting that the selected populations were a direct response to electron donor addition by the groundwater flow path. A false-discovery-rate approach was implemented to discard false-positive results by chance, given the large amount of data compared. PMID:20729318

  8. Quaternary Morphodynamics of Fluvial Dispersal Systems Revealed: The Fly River, PNG, and the Sunda Shelf, SE Asia, simulated with the Massively Parallel GPU-based Model 'GULLEM'

    NASA Astrophysics Data System (ADS)

    Aalto, R. E.; Lauer, J. W.; Darby, S. E.; Best, J.; Dietrich, W. E.

    2015-12-01

    During glacial-marine transgressions vast volumes of sediment are deposited due to the infilling of lowland fluvial systems and shallow shelves, material that is removed during ensuing regressions. Modelling these processes would illuminate system morphodynamics, fluxes, and 'complexity' in response to base level change, yet such problems are computationally formidable. Environmental systems are characterized by strong interconnectivity, yet traditional supercomputers have slow inter-node communication -- whereas rapidly advancing Graphics Processing Unit (GPU) technology offers vastly higher (>100x) bandwidths. GULLEM (GpU-accelerated Lowland Landscape Evolution Model) employs massively parallel code to simulate coupled fluvial-landscape evolution for complex lowland river systems over large temporal and spatial scales. GULLEM models the accommodation space carved/infilled by representing a range of geomorphic processes, including: river & tributary incision within a multi-directional flow regime, non-linear diffusion, glacial-isostatic flexure, hydraulic geometry, tectonic deformation, sediment production, transport & deposition, and full 3D tracking of all resulting stratigraphy. Model results concur with the Holocene dynamics of the Fly River, PNG -- as documented with dated cores, sonar imaging of floodbasin stratigraphy, and the observations of topographic remnants from LGM conditions. Other supporting research was conducted along the Mekong River, the largest fluvial system of the Sunda Shelf. These and other field data provide tantalizing empirical glimpses into the lowland landscapes of large rivers during glacial-interglacial transitions, observations that can be explored with this powerful numerical model. GULLEM affords estimates for the timing and flux budgets within the Fly and Sunda Systems, illustrating complex internal system responses to the external forcing of sea level and climate. Furthermore, GULLEM can be applied to most ANY fluvial system to

  9. Uncooled digital IRFPA-family with 17μm pixel-pitch based on amorphous silicon with massively parallel Sigma-Delta-ADC readout

    NASA Astrophysics Data System (ADS)

    Weiler, D.; Hochschulz, F.; Würfel, D.; Lerch, R.; Geruschke, T.; Wall, S.; Heß, J.; Wang, Q.; Vogt, H.

    2014-06-01

    This paper presents the results of an advanced digital IRFPA-family developed by Fraunhofer IMS. The IRFPA-family compromises the two different optical resolutions VGA (640 ×480 pixel) and QVGA (320 × 240 pixel) by using a pin-compatible detector board. The uncooled IRFPAs are designed for thermal imaging applications in the LWIR (8 .. 14μm) range with a full-frame frequency of 30 Hz and a high thermal sensitivity. The microbolometer with a pixel-pitch of 17μm consists of amorphous silicon as the sensing layer. By scaling and optimizing our previous microbolometer technology with a pixel-pitch of 25μm we enhance the thermal sensitivity of the microbolometer. The microbolometers are read out by a novel readout architecture which utilizes massively parallel on-chip Sigma-Delta-ADCs. This results in a direct digital conversion of the resistance change of the microbolometer induced by incident infrared radiation. To reduce production costs a chip-scale-package is used as vacuum package. This vacuum package consists of an IR-transparent window with an antireflection coating and a soldering frame which is fixed by a wafer-to-chip process directly on top of the CMOS-substrate. The chip-scale-package is placed onto a detector board by a chip-on-board technique. The IRFPAs are completely fabricated at Fraunhofer IMS on 8" CMOS wafers with an additional surface micromachining process. In this paper the architecture of the readout electronics, the packaging, and the electro-optical performance characterization are presented.

  10. Influences of diurnal sampling bias on fixed-point monitoring of plankton biodiversity determined using a massively parallel sequencing-based technique.

    PubMed

    Nagai, Satoshi; Hida, Kohsuke; Urushizaki, Shingo; Onitsuka, Goh; Yasuike, Motoshige; Nakamura, Yoji; Fujiwara, Atushi; Tajimi, Seisuke; Kimoto, Katsunori; Kobayashi, Takanori; Gojobori, Takashi; Ototake, Mitsuru

    2016-02-01

    In this study, we investigated the influence of diurnal sampling bias on the community structure of plankton by comparing the biodiversity among seawater samples (n=9) obtained every 3h for 24h by using massively parallel sequencing (MPS)-based plankton monitoring at a fixed point conducted at Himedo seaport in Yatsushiro Sea, Japan. The number of raw operational taxonomy units (OTUs) and OTUs after re-sampling was 507-658 (558 ± 104, mean ± standard deviation) and 448-544 (467 ± 81), respectively, indicating high plankton biodiversity at the sampling location. The relative abundance of the top 20 OTUs in the samples from Himedo seaport was 48.8-67.7% (58.0 ± 5.8%), and the highest-ranked OTU was Pseudo-nitzschia species (Bacillariophyta) with a relative abundance of 17.3-39.2%, followed by Oithona sp. 1 and Oithona sp. 2 (Arthropoda). During seawater sampling, the semidiurnal tidal current having an amplitude of 0.3ms(-1) was dominant, and the westward residual current driven by the northeasterly wind was continuously observed during the 24-h monitoring. Therefore, the relative abundance of plankton species apparently fluctuated among the samples, but no significant difference was noted according to G-test (p>0.05). Significant differences were observed between the samples obtained from a different locality (Kusuura in Yatsushiro Sea) and at different dates, suggesting that the influence of diurnal sampling bias on plankton diversity, determined using the MPS-based survey, was not significant and acceptable. PMID:26475937

  11. A combined massively parallel sequencing indicator species approach revealed significant association between sulfate-reducing bacteria and uranium-reducing microbial communities

    SciTech Connect

    Cardenas, Erick; Wu, Wei-min; Leigh, Mary Beth; Carley, Jack M; Carroll, Sue L; Gentry, Terry; Luo, Jian; Watson, David B; Gu, Baohua; Ginder-Vogel, Matthew A.; Kitanidis, Peter K.; Jardine, Philip; Kelly, Shelly D; Zhou, Jizhong; Criddle, Craig; Marsh, Terence; Tiedje, James

    2010-08-01

    Massively parallel sequencing has provided a more affordable and high throughput method to study microbial communities, although it has been mostly used in an exploratory fashion. We combined pyrosequencing with a strict indicator species statistical analysis to test if bacteria specifically responded to ethanol injection that successfully promoted dissimilatory uranium (VI) reduction in the subsurface of a uranium contamination plume at the Oak Ridge Field Research Center in Tennessee, USA. Remediation was achieved with a hydraulic flow control consisting of an inner loop, where ethanol was injected, and an outer loop for flow field protection. This strategy reduced uranium concentrations in groundwater to levels below 0.126 {micro}M, and created geochemical gradients in electron donors from the inner loop injection well towards the outer loop and down-gradient flow path. Our analysis with 15 sediment samples from the entire test area found significant indicator species that showed a high degree of adaptation to the three different hydrochemical created conditions. Castellaniella, and Rhodanobacter characterized areas with low pH, heavy metals, and low bioactivity; while sulfate-, Fe(III)-, and U(VI)-reducing bacteria (Desulfovibrio, Anaeromyxobacter, and Desulfosporosinus) were indicators of areas where U(VI) reduction occurred. Abundance of these bacteria as well as the Fe(III)- and U(VI)-reducer Geobacter correlated with the hydraulic connectivity to the substrate injection site, suggesting that the selected populations were a direct response to the electron donor addition and by the groundwater flow path. A false discovery rate approach was implemented to discard false positives by chance given the large amount of data compared.

  12. Transition to unstable ion flow in parallel electric fields. [in ionosphere

    NASA Technical Reports Server (NTRS)

    Bergmann, R.; Lotko, W.

    1986-01-01

    The stability of ionospheric O(+)-H(+) outflows accelerated by a nonambipolar parallel electric field is considered under conditions where the ion motion initially develops adiabatically and the ambient plasma is vertically stratified with an effective temperature that increases with altitude. Such conditions are expected near the bottom of the auroral acceleration region where ion and electron streaming instabilities first develop. It is shown for a particular equilibrium profile that the differentially accelerated ion flows become unstable within about 100 km from their entry point in the acceleration region. At O(+)/H(+) density ratios less than about 9, the instability is dominated by a violent H(+)-O(+) two-stream interaction which couples the O(+) and H(+) acoustic modes, and which mediates a transition to nonadiabatic acceleration. At higher altitudes and/or larger O(+)/H(+) density ratios, a much weaker resonant instability exists, which is driven by the relative drift between electrons and O(+) or H(+) ions. The results suggest that the H(+)-O(+) two-stream instability may be a viable mechanism for heating upflowing auroral ions.

  13. A fast parallel solver for the forward problem in electrical impedance tomography.

    PubMed

    Jehl, Markus; Dedner, Andreas; Betcke, Timo; Aristovich, Kirill; Klöfkorn, Robert; Holder, David

    2015-01-01

    Electrical impedance tomography (EIT) is a noninvasive imaging modality, where imperceptible currents are applied to the skin and the resulting surface voltages are measured. It has the potential to distinguish between ischaemic and haemorrhagic stroke with a portable and inexpensive device. The image reconstruction relies on an accurate forward model of the experimental setup. Because of the relatively small signal in stroke EIT, the finite-element modeling requires meshes of more than 10 million elements. To study the requirements in the forward modeling in EIT and also to reduce the time for experimental image acquisition, it is necessary to reduce the run time of the forward computation. We show the implementation of a parallel forward solver for EIT using the Dune-Fem C++ library and demonstrate its performance on many CPU's of a computer cluster. For a typical EIT application a direct solver was significantly slower and not an alternative to iterative solvers with multigrid preconditioning. With this new solver, we can compute the forward solutions and the Jacobian matrix of a typical EIT application with 30 electrodes on a 15-million element mesh in less than 15 min. This makes it a valuable tool for simulation studies and EIT applications with high precision requirements. It is freely available for download. PMID:25069109

  14. Neural network control of a parallel hybrid-electric propulsion system for a small unmanned aerial vehicle

    NASA Astrophysics Data System (ADS)

    Harmon, Frederick G.

    2005-11-01

    Parallel hybrid-electric propulsion systems would be beneficial for small unmanned aerial vehicles (UAVs) used for military, homeland security, and disaster-monitoring missions. The benefits, due to the hybrid and electric-only modes, include increased time-on-station and greater range as compared to electric-powered UAVs and stealth modes not available with gasoline-powered UAVs. This dissertation contributes to the research fields of small unmanned aerial vehicles, hybrid-electric propulsion system control, and intelligent control. A conceptual design of a small UAV with a parallel hybrid-electric propulsion system is provided. The UAV is intended for intelligence, surveillance, and reconnaissance (ISR) missions. A conceptual design reveals the trade-offs that must be considered to take advantage of the hybrid-electric propulsion system. The resulting hybrid-electric propulsion system is a two-point design that includes an engine primarily sized for cruise speed and an electric motor and battery pack that are primarily sized for a slower endurance speed. The electric motor provides additional power for take-off, climbing, and acceleration and also serves as a generator during charge-sustaining operation or regeneration. The intelligent control of the hybrid-electric propulsion system is based on an instantaneous optimization algorithm that generates a hyper-plane from the nonlinear efficiency maps for the internal combustion engine, electric motor, and lithium-ion battery pack. The hyper-plane incorporates charge-depletion and charge-sustaining strategies. The optimization algorithm is flexible and allows the operator/user to assign relative importance between the use of gasoline, electricity, and recharging depending on the intended mission. A MATLAB/Simulink model was developed to test the control algorithms. The Cerebellar Model Arithmetic Computer (CMAC) associative memory neural network is applied to the control of the UAVs parallel hybrid-electric

  15. Parallel computation of optimized arrays for 2-D electrical imaging surveys

    NASA Astrophysics Data System (ADS)

    Loke, M. H.; Wilkinson, P. B.; Chambers, J. E.

    2010-12-01

    Modern automatic multi-electrode survey instruments have made it possible to use non-traditional arrays to maximize the subsurface resolution from electrical imaging surveys. Previous studies have shown that one of the best methods for generating optimized arrays is to select the set of array configurations that maximizes the model resolution for a homogeneous earth model. The Sherman-Morrison Rank-1 update is used to calculate the change in the model resolution when a new array is added to a selected set of array configurations. This method had the disadvantage that it required several hours of computer time even for short 2-D survey lines. The algorithm was modified to calculate the change in the model resolution rather than the entire resolution matrix. This reduces the computer time and memory required as well as the computational round-off errors. The matrix-vector multiplications for a single add-on array were replaced with matrix-matrix multiplications for 28 add-on arrays to further reduce the computer time. The temporary variables were stored in the double-precision Single Instruction Multiple Data (SIMD) registers within the CPU to minimize computer memory access. A further reduction in the computer time is achieved by using the computer graphics card Graphics Processor Unit (GPU) as a highly parallel mathematical coprocessor. This makes it possible to carry out the calculations for 512 add-on arrays in parallel using the GPU. The changes reduce the computer time by more than two orders of magnitude. The algorithm used to generate an optimized data set adds a specified number of new array configurations after each iteration to the existing set. The resolution of the optimized data set can be increased by adding a smaller number of new array configurations after each iteration. Although this increases the computer time required to generate an optimized data set with the same number of data points, the new fast numerical routines has made this practical on

  16. The effect of the macrolide antibiotic tylosin on microbial diversity in the canine small intestine as demonstrated by massive parallel 16S rRNA gene sequencing

    PubMed Central

    2009-01-01

    Background Recent studies have shown that the fecal microbiota is generally resilient to short-term antibiotic administration, but some bacterial taxa may remain depressed for several months. Limited information is available about the effect of antimicrobials on small intestinal microbiota, an important contributor to gastrointestinal health. The antibiotic tylosin is often successfully used for the treatment of chronic diarrhea in dogs, but its exact mode of action and its effect on the intestinal microbiota remain unknown. The aim of this study was to evaluate the effect of tylosin on canine jejunal microbiota. Tylosin was administered at 20 to 22 mg/kg q 24 hr for 14 days to five healthy dogs, each with a pre-existing jejunal fistula. Jejunal brush samples were collected through the fistula on days 0, 14, and 28 (14 days after withdrawal of tylosin). Bacterial diversity was characterized using massive parallel 16S rRNA gene pyrosequencing. Results Pyrosequencing revealed a previously unrecognized species richness in the canine small intestine. Ten bacterial phyla were identified. Microbial populations were phylogenetically more similar during tylosin treatment. However, a remarkable inter-individual response was observed for specific taxa. Fusobacteria, Bacteroidales, and Moraxella tended to decrease. The proportions of Enterococcus-like organisms, Pasteurella spp., and Dietzia spp. increased significantly during tylosin administration (p < 0.05). The proportion of Escherichia coli-like organisms increased by day 28 (p = 0.04). These changes were not accompanied by any obvious clinical effects. On day 28, the phylogenetic composition of the microbiota was similar to day 0 in only 2 of 5 dogs. Bacterial diversity resembled the pre-treatment state in 3 of 5 dogs. Several bacterial taxa such as Spirochaetes, Streptomycetaceae, and Prevotellaceae failed to recover at day 28 (p < 0.05). Several bacterial groups considered to be sensitive to tylosin increased in their

  17. Massively parallel sequencing of forensic STRs: Considerations of the DNA commission of the International Society for Forensic Genetics (ISFG) on minimal nomenclature requirements.

    PubMed

    Parson, Walther; Ballard, David; Budowle, Bruce; Butler, John M; Gettings, Katherine B; Gill, Peter; Gusmão, Leonor; Hares, Douglas R; Irwin, Jodi A; King, Jonathan L; Knijff, Peter de; Morling, Niels; Prinz, Mechthild; Schneider, Peter M; Neste, Christophe Van; Willuweit, Sascha; Phillips, Christopher

    2016-05-01

    The DNA Commission of the International Society for Forensic Genetics (ISFG) is reviewing factors that need to be considered ahead of the adoption by the forensic community of short tandem repeat (STR) genotyping by massively parallel sequencing (MPS) technologies. MPS produces sequence data that provide a precise description of the repeat allele structure of a STR marker and variants that may reside in the flanking areas of the repeat region. When a STR contains a complex arrangement of repeat motifs, the level of genetic polymorphism revealed by the sequence data can increase substantially. As repeat structures can be complex and include substitutions, insertions, deletions, variable tandem repeat arrangements of multiple nucleotide motifs, and flanking region SNPs, established capillary electrophoresis (CE) allele descriptions must be supplemented by a new system of STR allele nomenclature, which retains backward compatibility with the CE data that currently populate national DNA databases and that will continue to be produced for the coming years. Thus, there is a pressing need to produce a standardized framework for describing complex sequences that enable comparison with currently used repeat allele nomenclature derived from conventional CE systems. It is important to discern three levels of information in hierarchical order (i) the sequence, (ii) the alignment, and (iii) the nomenclature of STR sequence data. We propose a sequence (text) string format the minimal requirement of data storage that laboratories should follow when adopting MPS of STRs. We further discuss the variant annotation and sequence comparison framework necessary to maintain compatibility among established and future data. This system must be easy to use and interpret by the DNA specialist, based on a universally accessible genome assembly, and in place before the uptake of MPS by the general forensic community starts to generate sequence data on a large scale. While the established

  18. FAVR (Filtering and Annotation of Variants that are Rare): methods to facilitate the analysis of rare germline genetic variants from massively parallel sequencing datasets

    PubMed Central

    2013-01-01

    Background Characterising genetic diversity through the analysis of massively parallel sequencing (MPS) data offers enormous potential to significantly improve our understanding of the genetic basis for observed phenotypes, including predisposition to and progression of complex human disease. Great challenges remain in resolving genetic variants that are genuine from the millions of artefactual signals. Results FAVR is a suite of new methods designed to work with commonly used MPS analysis pipelines to assist in the resolution of some of the issues related to the analysis of the vast amount of resulting data, with a focus on relatively rare genetic variants. To the best of our knowledge, no equivalent method has previously been described. The most important and novel aspect of FAVR is the use of signatures in comparator sequence alignment files during variant filtering, and annotation of variants potentially shared between individuals. The FAVR methods use these signatures to facilitate filtering of (i) platform and/or mapping-specific artefacts, (ii) common genetic variants, and, where relevant, (iii) artefacts derived from imbalanced paired-end sequencing, as well as annotation of genetic variants based on evidence of co-occurrence in individuals. We applied conventional variant calling applied to whole-exome sequencing datasets, produced using both SOLiD and TruSeq chemistries, with or without downstream processing by FAVR methods. We demonstrate a 3-fold smaller rare single nucleotide variant shortlist with no detected reduction in sensitivity. This analysis included Sanger sequencing of rare variant signals not evident in dbSNP131, assessment of known variant signal preservation, and comparison of observed and expected rare variant numbers across a range of first cousin pairs. The principles described herein were applied in our recent publication identifying XRCC2 as a new breast cancer risk gene and have been made publically available as a suite of software

  19. ROVER variant caller: read-pair overlap considerate variant-calling software applied to PCR-based massively parallel sequencing datasets

    PubMed Central

    2014-01-01

    Background We recently described Hi-Plex, a highly multiplexed PCR-based target-enrichment system for massively parallel sequencing (MPS), which allows the uniform definition of library size so that subsequent paired-end sequencing can achieve complete overlap of read pairs. Variant calling from Hi-Plex-derived datasets can thus rely on the identification of variants appearing in both reads of read-pairs, permitting stringent filtering of sequencing chemistry-induced errors. These principles underly ROVER software (derived from Read Overlap PCR-MPS variant caller), which we have recently used to report the screening for genetic mutations in the breast cancer predisposition gene PALB2. Here, we describe the algorithms underlying ROVER and its usage. Results ROVER enables users to quickly and accurately identify genetic variants from PCR-targeted, overlapping paired-end MPS datasets. The open-source availability of the software and threshold tailorability enables broad access for a range of PCR-MPS users. Methods ROVER is implemented in Python and runs on all popular POSIX-like operating systems (Linux, OS X). The software accepts a tab-delimited text file listing the coordinates of the target-specific primers used for targeted enrichment based on a specified genome-build. It also accepts aligned sequence files resulting from mapping to the same genome-build. ROVER identifies the amplicon a given read-pair represents and removes the primer sequences by using the mapping co-ordinates and primer co-ordinates. It considers overlapping read-pairs with respect to primer-intervening sequence. Only when a variant is observed in both reads of a read-pair does the signal contribute to a tally of read-pairs containing or not containing the variant. A user-defined threshold informs the minimum number of, and proportion of, read-pairs a variant must be observed in for a ‘call’ to be made. ROVER also reports the depth of coverage across amplicons to facilitate the

  20. Massively parallel sequencing of phyllodes tumours of the breast reveals actionable mutations, and TERT promoter hotspot mutations and TERT gene amplification as likely drivers of progression.

    PubMed

    Piscuoglio, Salvatore; Ng, Charlotte Ky; Murray, Melissa; Burke, Kathleen A; Edelweiss, Marcia; Geyer, Felipe C; Macedo, Gabriel S; Inagaki, Akiko; Papanastasiou, Anastasios D; Martelotto, Luciano G; Marchio, Caterina; Lim, Raymond S; Ioris, Rafael A; Nahar, Pooja K; Bruijn, Ino De; Smyth, Lillian; Akram, Muzaffar; Ross, Dara; Petrini, John H; Norton, Larry; Solit, David B; Baselga, Jose; Brogi, Edi; Ladanyi, Marc; Weigelt, Britta; Reis-Filho, Jorge S

    2016-03-01

    Phyllodes tumours (PTs) are breast fibroepithelial lesions that are graded based on histological criteria as benign, borderline or malignant. PTs may recur locally. Borderline PTs and malignant PTs may metastasize to distant sites. Breast fibroepithelial lesions, including PTs and fibroadenomas, are characterized by recurrent MED12 exon 2 somatic mutations. We sought to define the repertoire of somatic genetic alterations in PTs and whether these may assist in the differential diagnosis of these lesions. We collected 100 fibroadenomas, 40 benign PTs, 14 borderline PTs and 22 malignant PTs; six, six and 13 benign, borderline and malignant PTs, respectively, and their matched normal tissue, were subjected to targeted massively parallel sequencing (MPS) using the MSK-IMPACT sequencing assay. Recurrent MED12 mutations were found in 56% of PTs; in addition, mutations affecting cancer genes (eg TP53, RB1, SETD2 and EGFR) were exclusively detected in borderline and malignant PTs. We found a novel recurrent clonal hotspot mutation in the TERT promoter (-124 C>T) in 52% and TERT gene amplification in 4% of PTs. Laser capture microdissection revealed that these mutations were restricted to the mesenchymal component of PTs. Sequencing analysis of the entire cohort revealed that the frequency of TERT alterations increased from benign (18%) to borderline (57%) and to malignant PTs (68%; p < 0.01), and TERT alterations were associated with increased levels of TERT mRNA (p < 0.001). No TERT alterations were observed in fibroadenomas. An analysis of TERT promoter sequencing and gene amplification distinguished PTs from fibroadenomas with a sensitivity and a positive predictive value of 100% (CI 95.38-100%) and 100% (CI 85.86-100%), respectively, and a sensitivity and a negative predictive value of 39% (CI 28.65-51.36%) and 68% (CI 60.21-75.78%), respectively. Our results suggest that TERT alterations may drive the progression of PTs, and may assist in the differential diagnosis

  1. Module Six: Parallel Circuits; Basic Electricity and Electronics Individualized Learning System.

    ERIC Educational Resources Information Center

    Bureau of Naval Personnel, Washington, DC.

    In this module the student will learn the rules that govern the characteristics of parallel circuits; the relationships between voltage, current, resistance and power; and the results of common troubles in parallel circuits. The module is divided into four lessons: rules of voltage and current, rules for resistance and power, variational analysis,…

  2. A search for parallel electric fields by observing secondary electrons and photoelectrons in the low-altitude auroral zone

    NASA Technical Reports Server (NTRS)

    Fung, Shing F.; Hoffman, R. A.

    1991-01-01

    Model calculations are performed demonstrating the effect of weak parallel electric fields on the differential spectra of the low-energy electrons observed in the inverted-V electron precipitation events in the topside ionosphere. A comparison of the altitude dependence of the observed spectra with the model calculations shows that there can be, on average, no more than a 2-V potential drop between the altitudes of 400 and 900 km, corresponding to a distributed parallel dc electric field of less than 4 microV/m under the inverted-V electron precipitation regions. Statistical results are presented on the spectral dependence of secondary electrons on the inverted-V primary beam parameters.

  3. BerkeleyGW: A massively parallel computer package for the calculation of the quasiparticle and optical properties of materials and nanostructures

    NASA Astrophysics Data System (ADS)

    Deslippe, Jack; Samsonidze, Georgy; Strubbe, David A.; Jain, Manish; Cohen, Marvin L.; Louie, Steven G.

    2012-06-01

    BerkeleyGW is a massively parallel computational package for electron excited-state properties that is based on the many-body perturbation theory employing the ab initio GW and GW plus Bethe-Salpeter equation methodology. It can be used in conjunction with many density-functional theory codes for ground-state properties, including PARATEC, PARSEC, Quantum ESPRESSO, SIESTA, and Octopus. The package can be used to compute the electronic and optical properties of a wide variety of material systems from bulk semiconductors and metals to nanostructured materials and molecules. The package scales to 10 000s of CPUs and can be used to study systems containing up to 100s of atoms. Program summaryProgram title: BerkeleyGW Catalogue identifier: AELG_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AELG_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: Open source BSD License. See code for licensing details. No. of lines in distributed program, including test data, etc.: 576 540 No. of bytes in distributed program, including test data, etc.: 110 608 809 Distribution format: tar.gz Programming language: Fortran 90, C, C++, Python, Perl, BASH Computer: Linux/UNIX workstations or clusters Operating system: Tested on a variety of Linux distributions in parallel and serial as well as AIX and Mac OSX RAM: (50-2000) MB per CPU (Highly dependent on system size) Classification: 7.2, 7.3, 16.2, 18 External routines: BLAS, LAPACK, FFTW, ScaLAPACK (optional), MPI (optional). All available under open-source licenses. Nature of problem: The excited state properties of materials involve the addition or subtraction of electrons as well as the optical excitations of electron-hole pairs. The excited particles interact strongly with other electrons in a material system. This interaction affects the electronic energies, wavefunctions and lifetimes. It is well known that ground-state theories, such as standard methods

  4. Atmospheric plasma jet array in parallel electric and gas flow fields for three-dimensional surface treatment

    NASA Astrophysics Data System (ADS)

    Cao, Z.; Walsh, J. L.; Kong, M. G.

    2009-01-01

    This letter reports on electrical and optical characteristics of a ten-channel atmospheric pressure glow discharge jet array in parallel electric and gas flow fields. Challenged with complex three-dimensional substrates including surgical tissue forceps and sloped plastic plate of up to 15°, the jet array is shown to achieve excellent jet-to-jet uniformity both in time and in space. Its spatial uniformity is four times better than a comparable single jet when both are used to treat a 15° sloped substrate. These benefits are likely from an effective self-adjustment mechanism among individual jets facilitated by individualized ballast and spatial redistribution of surface charges.

  5. The relation between reconnected flux, the parallel electric field, and the reconnection rate in a three-dimensional kinetic simulation of magnetic reconnection

    SciTech Connect

    Wendel, D. E.; Olson, D. K.; Hesse, M.; Kuznetsova, M.; Adrian, M. L.; Aunai, N.; Karimabadi, H.; Daughton, W.

    2013-12-15

    We investigate the distribution of parallel electric fields and their relationship to the location and rate of magnetic reconnection in a large particle-in-cell simulation of 3D turbulent magnetic reconnection with open boundary conditions. The simulation's guide field geometry inhibits the formation of simple topological features such as null points. Therefore, we derive the location of potential changes in magnetic connectivity by finding the field lines that experience a large relative change between their endpoints, i.e., the quasi-separatrix layer. We find a good correspondence between the locus of changes in magnetic connectivity or the quasi-separatrix layer and the map of large gradients in the integrated parallel electric field (or quasi-potential). Furthermore, we investigate the distribution of the parallel electric field along the reconnecting field lines. We find the reconnection rate is controlled by only the low-amplitude, zeroth and first–order trends in the parallel electric field while the contribution from fluctuations of the parallel electric field, such as electron holes, is negligible. The results impact the determination of reconnection sites and reconnection rates in models and in situ spacecraft observations of 3D turbulent reconnection. It is difficult through direct observation to isolate the loci of the reconnection parallel electric field amidst the large amplitude fluctuations. However, we demonstrate that a positive slope of the running sum of the parallel electric field along the field line as a function of field line length indicates where reconnection is occurring along the field line.

  6. The relation between reconnected flux, the parallel electric field, and the reconnection rate in a three-dimensional kinetic simulation of magnetic reconnection

    NASA Astrophysics Data System (ADS)

    Wendel, D. E.; Olson, D. K.; Hesse, M.; Aunai, N.; Kuznetsova, M.; Karimabadi, H.; Daughton, W.; Adrian, M. L.

    2013-12-01

    We investigate the distribution of parallel electric fields and their relationship to the location and rate of magnetic reconnection in a large particle-in-cell simulation of 3D turbulent magnetic reconnection with open boundary conditions. The simulation's guide field geometry inhibits the formation of simple topological features such as null points. Therefore, we derive the location of potential changes in magnetic connectivity by finding the field lines that experience a large relative change between their endpoints, i.e., the quasi-separatrix layer. We find a good correspondence between the locus of changes in magnetic connectivity or the quasi-separatrix layer and the map of large gradients in the integrated parallel electric field (or quasi-potential). Furthermore, we investigate the distribution of the parallel electric field along the reconnecting field lines. We find the reconnection rate is controlled by only the low-amplitude, zeroth and first-order trends in the parallel electric field while the contribution from fluctuations of the parallel electric field, such as electron holes, is negligible. The results impact the determination of reconnection sites and reconnection rates in models and in situ spacecraft observations of 3D turbulent reconnection. It is difficult through direct observation to isolate the loci of the reconnection parallel electric field amidst the large amplitude fluctuations. However, we demonstrate that a positive slope of the running sum of the parallel electric field along the field line as a function of field line length indicates where reconnection is occurring along the field line.

  7. Design and development of split-parallel through-the road retrofit hybrid electric vehicle with in-wheel motors

    NASA Astrophysics Data System (ADS)

    Zulkifli, S. A.; Syaifuddin Mohd, M.; Maharun, M.; Bakar, N. S. A.; Idris, S.; Samsudin, S. H.; Firmansyah; Adz, J. J.; Misbahulmunir, M.; Abidin, E. Z. Z.; Syafiq Mohd, M.; Saad, N.; Aziz, A. R. A.

    2015-12-01

    One configuration of the hybrid electric vehicle (HEV) is the split-axle parallel hybrid, in which an internal combustion engine (ICE) and an electric motor provide propulsion power to different axles. A particular sub-type of the split-parallel hybrid does not have the electric motor installed on board the vehicle; instead, two electric motors are placed in the hubs of the non-driven wheels, called ‘hub motor’ or ‘in-wheel motor’ (IWM). Since propulsion power from the ICE and IWM is coupled through the vehicle itself, its wheels and the road on which it moves, this particular configuration is termed ‘through-the-road’ (TTR) hybrid. TTR configuration enables existing ICE-powered vehicles to be retrofitted into an HEV with minimal physical modification. This work describes design of a retrofit- conversion TTR-IWM hybrid vehicle - its sub-systems and development work. Operating modes and power flow of the TTR hybrid, its torque coupling and resultant traction profiles are initially discussed.

  8. NON-CONFORMING FINITE ELEMENTS; MESH GENERATION, ADAPTIVITY AND RELATED ALGEBRAIC MULTIGRID AND DOMAIN DECOMPOSITION METHODS IN MASSIVELY PARALLEL COMPUTING ENVIRONMENT

    SciTech Connect

    Lazarov, R; Pasciak, J; Jones, J

    2002-02-01

    Construction, analysis and numerical testing of efficient solution techniques for solving elliptic PDEs that allow for parallel implementation have been the focus of the research. A number of discretization and solution methods for solving second order elliptic problems that include mortar and penalty approximations and domain decomposition methods for finite elements and finite volumes have been investigated and analyzed. Techniques for parallel domain decomposition algorithms in the framework of PETC and HYPRE have been studied and tested. Hierarchical parallel grid refinement and adaptive solution methods have been implemented and tested on various model problems. A parallel code implementing the mortar method with algebraically constructed multiplier spaces was developed.

  9. Static stability of parallel operation of asynchronized generators in an electrical system

    NASA Astrophysics Data System (ADS)

    Plotnikova, T. V.; Sokur, P. V.; Tuzov, P. Yu.; Shakaryan, Yu. G.

    2014-12-01

    The static stability of single and parallel operations of an asynchronized generator (ASG) in a long-distance power transmission line is investigated. The synthesis of the ASG excitation control law at which set of the machine's stable operating conditions G s will comprise sufficiently conservative set of permissible operating conditions G p is considered.

  10. Effects of a Parallel Electric Field and the Geomagnetic Field in the Topside Ionosphere on Auroral and Photoelectron Energy Distributions

    NASA Technical Reports Server (NTRS)

    Min, Q.-L.; Lummerzheim, D.; Rees, M. H.; Stamnes, K.

    1993-01-01

    The consequences of electric field acceleration and an inhomogencous magnetic field on auroral electron energy distributions in the topside ionosphere are investigated. The one- dimensional, steady state electron transport equation includes elastic and inelastic collisions, an inhomogencous magnetic field, and a field-aligned electric field. The case of a self-consistent polarization electric field is considered first. The self-consistent field is derived by solving the continuity equation for all ions of importance, including diffusion of 0(+) and H(+), and the electron and ion energy equations to derive the electron and ion temperatures. The system of coupled electron transport, continuity, and energy equations is solved numerically. Recognizing observations of parallel electric fields of larger magnitude than the baseline case of the polarization field, the effect of two model fields on the electron distribution function in investigated. In one case the field is increased from the polarization field magnitude at 300 km to a maximum at the upper boundary of 800 km, and in another case a uniform field is added to the polarization field. Substantial perturbations of the low energy portion of the electron flux are produced: an upward directed electric field accelerates the downward directed flux of low-energy secondary electrons and decelerates the upward directed component. Above about 400 km the inhomogencous magnetic field produces anisotropies in the angular distribution of the electron flux. The effects of the perturbed energy distributions on auroral spectral emission features are noted.

  11. Effects of a parallel electric field and the geomagnetic field in the topside ionosphere on auroral and photoelectron energy distributions

    NASA Technical Reports Server (NTRS)

    Min, Q.-L.; Lummerzheim, D.; Rees, M. H.; Stamnes, K.

    1993-01-01

    The consequences of electric field acceleration and an inhomogeneous magnetic field on auroral electron energy distributions in the topside ionosphere are investigated. The one-dimensional, steady state electron transport equation includes elastic and inelastic collisions, an inhomogeneous magnetic field, and a field-aligned electric field. The case of a self-consistent polarization electric field is considered first. The self-consistent field is derived by solving the continuity equation for all ions of importance, including diffusion of O(+) and H(+), and the electron and ion energy equations to derive the electron and ion temperatures. The system of coupled electron transport, continuity, and energy equations is solved numerically. Recognizing observations of parallel electric fields of larger magnitude than the baseline case of the polarization field, the effect of two model fields on the electron distribution function is investigated. In one case the field is increased from the polarization field magnitude at 300 km to a maximum at the upper boundary of 800 km, and in another case a uniform field is added to the polarization field. Substantial perturbations of the low energy portion of the electron flux are produced: an upward directed electric field accelerates the downward directed flux of low-energy secondary electrons and decelerates the upward directed component. Above about 400 km the inhomogeneous magnetic field produces anisotropies in the angular distribution of the electron flux. The effects of the perturbed energy distributions on auroral spectral emission features are noted.

  12. Economical launching and accelerating control strategy for a single-shaft parallel hybrid electric bus

    NASA Astrophysics Data System (ADS)

    Yang, Chao; Song, Jian; Li, Liang; Li, Shengbo; Cao, Dongpu

    2016-08-01

    This paper presents an economical launching and accelerating mode, including four ordered phases: pure electrical driving, clutch engagement and engine start-up, engine active charging, and engine driving, which can be fit for the alternating conditions and improve the fuel economy of hybrid electric bus (HEB) during typical city-bus driving scenarios. By utilizing the fast response feature of electric motor (EM), an adaptive controller for EM is designed to realize the power demand during the pure electrical driving mode, the engine starting mode and the engine active charging mode. Concurrently, the smoothness issue induced by the sequential mode transitions is solved with a coordinated control logic for engine, EM and clutch. Simulation and experimental results show that the proposed launching and accelerating mode and its control methods are effective in improving the fuel economy and ensure the drivability during the fast transition between the operation modes of HEB.

  13. Propagation of the Lightning Electromagnetic Pulse Through the E- and F-region Ionosphere and the Generation of Parallel Electric Fields

    NASA Astrophysics Data System (ADS)

    Rowland, D. E.; Wygant, J. R.; Pfaff, R. F.; Farrell, W. M.; Goetz, K. A.; Monson, S. J.

    2004-05-01

    Sounding rockets launched by Mike Kelley and his group at Cornell demonstrated the existence of transient (1 ms) electric fields associated with lightning strikes at high altitudes above active thunderstorms. These electric fields had a component parallel to the Earth's magnetic field, and were unipolar and large in amplitude. They were thought to be strong enough to energize electrons and generate strong turbulence as the beams thermalized. The parallel electric fields were observed on multiple flights, but high time resolution measurements were not made within 100 km horizontal distance of lightning strokes, where the electric fields are largest. In 2000 the ``Lightning Bolt'' sounding rocket (NASA 27.143) was launched directly over an active thunderstorm to an apogee near 300 km. The sounding rocket was equipped with sensitive electric and magnetic field instruments as well as a photometer and electrostatic analyser for measuring accelerated electrons. The electric and magnetic fields were sampled at 10 million samples per second, letting us fully resolve the structure of the parallel electric field pulse up to and beyond the plasma frequency. We will present results from the Lightning Bolt mission, concentrating on the parallel electric field pulses that arrive before the lower-frequency whistler wave modes. We observe pulses with peak electric fields of a few mV/m lasting for a substantial fraction of a millisecond. Superimposed on this is high-frequency turbulence, comparable in amplitude to the pulse itself. This is the first direct observation of this structure in the parallel electric field, within 100 km horizontal distance of the lightning stroke. We will present evidence for the method of generation of these parallel fields, and discuss their probable effect on ionospheric electrons.

  14. Parallel Electric Fields and Wave Phenomena Associated with Magnetic Reconnection: The Merged Magnetic Field Product from MMS

    NASA Astrophysics Data System (ADS)

    Argall, M. R.; Torbert, R. B.; Le Contel, O.; Russell, C. T.; Magnes, W.; Strangeway, R. J.; Bromund, K. R.; Lindqvist, P. A.; Marklund, G. T.; Ergun, R. E.; Khotyaintsev, Y. V.

    2015-12-01

    Kinetic processes associated with magnetic reconnection current structures are able to be resolved for the first time by the instrument suites and small inter-spacecraft separation of MMS. Measurements of the parallel electric fields responsible for electron acceleration, and wave activity associated with reconnection onset and electron scattering require precise knowledge of the magnetic field amplitude and phase. The fluxgate and searchcoil magnetometers on MMS are sensitive to low- and high-frequency field fluctuations, respectively. In the middle frequency range, we optimize sensitivity by merging the two datasets to create a single magnetic field data product. We analyze frequency-dependent amplitude and phase relationships between the two instruments to determine how they should be joined. The result is a product with the time resolution and Nyquist frequency of the searchcoil, but with the fluxgate's ability to measure the DC magnetic field. This dataset provides improved phase information suitable for determining parallel electric fields during magnetic reconnection events. Its enhanced sensitivity also makes it ideal for resolving thin current layers and uncovering low-amplitude wave activity, such as EMIC waves related to substorm injections and Alfven or lower hybrid waves related to reconnection.

  15. The Coupling of Ion and Electron Heating During Magnetic Reconnection and the Role of Parallel Electric Fields

    NASA Astrophysics Data System (ADS)

    Haggerty, C. C.; Shay, M. A.; Drake, J. F.; Phan, T.

    2015-12-01

    The energization of ions and electrons due to magnetic reconnection plays an important role in the heating and dynamics of plasma throughout the heliosphere. However there are still gaps in the understanding of the relationship between ion and electron heating. The mechanisms for ion and electron heating are examined using a systematic set of antiparallel reconnection simulations performed with Kinetic Particle in Cell (PIC). Ions and electrons are energized and heated through reflections across contracting magnetic field lines in the center of the outflow exhaust. Long range parallel electric fields form to trap heated electrons, and in turn inhibit the counter streaming ion beams, reducing the ion temperature increment. A balance between electron energy gain from the contracting field lines and energy loss at the inflow edge of the exhaust ultimately determines the electron heating. This mechanism extends far downstream and leads to a net heating which is independent of distance from the x-line. The parallel electric field fundamentally links energy partition between the electrons and ions.

  16. Simulation of electrostatic ion instabilities in the presence of parallel currents and transverse electric fields

    NASA Technical Reports Server (NTRS)

    Nishikawa, K.-I.; Ganguli, G.; Lee, Y. C.; Palmadesso, P. J.

    1989-01-01

    A spatially two-dimensional electrostatic PIC simulation code was used to study the stability of a plasma equilibrium characterized by a localized transverse dc electric field and a field-aligned drift for L is much less than Lx, where Lx is the simulation length in the x direction and L is the scale length associated with the dc electric field. It is found that the dc electric field and the field-aligned current can together play a synergistic role to enable the excitation of electrostatic waves even when the threshold values of the field aligned drift and the E x B drift are individually subcritical. The simulation results show that the growing ion waves are associated with small vortices in the linear stage, which evolve to the nonlinear stage dominated by larger vortices with lower frequencies.

  17. Neurite, a Finite Difference Large Scale Parallel Program for the Simulation of Electrical Signal Propagation in Neurites under Mechanical Loading

    PubMed Central

    García-Grajales, Julián A.; Rucabado, Gabriel; García-Dopico, Antonio; Peña, José-María; Jérusalem, Antoine

    2015-01-01

    With the growing body of research on traumatic brain injury and spinal cord injury, computational neuroscience has recently focused its modeling efforts on neuronal functional deficits following mechanical loading. However, in most of these efforts, cell damage is generally only characterized by purely mechanistic criteria, functions of quantities such as stress, strain or their corresponding rates. The modeling of functional deficits in neurites as a consequence of macroscopic mechanical insults has been rarely explored. In particular, a quantitative mechanically based model of electrophysiological impairment in neuronal cells, Neurite, has only very recently been proposed. In this paper, we present the implementation details of this model: a finite difference parallel program for simulating electrical signal propagation along neurites under mechanical loading. Following the application of a macroscopic strain at a given strain rate produced by a mechanical insult, Neurite is able to simulate the resulting neuronal electrical signal propagation, and thus the corresponding functional deficits. The simulation of the coupled mechanical and electrophysiological behaviors requires computational expensive calculations that increase in complexity as the network of the simulated cells grows. The solvers implemented in Neurite—explicit and implicit—were therefore parallelized using graphics processing units in order to reduce the burden of the simulation costs of large scale scenarios. Cable Theory and Hodgkin-Huxley models were implemented to account for the electrophysiological passive and active regions of a neurite, respectively, whereas a coupled mechanical model accounting for the neurite mechanical behavior within its surrounding medium was adopted as a link between electrophysiology and mechanics. This paper provides the details of the parallel implementation of Neurite, along with three different application examples: a long myelinated axon, a segmented

  18. Electrostatic ion instabilities in the presence of parallel currents and transverse electric fields

    NASA Technical Reports Server (NTRS)

    Ganguli, G.; Palmadesso, P. J.

    1988-01-01

    The electrostatic ion instabilities are studied for oblique propagation in the presence of magnetic field-aligned currents and transverse localized electric fields in a weakly collisional plasma. The presence of transverse electric fields result in mode excitation for magnetic field aligned current values that are otherwise stable. The electron collisions enhance the growth while ion collisions have a damping effect. These results are discussed in the context of observations of low frequency ion modes in the auroral ionosphere by radar and rocket experiments.

  19. Column buckling of doubly parallel slender nanowires carrying electric current acted upon by a magnetic field

    NASA Astrophysics Data System (ADS)

    Kiani, Keivan

    2016-08-01

    Axial buckling of current-carrying double-nanowire-systems immersed in a longitudinal magnetic field is aimed to be explored. Each nanowire is affected by the magnetic forces resulted from the externally exerted magnetic field plus the magnetic field resulted from the passage of electric current through the adjacent nanowire. To study the problem, these forces are appropriately evaluated in terms of transverse displacements. Subsequently, the governing equations of the nanosystem are constructed using Euler-Bernoulli beam theory in conjunction with the surface elasticity theory of Gurtin and Murdoch. Using a meshless technique and assumed mode method, the critical compressive buckling load of the nanosystem is determined. In a special case, the obtained results by these two numerical methods are successfully checked. The roles of the slenderness ratio, electric current, magnetic field strength, and interwire distance on the axial buckling load and stability behavior of the nanosystem are displayed and discussed in some detail.

  20. Characterization and monitoring of subsurface processes using parallel computing and electrical resistivity imaging

    SciTech Connect

    Johnson, Timothy C.; Truex, Michael J.; Wellman, Dawn M.; Marble, Justin

    2011-12-01

    This newsletter discusses recent advancement in subsurface resistivity characterization and monitoring capabilities. The BC Cribs field desiccation treatability test resistivity monitoring data is use an example to demonstrate near-real time 3D subsurface imaging capabilities. Electrical resistivity tomography (ERT) is a method of imaging the electrical resistivity distribution of the subsurface. An ERT data collection system consists of an array of electrodes, deployed on the ground surface or within boreholes, that are connected to a control unit which can access each electrode independently (Figure 1). A single measurement is collected by injecting current across a pair of current injection electrodes (source and sink), and measuring the resulting potential generated across a pair of potential measurement electrodes (positive and negative). An ERT data set is generated by collecting many such measurements using strategically selected current and potential electrode pairs. This data set is then processed using an inversion algorithm, which reconstructs an estimate (or image) of the electrical conductivity (i.e. the inverse of resistivity) distribution that gave rise to the measured data.

  1. MPP parallel forth

    NASA Technical Reports Server (NTRS)

    Dorband, John E.

    1987-01-01

    Massively Parallel Processor (MPP) Parallel FORTH is a derivative of FORTH-83 and Unified Software Systems' Uni-FORTH. The extension of FORTH into the realm of parallel processing on the MPP is described. With few exceptions, Parallel FORTH was made to follow the description of Uni-FORTH as closely as possible. Likewise, the parallel FORTH extensions were designed as philosophically similar to serial FORTH as possible. The MPP hardware characteristics, as viewed by the FORTH programmer, is discussed. Then a description is presented of how parallel FORTH is implemented on the MPP.

  2. Anomalous effect in a hydrogenic impurity in a spherical quantum dot under the influence of parallel electric and magnetic fields

    NASA Astrophysics Data System (ADS)

    Ho, Y. K.; Lin, Y. C.; Sahoo, S.

    2004-03-01

    We will present calculations for the energy levels and the resonance widths of the quasi-bound states of a confined hydrogenic impurity in an isolate quantum dot subjected to external electric and magnetic fields in parallel directions. A method of complex absorbing potential [1] is used in our present investigation. Resonance positions and widths are reported for a wide range of dot sizes to demonstrate that Stark resonances in a confined hydrogen atom leads to a new phenomenon as a consequence of the quantum confinement of the atom, contrary to the Stark effect on a free atom. * This work was supported by the National Science Council of ROC. [1] S. Sahoo and Y. K. Ho, Chin. J. Phys. 38, 127 (2000); J. Phys. B 33, 2195 (2000); J. Phys. B 33, 5151 (2000); Phys. Rev. A 65, 015403 (2001);

  3. Scaling properties of composite information measures and shape complexity for hydrogenic atoms in parallel magnetic and electric fields

    NASA Astrophysics Data System (ADS)

    González-Férez, R.; Dehesa, J. S.; Patil, S. H.; Sen, K. D.

    2009-12-01

    The scaling properties of various composite information-theoretic measures (Shannon and Rényi entropy sums, Fisher and Onicescu information products, Tsallis entropy ratio, Fisher-Shannon product and shape complexity) are studied in position and momentum spaces for the non-relativistic hydrogenic atoms in the presence of parallel magnetic and electric fields. Such measures are found to be invariant at the fixed values of the scaling parameters given by s1={Bħ3(4}/{Z2m2e} and s2={Fħ4(4}/{Z3em2}. Numerical results which support the validity of the scaling properties are shown by choosing the representative example of the position space shape complexity. Physical significance of the resulting scaling behavior is discussed.

  4. The control of a parallel hybrid-electric propulsion system for a small unmanned aerial vehicle using a CMAC neural network.

    PubMed

    Harmon, Frederick G; Frank, Andrew A; Joshi, Sanjay S

    2005-01-01

    A Simulink model, a propulsion energy optimization algorithm, and a CMAC controller were developed for a small parallel hybrid-electric unmanned aerial vehicle (UAV). The hybrid-electric UAV is intended for military, homeland security, and disaster-monitoring missions involving intelligence, surveillance, and reconnaissance (ISR). The Simulink model is a forward-facing simulation program used to test different control strategies. The flexible energy optimization algorithm for the propulsion system allows relative importance to be assigned between the use of gasoline, electricity, and recharging. A cerebellar model arithmetic computer (CMAC) neural network approximates the energy optimization results and is used to control the parallel hybrid-electric propulsion system. The hybrid-electric UAV with the CMAC controller uses 67.3% less energy than a two-stroke gasoline-powered UAV during a 1-h ISR mission and 37.8% less energy during a longer 3-h ISR mission. PMID:16112553

  5. Observations of whistler mode waves with nonlinear parallel electric fields near the dayside magnetic reconnection separatrix by the Magnetospheric Multiscale mission

    NASA Astrophysics Data System (ADS)

    Wilder, F. D.; Ergun, R. E.; Goodrich, K. A.; Goldman, M. V.; Newman, D. L.; Malaspina, D. M.; Jaynes, A. N.; Schwartz, S. J.; Trattner, K. J.; Burch, J. L.; Argall, M. R.; Torbert, R. B.; Lindqvist, P.-A.; Marklund, G.; Le Contel, O.; Mirioni, L.; Khotyaintsev, Yu. V.; Strangeway, R. J.; Russell, C. T.; Pollock, C. J.; Giles, B. L.; Plaschke, F.; Magnes, W.; Eriksson, S.; Stawarz, J. E.; Sturner, A. P.; Holmes, J. C.

    2016-06-01

    We show observations from the Magnetospheric Multiscale (MMS) mission of whistler mode waves in the Earth's low-latitude boundary layer (LLBL) during a magnetic reconnection event. The waves propagated obliquely to the magnetic field toward the X line and were confined to the edge of a southward jet in the LLBL. Bipolar parallel electric fields interpreted as electrostatic solitary waves (ESW) are observed intermittently and appear to be in phase with the parallel component of the whistler oscillations. The polarity of the ESWs suggests that if they propagate with the waves, they are electron enhancements as opposed to electron holes. The reduced electron distribution shows a shoulder in the distribution for parallel velocities between 17,000 and 22,000 km/s, which persisted during the interval when ESWs were observed, and is near the phase velocity of the whistlers. This shoulder can drive Langmuir waves, which were observed in the high-frequency parallel electric field data.

  6. Method and apparatus for routing data in an inter-nodal communications lattice of a massively parallel computer system by semi-randomly varying routing policies for different packets

    DOEpatents

    Archer, Charles Jens; Musselman, Roy Glenn; Peters, Amanda; Pinnow, Kurt Walter; Swartz, Brent Allen; Wallenfelt, Brian Paul

    2010-11-23

    A massively parallel computer system contains an inter-nodal communications network of node-to-node links. Nodes vary a choice of routing policy for routing data in the network in a semi-random manner, so that similarly situated packets are not always routed along the same path. Semi-random variation of the routing policy tends to avoid certain local hot spots of network activity, which might otherwise arise using more consistent routing determinations. Preferably, the originating node chooses a routing policy for a packet, and all intermediate nodes in the path route the packet according to that policy. Policies may be rotated on a round-robin basis, selected by generating a random number, or otherwise varied.

  7. A design for an intelligent monitor and controller for space station electrical power using parallel distributed problem solving

    NASA Astrophysics Data System (ADS)

    Morris, Robert A.

    1990-12-01

    The emphasis is on defining a set of communicating processes for intelligent spacecraft secondary power distribution and control. The computer hardware and software implementation platform for this work is that of the ADEPTS project at the Johnson Space Center (JSC). The electrical power system design which was used as the basis for this research is that of Space Station Freedom, although the functionality of the processes defined here generalize to any permanent manned space power control application. First, the Space Station Electrical Power Subsystem (EPS) hardware to be monitored is described, followed by a set of scenarios describing typical monitor and control activity. Then, the parallel distributed problem solving approach to knowledge engineering is introduced. There follows a two-step presentation of the intelligent software design for secondary power control. The first step decomposes the problem of monitoring and control into three primary functions. Each of the primary functions is described in detail. Suggestions for refinements and embelishments in design specifications are given.

  8. A design for an intelligent monitor and controller for space station electrical power using parallel distributed problem solving

    NASA Technical Reports Server (NTRS)

    Morris, Robert A.

    1990-01-01

    The emphasis is on defining a set of communicating processes for intelligent spacecraft secondary power distribution and control. The computer hardware and software implementation platform for this work is that of the ADEPTS project at the Johnson Space Center (JSC). The electrical power system design which was used as the basis for this research is that of Space Station Freedom, although the functionality of the processes defined here generalize to any permanent manned space power control application. First, the Space Station Electrical Power Subsystem (EPS) hardware to be monitored is described, followed by a set of scenarios describing typical monitor and control activity. Then, the parallel distributed problem solving approach to knowledge engineering is introduced. There follows a two-step presentation of the intelligent software design for secondary power control. The first step decomposes the problem of monitoring and control into three primary functions. Each of the primary functions is described in detail. Suggestions for refinements and embelishments in design specifications are given.

  9. Large scale electron acceleration by parallel electric fields during magnetic reconnection

    NASA Astrophysics Data System (ADS)

    Egedal, J.; Le, A.; Daughton, W.

    2011-10-01

    Magnetic reconnection is an ubiquitous phenomenon in plasmas. It permits an explosive release of energy through changes in the magnetic field line topology. In the Earth's magnetotail, reconnection energizes electrons up to hundreds of keV and solar flares events can channel up to 50% of the magnetic energy into the electrons resulting in superthermal populations. Electron energization is also fundamentally important to astrophysical applications, where X-rays generated by relativistic electrons provide a unique window into the extreme environments. Here we show that during reconnection powerful energization of electrons by E∥ can occur over spatial scales which hugely exceed what previously thought possible. Thus, our results are contrary to a fundamental assumption that a hot plasma - a highly conducting medium for electrical current - cannot support any significant E∥ over length scales large compared to the small electron inertial length de = c /ωpe . In our model E∥ is supported by strongly anisotropic features in the electron distributions not permitted in standard fluid formulations, but routinely observed by spacecraft in the Earth's magnetosphere. This allows for electron energization in spatial regions that exceed the regular de scale electron diffusion region by at least three orders of magnitude. Magnetic reconnection is an ubiquitous phenomenon in plasmas. It permits an explosive release of energy through changes in the magnetic field line topology. In the Earth's magnetotail, reconnection energizes electrons up to hundreds of keV and solar flares events can channel up to 50% of the magnetic energy into the electrons resulting in superthermal populations. Electron energization is also fundamentally important to astrophysical applications, where X-rays generated by relativistic electrons provide a unique window into the extreme environments. Here we show that during reconnection powerful energization of electrons by E∥ can occur over spatial

  10. Large scale electron acceleration by parallel electric fields during magnetic reconnection

    NASA Astrophysics Data System (ADS)

    Egedal, J.; Le, A.; Daughton, W.

    2011-10-01

    Magnetic reconnection is an ubiquitous phenomenon in plasmas. It permits an explosive release of energy through changes in the magnetic field line topology. In the Earth's magnetotail, reconnection energizes electrons up to hundreds of keV and solar flares events can channel up to 50% of the magnetic energy into the electrons. Electron energization is also fundamentally important toastrophysical applications, where X-rays generated by relativistic electrons provide a unique window into the extreme environments. Here we show that during reconnection powerful energization of electrons by E∥ can occur over spatial scales which hugely exceed what previously thought possible. Thus, our results are contrary to a fundamental assumption that a hot plasma - a highly conducting medium for electrical current - cannot support any significant E∥ over length scales large compared to the small electron inertial length de = c /ωpe . In our model E∥ is supported by non-thermal and strongly anisotropic features in the electron distributions not permitted in standard fluid formulations, but routinely observed by spacecraft in the Earth's magnetosphere. This allows for electron energization in spatial regions that excide the regular de scale electron diffusion region by at least three orders of magnitude. This work was supported by NSF CAREER Award 0844620.

  11. Implementation and scaling of the fully coupled Terrestrial Systems Modeling Platform (TerrSysMP) in a massively parallel supercomputing environment - a case study on JUQUEEN (IBM Blue Gene/Q)

    NASA Astrophysics Data System (ADS)

    Gasper, F.; Goergen, K.; Kollet, S.; Shrestha, P.; Sulis, M.; Rihani, J.; Geimer, M.

    2014-06-01

    Continental-scale hyper-resolution simulations constitute a grand challenge in characterizing non-linear feedbacks of states and fluxes of the coupled water, energy, and biogeochemical cycles of terrestrial systems. Tackling this challenge requires advanced coupling and supercomputing technologies for earth system models that are discussed in this study, utilizing the example of the implementation of the newly developed Terrestrial Systems Modeling Platform (TerrSysMP) on JUQUEEN (IBM Blue Gene/Q) of the Jülich Supercomputing Centre, Germany. The applied coupling strategies rely on the Multiple Program Multiple Data (MPMD) paradigm and require memory and load balancing considerations in the exchange of the coupling fields between different component models and allocation of computational resources, respectively. These considerations can be reached with advanced profiling and tracing tools leading to the efficient use of massively parallel computing environments, which is then mainly determined by the parallel performance of individual component models. However, the problem of model I/O and initialization in the peta-scale range requires major attention, because this constitutes a true big data challenge in the perspective of future exa-scale capabilities, which is unsolved.

  12. Development and application of a 6.5 million feature Affymetrix Genechip® for massively parallel discovery of single position polymorphisms in lettuce (Lactuca spp.)

    PubMed Central

    2012-01-01

    Background High-resolution genetic maps are needed in many crops to help characterize the genetic diversity that determines agriculturally important traits. Hybridization to microarrays to detect single feature polymorphisms is a powerful technique for marker discovery and genotyping because of its highly parallel nature. However, microarrays designed for gene expression analysis rarely provide sufficient gene coverage for optimal detection of nucleotide polymorphisms, which limits utility in species with low rates of polymorphism such as lettuce (Lactuca sativa). Results We developed a 6.5 million feature Affymetrix GeneChip® for efficient polymorphism discovery and genotyping, as well as for analysis of gene expression in lettuce. Probes on the microarray were designed from 26,809 unigenes from cultivated lettuce and an additional 8,819 unigenes from four related species (L. serriola, L. saligna, L. virosa and L. perennis). Where possible, probes were tiled with a 2 bp stagger, alternating on each DNA strand; providing an average of 187 probes covering approximately 600 bp for each of over 35,000 unigenes; resulting in up to 13 fold redundancy in coverage per nucleotide. We developed protocols for hybridization of genomic DNA to the GeneChip® and refined custom algorithms that utilized coverage from multiple, high quality probes to detect single position polymorphisms in 2 bp sliding windows across each unigene. This allowed us to detect greater than 18,000 polymorphisms between the parental lines of our core mapping population, as well as numerous polymorphisms between cultivated lettuce and wild species in the lettuce genepool. Using marker data from our diversity panel comprised of 52 accessions from the five species listed above, we were able to separate accessions by species using both phylogenetic and principal component analyses. Additionally, we estimated the diversity between different types of cultivated lettuce and distinguished morphological types

  13. Second-Order Møller-Plesset Perturbation Theory in the Condensed Phase: An Efficient and Massively Parallel Gaussian and Plane Waves Approach.

    PubMed

    Del Ben, Mauro; Hutter, Jürg; VandeVondele, Joost

    2012-11-13

    A novel algorithm, based on a hybrid Gaussian and plane waves (GPW) approach, is developed for the canonical second-order Møller-Plesset perturbation energy (MP2) of finite and extended systems. The key aspect of the method is that the electron repulsion integrals (ia|λσ) are computed by direct integration between the products of Gaussian basis functions λσ and the electrostatic potential arising from a given occupied-virtual pair density ia. The electrostatic potential is obtained in a plane waves basis set after solving the Poisson equation in Fourier space. In particular, for condensed phase systems, this scheme is highly efficient. Furthermore, our implementation has low memory requirements and displays excellent parallel scalability up to 100 000 processes. In this way, canonical MP2 calculations for condensed phase systems containing hundreds of atoms or more than 5000 basis functions can be performed within minutes, while systems up to 1000 atoms and 10 000 basis functions remain feasible. Solid LiH has been employed as a benchmark to study basis set and system size convergence. Lattice constants and cohesive energies of various molecular crystals have been studied with MP2 and double-hybrid functionals. PMID:26605583

  14. PMESH: A parallel mesh generator

    SciTech Connect

    Hardin, D.D.

    1994-10-21

    The Parallel Mesh Generation (PMESH) Project is a joint LDRD effort by A Division and Engineering to develop a unique mesh generation system that can construct large calculational meshes (of up to 10{sup 9} elements) on massively parallel computers. Such a capability will remove a critical roadblock to unleashing the power of massively parallel processors (MPPs) for physical analysis. PMESH will support a variety of LLNL 3-D physics codes in the areas of electromagnetics, structural mechanics, thermal analysis, and hydrodynamics.

  15. Military Curricula for Vocational & Technical Education. Basic Electricity and Electronics Individualized Learning System. CANTRAC A-100-0010. Module Six: Parallel Circuits. Study Booklet.

    ERIC Educational Resources Information Center

    Chief of Naval Education and Training Support, Pensacola, FL.

    This individualized learning module on parallel circuits is one in a series of modules for a course in basic electricity and electronics. The course is one of a number of military-developed curriculum packages selected for adaptation to vocational instructional and curriculum development in a civilian setting. Four lessons are included in the…

  16. Military Curricula for Vocational & Technical Education. Basic Electricity and Electronics Individualized Learning System. CANTRAC A-100-0010. Module Fourteen: Parallel AC Resistive-Reactive Circuits. Study Booklet.

    ERIC Educational Resources Information Center

    Chief of Naval Education and Training Support, Pensacola, FL.

    This individualized learning module on parallel alternating current resistive-reaction circuits is one in a series of modules for a course in basic electricity and electronics. The course is one of a number of military-developed curriculum packages selected for adaptation to vocational instructional and curriculum development in a civilian…

  17. Massive Stars

    NASA Astrophysics Data System (ADS)

    Livio, Mario; Villaver, Eva

    2009-11-01

    Participants; Preface Mario Livio and Eva Villaver; 1. High-mass star formation by gravitational collapse of massive cores M. R. Krumholz; 2. Observations of massive star formation N. A. Patel; 3. Massive star formation in the Galactic center D. F. Figer; 4. An X-ray tour of massive star-forming regions with Chandra L. K. Townsley; 5. Massive stars: feedback effects in the local universe M. S. Oey and C. J. Clarke; 6. The initial mass function in clusters B. G. Elmegreen; 7. Massive stars and star clusters in the Antennae galaxies B. C. Whitmore; 8. On the binarity of Eta Carinae T. R. Gull; 9. Parameters and winds of hot massive stars R. P. Kudritzki and M. A. Urbaneja; 10. Unraveling the Galaxy to find the first stars J. Tumlinson; 11. Optically observable zero-age main-sequence O stars N. R. Walborn; 12. Metallicity-dependent Wolf-Raynet winds P. A. Crowther; 13. Eruptive mass loss in very massive stars and Population III stars N. Smith; 14. From progenitor to afterlife R. A. Chevalier; 15. Pair-production supernovae: theory and observation E. Scannapieco; 16. Cosmic infrared background and Population III: an overview A. Kashlinsky.

  18. Massive parallel analysis of the binding specificity of histone-like protein HU to single- and double-stranded DNA with generic oligodeoxyribonucleotide microchips.

    SciTech Connect

    Krylov, A. S.; Zasedateleva, O. A.; Prokopenko, D. V.; Rouviere-Yaniv, J.; Mirzabekov, A. D.; Biochip Technology Center; Engelhardt Inst. of Molecular Biology; Inst. de Biologie Physico-Chimique

    2001-06-15

    A generic hexadeoxyribonucleotide microchip has been applied to test the DNA-binding properties of HU histone-like bacterial protein, which is known to have a low sequence specificity. All 4096 hexamers flanked within 8mers by degenerate bases at both the 3'- and 5'-ends were immobilized within the 100 x 100 x 20 mm polyacrylamide gel pads of the microchip. Single-stranded immobilized oligonucleotides were converted in some experiments to the double-stranded form by hybridization with a specified mixture of 8mers. The DNA interaction with HU was characterized by three type of measurements: (i) binding of FITC-labeled HU to microchip oligonucleotides; (ii) melting curves of complexes of labeled HU with single-stranded microchip oligonucleotides; (iii) the effect of HU binding on melting curves of microchip double-stranded DNA labeled with another fluorescent dye, Texas Red. Large numbers of measurements of these parameters were carried out in parallel for all or many generic microchip elements in real time with a multi-wavelength fluorescence microscope. Statistical analysis of these data suggests some preference for HU binding to G/C-rich single-stranded oligonucleotides. HU complexes with double-stranded microchip 8mers can be divided into two groups in which HU binding either increased the melting temperature (T{sub m}) of duplexes or decreased it. The stabilized duplexes showed some preference for presence of the sequence motifs AAG, AGA and AAGA. In the second type of complex, enriched with A/T base pairs, the destabilization effect was higher for longer stretches of A/T duplexes. Binding of HU to labeled duplexes in the second type of complex caused some decrease in fluorescence. This decrease also correlates with the higher A/T content and lower T{sub m}. The results demonstrate that generic microchips could be an efficient approach in analysis of sequence specificity of proteins.

  19. Calculation of the Potential and Electric Flux Lines for Parallel Plate Capacitors with Symmetrically Placed Equal Lengths by Using the Method of Conformal Mapping

    NASA Astrophysics Data System (ADS)

    Albayrak, Erhan

    2001-05-01

    The classical problem of the parallel-plate capacitors has been investigated by a number of authors, including Love [1], Langton [2] and Lin [3]. In this paper, the exact equipotentials and electric flux lines of symmetrically placed two thin conducting plates are obtained using the Schwarz- Cristoffel transformation and the method of conformal mapping. The coordinates x , y in the z-plane corresponding to the constant electric flux lines and equipotential lines are obtained after very detailed and cumbersome calculations. The complete field distribution is given by constructing the family of lines of electric flux and equipotential.

  20. Quark matter in a parallel electric and magnetic field background: Chiral phase transition and equilibration of chiral density

    NASA Astrophysics Data System (ADS)

    Ruggieri, M.; Peng, G. X.

    2016-05-01

    In this article, we study spontaneous chiral symmetry breaking for quark matter in the background of static and homogeneous parallel electric field E and magnetic field B . We use a Nambu-Jona-Lasinio model with a local kernel interaction to compute the relevant quantities to describe chiral symmetry breaking at a finite temperature for a wide range of E and B . We study the effect of this background on the inverse catalysis of chiral symmetry breaking for E and B of the same order of magnitude. We then focus on the effect of the equilibration of chiral density n5 , produced dynamically by an axial anomaly on the critical temperature. The equilibration of n5 , a consequence of chirality-flipping processes in the thermal bath, allows for the introduction of the chiral chemical potential μ5, which is computed self-consistently as a function of the temperature and field strength by coupling the number equation to the gap equation and solving the two within an expansion in E /T2 , B /T2 , and μ52/T2 . We find that even if chirality is produced and equilibrates within a relaxation time τM , it does not change drastically the thermodynamics, with particular reference to the inverse catalysis induced by the external fields, as long as the average μ5 at equilibrium is not too large.