Science.gov

Sample records for massively parallel ligation

  1. Massively parallel visualization: Parallel rendering

    SciTech Connect

    Hansen, C.D.; Krogh, M.; White, W.

    1995-12-01

    This paper presents rendering algorithms, developed for massively parallel processors (MPPs), for polygonal, spheres, and volumetric data. The polygon algorithm uses a data parallel approach whereas the sphere and volume renderer use a MIMD approach. Implementations for these algorithms are presented for the Thinking Machines Corporation CM-5 MPP.

  2. Massively parallel mathematical sieves

    SciTech Connect

    Montry, G.R.

    1989-01-01

    The Sieve of Eratosthenes is a well-known algorithm for finding all prime numbers in a given subset of integers. A parallel version of the Sieve is described that produces computational speedups over 800 on a hypercube with 1,024 processing elements for problems of fixed size. Computational speedups as high as 980 are achieved when the problem size per processor is fixed. The method of parallelization generalizes to other sieves and will be efficient on any ensemble architecture. We investigate two highly parallel sieves using scattered decomposition and compare their performance on a hypercube multiprocessor. A comparison of different parallelization techniques for the sieve illustrates the trade-offs necessary in the design and implementation of massively parallel algorithms for large ensemble computers.

  3. Massively Parallel QCD

    SciTech Connect

    Soltz, R; Vranas, P; Blumrich, M; Chen, D; Gara, A; Giampap, M; Heidelberger, P; Salapura, V; Sexton, J; Bhanot, G

    2007-04-11

    The theory of the strong nuclear force, Quantum Chromodynamics (QCD), can be numerically simulated from first principles on massively-parallel supercomputers using the method of Lattice Gauge Theory. We describe the special programming requirements of lattice QCD (LQCD) as well as the optimal supercomputer hardware architectures that it suggests. We demonstrate these methods on the BlueGene massively-parallel supercomputer and argue that LQCD and the BlueGene architecture are a natural match. This can be traced to the simple fact that LQCD is a regular lattice discretization of space into lattice sites while the BlueGene supercomputer is a discretization of space into compute nodes, and that both are constrained by requirements of locality. This simple relation is both technologically important and theoretically intriguing. The main result of this paper is the speedup of LQCD using up to 131,072 CPUs on the largest BlueGene/L supercomputer. The speedup is perfect with sustained performance of about 20% of peak. This corresponds to a maximum of 70.5 sustained TFlop/s. At these speeds LQCD and BlueGene are poised to produce the next generation of strong interaction physics theoretical results.

  4. Parallel rendering techniques for massively parallel visualization

    SciTech Connect

    Hansen, C.; Krogh, M.; Painter, J.

    1995-07-01

    As the resolution of simulation models increases, scientific visualization algorithms which take advantage of the large memory. and parallelism of Massively Parallel Processors (MPPs) are becoming increasingly important. For large applications rendering on the MPP tends to be preferable to rendering on a graphics workstation due to the MPP`s abundant resources: memory, disk, and numerous processors. The challenge becomes developing algorithms that can exploit these resources while minimizing overhead, typically communication costs. This paper will describe recent efforts in parallel rendering for polygonal primitives as well as parallel volumetric techniques. This paper presents rendering algorithms, developed for massively parallel processors (MPPs), for polygonal, spheres, and volumetric data. The polygon algorithm uses a data parallel approach whereas the sphere and volume render use a MIMD approach. Implementations for these algorithms are presented for the Thinking Ma.chines Corporation CM-5 MPP.

  5. Efficient, massively parallel eigenvalue computation

    NASA Technical Reports Server (NTRS)

    Huo, Yan; Schreiber, Robert

    1993-01-01

    In numerical simulations of disordered electronic systems, one of the most common approaches is to diagonalize random Hamiltonian matrices and to study the eigenvalues and eigenfunctions of a single electron in the presence of a random potential. An effort to implement a matrix diagonalization routine for real symmetric dense matrices on massively parallel SIMD computers, the Maspar MP-1 and MP-2 systems, is described. Results of numerical tests and timings are also presented.

  6. Massively parallel MRI detector arrays

    NASA Astrophysics Data System (ADS)

    Keil, Boris; Wald, Lawrence L.

    2013-04-01

    Originally proposed as a method to increase sensitivity by extending the locally high-sensitivity of small surface coil elements to larger areas via reception, the term parallel imaging now includes the use of array coils to perform image encoding. This methodology has impacted clinical imaging to the point where many examinations are performed with an array comprising multiple smaller surface coil elements as the detector of the MR signal. This article reviews the theoretical and experimental basis for the trend towards higher channel counts relying on insights gained from modeling and experimental studies as well as the theoretical analysis of the so-called “ultimate” SNR and g-factor. We also review the methods for optimally combining array data and changes in RF methodology needed to construct massively parallel MRI detector arrays and show some examples of state-of-the-art for highly accelerated imaging with the resulting highly parallel arrays.

  7. Massively Parallel MRI Detector Arrays

    PubMed Central

    Keil, Boris; Wald, Lawrence L

    2013-01-01

    Originally proposed as a method to increase sensitivity by extending the locally high-sensitivity of small surface coil elements to larger areas, the term parallel imaging now includes the use of array coils to perform image encoding. This methodology has impacted clinical imaging to the point where many examinations are performed with an array comprising multiple smaller surface coil elements as the detector of the MR signal. This article reviews the theoretical and experimental basis for the trend towards higher channel counts relying on insights gained from modeling and experimental studies as well as the theoretical analysis of the so-called “ultimate” SNR and g-factor. We also review the methods for optimally combining array data and changes in RF methodology needed to construct massively parallel MRI detector arrays and show some examples of state-of-the-art for highly accelerated imaging with the resulting highly parallel arrays. PMID:23453758

  8. Massively parallel MRI detector arrays.

    PubMed

    Keil, Boris; Wald, Lawrence L

    2013-04-01

    Originally proposed as a method to increase sensitivity by extending the locally high-sensitivity of small surface coil elements to larger areas via reception, the term parallel imaging now includes the use of array coils to perform image encoding. This methodology has impacted clinical imaging to the point where many examinations are performed with an array comprising multiple smaller surface coil elements as the detector of the MR signal. This article reviews the theoretical and experimental basis for the trend towards higher channel counts relying on insights gained from modeling and experimental studies as well as the theoretical analysis of the so-called "ultimate" SNR and g-factor. We also review the methods for optimally combining array data and changes in RF methodology needed to construct massively parallel MRI detector arrays and show some examples of state-of-the-art for highly accelerated imaging with the resulting highly parallel arrays. PMID:23453758

  9. Merlin - Massively parallel heterogeneous computing

    NASA Technical Reports Server (NTRS)

    Wittie, Larry; Maples, Creve

    1989-01-01

    Hardware and software for Merlin, a new kind of massively parallel computing system, are described. Eight computers are linked as a 300-MIPS prototype to develop system software for a larger Merlin network with 16 to 64 nodes, totaling 600 to 3000 MIPS. These working prototypes help refine a mapped reflective memory technique that offers a new, very general way of linking many types of computer to form supercomputers. Processors share data selectively and rapidly on a word-by-word basis. Fast firmware virtual circuits are reconfigured to match topological needs of individual application programs. Merlin's low-latency memory-sharing interfaces solve many problems in the design of high-performance computing systems. The Merlin prototypes are intended to run parallel programs for scientific applications and to determine hardware and software needs for a future Teraflops Merlin network.

  10. Massively parallel quantum computer simulator

    NASA Astrophysics Data System (ADS)

    De Raedt, K.; Michielsen, K.; De Raedt, H.; Trieu, B.; Arnold, G.; Richter, M.; Lippert, Th.; Watanabe, H.; Ito, N.

    2007-01-01

    We describe portable software to simulate universal quantum computers on massive parallel computers. We illustrate the use of the simulation software by running various quantum algorithms on different computer architectures, such as a IBM BlueGene/L, a IBM Regatta p690+, a Hitachi SR11000/J1, a Cray X1E, a SGI Altix 3700 and clusters of PCs running Windows XP. We study the performance of the software by simulating quantum computers containing up to 36 qubits, using up to 4096 processors and up to 1 TB of memory. Our results demonstrate that the simulator exhibits nearly ideal scaling as a function of the number of processors and suggest that the simulation software described in this paper may also serve as benchmark for testing high-end parallel computers.

  11. Massively parallel femtosecond laser processing.

    PubMed

    Hasegawa, Satoshi; Ito, Haruyasu; Toyoda, Haruyoshi; Hayasaki, Yoshio

    2016-08-01

    Massively parallel femtosecond laser processing with more than 1000 beams was demonstrated. Parallel beams were generated by a computer-generated hologram (CGH) displayed on a spatial light modulator (SLM). The key to this technique is to optimize the CGH in the laser processing system using a scheme called in-system optimization. It was analytically demonstrated that the number of beams is determined by the horizontal number of pixels in the SLM NSLM that is imaged at the pupil plane of an objective lens and a distance parameter pd obtained by dividing the distance between adjacent beams by the diffraction-limited beam diameter. A performance limitation of parallel laser processing in our system was estimated at NSLM of 250 and pd of 7.0. Based on these parameters, the maximum number of beams in a hexagonal close-packed structure was calculated to be 1189 by using an analytical equation. PMID:27505815

  12. Multigrid on massively parallel architectures

    SciTech Connect

    Falgout, R D; Jones, J E

    1999-09-17

    The scalable implementation of multigrid methods for machines with several thousands of processors is investigated. Parallel performance models are presented for three different structured-grid multigrid algorithms, and a description is given of how these models can be used to guide implementation. Potential pitfalls are illustrated when moving from moderate-sized parallelism to large-scale parallelism, and results are given from existing multigrid codes to support the discussion. Finally, the use of mixed programming models is investigated for multigrid codes on clusters of SMPs.

  13. Massively parallel neurocomputing for aerospace applications

    NASA Technical Reports Server (NTRS)

    Fijany, Amir; Barhen, Jacob; Toomarian, Nikzad

    1993-01-01

    An innovative hybrid, analog-digital charge-domain technology, for the massively parallel VLSI implementation of certain large scale matrix-vector operations, has recently been introduced. It employs arrays of Charge Coupled/Charge Injection Device cells holding an analog matrix of charge, which process digital vectors in parallel by means of binary, non-destructive charge transfer operations. The impact of this technology on massively parallel processing is discussed. Fundamentally new classes of algorithms, specifically designed for this emerging technology, as applied to signal processing, are derived.

  14. Massively Parallel Computing: A Sandia Perspective

    SciTech Connect

    Dosanjh, Sudip S.; Greenberg, David S.; Hendrickson, Bruce; Heroux, Michael A.; Plimpton, Steve J.; Tomkins, James L.; Womble, David E.

    1999-05-06

    The computing power available to scientists and engineers has increased dramatically in the past decade, due in part to progress in making massively parallel computing practical and available. The expectation for these machines has been great. The reality is that progress has been slower than expected. Nevertheless, massively parallel computing is beginning to realize its potential for enabling significant break-throughs in science and engineering. This paper provides a perspective on the state of the field, colored by the authors' experiences using large scale parallel machines at Sandia National Laboratories. We address trends in hardware, system software and algorithms, and we also offer our view of the forces shaping the parallel computing industry.

  15. Rectal ulcers and massive bleeding after hemorrhoidal band ligation while on aspirin

    PubMed Central

    Patel, Shruti; Shahzad, Ghulamullah; Rizvon, Kaleem; Subramani, Krishnaiyer; Viswanathan, Prakash; Mustacchia, Paul

    2014-01-01

    Endoscopic hemorrhoidal band ligation is a well-established nonoperative method for treatment of bleeding internal hemorrhoids (grade 1 to 3). It is a safe and effective technique with a high success rate. Complications with this procedure are uncommon. Although rectal ulceration due to band ligation is a rare complication, it can cause life-threatening hemorrhage especially when patients are on medications which impair hemostasis like aspirin or non steroidal anti-inflammatory drugs. We present 2 cases of massive lower gastro-intestinal bleeding in patients who had a band ligation procedure performed 2 wk prior to the presentation and were on aspirin at home. Both the patients were hemodynamically unstable requiring resuscitation. They required platelet and blood transfusions and were found to have rectal ulcers on colonoscopy done subsequently. The rectal ulcers corresponded to the site of band ligation. The use of aspirin by these patients would have caused defects in the hemostasis and may have predisposed them to massive bleeding in the presence of rectal ulcers occurring after the band ligation procedure. Managing aspirin before and after the ligation may be difficult especially since adequate guidelines are unavailable. Stopping aspirin in all the cases might not be safe and the decision should be individualized. PMID:24749117

  16. Massive parallelism in the future of science

    NASA Technical Reports Server (NTRS)

    Denning, Peter J.

    1988-01-01

    Massive parallelism appears in three domains of action of concern to scientists, where it produces collective action that is not possible from any individual agent's behavior. In the domain of data parallelism, computers comprising very large numbers of processing agents, one for each data item in the result will be designed. These agents collectively can solve problems thousands of times faster than current supercomputers. In the domain of distributed parallelism, computations comprising large numbers of resource attached to the world network will be designed. The network will support computations far beyond the power of any one machine. In the domain of people parallelism collaborations among large groups of scientists around the world who participate in projects that endure well past the sojourns of individuals within them will be designed. Computing and telecommunications technology will support the large, long projects that will characterize big science by the turn of the century. Scientists must become masters in these three domains during the coming decade.

  17. Massively parallel sequencing and rare disease

    PubMed Central

    Ng, Sarah B.; Nickerson, Deborah A.; Bamshad, Michael J.; Shendure, Jay

    2010-01-01

    Massively parallel sequencing has enabled the rapid, systematic identification of variants on a large scale. This has, in turn, accelerated the pace of gene discovery and disease diagnosis on a molecular level and has the potential to revolutionize methods particularly for the analysis of Mendelian disease. Using massively parallel sequencing has enabled investigators to interrogate variants both in the context of linkage intervals and also on a genome-wide scale, in the absence of linkage information entirely. The primary challenge now is to distinguish between background polymorphisms and pathogenic mutations. Recently developed strategies for rare monogenic disorders have met with some early success. These strategies include filtering for potential causal variants based on frequency and function, and also ranking variants based on conservation scores and predicted deleteriousness to protein structure. Here, we review the recent literature in the use of high-throughput sequence data and its analysis in the discovery of causal mutations for rare disorders. PMID:20846941

  18. Associative massively parallel processor for video processing

    NASA Astrophysics Data System (ADS)

    Krikelis, Argy; Tawiah, T.

    1996-03-01

    Massively parallel processing architectures have matured primarily through image processing and computer vision application. The similarity of processing requirements between these areas and video processing suggest that they should be very appropriate for video processing applications. This research describes the use of an associative massively parallel processing based system for video compression which includes architectural and system description, discussion of the implementation of compression tasks such as DCT/IDCT, Motion Estimation and Quantization and system evaluation. The core of the processing system is the ASP (Associative String Processor) architecture a modular massively parallel, programmable and inherently fault-tolerant fine-grain SIMD processing architecture incorporating a string of identical APEs (Associative Processing Elements), a reconfigurable inter-processor communication network and a Vector Data Buffer for fully-overlapped data input-output. For video compression applications a prototype system is developed, which is using ASP modules to implement the required compression tasks. This scheme leads to a linear speed up of the computation by simply adding more APEs to the modules.

  19. Template based parallel checkpointing in a massively parallel computer system

    DOEpatents

    Archer, Charles Jens; Inglett, Todd Alan

    2009-01-13

    A method and apparatus for a template based parallel checkpoint save for a massively parallel super computer system using a parallel variation of the rsync protocol, and network broadcast. In preferred embodiments, the checkpoint data for each node is compared to a template checkpoint file that resides in the storage and that was previously produced. Embodiments herein greatly decrease the amount of data that must be transmitted and stored for faster checkpointing and increased efficiency of the computer system. Embodiments are directed to a parallel computer system with nodes arranged in a cluster with a high speed interconnect that can perform broadcast communication. The checkpoint contains a set of actual small data blocks with their corresponding checksums from all nodes in the system. The data blocks may be compressed using conventional non-lossy data compression algorithms to further reduce the overall checkpoint size.

  20. Efficient communication in massively parallel computers

    SciTech Connect

    Cypher, R.E.

    1989-01-01

    A fundamental operation in parallel computation is sorting. Sorting is important not only because it is required by many algorithms, but also because it can be used to implement irregular, pointer-based communication. The author studies two algorithms for sorting in massively parallel computers. First, he examines Shellsort. Shellsort is a sorting algorithm that is based on a sequence of parameters called increments. Shellsort can be used to create a parallel sorting device known as a sorting network. Researchers have suggested that if the correct increment sequence is used, an optimal size sorting network can be obtained. All published increment sequences have been monotonically decreasing. He shows that no monotonically decreasing increment sequence will yield an optimal size sorting network. Second, he presents a sorting algorithm called Cubesort. Cubesort is the fastest known sorting algorithm for a variety of parallel computers aver a wide range of parameters. He also presents a paradigm for developing parallel algorithms that have efficient communication. The paradigm, called the data reduction paradigm, consists of using a divide-and-conquer strategy. Both the division and combination phases of the divide-and-conquer algorithm may require irregular, pointer-based communication between processors. However, the problem is divided so as to limit the amount of data that must be communicated. As a result the communication can be performed efficiently. He presents data reduction algorithms for the image component labeling problem, the closest pair problem and four versions of the parallel prefix problem.

  1. Massively Parallel Direct Simulation of Multiphase Flow

    SciTech Connect

    COOK,BENJAMIN K.; PREECE,DALE S.; WILLIAMS,J.R.

    2000-08-10

    The authors understanding of multiphase physics and the associated predictive capability for multi-phase systems are severely limited by current continuum modeling methods and experimental approaches. This research will deliver an unprecedented modeling capability to directly simulate three-dimensional multi-phase systems at the particle-scale. The model solves the fully coupled equations of motion governing the fluid phase and the individual particles comprising the solid phase using a newly discovered, highly efficient coupled numerical method based on the discrete-element method and the Lattice-Boltzmann method. A massively parallel implementation will enable the solution of large, physically realistic systems.

  2. Time sharing massively parallel machines. Draft

    SciTech Connect

    Gorda, B.; Wolski, R.

    1995-03-01

    As part of the Massively Parallel Computing Initiative (MPCI) at the Lawrence Livermore National Laboratory, the authors have developed a simple, effective and portable time sharing mechanism by scheduling gangs of processes on tightly coupled parallel machines. By time-sharing the resources, the system interleaves production and interactive jobs. Immediate priority is given to interactive use, maintaining good response time. Production jobs are scheduled during idle periods, making use of the otherwise unused resources. In this paper the authors discuss their experience with gang scheduling over the 3 year life-time of the project. In section 2, they motivate the project and discuss some of its details. Section 3.0 describes the general scheduling problem and how gang scheduling addresses it. In section 4.0, they describe the implementation. Section 8.0 presents results culled over the lifetime of the project. They conclude this paper with some observations and possible future directions.

  3. Seismic imaging on massively parallel computers

    SciTech Connect

    Ober, C.C.; Oldfield, R.A.; Womble, D.E.; Mosher, C.C.

    1997-07-01

    A key to reducing the risks and costs associated with oil and gas exploration is the fast, accurate imaging of complex geologies, such as salt domes in the Gulf of Mexico and overthrust regions in US onshore regions. Pre-stack depth migration generally yields the most accurate images, and one approach to this is to solve the scalar-wave equation using finite differences. Current industry computational capabilities are insufficient for the application of finite-difference, 3-D, prestack, depth-migration algorithms. High performance computers and state-of-the-art algorithms and software are required to meet this need. As part of an ongoing ACTI project funded by the US Department of Energy, the authors have developed a finite-difference, 3-D prestack, depth-migration code for massively parallel computer systems. The goal of this work is to demonstrate that massively parallel computers (thousands of processors) can be used efficiently for seismic imaging, and that sufficient computing power exists (or soon will exist) to make finite-difference, prestack, depth migration practical for oil and gas exploration.

  4. Massive hybrid parallelism for fully implicit multiphysics

    SciTech Connect

    Gaston, D. R.; Permann, C. J.; Andrs, D.; Peterson, J. W.

    2013-07-01

    As hardware advances continue to modify the supercomputing landscape, traditional scientific software development practices will become more outdated, ineffective, and inefficient. The process of rewriting/retooling existing software for new architectures is a Sisyphean task, and results in substantial hours of development time, effort, and money. Software libraries which provide an abstraction of the resources provided by such architectures are therefore essential if the computational engineering and science communities are to continue to flourish in this modern computing environment. The Multiphysics Object Oriented Simulation Environment (MOOSE) framework enables complex multiphysics analysis tools to be built rapidly by scientists, engineers, and domain specialists, while also allowing them to both take advantage of current HPC architectures, and efficiently prepare for future supercomputer designs. MOOSE employs a hybrid shared-memory and distributed-memory parallel model and provides a complete and consistent interface for creating multiphysics analysis tools. In this paper, a brief discussion of the mathematical algorithms underlying the framework and the internal object-oriented hybrid parallel design are given. Representative massively parallel results from several applications areas are presented, and a brief discussion of future areas of research for the framework are provided. (authors)

  5. MASSIVE HYBRID PARALLELISM FOR FULLY IMPLICIT MULTIPHYSICS

    SciTech Connect

    Cody J. Permann; David Andrs; John W. Peterson; Derek R. Gaston

    2013-05-01

    As hardware advances continue to modify the supercomputing landscape, traditional scientific software development practices will become more outdated, ineffective, and inefficient. The process of rewriting/retooling existing software for new architectures is a Sisyphean task, and results in substantial hours of development time, effort, and money. Software libraries which provide an abstraction of the resources provided by such architectures are therefore essential if the computational engineering and science communities are to continue to flourish in this modern computing environment. The Multiphysics Object Oriented Simulation Environment (MOOSE) framework enables complex multiphysics analysis tools to be built rapidly by scientists, engineers, and domain specialists, while also allowing them to both take advantage of current HPC architectures, and efficiently prepare for future supercomputer designs. MOOSE employs a hybrid shared-memory and distributed-memory parallel model and provides a complete and consistent interface for creating multiphysics analysis tools. In this paper, a brief discussion of the mathematical algorithms underlying the framework and the internal object-oriented hybrid parallel design are given. Representative massively parallel results from several applications areas are presented, and a brief discussion of future areas of research for the framework are provided.

  6. Solid modeling on a massively parallel processor

    SciTech Connect

    Strip, D. ); Karasick, M. )

    1992-01-01

    Solid modeling underlies many technologies that are key to modern manufacturing. These range from computer-aided design systems to robot simulators, from finite element analysis to integrated circuit process modeling. The accuracy, and hence the utility, of these models is often constrained by the amount of computer time required to perform the desired operations. This paper presents a family of algorithms for solid modeling operations using the Connection Machine, a massively parallel SIMD processor. The authors describe a data structure for representing solid models and algorithms that use the representation to implement efficiently a variety of solid modeling operations. The authors give a sketch of the algorithm for intersecting solids and present computational experience using these algorithms. The data structure and algorithms are contrasted with those of serial architectures, and execution times are compared.

  7. Massively parallel neural network intelligent browse

    NASA Astrophysics Data System (ADS)

    Maxwell, Thomas P.; Zion, Philip M.

    1992-04-01

    A massively parallel neural network architecture is currently being developed as a potential component of a distributed information system in support of NASA's Earth Observing System. This architecture can be trained, via an iterative learning process, to recognize objects in images based on texture features, allowing scientists to search for all patterns which are similar to a target pattern in a database of images. It may facilitate scientific inquiry by allowing scientists to automatically search for physical features of interest in a database through computer pattern recognition, alleviating the need for exhaustive visual searches through possibly thousands of images. The architecture is implemented on a Connection Machine such that each physical processor contains a simulated 'neuron' which views a feature vector derived from a subregion of the input image. Each of these neurons is trained, via the perceptron rule, to identify the same pattern. The network output gives a probability distribution over the input image of finding the target pattern in a given region. In initial tests the architecture was trained to separate regions containing clouds from clear regions in 512 by 512 pixel AVHRR images. We found that in about 10 minutes we can train a network to perform with high accuracy in recognizing clouds which were texturally similar to a target cloud group. These promising results suggest that this type of architecture may play a significant role in coping with the forthcoming flood of data from the Earth-monitoring missions of the major space-faring nations.

  8. Seismic imaging on massively parallel computers

    SciTech Connect

    Ober, C.C.; Oldfield, R.; Womble, D.E.; VanDyke, J.; Dosanjh, S.

    1996-03-01

    Fast, accurate imaging of complex, oil-bearing geologies, such as overthrusts and salt domes, is the key to reducing the costs of domestic oil and gas exploration. Geophysicists say that the known oil reserves in the Gulf of Mexico could be significantly increased if accurate seismic imaging beneath salt domes was possible. A range of techniques exist for imaging these regions, but the highly accurate techniques involve the solution of the wave equation and are characterized by large data sets and large computational demands. Massively parallel computers can provide the computational power for these highly accurate imaging techniques. A brief introduction to seismic processing will be presented, and the implementation of a seismic-imaging code for distributed memory computers will be discussed. The portable code, Salvo, performs a wave equation-based, 3-D, prestack, depth imaging and currently runs on the Intel Paragon and the Cray T3D. It used MPI for portability, and has sustained 22 Mflops/sec/proc (compiled FORTRAN) on the Intel Paragon.

  9. Multiplexed microsatellite recovery using massively parallel sequencing

    USGS Publications Warehouse

    Jennings, T.N.; Knaus, B.J.; Mullins, T.D.; Haig, S.M.; Cronn, R.C.

    2011-01-01

    Conservation and management of natural populations requires accurate and inexpensive genotyping methods. Traditional microsatellite, or simple sequence repeat (SSR), marker analysis remains a popular genotyping method because of the comparatively low cost of marker development, ease of analysis and high power of genotype discrimination. With the availability of massively parallel sequencing (MPS), it is now possible to sequence microsatellite-enriched genomic libraries in multiplex pools. To test this approach, we prepared seven microsatellite-enriched, barcoded genomic libraries from diverse taxa (two conifer trees, five birds) and sequenced these on one lane of the Illumina Genome Analyzer using paired-end 80-bp reads. In this experiment, we screened 6.1 million sequences and identified 356958 unique microreads that contained di- or trinucleotide microsatellites. Examination of four species shows that our conversion rate from raw sequences to polymorphic markers compares favourably to Sanger- and 454-based methods. The advantage of multiplexed MPS is that the staggering capacity of modern microread sequencing is spread across many libraries; this reduces sample preparation and sequencing costs to less than $400 (USD) per species. This price is sufficiently low that microsatellite libraries could be prepared and sequenced for all 1373 organisms listed as 'threatened' and 'endangered' in the United States for under $0.5M (USD).

  10. Fault tolerant massively parallel processing architecture

    SciTech Connect

    Balasubramanian, V.; Banerjee, P.

    1987-08-01

    This paper presents two massively parallel processing architectures suitable for solving a wide variety of algorithms of divide-and-conquer type for problems such as the discrete Fourier transform, production systems, design automation, and others. The first architecture, called the Chain-structured Butterfly ARchitecture (CBAR), consists of a two-dimensional array of N-L . (log/sub 2/(L)+1) processing elements (PE) organized as L levels of log/sub 2/(L)+1 stages, and which has the butterfly connection between PEs in consecutive stages with straight-through feedback between PEs in the last and first stages. This connection system has the desirable property of allowing thousands of PEs to be connected with O(N) connection cost, O(log/sub 2/(N/log/sub 2/N)) communication paths, and a small number (=4) of I/O ports per PE. However, this architecture is not fault tolerant. The authors, therefore, propose a second architecture, called the REconfigurable Chain-structured Butterfly ARchitecture (RECBAR), which is a modified version of the CBAR. The RECBAR possesses all the desirable features of the CBAR, with the number of I/O ports per PE increased to six, and uses O(log/sub 2/N)/N) overhead in PEs and approximately 50% overhead in links to achieve single-level fault tolerance. Reliability improvements of the RECBAR over the CBAR are studied. This paper also presents a distributed diagnostic and structuring algorithm for the RECBAR that enables the architecture to detect faults and structure itself accordingly within 2 . log/sub 2/(L)+1 time steps, thus making it a truly fault tolerant architecture.

  11. The EMCC / DARPA Massively Parallel Electromagnetic Scattering Project

    NASA Technical Reports Server (NTRS)

    Woo, Alex C.; Hill, Kueichien C.

    1996-01-01

    The Electromagnetic Code Consortium (EMCC) was sponsored by the Advanced Research Program Agency (ARPA) to demonstrate the effectiveness of massively parallel computing in large scale radar signature predictions. The EMCC/ARPA project consisted of three parts.

  12. Visualization on massively parallel computers using CM/AVS

    SciTech Connect

    Krogh, M.F.; Hansen, C.D.

    1993-09-01

    CM/AVS is a visualization environment for the massively parallel CM-5 from Thinking Machines. It provides a backend to the standard commercially available AVS visualization product. At the Advanced Computing Laboratory at Los Alamos National Laboratory, we have been experimenting and utilizing this software within our visualization environment. This paper describes our experiences with CM/AVS. The conclusions reached are applicable to any implimentation of visualization software within a massively parallel computing environment.

  13. Experimental free-space optical network for massively parallel computers

    NASA Astrophysics Data System (ADS)

    Araki, S.; Kajita, M.; Kasahara, K.; Kubota, K.; Kurihara, K.; Redmond, I.; Schenfeld, E.; Suzaki, T.

    1996-03-01

    A free-space optical interconnection scheme is described for massively parallel processors based on the interconnection-cached network architecture. The optical network operates in a circuit-switching mode. Combined with a packet-switching operation among the circuit-switched optical channels, a high-bandwidth, low-latency network for massively parallel processing results. The design and assembly of a 64-channel experimental prototype is discussed, and operational results are presented.

  14. RAMA: A file system for massively parallel computers

    NASA Technical Reports Server (NTRS)

    Miller, Ethan L.; Katz, Randy H.

    1993-01-01

    This paper describes a file system design for massively parallel computers which makes very efficient use of a few disks per processor. This overcomes the traditional I/O bottleneck of massively parallel machines by storing the data on disks within the high-speed interconnection network. In addition, the file system, called RAMA, requires little inter-node synchronization, removing another common bottleneck in parallel processor file systems. Support for a large tertiary storage system can easily be integrated in lo the file system; in fact, RAMA runs most efficiently when tertiary storage is used.

  15. IMPAIR: massively parallel deconvolution on the GPU

    NASA Astrophysics Data System (ADS)

    Sherry, Michael; Shearer, Andy

    2013-02-01

    The IMPAIR software is a high throughput image deconvolution tool for processing large out-of-core datasets of images, varying from large images with spatially varying PSFs to large numbers of images with spatially invariant PSFs. IMPAIR implements a parallel version of the tried and tested Richardson-Lucy deconvolution algorithm regularised via a custom wavelet thresholding library. It exploits the inherently parallel nature of the convolution operation to achieve quality results on consumer grade hardware: through the NVIDIA Tesla GPU implementation, the multi-core OpenMP implementation, and the cluster computing MPI implementation of the software. IMPAIR aims to address the problem of parallel processing in both top-down and bottom-up approaches: by managing the input data at the image level, and by managing the execution at the instruction level. These combined techniques will lead to a scalable solution with minimal resource consumption and maximal load balancing. IMPAIR is being developed as both a stand-alone tool for image processing, and as a library which can be embedded into non-parallel code to transparently provide parallel high throughput deconvolution.

  16. EFFICIENT SCHEDULING OF PARALLEL JOBS ON MASSIVELY PARALLEL SYSTEMS

    SciTech Connect

    F. PETRINI; W. FENG

    1999-09-01

    We present buffered coscheduling, a new methodology to multitask parallel jobs in a message-passing environment and to develop parallel programs that can pave the way to the efficient implementation of a distributed operating system. Buffered coscheduling is based on three innovative techniques: communication buffering, strobing, and non-blocking communication. By leveraging these techniques, we can perform effective optimizations based on the global status of the parallel machine rather than on the limited knowledge available locally to each processor. The advantages of buffered coscheduling include higher resource utilization, reduced communication overhead, efficient implementation of low-control strategies and fault-tolerant protocols, accurate performance modeling, and a simplified yet still expressive parallel programming model. Preliminary experimental results show that buffered coscheduling is very effective in increasing the overall performance in the presence of load imbalance and communication-intensive workloads.

  17. Scan line graphics generation on the massively parallel processor

    NASA Technical Reports Server (NTRS)

    Dorband, John E.

    1988-01-01

    Described here is how researchers implemented a scan line graphics generation algorithm on the Massively Parallel Processor (MPP). Pixels are computed in parallel and their results are applied to the Z buffer in large groups. To perform pixel value calculations, facilitate load balancing across the processors and apply the results to the Z buffer efficiently in parallel requires special virtual routing (sort computation) techniques developed by the author especially for use on single-instruction multiple-data (SIMD) architectures.

  18. Massively Parallel Sequencing: The Next Big Thing in Genetic Medicine

    PubMed Central

    Tucker, Tracy; Marra, Marco; Friedman, Jan M.

    2009-01-01

    Massively parallel sequencing has reduced the cost and increased the throughput of genomic sequencing by more than three orders of magnitude, and it seems likely that costs will fall and throughput improve even more in the next few years. Clinical use of massively parallel sequencing will provide a way to identify the cause of many diseases of unknown etiology through simultaneous screening of thousands of loci for pathogenic mutations and by sequencing biological specimens for the genomic signatures of novel infectious agents. In addition to providing these entirely new diagnostic capabilities, massively parallel sequencing may also replace arrays and Sanger sequencing in clinical applications where they are currently being used. Routine clinical use of massively parallel sequencing will require higher accuracy, better ways to select genomic subsets of interest, and improvements in the functionality, speed, and ease of use of data analysis software. In addition, substantial enhancements in laboratory computer infrastructure, data storage, and data transfer capacity will be needed to handle the extremely large data sets produced. Clinicians and laboratory personnel will require training to use the sequence data effectively, and appropriate methods will need to be developed to deal with the incidental discovery of pathogenic mutations and variants of uncertain clinical significance. Massively parallel sequencing has the potential to transform the practice of medical genetics and related fields, but the vast amount of personal genomic data produced will increase the responsibility of geneticists to ensure that the information obtained is used in a medically and socially responsible manner. PMID:19679224

  19. Massively parallel neural encoding and decoding of visual stimuli.

    PubMed

    Lazar, Aurel A; Zhou, Yiyin

    2012-08-01

    The massively parallel nature of video Time Encoding Machines (TEMs) calls for scalable, massively parallel decoders that are implemented with neural components. The current generation of decoding algorithms is based on computing the pseudo-inverse of a matrix and does not satisfy these requirements. Here we consider video TEMs with an architecture built using Gabor receptive fields and a population of Integrate-and-Fire neurons. We show how to build a scalable architecture for video Time Decoding Machines using recurrent neural networks. Furthermore, we extend our architecture to handle the reconstruction of visual stimuli encoded with massively parallel video TEMs having neurons with random thresholds. Finally, we discuss in detail our algorithms and demonstrate their scalability and performance on a large scale GPU cluster. PMID:22397951

  20. Staging memory for massively parallel processor

    NASA Technical Reports Server (NTRS)

    Batcher, Kenneth E. (Inventor)

    1988-01-01

    The invention herein relates to a computer organization capable of rapidly processing extremely large volumes of data. A staging memory is provided having a main stager portion consisting of a large number of memory banks which are accessed in parallel to receive, store, and transfer data words simultaneous with each other. Substager portions interconnect with the main stager portion to match input and output data formats with the data format of the main stager portion. An address generator is coded for accessing the data banks for receiving or transferring the appropriate words. Input and output permutation networks arrange the lineal order of data into and out of the memory banks.

  1. The Challenge of Massively Parallel Computing

    SciTech Connect

    WOMBLE,DAVID E.

    1999-11-03

    Since the mid-1980's, there have been a number of commercially available parallel computers with hundreds or thousands of processors. These machines have provided a new capability to the scientific community, and they been used successfully by scientists and engineers although with varying degrees of success. One of the reasons for the limited success is the difficulty, or perceived difficulty, in developing code for these machines. In this paper we discuss many of the issues and challenges in developing scalable hardware, system software and algorithms for machines comprising hundreds or thousands of processors.

  2. Design and implementation of a massively parallel version of DIRECT

    SciTech Connect

    He, J.; Verstak, A.; Watson, L.; Sosonkina, M.

    2007-10-24

    This paper describes several massively parallel implementations for a global search algorithm DIRECT. Two parallel schemes take different approaches to address DIRECT's design challenges imposed by memory requirements and data dependency. Three design aspects in topology, data structures, and task allocation are compared in detail. The goal is to analytically investigate the strengths and weaknesses of these parallel schemes, identify several key sources of inefficiency, and experimentally evaluate a number of improvements in the latest parallel DIRECT implementation. The performance studies demonstrate improved data structure efficiency and load balancing on a 2200 processor cluster.

  3. Massively parallel solution of the assignment problem. Technical report

    SciTech Connect

    Wein, J.; Zenios, S.

    1990-12-01

    In this paper we discuss the design, implementation and effectiveness of massively parallel algorithms for the solution of large-scale assignment problems. In particular, we study the auction algorithms of Bertsekas, an algorithm based on the method of multipliers of Hestenes and Powell, and an algorithm based on the alternating direction method of multipliers of Eckstein. We discuss alternative approaches to the massively parallel implementation of the auction algorithm, including Jacobi, Gauss-Seidel and a hybrid scheme. The hybrid scheme, in particular, exploits two different levels of parallelism and an efficient way of communicating the data between them without the need to perform general router operations across the hypercube network. We then study the performance of massively parallel implementations of two methods of multipliers. Implementations are carried out on the Connection Machine CM-2, and the algorithms are evaluated empirically with the solution of large scale problems. The hybrid scheme significantly outperforms all of the other methods and gives the best computational results to date for a massively parallel solution to this problem.

  4. Shift: A Massively Parallel Monte Carlo Radiation Transport Package

    SciTech Connect

    Pandya, Tara M; Johnson, Seth R; Davidson, Gregory G; Evans, Thomas M; Hamilton, Steven P

    2015-01-01

    This paper discusses the massively-parallel Monte Carlo radiation transport package, Shift, developed at Oak Ridge National Laboratory. It reviews the capabilities, implementation, and parallel performance of this code package. Scaling results demonstrate very good strong and weak scaling behavior of the implemented algorithms. Benchmark results from various reactor problems show that Shift results compare well to other contemporary Monte Carlo codes and experimental results.

  5. Efficient parallel global garbage collection on massively parallel computers

    SciTech Connect

    Kamada, Tomio; Matsuoka, Satoshi; Yonezawa, Akinori

    1994-12-31

    On distributed-memory high-performance MPPs where processors are interconnected by an asynchronous network, efficient Garbage Collection (GC) becomes difficult due to inter-node references and references within pending, unprocessed messages. The parallel global GC algorithm (1) takes advantage of reference locality, (2) efficiently traverses references over nodes, (3) admits minimum pause time of ongoing computations, and (4) has been shown to scale up to 1024 node MPPs. The algorithm employs a global weight counting scheme to substantially reduce message traffic. The two methods for confirming the arrival of pending messages are used: one counts numbers of messages and the other uses network `bulldozing.` Performance evaluation in actual implementations on a multicomputer with 32-1024 nodes, Fujitsu AP1000, reveals various favorable properties of the algorithm.

  6. Ligation errors in DNA computing.

    PubMed

    Aoi, Y; Yoshinobu, T; Tanizawa, K; Kinoshita, K; Iwasaki, H

    1999-10-01

    DNA computing is a novel method of computing proposed by Adleman (1994), in which the data is encoded in the sequences of oligonucleotides. Massively parallel reactions between oligonucleotides are expected to make it possible to solve huge problems. In this study, reliability of the ligation process employed in the DNA computing is tested by estimating the error rate at which wrong oligonucleotides are ligated. Ligation of wrong oligonucleotides would result in a wrong answer in the DNA computing. The dependence of the error rate on the number of mismatches between oligonucleotides and on the combination of bases is investigated. PMID:10636043

  7. Solving unstructured grid problems on massively parallel computers

    NASA Technical Reports Server (NTRS)

    Hammond, Steven W.; Schreiber, Robert

    1990-01-01

    A highly parallel graph mapping technique that enables one to efficiently solve unstructured grid problems on massively parallel computers is presented. Many implicit and explicit methods for solving discretized partial differential equations require each point in the discretization to exchange data with its neighboring points every time step or iteration. The cost of this communication can negate the high performance promised by massively parallel computing. To eliminate this bottleneck, the graph of the irregular problem is mapped into the graph representing the interconnection topology of the computer such that the sum of the distances that the messages travel is minimized. It is shown that using the heuristic mapping algorithm significantly reduces the communication time compared to a naive assignment of processes to processors.

  8. A Programming Model for Massive Data Parallelism with Data Dependencies

    SciTech Connect

    Cui, Xiaohui; Mueller, Frank; Potok, Thomas E; Zhang, Yongpeng

    2009-01-01

    Accelerating processors can often be more cost and energy effective for a wide range of data-parallel computing problems than general-purpose processors. For graphics processor units (GPUs), this is particularly the case when program development is aided by environments such as NVIDIA s Compute Unified Device Architecture (CUDA), which dramatically reduces the gap between domain-specific architectures and general purpose programming. Nonetheless, general-purpose GPU (GPGPU) programming remains subject to several restrictions. Most significantly, the separation of host (CPU) and accelerator (GPU) address spaces requires explicit management of GPU memory resources, especially for massive data parallelism that well exceeds the memory capacity of GPUs. One solution to this problem is to transfer data between the GPU and host memories frequently. In this work, we investigate another approach. We run massively data-parallel applications on GPU clusters. We further propose a programming model for massive data parallelism with data dependencies for this scenario. Experience from micro benchmarks and real-world applications shows that our model provides not only ease of programming but also significant performance gains.

  9. The language parallel Pascal and other aspects of the massively parallel processor

    NASA Technical Reports Server (NTRS)

    Reeves, A. P.; Bruner, J. D.

    1982-01-01

    A high level language for the Massively Parallel Processor (MPP) was designed. This language, called Parallel Pascal, is described in detail. A description of the language design, a description of the intermediate language, Parallel P-Code, and details for the MPP implementation are included. Formal descriptions of Parallel Pascal and Parallel P-Code are given. A compiler was developed which converts programs in Parallel Pascal into the intermediate Parallel P-Code language. The code generator to complete the compiler for the MPP is being developed independently. A Parallel Pascal to Pascal translator was also developed. The architecture design for a VLSI version of the MPP was completed with a description of fault tolerant interconnection networks. The memory arrangement aspects of the MPP are discussed and a survey of other high level languages is given.

  10. Supercomputing on massively parallel bit-serial architectures

    NASA Technical Reports Server (NTRS)

    Iobst, Ken

    1985-01-01

    Research on the Goodyear Massively Parallel Processor (MPP) suggests that high-level parallel languages are practical and can be designed with powerful new semantics that allow algorithms to be efficiently mapped to the real machines. For the MPP these semantics include parallel/associative array selection for both dense and sparse matrices, variable precision arithmetic to trade accuracy for speed, micro-pipelined train broadcast, and conditional branching at the processing element (PE) control unit level. The preliminary design of a FORTRAN-like parallel language for the MPP has been completed and is being used to write programs to perform sparse matrix array selection, min/max search, matrix multiplication, Gaussian elimination on single bit arrays and other generic algorithms. A description is given of the MPP design. Features of the system and its operation are illustrated in the form of charts and diagrams.

  11. Development of massively parallel quantum chemistry program SMASH

    SciTech Connect

    Ishimura, Kazuya

    2015-12-31

    A massively parallel program for quantum chemistry calculations SMASH was released under the Apache License 2.0 in September 2014. The SMASH program is written in the Fortran90/95 language with MPI and OpenMP standards for parallelization. Frequently used routines, such as one- and two-electron integral calculations, are modularized to make program developments simple. The speed-up of the B3LYP energy calculation for (C{sub 150}H{sub 30}){sub 2} with the cc-pVDZ basis set (4500 basis functions) was 50,499 on 98,304 cores of the K computer.

  12. Development of massively parallel quantum chemistry program SMASH

    NASA Astrophysics Data System (ADS)

    Ishimura, Kazuya

    2015-12-01

    A massively parallel program for quantum chemistry calculations SMASH was released under the Apache License 2.0 in September 2014. The SMASH program is written in the Fortran90/95 language with MPI and OpenMP standards for parallelization. Frequently used routines, such as one- and two-electron integral calculations, are modularized to make program developments simple. The speed-up of the B3LYP energy calculation for (C150H30)2 with the cc-pVDZ basis set (4500 basis functions) was 50,499 on 98,304 cores of the K computer.

  13. TSE computers - A means for massively parallel computations

    NASA Technical Reports Server (NTRS)

    Strong, J. P., III

    1976-01-01

    A description is presented of hardware concepts for building a massively parallel processing system for two-dimensional data. The processing system is to use logic arrays of 128 x 128 elements which perform over 16 thousand operations simultaneously. Attention is given to image data, logic arrays, basic image logic functions, a prototype negator, an interleaver device, image logic circuits, and an image memory circuit.

  14. Computational fluid dynamics on a massively parallel computer

    NASA Technical Reports Server (NTRS)

    Jespersen, Dennis C.; Levit, Creon

    1989-01-01

    A finite difference code was implemented for the compressible Navier-Stokes equations on the Connection Machine, a massively parallel computer. The code is based on the ARC2D/ARC3D program and uses the implicit factored algorithm of Beam and Warming. The codes uses odd-even elimination to solve linear systems. Timings and computation rates are given for the code, and a comparison is made with a Cray XMP.

  15. MIMD massively parallel methods for engineering and science problems

    SciTech Connect

    Camp, W.J.; Plimpton, S.J.

    1993-08-01

    MIMD massively parallel computers promise unique power and flexibility for engineering and scientific simulations. In this paper we review the development of a number of software methods and algorithms for scientific and engineering problems which are helping to realize that promise. We discuss new domain decomposition, load balancing, data layout and communications methods applicable to simulations in a broad range of technical field including signal processing, multi-dimensional structural and fluid mechanics, materials science, and chemical and biological systems.

  16. Massively parallel Wang Landau sampling on multiple GPUs

    SciTech Connect

    Yin, Junqi; Landau, D. P.

    2012-01-01

    Wang Landau sampling is implemented on the Graphics Processing Unit (GPU) with the Compute Unified Device Architecture (CUDA). Performances on three different GPU cards, including the new generation Fermi architecture card, are compared with that on a Central Processing Unit (CPU). The parameters for massively parallel Wang Landau sampling are tuned in order to achieve fast convergence. For simulations of the water cluster systems, we obtain an average of over 50 times speedup for a given workload.

  17. 3D seismic imaging on massively parallel computers

    SciTech Connect

    Womble, D.E.; Ober, C.C.; Oldfield, R.

    1997-02-01

    The ability to image complex geologies such as salt domes in the Gulf of Mexico and thrusts in mountainous regions is a key to reducing the risk and cost associated with oil and gas exploration. Imaging these structures, however, is computationally expensive. Datasets can be terabytes in size, and the processing time required for the multiple iterations needed to produce a velocity model can take months, even with the massively parallel computers available today. Some algorithms, such as 3D, finite-difference, prestack, depth migration remain beyond the capacity of production seismic processing. Massively parallel processors (MPPs) and algorithms research are the tools that will enable this project to provide new seismic processing capabilities to the oil and gas industry. The goals of this work are to (1) develop finite-difference algorithms for 3D, prestack, depth migration; (2) develop efficient computational approaches for seismic imaging and for processing terabyte datasets on massively parallel computers; and (3) develop a modular, portable, seismic imaging code.

  18. Endoscopic variceal ligation caused massive bleeding due to laceration of an esophageal varicose vein with tissue glue emboli

    PubMed Central

    Wei, Xiu-Qing; Gu, Hua-Ying; Wu, Zhi-E; Miao, Hui-Biao; Wang, Pei-Qi; Wen, Zhuo-Fu; Wu, Bin

    2014-01-01

    Endoscopic variceal obturation of gastric varices with tissue glue is considered the first choice for management of gastric varices, and is usually safe and effective. However, there is still a low incidence of complications and some are even fatal. Here, we present a case in which endoscopic variceal ligation caused laceration of the esophageal varicose vein with tissue glue emboli and massive bleeding after 3 mo. Cessation of bleeding was achieved via variceal sclerotherapy using a cap-fitted gastroscope. Methods of recognizing an esophageal varicose vein with tissue glue plug are discussed. PMID:25400482

  19. Endoscopic variceal ligation caused massive bleeding due to laceration of an esophageal varicose vein with tissue glue emboli.

    PubMed

    Wei, Xiu-Qing; Gu, Hua-Ying; Wu, Zhi-E; Miao, Hui-Biao; Wang, Pei-Qi; Wen, Zhuo-Fu; Wu, Bin

    2014-11-14

    Endoscopic variceal obturation of gastric varices with tissue glue is considered the first choice for management of gastric varices, and is usually safe and effective. However, there is still a low incidence of complications and some are even fatal. Here, we present a case in which endoscopic variceal ligation caused laceration of the esophageal varicose vein with tissue glue emboli and massive bleeding after 3 mo. Cessation of bleeding was achieved via variceal sclerotherapy using a cap-fitted gastroscope. Methods of recognizing an esophageal varicose vein with tissue glue plug are discussed. PMID:25400482

  20. Requirements for supercomputing in energy research: The transition to massively parallel computing

    SciTech Connect

    Not Available

    1993-02-01

    This report discusses: The emergence of a practical path to TeraFlop computing and beyond; requirements of energy research programs at DOE; implementation: supercomputer production computing environment on massively parallel computers; and implementation: user transition to massively parallel computing.

  1. The 2nd Symposium on the Frontiers of Massively Parallel Computations

    NASA Technical Reports Server (NTRS)

    Mills, Ronnie (Editor)

    1988-01-01

    Programming languages, computer graphics, neural networks, massively parallel computers, SIMD architecture, algorithms, digital terrain models, sort computation, simulation of charged particle transport on the massively parallel processor and image processing are among the topics discussed.

  2. Routing performance analysis and optimization within a massively parallel computer

    DOEpatents

    Archer, Charles Jens; Peters, Amanda; Pinnow, Kurt Walter; Swartz, Brent Allen

    2013-04-16

    An apparatus, program product and method optimize the operation of a massively parallel computer system by, in part, receiving actual performance data concerning an application executed by the plurality of interconnected nodes, and analyzing the actual performance data to identify an actual performance pattern. A desired performance pattern may be determined for the application, and an algorithm may be selected from among a plurality of algorithms stored within a memory, the algorithm being configured to achieve the desired performance pattern based on the actual performance data.

  3. The Massively Parallel Processor and its applications. [for environmental monitoring

    NASA Technical Reports Server (NTRS)

    Strong, J. P.; Schaefer, D. H.; Fischer, J. R.; Wallgren, K. R.; Bracken, P. A.

    1979-01-01

    A long-term experimental development program conducted at Goddard Space Flight Center to implement an ultrahigh-speed data processing system known as the Massively Parallel Processor (MPP) is described. The MPP is a single instruction multiple data stream computer designed to perform logical, integer, and floating point arithmetic operations on variable word length data. Information is presented on system architecture, the system configuration, the array unit architecture, individual processing units, and expected operating rates for several image processing applications (including the processing of Landsat data).

  4. A biconjugate gradient type algorithm on massively parallel architectures

    NASA Technical Reports Server (NTRS)

    Freund, Roland W.; Hochbruck, Marlis

    1991-01-01

    The biconjugate gradient (BCG) method is the natural generalization of the classical conjugate gradient algorithm for Hermitian positive definite matrices to general non-Hermitian linear systems. Unfortunately, the original BCG algorithm is susceptible to possible breakdowns and numerical instabilities. Recently, Freund and Nachtigal have proposed a novel BCG type approach, the quasi-minimal residual method (QMR), which overcomes the problems of BCG. Here, an implementation is presented of QMR based on an s-step version of the nonsymmetric look-ahead Lanczos algorithm. The main feature of the s-step Lanczos algorithm is that, in general, all inner products, except for one, can be computed in parallel at the end of each block; this is unlike the other standard Lanczos process where inner products are generated sequentially. The resulting implementation of QMR is particularly attractive on massively parallel SIMD architectures, such as the Connection Machine.

  5. Numerical computation on massively parallel hypercubes. [Connection machine

    SciTech Connect

    McBryan, O.A.

    1986-01-01

    We describe numerical computations on the Connection Machine, a massively parallel hypercube architecture with 65,536 single-bit processors and 32 Mbytes of memory. A parallel extension of COMMON LISP, provides access to the processors and network. The rich software environment is further enhanced by a powerful virtual processor capability, which extends the degree of fine-grained parallelism beyond 1,000,000. We briefly describe the hardware and indicate the principal features of the parallel programming environment. We then present implementations of SOR, multigrid and pre-conditioned conjugate gradient algorithms for solving partial differential equations on the Connection Machine. Despite the lack of floating point hardware, computation rates above 100 megaflops have been achieved in PDE solution. Virtual processors prove to be a real advantage, easing the effort of software development while improving system performance significantly. The software development effort is also facilitated by the fact that hypercube communications prove to be fast and essentially independent of distance. 29 refs., 4 figs.

  6. The performance realities of massively parallel processors: A case study

    SciTech Connect

    Lubeck, O.M.; Simmons, M.L.; Wasserman, H.J.

    1992-07-01

    This paper presents the results of an architectural comparison of SIMD massive parallelism, as implemented in the Thinking Machines Corp. CM-2 computer, and vector or concurrent-vector processing, as implemented in the Cray Research Inc. Y-MP/8. The comparison is based primarily upon three application codes that represent Los Alamos production computing. Tests were run by porting optimized CM Fortran codes to the Y-MP, so that the same level of optimization was obtained on both machines. The results for fully-configured systems, using measured data rather than scaled data from smaller configurations, show that the Y-MP/8 is faster than the 64k CM-2 for all three codes. A simple model that accounts for the relative characteristic computational speeds of the two machines, and reduction in overall CM-2 performance due to communication or SIMD conditional execution, is included. The model predicts the performance of two codes well, but fails for the third code, because the proportion of communications in this code is very high. Other factors, such as memory bandwidth and compiler effects, are also discussed. Finally, the paper attempts to show the equivalence of the CM-2 and Y-MP programming models, and also comments on selected future massively parallel processor designs.

  7. Comparison of massively parallel hand-print segmenters

    SciTech Connect

    Wilkinson, R.A.; Garris, M.D.

    1992-09-01

    NIST has developed a massively parallel hand-print recognition system that allows components to be interchanged. Using this system, three different character segmentation algorithms have been developed and studied. They are blob coloring, histogramming, and a hybrid of the two. The blob coloring method uses connected components to isolate characters. The histogramming method locates linear spaces, which may be slanted, to segment characters. The hybrid method is an augmented histogramming method that incorporates statistically adaptive rules to decide when a histogrammed item is too large and applies blob coloring to further segment the difficult item. The hardware configuration is a serial host computer with a 1024 processor Single Instruction Multiple Data (SIMD) machine attached to it. The data used in this comparison is 'NIST Special Database 1' which contains 2100 forms from different writers where each form contains 130 digit characters distributed across 28 fields. This gives a potential 273,000 characters to be segmented. Running the massively parallel system across the 2100 forms, blob coloring required 2.1 seconds per form with an accuracy of 97.5%, histogramming required 14.4 seconds with an accuracy of 95.3%, and the hybrid method required 13.2 seconds with an accuracy of 95.4%. The results of this comparison show that the blob coloring method on a SIMD architecture is superior.

  8. Learning Quantitative Sequence-Function Relationships from Massively Parallel Experiments

    NASA Astrophysics Data System (ADS)

    Atwal, Gurinder S.; Kinney, Justin B.

    2016-03-01

    A fundamental aspect of biological information processing is the ubiquity of sequence-function relationships—functions that map the sequence of DNA, RNA, or protein to a biochemically relevant activity. Most sequence-function relationships in biology are quantitative, but only recently have experimental techniques for effectively measuring these relationships been developed. The advent of such "massively parallel" experiments presents an exciting opportunity for the concepts and methods of statistical physics to inform the study of biological systems. After reviewing these recent experimental advances, we focus on the problem of how to infer parametric models of sequence-function relationships from the data produced by these experiments. Specifically, we retrace and extend recent theoretical work showing that inference based on mutual information, not the standard likelihood-based approach, is often necessary for accurately learning the parameters of these models. Closely connected with this result is the emergence of "diffeomorphic modes"—directions in parameter space that are far less constrained by data than likelihood-based inference would suggest. Analogous to Goldstone modes in physics, diffeomorphic modes arise from an arbitrarily broken symmetry of the inference problem. An analytically tractable model of a massively parallel experiment is then described, providing an explicit demonstration of these fundamental aspects of statistical inference. This paper concludes with an outlook on the theoretical and computational challenges currently facing studies of quantitative sequence-function relationships.

  9. Automation of Molecular-Based Analyses: A Primer on Massively Parallel Sequencing

    PubMed Central

    Nguyen, Lan; Burnett, Leslie

    2014-01-01

    Recent advances in genetics have been enabled by new genetic sequencing techniques called massively parallel sequencing (MPS) or next-generation sequencing. Through the ability to sequence in parallel hundreds of thousands to millions of DNA fragments, the cost and time required for sequencing has dramatically decreased. There are a number of different MPS platforms currently available and being used in Australia. Although they differ in the underlying technology involved, their overall processes are very similar: DNA fragmentation, adaptor ligation, immobilisation, amplification, sequencing reaction and data analysis. MPS is being used in research, translational and increasingly now also in clinical settings. Common applications include sequencing of whole genomes, whole exomes or targeted genes for disease-causing gene discovery, genetic diagnosis and targeted cancer therapy. Even though the revolution that is occurring with MPS is exciting due to its increasing use, improving and emerging technologies and new applications, significant challenges still exist. Particularly challenging issues are the bioinformatics required for data analysis, interpretation of results and the ethical dilemma of ‘incidental findings’. PMID:25336762

  10. Optimal evaluation of array expressions on massively parallel machines

    NASA Technical Reports Server (NTRS)

    Chatterjee, Siddhartha; Gilbert, John R.; Schreiber, Robert; Teng, Shang-Hua

    1992-01-01

    We investigate the problem of evaluating FORTRAN 90 style array expressions on massively parallel distributed-memory machines. On such machines, an elementwise operation can be performed in constant time for arrays whose corresponding elements are in the same processor. If the arrays are not aligned in this manner, the cost of aligning them is part of the cost of evaluating the expression. The choice of where to perform the operation then affects this cost. We present algorithms based on dynamic programming to solve this problem efficiently for a wide variety of interconnection schemes, including multidimensional grids and rings, hypercubes, and fat-trees. We also consider expressions containing operations that change the shape of the arrays, and show that our approach extends naturally to handle this case.

  11. Applications of massively parallel computers in telemetry processing

    NASA Technical Reports Server (NTRS)

    El-Ghazawi, Tarek A.; Pritchard, Jim; Knoble, Gordon

    1994-01-01

    Telemetry processing refers to the reconstruction of full resolution raw instrumentation data with artifacts, of space and ground recording and transmission, removed. Being the first processing phase of satellite data, this process is also referred to as level-zero processing. This study is aimed at investigating the use of massively parallel computing technology in providing level-zero processing to spaceflights that adhere to the recommendations of the Consultative Committee on Space Data Systems (CCSDS). The workload characteristics, of level-zero processing, are used to identify processing requirements in high-performance computing systems. An example of level-zero functions on a SIMD MPP, such as the MasPar, is discussed. The requirements in this paper are based in part on the Earth Observing System (EOS) Data and Operation System (EDOS).

  12. A Computational Fluid Dynamics Algorithm on a Massively Parallel Computer

    NASA Technical Reports Server (NTRS)

    Jespersen, Dennis C.; Levit, Creon

    1989-01-01

    The discipline of computational fluid dynamics is demanding ever-increasing computational power to deal with complex fluid flow problems. We investigate the performance of a finite-difference computational fluid dynamics algorithm on a massively parallel computer, the Connection Machine. Of special interest is an implicit time-stepping algorithm; to obtain maximum performance from the Connection Machine, it is necessary to use a nonstandard algorithm to solve the linear systems that arise in the implicit algorithm. We find that the Connection Machine ran achieve very high computation rates on both explicit and implicit algorithms. The performance of the Connection Machine puts it in the same class as today's most powerful conventional supercomputers.

  13. MPSim: A Massively Parallel General Simulation Program for Materials

    NASA Astrophysics Data System (ADS)

    Iotov, Mihail; Gao, Guanghua; Vaidehi, Nagarajan; Cagin, Tahir; Goddard, William A., III

    1997-08-01

    In this talk, we describe a general purpose Massively Parallel Simulation (MPSim) program used for computational materials science and life sciences. We also will present scaling aspects of the program along with several case studies. The program incorporates highly efficient CMM method to accurately calculate the interactions. For studying bulk materials, the program uses the Reduced CMM to account for infinite range sums. The software embodies various advanced molecular dynamics algorithms, energy and structure optimization techniques with a set of analysis tools suitable for large scale structures. The applications using the program range amorphous polymers, liquid-polymer interfaces, large viruses, million atom clusters, surfaces, gas diffusion in polymers. Program is originally developed on KSR in an object oriented fashion and is ported to SGI-PC, and HP-Examplar. Message Passing version is originally implemented on Intel Paragon using NX, then MPI and later tested on Cray T3D, and IBM SP2 platforms.

  14. Beam dynamics calculations and particle tracking using massively parallel processors

    SciTech Connect

    Ryne, R.D.; Habib, S.

    1995-12-31

    During the past decade massively parallel processors (MPPs) have slowly gained acceptance within the scientific community. At present these machines typically contain a few hundred to one thousand off-the-shelf microprocessors and a total memory of up to 32 GBytes. The potential performance of these machines is illustrated by the fact that a month long job on a high end workstation might require only a few hours on an MPP. The acceptance of MPPs has been slow for a variety of reasons. For example, some algorithms are not easily parallelizable. Also, in the past these machines were difficult to program. But in recent years the development of Fortran-like languages such as CM Fortran and High Performance Fortran have made MPPs much easier to use. In the following we will describe how MPPs can be used for beam dynamics calculations and long term particle tracking.

  15. Integration of IR focal plane arrays with massively parallel processor

    NASA Astrophysics Data System (ADS)

    Esfandiari, P.; Koskey, P.; Vaccaro, K.; Buchwald, W.; Clark, F.; Krejca, B.; Rekeczky, C.; Zarandy, A.

    2008-04-01

    The intent of this investigation is to replace the low fill factor visible sensor of a Cellular Neural Network (CNN) processor with an InGaAs Focal Plane Array (FPA) using both bump bonding and epitaxial layer transfer techniques for use in the Ballistic Missile Defense System (BMDS) interceptor seekers. The goal is to fabricate a massively parallel digital processor with a local as well as a global interconnect architecture. Currently, this unique CNN processor is capable of processing a target scene in excess of 10,000 frames per second with its visible sensor. What makes the CNN processor so unique is that each processing element includes memory, local data storage, local and global communication devices and a visible sensor supported by a programmable analog or digital computer program.

  16. Transmissive Nanohole Arrays for Massively-Parallel Optical Biosensing

    PubMed Central

    2015-01-01

    A high-throughput optical biosensing technique is proposed and demonstrated. This hybrid technique combines optical transmission of nanoholes with colorimetric silver staining. The size and spacing of the nanoholes are chosen so that individual nanoholes can be independently resolved in massive parallel using an ordinary transmission optical microscope, and, in place of determining a spectral shift, the brightness of each nanohole is recorded to greatly simplify the readout. Each nanohole then acts as an independent sensor, and the blocking of nanohole optical transmission by enzymatic silver staining defines the specific detection of a biological agent. Nearly 10000 nanoholes can be simultaneously monitored under the field of view of a typical microscope. As an initial proof of concept, biotinylated lysozyme (biotin-HEL) was used as a model analyte, giving a detection limit as low as 0.1 ng/mL. PMID:25530982

  17. Development of a massively parallel parachute performance prediction code

    SciTech Connect

    Peterson, C.W.; Strickland, J.H.; Wolfe, W.P.; Sundberg, W.D.; McBride, D.D.

    1997-04-01

    The Department of Energy has given Sandia full responsibility for the complete life cycle (cradle to grave) of all nuclear weapon parachutes. Sandia National Laboratories is initiating development of a complete numerical simulation of parachute performance, beginning with parachute deployment and continuing through inflation and steady state descent. The purpose of the parachute performance code is to predict the performance of stockpile weapon parachutes as these parachutes continue to age well beyond their intended service life. A new massively parallel computer will provide unprecedented speed and memory for solving this complex problem, and new software will be written to treat the coupled fluid, structure and trajectory calculations as part of a single code. Verification and validation experiments have been proposed to provide the necessary confidence in the computations.

  18. Efficient Identification of Assembly Neurons within Massively Parallel Spike Trains

    PubMed Central

    Berger, Denise; Borgelt, Christian; Louis, Sebastien; Morrison, Abigail; Grün, Sonja

    2010-01-01

    The chance of detecting assembly activity is expected to increase if the spiking activities of large numbers of neurons are recorded simultaneously. Although such massively parallel recordings are now becoming available, methods able to analyze such data for spike correlation are still rare, as a combinatorial explosion often makes it infeasible to extend methods developed for smaller data sets. By evaluating pattern complexity distributions the existence of correlated groups can be detected, but their member neurons cannot be identified. In this contribution, we present approaches to actually identify the individual neurons involved in assemblies. Our results may complement other methods and also provide a way to reduce data sets to the “relevant” neurons, thus allowing us to carry out a refined analysis of the detailed correlation structure due to reduced computation time. PMID:19809521

  19. Massively parallel high-order combinatorial genetics in human cells

    PubMed Central

    Wong, Alan S L; Choi, Gigi C G; Cheng, Allen A; Purcell, Oliver; Lu, Timothy K

    2016-01-01

    The systematic functional analysis of combinatorial genetics has been limited by the throughput that can be achieved and the order of complexity that can be studied. To enable massively parallel characterization of genetic combinations in human cells, we developed a technology for rapid, scalable assembly of high-order barcoded combinatorial genetic libraries that can be quantified with high-throughput sequencing. We applied this technology, combinatorial genetics en masse (CombiGEM), to create high-coverage libraries of 1,521 two-wise and 51,770 three-wise barcoded combinations of 39 human microRNA (miRNA) precursors. We identified miRNA combinations that synergistically sensitize drug-resistant cancer cells to chemotherapy and/or inhibit cancer cell proliferation, providing insights into complex miRNA networks. More broadly, our method will enable high-throughput profiling of multifactorial genetic combinations that regulate phenotypes of relevance to biomedicine, biotechnology and basic science. PMID:26280411

  20. Massively Parallel Simulations of Diffusion in Dense Polymeric Structures

    SciTech Connect

    Faulon, Jean-Loup, Wilcox, R.T. , Hobbs, J.D. , Ford, D.M.

    1997-11-01

    An original computational technique to generate close-to-equilibrium dense polymeric structures is proposed. Diffusion of small gases are studied on the equilibrated structures using massively parallel molecular dynamics simulations running on the Intel Teraflops (9216 Pentium Pro processors) and Intel Paragon(1840 processors). Compared to the current state-of-the-art equilibration methods this new technique appears to be faster by some orders of magnitude.The main advantage of the technique is that one can circumvent the bottlenecks in configuration space that inhibit relaxation in molecular dynamics simulations. The technique is based on the fact that tetravalent atoms (such as carbon and silicon) fit in the center of a regular tetrahedron and that regular tetrahedrons can be used to mesh the three-dimensional space. Thus, the problem of polymer equilibration described by continuous equations in molecular dynamics is reduced to a discrete problem where solutions are approximated by simple algorithms. Practical modeling applications include the constructing of butyl rubber and ethylene-propylene-dimer-monomer (EPDM) models for oxygen and water diffusion calculations. Butyl and EPDM are used in O-ring systems and serve as sealing joints in many manufactured objects. Diffusion coefficients of small gases have been measured experimentally on both polymeric systems, and in general the diffusion coefficients in EPDM are an order of magnitude larger than in butyl. In order to better understand the diffusion phenomena, 10, 000 atoms models were generated and equilibrated for butyl and EPDM. The models were submitted to a massively parallel molecular dynamics simulation to monitor the trajectories of the diffusing species.

  1. Particle simulation of plasmas on the massively parallel processor

    NASA Technical Reports Server (NTRS)

    Gledhill, I. M. A.; Storey, L. R. O.

    1987-01-01

    Particle simulations, in which collective phenomena in plasmas are studied by following the self consistent motions of many discrete particles, involve several highly repetitive sets of calculations that are readily adaptable to SIMD parallel processing. A fully electromagnetic, relativistic plasma simulation for the massively parallel processor is described. The particle motions are followed in 2 1/2 dimensions on a 128 x 128 grid, with periodic boundary conditions. The two dimensional simulation space is mapped directly onto the processor network; a Fast Fourier Transform is used to solve the field equations. Particle data are stored according to an Eulerian scheme, i.e., the information associated with each particle is moved from one local memory to another as the particle moves across the spatial grid. The method is applied to the study of the nonlinear development of the whistler instability in a magnetospheric plasma model, with an anisotropic electron temperature. The wave distribution function is included as a new diagnostic to allow simulation results to be compared with satellite observations.

  2. Massively parallel simulations of multiphase flows using Lattice Boltzmann methods

    NASA Astrophysics Data System (ADS)

    Ahrenholz, Benjamin

    2010-03-01

    In the last two decades the lattice Boltzmann method (LBM) has matured as an alternative and efficient numerical scheme for the simulation of fluid flows and transport problems. Unlike conventional numerical schemes based on discretizations of macroscopic continuum equations, the LBM is based on microscopic models and mesoscopic kinetic equations. The fundamental idea of the LBM is to construct simplified kinetic models that incorporate the essential physics of microscopic or mesoscopic processes so that the macroscopic averaged properties obey the desired macroscopic equations. Especially applications involving interfacial dynamics, complex and/or changing boundaries and complicated constitutive relationships which can be derived from a microscopic picture are suitable for the LBM. In this talk a modified and optimized version of a Gunstensen color model is presented to describe the dynamics of the fluid/fluid interface where the flow field is based on a multi-relaxation-time model. Based on that modeling approach validation studies of contact line motion are shown. Due to the fact that the LB method generally needs only nearest neighbor information, the algorithm is an ideal candidate for parallelization. Hence, it is possible to perform efficient simulations in complex geometries at a large scale by massively parallel computations. Here, the results of drainage and imbibition (Degree of Freedom > 2E11) in natural porous media gained from microtomography methods are presented. Those fully resolved pore scale simulations are essential for a better understanding of the physical processes in porous media and therefore important for the determination of constitutive relationships.

  3. Efficiently modeling neural networks on massively parallel computers

    NASA Technical Reports Server (NTRS)

    Farber, Robert M.

    1993-01-01

    Neural networks are a very useful tool for analyzing and modeling complex real world systems. Applying neural network simulations to real world problems generally involves large amounts of data and massive amounts of computation. To efficiently handle the computational requirements of large problems, we have implemented at Los Alamos a highly efficient neural network compiler for serial computers, vector computers, vector parallel computers, and fine grain SIMD computers such as the CM-2 connection machine. This paper describes the mapping used by the compiler to implement feed-forward backpropagation neural networks for a SIMD (Single Instruction Multiple Data) architecture parallel computer. Thinking Machines Corporation has benchmarked our code at 1.3 billion interconnects per second (approximately 3 gigaflops) on a 64,000 processor CM-2 connection machine (Singer 1990). This mapping is applicable to other SIMD computers and can be implemented on MIMD computers such as the CM-5 connection machine. Our mapping has virtually no communications overhead with the exception of the communications required for a global summation across the processors (which has a sub-linear runtime growth on the order of O(log(number of processors)). We can efficiently model very large neural networks which have many neurons and interconnects and our mapping can extend to arbitrarily large networks (within memory limitations) by merging the memory space of separate processors with fast adjacent processor interprocessor communications. This paper will consider the simulation of only feed forward neural network although this method is extendable to recurrent networks.

  4. Massively Parallel Interrogation of Aptamer Sequence, Structure and Function

    SciTech Connect

    Fischer, N O; Tok, J B; Tarasow, T M

    2008-02-08

    Optimization of high affinity reagents is a significant bottleneck in medicine and the life sciences. The ability to synthetically create thousands of permutations of a lead high-affinity reagent and survey the properties of individual permutations in parallel could potentially relieve this bottleneck. Aptamers are single stranded oligonucleotides affinity reagents isolated by in vitro selection processes and as a class have been shown to bind a wide variety of target molecules. Methodology/Principal Findings. High density DNA microarray technology was used to synthesize, in situ, arrays of approximately 3,900 aptamer sequence permutations in triplicate. These sequences were interrogated on-chip for their ability to bind the fluorescently-labeled cognate target, immunoglobulin E, resulting in the parallel execution of thousands of experiments. Fluorescence intensity at each array feature was well resolved and shown to be a function of the sequence present. The data demonstrated high intra- and interchip correlation between the same features as well as among the sequence triplicates within a single array. Consistent with aptamer mediated IgE binding, fluorescence intensity correlated strongly with specific aptamer sequences and the concentration of IgE applied to the array. The massively parallel sequence-function analyses provided by this approach confirmed the importance of a consensus sequence found in all 21 of the original IgE aptamer sequences and support a common stem:loop structure as being the secondary structure underlying IgE binding. The microarray application, data and results presented illustrate an efficient, high information content approach to optimizing aptamer function. It also provides a foundation from which to better understand and manipulate this important class of high affinity biomolecules.

  5. The minimal amount of starting DNA for Agilent's hybrid capture-based targeted massively parallel sequencing.

    PubMed

    Chung, Jongsuk; Son, Dae-Soon; Jeon, Hyo-Jeong; Kim, Kyoung-Mee; Park, Gahee; Ryu, Gyu Ha; Park, Woong-Yang; Park, Donghyun

    2016-01-01

    Targeted capture massively parallel sequencing is increasingly being used in clinical settings, and as costs continue to decline, use of this technology may become routine in health care. However, a limited amount of tissue has often been a challenge in meeting quality requirements. To offer a practical guideline for the minimum amount of input DNA for targeted sequencing, we optimized and evaluated the performance of targeted sequencing depending on the input DNA amount. First, using various amounts of input DNA, we compared commercially available library construction kits and selected Agilent's SureSelect-XT and KAPA Biosystems' Hyper Prep kits as the kits most compatible with targeted deep sequencing using Agilent's SureSelect custom capture. Then, we optimized the adapter ligation conditions of the Hyper Prep kit to improve library construction efficiency and adapted multiplexed hybrid selection to reduce the cost of sequencing. In this study, we systematically evaluated the performance of the optimized protocol depending on the amount of input DNA, ranging from 6.25 to 200 ng, suggesting the minimal input DNA amounts based on coverage depths required for specific applications. PMID:27220682

  6. Analysis of composite ablators using massively parallel computation

    NASA Technical Reports Server (NTRS)

    Shia, David

    1995-01-01

    In this work, the feasibility of using massively parallel computation to study the response of ablative materials is investigated. Explicit and implicit finite difference methods are used on a massively parallel computer, the Thinking Machines CM-5. The governing equations are a set of nonlinear partial differential equations. The governing equations are developed for three sample problems: (1) transpiration cooling, (2) ablative composite plate, and (3) restrained thermal growth testing. The transpiration cooling problem is solved using a solution scheme based solely on the explicit finite difference method. The results are compared with available analytical steady-state through-thickness temperature and pressure distributions and good agreement between the numerical and analytical solutions is found. It is also found that a solution scheme based on the explicit finite difference method has the following advantages: incorporates complex physics easily, results in a simple algorithm, and is easily parallelizable. However, a solution scheme of this kind needs very small time steps to maintain stability. A solution scheme based on the implicit finite difference method has the advantage that it does not require very small times steps to maintain stability. However, this kind of solution scheme has the disadvantages that complex physics cannot be easily incorporated into the algorithm and that the solution scheme is difficult to parallelize. A hybrid solution scheme is then developed to combine the strengths of the explicit and implicit finite difference methods and minimize their weaknesses. This is achieved by identifying the critical time scale associated with the governing equations and applying the appropriate finite difference method according to this critical time scale. The hybrid solution scheme is then applied to the ablative composite plate and restrained thermal growth problems. The gas storage term is included in the explicit pressure calculation of both

  7. Massively Parallel Processing for Fast and Accurate Stamping Simulations

    NASA Astrophysics Data System (ADS)

    Gress, Jeffrey J.; Xu, Siguang; Joshi, Ramesh; Wang, Chuan-tao; Paul, Sabu

    2005-08-01

    The competitive automotive market drives automotive manufacturers to speed up the vehicle development cycles and reduce the lead-time. Fast tooling development is one of the key areas to support fast and short vehicle development programs (VDP). In the past ten years, the stamping simulation has become the most effective validation tool in predicting and resolving all potential formability and quality problems before the dies are physically made. The stamping simulation and formability analysis has become an critical business segment in GM math-based die engineering process. As the simulation becomes as one of the major production tools in engineering factory, the simulation speed and accuracy are the two of the most important measures for stamping simulation technology. The speed and time-in-system of forming analysis becomes an even more critical to support the fast VDP and tooling readiness. Since 1997, General Motors Die Center has been working jointly with our software vendor to develop and implement a parallel version of simulation software for mass production analysis applications. By 2001, this technology was matured in the form of distributed memory processing (DMP) of draw die simulations in a networked distributed memory computing environment. In 2004, this technology was refined to massively parallel processing (MPP) and extended to line die forming analysis (draw, trim, flange, and associated spring-back) running on a dedicated computing environment. The evolution of this technology and the insight gained through the implementation of DM0P/MPP technology as well as performance benchmarks are discussed in this publication.

  8. Cloud identification using genetic algorithms and massively parallel computation

    NASA Technical Reports Server (NTRS)

    Buckles, Bill P.; Petry, Frederick E.

    1996-01-01

    As a Guest Computational Investigator under the NASA administered component of the High Performance Computing and Communication Program, we implemented a massively parallel genetic algorithm on the MasPar SIMD computer. Experiments were conducted using Earth Science data in the domains of meteorology and oceanography. Results obtained in these domains are competitive with, and in most cases better than, similar problems solved using other methods. In the meteorological domain, we chose to identify clouds using AVHRR spectral data. Four cloud speciations were used although most researchers settle for three. Results were remarkedly consistent across all tests (91% accuracy). Refinements of this method may lead to more timely and complete information for Global Circulation Models (GCMS) that are prevalent in weather forecasting and global environment studies. In the oceanographic domain, we chose to identify ocean currents from a spectrometer having similar characteristics to AVHRR. Here the results were mixed (60% to 80% accuracy). Given that one is willing to run the experiment several times (say 10), then it is acceptable to claim the higher accuracy rating. This problem has never been successfully automated. Therefore, these results are encouraging even though less impressive than the cloud experiment. Successful conclusion of an automated ocean current detection system would impact coastal fishing, naval tactics, and the study of micro-climates. Finally we contributed to the basic knowledge of GA (genetic algorithm) behavior in parallel environments. We developed better knowledge of the use of subpopulations in the context of shared breeding pools and the migration of individuals. Rigorous experiments were conducted based on quantifiable performance criteria. While much of the work confirmed current wisdom, for the first time we were able to submit conclusive evidence. The software developed under this grant was placed in the public domain. An extensive user

  9. Comparing current cluster, massively parallel, and accelerated systems

    SciTech Connect

    Barker, Kevin J; Davis, Kei; Hoisie, Adolfy; Kerbyson, Darren J; Pakin, Scott; Lang, Mike; Sancho Pitarch, Jose C

    2010-01-01

    Currently there is large architectural diversity in high perfonnance computing systems. They include 'commodity' cluster systems that optimize per-node performance for small jobs, massively parallel processors (MPPs) that optimize aggregate perfonnance for large jobs, and accelerated systems that optimize both per-node and aggregate performance but only for applications custom-designed to take advantage of such systems. Because of these dissimilarities, meaningful comparisons of achievable performance are not straightforward. In this work we utilize a methodology that combines both empirical analysis and performance modeling to compare clusters (represented by a 4,352-core IB cluster), MPPs (represented by a 147,456-core BG/P), and accelerated systems (represented by the 129,600-core Roadrunner) across a workload of four applications. Strengths of our approach include the ability to compare architectures - as opposed to specific implementations of an architecture - attribute each application's performance bottlenecks to characteristics unique to each system, and to explore performance scenarios in advance of their availability for measurement. Our analysis illustrates that application performance is essentially unrelated to relative peak performance but that application performance can be both predicted and explained using modeling.

  10. Investigation of reflective notching with massively parallel simulation

    NASA Astrophysics Data System (ADS)

    Tadros, Karim H.; Neureuther, Andrew R.; Gamelin, John K.; Guerrieri, Roberto

    1990-06-01

    A massively parallel simulation program TEMPEST is used to investigate the role of topography in generating reflective notching and to study the possibility of reducing effects through the introduction of special properties of resists and antireflection coating materials. The emphasis is on examining physical scattering mechanisms such as focused specular reflections resist thickness interference effects reflections from substrate grains and focusing of incident light by the resist curvature. Specular reflection from topography can focus incident radiation causing a 10-fold increase in effective exposure. Further complications such as dimples in the surface of positive resist features can result from a second reflection of focused energy by the resist/air interface. Variations in line-edge exposure due to substrate grain structure are primarily specular in nature and can become significant for grains larger than )tresi Local exposure variations due to vertical standing waves and changes in energy coupling due to changes in resist thickness are displaced laterally and are significant effects even though they are slightly less severe than vertical wave propagation theory suggests. Focusing effects due to refraction by the curved surface of the resist produce only minor changes in exposure. Increased resist contrast and resist absorption offer some improvement in reducing notching effects though minimizing substrate reflectivity is more effective. CPU time using 32 virtual nodes to simulate a 4 pm by 2 pm isolated domain with 13 bleaching steps was 30 minutes

  11. A high-plex PCR approach for massively parallel sequencing.

    PubMed

    Nguyen-Dumont, Tú; Pope, Bernard J; Hammet, Fleur; Southey, Melissa C; Park, Daniel J

    2013-08-01

    Current methods for targeted massively parallel sequencing (MPS) have several drawbacks, including limited design flexibility, expense, and protocol complexity, which restrict their application to settings involving modest target size and requiring low cost and high throughput. To address this, we have developed Hi-Plex, a PCR-MPS strategy intended for high-throughput screening of multiple genomic target regions that integrates simple, automated primer design software to control product size. Featuring permissive thermocycling conditions and clamp bias reduction, our protocol is simple, cost- and time-effective, uses readily available reagents, does not require expensive instrumentation, and requires minimal optimization. In a 60-plex assay targeting the breast cancer predisposition genes PALB2 and XRCC2, we applied Hi-Plex to 100 ng LCL-derived DNA, and 100 ng and 25 ng FFPE tumor-derived DNA. Altogether, at least 86.94% of the human genome-mapped reads were on target, and 100% of targeted amplicons were represented within 25-fold of the mean. Using 25 ng FFPE-derived DNA, 95.14% of mapped reads were on-target and relative representation ranged from 10.1-fold lower to 5.8-fold higher than the mean. These results were obtained using only the initial automatically-designed primers present in equal concentration. Hi-Plex represents a powerful new approach for screening panels of genomic target regions. PMID:23931594

  12. Wavelet-Based DFT calculations on Massively Parallel Hybrid Architectures

    NASA Astrophysics Data System (ADS)

    Genovese, Luigi

    2011-03-01

    In this contribution, we present an implementation of a full DFT code that can run on massively parallel hybrid CPU-GPU clusters. Our implementation is based on modern GPU architectures which support double-precision floating-point numbers. This DFT code, named BigDFT, is delivered within the GNU-GPL license either in a stand-alone version or integrated in the ABINIT software package. Hybrid BigDFT routines were initially ported with NVidia's CUDA language, and recently more functionalities have been added with new routines writeen within Kronos' OpenCL standard. The formalism of this code is based on Daubechies wavelets, which is a systematic real-space based basis set. As we will see in the presentation, the properties of this basis set are well suited for an extension on a GPU-accelerated environment. In addition to focusing on the implementation of the operators of the BigDFT code, this presentation also relies of the usage of the GPU resources in a complex code with different kinds of operations. A discussion on the interest of present and expected performances of Hybrid architectures computation in the framework of electronic structure calculations is also adressed.

  13. Massively parallel processor networks with optical express channels

    DOEpatents

    Deri, R.J.; Brooks, E.D. III; Haigh, R.E.; DeGroot, A.J.

    1999-08-24

    An optical method for separating and routing local and express channel data comprises interconnecting the nodes in a network with fiber optic cables. A single fiber optic cable carries both express channel traffic and local channel traffic, e.g., in a massively parallel processor (MPP) network. Express channel traffic is placed on, or filtered from, the fiber optic cable at a light frequency or a color different from that of the local channel traffic. The express channel traffic is thus placed on a light carrier that skips over the local intermediate nodes one-by-one by reflecting off of selective mirrors placed at each local node. The local-channel-traffic light carriers pass through the selective mirrors and are not reflected. A single fiber optic cable can thus be threaded throughout a three-dimensional matrix of nodes with the x,y,z directions of propagation encoded by the color of the respective light carriers for both local and express channel traffic. Thus frequency division multiple access is used to hierarchically separate the local and express channels to eliminate the bucket brigade latencies that would otherwise result if the express traffic had to hop between every local node to reach its ultimate destination. 3 figs.

  14. Massively parallel processor networks with optical express channels

    DOEpatents

    Deri, Robert J.; Brooks, III, Eugene D.; Haigh, Ronald E.; DeGroot, Anthony J.

    1999-01-01

    An optical method for separating and routing local and express channel data comprises interconnecting the nodes in a network with fiber optic cables. A single fiber optic cable carries both express channel traffic and local channel traffic, e.g., in a massively parallel processor (MPP) network. Express channel traffic is placed on, or filtered from, the fiber optic cable at a light frequency or a color different from that of the local channel traffic. The express channel traffic is thus placed on a light carrier that skips over the local intermediate nodes one-by-one by reflecting off of selective mirrors placed at each local node. The local-channel-traffic light carriers pass through the selective mirrors and are not reflected. A single fiber optic cable can thus be threaded throughout a three-dimensional matrix of nodes with the x,y,z directions of propagation encoded by the color of the respective light carriers for both local and express channel traffic. Thus frequency division multiple access is used to hierarchically separate the local and express channels to eliminate the bucket brigade latencies that would otherwise result if the express traffic had to hop between every local node to reach its ultimate destination.

  15. Massively parallel support for a case-based planning system

    NASA Technical Reports Server (NTRS)

    Kettler, Brian P.; Hendler, James A.; Anderson, William A.

    1993-01-01

    Case-based planning (CBP), a kind of case-based reasoning, is a technique in which previously generated plans (cases) are stored in memory and can be reused to solve similar planning problems in the future. CBP can save considerable time over generative planning, in which a new plan is produced from scratch. CBP thus offers a potential (heuristic) mechanism for handling intractable problems. One drawback of CBP systems has been the need for a highly structured memory to reduce retrieval times. This approach requires significant domain engineering and complex memory indexing schemes to make these planners efficient. In contrast, our CBP system, CaPER, uses a massively parallel frame-based AI language (PARKA) and can do extremely fast retrieval of complex cases from a large, unindexed memory. The ability to do fast, frequent retrievals has many advantages: indexing is unnecessary; very large case bases can be used; memory can be probed in numerous alternate ways; and queries can be made at several levels, allowing more specific retrieval of stored plans that better fit the target problem with less adaptation. In this paper we describe CaPER's case retrieval techniques and some experimental results showing its good performance, even on large case bases.

  16. Three-dimensional radiative transfer on a massively parallel computer

    NASA Astrophysics Data System (ADS)

    Vath, H. M.

    1994-04-01

    We perform 3D radiative transfer calculations in non-local thermodynamic equilibrium (NLTE) in the simple two-level atom approximation on the Mas-Par MP-1, which contains 8192 processors and is a single instruction multiple data (SIMD) machine, an example of the new generation of massively parallel computers. On such a machine, all processors execute the same command at a given time, but on different data. To make radiative transfer calculations efficient, we must re-consider the numerical methods and storage of data. To solve the transfer equation, we adopt the short characteristic method and examine different acceleration methods to obtain the source function. We use the ALI method and test local and non-local operators. Furthermore, we compare the Ng and the orthomin methods of acceleration. We also investigate the use of multi-grid methods to get fast solutions for the NLTE case. In order to test these numerical methods, we apply them to two problems with and without periodic boundary conditions.

  17. PFLOTRAN: Recent Developments Facilitating Massively-Parallel Reactive Biogeochemical Transport

    NASA Astrophysics Data System (ADS)

    Hammond, G. E.

    2015-12-01

    With the recent shift towards modeling carbon and nitrogen cycling in support of climate-related initiatives, emphasis has been placed on incorporating increasingly mechanistic biogeochemistry within Earth system models to more accurately predict the response of terrestrial processes to natural and anthropogenic climate cycles. PFLOTRAN is an open-source subsurface code that is specialized for simulating multiphase flow and multicomponent biogeochemical transport on supercomputers. The object-oriented code was designed with modularity in mind and has been coupled with several third-party simulators (e.g. CLM to simulate land surface processes and E4D for coupled hydrogeophysical inversion). Central to PFLOTRAN's capabilities is its ability to simulate tightly-coupled reactive transport processes. This presentation focuses on recent enhancements to the code that enable the solution of large parameterized biogeochemical reaction networks with numerous chemical species. PFLOTRAN's "reaction sandbox" is described, which facilitates the implementation of user-defined reaction networks without the need for a comprehensive understanding of PFLOTRAN software infrastructure. The reaction sandbox is written in modern Fortran (2003-2008) and leverages encapsulation, inheritance, and polymorphism to provide the researcher with a flexible workspace for prototyping reactions within a massively parallel flow and transport simulation framework. As these prototypical reactions mature into well-accepted implementations, they can be incorporated into PFLOTRAN as native biogeochemistry capability. Users of the reaction sandbox are encouraged to upload their source code to PFLOTRAN's main source code repository, including the addition of simple regression tests to better ensure the long-term code compatibility and validity of simulation results.

  18. Time efficient 3-D electromagnetic modeling on massively parallel computers

    SciTech Connect

    Alumbaugh, D.L.; Newman, G.A.

    1995-08-01

    A numerical modeling algorithm has been developed to simulate the electromagnetic response of a three dimensional earth to a dipole source for frequencies ranging from 100 to 100MHz. The numerical problem is formulated in terms of a frequency domain--modified vector Helmholtz equation for the scattered electric fields. The resulting differential equation is approximated using a staggered finite difference grid which results in a linear system of equations for which the matrix is sparse and complex symmetric. The system of equations is solved using a preconditioned quasi-minimum-residual method. Dirichlet boundary conditions are employed at the edges of the mesh by setting the tangential electric fields equal to zero. At frequencies less than 1MHz, normal grid stretching is employed to mitigate unwanted reflections off the grid boundaries. For frequencies greater than this, absorbing boundary conditions must be employed by making the stretching parameters of the modified vector Helmholtz equation complex which introduces loss at the boundaries. To allow for faster calculation of realistic models, the original serial version of the code has been modified to run on a massively parallel architecture. This modification involves three distinct tasks; (1) mapping the finite difference stencil to a processor stencil which allows for the necessary information to be exchanged between processors that contain adjacent nodes in the model, (2) determining the most efficient method to input the model which is accomplished by dividing the input into ``global`` and ``local`` data and then reading the two sets in differently, and (3) deciding how to output the data which is an inherently nonparallel process.

  19. Massively parallel computational fluid dynamics calculations for aerodynamics and aerothermodynamics applications

    SciTech Connect

    Payne, J.L.; Hassan, B.

    1998-09-01

    Massively parallel computers have enabled the analyst to solve complicated flow fields (turbulent, chemically reacting) that were previously intractable. Calculations are presented using a massively parallel CFD code called SACCARA (Sandia Advanced Code for Compressible Aerothermodynamics Research and Analysis) currently under development at Sandia National Laboratories as part of the Department of Energy (DOE) Accelerated Strategic Computing Initiative (ASCI). Computations were made on a generic reentry vehicle in a hypersonic flowfield utilizing three different distributed parallel computers to assess the parallel efficiency of the code with increasing numbers of processors. The parallel efficiencies for the SACCARA code will be presented for cases using 1, 150, 100 and 500 processors. Computations were also made on a subsonic/transonic vehicle using both 236 and 521 processors on a grid containing approximately 14.7 million grid points. Ongoing and future plans to implement a parallel overset grid capability and couple SACCARA with other mechanics codes in a massively parallel environment are discussed.

  20. Three-Dimensional Radiative Transfer on a Massively Parallel Computer.

    NASA Astrophysics Data System (ADS)

    Vath, Horst Michael

    1994-01-01

    We perform three-dimensional radiative transfer calculations on the MasPar MP-1, which contains 8192 processors and is a single instruction multiple data (SIMD) machine, an example of the new generation of massively parallel computers. To make radiative transfer calculations efficient, we must re-consider the numerical methods and methods of storage of data that have been used with serial machines. We developed a numerical code which efficiently calculates images and spectra of astrophysical systems as seen from different viewing directions and at different wavelengths. We use this code to examine a number of different astrophysical systems. First we image the HI distribution of model galaxies. Then we investigate the galaxy NGC 5055, which displays a radial asymmetry in its optical appearance. This can be explained by the presence of dust in the outer HI disk far beyond the optical disk. As the formation of dust is connected to the presence of stars, the existence of dust in outer regions of this galaxy could have consequences for star formation at a time when this galaxy was just forming. Next we use the code for polarized radiative transfer. We first discuss the numerical computation of the required cyclotron opacities and use them to calculate spectra of AM Her systems, binaries containing accreting magnetic white dwarfs. Then we obtain spectra of an extended polar cap. Previous calculations did not consider the three -dimensional extension of the shock. We find that this results in a significant underestimate of the radiation emitted in the shock. Next we calculate the spectrum of the intermediate polar RE 0751+14. For this system we obtain a magnetic field of ~10 MG, which has consequences for the evolution of intermediate polars. Finally we perform 3D radiative transfer in NLTE in the two-level atom approximation. To solve the transfer equation in this case, we adapt the short characteristic method and examine different acceleration methods to obtain the

  1. SWAMP+: multiple subsequence alignment using associative massive parallelism

    SciTech Connect

    Steinfadt, Shannon Irene; Baker, Johnnie W

    2010-10-18

    A new parallel algorithm SWAMP+ incorporates the Smith-Waterman sequence alignment on an associative parallel model known as ASC. It is a highly sensitive parallel approach that expands traditional pairwise sequence alignment. This is the first parallel algorithm to provide multiple non-overlapping, non-intersecting subsequence alignments with the accuracy of Smith-Waterman. The efficient algorithm provides multiple alignments similar to BLAST while creating a better workflow for the end users. The parallel portions of the code run in O(m+n) time using m processors. When m = n, the algorithmic analysis becomes O(n) with a coefficient of two, yielding a linear speedup. Implementation of the algorithm on the SIMD ClearSpeed CSX620 confirms this theoretical linear speedup with real timings.

  2. QCD on the Massively Parallel Computer AP1000

    NASA Astrophysics Data System (ADS)

    Akemi, K.; Fujisaki, M.; Okuda, M.; Tago, Y.; Hashimoto, T.; Hioki, S.; Miyamura, O.; Takaishi, T.; Nakamura, A.; de Forcrand, Ph.; Hege, C.; Stamatescu, I. O.

    We present the QCD-TARO program of calculations which uses the parallel computer AP1000 of Fujitsu. We discuss the results on scaling, correlation times and hadronic spectrum, some aspects of the implementation and the future prospects.

  3. A development plan for a massively parallel version of the hydrocode CTH

    SciTech Connect

    Robinson, A.C.; Fang, E.; Holdridge, D.; McGlaun, J.M.

    1990-07-01

    Massively parallel computers and computer networks are beginning to appear as an integral part of the scientific computing workplace. This report documents the goals and the corresponding development plan of the massively parallel project of Departments 1530 and 1420. The main goal of the project is to provide a clear understanding of the issues and difficulties involved in bringing the current production hydrocode CTH to the state of being portable to a number of currently available parallel computing architectures. In the process of this research, various working versions of the code will be produced. 6 refs., 6 figs.

  4. Targeted single molecule mutation detection with massively parallel sequencing

    PubMed Central

    Gregory, Mark T.; Bertout, Jessica A.; Ericson, Nolan G.; Taylor, Sean D.; Mukherjee, Rithun; Robins, Harlan S.; Drescher, Charles W.; Bielas, Jason H.

    2016-01-01

    Next-generation sequencing (NGS) technologies have transformed genomic research and have the potential to revolutionize clinical medicine. However, the background error rates of sequencing instruments and limitations in targeted read coverage have precluded the detection of rare DNA sequence variants by NGS. Here we describe a method, termed CypherSeq, which combines double-stranded barcoding error correction and rolling circle amplification (RCA)-based target enrichment to vastly improve NGS-based rare variant detection. The CypherSeq methodology involves the ligation of sample DNA into circular vectors, which contain double-stranded barcodes for computational error correction and adapters for library preparation and sequencing. CypherSeq is capable of detecting rare mutations genome-wide as well as those within specific target genes via RCA-based enrichment. We demonstrate that CypherSeq is capable of correcting errors incurred during library preparation and sequencing to reproducibly detect mutations down to a frequency of 2.4 × 10−7 per base pair, and report the frequency and spectra of spontaneous and ethyl methanesulfonate-induced mutations across the Saccharomyces cerevisiae genome. PMID:26384417

  5. A Massively Parallel Adaptive Fast Multipole Method on Heterogeneous Architectures

    SciTech Connect

    Lashuk, Ilya; Chandramowlishwaran, Aparna; Langston, Harper; Nguyen, Tuan-Anh; Sampath, Rahul S; Shringarpure, Aashay; Vuduc, Richard; Ying, Lexing; Zorin, Denis; Biros, George

    2012-01-01

    We describe a parallel fast multipole method (FMM) for highly nonuniform distributions of particles. We employ both distributed memory parallelism (via MPI) and shared memory parallelism (via OpenMP and GPU acceleration) to rapidly evaluate two-body nonoscillatory potentials in three dimensions on heterogeneous high performance computing architectures. We have performed scalability tests with up to 30 billion particles on 196,608 cores on the AMD/CRAY-based Jaguar system at ORNL. On a GPU-enabled system (NSF's Keeneland at Georgia Tech/ORNL), we observed 30x speedup over a single core CPU and 7x speedup over a multicore CPU implementation. By combining GPUs with MPI, we achieve less than 10 ns/particle and six digits of accuracy for a run with 48 million nonuniformly distributed particles on 192 GPUs.

  6. Massively parallel switch-level simulation: A feasibility study

    SciTech Connect

    Kravitz, S.A.

    1989-01-01

    This thesis addresses the feasibility of mapping the COSMOS switch-level simulator onto computers with thousands of simple processors. COSMOS Preprocesses transistor networks into equivalent Boolean behavioral models, capturing the switch-level behavior of a circuit in a set of Boolean formulas. The author shows that thousand-fold parallelism exists in the formulas derived by COSMOS for some actual circuits. He exposes this parallelism by eliminating the event list from the simulator, and he demonstrates that this represents an attractive tradeoff given sufficient parallelism in the circuit model. To investigate the feasibility of this approach, he has developed a prototype implementation of the COSMOS simulator on a 32k processor Connection Machine.

  7. High density packaging and interconnect of massively parallel image processors

    NASA Technical Reports Server (NTRS)

    Carson, John C.; Indin, Ronald J.

    1991-01-01

    This paper presents conceptual designs for high density packaging of parallel processing systems. The systems fall into two categories: global memory systems where many processors are packaged into a stack, and distributed memory systems where a single processor and many memory chips are packaged into a stack. Thermal behavior and performance are discussed.

  8. Molecular simulation of rheological properties using massively parallel supercomputers

    SciTech Connect

    Bhupathiraju, R.K.; Cui, S.T.; Gupta, S.A.; Cummings, P.T.; Cochran, H.D.

    1996-11-01

    Advances in parallel supercomputing now make possible molecular-based engineering and science calculations that will soon revolutionize many technologies, such as those involving polymers and those involving aqueous electrolytes. We have developed a suite of message-passing codes for classical molecular simulation of such complex fluids and amorphous materials and have completed a number of demonstration calculations of problems of scientific and technological importance with each. In this paper, we will focus on the molecular simulation of rheological properties, particularly viscosity, of simple and complex fluids using parallel implementations of non-equilibrium molecular dynamics. Such calculations represent significant challenges computationally because, in order to reduce the thermal noise in the calculated properties within acceptable limits, large systems and/or long simulated times are required.

  9. Casting Pearls Ballistically: Efficient Massively Parallel Simulation of Particle Deposition

    NASA Astrophysics Data System (ADS)

    Lubachevsky, Boris D.; Privman, Vladimir; Roy, Subhas C.

    1996-06-01

    We simulate ballistic particle deposition wherein a large number of spherical particles are "cast" vertically over a planar horizontal surface. Upon first contact (with the surface or with a previously deposited particle) each particle stops. This model helps material scientists to study the adsorption and sediment formation. The model is sequential, with particles deposited one by one. We have found an equivalent formulation using a continuous time random process and we simulate the latter in parallel using a method similar to the one previously employed for simulating Ising spins. We augment the parallel algorithm for simulating Ising spins with several techniques aimed at the increase of efficiency of producing the particle configuration and statistics collection. Some of these techniques are similar to earlier ones. We implement the resulting algorithm on a 16K PE MasPar MP-1 and a 4K PE MasPar MP-2. The parallel code runs on MasPar computers nearly two orders of magnitude faster than an optimized sequential code runs on a fast workstation.

  10. Casting pearls ballistically: Efficient massively parallel simulation of particle deposition

    SciTech Connect

    Lubachevsky, B.D.; Privman, V.; Roy, S.C.

    1996-06-01

    We simulate ballistic particle deposition wherein a large number of spherical particles are {open_quotes}cast{close_quotes} vertically over a planar horizontal surface. Upon first contact (with the surface or with a previously deposited particle) each particle stops. This model helps material scientists to study the adsorption and sediment formation. The model is sequential, with particles deposited one by one. We have found an equivalent formulation using a continuous time random process and we simulate the latter in parallel using a method similar to the one previously employed for simulating Ising spins. We augment the parallel algorithm for simulating Ising spins with several techniques aimed at the increase of efficiency of producing the particle configuration and statistics collection. Some of these techniques are similar to earlier ones. We implement the resulting algorithm on a 16K PE MasPar MP-1 and a 4K PE MasPar MP-2. The parallel code runs on MasPar computers nearly two orders of magnitude faster than an optimized sequential code runs on a fast workstation. 17 refs., 9 figs.

  11. Performance effects of irregular communications patterns on massively parallel multiprocessors

    NASA Technical Reports Server (NTRS)

    Saltz, Joel; Petiton, Serge; Berryman, Harry; Rifkin, Adam

    1991-01-01

    A detailed study of the performance effects of irregular communications patterns on the CM-2 was conducted. The communications capabilities of the CM-2 were characterized under a variety of controlled conditions. In the process of carrying out the performance evaluation, extensive use was made of a parameterized synthetic mesh. In addition, timings with unstructured meshes generated for aerodynamic codes and a set of sparse matrices with banded patterns on non-zeroes were performed. This benchmarking suite stresses the communications capabilities of the CM-2 in a range of different ways. Benchmark results demonstrate that it is possible to make effective use of much of the massive concurrency available in the communications network.

  12. Massively parallel spatial light modulation-based optical signal processing

    NASA Astrophysics Data System (ADS)

    Li, Yao

    1993-03-01

    A new optical parallel arithmetic processing scheme using a nonholographic optoelectronic content-addressable memory (CAM) was proposed. The design of a four-bit CAM-based optical carry look-ahead adder was studied. Compared with existing optoelectronic binary addition approaches, this nonholographic CAM Scheme offers a number of practical advantages, such as faster processing speed and ease of optical implementation and alignment. For an addition of numbers longer than four bits, by incorporating the previous stage's carry, a number of four-bit CLA's can be cascaded. Experimental results were also demonstrated. One paper to the Optics Letters was published.

  13. A sweep algorithm for massively parallel simulation of circuit-switched networks

    NASA Technical Reports Server (NTRS)

    Gaujal, Bruno; Greenberg, Albert G.; Nicol, David M.

    1992-01-01

    A new massively parallel algorithm is presented for simulating large asymmetric circuit-switched networks, controlled by a randomized-routing policy that includes trunk-reservation. A single instruction multiple data (SIMD) implementation is described, and corresponding experiments on a 16384 processor MasPar parallel computer are reported. A multiple instruction multiple data (MIMD) implementation is also described, and corresponding experiments on an Intel IPSC/860 parallel computer, using 16 processors, are reported. By exploiting parallelism, our algorithm increases the possible execution rate of such complex simulations by as much as an order of magnitude.

  14. Performance of the Wavelet Decomposition on Massively Parallel Architectures

    NASA Technical Reports Server (NTRS)

    El-Ghazawi, Tarek A.; LeMoigne, Jacqueline; Zukor, Dorothy (Technical Monitor)

    2001-01-01

    Traditionally, Fourier Transforms have been utilized for performing signal analysis and representation. But although it is straightforward to reconstruct a signal from its Fourier transform, no local description of the signal is included in its Fourier representation. To alleviate this problem, Windowed Fourier transforms and then wavelet transforms have been introduced, and it has been proven that wavelets give a better localization than traditional Fourier transforms, as well as a better division of the time- or space-frequency plane than Windowed Fourier transforms. Because of these properties and after the development of several fast algorithms for computing the wavelet representation of any signal, in particular the Multi-Resolution Analysis (MRA) developed by Mallat, wavelet transforms have increasingly been applied to signal analysis problems, especially real-life problems, in which speed is critical. In this paper we present and compare efficient wavelet decomposition algorithms on different parallel architectures. We report and analyze experimental measurements, using NASA remotely sensed images. Results show that our algorithms achieve significant performance gains on current high performance parallel systems, and meet scientific applications and multimedia requirements. The extensive performance measurements collected over a number of high-performance computer systems have revealed important architectural characteristics of these systems, in relation to the processing demands of the wavelet decomposition of digital images.

  15. Scientific development of a massively parallel ocean climate model. Final report

    SciTech Connect

    Semtner, A.J.; Chervin, R.M.

    1996-09-01

    Over the last three years, very significant advances have been made in refining the grid resolution of ocean models and in improving the physical and numerical treatments of ocean hydrodynamics. Some of these advances have occurred as a result of the successful transition of ocean models onto massively parallel computers, which has been led by Los Alamos investigators. Major progress has been made in simulating global ocean circulation and in understanding various ocean climatic aspects such as the effect of wind driving on heat and freshwater transports. These steps have demonstrated the capability to conduct realistic decadal to century ocean integrations at high resolution on massively parallel computers.

  16. Signal processing applications of massively parallel charge domain computing devices

    NASA Technical Reports Server (NTRS)

    Fijany, Amir (Inventor); Barhen, Jacob (Inventor); Toomarian, Nikzad (Inventor)

    1999-01-01

    The present invention is embodied in a charge coupled device (CCD)/charge injection device (CID) architecture capable of performing a Fourier transform by simultaneous matrix vector multiplication (MVM) operations in respective plural CCD/CID arrays in parallel in O(1) steps. For example, in one embodiment, a first CCD/CID array stores charge packets representing a first matrix operator based upon permutations of a Hartley transform and computes the Fourier transform of an incoming vector. A second CCD/CID array stores charge packets representing a second matrix operator based upon different permutations of a Hartley transform and computes the Fourier transform of an incoming vector. The incoming vector is applied to the inputs of the two CCD/CID arrays simultaneously, and the real and imaginary parts of the Fourier transform are produced simultaneously in the time required to perform a single MVM operation in a CCD/CID array.

  17. Factorization of large integers on a massively parallel computer

    SciTech Connect

    Davis, J.A.; Holdridge, D.B.

    1988-01-01

    Our interest in integer factorization at Sandia National Laboratories is motivated by cryptographic applications and in particular the security of the RSA encryption-decryption algorithm. We have implemented our version of the quadratic sieve procedure on the NCUBE computer with 1024 processors (nodes). The new code is significantly different in all important aspects from the program used to factor number of order 10/sup 70/ on a single processor CRAY computer. Capabilities of parallel processing and limitation of small local memory necessitated this entirely new implementation. This effort involved several restarts as realizations of program structures that seemed appealing bogged down due to inter-processor communications. We are presently working with integers of magnitude about 10/sup 70/ in tuning this code to the novel hardware. 6 refs., 3 figs.

  18. Massively parallel algorithms for trace-driven cache simulations

    NASA Technical Reports Server (NTRS)

    Nicol, David M.; Greenberg, Albert G.; Lubachevsky, Boris D.

    1991-01-01

    Trace driven cache simulation is central to computer design. A trace is a very long sequence of reference lines from main memory. At the t(exp th) instant, reference x sub t is hashed into a set of cache locations, the contents of which are then compared with x sub t. If at the t sup th instant x sub t is not present in the cache, then it is said to be a miss, and is loaded into the cache set, possibly forcing the replacement of some other memory line, and making x sub t present for the (t+1) sup st instant. The problem of parallel simulation of a subtrace of N references directed to a C line cache set is considered, with the aim of determining which references are misses and related statistics. A simulation method is presented for the Least Recently Used (LRU) policy, which regradless of the set size C runs in time O(log N) using N processors on the exclusive read, exclusive write (EREW) parallel model. A simpler LRU simulation algorithm is given that runs in O(C log N) time using N/log N processors. Timings are presented of the second algorithm's implementation on the MasPar MP-1, a machine with 16384 processors. A broad class of reference based line replacement policies are considered, which includes LRU as well as the Least Frequently Used and Random replacement policies. A simulation method is presented for any such policy that on any trace of length N directed to a C line set runs in the O(C log N) time with high probability using N processors on the EREW model. The algorithms are simple, have very little space overhead, and are well suited for SIMD implementation.

  19. MASSIVELY PARALLEL LATENT SEMANTIC ANALYSES USING A GRAPHICS PROCESSING UNIT

    SciTech Connect

    Cavanagh, J.; Cui, S.

    2009-01-01

    Latent Semantic Analysis (LSA) aims to reduce the dimensions of large term-document datasets using Singular Value Decomposition. However, with the ever-expanding size of datasets, current implementations are not fast enough to quickly and easily compute the results on a standard PC. A graphics processing unit (GPU) can solve some highly parallel problems much faster than a traditional sequential processor or central processing unit (CPU). Thus, a deployable system using a GPU to speed up large-scale LSA processes would be a much more effective choice (in terms of cost/performance ratio) than using a PC cluster. Due to the GPU’s application-specifi c architecture, harnessing the GPU’s computational prowess for LSA is a great challenge. We presented a parallel LSA implementation on the GPU, using NVIDIA® Compute Unifi ed Device Architecture and Compute Unifi ed Basic Linear Algebra Subprograms software. The performance of this implementation is compared to traditional LSA implementation on a CPU using an optimized Basic Linear Algebra Subprograms library. After implementation, we discovered that the GPU version of the algorithm was twice as fast for large matrices (1 000x1 000 and above) that had dimensions not divisible by 16. For large matrices that did have dimensions divisible by 16, the GPU algorithm ran fi ve to six times faster than the CPU version. The large variation is due to architectural benefi ts of the GPU for matrices divisible by 16. It should be noted that the overall speeds for the CPU version did not vary from relative normal when the matrix dimensions were divisible by 16. Further research is needed in order to produce a fully implementable version of LSA. With that in mind, the research we presented shows that the GPU is a viable option for increasing the speed of LSA, in terms of cost/performance ratio.

  20. Massively parallel computation of RCS with finite elements

    NASA Technical Reports Server (NTRS)

    Parker, Jay

    1993-01-01

    One of the promising combinations of finite element approaches for scattering problems uses Whitney edge elements, spherical vector wave-absorbing boundary conditions, and bi-conjugate gradient solution for the frequency-domain near field. Each of these approaches may be criticized. Low-order elements require high mesh density, but also result in fast, reliable iterative convergence. Spherical wave-absorbing boundary conditions require additional space to be meshed beyond the most minimal near-space region, but result in fully sparse, symmetric matrices which keep storage and solution times low. Iterative solution is somewhat unpredictable and unfriendly to multiple right-hand sides, yet we find it to be uniformly fast on large problems to date, given the other two approaches. Implementation of these approaches on a distributed memory, message passing machine yields huge dividends, as full scalability to the largest machines appears assured and iterative solution times are well-behaved for large problems. We present times and solutions for computed RCS for a conducting cube and composite permeability/conducting sphere on the Intel ipsc860 with up to 16 processors solving over 200,000 unknowns. We estimate problems of approximately 10 million unknowns, encompassing 1000 cubic wavelengths, may be attempted on a currently available 512 processor machine, but would be exceedingly tedious to prepare. The most severe bottlenecks are due to the slow rate of mesh generation on non-parallel machines and the large transfer time from such a machine to the parallel processor. One solution, in progress, is to create and then distribute a coarse mesh among the processors, followed by systematic refinement within each processor. Elimination of redundant node definitions at the mesh-partition surfaces, snap-to-surface post processing of the resulting mesh for good modelling of curved surfaces, and load-balancing redistribution of new elements after the refinement are auxiliary

  1. A massively parallel computational approach to coupled thermoelastic/porous gas flow problems

    NASA Technical Reports Server (NTRS)

    Shia, David; Mcmanus, Hugh L.

    1995-01-01

    A new computational scheme for coupled thermoelastic/porous gas flow problems is presented. Heat transfer, gas flow, and dynamic thermoelastic governing equations are expressed in fully explicit form, and solved on a massively parallel computer. The transpiration cooling problem is used as an example problem. The numerical solutions have been verified by comparison to available analytical solutions. Transient temperature, pressure, and stress distributions have been obtained. Small spatial oscillations in pressure and stress have been observed, which would be impractical to predict with previously available schemes. Comparisons between serial and massively parallel versions of the scheme have also been made. The results indicate that for small scale problems the serial and parallel versions use practically the same amount of CPU time. However, as the problem size increases the parallel version becomes more efficient than the serial version.

  2. Solution of large linear systems of equations on the massively parallel processor

    NASA Technical Reports Server (NTRS)

    Ida, Nathan; Udawatta, Kapila

    1987-01-01

    The Massively Parallel Processor (MPP) was designed as a special machine for specific applications in image processing. As a parallel machine, with a large number of processors that can be reconfigured in different combinations it is also applicable to other problems that require a large number of processors. The solution of linear systems of equations on the MPP is investigated. The solution times achieved are compared to those obtained with a serial machine and the performance of the MPP is discussed.

  3. The minimal amount of starting DNA for Agilent’s hybrid capture-based targeted massively parallel sequencing

    PubMed Central

    Chung, Jongsuk; Son, Dae-Soon; Jeon, Hyo-Jeong; Kim, Kyoung-Mee; Park, Gahee; Ryu, Gyu Ha; Park, Woong-Yang; Park, Donghyun

    2016-01-01

    Targeted capture massively parallel sequencing is increasingly being used in clinical settings, and as costs continue to decline, use of this technology may become routine in health care. However, a limited amount of tissue has often been a challenge in meeting quality requirements. To offer a practical guideline for the minimum amount of input DNA for targeted sequencing, we optimized and evaluated the performance of targeted sequencing depending on the input DNA amount. First, using various amounts of input DNA, we compared commercially available library construction kits and selected Agilent’s SureSelect-XT and KAPA Biosystems’ Hyper Prep kits as the kits most compatible with targeted deep sequencing using Agilent’s SureSelect custom capture. Then, we optimized the adapter ligation conditions of the Hyper Prep kit to improve library construction efficiency and adapted multiplexed hybrid selection to reduce the cost of sequencing. In this study, we systematically evaluated the performance of the optimized protocol depending on the amount of input DNA, ranging from 6.25 to 200 ng, suggesting the minimal input DNA amounts based on coverage depths required for specific applications. PMID:27220682

  4. Massively parallel algorithms for real-time wavefront control of a dense adaptive optics system

    SciTech Connect

    Fijany, A.; Milman, M.; Redding, D.

    1994-12-31

    In this paper massively parallel algorithms and architectures for real-time wavefront control of a dense adaptive optic system (SELENE) are presented. The authors have already shown that the computation of a near optimal control algorithm for SELENE can be reduced to the solution of a discrete Poisson equation on a regular domain. Although, this represents an optimal computation, due the large size of the system and the high sampling rate requirement, the implementation of this control algorithm poses a computationally challenging problem since it demands a sustained computational throughput of the order of 10 GFlops. They develop a novel algorithm, designated as Fast Invariant Imbedding algorithm, which offers a massive degree of parallelism with simple communication and synchronization requirements. Due to these features, this algorithm is significantly more efficient than other Fast Poisson Solvers for implementation on massively parallel architectures. The authors also discuss two massively parallel, algorithmically specialized, architectures for low-cost and optimal implementation of the Fast Invariant Imbedding algorithm.

  5. Large-eddy simulation of the Rayleigh-Taylor instability on a massively parallel computer

    SciTech Connect

    Amala, P.A.K.

    1995-03-01

    A computational model for the solution of the three-dimensional Navier-Stokes equations is developed. This model includes a turbulence model: a modified Smagorinsky eddy-viscosity with a stochastic backscatter extension. The resultant equations are solved using finite difference techniques: the second-order explicit Lax-Wendroff schemes. This computational model is implemented on a massively parallel computer. Programming models on massively parallel computers are next studied. It is desired to determine the best programming model for the developed computational model. To this end, three different codes are tested on a current massively parallel computer: the CM-5 at Los Alamos. Each code uses a different programming model: one is a data parallel code; the other two are message passing codes. Timing studies are done to determine which method is the fastest. The data parallel approach turns out to be the fastest method on the CM-5 by at least an order of magnitude. The resultant code is then used to study a current problem of interest to the computational fluid dynamics community. This is the Rayleigh-Taylor instability. The Lax-Wendroff methods handle shocks and sharp interfaces poorly. To this end, the Rayleigh-Taylor linear analysis is modified to include a smoothed interface. The linear growth rate problem is then investigated. Finally, the problem of the randomly perturbed interface is examined. Stochastic backscatter breaks the symmetry of the stationary unstable interface and generates a mixing layer growing at the experimentally observed rate. 115 refs., 51 figs., 19 tabs.

  6. Parallel optimization of pixel purity index algorithm for massive hyperspectral images in cloud computing environment

    NASA Astrophysics Data System (ADS)

    Chen, Yufeng; Wu, Zebin; Sun, Le; Wei, Zhihui; Li, Yonglong

    2016-04-01

    With the gradual increase in the spatial and spectral resolution of hyperspectral images, the size of image data becomes larger and larger, and the complexity of processing algorithms is growing, which poses a big challenge to efficient massive hyperspectral image processing. Cloud computing technologies distribute computing tasks to a large number of computing resources for handling large data sets without the limitation of memory and computing resource of a single machine. This paper proposes a parallel pixel purity index (PPI) algorithm for unmixing massive hyperspectral images based on a MapReduce programming model for the first time in the literature. According to the characteristics of hyperspectral images, we describe the design principle of the algorithm, illustrate the main cloud unmixing processes of PPI, and analyze the time complexity of serial and parallel algorithms. Experimental results demonstrate that the parallel implementation of the PPI algorithm on the cloud can effectively process big hyperspectral data and accelerate the algorithm.

  7. Applications of the massively parallel machine, the MasPar MP-1, to Earth sciences

    NASA Technical Reports Server (NTRS)

    Fischer, James R.; Strong, James P.; Dorband, John E.; Tilton, James C.

    1991-01-01

    The computational workload of upcoming NASA science missions, especially the ground data processing for the Earth Observing System, is projected to be quite large (in the 50 to 100 gigaFLOPS range) and corespondingly very expensive to perform using conventional supercomputer systems. High performance, general purpose massively parallel computer systems such as the MasPar MP-1 are being investigated by NASA as a more cost effective alternative. Massively parallel systems are targeted for accelerated development and maturation by NASA's upcoming five-year High Performance Computing and Communications Program. A summary of the broad range of applications currently running on the MP-1 at NASA/Goddard are presented in this paper along with descriptions of the parallel algorithmic techniques employed in five applications that have bearing on Earth sciences.

  8. Massively parallel implementation of the Penn State/NCAR Mesoscale Model

    SciTech Connect

    Foster, I.; Michalakes, J.

    1992-01-01

    Parallel computing promises significant improvements in both the raw speed and cost performance of mesoscale atmospheric models. On distributed-memory massively parallel computers available today, the performance of a mesoscale model will exceed that of conventional supercomputers; on the teraflops machines expected within the next five years, performance will increase by several orders of magnitude. As a result, scientists will be able to consider larger problems, more complex model processes, and finer resolutions. In this paper. we report on a project at Argonne National Laboratory that will allow scientists to take advantage of parallel computing technology. This Massively Parallel Mesoscale Model (MPMM) will be functionally equivalent to the Penn State/NCAR Mesoscale Model (MM). In a prototype study, we produced a parallel version of MM4 using a static (compile-time) coarse-grained patch'' decomposition. This code achieves one-third the performance of a one-processor CRAY Y-MP on twelve Intel 1860 microprocessors. The current version of MPMM is based on all MM5 and uses a more fine-grained approach, decomposing the grid as finely as the mesh itself allows so that each horizontal grid cell is a parallel process. This will allow the code to utilize many hundreds of processors. A high-level language for expressing parallel programs is used to implement communication strearns between the processes in a way that permits dynamic remapping to the physical processors of a particular parallel computer. This facilitates load balancing, grid nesting, and coupling with graphical systems and other models.

  9. Massively parallel implementation of the Penn State/NCAR Mesoscale Model

    SciTech Connect

    Foster, I.; Michalakes, J.

    1992-12-01

    Parallel computing promises significant improvements in both the raw speed and cost performance of mesoscale atmospheric models. On distributed-memory massively parallel computers available today, the performance of a mesoscale model will exceed that of conventional supercomputers; on the teraflops machines expected within the next five years, performance will increase by several orders of magnitude. As a result, scientists will be able to consider larger problems, more complex model processes, and finer resolutions. In this paper. we report on a project at Argonne National Laboratory that will allow scientists to take advantage of parallel computing technology. This Massively Parallel Mesoscale Model (MPMM) will be functionally equivalent to the Penn State/NCAR Mesoscale Model (MM). In a prototype study, we produced a parallel version of MM4 using a static (compile-time) coarse-grained ``patch`` decomposition. This code achieves one-third the performance of a one-processor CRAY Y-MP on twelve Intel 1860 microprocessors. The current version of MPMM is based on all MM5 and uses a more fine-grained approach, decomposing the grid as finely as the mesh itself allows so that each horizontal grid cell is a parallel process. This will allow the code to utilize many hundreds of processors. A high-level language for expressing parallel programs is used to implement communication strearns between the processes in a way that permits dynamic remapping to the physical processors of a particular parallel computer. This facilitates load balancing, grid nesting, and coupling with graphical systems and other models.

  10. Massively parallel per-pixel-based zerotree processing architecture for real-time video compression

    NASA Astrophysics Data System (ADS)

    Alagoda, Geoffrey; Rassau, Alexander M.; Eshraghian, Kamran

    2001-11-01

    In the span of a few years, mobile multimedia communication has rapidly become a significant area of research and development constantly challenging boundaries on a variety of technological fronts. Video compression, a fundamental component for most mobile multimedia applications, generally places heavy demands in terms of the required processing capacity. Hardware implementations of typical modern hybrid codecs require realisation of components such as motion compensation, wavelet transform, quantisation, zerotree coding and arithmetic coding in real-time. While the implementation of such codecs using a fast generic processor is possible, undesirable trade-offs in terms of power consumption and speed must generally be made. The improvement in power consumption that is achievable through the use of a slow-clocked massively parallel processing environment, while maintaining real-time processing speeds, should thus not be overlooked. An architecture to realise such a massively parallel solution for a zerotree entropy coder is, therefore, presented in this paper.

  11. Numerical and physical instabilities in massively parallel LES of reacting flows

    NASA Astrophysics Data System (ADS)

    Poinsot, Thierry

    LES of reacting flows is rapidly becoming mature and providing levels of precision which can not be reached with any RANS (Reynolds Averaged) technique. In addition to the multiple subgrid scale models required for such LES and to the questions raised by the required numerical accurcay of LES solvers, various issues related the reliability, mesh independence and repetitivity of LES must still be addressed, especially when LES is used on massively parallel machines. This talk discusses some of these issues: (1) the existence of non physical waves (known as `wiggles' by most LES practitioners) in LES, (2) the effects of mesh size on LES of reacting flows, (3) the growth of rounding errors in LES on massively parallel machines and more generally (4) the ability to qualify a LES code as `bug free' and `accurate'. Examples range from academic cases (minimum non-reacting turbulent channel) to applied configurations (a sector of an helicopter combustion chamber).

  12. Cross-platform compatibility of Hi-Plex, a streamlined approach for targeted massively parallel sequencing.

    PubMed

    Nguyen-Dumont, Tú; Pope, Bernard J; Hammet, Fleur; Mahmoodi, Maryam; Tsimiklis, Helen; Southey, Melissa C; Park, Daniel J

    2013-11-15

    Although per-base sequencing costs have decreased during recent years, library preparation for targeted massively parallel sequencing remains constrained by high reagent cost, limited design flexibility, and protocol complexity. To address these limitations, we previously developed Hi-Plex, a polymerase chain reaction (PCR) massively parallel sequencing strategy for screening panels of genomic target regions. Here, we demonstrate that Hi-Plex applied with hybrid adapters can generate a library suitable for sequencing with both the Ion Torrent and the TruSeq chemistries and that adjusting primer concentrations improves coverage uniformity. These results expand Hi-Plex capabilities as an accurate, affordable, flexible, and rapid approach for various genetic screening applications. PMID:23933242

  13. Cross-platform compatibility of Hi-Plex, a streamlined approach for targeted massively parallel sequencing

    PubMed Central

    Nguyen-Dumont, Tú; Pope, Bernard J.; Hammet, Fleur; Mahmoodi, Maryam; Tsimiklis, Helen; Southey, Melissa C.; Park, Daniel J.

    2013-01-01

    Although per-base sequencing costs have decreased during recent years, library preparation for targeted massively parallel sequencing remains constrained by high reagent cost, limited design flexibility, and protocol complexity. To address these limitations, we previously developed Hi-Plex, a polymerase chain reaction (PCR) massively parallel sequencing strategy for screening panels of genomic target regions. Here, we demonstrate that Hi-Plex applied with hybrid adapters can generate a library suitable for sequencing with both the Ion Torrent and the TruSeq chemistries and that adjusting primer concentrations improves coverage uniformity. These results expand Hi-Plex capabilities as an accurate, affordable, flexible, and rapid approach for various genetic screening applications. PMID:23933242

  14. A domain decomposition study of massively parallel computing in compressible gas dynamics

    NASA Astrophysics Data System (ADS)

    Wong, C. C.; Blottner, F. G.; Payne, J. L.; Soetrisno, M.

    1995-03-01

    The appropriate utilization of massively parallel computers for solving the Navier-Stokes equations is investigated and determined from an engineering perspective. The issues investigated are: (1) Should strip or patch domain decomposition of the spatial mesh be used to reduce computer time? (2) How many computer nodes should be used for a problem with a given sized mesh to reduce computer time? (3) Is the convergence of the Navier-Stokes solution procedure (LU-SGS) adversely influenced by the domain decomposition approach? The results of the paper show that the present Navier-Stokes solution technique has good performance on a massively parallel computer for transient flow problems. For steady-state problems with a large number of mesh cells, the solution procedure will require significant computer time due to an increased number of iterations to achieve a converged solution. There is an optimum number of computer nodes to use for a problem with a given global mesh size.

  15. Chemical network problems solved on NASA/Goddard's massively parallel processor computer

    NASA Technical Reports Server (NTRS)

    Cho, Seog Y.; Carmichael, Gregory R.

    1987-01-01

    The single instruction stream, multiple data stream Massively Parallel Processor (MPP) unit consists of 16,384 bit serial arithmetic processors configured as a 128 x 128 array whose speed can exceed that of current supercomputers (Cyber 205). The applicability of the MPP for solving reaction network problems is presented and discussed, including the mapping of the calculation to the architecture, and CPU timing comparisons.

  16. Progressive Vector Quantization on a massively parallel SIMD machine with application to multispectral image data

    NASA Technical Reports Server (NTRS)

    Manohar, Mareboyana; Tilton, James C.

    1994-01-01

    A progressive vector quantization (VQ) compression approach is discussed which decomposes image data into a number of levels using full search VQ. The final level is losslessly compressed, enabling lossless reconstruction. The computational difficulties are addressed by implementation on a massively parallel SIMD machine. We demonstrate progressive VQ on multispectral imagery obtained from the Advanced Very High Resolution Radiometer instrument and other Earth observation image data, and investigate the trade-offs in selecting the number of decomposition levels and codebook training method.

  17. Parallel contributing area calculation with granularity control on massive grid terrain datasets

    NASA Astrophysics Data System (ADS)

    Jiang, Ling; Tang, Guoan; Liu, Xuejun; Song, Xiaodong; Yang, Jianyi; Liu, Kai

    2013-10-01

    The calculation of contributing areas from digital elevation models (DEMs) is one of the important tasks in digital terrain analysis (DTA). The computational process usually involves two steps in a real application: (1) calculating flow directions via a flow model, and (2) computing the contributing area for each grid cell in the DEM. The traditional algorithm for calculating contributing areas is coded as a sequential program executed on a single processor. With the increase of scope and resolution of DEMs, the serial algorithm has become increasingly difficult to perform and is often very time-consuming, especially for DEMs of large areas and fine scales. In recent years, parallel computing is able to meet this challenge with the development of computer technology. However, the parallel implementation with granularity control, an efficient strategy to reap the best parallel performance and to break the limitation of computing resources in processing massive grid terrain datasets, has not been found in DTA research field. This paper develops a message-passing-interface (MPI) parallel approach with granularity control to calculate contributing areas. According to the proposed parallelization strategy, the parallel D8 algorithm with granularity control is designed as well as the parallel AreaD8 algorithm. Based on the domain decomposition of DEM data, it is possible for each process to process multiple partitions decomposed under a grain size. According to an iterative procedure of reading source data, executing the operator and writing resulting data, the partitions achieve the calculation results one by one in each process. The experimental results on a multi-node cluster show that the proposed parallel algorithms with granularity control are the powerful tools to process the big dataset and the parallel D8 algorithm is insensitive to granularity, while the parallel AreaD8 algorithm has an optimal grain size to reap the best parallel performance.

  18. Tubal Ligation

    MedlinePlus

    Tubal ligation (getting your "tubes tied") is a type of surgery. It prevents a woman from getting pregnant. ... to most normal activities within a few days. Tubal ligation can sometimes be reversed, but not always.

  19. Massively parallel multifrontal methods for finite element analysis on MIMD computer systems

    SciTech Connect

    Benner, R.E.

    1993-03-01

    The development of highly parallel direct solvers for large, sparse linear systems of equations (e.g. for finite element or finite difference models) is lagging behind progress in parallel direct solvers for dense matrices and iterative methods for sparse matrices. We describe a massively parallel (MP) multifrontal solver for the direct solution of large sparse linear systems, such as those routinely encountered in finite element structural analysis, in an effort to address concerns about the viability of scalable, MP direct methods for sparse systems and enhance the software base for MP applications. Performance results are presented and future directions are outlined for research and development efforts in parallel multifrontal and related solvers. In particular, parallel efficiencies of 25% on 1024 nCUBE 2 nodes and 36% on 64 Intel iPSCS60 nodes have been demonstrated, and parallel efficiencies of 60--85% are expected when a severe load imbalance is overcome by static mapping and dynamic load balance techniques previously developed for other parallel solvers and application codes.

  20. Using CLIPS in the domain of knowledge-based massively parallel programming

    NASA Technical Reports Server (NTRS)

    Dvorak, Jiri J.

    1994-01-01

    The Program Development Environment (PDE) is a tool for massively parallel programming of distributed-memory architectures. Adopting a knowledge-based approach, the PDE eliminates the complexity introduced by parallel hardware with distributed memory and offers complete transparency in respect of parallelism exploitation. The knowledge-based part of the PDE is realized in CLIPS. Its principal task is to find an efficient parallel realization of the application specified by the user in a comfortable, abstract, domain-oriented formalism. A large collection of fine-grain parallel algorithmic skeletons, represented as COOL objects in a tree hierarchy, contains the algorithmic knowledge. A hybrid knowledge base with rule modules and procedural parts, encoding expertise about application domain, parallel programming, software engineering, and parallel hardware, enables a high degree of automation in the software development process. In this paper, important aspects of the implementation of the PDE using CLIPS and COOL are shown, including the embedding of CLIPS with C++-based parts of the PDE. The appropriateness of the chosen approach and of the CLIPS language for knowledge-based software engineering are discussed.

  1. Massively parallel fast elliptic equation solver for three dimensional hydrodynamics and relativity

    SciTech Connect

    Sholl, P.L.; Wilson, J.R.; Mathews, G.J.; Avila, J.H.

    1995-01-01

    Through the work proposed in this document we expect to advance the forefront of large scale computational efforts on massively parallel distributed-memory multiprocessors. We will develop tools for effective conversion to a parallel implementation of sequential numerical methods used to solve large systems of partial differential equations. The research supported by this work will involve conversion of a program which does state of the art modeling of multi-dimensional hydrodynamics, general relativity and particle transport in energetic astrophysical environments. The proposed parallel algorithm development, particularly the study and development of fast elliptic equation solvers, could significantly benefit this program and other applications involving solutions to systems of differential equations. We shall develop a data communication manager for distributed memory computers as an aid in program conversions to a parallel environment and implement it in the three dimensional relativistic hydrodynamics program discussed below; develop a concurrent system/concurrent subgrid multigrid method. Currently, five systems are approximated sequentially using multigrid successive overrelaxation. Results from an iteration cycle of one multigrid system are used in following multigrid systems iterations. We shall develop a multigrid algorithm for simultaneous computation of the sets of equations. In addition, we shall implement a method for concurrent processing of the subgrids in each of the multigrid computations. The conditions for convergence of the method will be examined. We`ll compare this technique to other parallel multigrid techniques, such as distributed data/sequential subgrids and the Parallel Superconvergent Multigrid of Frederickson and McBryan. We expect the results of these studies to offer insight and tools both for the selection of new algorithms as well as for conversion of existing large codes for massively parallel architectures.

  2. ASCI Red -- Experiences and lessons learned with a massively parallel teraFLOP supercomputer

    SciTech Connect

    Christon, M.A.; Crawford, D.A.; Hertel, E.S.; Peery, J.S.; Robinson, A.C.

    1997-06-01

    The Accelerated Strategic Computing Initiative (ASCI) program involves Sandia, Los Alamos and Lawrence Livermore National Laboratories. At Sandia National Laboratories, ASCI applications include large deformation transient dynamics, shock propagation, electromechanics, and abnormal thermal environments. In order to resolve important physical phenomena in these problems, it is estimated that meshes ranging from 10{sup 6} to 10{sup 9} grid points will be required. The ASCI program is relying on the use of massively parallel supercomputers initially capable of delivering over 1 TFLOPs to perform such demanding computations. The ASCI Red machine at Sandia National Laboratories consists of over 4,500 computational nodes with a peak computational rate of 1.8 TFLOPs, 567 GBytes of memory, and 2 TBytes of disk storage. Regardless of the peak FLOP rate, there are many issues surrounding the use of massively parallel supercomputers in a production environment. These issues include parallel I/O, mesh generation, visualization, archival storage, high-bandwidth networking and the development of parallel algorithms. In order to illustrate these issues and their solution with respect to ASCI Red, demonstration calculations of time-dependent buoyancy-dominated plumes, electromechanics, and shock propagation will be presented.

  3. Massively parallel Monte Carlo for many-particle simulations on GPUs

    SciTech Connect

    Anderson, Joshua A.; Jankowski, Eric; Grubb, Thomas L.; Engel, Michael; Glotzer, Sharon C.

    2013-12-01

    Current trends in parallel processors call for the design of efficient massively parallel algorithms for scientific computing. Parallel algorithms for Monte Carlo simulations of thermodynamic ensembles of particles have received little attention because of the inherent serial nature of the statistical sampling. In this paper, we present a massively parallel method that obeys detailed balance and implement it for a system of hard disks on the GPU. We reproduce results of serial high-precision Monte Carlo runs to verify the method. This is a good test case because the hard disk equation of state over the range where the liquid transforms into the solid is particularly sensitive to small deviations away from the balance conditions. On a Tesla K20, our GPU implementation executes over one billion trial moves per second, which is 148 times faster than on a single Intel Xeon E5540 CPU core, enables 27 times better performance per dollar, and cuts energy usage by a factor of 13. With this improved performance we are able to calculate the equation of state for systems of up to one million hard disks. These large system sizes are required in order to probe the nature of the melting transition, which has been debated for the last forty years. In this paper we present the details of our computational method, and discuss the thermodynamics of hard disks separately in a companion paper.

  4. Molecular Dynamics Simulations from SNL's Large-scale Atomic/Molecular Massively Parallel Simulator (LAMMPS)

    DOE Data Explorer

    Plimpton, Steve; Thompson, Aidan; Crozier, Paul

    LAMMPS (http://lammps.sandia.gov/index.html) stands for Large-scale Atomic/Molecular Massively Parallel Simulator and is a code that can be used to model atoms or, as the LAMMPS website says, as a parallel particle simulator at the atomic, meso, or continuum scale. This Sandia-based website provides a long list of animations from large simulations. These were created using different visualization packages to read LAMMPS output, and each one provides the name of the PI and a brief description of the work done or visualization package used. See also the static images produced from simulations at http://lammps.sandia.gov/pictures.html The foundation paper for LAMMPS is: S. Plimpton, Fast Parallel Algorithms for Short-Range Molecular Dynamics, J Comp Phys, 117, 1-19 (1995), but the website also lists other papers describing contributions to LAMMPS over the years.

  5. Medical image processing utilizing neural networks trained on a massively parallel computer.

    PubMed

    Kerr, J P; Bartlett, E B

    1995-07-01

    While finding many applications in science, engineering, and medicine, artificial neural networks (ANNs) have typically been limited to small architectures. In this paper, we demonstrate how very large architecture neural networks can be trained for medical image processing utilizing a massively parallel, single-instruction multiple data (SIMD) computer. The two- to three-orders of magnitude improvement in processing time attainable using a parallel computer makes it practical to train very large architecture ANNs. As an example we have trained several ANNs to demonstrate the tomographic reconstruction of 64 x 64 single photon emission computed tomography (SPECT) images from 64 planar views of the images. The potential for these large architecture ANNs lies in the fact that once the neural network is properly trained on the parallel computer the corresponding interconnection weight file can be loaded on a serial computer. Subsequently, relatively fast processing of all novel images can be performed on a PC or workstation. PMID:7497701

  6. A massively parallel adaptive finite element method with dynamic load balancing

    SciTech Connect

    Devine, K.D.; Flaherty, J.E.; Wheat, S.R.; Maccabe, A.B.

    1993-05-01

    We construct massively parallel, adaptive finite element methods for the solution of hyperbolic conservation laws in one and two dimensions. Spatial discretization is performed by a discontinuous Galerkin finite element method using a basis of piecewise Legendre polynomials. Temporal discretization utilizes a Runge-Kutta method. Dissipative fluxes and projection limiting prevent oscillations near solution discontinuities. The resulting method is of high order and may be parallelized efficiently on MIMD computers. We demonstrate parallel efficiency through computations on a 1024-processor nCUBE/2 hypercube. We also present results using adaptive p-refinement to reduce the computational cost of the method. We describe tiling, a dynamic, element-based data migration system. Tiling dynamically maintains global load balance in the adaptive method by overlapping neighborhoods of processors, where each neighborhood performs local load balancing. We demonstrate the effectiveness of the dynamic load balancing with adaptive p-refinement examples.

  7. A massively parallel adaptive finite element method with dynamic load balancing

    SciTech Connect

    Devine, K.D.; Flaherty, J.E.; Wheat, S.R.; Maccabe, A.B.

    1993-12-31

    The authors construct massively parallel adaptive finite element methods for the solution of hyperbolic conservation laws. Spatial discretization is performed by a discontinuous Galerkin finite element method using a basis of piecewise Legendre polynomials. Temporal discretization utilizes a Runge-Kutta method. Dissipative fluxes and projection limiting prevent oscillations near solution discontinuities. The resulting method is of high order and may be parallelized efficiently on MIMD computers. They demonstrate parallel efficiency through computations on a 1024-processor nCUBE/2 hypercube. They present results using adaptive p-refinement to reduce the computational cost of the method, and tiling, a dynamic, element-based data migration system that maintains global load balance of the adaptive method by overlapping neighborhoods of processors that each perform local balancing.

  8. The use of inexact ODE solver in waveform relaxation methods on a massively parallel computer

    SciTech Connect

    Luk, W.S.; Wing, O.

    1995-12-01

    This paper presents the use of inexact ordinary differential equation (ODE) solver in waveform relaxation methods for solving initial value problems: Since the conventional ODE solvers are inherently sequential, the inexact ODE solver is used by taking time points from only previous waveform iteration for time integration. As a result, this method is truly massively parallel, as the equation is completely unfolded both in system and in time. Convergence analysis shows that the spectral radius of the iteration equation resulting from the {open_quotes}inexact{close_quotes} solver is the same as that from the standard method, and hence the new method is robust. The parallel implementation issues on the DECmpp 12000/Sx computer will also be discussed. Numerical results illustrate that though the number of iterations in the inexact method is increased over the exact method, as expected, the computation time is much reduced because of the large-scale parallelism.

  9. Design and Performance Analysis of a Massively Parallel Atmospheric General Circulation Model

    NASA Technical Reports Server (NTRS)

    Schaffer, Daniel S.; Suarez, Max J.

    1998-01-01

    In the 1990's computer manufacturers are increasingly turning to the development of parallel processor machines to meet the high performance needs of their customers. Simultaneously, atmospheric scientists study weather and climate phenomena ranging from hurricanes to El Nino to global warming that require increasingly fine resolution models. Here, implementation of a parallel atmospheric general circulation model (GCM) which exploits the power of massively parallel machines is described. Using the horizontal data domain decomposition methodology, this FORTRAN 90 model is able to integrate a 0.6 deg. longitude by 0.5 deg. latitude problem at a rate of 19 Gigaflops on 512 processors of a Cray T3E 600; corresponding to 280 seconds of wall-clock time per simulated model day. At this resolution, the model has 64 times as many degrees of freedom and performs 400 times as many floating point operations per simulated day as the model it replaces.

  10. Virtual Simulator: An infrastructure for design and performance-prediction of massively parallel codes

    NASA Astrophysics Data System (ADS)

    Perumalla, K.; Fujimoto, R.; Pande, S.; Karimabadi, H.; Driscoll, J.; Omelchenko, Y.

    2005-12-01

    Large parallel/distributed scientific simulations are very complex, and their dynamic behavior is hard to predict. Efficient development of massively parallel codes remains a computational challenge. For example, almost none of the kinetic codes in use in space physics today have dynamic load balancing capability. Here we present a new infrastructure for design and prediction of parallel codes. Performance prediction is useful to analyze, understand and experiment with different partitioning schemes, multiple modeling alternatives and so on, without having to run the application on supercomputers. Instrumentation of the model (with least perturbance to performance) is useful to glean key metrics and understand application-level behavior. Unfortunately, traditional approaches to virtual execution and instrumentation are limited by either slow execution speed or low resolution or both. We present a new framework that provides a high-resolution framework that provides a virtual CPU abstraction (with a full thread context per CPU), yet scales to thousands of virtual CPUs. The tool, called PDES2, presents different levels of modeling interfaces, from general purpose parallel simulations to parallel grid-based particle-in-cell (PIC) codes. The tool itself runs on multiple processors in order to accommodate the high-resolution by distributing the virtual execution across processors. Validation experiments of PIC models in the framework using a 1-D hybrid shock application show close agreement of results from virtual executions with results from actual supercomputer runs. The utility of this tool is further illustrated through an application to a parallel global hybrid code.

  11. LDRD final report on massively-parallel linear programming : the parPCx system.

    SciTech Connect

    Parekh, Ojas; Phillips, Cynthia Ann; Boman, Erik Gunnar

    2005-02-01

    This report summarizes the research and development performed from October 2002 to September 2004 at Sandia National Laboratories under the Laboratory-Directed Research and Development (LDRD) project ''Massively-Parallel Linear Programming''. We developed a linear programming (LP) solver designed to use a large number of processors. LP is the optimization of a linear objective function subject to linear constraints. Companies and universities have expended huge efforts over decades to produce fast, stable serial LP solvers. Previous parallel codes run on shared-memory systems and have little or no distribution of the constraint matrix. We have seen no reports of general LP solver runs on large numbers of processors. Our parallel LP code is based on an efficient serial implementation of Mehrotra's interior-point predictor-corrector algorithm (PCx). The computational core of this algorithm is the assembly and solution of a sparse linear system. We have substantially rewritten the PCx code and based it on Trilinos, the parallel linear algebra library developed at Sandia. Our interior-point method can use either direct or iterative solvers for the linear system. To achieve a good parallel data distribution of the constraint matrix, we use a (pre-release) version of a hypergraph partitioner from the Zoltan partitioning library. We describe the design and implementation of our new LP solver called parPCx and give preliminary computational results. We summarize a number of issues related to efficient parallel solution of LPs with interior-point methods including data distribution, numerical stability, and solving the core linear system using both direct and iterative methods. We describe a number of applications of LP specific to US Department of Energy mission areas and we summarize our efforts to integrate parPCx (and parallel LP solvers in general) into Sandia's massively-parallel integer programming solver PICO (Parallel Interger and Combinatorial Optimizer). We

  12. Overcoming rule-based rigidity and connectionist limitations through massively-parallel case-based reasoning

    NASA Technical Reports Server (NTRS)

    Barnden, John; Srinivas, Kankanahalli

    1990-01-01

    Symbol manipulation as used in traditional Artificial Intelligence has been criticized by neural net researchers for being excessively inflexible and sequential. On the other hand, the application of neural net techniques to the types of high-level cognitive processing studied in traditional artificial intelligence presents major problems as well. A promising way out of this impasse is to build neural net models that accomplish massively parallel case-based reasoning. Case-based reasoning, which has received much attention recently, is essentially the same as analogy-based reasoning, and avoids many of the problems leveled at traditional artificial intelligence. Further problems are avoided by doing many strands of case-based reasoning in parallel, and by implementing the whole system as a neural net. In addition, such a system provides an approach to some aspects of the problems of noise, uncertainty and novelty in reasoning systems. The current neural net system (Conposit), which performs standard rule-based reasoning, is being modified into a massively parallel case-based reasoning version.

  13. A Novel Implementation of Massively Parallel Three Dimensional Monte Carlo Radiation Transport

    NASA Astrophysics Data System (ADS)

    Robinson, P. B.; Peterson, J. D. L.

    2005-12-01

    The goal of our summer project was to implement the difference formulation for radiation transport into Cosmos++, a multidimensional, massively parallel, magneto hydrodynamics code for astrophysical applications (Peter Anninos - AX). The difference formulation is a new method for Symbolic Implicit Monte Carlo thermal transport (Brooks and Szöke - PAT). Formerly, simultaneous implementation of fully implicit Monte Carlo radiation transport in multiple dimensions on multiple processors had not been convincingly demonstrated. We found that a combination of the difference formulation and the inherent structure of Cosmos++ makes such an implementation both accurate and straightforward. We developed a "nearly nearest neighbor physics" technique to allow each processor to work independently, even with a fully implicit code. This technique coupled with the increased accuracy of an implicit Monte Carlo solution and the efficiency of parallel computing systems allows us to demonstrate the possibility of massively parallel thermal transport. This work was performed under the auspices of the U.S. Department of Energy by University of California Lawrence Livermore National Laboratory under contract No. W-7405-Eng-48

  14. Massively parallel regularized 3D inversion of potential fields on CPUs and GPUs

    NASA Astrophysics Data System (ADS)

    Čuma, Martin; Zhdanov, Michael S.

    2014-01-01

    We have recently introduced a massively parallel regularized 3D inversion of potential fields data. This program takes as an input gravity or magnetic vector, tensor and Total Magnetic Intensity (TMI) measurements and produces 3D volume of density, susceptibility, or three dimensional magnetization vector, the latest also including magnetic remanence information. The code uses combined MPI and OpenMP approach that maps well onto current multiprocessor multicore clusters and exhibits nearly linear strong and weak parallel scaling. It has been used to invert regional to continental size data sets with up to billion cells of the 3D Earth's volume on large clusters for interpretation of large airborne gravity and magnetics surveys. In this paper we explain the features that made this massive parallelization feasible and extend the code to add GPU support in the form of the OpenACC directives. This implementation resulted in up to a 22x speedup as compared to the scalar multithreaded implementation on a 12 core Intel CPU based computer node. Furthermore, we also introduce a mixed single-double precision approach, which allows us to perform most of the calculation at a single floating point number precision while keeping the result as precise as if the double precision had been used. This approach provides an additional 40% speedup on the GPUs, as compared to the pure double precision implementation. It also has about half of the memory footprint of the fully double precision version.

  15. A cost-effective methodology for the design of massively-parallel VLSI functional units

    NASA Technical Reports Server (NTRS)

    Venkateswaran, N.; Sriram, G.; Desouza, J.

    1993-01-01

    In this paper we propose a generalized methodology for the design of cost-effective massively-parallel VLSI Functional Units. This methodology is based on a technique of generating and reducing a massive bit-array on the mask-programmable PAcube VLSI array. This methodology unifies (maintains identical data flow and control) the execution of complex arithmetic functions on PAcube arrays. It is highly regular, expandable and uniform with respect to problem-size and wordlength, thereby reducing the communication complexity. The memory-functional unit interface is regular and expandable. Using this technique functional units of dedicated processors can be mask-programmed on the naked PAcube arrays, reducing the turn-around time. The production cost of such dedicated processors can be drastically reduced since the naked PAcube arrays can be mass-produced. Analysis of the the performance of functional units designed by our method yields promising results.

  16. Commodity cluster and hardware-based massively parallel implementations of hyperspectral imaging algorithms

    NASA Astrophysics Data System (ADS)

    Plaza, Antonio; Chang, Chein-I.; Plaza, Javier; Valencia, David

    2006-05-01

    The incorporation of hyperspectral sensors aboard airborne/satellite platforms is currently producing a nearly continual stream of multidimensional image data, and this high data volume has soon introduced new processing challenges. The price paid for the wealth spatial and spectral information available from hyperspectral sensors is the enormous amounts of data that they generate. Several applications exist, however, where having the desired information calculated quickly enough for practical use is highly desirable. High computing performance of algorithm analysis is particularly important in homeland defense and security applications, in which swift decisions often involve detection of (sub-pixel) military targets (including hostile weaponry, camouflage, concealment, and decoys) or chemical/biological agents. In order to speed-up computational performance of hyperspectral imaging algorithms, this paper develops several fast parallel data processing techniques. Techniques include four classes of algorithms: (1) unsupervised classification, (2) spectral unmixing, and (3) automatic target recognition, and (4) onboard data compression. A massively parallel Beowulf cluster (Thunderhead) at NASA's Goddard Space Flight Center in Maryland is used to measure parallel performance of the proposed algorithms. In order to explore the viability of developing onboard, real-time hyperspectral data compression algorithms, a Xilinx Virtex-II field programmable gate array (FPGA) is also used in experiments. Our quantitative and comparative assessment of parallel techniques and strategies may help image analysts in selection of parallel hyperspectral algorithms for specific applications.

  17. Stochastic simulation of charged particle transport on the massively parallel processor

    NASA Technical Reports Server (NTRS)

    Earl, James A.

    1988-01-01

    Computations of cosmic-ray transport based upon finite-difference methods are afflicted by instabilities, inaccuracies, and artifacts. To avoid these problems, researchers developed a Monte Carlo formulation which is closely related not only to the finite-difference formulation, but also to the underlying physics of transport phenomena. Implementations of this approach are currently running on the Massively Parallel Processor at Goddard Space Flight Center, whose enormous computing power overcomes the poor statistical accuracy that usually limits the use of stochastic methods. These simulations have progressed to a stage where they provide a useful and realistic picture of solar energetic particle propagation in interplanetary space.

  18. Block iterative restoration of astronomical images with the massively parallel processor

    NASA Technical Reports Server (NTRS)

    Heap, Sara R.; Lindler, Don J.

    1987-01-01

    A method is described for algebraic image restoration capable of treating astronomical images. For a typical 500 x 500 image, direct algebraic restoration would require the solution of a 250,000 x 250,000 linear system. The block iterative approach is used to reduce the problem to solving 4900 121 x 121 linear systems. The algorithm was implemented on the Goddard Massively Parallel Processor, which can solve a 121 x 121 system in approximately 0.06 seconds. Examples are shown of the results for various astronomical images.

  19. Direct methods for banded linear systems on massively parallel processor computers

    SciTech Connect

    Arbenz, P.; Gander, W.

    1995-12-01

    The authors discuss direct methods for solving systems of linear equations Ax = b, A {element_of} lR{sup nxn}, on massively parallel processor (MPP) computers. Here, A is a real banded n x n matrix with lower and upper half-bandwidth r and s, respectively. We assume that the matrix A has a narrow band, meaning r + s << n. Only in this case, it is worthwhile taking into account the zero structure of A, i.e. store the matrix by diagonals and modify algorithms.

  20. Scalable load balancing for massively parallel distributed Monte Carlo particle transport

    SciTech Connect

    O'Brien, M. J.; Brantley, P. S.; Joy, K. I.

    2013-07-01

    In order to run computer simulations efficiently on massively parallel computers with hundreds of thousands or millions of processors, care must be taken that the calculation is load balanced across the processors. Examining the workload of every processor leads to an unscalable algorithm, with run time at least as large as O(N), where N is the number of processors. We present a scalable load balancing algorithm, with run time 0(log(N)), that involves iterated processor-pair-wise balancing steps, ultimately leading to a globally balanced workload. We demonstrate scalability of the algorithm up to 2 million processors on the Sequoia supercomputer at Lawrence Livermore National Laboratory. (authors)

  1. Estimating water flow through a hillslope using the massively parallel processor

    NASA Technical Reports Server (NTRS)

    Devaney, Judy E.; Camillo, P. J.; Gurney, R. J.

    1988-01-01

    A new two-dimensional model of water flow in a hillslope has been implemented on the Massively Parallel Processor at the Goddard Space Flight Center. Flow in the soil both in the saturated and unsaturated zones, evaporation and overland flow are all modelled, and the rainfall rates are allowed to vary spatially. Previous models of this type had always been very limited computationally. This model takes less than a minute to model all the components of the hillslope water flow for a day. The model can now be used in sensitivity studies to specify which measurements should be taken and how accurate they should be to describe such flows for environmental studies.

  2. Animated computer graphics models of space and earth sciences data generated via the massively parallel processor

    NASA Technical Reports Server (NTRS)

    Treinish, Lloyd A.; Gough, Michael L.; Wildenhain, W. David

    1987-01-01

    The capability was developed of rapidly producing visual representations of large, complex, multi-dimensional space and earth sciences data sets via the implementation of computer graphics modeling techniques on the Massively Parallel Processor (MPP) by employing techniques recently developed for typically non-scientific applications. Such capabilities can provide a new and valuable tool for the understanding of complex scientific data, and a new application of parallel computing via the MPP. A prototype system with such capabilities was developed and integrated into the National Space Science Data Center's (NSSDC) Pilot Climate Data System (PCDS) data-independent environment for computer graphics data display to provide easy access to users. While developing these capabilities, several problems had to be solved independently of the actual use of the MPP, all of which are outlined.

  3. Implementation, capabilities, and benchmarking of Shift, a massively parallel Monte Carlo radiation transport code

    DOE PAGESBeta

    Pandya, Tara M.; Johnson, Seth R.; Evans, Thomas M.; Davidson, Gregory G.; Hamilton, Steven P.; Godfrey, Andrew T.

    2015-12-21

    This paper discusses the implementation, capabilities, and validation of Shift, a massively parallel Monte Carlo radiation transport package developed and maintained at Oak Ridge National Laboratory. It has been developed to scale well from laptop to small computing clusters to advanced supercomputers. Special features of Shift include hybrid capabilities for variance reduction such as CADIS and FW-CADIS, and advanced parallel decomposition and tally methods optimized for scalability on supercomputing architectures. Shift has been validated and verified against various reactor physics benchmarks and compares well to other state-of-the-art Monte Carlo radiation transport codes such as MCNP5, CE KENO-VI, and OpenMC. Somemore » specific benchmarks used for verification and validation include the CASL VERA criticality test suite and several Westinghouse AP1000® problems. These benchmark and scaling studies show promising results.« less

  4. Implementation, capabilities, and benchmarking of Shift, a massively parallel Monte Carlo radiation transport code

    SciTech Connect

    Pandya, Tara M.; Johnson, Seth R.; Evans, Thomas M.; Davidson, Gregory G.; Hamilton, Steven P.; Godfrey, Andrew T.

    2015-12-21

    This paper discusses the implementation, capabilities, and validation of Shift, a massively parallel Monte Carlo radiation transport package developed and maintained at Oak Ridge National Laboratory. It has been developed to scale well from laptop to small computing clusters to advanced supercomputers. Special features of Shift include hybrid capabilities for variance reduction such as CADIS and FW-CADIS, and advanced parallel decomposition and tally methods optimized for scalability on supercomputing architectures. Shift has been validated and verified against various reactor physics benchmarks and compares well to other state-of-the-art Monte Carlo radiation transport codes such as MCNP5, CE KENO-VI, and OpenMC. Some specific benchmarks used for verification and validation include the CASL VERA criticality test suite and several Westinghouse AP1000® problems. These benchmark and scaling studies show promising results.

  5. Implementation, capabilities, and benchmarking of Shift, a massively parallel Monte Carlo radiation transport code

    NASA Astrophysics Data System (ADS)

    Pandya, Tara M.; Johnson, Seth R.; Evans, Thomas M.; Davidson, Gregory G.; Hamilton, Steven P.; Godfrey, Andrew T.

    2016-03-01

    This work discusses the implementation, capabilities, and validation of Shift, a massively parallel Monte Carlo radiation transport package authored at Oak Ridge National Laboratory. Shift has been developed to scale well from laptops to small computing clusters to advanced supercomputers and includes features such as support for multiple geometry and physics engines, hybrid capabilities for variance reduction methods such as the Consistent Adjoint-Driven Importance Sampling methodology, advanced parallel decompositions, and tally methods optimized for scalability on supercomputing architectures. The scaling studies presented in this paper demonstrate good weak and strong scaling behavior for the implemented algorithms. Shift has also been validated and verified against various reactor physics benchmarks, including the Consortium for Advanced Simulation of Light Water Reactors' Virtual Environment for Reactor Analysis criticality test suite and several Westinghouse AP1000® problems presented in this paper. These benchmark results compare well to those from other contemporary Monte Carlo codes such as MCNP5 and KENO.

  6. Massively Parallel Computation of Soil Surface Roughness Parameters on A Fermi GPU

    NASA Astrophysics Data System (ADS)

    Li, Xiaojie; Song, Changhe

    2016-06-01

    Surface roughness is description of the surface micro topography of randomness or irregular. The standard deviation of surface height and the surface correlation length describe the statistical variation for the random component of a surface height relative to a reference surface. When the number of data points is large, calculation of surface roughness parameters is time-consuming. With the advent of Graphics Processing Unit (GPU) architectures, inherently parallel problem can be effectively solved using GPUs. In this paper we propose a GPU-based massively parallel computing method for 2D bare soil surface roughness estimation. This method was applied to the data collected by the surface roughness tester based on the laser triangulation principle during the field experiment in April 2012. The total number of data points was 52,040. It took 47 seconds on a Fermi GTX 590 GPU whereas its serial CPU version took 5422 seconds, leading to a significant 115x speedup.

  7. A Massively Parallel Solver for the Mechanical Harmonic Analysis of Accelerator Cavities

    SciTech Connect

    O. Kononenko

    2015-02-17

    ACE3P is a 3D massively parallel simulation suite that developed at SLAC National Accelerator Laboratory that can perform coupled electromagnetic, thermal and mechanical study. Effectively utilizing supercomputer resources, ACE3P has become a key simulation tool for particle accelerator R and D. A new frequency domain solver to perform mechanical harmonic response analysis of accelerator components is developed within the existing parallel framework. This solver is designed to determine the frequency response of the mechanical system to external harmonic excitations for time-efficient accurate analysis of the large-scale problems. Coupled with the ACE3P electromagnetic modules, this capability complements a set of multi-physics tools for a comprehensive study of microphonics in superconducting accelerating cavities in order to understand the RF response and feedback requirements for the operational reliability of a particle accelerator. (auth)

  8. Massively Parallel and Scalable Implicit Time Integration Algorithms for Structural Dynamics

    NASA Technical Reports Server (NTRS)

    Farhat, Charbel

    1997-01-01

    Explicit codes are often used to simulate the nonlinear dynamics of large-scale structural systems, even for low frequency response, because the storage and CPU requirements entailed by the repeated factorizations traditionally found in implicit codes rapidly overwhelm the available computing resources. With the advent of parallel processing, this trend is accelerating because of the following additional facts: (a) explicit schemes are easier to parallelize than implicit ones, and (b) explicit schemes induce short range interprocessor communications that are relatively inexpensive, while the factorization methods used in most implicit schemes induce long range interprocessor communications that often ruin the sought-after speed-up. However, the time step restriction imposed by the Courant stability condition on all explicit schemes cannot yet be offset by the speed of the currently available parallel hardware. Therefore, it is essential to develop efficient alternatives to direct methods that are also amenable to massively parallel processing because implicit codes using unconditionally stable time-integration algorithms are computationally more efficient when simulating the low-frequency dynamics of aerospace structures.

  9. Massively parallel simulation of flow and transport in variably saturated porous and fractured media

    SciTech Connect

    Wu, Yu-Shu; Zhang, Keni; Pruess, Karsten

    2002-01-15

    This paper describes a massively parallel simulation method and its application for modeling multiphase flow and multicomponent transport in porous and fractured reservoirs. The parallel-computing method has been implemented into the TOUGH2 code and its numerical performance is tested on a Cray T3E-900 and IBM SP. The efficiency and robustness of the parallel-computing algorithm are demonstrated by completing two simulations with more than one million gridblocks, using site-specific data obtained from a site-characterization study. The first application involves the development of a three-dimensional numerical model for flow in the unsaturated zone of Yucca Mountain, Nevada. The second application is the study of tracer/radionuclide transport through fracture-matrix rocks for the same site. The parallel-computing technique enhances modeling capabilities by achieving several-orders-of-magnitude speedup for large-scale and high resolution modeling studies. The resulting modeling results provide many new insights into flow and transport processes that could not be obtained from simulations using the single-CPU simulator.

  10. DGDFT: A massively parallel method for large scale density functional theory calculations

    SciTech Connect

    Hu, Wei Yang, Chao; Lin, Lin

    2015-09-28

    We describe a massively parallel implementation of the recently developed discontinuous Galerkin density functional theory (DGDFT) method, for efficient large-scale Kohn-Sham DFT based electronic structure calculations. The DGDFT method uses adaptive local basis (ALB) functions generated on-the-fly during the self-consistent field iteration to represent the solution to the Kohn-Sham equations. The use of the ALB set provides a systematic way to improve the accuracy of the approximation. By using the pole expansion and selected inversion technique to compute electron density, energy, and atomic forces, we can make the computational complexity of DGDFT scale at most quadratically with respect to the number of electrons for both insulating and metallic systems. We show that for the two-dimensional (2D) phosphorene systems studied here, using 37 basis functions per atom allows us to reach an accuracy level of 1.3 × 10{sup −4} Hartree/atom in terms of the error of energy and 6.2 × 10{sup −4} Hartree/bohr in terms of the error of atomic force, respectively. DGDFT can achieve 80% parallel efficiency on 128,000 high performance computing cores when it is used to study the electronic structure of 2D phosphorene systems with 3500-14 000 atoms. This high parallel efficiency results from a two-level parallelization scheme that we will describe in detail.

  11. Seismic waves modeling with the Fourier pseudo-spectral method on massively parallel machines.

    NASA Astrophysics Data System (ADS)

    Klin, Peter

    2015-04-01

    The Fourier pseudo-spectral method (FPSM) is an approach for the 3D numerical modeling of the wave propagation, which is based on the discretization of the spatial domain in a structured grid and relies on global spatial differential operators for the solution of the wave equation. This last peculiarity is advantageous from the accuracy point of view but poses difficulties for an efficient implementation of the method to be run on parallel computers with distributed memory architecture. The 1D spatial domain decomposition approach has been so far commonly adopted in the parallel implementations of the FPSM, but it implies an intensive data exchange among all the processors involved in the computation, which can degrade the performance because of communication latencies. Moreover, the scalability of the 1D domain decomposition is limited, since the number of processors can not exceed the number of grid points along the directions in which the domain is partitioned. This limitation inhibits an efficient exploitation of the computational environments with a very large number of processors. In order to overcome the limitations of the 1D domain decomposition we implemented a parallel version of the FPSM based on a 2D domain decomposition, which allows to achieve a higher degree of parallelism and scalability on massively parallel machines with several thousands of processing elements. The parallel programming is essentially achieved using the MPI protocol but OpenMP parts are also included in order to exploit the single processor multi - threading capabilities, when available. The developed tool is aimed at the numerical simulation of the seismic waves propagation and in particular is intended for earthquake ground motion research. We show the scalability tests performed up to 16k processing elements on the IBM Blue Gene/Q computer at CINECA (Italy), as well as the application to the simulation of the earthquake ground motion in the alluvial plain of the Po river (Italy).

  12. A Faster Parallel Algorithm and Efficient Multithreaded Implementations for Evaluating Betweenness Centrality on Massive Datasets

    SciTech Connect

    Madduri, Kamesh; Ediger, David; Jiang, Karl; Bader, David A.; Chavarría-Miranda, Daniel

    2009-05-29

    We present a new lock-free parallel algorithm for computing betweenness centrality of massive small-world networks. With minor changes to the data structures, our algorithm also achieves better spatial cache locality compared to previous approaches. Betweenness centrality is a key algorithm kernel in the HPCS SSCA#2 Graph Analysis benchmark, which has been extensively used to evaluate the performance of emerging high-performance computing architectures for graph-theoretic computations. We design optimized implementations of betweenness centrality and the SSCA#2 benchmark for two hardware multithreaded systems: a Cray XMT system with the ThreadStorm processor, and a single-socket Sun multicore server with the UltraSparc T2 processor. For a small-world network of 134 million vertices and 1.073 billion edges, the 16-processor XMT system and the 8-core Sun Fire T5120 server achieve TEPS scores (an algorithmic performance count for the SSCA#2 benchmark) of 160 million and 90 million respectively, which corresponds to more than a 2X performance improvement over the previous parallel implementations. To better characterize the performance of these multithreaded systems, we correlate the SSCA#2 performance results with data from the memory-intensive STREAM and RandomAccess benchmarks. Finally, we demonstrate the applicability of our implementation to analyze massive real-world datasets by computing approximate betweenness centrality for a large-scale IMDb movie-actor network.

  13. A Faster Parallel Algorithm and Efficient Multithreaded Implementations for Evaluating Betweenness Centrality on Massive Datasets

    SciTech Connect

    Madduri, Kamesh; Ediger, David; Jiang, Karl; Bader, David A.; Chavarria-Miranda, Daniel

    2009-02-15

    We present a new lock-free parallel algorithm for computing betweenness centralityof massive small-world networks. With minor changes to the data structures, ouralgorithm also achieves better spatial cache locality compared to previous approaches. Betweenness centrality is a key algorithm kernel in HPCS SSCA#2, a benchmark extensively used to evaluate the performance of emerging high-performance computing architectures for graph-theoretic computations. We design optimized implementations of betweenness centrality and the SSCA#2 benchmark for two hardware multithreaded systems: a Cray XMT system with the Threadstorm processor, and a single-socket Sun multicore server with the UltraSPARC T2 processor. For a small-world network of 134 million vertices and 1.073 billion edges, the 16-processor XMT system and the 8-core Sun Fire T5120 server achieve TEPS scores (an algorithmic performance count for the SSCA#2 benchmark) of 160 million and 90 million respectively, which corresponds to more than a 2X performance improvement over the previous parallel implementations. To better characterize the performance of these multithreaded systems, we correlate the SSCA#2 performance results with data from the memory-intensive STREAM and RandomAccess benchmarks. Finally, we demonstrate the applicability of our implementation to analyze massive real-world datasets by computing approximate betweenness centrality for a large-scale IMDb movie-actor network.

  14. On distributed memory MPI-based parallelization of SPH codes in massive HPC context

    NASA Astrophysics Data System (ADS)

    Oger, G.; Le Touzé, D.; Guibert, D.; de Leffe, M.; Biddiscombe, J.; Soumagne, J.; Piccinali, J.-G.

    2016-03-01

    Most of particle methods share the problem of high computational cost and in order to satisfy the demands of solvers, currently available hardware technologies must be fully exploited. Two complementary technologies are now accessible. On the one hand, CPUs which can be structured into a multi-node framework, allowing massive data exchanges through a high speed network. In this case, each node is usually comprised of several cores available to perform multithreaded computations. On the other hand, GPUs which are derived from the graphics computing technologies, able to perform highly multi-threaded calculations with hundreds of independent threads connected together through a common shared memory. This paper is primarily dedicated to the distributed memory parallelization of particle methods, targeting several thousands of CPU cores. The experience gained clearly shows that parallelizing a particle-based code on moderate numbers of cores can easily lead to an acceptable scalability, whilst a scalable speedup on thousands of cores is much more difficult to obtain. The discussion revolves around speeding up particle methods as a whole, in a massive HPC context by making use of the MPI library. We focus on one particular particle method which is Smoothed Particle Hydrodynamics (SPH), one of the most widespread today in the literature as well as in engineering.

  15. Massively Parallel Dantzig-Wolfe Decomposition Applied to Traffic Flow Scheduling

    NASA Technical Reports Server (NTRS)

    Rios, Joseph Lucio; Ross, Kevin

    2009-01-01

    Optimal scheduling of air traffic over the entire National Airspace System is a computationally difficult task. To speed computation, Dantzig-Wolfe decomposition is applied to a known linear integer programming approach for assigning delays to flights. The optimization model is proven to have the block-angular structure necessary for Dantzig-Wolfe decomposition. The subproblems for this decomposition are solved in parallel via independent computation threads. Experimental evidence suggests that as the number of subproblems/threads increases (and their respective sizes decrease), the solution quality, convergence, and runtime improve. A demonstration of this is provided by using one flight per subproblem, which is the finest possible decomposition. This results in thousands of subproblems and associated computation threads. This massively parallel approach is compared to one with few threads and to standard (non-decomposed) approaches in terms of solution quality and runtime. Since this method generally provides a non-integral (relaxed) solution to the original optimization problem, two heuristics are developed to generate an integral solution. Dantzig-Wolfe followed by these heuristics can provide a near-optimal (sometimes optimal) solution to the original problem hundreds of times faster than standard (non-decomposed) approaches. In addition, when massive decomposition is employed, the solution is shown to be more likely integral, which obviates the need for an integerization step. These results indicate that nationwide, real-time, high fidelity, optimal traffic flow scheduling is achievable for (at least) 3 hour planning horizons.

  16. The Fortran-P Translator: Towards Automatic Translation of Fortran 77 Programs for Massively Parallel Processors

    DOE PAGESBeta

    O'keefe, Matthew; Parr, Terence; Edgar, B. Kevin; Anderson, Steve; Woodward, Paul; Dietz, Hank

    1995-01-01

    Massively parallel processors (MPPs) hold the promise of extremely high performance that, if realized, could be used to study problems of unprecedented size and complexity. One of the primary stumbling blocks to this promise has been the lack of tools to translate application codes to MPP form. In this article we show how applications codes written in a subset of Fortran 77, called Fortran-P, can be translated to achieve good performance on several massively parallel machines. This subset can express codes that are self-similar, where the algorithm applied to the global data domain is also applied to each subdomain. Wemore » have found many codes that match the Fortran-P programming style and have converted them using our tools. We believe a self-similar coding style will accomplish what a vectorizable style has accomplished for vector machines by allowing the construction of robust, user-friendly, automatic translation systems that increase programmer productivity and generate fast, efficient code for MPPs.« less

  17. Massively Parallel Sequencing-Based Clonality Analysis of Synchronous Endometrioid Endometrial and Ovarian Carcinomas.

    PubMed

    Schultheis, Anne M; Ng, Charlotte K Y; De Filippo, Maria R; Piscuoglio, Salvatore; Macedo, Gabriel S; Gatius, Sonia; Perez Mies, Belen; Soslow, Robert A; Lim, Raymond S; Viale, Agnes; Huberman, Kety H; Palacios, Jose C; Reis-Filho, Jorge S; Matias-Guiu, Xavier; Weigelt, Britta

    2016-06-01

    Synchronous early-stage endometrioid endometrial carcinomas (EECs) and endometrioid ovarian carcinomas (EOCs) are associated with a favorable prognosis and have been suggested to represent independent primary tumors rather than metastatic disease. We subjected sporadic synchronous EECs/EOCs from five patients to whole-exome massively parallel sequencing, which revealed that the EEC and EOC of each case displayed strikingly similar repertoires of somatic mutations and gene copy number alterations. Despite the presence of mutations restricted to the EEC or EOC in each case, we observed that the mutational processes that shaped their respective genomes were consistent. High-depth targeted massively parallel sequencing of sporadic synchronous EECs/EOCs from 17 additional patients confirmed that these lesions are clonally related. In an additional Lynch Syndrome case, however, the EEC and EOC were found to constitute independent cancers lacking somatic mutations in common. Taken together, sporadic synchronous EECs/EOCs are clonally related and likely constitute dissemination from one site to the other. PMID:26832770

  18. New strategies and emerging technologies for massively parallel sequencing: applications in medical research.

    PubMed

    Mardis, Elaine R

    2009-01-01

    A variety of techniques that specifically target human gene sequences for differential capture from a genomic sample, coupled with next-generation, massively parallel DNA sequencing instruments, is rapidly supplanting the combination of polymerase chain reaction and capillary sequencing to discover coding variants in medically relevant samples. These studies are most appropriate for the sample numbers necessary to identify both common and rare single nucleotide variants, as well as small insertion or deletion events, which may cause complex inherited diseases. The same massively parallel sequencers are simultaneously being used for whole-genome resequencing and comprehensive, genome-wide variant discovery in studies of somatic diseases such as cancer. Viral and microbial researchers are using next-generation sequences to identify unknown etiologic agents in human diseases, to study the viral and microbial species that occupy surfaces of the human body, and to inform the clinical management of chronic infectious diseases such as human immunodeficiency virus (HIV). Taken together, these approaches are dramatically accelerating the pace of human disease research and are already impacting patient care. PMID:19435481

  19. Transcriptional analysis of endocrine disruption using zebrafish and massively parallel sequencing

    PubMed Central

    Baker, Michael E.; Hardiman, Gary

    2014-01-01

    Endocrine disrupting chemicals (EDCs) including plasticizers, pesticides, detergents and pharmaceuticals, affect a variety of hormone-regulated physiological pathways in humans and wildlife. Many EDCs are lipophilic molecules and bind to hydrophobic pockets in steroid receptors, such as the estrogen receptor and androgen receptor, which are important in vertebrate reproduction and development. Indeed, health effects attributed to EDCs include reproductive dysfunction (e.g., reduced fertility, reproductive tract abnormalities and skewed male/female sex ratios in fish), early puberty, various cancers and obesity. A major concern is the effects of exposure to low concentrations of endocrine disruptors in utero and post partum, which may increase the incidence of cancer and diabetes in adults. EDCs affect transcription of hundreds and even thousands of genes, which has created the need for new tools to monitor the global effects of EDCs. The emergence of massive parallel sequencing for investigating gene transcription provides a sensitive tool for monitoring the effects of EDCs on humans and other vertebrates as well as elucidating the mechanism of action of EDCs. Zebrafish conserve many developmental pathways found in humans, which makes zebrafish a valuable model system for studying EDCs especially on early organ development because their embryos are translucent. In this article we review recent advances in massive parallel sequencing approaches with a focus on zebrafish. We make the case that zebrafish exposed to EDCs at different stages of development, can provide important insights on EDC effects on human health. PMID:24850832

  20. A massively parallel semi-Lagrangian algorithm for solving the transport equation

    SciTech Connect

    Manson, Russell; Wang, Dali

    2010-01-01

    The scalar transport equation underpins many models employed in science, engineering, technology and business. Application areas include, but are not restricted to, pollution transport, weather forecasting, video analysis and encoding (the optical flow equation), options and stock pricing (the Black-Scholes equation) and spatially explicit ecological models. Unfortunately finding numerical solutions to this equation which are fast and accurate is not trivial. Moreover, finding such numerical algorithms that can be implemented on high performance computer architectures efficiently is challenging. In this paper the authors describe a massively parallel algorithm for solving the advection portion of the transport equation. We present an approach here which is different to that used in most transport models and which we have tried and tested for various scenarios. The approach employs an intelligent domain decomposition based on the vector field of the system equations and thus automatically partitions the computational domain into algorithmically autonomous regions. The solution of a classic pure advection transport problem is shown to be conservative, monotonic and highly accurate at large time steps. Additionally we demonstrate that the algorithm is highly efficient for high performance computer architectures and thus offers a route towards massively parallel application.

  1. MADmap: A Massively Parallel Maximum-Likelihood Cosmic Microwave Background Map-Maker

    SciTech Connect

    Cantalupo, Christopher; Borrill, Julian; Jaffe, Andrew; Kisner, Theodore; Stompor, Radoslaw

    2009-06-09

    MADmap is a software application used to produce maximum-likelihood images of the sky from time-ordered data which include correlated noise, such as those gathered by Cosmic Microwave Background (CMB) experiments. It works efficiently on platforms ranging from small workstations to the most massively parallel supercomputers. Map-making is a critical step in the analysis of all CMB data sets, and the maximum-likelihood approach is the most accurate and widely applicable algorithm; however, it is a computationally challenging task. This challenge will only increase with the next generation of ground-based, balloon-borne and satellite CMB polarization experiments. The faintness of the B-mode signal that these experiments seek to measure requires them to gather enormous data sets. MADmap is already being run on up to O(1011) time samples, O(108) pixels and O(104) cores, with ongoing work to scale to the next generation of data sets and supercomputers. We describe MADmap's algorithm based around a preconditioned conjugate gradient solver, fast Fourier transforms and sparse matrix operations. We highlight MADmap's ability to address problems typically encountered in the analysis of realistic CMB data sets and describe its application to simulations of the Planck and EBEX experiments. The massively parallel and distributed implementation is detailed and scaling complexities are given for the resources required. MADmap is capable of analysing the largest data sets now being collected on computing resources currently available, and we argue that, given Moore's Law, MADmap will be capable of reducing the most massive projected data sets.

  2. ASSET: Analysis of Sequences of Synchronous Events in Massively Parallel Spike Trains.

    PubMed

    Torre, Emiliano; Canova, Carlos; Denker, Michael; Gerstein, George; Helias, Moritz; Grün, Sonja

    2016-07-01

    With the ability to observe the activity from large numbers of neurons simultaneously using modern recording technologies, the chance to identify sub-networks involved in coordinated processing increases. Sequences of synchronous spike events (SSEs) constitute one type of such coordinated spiking that propagates activity in a temporally precise manner. The synfire chain was proposed as one potential model for such network processing. Previous work introduced a method for visualization of SSEs in massively parallel spike trains, based on an intersection matrix that contains in each entry the degree of overlap of active neurons in two corresponding time bins. Repeated SSEs are reflected in the matrix as diagonal structures of high overlap values. The method as such, however, leaves the task of identifying these diagonal structures to visual inspection rather than to a quantitative analysis. Here we present ASSET (Analysis of Sequences of Synchronous EvenTs), an improved, fully automated method which determines diagonal structures in the intersection matrix by a robust mathematical procedure. The method consists of a sequence of steps that i) assess which entries in the matrix potentially belong to a diagonal structure, ii) cluster these entries into individual diagonal structures and iii) determine the neurons composing the associated SSEs. We employ parallel point processes generated by stochastic simulations as test data to demonstrate the performance of the method under a wide range of realistic scenarios, including different types of non-stationarity of the spiking activity and different correlation structures. Finally, the ability of the method to discover SSEs is demonstrated on complex data from large network simulations with embedded synfire chains. Thus, ASSET represents an effective and efficient tool to analyze massively parallel spike data for temporal sequences of synchronous activity. PMID:27420734

  3. ASSET: Analysis of Sequences of Synchronous Events in Massively Parallel Spike Trains

    PubMed Central

    Canova, Carlos; Denker, Michael; Gerstein, George; Helias, Moritz

    2016-01-01

    With the ability to observe the activity from large numbers of neurons simultaneously using modern recording technologies, the chance to identify sub-networks involved in coordinated processing increases. Sequences of synchronous spike events (SSEs) constitute one type of such coordinated spiking that propagates activity in a temporally precise manner. The synfire chain was proposed as one potential model for such network processing. Previous work introduced a method for visualization of SSEs in massively parallel spike trains, based on an intersection matrix that contains in each entry the degree of overlap of active neurons in two corresponding time bins. Repeated SSEs are reflected in the matrix as diagonal structures of high overlap values. The method as such, however, leaves the task of identifying these diagonal structures to visual inspection rather than to a quantitative analysis. Here we present ASSET (Analysis of Sequences of Synchronous EvenTs), an improved, fully automated method which determines diagonal structures in the intersection matrix by a robust mathematical procedure. The method consists of a sequence of steps that i) assess which entries in the matrix potentially belong to a diagonal structure, ii) cluster these entries into individual diagonal structures and iii) determine the neurons composing the associated SSEs. We employ parallel point processes generated by stochastic simulations as test data to demonstrate the performance of the method under a wide range of realistic scenarios, including different types of non-stationarity of the spiking activity and different correlation structures. Finally, the ability of the method to discover SSEs is demonstrated on complex data from large network simulations with embedded synfire chains. Thus, ASSET represents an effective and efficient tool to analyze massively parallel spike data for temporal sequences of synchronous activity. PMID:27420734

  4. MPI/OpenMP Hybrid Parallel Algorithm of Resolution of Identity Second-Order Møller-Plesset Perturbation Calculation for Massively Parallel Multicore Supercomputers.

    PubMed

    Katouda, Michio; Nakajima, Takahito

    2013-12-10

    A new algorithm for massively parallel calculations of electron correlation energy of large molecules based on the resolution of identity second-order Møller-Plesset perturbation (RI-MP2) technique is developed and implemented into the quantum chemistry software NTChem. In this algorithm, a Message Passing Interface (MPI) and Open Multi-Processing (OpenMP) hybrid parallel programming model is applied to attain efficient parallel performance on massively parallel supercomputers. An in-core storage scheme of intermediate data of three-center electron repulsion integrals utilizing the distributed memory is developed to eliminate input/output (I/O) overhead. The parallel performance of the algorithm is tested on massively parallel supercomputers such as the K computer (using up to 45 992 central processing unit (CPU) cores) and a commodity Intel Xeon cluster (using up to 8192 CPU cores). The parallel RI-MP2/cc-pVTZ calculation of two-layer nanographene sheets (C150H30)2 (number of atomic orbitals is 9640) is performed using 8991 node and 71 288 CPU cores of the K computer. PMID:26592275

  5. User's Guide for TOUGH2-MP - A Massively Parallel Version of the TOUGH2 Code

    SciTech Connect

    Earth Sciences Division; Zhang, Keni; Zhang, Keni; Wu, Yu-Shu; Pruess, Karsten

    2008-05-27

    TOUGH2-MP is a massively parallel (MP) version of the TOUGH2 code, designed for computationally efficient parallel simulation of isothermal and nonisothermal flows of multicomponent, multiphase fluids in one, two, and three-dimensional porous and fractured media. In recent years, computational requirements have become increasingly intensive in large or highly nonlinear problems for applications in areas such as radioactive waste disposal, CO2 geological sequestration, environmental assessment and remediation, reservoir engineering, and groundwater hydrology. The primary objective of developing the parallel-simulation capability is to significantly improve the computational performance of the TOUGH2 family of codes. The particular goal for the parallel simulator is to achieve orders-of-magnitude improvement in computational time for models with ever-increasing complexity. TOUGH2-MP is designed to perform parallel simulation on multi-CPU computational platforms. An earlier version of TOUGH2-MP (V1.0) was based on the TOUGH2 Version 1.4 with EOS3, EOS9, and T2R3D modules, a software previously qualified for applications in the Yucca Mountain project, and was designed for execution on CRAY T3E and IBM SP supercomputers. The current version of TOUGH2-MP (V2.0) includes all fluid property modules of the standard version TOUGH2 V2.0. It provides computationally efficient capabilities using supercomputers, Linux clusters, or multi-core PCs, and also offers many user-friendly features. The parallel simulator inherits all process capabilities from V2.0 together with additional capabilities for handling fractured media from V1.4. This report provides a quick starting guide on how to set up and run the TOUGH2-MP program for users with a basic knowledge of running the (standard) version TOUGH2 code, The report also gives a brief technical description of the code, including a discussion of parallel methodology, code structure, as well as mathematical and numerical methods used

  6. Compact Graph Representations and Parallel Connectivity Algorithms for Massive Dynamic Network Analysis

    SciTech Connect

    Madduri, Kamesh; Bader, David A.

    2009-02-15

    Graph-theoretic abstractions are extensively used to analyze massive data sets. Temporal data streams from socioeconomic interactions, social networking web sites, communication traffic, and scientific computing can be intuitively modeled as graphs. We present the first study of novel high-performance combinatorial techniques for analyzing large-scale information networks, encapsulating dynamic interaction data in the order of billions of entities. We present new data structures to represent dynamic interaction networks, and discuss algorithms for processing parallel insertions and deletions of edges in small-world networks. With these new approaches, we achieve an average performance rate of 25 million structural updates per second and a parallel speedup of nearly28 on a 64-way Sun UltraSPARC T2 multicore processor, for insertions and deletions to a small-world network of 33.5 million vertices and 268 million edges. We also design parallel implementations of fundamental dynamic graph kernels related to connectivity and centrality queries. Our implementations are freely distributed as part of the open-source SNAP (Small-world Network Analysis and Partitioning) complex network analysis framework.

  7. Microfluidic Reactor Array Device for Massively Parallel In-situ Synthesis of Oligonucleotides

    PubMed Central

    Srivannavit, Onnop; Gulari, Mayurachat; Hua, Zhishan.; Gao, Xiaolian; Zhou, Xiaochuan; Hong, Ailing; Zhou, Tiecheng; Gulari, Erdogan

    2009-01-01

    We have designed and fabricated a microfluidic reactor array device for massively parallel in-situ synthesis of oligonucleotides (oDNA). The device is made of glass anodically bonded to silicon consisting of three level features: microreactors, microchannels and through inlet/outlet holes. Main challenges in the design of this device include preventing diffusion of photogenerated reagents upon activation and achieving uniform reagent flow through thousands of parallel reactors. The device embodies a simple and effective dynamic isolation mechanism which prevents the intermixing of active reagents between discrete microreactors. Depending on the design parameters, it is possible to achieve uniform flow and synthesis reaction in all of the reactors by proper design of the microreactors and the microchannels. We demonstrated the use of this device on a solution-based, light-directed parallel in-situ oDNA synthesis. We were able to synthesize long oDNA, up to 120 mers at stepwise yield of 98 %. The quality of our microfluidic oDNA microarray including sensitivity, signal noise, specificity, spot variation and accuracy was characterized. Our microfluidic reactor array devices show a great potential for genomics and proteomics researches. PMID:20161215

  8. Scalable High Performance Computing: Direct and Large-Eddy Turbulent Flow Simulations Using Massively Parallel Computers

    NASA Technical Reports Server (NTRS)

    Morgan, Philip E.

    2004-01-01

    This final report contains reports of research related to the tasks "Scalable High Performance Computing: Direct and Lark-Eddy Turbulent FLow Simulations Using Massively Parallel Computers" and "Devleop High-Performance Time-Domain Computational Electromagnetics Capability for RCS Prediction, Wave Propagation in Dispersive Media, and Dual-Use Applications. The discussion of Scalable High Performance Computing reports on three objectives: validate, access scalability, and apply two parallel flow solvers for three-dimensional Navier-Stokes flows; develop and validate a high-order parallel solver for Direct Numerical Simulations (DNS) and Large Eddy Simulation (LES) problems; and Investigate and develop a high-order Reynolds averaged Navier-Stokes turbulence model. The discussion of High-Performance Time-Domain Computational Electromagnetics reports on five objectives: enhancement of an electromagnetics code (CHARGE) to be able to effectively model antenna problems; utilize lessons learned in high-order/spectral solution of swirling 3D jets to apply to solving electromagnetics project; transition a high-order fluids code, FDL3DI, to be able to solve Maxwell's Equations using compact-differencing; develop and demonstrate improved radiation absorbing boundary conditions for high-order CEM; and extend high-order CEM solver to address variable material properties. The report also contains a review of work done by the systems engineer.

  9. Climate system modeling on massively parallel systems: LDRD Project 95-ERP-47 final report

    SciTech Connect

    Mirin, A.A.; Dannevik, W.P.; Chan, B.; Duffy, P.B.; Eltgroth, P.G.; Wehner, M.F.

    1996-12-01

    Global warming, acid rain, ozone depletion, and biodiversity loss are some of the major climate-related issues presently being addressed by climate and environmental scientists. Because unexpected changes in the climate could have significant effect on our economy, it is vitally important to improve the scientific basis for understanding and predicting the earth`s climate. The impracticality of modeling the earth experimentally in the laboratory together with the fact that the model equations are highly nonlinear has created a unique and vital role for computer-based climate experiments. However, today`s computer models, when run at desired spatial and temporal resolution and physical complexity, severely overtax the capabilities of our most powerful computers. Parallel processing offers significant potential for attaining increased performance and making tractable simulations that cannot be performed today. The principal goals of this project have been to develop and demonstrate the capability to perform large-scale climate simulations on high-performance computing systems (using methodology that scales to the systems of tomorrow), and to carry out leading-edge scientific calculations using parallelized models. The demonstration platform for these studies has been the 256-processor Cray-T3D located at Lawrence Livermore National Laboratory. Our plan was to undertake an ambitious program in optimization, proof-of-principle and scientific study. These goals have been met. We are now regularly using massively parallel processors for scientific study of the ocean and atmosphere, and preliminary parallel coupled ocean/atmosphere calculations are being carried out as well. Furthermore, our work suggests that it should be possible to develop an advanced comprehensive climate system model with performance scalable to the teraflops range. 9 refs., 3 figs.

  10. Large-Scale Eigenvalue Calculations for Stability Analysis of Steady Flows on Massively Parallel Computers

    SciTech Connect

    Lehoucq, Richard B.; Salinger, Andrew G.

    1999-08-01

    We present an approach for determining the linear stability of steady states of PDEs on massively parallel computers. Linearizing the transient behavior around a steady state leads to a generalized eigenvalue problem. The eigenvalues with largest real part are calculated using Arnoldi's iteration driven by a novel implementation of the Cayley transformation to recast the problem as an ordinary eigenvalue problem. The Cayley transformation requires the solution of a linear system at each Arnoldi iteration, which must be done iteratively for the algorithm to scale with problem size. A representative model problem of 3D incompressible flow and heat transfer in a rotating disk reactor is used to analyze the effect of algorithmic parameters on the performance of the eigenvalue algorithm. Successful calculations of leading eigenvalues for matrix systems of order up to 4 million were performed, identifying the critical Grashof number for a Hopf bifurcation.

  11. Massively Parallel Interrogation of the Effects of Gene Expression Levels on Fitness.

    PubMed

    Keren, Leeat; Hausser, Jean; Lotan-Pompan, Maya; Vainberg Slutskin, Ilya; Alisar, Hadas; Kaminski, Sivan; Weinberger, Adina; Alon, Uri; Milo, Ron; Segal, Eran

    2016-08-25

    Data of gene expression levels across individuals, cell types, and disease states is expanding, yet our understanding of how expression levels impact phenotype is limited. Here, we present a massively parallel system for assaying the effect of gene expression levels on fitness in Saccharomyces cerevisiae by systematically altering the expression level of ∼100 genes at ∼100 distinct levels spanning a 500-fold range at high resolution. We show that the relationship between expression levels and growth is gene and environment specific and provides information on the function, stoichiometry, and interactions of genes. Wild-type expression levels in some conditions are not optimal for growth, and genes whose fitness is greatly affected by small changes in expression level tend to exhibit lower cell-to-cell variability in expression. Our study addresses a fundamental gap in understanding the functional significance of gene expression regulation and offers a framework for evaluating the phenotypic effects of expression variation. PMID:27545349

  12. The sensitivity of massively parallel sequencing for detecting candidate infectious agents associated with human tissue.

    PubMed

    Moore, Richard A; Warren, René L; Freeman, J Douglas; Gustavsen, Julia A; Chénard, Caroline; Friedman, Jan M; Suttle, Curtis A; Zhao, Yongjun; Holt, Robert A

    2011-01-01

    Massively parallel sequencing technology now provides the opportunity to sample the transcriptome of a given tissue comprehensively. Transcripts at only a few copies per cell are readily detectable, allowing the discovery of low abundance viral and bacterial transcripts in human tissue samples. Here we describe an approach for mining large sequence data sets for the presence of microbial sequences. Further, we demonstrate the sensitivity of this approach by sequencing human RNA-seq libraries spiked with decreasing amounts of an RNA-virus. At a modest depth of sequencing, viral transcripts can be detected at frequencies less than 1 in 1,000,000. With current sequencing platforms approaching outputs of one billion reads per run, this is a highly sensitive method for detecting putative infectious agents associated with human tissues. PMID:21603639

  13. 3-D readout-electronics packaging for high-bandwidth massively paralleled imager

    DOEpatents

    Kwiatkowski, Kris; Lyke, James

    2007-12-18

    Dense, massively parallel signal processing electronics are co-packaged behind associated sensor pixels. Microchips containing a linear or bilinear arrangement of photo-sensors, together with associated complex electronics, are integrated into a simple 3-D structure (a "mirror cube"). An array of photo-sensitive cells are disposed on a stacked CMOS chip's surface at a 45.degree. angle from light reflecting mirror surfaces formed on a neighboring CMOS chip surface. Image processing electronics are held within the stacked CMOS chip layers. Electrical connections couple each of said stacked CMOS chip layers and a distribution grid, the connections for distributing power and signals to components associated with each stacked CSMO chip layer.

  14. Demonstration of EDA flow for massively parallel e-beam lithography

    NASA Astrophysics Data System (ADS)

    Brandt, P.; Belledent, J.; Tranquillin, C.; Figueiro, T.; Meunier, S.; Bayle, S.; Fay, A.; Milléquant, M.; Icard, B.; Wieland, M.

    2014-03-01

    Today's soaring complexity in pushing the limits of 193nm immersion lithography drives the development of other technologies. One of these alternatives is mask-less massively parallel electron beam lithography, (MP-EBL), a promising candidate in which future resolution needs can be fulfilled at competitive cost. MAPPER Lithography's MATRIX MP-EBL platform has currently entered an advanced stage of development. The first tool in this platform, the FLX 1200, will operate using more than 1,300 beams, each one writing a stripe 2.2μm wide. 0.2μm overlap from stripe to stripe is allocated for stitching. Each beam is composed of 49 individual sub-beams that can be blanked independently in order to write in a raster scan pixels onto the wafer.

  15. Simulating massively parallel electron beam inspection for sub-20 nm defects

    NASA Astrophysics Data System (ADS)

    Bunday, Benjamin D.; Mukhtar, Maseeh; Quoi, Kathy; Thiel, Brad; Malloy, Matt

    2015-03-01

    SEMATECH has initiated a program to develop massively-parallel electron beam defect inspection (MPEBI). Here we use JMONSEL simulations to generate expected imaging responses of chosen test cases of patterns and defects with ability to vary parameters for beam energy, spot size, pixel size, and/or defect material and form factor. The patterns are representative of the design rules for an aggressively-scaled FinFET-type design. With these simulated images and resulting shot noise, a signal-to-noise framework is developed, which relates to defect detection probabilities. Additionally, with this infrastructure the effect of detection chain noise and frequency dependent system response can be made, allowing for targeting of best recipe parameters for MPEBI validation experiments, ultimately leading to insights into how such parameters will impact MPEBI tool design, including necessary doses for defect detection and estimations of scanning speeds for achieving high throughput for HVM.

  16. A Massively Parallel Sparse Eigensolver for Structural Dynamics Finite Element Analysis

    SciTech Connect

    Day, David M.; Reese, G.M.

    1999-05-01

    Eigenanalysis is a critical component of structural dynamics which is essential for determinating the vibrational response of systems. This effort addresses the development of numerical algorithms associated with scalable eigensolver techniques suitable for use on massively parallel, distributed memory computers that are capable of solving large scale structural dynamics problems. An iterative Lanczos method was determined to be the best choice for the application. Scalability of the eigenproblem depends on scalability of the underlying linear solver. A multi-level solver (FETI) was selected as most promising for this component. Issues relating to heterogeneous materials, mechanisms and multipoint constraints have been examined, and the linear solver algorithm has been developed to incorporate features that result in a scalable, robust algorithm for practical structural dynamics applications. The resulting tools have been demonstrated on large problems representative of a weapon's system.

  17. Inside the intraterrestrials: The deep biosphere seen through massively parallel sequencing

    NASA Astrophysics Data System (ADS)

    Biddle, J.

    2009-12-01

    Deeply buried marine sediments may house a large amount of the Earth’s microbial population. Initial studies based on 16S rRNA clone libraries suggest that these sediments contain unique phylotypes of microorganisms, particularly from the archaeal domain. Since this environment is so difficult to study, microbiologists are challenged to find ways to examine these populations remotely. A major approach taken to study this environment uses massively parallel sequencing to examine the inner genetic workings of these microorganisms after the sediment has been drilled. Both metagenomics and tagged amplicon sequencing have been employed on deep sediments, and initial results show that different geographic regions can be differentiated through genomics and also minor populations may cause major geochemical changes.

  18. Macro-scale phenomena of arterial coupled cells: a massively parallel simulation

    PubMed Central

    Shaikh, Mohsin Ahmed; Wall, David J. N.; David, Tim

    2012-01-01

    Impaired mass transfer characteristics of blood-borne vasoactive species such as adenosine triphosphate in regions such as an arterial bifurcation have been hypothesized as a prospective mechanism in the aetiology of atherosclerotic lesions. Arterial endothelial cells (ECs) and smooth muscle cells (SMCs) respond differentially to altered local haemodynamics and produce coordinated macro-scale responses via intercellular communication. Using a computationally designed arterial segment comprising large populations of mathematically modelled coupled ECs and SMCs, we investigate their response to spatial gradients of blood-borne agonist concentrations and the effect of micro-scale-driven perturbation on the macro-scale. Altering homocellular (between same cell type) and heterocellular (between different cell types) intercellular coupling, we simulated four cases of normal and pathological arterial segments experiencing an identical gradient in the concentration of the agonist. Results show that the heterocellular calcium (Ca2+) coupling between ECs and SMCs is important in eliciting a rapid response when the vessel segment is stimulated by the agonist gradient. In the absence of heterocellular coupling, homocellular Ca2+ coupling between SMCs is necessary for propagation of Ca2+ waves from downstream to upstream cells axially. Desynchronized intracellular Ca2+ oscillations in coupled SMCs are mandatory for this propagation. Upon decoupling the heterocellular membrane potential, the arterial segment looses the inhibitory effect of ECs on the Ca2+ dynamics of the underlying SMCs. The full system comprises hundreds of thousands of coupled nonlinear ordinary differential equations simulated on the massively parallel Blue Gene architecture. The use of massively parallel computational architectures shows the capability of this approach to address macro-scale phenomena driven by elementary micro-scale components of the system. PMID:21920960

  19. Massively parallel cis-regulatory analysis in the mammalian central nervous system

    PubMed Central

    Shen, Susan Q.; Myers, Connie A.; Hughes, Andrew E.O.; Byrne, Leah C.; Flannery, John G.; Corbo, Joseph C.

    2016-01-01

    Cis-regulatory elements (CREs, e.g., promoters and enhancers) regulate gene expression, and variants within CREs can modulate disease risk. Next-generation sequencing has enabled the rapid generation of genomic data that predict the locations of CREs, but a bottleneck lies in functionally interpreting these data. To address this issue, massively parallel reporter assays (MPRAs) have emerged, in which barcoded reporter libraries are introduced into cells, and the resulting barcoded transcripts are quantified by next-generation sequencing. Thus far, MPRAs have been largely restricted to assaying short CREs in a limited repertoire of cultured cell types. Here, we present two advances that extend the biological relevance and applicability of MPRAs. First, we adapt exome capture technology to instead capture candidate CREs, thereby tiling across the targeted regions and markedly increasing the length of CREs that can be readily assayed. Second, we package the library into adeno-associated virus (AAV), thereby allowing delivery to target organs in vivo. As a proof of concept, we introduce a capture library of about 46,000 constructs, corresponding to roughly 3500 DNase I hypersensitive (DHS) sites, into the mouse retina by ex vivo plasmid electroporation and into the mouse cerebral cortex by in vivo AAV injection. We demonstrate tissue-specific cis-regulatory activity of DHSs and provide examples of high-resolution truncation mutation analysis for multiplex parsing of CREs. Our approach should enable massively parallel functional analysis of a wide range of CREs in any organ or species that can be infected by AAV, such as nonhuman primates and human stem cell–derived organoids. PMID:26576614

  20. Architecture for next-generation massively parallel maskless lithography system (MPML2)

    NASA Astrophysics Data System (ADS)

    Su, Ming-Shing; Tsai, Kuen-Yu; Lu, Yi-Chang; Kuo, Yu-Hsuan; Pei, Ting-Hang; Yen, Jia-Yush

    2010-03-01

    Electron-beam lithography is promising for future manufacturing technology because it does not suffer from wavelength limits set by light sources. Since single electron-beam lithography systems have a common problem in throughput, a multi-electron-beam lithography (MEBL) system should be a feasible alternative using the concept of massive parallelism. In this paper, we evaluate the advantages and the disadvantages of different MEBL system architectures, and propose our novel Massively Parallel MaskLess Lithography System, MPML2. MPML2 system is targeting for cost-effective manufacturing at the 32nm node and beyond. The key structure of the proposed system is its beamlet array cells (BACs). Hundreds of BACs are uniformly arranged over the whole wafer area in the proposed system. Each BAC has a data processor and an array of beamlets, and each beamlet consists of an electron-beam source, a source controller, a set of electron lenses, a blanker, a deflector, and an electron detector. These essential parts of beamlets are integrated using MEMS technology, which increases the density of beamlets and reduces the system cost. The data processor in the BAC processes layout information coming off-chamber and dispatches them to the corresponding beamlet to control its ON/OFF status. High manufacturing cost of masks is saved in maskless lithography systems, however, immense mask data are needed to be handled and transmitted. Therefore, data compression technique is applied to reduce required transmission bandwidth. The compression algorithm is fast and efficient so that the real-time decoder can be implemented on-chip. Consequently, the proposed MPML2 can achieve 10 wafers per hour (WPH) throughput for 300mm-wafer systems.

  1. Tubal ligation

    MedlinePlus

    Sterilization surgery - female; Tubal sterilization; Tube tying; Tying the tubes; Hysteroscopic tubal occlusion procedure ... Tubal ligation is done in a hospital or outpatient clinic. You may receive general anesthesia. You will be ...

  2. Massively parallel simulation with DOE's ASCI supercomputers : an overview of the Los Alamos Crestone project

    SciTech Connect

    Weaver, R. P.; Gittings, M. L.

    2004-01-01

    The Los Alamos Crestone Project is part of the Department of Energy's (DOE) Accelerated Strategic Computing Initiative, or ASCI Program. The main goal of this software development project is to investigate the use of continuous adaptive mesh refinement (CAMR) techniques for application to problems of interest to the Laboratory. There are many code development efforts in the Crestone Project, both unclassified and classified codes. In this overview I will discuss the unclassified SAGE and the RAGE codes. The SAGE (SAIC adaptive grid Eulerian) code is a one-, two-, and three-dimensional multimaterial Eulerian massively parallel hydrodynamics code for use in solving a variety of high-deformation flow problems. The RAGE CAMR code is built from the SAGE code by adding various radiation packages, improved setup utilities and graphics packages and is used for problems in which radiation transport of energy is important. The goal of these massively-parallel versions of the codes is to run extremely large problems in a reasonable amount of calendar time. Our target is scalable performance to {approx}10,000 processors on a 1 billion CAMR computational cell problem that requires hundreds of variables per cell, multiple physics packages (e.g. radiation and hydrodynamics), and implicit matrix solves for each cycle. A general description of the RAGE code has been published in [l],[ 2], [3] and [4]. Currently, the largest simulations we do are three-dimensional, using around 500 million computation cells and running for literally months of calendar time using {approx}2000 processors. Current ASCI platforms range from several 3-teraOPS supercomputers to one 12-teraOPS machine at Lawrence Livermore National Laboratory, the White machine, and one 20-teraOPS machine installed at Los Alamos, the Q machine. Each machine is a system comprised of many component parts that must perform in unity for the successful run of these simulations. Key features of any massively parallel system

  3. Integration Architecture of Content Addressable Memory and Massive-Parallel Memory-Embedded SIMD Matrix for Versatile Multimedia Processor

    NASA Astrophysics Data System (ADS)

    Kumaki, Takeshi; Ishizaki, Masakatsu; Koide, Tetsushi; Mattausch, Hans Jürgen; Kuroda, Yasuto; Gyohten, Takayuki; Noda, Hideyuki; Dosaka, Katsumi; Arimoto, Kazutami; Saito, Kazunori

    This paper presents an integration architecture of content addressable memory (CAM) and a massive-parallel memory-embedded SIMD matrix for constructing a versatile multimedia processor. The massive-parallel memory-embedded SIMD matrix has 2,048 2-bit processing elements, which are connected by a flexible switching network, and supports 2-bit 2,048-way bit-serial and word-parallel operations with a single command. The SIMD matrix architecture is verified to be a better way for processing the repeated arithmetic operation types in multimedia applications. The proposed architecture, reported in this paper, exploits in addition CAM technology and enables therefore fast pipelined table-lookup coding operations. Since both arithmetic and table-lookup operations execute extremely fast, the proposed novel architecture can realize consequently efficient and versatile multimedia data processing. Evaluation results of the proposed CAM-enhanced massive-parallel SIMD matrix processor for the example of the frequently used JPEG image-compression application show that the necessary clock cycle number can be reduced by 86% in comparison to a conventional mobile DSP architecture. The determined performances in Mpixel/mm2 are factors 3.3 and 4.4 better than with a CAM-less massive-parallel memory-embedded SIMD matrix processor and a conventional mobile DSP, respectively.

  4. Massive Exploration of Perturbed Conditions of the Blood Coagulation Cascade through GPU Parallelization

    PubMed Central

    Cazzaniga, Paolo; Nobile, Marco S.; Besozzi, Daniela; Bellini, Matteo; Mauri, Giancarlo

    2014-01-01

    The introduction of general-purpose Graphics Processing Units (GPUs) is boosting scientific applications in Bioinformatics, Systems Biology, and Computational Biology. In these fields, the use of high-performance computing solutions is motivated by the need of performing large numbers of in silico analysis to study the behavior of biological systems in different conditions, which necessitate a computing power that usually overtakes the capability of standard desktop computers. In this work we present coagSODA, a CUDA-powered computational tool that was purposely developed for the analysis of a large mechanistic model of the blood coagulation cascade (BCC), defined according to both mass-action kinetics and Hill functions. coagSODA allows the execution of parallel simulations of the dynamics of the BCC by automatically deriving the system of ordinary differential equations and then exploiting the numerical integration algorithm LSODA. We present the biological results achieved with a massive exploration of perturbed conditions of the BCC, carried out with one-dimensional and bi-dimensional parameter sweep analysis, and show that GPU-accelerated parallel simulations of this model can increase the computational performances up to a 181× speedup compared to the corresponding sequential simulations. PMID:25025072

  5. Practical Realization of Massively Parallel Fiber -Free-Space Optical Interconnects

    NASA Astrophysics Data System (ADS)

    Gruber, Matthias; Jahns, Jürgen; El Joudi, El Mehdi; Sinzinger, Stefan

    2001-06-01

    We propose a novel approach to realizing massively parallel optical interconnects based on commercially available multifiber ribbons with MT-type connectors and custom-designed planar-integrated free-space components. It combines the advantages of fiber optics, that is, a long range and convenient and flexible installation, with those of (planar-integrated) free-space optics, that is, a wide range of implementable functions and a high potential for integration and parallelization. For the interface between fibers and free-space optical systems a low-cost practical solution is presented. It consists of using a metal connector plate that was manufactured on a computer-controlled milling machine. Channel densities are of the order of 100 /mm2 between optoelectronic VLSI chips and the free-space optical systems and 1 /mm2 between the free-space optical systems and MT-type fiber connectors. Experiments in combination with specially designed planar-integrated test systems prove that multiple one-to-one and one-to-many interconnects can be established with not more than 10% uniformity error.

  6. Measures of effectiveness for BMD mid-course tracking on MIMD massively parallel computers

    SciTech Connect

    VanDyke, J.P.; Tomkins, J.L.; Furnish, M.D.

    1995-05-01

    The TRC code, a mid-course tracking code for ballistic missiles, has previously been implemented on a 1024-processor MIMD (Multiple Instruction -- Multiple Data) massively parallel computer. Measures of Effectiveness (MOE) for this algorithm have been developed for this computing environment. The MOE code is run in parallel with the TRC code. Particularly useful MOEs include the number of missed objects (real objects for which the TRC algorithm did not construct a track); of ghost tracks (tracks not corresponding to a real object); of redundant tracks (multiple tracks corresponding to a single real object); and of unresolved objects (multiple objects corresponding to a single track). All of these are expressed as a function of time, and tend to maximize during the time in which real objects are spawned (multiple reentry vehicles per post-boost vehicle). As well, it is possible to measure the track-truth separation as a function of time. A set of calculations is presented illustrating these MOEs as a function of time for a case with 99 post-boost vehicles, each of which spawns 9 reentry vehicles.

  7. Massive parallel IGHV gene sequencing reveals a germinal center pathway in origins of human multiple myeloma

    PubMed Central

    Bryant, Dean; Seckinger, Anja; Hose, Dirk; Zojer, Niklas; Sahota, Surinder S.

    2015-01-01

    Human multiple myeloma (MM) is characterized by accumulation of malignant terminally differentiated plasma cells (PCs) in the bone marrow (BM), raising the question when during maturation neoplastic transformation begins. Immunoglobulin IGHV genes carry imprints of clonal tumor history, delineating somatic hypermutation (SHM) events that generally occur in the germinal center (GC). Here, we examine MM-derived IGHV genes using massive parallel deep sequencing, comparing them with profiles in normal BM PCs. In 4/4 presentation IgG MM, monoclonal tumor-derived IGHV sequences revealed significant evidence for intraclonal variation (ICV) in mutation patterns. IGHV sequences of 2/2 normal PC IgG populations revealed dominant oligoclonal expansions, each expansion also displaying mutational ICV. Clonal expansions in MM and in normal BM PCs reveal common IGHV features. In such MM, the data fit a model of tumor origins in which neoplastic transformation is initiated in a GC B-cell committed to terminal differentiation but still targeted by on-going SHM. Strikingly, the data parallel IGHV clonal sequences in some monoclonal gammopathy of undetermined significance (MGUS) known to display on-going SHM imprints. Since MGUS generally precedes MM, these data suggest origins of MGUS and MM with IGHV gene mutational ICV from the same GC B-cell, arising via a distinctive pathway. PMID:25929340

  8. The divide-expand-consolidate MP2 scheme goes massively parallel

    NASA Astrophysics Data System (ADS)

    Kristensen, Kasper; Kjærgaard, Thomas; Høyvik, Ida-Marie; Ettenhuber, Patrick; Jørgensen, Poul; Jansik, Branislav; Reine, Simen; Jakowski, Jacek

    2013-07-01

    For large molecular systems conventional implementations of second order Møller-Plesset (MP2) theory encounter a scaling wall, both memory- and time-wise. We describe how this scaling wall can be removed. We present a massively parallel algorithm for calculating MP2 energies and densities using the divide-expand-consolidate scheme where a calculation on a large system is divided into many small fragment calculations employing local orbital spaces. The resulting algorithm is linear-scaling with system size, exhibits near perfect parallel scalability, removes memory bottlenecks and does not involve any I/O. The algorithm employs three levels of parallelisation combined via a dynamic job distribution scheme. Results on two molecular systems containing 528 and 1056 atoms (4278 and 8556 basis functions) using 47,120 and 94,240 cores are presented. The results demonstrate the scalability of the algorithm both with respect to the number of cores and with respect to system size. The presented algorithm is thus highly suited for large super computer architectures and allows MP2 calculations on large molecular systems to be carried out within a few hours - for example, the correlated calculation on the molecular system containing 1056 atoms took 2.37 hours using 94240 cores.

  9. Massively parallel electrical conductivity imaging of the subsurface: Applications to hydrocarbon exploration

    SciTech Connect

    Newman, G.A.; Commer, M.

    2009-06-01

    Three-dimensional (3D) geophysical imaging is now receiving considerable attention for electrical conductivity mapping of potential offshore oil and gas reservoirs. The imaging technology employs controlled source electromagnetic (CSEM) and magnetotelluric (MT) fields and treats geological media exhibiting transverse anisotropy. Moreover when combined with established seismic methods, direct imaging of reservoir fluids is possible. Because of the size of the 3D conductivity imaging problem, strategies are required exploiting computational parallelism and optimal meshing. The algorithm thus developed has been shown to scale to tens of thousands of processors. In one imaging experiment, 32,768 tasks/processors on the IBM Watson Research Blue Gene/L supercomputer were successfully utilized. Over a 24 hour period we were able to image a large scale field data set that previously required over four months of processing time on distributed clusters based on Intel or AMD processors utilizing 1024 tasks on an InfiniBand fabric. Electrical conductivity imaging using massively parallel computational resources produces results that cannot be obtained otherwise and are consistent with timeframes required for practical exploration problems.

  10. Rigid body constraints realized in massively-parallel molecular dynamics on graphics processing units

    NASA Astrophysics Data System (ADS)

    Nguyen, Trung Dac; Phillips, Carolyn L.; Anderson, Joshua A.; Glotzer, Sharon C.

    2011-11-01

    Molecular dynamics (MD) methods compute the trajectory of a system of point particles in response to a potential function by numerically integrating Newton's equations of motion. Extending these basic methods with rigid body constraints enables composite particles with complex shapes such as anisotropic nanoparticles, grains, molecules, and rigid proteins to be modeled. Rigid body constraints are added to the GPU-accelerated MD package, HOOMD-blue, version 0.10.0. The software can now simulate systems of particles, rigid bodies, or mixed systems in microcanonical (NVE), canonical (NVT), and isothermal-isobaric (NPT) ensembles. It can also apply the FIRE energy minimization technique to these systems. In this paper, we detail the massively parallel scheme that implements these algorithms and discuss how our design is tuned for the maximum possible performance. Two different case studies are included to demonstrate the performance attained, patchy spheres and tethered nanorods. In typical cases, HOOMD-blue on a single GTX 480 executes 2.5-3.6 times faster than LAMMPS executing the same simulation on any number of CPU cores in parallel. Simulations with rigid bodies may now be run with larger systems and for longer time scales on a single workstation than was previously even possible on large clusters.

  11. A two-phase thermal model for subsurface transport on massively parallel computers

    SciTech Connect

    Martinez, M.J.; Hopkins, P.L.

    1997-12-01

    Many research activities in subsurface transport require the numerical simulation of multiphase flow in porous media. This capability is critical to research in environmental remediation (e.g. contaminations with dense, non-aqueous-phase liquids), nuclear waste management, reservoir engineering, and to the assessment of the future availability of groundwater in many parts of the world. This paper presents an unstructured grid numerical algorithm for subsurface transport in heterogeneous porous media implemented for use on massively parallel (MP) computers. The mathematical model considers nonisothermal two-phase (liquid/gas) flow, including capillary pressure effects, binary diffusion in the gas phase, conductive, latent, and sensible heat transport. The Galerkin finite element method is used for spatial discretization, and temporal integration is accomplished via a predictor/corrector scheme. Message-passing and domain decomposition techniques are used for implementing a scalable algorithm for distributed memory parallel computers. Illustrative applications are shown to demonstrate capabilities and performance, one of which is modeling hydrothermal transport at the Yucca Mountain site for a radioactive waste facility.

  12. GPAW - massively parallel electronic structure calculations with Python-based software.

    SciTech Connect

    Enkovaara, J.; Romero, N.; Shende, S.; Mortensen, J.

    2011-01-01

    Electronic structure calculations are a widely used tool in materials science and large consumer of supercomputing resources. Traditionally, the software packages for these kind of simulations have been implemented in compiled languages, where Fortran in its different versions has been the most popular choice. While dynamic, interpreted languages, such as Python, can increase the effciency of programmer, they cannot compete directly with the raw performance of compiled languages. However, by using an interpreted language together with a compiled language, it is possible to have most of the productivity enhancing features together with a good numerical performance. We have used this approach in implementing an electronic structure simulation software GPAW using the combination of Python and C programming languages. While the chosen approach works well in standard workstations and Unix environments, massively parallel supercomputing systems can present some challenges in porting, debugging and profiling the software. In this paper we describe some details of the implementation and discuss the advantages and challenges of the combined Python/C approach. We show that despite the challenges it is possible to obtain good numerical performance and good parallel scalability with Python based software.

  13. Massively parallel electrical conductivity imaging of the subsurface: Applications to hydrocarbon exploration

    NASA Astrophysics Data System (ADS)

    Newman, Gregory A.; Commer, Michael

    2009-07-01

    Three-dimensional (3D) geophysical imaging is now receiving considerable attention for electrical conductivity mapping of potential offshore oil and gas reservoirs. The imaging technology employs controlled source electromagnetic (CSEM) and magnetotelluric (MT) fields and treats geological media exhibiting transverse anisotropy. Moreover when combined with established seismic methods, direct imaging of reservoir fluids is possible. Because of the size of the 3D conductivity imaging problem, strategies are required exploiting computational parallelism and optimal meshing. The algorithm thus developed has been shown to scale to tens of thousands of processors. In one imaging experiment, 32,768 tasks/processors on the IBM Watson Research Blue Gene/L supercomputer were successfully utilized. Over a 24 hour period we were able to image a large scale field data set that previously required over four months of processing time on distributed clusters based on Intel or AMD processors utilizing 1024 tasks on an InfiniBand fabric. Electrical conductivity imaging using massively parallel computational resources produces results that cannot be obtained otherwise and are consistent with timeframes required for practical exploration problems.

  14. Multi-mode sensor processing on a dynamically reconfigurable massively parallel processor array

    NASA Astrophysics Data System (ADS)

    Chen, Paul; Butts, Mike; Budlong, Brad; Wasson, Paul

    2008-04-01

    This paper introduces a novel computing architecture that can be reconfigured in real time to adapt on demand to multi-mode sensor platforms' dynamic computational and functional requirements. This 1 teraOPS reconfigurable Massively Parallel Processor Array (MPPA) has 336 32-bit processors. The programmable 32-bit communication fabric provides streamlined inter-processor connections with deterministically high performance. Software programmability, scalability, ease of use, and fast reconfiguration time (ranging from microseconds to milliseconds) are the most significant advantages over FPGAs and DSPs. This paper introduces the MPPA architecture, its programming model, and methods of reconfigurability. An MPPA platform for reconfigurable computing is based on a structural object programming model. Objects are software programs running concurrently on hundreds of 32-bit RISC processors and memories. They exchange data and control through a network of self-synchronizing channels. A common application design pattern on this platform, called a work farm, is a parallel set of worker objects, with one input and one output stream. Statically configured work farms with homogeneous and heterogeneous sets of workers have been used in video compression and decompression, network processing, and graphics applications.

  15. A Lightweight Remote Parallel Visualization Platform for Interactive Massive Time-varying Climate Data Analysis

    NASA Astrophysics Data System (ADS)

    Li, J.; Zhang, T.; Huang, Q.; Liu, Q.

    2014-12-01

    Today's climate datasets are featured with large volume, high degree of spatiotemporal complexity and evolving fast overtime. As visualizing large volume distributed climate datasets is computationally intensive, traditional desktop based visualization applications fail to handle the computational intensity. Recently, scientists have developed remote visualization techniques to address the computational issue. Remote visualization techniques usually leverage server-side parallel computing capabilities to perform visualization tasks and deliver visualization results to clients through network. In this research, we aim to build a remote parallel visualization platform for visualizing and analyzing massive climate data. Our visualization platform was built based on Paraview, which is one of the most popular open source remote visualization and analysis applications. To further enhance the scalability and stability of the platform, we have employed cloud computing techniques to support the deployment of the platform. In this platform, all climate datasets are regular grid data which are stored in NetCDF format. Three types of data access methods are supported in the platform: accessing remote datasets provided by OpenDAP servers, accessing datasets hosted on the web visualization server and accessing local datasets. Despite different data access methods, all visualization tasks are completed at the server side to reduce the workload of clients. As a proof of concept, we have implemented a set of scientific visualization methods to show the feasibility of the platform. Preliminary results indicate that the framework can address the computation limitation of desktop based visualization applications.

  16. Genetic algorithm based task reordering to improve the performance of batch scheduled massively parallel scientific applications

    DOE PAGESBeta

    Sankaran, Ramanan; Angel, Jordan; Brown, W. Michael

    2015-04-08

    The growth in size of networked high performance computers along with novel accelerator-based node architectures has further emphasized the importance of communication efficiency in high performance computing. The world's largest high performance computers are usually operated as shared user facilities due to the costs of acquisition and operation. Applications are scheduled for execution in a shared environment and are placed on nodes that are not necessarily contiguous on the interconnect. Furthermore, the placement of tasks on the nodes allocated by the scheduler is sub-optimal, leading to performance loss and variability. Here, we investigate the impact of task placement on themore » performance of two massively parallel application codes on the Titan supercomputer, a turbulent combustion flow solver (S3D) and a molecular dynamics code (LAMMPS). Benchmark studies show a significant deviation from ideal weak scaling and variability in performance. The inter-task communication distance was determined to be one of the significant contributors to the performance degradation and variability. A genetic algorithm-based parallel optimization technique was used to optimize the task ordering. This technique provides an improved placement of the tasks on the nodes, taking into account the application's communication topology and the system interconnect topology. As a result, application benchmarks after task reordering through genetic algorithm show a significant improvement in performance and reduction in variability, therefore enabling the applications to achieve better time to solution and scalability on Titan during production.« less

  17. Genetic algorithm based task reordering to improve the performance of batch scheduled massively parallel scientific applications

    SciTech Connect

    Sankaran, Ramanan; Angel, Jordan; Brown, W. Michael

    2015-04-08

    The growth in size of networked high performance computers along with novel accelerator-based node architectures has further emphasized the importance of communication efficiency in high performance computing. The world's largest high performance computers are usually operated as shared user facilities due to the costs of acquisition and operation. Applications are scheduled for execution in a shared environment and are placed on nodes that are not necessarily contiguous on the interconnect. Furthermore, the placement of tasks on the nodes allocated by the scheduler is sub-optimal, leading to performance loss and variability. Here, we investigate the impact of task placement on the performance of two massively parallel application codes on the Titan supercomputer, a turbulent combustion flow solver (S3D) and a molecular dynamics code (LAMMPS). Benchmark studies show a significant deviation from ideal weak scaling and variability in performance. The inter-task communication distance was determined to be one of the significant contributors to the performance degradation and variability. A genetic algorithm-based parallel optimization technique was used to optimize the task ordering. This technique provides an improved placement of the tasks on the nodes, taking into account the application's communication topology and the system interconnect topology. As a result, application benchmarks after task reordering through genetic algorithm show a significant improvement in performance and reduction in variability, therefore enabling the applications to achieve better time to solution and scalability on Titan during production.

  18. Field-Scale, Massively Parallel Simulation of Production from Oceanic Gas Hydrate Deposits

    NASA Astrophysics Data System (ADS)

    Reagan, M. T.; Moridis, G. J.; Freeman, C. M.; Pan, L.; Boyle, K. L.; Johnson, J. N.; Husebo, J. A.

    2012-12-01

    The quantity of hydrocarbon gases trapped in natural hydrate accumulations is enormous, leading to significant interest in the evaluation of their potential as an energy source. It has been shown that large volumes of gas can be readily produced at high rates for long times from some types of methane hydrate accumulations by means of depressurization-induced dissociation, and using conventional technologies with horizontal or vertical well configurations. However, these systems are currently assessed using simplified or reduced-scale 3D or even 2D production simulations. In this study, we use the massively parallel TOUGH+HYDRATE code (pT+H) to assess the production potential of a large, deep-ocean hydrate reservoir and develop strategies for effective production. The simulations model a full 3D system of over 24 km2 extent, examining the productivity of vertical and horizontal wells, single or multiple wells, and explore variations in reservoir properties. Systems of up to 2.5M gridblocks, running on thousands of supercomputing nodes, are required to simulate such large systems at the highest level of detail. The simulations reveal the challenges inherent in producing from deep, relatively cold systems with extensive water-bearing channels and connectivity to large aquifers, including the difficulty of achieving depressurizing, the challenges of high water removal rates, and the complexity of production design. Also highlighted are new frontiers in large-scale reservoir simulation of coupled flow, transport, thermodynamics, and phase behavior, including the construction of large meshes, the use parallel numerical solvers and MPI, and large-scale, parallel 3D visualization of results.

  19. Ligation module for in vitro selection in DNA computing

    NASA Astrophysics Data System (ADS)

    van Noort, Danny; Lee, In-Hee; Landweber, Laura F.; Zhang, Byoung-Tak

    2005-02-01

    In this paper a classical AI problem is proposed to be solved by DNA computing: theorem proving. Since the complexity grows exponentially with the size of the problem, the solving process should be done in parallel. Massive parallelism is one of the advantages of DNA computers. It will be shown that the resolution refutation proof can be readily implemented by DNA hybridisation and ligation. Microreactors lend themselves to a relatively simple implementation of DNA computing. Not only is the design of the DNA critical for the success of the system but also the architecture of the microfluidic structure. Here the DNA performs the computation, while the microfluidics aids the biochemical steps necessary to manipulate the DNA, i.e. hybridisation and ligation.

  20. MicroRNA transcriptome in the newborn mouse ovaries determined by massive parallel sequencing.

    PubMed

    Ahn, Hyo Won; Morin, Ryan D; Zhao, Han; Harris, Ronald A; Coarfa, Cristian; Chen, Zi-Jiang; Milosavljevic, Aleksandar; Marra, Marco A; Rajkovic, Aleksandar

    2010-07-01

    Small non-coding RNAs, such as microRNAs (miRNAs), are involved in diverse biological processes including organ development and tissue differentiation. Global disruption of miRNA biogenesis in Dicer knockout mice disrupts early embryogenesis and primordial germ cell formation. However, the role of miRNAs in early folliculogenesis is poorly understood. In order to identify a full transcriptome set of small RNAs expressed in the newborn (NB) ovary, we extracted small RNA fraction from mouse NB ovary tissues and subjected it to massive parallel sequencing using the Genome Analyzer from Illumina. Massive sequencing produced 4 655 992 reads of 33 bp each representing a total of 154 Mbp of sequence data. The Pash alignment algorithm mapped 50.13% of the reads to the mouse genome. Sequence reads were clustered based on overlapping mapping coordinates and intersected with known miRNAs, small nucleolar RNAs (snoRNAs), piwi-interacting RNA (piRNA) clusters and repetitive genomic regions; 25.2% of the reads mapped to known miRNAs, 25.5% to genomic repeats, 3.5% to piRNAs and 0.18% to snoRNAs. Three hundred and ninety-eight known miRNA species were among the sequenced small RNAs, and 118 isomiR sequences that are not in the miRBase database. Let-7 family was the most abundantly expressed miRNA, and mmu-mir-672, mmu-mir-322, mmu-mir-503 and mmu-mir-465 families are the most abundant X-linked miRNA detected. X-linked mmu-mir-503, mmu-mir-672 and mmu-mir-465 family showed preferential expression in testes and ovaries. We also identified four novel miRNAs that are preferentially expressed in gonads. Gonadal selective miRNAs may play important roles in ovarian development, folliculogenesis and female fertility. PMID:20215419

  1. Wideband aperture array using RF channelizers and massively parallel digital 2D IIR filterbank

    NASA Astrophysics Data System (ADS)

    Sengupta, Arindam; Madanayake, Arjuna; Gómez-García, Roberto; Engeberg, Erik D.

    2014-05-01

    Wideband receive-mode beamforming applications in wireless location, electronically-scanned antennas for radar, RF sensing, microwave imaging and wireless communications require digital aperture arrays that offer a relatively constant far-field beam over several octaves of bandwidth. Several beamforming schemes including the well-known true time-delay and the phased array beamformers have been realized using either finite impulse response (FIR) or fast Fourier transform (FFT) digital filter-sum based techniques. These beamforming algorithms offer the desired selectivity at the cost of a high computational complexity and frequency-dependant far-field array patterns. A novel approach to receiver beamforming is the use of massively parallel 2-D infinite impulse response (IIR) fan filterbanks for the synthesis of relatively frequency independent RF beams at an order of magnitude lower multiplier complexity compared to FFT or FIR filter based conventional algorithms. The 2-D IIR filterbanks demand fast digital processing that can support several octaves of RF bandwidth, fast analog-to-digital converters (ADCs) for RF-to-bits type direct conversion of wideband antenna element signals. Fast digital implementation platforms that can realize high-precision recursive filter structures necessary for real-time beamforming, at RF radio bandwidths, are also desired. We propose a novel technique that combines a passive RF channelizer, multichannel ADC technology, and single-phase massively parallel 2-D IIR digital fan filterbanks, realized at low complexity using FPGA and/or ASIC technology. There exists native support for a larger bandwidth than the maximum clock frequency of the digital implementation technology. We also strive to achieve More-than-Moore throughput by processing a wideband RF signal having content with N-fold (B = N Fclk/2) bandwidth compared to the maximum clock frequency Fclk Hz of the digital VLSI platform under consideration. Such increase in bandwidth is

  2. Switching dynamics of thin film ferroelectric devices - a massively parallel phase field study

    NASA Astrophysics Data System (ADS)

    Ashraf, Md. Khalid

    In this thesis, we investigate the switching dynamics in thin film ferroelectrics. Ferroelectric materials are of inherent interest for low power and multi-functional devices. However, possible device applications of these materials have been limited due to the poorly understood electromagnetic and mechanical response at the nanoscale in arbitrary device structures. The difficulty in understanding switching dynamics mainly arises from the presence of features at multiple length scales and the nonlinearity associated with the strongly coupled states. For example, in a ferroelectric material, the domain walls are of nm size whereas the domain pattern forms at micron scale. The switching is determined by coupled chemical, electrostatic, mechanical and thermal interactions. Thus computational understanding of switching dynamics in thin film ferroelectrics and a direct comparison with experiment poses a significant numerical challenge. We have developed a phase field model that describes the physics of polarization dynamics at the microscopic scale. A number of efficient numerical methods have been applied for achieving massive parallelization of all the calculation steps. Conformally mapped elements, node wise assembly and prevention of dynamic loading minimized the communication between processors and increased the parallelization efficiency. With these improvements, we have reached the experimental scale - a significant step forward compared to the state of the art thin film ferroelectric switching dynamics models. Using this model, we elucidated the switching dynamics on multiple surfaces of the multiferroic material BFO. We also calculated the switching energy of scaled BFO islands. Finally, we studied the interaction of domain wall propagation with misfit dislocations in the thin film. We believe that the model will be useful in understanding the switching dynamics in many different experimental setups incorporating thin film ferroelectrics.

  3. Diffuse large B-cell lymphoma: sub-classification by massive parallel quantitative RT-PCR.

    PubMed

    Xue, Xuemin; Zeng, Naiyan; Gao, Zifen; Du, Ming-Qing

    2015-01-01

    Diffuse large B-cell lymphoma (DLBCL) is a heterogeneous entity with remarkably variable clinical outcome. Gene expression profiling (GEP) classifies DLBCL into activated B-cell like (ABC), germinal center B-cell like (GCB), and Type-III subtypes, with ABC-DLBCL characterized by a poor prognosis and constitutive NF-κB activation. A major challenge for the application of this cell of origin (COO) classification in routine clinical practice is to establish a robust clinical assay amenable to routine formalin-fixed paraffin-embedded (FFPE) diagnostic biopsies. In this study, we investigated the possibility of COO-classification using FFPE tissue RNA samples by massive parallel quantitative reverse transcription PCR (qRT-PCR). We established a protocol for parallel qRT-PCR using FFPE RNA samples with the Fluidigm BioMark HD system, and quantified the expression of the COO classifier genes and the NF-κB targeted-genes that characterize ABC-DLBCL in 143 cases of DLBCL. We also trained and validated a series of basic machine-learning classifiers and their derived meta classifiers, and identified SimpleLogistic as the top classifier that gave excellent performance across various GEP data sets derived from fresh-frozen or FFPE tissues by different microarray platforms. Finally, we applied SimpleLogistic to our data set generated by qRT-PCR, and the ABC and GCB-DLBCL assigned showed the respective characteristics in their clinical outcome and NF-κB target gene expression. The methodology established in this study provides a robust approach for DLBCL sub-classification using routine FFPE diagnostic biopsies in a routine clinical setting. PMID:25418578

  4. Hierarchical Image Segmentation of Remotely Sensed Data using Massively Parallel GNU-LINUX Software

    NASA Technical Reports Server (NTRS)

    Tilton, James C.

    2003-01-01

    A hierarchical set of image segmentations is a set of several image segmentations of the same image at different levels of detail in which the segmentations at coarser levels of detail can be produced from simple merges of regions at finer levels of detail. In [1], Tilton, et a1 describes an approach for producing hierarchical segmentations (called HSEG) and gave a progress report on exploiting these hierarchical segmentations for image information mining. The HSEG algorithm is a hybrid of region growing and constrained spectral clustering that produces a hierarchical set of image segmentations based on detected convergence points. In the main, HSEG employs the hierarchical stepwise optimization (HSWO) approach to region growing, which was described as early as 1989 by Beaulieu and Goldberg. The HSWO approach seeks to produce segmentations that are more optimized than those produced by more classic approaches to region growing (e.g. Horowitz and T. Pavlidis, [3]). In addition, HSEG optionally interjects between HSWO region growing iterations, merges between spatially non-adjacent regions (i.e., spectrally based merging or clustering) constrained by a threshold derived from the previous HSWO region growing iteration. While the addition of constrained spectral clustering improves the utility of the segmentation results, especially for larger images, it also significantly increases HSEG s computational requirements. To counteract this, a computationally efficient recursive, divide-and-conquer, implementation of HSEG (RHSEG) was devised, which includes special code to avoid processing artifacts caused by RHSEG s recursive subdivision of the image data. The recursive nature of RHSEG makes for a straightforward parallel implementation. This paper describes the HSEG algorithm, its recursive formulation (referred to as RHSEG), and the implementation of RHSEG using massively parallel GNU-LINUX software. Results with Landsat TM data are included comparing RHSEG with classic

  5. World Wide Web interface for advanced SPECT reconstruction algorithms implemented on a remote massively parallel computer.

    PubMed

    Formiconi, A R; Passeri, A; Guelfi, M R; Masoni, M; Pupi, A; Meldolesi, U; Malfetti, P; Calori, L; Guidazzoli, A

    1997-11-01

    Data from Single Photon Emission Computed Tomography (SPECT) studies are blurred by inevitable physical phenomena occurring during data acquisition. These errors may be compensated by means of reconstruction algorithms which take into account accurate physical models of the data acquisition procedure. Unfortunately, this approach involves high memory requirements as well as a high computational burden which cannot be afforded by the computer systems of SPECT acquisition devices. In this work the possibility of accessing High Performance Computing and Networking (HPCN) resources through a World Wide Web interface for the advanced reconstruction of SPECT data in a clinical environment was investigated. An iterative algorithm with an accurate model of the variable system response was ported on the Multiple Instruction Multiple Data (MIMD) parallel architecture of a Cray T3D massively parallel computer. The system was accessible even from low cost PC-based workstations through standard TCP/IP networking. A speedup factor of 148 was predicted by the benchmarks run on the Cray T3D. A complete brain study of 30 (64 x 64) slices was reconstructed from a set of 90 (64 x 64) projections with ten iterations of the conjugate gradients algorithm in 9 s which corresponds to an actual speed-up factor of 135. The technique was extended to a more accurate 3D modeling of the system response for a true 3D reconstruction of SPECT data; the reconstruction time of the same data set with this more accurate model was 5 min. This work demonstrates the possibility of exploiting remote HPCN resources from hospital sites by means of low cost workstations using standard communication protocols and an user-friendly WWW interface without particular problems for routine use. PMID:9506406

  6. PORTA: A Massively Parallel Code for 3D Non-LTE Polarized Radiative Transfer

    NASA Astrophysics Data System (ADS)

    Štěpán, J.

    2014-10-01

    The interpretation of the Stokes profiles of the solar (stellar) spectral line radiation requires solving a non-LTE radiative transfer problem that can be very complex, especially when the main interest lies in modeling the linear polarization signals produced by scattering processes and their modification by the Hanle effect. One of the main difficulties is due to the fact that the plasma of a stellar atmosphere can be highly inhomogeneous and dynamic, which implies the need to solve the non-equilibrium problem of generation and transfer of polarized radiation in realistic three-dimensional stellar atmospheric models. Here we present PORTA, a computer program we have developed for solving, in three-dimensional (3D) models of stellar atmospheres, the problem of the generation and transfer of spectral line polarization taking into account anisotropic radiation pumping and the Hanle and Zeeman effects in multilevel atoms. The numerical method of solution is based on a highly convergent iterative algorithm, whose convergence rate is insensitive to the grid size, and on an accurate short-characteristics formal solver of the Stokes-vector transfer equation which uses monotonic Bezier interpolation. In addition to the iterative method and the 3D formal solver, another important feature of PORTA is a novel parallelization strategy suitable for taking advantage of massively parallel computers. Linear scaling of the solution with the number of processors allows to reduce the solution time by several orders of magnitude. We present useful benchmarks and a few illustrations of applications using a 3D model of the solar chromosphere resulting from MHD simulations. Finally, we present our conclusions with a view to future research. For more details see Štěpán & Trujillo Bueno (2013).

  7. Massively parallel LES of azimuthal thermo-acoustic instabilities in annular gas turbines

    NASA Astrophysics Data System (ADS)

    Wolf, P.; Staffelbach, G.; Roux, A.; Gicquel, L.; Poinsot, T.; Moureau, V.

    2009-06-01

    Increasingly stringent regulations and the need to tackle rising fuel prices have placed great emphasis on the design of aeronautical gas turbines, which are unfortunately more and more prone to combustion instabilities. In the particular field of annular combustion chambers, these instabilities often take the form of azimuthal modes. To predict these modes, one must compute the full combustion chamber, which remained out of reach until very recently and the development of massively parallel computers. In this article, full annular Large Eddy Simulations (LES) of two helicopter combustors, which differ only on the swirlers' design, are performed. In both computations, LES captures self-established rotating azimuthal modes. However, the two cases exhibit different thermo-acoustic responses and the resulting limit-cycles are different. With the first design, a self-excited strong instability develops, leading to pulsating flames and local flashback. In the second case, the flames are much less affected by the azimuthal mode and remain stable, allowing an acceptable operation. Hence, this study highlights the potential of LES for discriminating injection system designs. To cite this article: P. Wolf et al., C. R. Mecanique 337 (2009).

  8. Massively parallel LES of azimuthal thermo-acoustic instabilities in annular gas turbines

    NASA Astrophysics Data System (ADS)

    Wolf, Pierre; Staffelbach, Gabriel; Gicquel, Laurent; Poinsot, Thierry

    2009-07-01

    Most of the energy produced worldwide comes from the combustion of fossil fuels. In the context of global climate changes and dramatically decreasing resources, there is a critical need for optimizing the process of burning, especially in the field of gas turbines. Unfortunately, new designs for efficient combustion are prone to destructive thermo-acoustic instabilities. Large Eddy Simulation (LES) is a promising tool to predict turbulent reacting flows in complex industrial configurations and explore the mechanisms triggering the coupling between acoustics and combustion. In the particular field of annular combustion chambers, these instabilities usually take the form of azimuthal modes. To predict these modes, one must compute the full combustion chamber comprising all sectors, which remained out of reach until very recently and the development of massively parallel computers. A fully compressible, multi-species reactive Navier-Stokes solver is used on up to 4096 BlueGene/P CPUs for two designs of a full annular helicopter chamber. Results show evidence of self-established azimuthal modes for the two cases but with different energy containing limit-cycles. Mesh dependency is checked with grids comprising 38 and 93 million tetrahedra. The fact that the two grid predictions yield similar flow topologies and limit-cycles enforces the ability of LES to discriminate design changes.

  9. Ensuring the safety of vaccine cell substrates by massively parallel sequencing of the transcriptome.

    PubMed

    Onions, D; Côté, C; Love, B; Toms, B; Koduri, S; Armstrong, A; Chang, A; Kolman, J

    2011-09-22

    Massively parallel, deep, sequencing of the transcriptome coupled with algorithmic analysis to identify adventitious agents (MP-Seq™) is an important adjunct in ensuring the safety of cells used in vaccine production. Such cells may harbour novel viruses whose sequences are unknown or latent viruses that are only expressed following stress to the cells. MP-Seq is an unbiased and comprehensive method to identify such viruses and other adventitious agents without prior knowledge of the nature of those agents. Here we demonstrate its utility as part of an integrated approach to identify and characterise potential contaminants within commonly used virus and vaccine production cell lines. Through this analysis, in combination with more traditional approaches, we have excluded the presence of porcine circoviruses in the ATCC Vero cell bank (CCL-81), however, we found that a full length betaretrovirus related to SRV can be expressed in these cells, a factor that may be of importance in the production of certain vaccines. Similarly, insect cells are proving to be valuable for the production of virus like particles and sub-unit vaccines, but they can harbour a range of latent viruses. We show that following MP-Seq of the Trichoplusia ni (High Five cell line) transcriptome we were able to detect a contaminating, latent nodavirus and identify an expressed errantivirus genome. Collectively, these studies have reinforced the role of MP-Seq as an integral tool for the identification of contaminating agents in vaccine cell substrates. PMID:21651935

  10. GRay: A Massively Parallel GPU-based Code for Ray Tracing in Relativistic Spacetimes

    NASA Astrophysics Data System (ADS)

    Chan, Chi-kwan; Psaltis, Dimitrios; Özel, Feryal

    2013-11-01

    We introduce GRay, a massively parallel integrator designed to trace the trajectories of billions of photons in a curved spacetime. This graphics-processing-unit (GPU)-based integrator employs the stream processing paradigm, is implemented in CUDA C/C++, and runs on nVidia graphics cards. The peak performance of GRay using single-precision floating-point arithmetic on a single GPU exceeds 300 GFLOP (or 1 ns per photon per time step). For a realistic problem, where the peak performance cannot be reached, GRay is two orders of magnitude faster than existing central-processing-unit-based ray-tracing codes. This performance enhancement allows more effective searches of large parameter spaces when comparing theoretical predictions of images, spectra, and light curves from the vicinities of compact objects to observations. GRay can also perform on-the-fly ray tracing within general relativistic magnetohydrodynamic algorithms that simulate accretion flows around compact objects. Making use of this algorithm, we calculate the properties of the shadows of Kerr black holes and the photon rings that surround them. We also provide accurate fitting formulae of their dependencies on black hole spin and observer inclination, which can be used to interpret upcoming observations of the black holes at the center of the Milky Way, as well as M87, with the Event Horizon Telescope.

  11. Massive Parallel Sequencing for Diagnostic Genetic Testing of BRCA Genes--a Single Center Experience.

    PubMed

    Ermolenko, Natalya A; Boyarskikh, Uljana A; Kechin, Andrey A; Mazitova, Alexandra M; Khrapov, Evgeny A; Petrova, Valentina D; Lazarev, Alexandr F; Kushlinskii, Nikolay E; Filipenko, Maxim L

    2015-01-01

    The aim of this study was to implement massive parallel sequencing (MPS) technology in clinical genetics testing. We developed and tested an amplicon-based method for resequencing the BRCA1 and BRCA2 genes on an Illumina MiSeq to identify disease-causing mutations in patients with hereditary breast or ovarian cancer (HBOC). The coding regions of BRCA1 and BRCA2 were resequenced in 96 HBOC patient DNA samples obtained from different sample types: peripheral blood leukocytes, whole blood drops dried on paper, and buccal wash epithelia. A total of 16 random DNA samples were characterized using standard Sanger sequencing and applied to optimize the variant calling process and evaluate the accuracy of the MPS-method. The best bioinformatics workflow included the filtration of variants using GATK with the following cut-offs: variant frequency >14%, coverage (>25x) and presence in both the forward and reverse reads. The MPS method had 100% sensitivity and 94.4% specificity. Similar accuracy levels were achieved for DNA obtained from the different sample types. The workflow presented herein requires low amounts of DNA samples (170 ng) and is cost-effective due to the elimination of DNA and PCR product normalization steps. PMID:26625824

  12. Resolving genomic disorder–associated breakpoints within segmental DNA duplications using massively parallel sequencing

    PubMed Central

    Nuttle, Xander; Itsara, Andy; Shendure, Jay; Eichler, Evan E.

    2014-01-01

    The most common recurrent copy number variants associated with autism, developmental delay, and epilepsy are flanked by segmental duplications. Complete genetic characterization of these events is challenging because their breakpoints often occur within high-identity, copy number polymorphic paralogous sequences that cannot be specifically assayed using hybridization-based methods. Here, we provide a protocol for breakpoint resolution with sequence-level precision. Massively parallel sequencing is performed on libraries generated from haplotype-resolved chromosomes, genomic DNA, or molecular inversion probe–captured breakpoint-informative regions harboring paralog-distinguishing variants. Quantifying sequencing depth over informative sites enables breakpoint localization, typically within several kilobases to tens of kilobases. Depending on the approach employed, the sequencing platform, and the accuracy and completeness of the reference genome sequence, this protocol takes from a few days to several months to complete. Once established for a specific genomic disorder, it is possible to process thousands of DNA samples within as little as 3–4 weeks. PMID:24874815

  13. Characterization of the Zoarces viviparus liver transcriptome using massively parallel pyrosequencing

    PubMed Central

    Kristiansson, Erik; Asker, Noomi; Förlin, Lars; Larsson, DG Joakim

    2009-01-01

    Background The teleost Zoarces viviparus (eelpout) lives along the coasts of Northern Europe and has long been an established model organism for marine ecology and environmental monitoring. The scarce information about this species genome has however restrained the use of efficient molecular-level assays, such as gene expression microarrays. Results In the present study we present the first comprehensive characterization of the Zoarces viviparus liver transcriptome. From 400,000 reads generated by massively parallel pyrosequencing, more than 50,000 pieces of putative transcripts were assembled, annotated and functionally classified. The data was estimated to cover roughly 40% of the total transcriptome and homologues for about half of the genes of Gasterosteus aculeatus (stickleback) were identified. The sequence data was consequently used to design an oligonucleotide microarray for large-scale gene expression analysis. Conclusion Our results show that one run using a Genome Sequencer FLX from 454 Life Science/Roche generates enough genomic information for adequate de novo assembly of a large number of genes in a higher vertebrate. The generated sequence data, including the validated microarray probes, are publicly available to promote genome-wide research in Zoarces viviparus. PMID:19646242

  14. Hybrid selection of discrete genomic intervals on custom-designed microarrays for massively parallel sequencing

    PubMed Central

    Hodges, Emily; Rooks, Michelle; Xuan, Zhenyu; Bhattacharjee, Arindam; Gordon, D Benjamin; Brizuela, Leonardo; McCombie, W Richard; Hannon, Gregory J

    2010-01-01

    Complementary techniques that deepen information content and minimize reagent costs are required to realize the full potential of massively parallel sequencing. Here, we describe a resequencing approach that directs focus to genomic regions of high interest by combining hybridization-based purification of multi-megabase regions with sequencing on the Illumina Genome Analyzer (GA). The capture matrix is created by a microarray on which probes can be programmed as desired to target any non-repeat portion of the genome, while the method requires only a basic familiarity with microarray hybridization. We present a detailed protocol suitable for 1–2 µg of input genomic DNA and highlight key design tips in which high specificity (>65% of reads stem from enriched exons) and high sensitivity (98% targeted base pair coverage) can be achieved. We have successfully applied this to the enrichment of coding regions, in both human and mouse, ranging from 0.5 to 4 Mb in length. From genomic DNA library production to base-called sequences, this procedure takes approximately 9–10 d inclusive of array captures and one Illumina flow cell run. PMID:19478811

  15. Radiation hydrodynamics using characteristics on adaptive decomposed domains for massively parallel star formation simulations

    NASA Astrophysics Data System (ADS)

    Buntemeyer, Lars; Banerjee, Robi; Peters, Thomas; Klassen, Mikhail; Pudritz, Ralph E.

    2016-02-01

    We present an algorithm for solving the radiative transfer problem on massively parallel computers using adaptive mesh refinement and domain decomposition. The solver is based on the method of characteristics which requires an adaptive raytracer that integrates the equation of radiative transfer. The radiation field is split into local and global components which are handled separately to overcome the non-locality problem. The solver is implemented in the framework of the magneto-hydrodynamics code FLASH and is coupled by an operator splitting step. The goal is the study of radiation in the context of star formation simulations with a focus on early disc formation and evolution. This requires a proper treatment of radiation physics that covers both the optically thin as well as the optically thick regimes and the transition region in particular. We successfully show the accuracy and feasibility of our method in a series of standard radiative transfer problems and two 3D collapse simulations resembling the early stages of protostar and disc formation.

  16. Massively Parallel Sequencing for Genetic Diagnosis of Hearing Loss: The New Standard of Care

    PubMed Central

    Shearer, A. Eliot; Smith, Richard J.H.

    2016-01-01

    Objective To evaluate the use of new genetic sequencing techniques for comprehensive genetic testing for hearing loss. Data Sources Articles were identified from PubMed and Google Scholar databases using pertinent search terms. Review Methods Literature search identified 30 studies as candidates that met search criteria. Three studies were excluded and eight studies were found to be case reports. 20 studies were included for review analysis including seven studies that evaluated controls and 16 studies that evaluated patients with unknown causes of hearing loss; three studies evaluated both controls and patients. Conclusions In the 20 studies included in review analysis, 426 control samples and 603 patients with unknown causes of hearing loss underwent comprehensive genetic diagnosis for hearing loss using massively parallel sequencing. Control analysis showed a sensitivity and specificity > 99%, sufficient for clinical use of these tests. The overall diagnostic rate was 41% (range 10% to 83%) and varied based on several factors including inheritance and pre-screening prior to comprehensive testing. There were significant differences in platforms available in regards to number and type of genes included and whether copy number variations were examined. Based on these results, comprehensive genetic testing should form the cornerstone of a tiered approach to clinical evaluation of patients with hearing loss along with history, physical exam, and audiometry and can determine further testing that may be required, if any. Implications for Practice Comprehensive genetic testing has become the new standard of care for genetic testing for patients with sensorineural hearing loss. PMID:26084827

  17. Massively parallel network architectures for automatic recognition of visual speech signals. Final technical report

    SciTech Connect

    Sejnowski, T.J.; Goldstein, M.

    1990-01-01

    This research sought to produce a massively-parallel network architecture that could interpret speech signals from video recordings of human talkers. This report summarizes the project's results: (1) A corpus of video recordings from two human speakers was analyzed with image processing techniques and used as the data for this study; (2) We demonstrated that a feed forward network could be trained to categorize vowels from these talkers. The performance was comparable to that of the nearest neighbors techniques and to trained humans on the same data; (3) We developed a novel approach to sensory fusion by training a network to transform from facial images to short-time spectral amplitude envelopes. This information can be used to increase the signal-to-noise ratio and hence the performance of acoustic speech recognition systems in noisy environments; (4) We explored the use of recurrent networks to perform the same mapping for continuous speech. Results of this project demonstrate the feasibility of adding a visual speech recognition component to enhance existing speech recognition systems. Such a combined system could be used in noisy environments, such as cockpits, where improved communication is needed. This demonstration of presymbolic fusion of visual and acoustic speech signals is consistent with our current understanding of human speech perception.

  18. Novel Y-chromosome Short Tandem Repeat Variants Detected Through the Use of Massively Parallel Sequencing.

    PubMed

    Warshauer, David H; Churchill, Jennifer D; Novroski, Nicole; King, Jonathan L; Budowle, Bruce

    2015-08-01

    Massively parallel sequencing (MPS) technology is capable of determining the sizes of short tandem repeat (STR) alleles as well as their individual nucleotide sequences. Thus, single nucleotide polymorphisms (SNPs) within the repeat regions of STRs and variations in the pattern of repeat units in a given repeat motif can be used to differentiate alleles of the same length. In this study, MPS was used to sequence 28 forensically-relevant Y-chromosome STRs in a set of 41 DNA samples from the 3 major U.S. population groups (African Americans, Caucasians, and Hispanics). The resulting sequence data, which were analyzed with STRait Razor v2.0, revealed 37 unique allele sequence variants that have not been previously reported. Of these, 19 sequences were variations of documented sequences resulting from the presence of intra-repeat SNPs or alternative repeat unit patterns. Despite a limited sampling, two of the most frequently-observed variants were found only in African American samples. The remaining 18 variants represented allele sequences for which there were no published data with which to compare. These findings illustrate the great potential of MPS with regard to increasing the resolving power of STR typing and emphasize the need for sample population characterization of STR alleles. PMID:26391384

  19. Massively parallel decoding of mammalian regulatory sequences supports a flexible organizational model.

    PubMed

    Smith, Robin P; Taher, Leila; Patwardhan, Rupali P; Kim, Mee J; Inoue, Fumitaka; Shendure, Jay; Ovcharenko, Ivan; Ahituv, Nadav

    2013-09-01

    Despite continual progress in the cataloging of vertebrate regulatory elements, little is known about their organization and regulatory architecture. Here we describe a massively parallel experiment to systematically test the impact of copy number, spacing, combination and order of transcription factor binding sites on gene expression. A complex library of ∼5,000 synthetic regulatory elements containing patterns from 12 liver-specific transcription factor binding sites was assayed in mice and in HepG2 cells. We find that certain transcription factors act as direct drivers of gene expression in homotypic clusters of binding sites, independent of spacing between sites, whereas others function only synergistically. Heterotypic enhancers are stronger than their homotypic analogs and favor specific transcription factor binding site combinations, mimicking putative native enhancers. Exhaustive testing of binding site permutations suggests that there is flexibility in binding site order. Our findings provide quantitative support for a flexible model of regulatory element activity and suggest a framework for the design of synthetic tissue-specific enhancers. PMID:23892608

  20. Photo-patterned free-standing hydrogel microarrays for massively parallel protein analysis

    NASA Astrophysics Data System (ADS)

    Duncombe, Todd A.; Herr, Amy E.

    2015-03-01

    Microfluidic technologies have largely been realized within enclosed microchannels. While powerful, a principle limitation of closed-channel microfluidics is the difficulty for sample extraction and downstream processing. To address this limitation and expand the utility of microfluidic analytical separation tools, we developed an openchannel hydrogel architecture for rapid protein analysis. Designed for compatibility with slab-gel polyacrylamide gel electrophoresis (PAGE) reagents and instruments, we detail the development of free-standing polyacrylamide gel (fsPAG) microstructures supporting electrophoretic performance rivalling that of microfluidic platforms. Owing to its open architecture - the platform can be easily interfaced with automated robotic controllers and downstream processing (e.g., sample spotters, immunological probing, mass spectroscopy). The fsPAG devices are directly photopatterened atop of and covalently attached to planar polymer or glass surfaces. Due to the fast < 1 hr design-prototype-test cycle - significantly faster than mold based fabrication techniques - rapid prototyping devices with fsPAG microstructures provides researchers a powerful tool for developing custom analytical assays. Leveraging the rapid prototyping benefits - we up-scale from a unit separation to an array of 96 concurrent fsPAGE assays in 10 min run time driven by one electrode pair. The fsPAGE platform is uniquely well-suited for massively parallelized proteomics, a major unrealized goal from bioanalytical technology.

  1. Transcriptomic analysis of the housefly (Musca domestica) larva using massively parallel pyrosequencing.

    PubMed

    Liu, Fengsong; Tang, Ting; Sun, Lingling; Jose Priya, T A

    2012-02-01

    To explore the transcriptome of Musca domestica larvae and to identify unique sequences, we used massively parallel pyrosequencing on the Roche 454-FLX platform to generate a substantial EST dataset of this fly. As a result, we obtained a total of 249,555 ESTs with an average read length of 373 bp. These reads were assembled into 13,206 contigs and 20,556 singletons. Using BlastX searches of the Swissprot and Nr databases, we were able to identify 4,814 contigs and 8,166 singletons as unique sequences. Subsequently, the annotated sequences were subjected to GO analysis and the search results showed a majority of the query sequences were assignable to certain gene ontology terms. In addition, functional classification and pathway assignment were performed by KEGG and 2,164 unique sequences were mapped into 184 KEGG pathways in total. As the first attempt on large-scale RNA sequencing of M. domestica, this general picture of the transcriptome can establish a fundamental resource for further research on functional genomics. PMID:21643958

  2. Novel Y-chromosome Short Tandem Repeat Variants Detected Through the Use of Massively Parallel Sequencing

    PubMed Central

    Warshauer, David H.; Churchill, Jennifer D.; Novroski, Nicole; King, Jonathan L.; Budowle, Bruce

    2015-01-01

    Massively parallel sequencing (MPS) technology is capable of determining the sizes of short tandem repeat (STR) alleles as well as their individual nucleotide sequences. Thus, single nucleotide polymorphisms (SNPs) within the repeat regions of STRs and variations in the pattern of repeat units in a given repeat motif can be used to differentiate alleles of the same length. In this study, MPS was used to sequence 28 forensically-relevant Y-chromosome STRs in a set of 41 DNA samples from the 3 major U.S. population groups (African Americans, Caucasians, and Hispanics). The resulting sequence data, which were analyzed with STRait Razor v2.0, revealed 37 unique allele sequence variants that have not been previously reported. Of these, 19 sequences were variations of documented sequences resulting from the presence of intra-repeat SNPs or alternative repeat unit patterns. Despite a limited sampling, two of the most frequently-observed variants were found only in African American samples. The remaining 18 variants represented allele sequences for which there were no published data with which to compare. These findings illustrate the great potential of MPS with regard to increasing the resolving power of STR typing and emphasize the need for sample population characterization of STR alleles. PMID:26391384

  3. GRay: A MASSIVELY PARALLEL GPU-BASED CODE FOR RAY TRACING IN RELATIVISTIC SPACETIMES

    SciTech Connect

    Chan, Chi-kwan; Psaltis, Dimitrios; Özel, Feryal

    2013-11-01

    We introduce GRay, a massively parallel integrator designed to trace the trajectories of billions of photons in a curved spacetime. This graphics-processing-unit (GPU)-based integrator employs the stream processing paradigm, is implemented in CUDA C/C++, and runs on nVidia graphics cards. The peak performance of GRay using single-precision floating-point arithmetic on a single GPU exceeds 300 GFLOP (or 1 ns per photon per time step). For a realistic problem, where the peak performance cannot be reached, GRay is two orders of magnitude faster than existing central-processing-unit-based ray-tracing codes. This performance enhancement allows more effective searches of large parameter spaces when comparing theoretical predictions of images, spectra, and light curves from the vicinities of compact objects to observations. GRay can also perform on-the-fly ray tracing within general relativistic magnetohydrodynamic algorithms that simulate accretion flows around compact objects. Making use of this algorithm, we calculate the properties of the shadows of Kerr black holes and the photon rings that surround them. We also provide accurate fitting formulae of their dependencies on black hole spin and observer inclination, which can be used to interpret upcoming observations of the black holes at the center of the Milky Way, as well as M87, with the Event Horizon Telescope.

  4. Targeted massively parallel sequencing provides comprehensive genetic diagnosis for patients with disorders of sex development

    PubMed Central

    Arboleda, VA; Lee, H; Sánchez, FJ; Délot, EC; Sandberg, DE; Grody, WW; Nelson, SF; Vilain, E

    2013-01-01

    Disorders of sex development (DSD) are rare disorders in which there is discordance between chromosomal, gonadal, and phenotypic sex. Only a minority of patients clinically diagnosed with DSD obtains a molecular diagnosis, leaving a large gap in our understanding of the prevalence, management, and outcomes in affected patients. We created a novel DSD-genetic diagnostic tool, in which sex development genes are captured using RNA probes and undergo massively parallel sequencing. In the pilot group of 14 patients, we determined sex chromosome dosage, copy number variation, and gene mutations. In the patients with a known genetic diagnosis (obtained either on a clinical or research basis), this test identified the molecular cause in 100% (7/7) of patients. In patients in whom no molecular diagnosis had been made, this tool identified a genetic diagnosis in two of seven patients. Targeted sequencing of genes representing a specific spectrum of disorders can result in a higher rate of genetic diagnoses than current diagnostic approaches. Our DSD diagnostic tool provides for first time, in a single blood test, a comprehensive genetic diagnosis in patients presenting with a wide range of urogenital anomalies. PMID:22435390

  5. Identification of Novel FMR1 Variants by Massively Parallel Sequencing in Developmentally Delayed Males

    PubMed Central

    Collins, Stephen C.; Bray, Steven M.; Suhl, Joshua A.; Cutler, David J.; Coffee, Bradford; Zwick, Michael E.; Warren, Stephen T.

    2010-01-01

    Fragile X syndrome (FXS), the most common inherited form of developmental delay, is typically caused by CGG-repeat expansion in FMR1. However, little attention has been paid to sequence variants in FMR1. Through the use of pooled-template massively parallel sequencing, we identified 130 novel FMR1 sequence variants in a population of 963 developmentally delayed males without CGG-repeat expansion mutations. Among these, we identified a novel missense change, p.R138Q, which alters a conserved residue in the nuclear localization signal of FMRP. We have also identified three promoter mutations in this population, all of which significantly reduce in vitro levels of FMR1 transcription. Additionally, we identified 10 noncoding variants of possible functional significance in the introns and 3’-untranslated region of FMR1, including two predicted splice site mutations. These findings greatly expand the catalogue of known FMR1 sequence variants and suggest that FMR1 sequence variants may represent an important cause of developmental delay. PMID:20799337

  6. Mitochondrial DNA heteroplasmy in the emerging field of massively parallel sequencing

    PubMed Central

    Just, Rebecca S.; Irwin, Jodi A.; Parson, Walther

    2015-01-01

    Long an important and useful tool in forensic genetic investigations, mitochondrial DNA (mtDNA) typing continues to mature. Research in the last few years has demonstrated both that data from the entire molecule will have practical benefits in forensic DNA casework, and that massively parallel sequencing (MPS) methods will make full mitochondrial genome (mtGenome) sequencing of forensic specimens feasible and cost-effective. A spate of recent studies has employed these new technologies to assess intraindividual mtDNA variation. However, in several instances, contamination and other sources of mixed mtDNA data have been erroneously identified as heteroplasmy. Well vetted mtGenome datasets based on both Sanger and MPS sequences have found authentic point heteroplasmy in approximately 25% of individuals when minor component detection thresholds are in the range of 10–20%, along with positional distribution patterns in the coding region that differ from patterns of point heteroplasmy in the well-studied control region. A few recent studies that examined very low-level heteroplasmy are concordant with these observations when the data are examined at a common level of resolution. In this review we provide an overview of considerations related to the use of MPS technologies to detect mtDNA heteroplasmy. In addition, we examine published reports on point heteroplasmy to characterize features of the data that will assist in the evaluation of future mtGenome data developed by any typing method. PMID:26009256

  7. Simulation of hydraulic fracture networks in three dimensions utilizing massively parallel computing platforms

    NASA Astrophysics Data System (ADS)

    Settgast, R. R.; Johnson, S.; Fu, P.; Walsh, S. D.; Ryerson, F. J.; Antoun, T.

    2012-12-01

    Hydraulic fracturing has been an enabling technology for commercially stimulating fracture networks for over half of a century. It has become one of the most widespread technologies for engineering subsurface fracture systems. Despite the ubiquity of this technique in the field, understanding and prediction of the hydraulic induced propagation of the fracture network in realistic, heterogeneous reservoirs has been limited. A number of developments in multiscale modeling in recent years have allowed researchers in related fields to tackle the modeling of complex fracture propagation as well as the mechanics of heterogeneous materials. These developments, combined with advances in quantifying solution uncertainties, provide possibilities for the geologic modeling community to capture both the fracturing behavior and longer-term permeability evolution of rock masses under hydraulic loading across both dynamic and viscosity-dominated regimes. Here we will demonstrate the first phase of this effort through illustrations of fully three-dimensional, tightly coupled hydromechanical simulations of hydraulically induced fracture network propagation run on massively parallel computing scales, and discuss preliminary results regarding the mechanisms by which fracture interactions and the accompanying changes to the stress field can lead to deleterious or beneficial changes to the fracture network.

  8. Identification of a novel GATA3 mutation in a deaf Taiwanese family by massively parallel sequencing.

    PubMed

    Lin, Yin-Hung; Wu, Chen-Chi; Hsu, Tun-Yen; Chiu, Wei-Yih; Hsu, Chuan-Jen; Chen, Pei-Lung

    2015-01-01

    Recent studies have confirmed the utility of massively parallel sequencing (MPS) in addressing genetically heterogeneous hereditary hearing impairment. By applying a MPS diagnostic panel targeting 129 known deafness genes, we identified a novel frameshift GATA3 mutation, c.149delT (p.Phe51LeufsX144), in a hearing-impaired family compatible with autosomal dominant inheritance. The GATA3 haploinsufficiency is thought to be associated with the hypoparathyroidism, sensorineural deafness, and renal dysplasia (HDR) syndrome. The pathogenicity of GATA3 c.149delT was supported by its absence in the 5400 NHLBI exomes, 1000 Genomes, and the 100 normal hearing controls of the present study; the co-segregation of c.149delT heterozygosity with hearing impairment in 9 affected members of the family; as well as the nonsense-mediated mRNA decay of the mutant allele in in vitro functional studies. The phenotypes in this family appeared relatively mild, as most affected members presented no signs of hypoparathyroidism or renal abnormalities, including the proband. To our knowledge, this is the first report of genetic diagnosis of HDR syndrome before the clinical diagnosis. Genetic examination for multiple deafness genes with MPS might be helpful in identifying certain types of syndromic hearing loss such as HDR syndrome, contributing to earlier diagnosis and treatment of the affected individuals. PMID:25771973

  9. Mitochondrial DNA heteroplasmy in the emerging field of massively parallel sequencing.

    PubMed

    Just, Rebecca S; Irwin, Jodi A; Parson, Walther

    2015-09-01

    Long an important and useful tool in forensic genetic investigations, mitochondrial DNA (mtDNA) typing continues to mature. Research in the last few years has demonstrated both that data from the entire molecule will have practical benefits in forensic DNA casework, and that massively parallel sequencing (MPS) methods will make full mitochondrial genome (mtGenome) sequencing of forensic specimens feasible and cost-effective. A spate of recent studies has employed these new technologies to assess intraindividual mtDNA variation. However, in several instances, contamination and other sources of mixed mtDNA data have been erroneously identified as heteroplasmy. Well vetted mtGenome datasets based on both Sanger and MPS sequences have found authentic point heteroplasmy in approximately 25% of individuals when minor component detection thresholds are in the range of 10-20%, along with positional distribution patterns in the coding region that differ from patterns of point heteroplasmy in the well-studied control region. A few recent studies that examined very low-level heteroplasmy are concordant with these observations when the data are examined at a common level of resolution. In this review we provide an overview of considerations related to the use of MPS technologies to detect mtDNA heteroplasmy. In addition, we examine published reports on point heteroplasmy to characterize features of the data that will assist in the evaluation of future mtGenome data developed by any typing method. PMID:26009256

  10. Massively parallel sequencing of complete mitochondrial genomes from hair shaft samples.

    PubMed

    Parson, Walther; Huber, Gabriela; Moreno, Lilliana; Madel, Maria-Bernadette; Brandhagen, Michael D; Nagl, Simone; Xavier, Catarina; Eduardoff, Mayra; Callaghan, Thomas C; Irwin, Jodi A

    2015-03-01

    Though shed hairs are one of the most commonly encountered evidence types, they are among the most limited in terms of DNA quantity and quality. As a result, DNA testing has historically focused on the recovery of just about 600 base pairs of the mitochondrial DNA control region. Here, we describe our success in recovering complete mitochondrial genome (mtGenome) data (∼16,569bp) from single shed hairs. By employing massively parallel sequencing (MPS), we demonstrate that particular hair samples yield DNA sufficient in quantity and quality to produce 2-3kb mtGenome amplicons and that entire mtGenome data can be recovered from hair extracts even without PCR enrichment. Most importantly, we describe a small amplicon multiplex assay comprised of sixty-two primer sets that can be routinely applied to the compromised hair samples typically encountered in forensic casework. In all samples tested here, the MPS data recovered using any one of the three methods were consistent with the control Sanger sequence data developed from high quality known specimens. Given the recently demonstrated value of complete mtGenome data in terms of discrimination power among randomly sampled individuals, the possibility of recovering mtGenome data from the most compromised and limited evidentiary material is likely to vastly increase the utility of mtDNA testing for hair evidence. PMID:25438934

  11. Large-scale massively parallel atomistic simulations of short pulse laser interaction with metals

    NASA Astrophysics Data System (ADS)

    Wu, Chengping; Zhigilei, Leonid; Computational Materials Group Team

    2014-03-01

    Taking advantage of petascale supercomputing architectures, large-scale massively parallel atomistic simulations (108-109 atoms) are performed to study the microscopic mechanisms of short pulse laser interaction with metals. The results of the simulations reveal a complex picture of highly non-equilibrium processes responsible for material modification and/or ejection. At low laser fluences below the ablation threshold, fast melting and resolidification occur under conditions of extreme heating and cooling rates resulting in surface microstructure modification. At higher laser fluences in the spallation regime, the material is ejected by the relaxation of laser-induced stresses and proceeds through the nucleation, growth and percolation of multiple voids in the sub-surface region of the irradiated target. At a fluence of ~ 2.5 times the spallation threshold, the top part of the target reaches the conditions for an explosive decomposition into vapor and small droplets, marking the transition to the phase explosion regime of laser ablation. The dynamics of plume formation and the characteristics of the ablation plume are obtained from the simulations and compared with the results of time-resolved plume imaging experiments. Financial support for this work was provided by NSF (DMR-0907247 and CMMI-1301298) and AFOSR (FA9550-10-1-0541). Computational support was provided by the OLCF (MAT048) and XSEDE (TG-DMR110090).

  12. Frequency of Usher syndrome type 1 in deaf children by massively parallel DNA sequencing.

    PubMed

    Yoshimura, Hidekane; Miyagawa, Maiko; Kumakawa, Kozo; Nishio, Shin-Ya; Usami, Shin-Ichi

    2016-05-01

    Usher syndrome type 1 (USH1) is the most severe of the three USH subtypes due to its profound hearing loss, absent vestibular response and retinitis pigmentosa appearing at a prepubescent age. Six causative genes have been identified for USH1, making early diagnosis and therapy possible through DNA testing. Targeted exon sequencing of selected genes using massively parallel DNA sequencing (MPS) technology enables clinicians to systematically tackle previously intractable monogenic disorders and improve molecular diagnosis. Using MPS along with direct sequence analysis, we screened 227 unrelated non-syndromic deaf children and detected recessive mutations in USH1 causative genes in five patients (2.2%): three patients harbored MYO7A mutations and one each carried CDH23 or PCDH15 mutations. As indicated by an earlier genotype-phenotype correlation study of the CDH23 and PCDH15 genes, we considered the latter two patients to have USH1. Based on clinical findings, it was also highly likely that one patient with MYO7A mutations possessed USH1 due to a late onset age of walking. This first report describing the frequency (1.3-2.2%) of USH1 among non-syndromic deaf children highlights the importance of comprehensive genetic testing for early disease diagnosis. PMID:26791358

  13. Tracking the roots of cellulase hyperproduction by the fungus Trichoderma reesei using massively parallel DNA sequencing

    SciTech Connect

    Le Crom, Stphane; Schackwitz, Wendy; Pennacchiod, Len; Magnuson, Jon K.; Culley, David E.; Collett, James R.; Martin, Joel X.; Druzhinina, Irina S.; Mathis, Hugues; Monot, Frdric; Seiboth, Bernhard; Cherry, Barbara; Rey, Michael; Berka, Randy; Kubicek, Christian P.; Baker, Scott E.; Margeot, Antoine

    2009-09-22

    Trichoderma reesei (teleomorph Hypocrea jecorina) is the main industrial source of cellulases and hemicellulases harnessed for the hydrolysis of biomass to simple sugars, which can then be converted to biofuels, such as ethanol, and other chemicals. The highly productive strains in use today were generated by classical mutagenesis. To learn how cellulase production was improved by these techniques, we performed massively parallel sequencing to identify mutations in the genomes of two hyperproducing strains (NG14, and its direct improved descendant, RUT C30). We detected a surprisingly high number of mutagenic events: 223 single nucleotides variants, 15 small deletions or insertions and 18 larger deletions leading to the loss of more than 100 kb of genomic DNA. From these events we report previously undocumented non-synonymous mutations in 43 genes that are mainly involved in nuclear transport, mRNA stability, transcription, secretion/vacuolar targeting, and metabolism. This homogeneity of functional categories suggests that multiple changes are necessary to improve cellulase production and not simply a few clear-cut mutagenic events. Phenotype microarrays show that some of these mutations result in strong changes in the carbon assimilation pattern of the two mutants with respect to the wild type strain QM6a. Our analysis provides the first genome-wide insights into the changes induced by classical mutagenesis in a filamentous fungus, and suggests new areas for the generation of enhanced T. reesei strains for industrial applications such as biofuel production.

  14. Frequency of Usher syndrome type 1 in deaf children by massively parallel DNA sequencing

    PubMed Central

    Yoshimura, Hidekane; Miyagawa, Maiko; Kumakawa, Kozo; Nishio, Shin-ya; Usami, Shin-ichi

    2016-01-01

    Usher syndrome type 1 (USH1) is the most severe of the three USH subtypes due to its profound hearing loss, absent vestibular response and retinitis pigmentosa appearing at a prepubescent age. Six causative genes have been identified for USH1, making early diagnosis and therapy possible through DNA testing. Targeted exon sequencing of selected genes using massively parallel DNA sequencing (MPS) technology enables clinicians to systematically tackle previously intractable monogenic disorders and improve molecular diagnosis. Using MPS along with direct sequence analysis, we screened 227 unrelated non-syndromic deaf children and detected recessive mutations in USH1 causative genes in five patients (2.2%): three patients harbored MYO7A mutations and one each carried CDH23 or PCDH15 mutations. As indicated by an earlier genotype–phenotype correlation study of the CDH23 and PCDH15 genes, we considered the latter two patients to have USH1. Based on clinical findings, it was also highly likely that one patient with MYO7A mutations possessed USH1 due to a late onset age of walking. This first report describing the frequency (1.3–2.2%) of USH1 among non-syndromic deaf children highlights the importance of comprehensive genetic testing for early disease diagnosis. PMID:26791358

  15. Massively parallel sampling of lattice proteins reveals foundations of thermal adaptation.

    PubMed

    Venev, Sergey V; Zeldovich, Konstantin B

    2015-08-01

    Evolution of proteins in bacteria and archaea living in different conditions leads to significant correlations between amino acid usage and environmental temperature. The origins of these correlations are poorly understood, and an important question of protein theory, physics-based prediction of types of amino acids overrepresented in highly thermostable proteins, remains largely unsolved. Here, we extend the random energy model of protein folding by weighting the interaction energies of amino acids by their frequencies in protein sequences and predict the energy gap of proteins designed to fold well at elevated temperatures. To test the model, we present a novel scalable algorithm for simultaneous energy calculation for many sequences in many structures, targeting massively parallel computing architectures such as graphics processing unit. The energy calculation is performed by multiplying two matrices, one representing the complete set of sequences, and the other describing the contact maps of all structural templates. An implementation of the algorithm for the CUDA platform is available at http://www.github.com/kzeldovich/galeprot and calculates protein folding energies over 250 times faster than a single central processing unit. Analysis of amino acid usage in 64-mer cubic lattice proteins designed to fold well at different temperatures demonstrates an excellent agreement between theoretical and simulated values of energy gap. The theoretical predictions of temperature trends of amino acid frequencies are significantly correlated with bioinformatics data on 191 bacteria and archaea, and highlight protein folding constraints as a fundamental selection pressure during thermal adaptation in biological evolution. PMID:26254668

  16. Tracking the roots of cellulase hyperproduction by the fungus Trichoderma reesei using massively parallel DNA sequencing

    PubMed Central

    Le Crom, Stéphane; Schackwitz, Wendy; Pennacchio, Len; Magnuson, Jon K.; Culley, David E.; Collett, James R.; Martin, Joel; Druzhinina, Irina S.; Mathis, Hugues; Monot, Frédéric; Seiboth, Bernhard; Cherry, Barbara; Rey, Michael; Berka, Randy; Kubicek, Christian P.; Baker, Scott E.; Margeot, Antoine

    2009-01-01

    Trichoderma reesei (teleomorph Hypocrea jecorina) is the main industrial source of cellulases and hemicellulases harnessed for the hydrolysis of biomass to simple sugars, which can then be converted to biofuels such as ethanol and other chemicals. The highly productive strains in use today were generated by classical mutagenesis. To learn how cellulase production was improved by these techniques, we performed massively parallel sequencing to identify mutations in the genomes of two hyperproducing strains (NG14, and its direct improved descendant, RUT C30). We detected a surprisingly high number of mutagenic events: 223 single nucleotides variants, 15 small deletions or insertions, and 18 larger deletions, leading to the loss of more than 100 kb of genomic DNA. From these events, we report previously undocumented non-synonymous mutations in 43 genes that are mainly involved in nuclear transport, mRNA stability, transcription, secretion/vacuolar targeting, and metabolism. This homogeneity of functional categories suggests that multiple changes are necessary to improve cellulase production and not simply a few clear-cut mutagenic events. Phenotype microarrays show that some of these mutations result in strong changes in the carbon assimilation pattern of the two mutants with respect to the wild-type strain QM6a. Our analysis provides genome-wide insights into the changes induced by classical mutagenesis in a filamentous fungus and suggests areas for the generation of enhanced T. reesei strains for industrial applications such as biofuel production. PMID:19805272

  17. Tracking the roots of cellulase hyperproduction by the fungus Trichoderma reesei using massively parallel DNA sequencing.

    PubMed

    Le Crom, Stéphane; Schackwitz, Wendy; Pennacchio, Len; Magnuson, Jon K; Culley, David E; Collett, James R; Martin, Joel; Druzhinina, Irina S; Mathis, Hugues; Monot, Frédéric; Seiboth, Bernhard; Cherry, Barbara; Rey, Michael; Berka, Randy; Kubicek, Christian P; Baker, Scott E; Margeot, Antoine

    2009-09-22

    Trichoderma reesei (teleomorph Hypocrea jecorina) is the main industrial source of cellulases and hemicellulases harnessed for the hydrolysis of biomass to simple sugars, which can then be converted to biofuels such as ethanol and other chemicals. The highly productive strains in use today were generated by classical mutagenesis. To learn how cellulase production was improved by these techniques, we performed massively parallel sequencing to identify mutations in the genomes of two hyperproducing strains (NG14, and its direct improved descendant, RUT C30). We detected a surprisingly high number of mutagenic events: 223 single nucleotides variants, 15 small deletions or insertions, and 18 larger deletions, leading to the loss of more than 100 kb of genomic DNA. From these events, we report previously undocumented non-synonymous mutations in 43 genes that are mainly involved in nuclear transport, mRNA stability, transcription, secretion/vacuolar targeting, and metabolism. This homogeneity of functional categories suggests that multiple changes are necessary to improve cellulase production and not simply a few clear-cut mutagenic events. Phenotype microarrays show that some of these mutations result in strong changes in the carbon assimilation pattern of the two mutants with respect to the wild-type strain QM6a. Our analysis provides genome-wide insights into the changes induced by classical mutagenesis in a filamentous fungus and suggests areas for the generation of enhanced T. reesei strains for industrial applications such as biofuel production. PMID:19805272

  18. SIESTA-PEXSI: Massively parallel method for efficient and accurate ab initio materials simulation

    NASA Astrophysics Data System (ADS)

    Lin, Lin; Huhs, Georg; Garcia, Alberto; Yang, Chao

    2014-03-01

    We describe how to combine the pole expansion and selected inversion (PEXSI) technique with the SIESTA method, which uses numerical atomic orbitals for Kohn-Sham density functional theory (KSDFT) calculations. The PEXSI technique can efficiently utilize the sparsity pattern of the Hamiltonian matrix and the overlap matrix generated from codes such as SIESTA, and solves KSDFT without using cubic scaling matrix diagonalization procedure. The complexity of PEXSI scales at most quadratically with respect to the system size, and the accuracy is comparable to that obtained from full diagonalization. One distinct feature of PEXSI is that it achieves low order scaling without using the near-sightedness property and can be therefore applied to metals as well as insulators and semiconductors, at room temperature or even lower temperature. The PEXSI method is highly scalable, and the recently developed massively parallel PEXSI technique can make efficient usage of 10,000 ~100,000 processors on high performance machines. We demonstrate the performance the SIESTA-PEXSI method using several examples for large scale electronic structure calculation including long DNA chain and graphene-like structures with more than 20000 atoms. Funded by Luis Alvarez fellowship in LBNL, and DOE SciDAC project in partnership with BES.

  19. ALEGRA -- A massively parallel h-adaptive code for solid dynamics

    SciTech Connect

    Summers, R.M.; Wong, M.K.; Boucheron, E.A.; Weatherby, J.R.

    1997-12-31

    ALEGRA is a multi-material, arbitrary-Lagrangian-Eulerian (ALE) code for solid dynamics designed to run on massively parallel (MP) computers. It combines the features of modern Eulerian shock codes, such as CTH, with modern Lagrangian structural analysis codes using an unstructured grid. ALEGRA is being developed for use on the teraflop supercomputers to conduct advanced three-dimensional (3D) simulations of shock phenomena important to a variety of systems. ALEGRA was designed with the Single Program Multiple Data (SPMD) paradigm, in which the mesh is decomposed into sub-meshes so that each processor gets a single sub-mesh with approximately the same number of elements. Using this approach the authors have been able to produce a single code that can scale from one processor to thousands of processors. A current major effort is to develop efficient, high precision simulation capabilities for ALEGRA, without the computational cost of using a global highly resolved mesh, through flexible, robust h-adaptivity of finite elements. H-adaptivity is the dynamic refinement of the mesh by subdividing elements, thus changing the characteristic element size and reducing numerical error. The authors are working on several major technical challenges that must be met to make effective use of HAMMER on MP computers.

  20. Massively parallel sequencing-based survey of eukaryotic community structures in Hiroshima Bay and Ishigaki Island.

    PubMed

    Nagai, Satoshi; Hida, Kohsuke; Urusizaki, Shingo; Takano, Yoshihito; Hongo, Yuki; Kameda, Takahiko; Abe, Kazuo

    2016-02-01

    In this study, we compared the eukaryote biodiversity between Hiroshima Bay and Ishigaki Island in Japanese coastal waters by using the massively parallel sequencing (MPS)-based technique to collect preliminary data. The relative abundance of Alveolata was highest in both localities, and the second highest groups were Stramenopiles, Opisthokonta, or Hacrobia, which varied depending on the samples considered. For microalgal phyla, the relative abundance of operational taxonomic units (OTUs) and the number of MPS were highest for Dinophyceae in both localities, followed by Bacillariophyceae in Hiroshima Bay, and by Bacillariophyceae or Chlorophyceae in Ishigaki Island. The number of detected OTUs in Hiroshima Bay and Ishigaki Island was 645 and 791, respectively, and 15.3% and 12.5% of the OTUs were common between the two localities. In the non-metric multidimensional scaling analysis, the samples from the two localities were plotted in different positions. In the dendrogram developed using similarity indices, the samples were clustered into different nodes based on localities with high multiscale bootstrap values, reflecting geographic differences in biodiversity. Thus, we succeeded in demonstrating biodiversity differences between the two localities, although the read numbers of the MPSs were not high enough. The corresponding analysis showed a clear seasonal change in the biodiversity of Hiroshima Bay but it was not clear in Ishigaki Island. Thus, the MPS-based technique shows a great advantage of high performance by detecting several hundreds of OTUs from a single sample, strongly suggesting the effectiveness to apply this technique to routine monitoring programs. PMID:26476293

  1. Massively parallel enzyme kinetics reveals the substrate recognition landscape of the metalloprotease ADAMTS13

    PubMed Central

    Kretz, Colin A.; Dai, Manhong; Soylemez, Onuralp; Yee, Andrew; Desch, Karl C.; Siemieniak, David; Tomberg, Kärt; Kondrashov, Fyodor A.; Meng, Fan; Ginsburg, David

    2015-01-01

    Proteases play important roles in many biologic processes and are key mediators of cancer, inflammation, and thrombosis. However, comprehensive and quantitative techniques to define the substrate specificity profile of proteases are lacking. The metalloprotease ADAMTS13 regulates blood coagulation by cleaving von Willebrand factor (VWF), reducing its procoagulant activity. A mutagenized substrate phage display library based on a 73-amino acid fragment of VWF was constructed, and the ADAMTS13-dependent change in library complexity was evaluated over reaction time points, using high-throughput sequencing. Reaction rate constants (kcat/KM) were calculated for nearly every possible single amino acid substitution within this fragment. This massively parallel enzyme kinetics analysis detailed the specificity of ADAMTS13 and demonstrated the critical importance of the P1-P1′ substrate residues while defining exosite binding domains. These data provided empirical evidence for the propensity for epistasis within VWF and showed strong correlation to conservation across orthologs, highlighting evolutionary selective pressures for VWF. PMID:26170332

  2. Nanopantography: A new method for massively parallel nanopatterning over large areas

    NASA Astrophysics Data System (ADS)

    Xu, Lin

    Nanopantography, a radically new method for versatile fabrication of sub-20 nm features in a massively parallel fashion, represents a breakthrough in nanotechnology. The concept of this technique is to focus ion "beamlets" in parallel to write identical, arbitrary nano-patterns. Depending on the ion species, nanopatterns can be either etched, or deposited by nanopantography. An array of electrostatic lenses and a broad-area, directional, monoenergetic ion beam are required to implement nanopantography. This dissertation is dedicated to extracting an ion beam with desired properties from a plasma source and realizing nanopantography using this beam. A novel ion extraction strategy has been used to extract a nearly monoenergetic and energy-specified ion beam from a capacitively-coupled or an inductively-coupled, pulsed Ar plasma. The electron temperature decayed rapidly in the afterglow, resulting in uniform plasma potential, and minimal energy spread for ions extracted in the afterglow. Ion energy was controlled by a DC bias, or alternatively by a high-voltage pulse, on the ring electrode surrounding the plasma. Langmuir probe measurements indicated that this bias raised the plasma potential without heating the electrons in the afterglow. The energy spread was 3.4 eV (FWHM) For a peak ion beam energy of 102.0 eV. Similar results were obtained in an inductively-coupled pulsed plasma when the acceleration ring was pulsed exclusively during the afterglow. To achieve Ni deposition by nanopantography, higher Ni atom and ion densities are desired in the plasma source. An ionized physical vapor deposition (IPVD) system with a Ni internal RF coil and Ni target was used to introduce Ni atoms, and a fraction of the atoms becomes ionized in the high-density plasma. Optical emission spectroscopy (OAS) and optical absorption spectroscopy (OAS), in combination with global models, were used to determine the Ni atom and ion density. For a pressure of 8-20 mTorr and coil power of 40

  3. A Massive Parallel Variational Multiscale FEM Scheme Applied to Nonhydrostatic Atmospheric Dynamics

    NASA Astrophysics Data System (ADS)

    Vazquez, Mariano; Marras, Simone; Moragues, Margarida; Jorba, Oriol; Houzeaux, Guillaume; Aubry, Romain

    2010-05-01

    The solution of the fully compressible Euler equations of stratified flows is approached from the point of view of Computational Fluid Dynamics techniques. Specifically, the main aim of this contribution is the introduction of a Variational Multiscale Finite Element (CVMS-FE) approach to solve dry atmospheric dynamics effectively on massive parallel architectures with more than 1000 processors. The conservation form of the equations of motion is discretized in all directions with a Galerkin scheme with stabilization given by the compressible counterpart of the variational multiscale technique of Hughes [1] and Houzeaux et al. [2]. The justification of this effort is twofold: the search of optimal parallelization characteristics and linear scalability trends on petascale machines is one. The development of a numerical algorithm whose local nature helps maintaining minimal the communication among the processors implies, in fact, a large leap towards efficient parallel computing. Second, the rising trend to global models and models of higher spatial resolution naturally suggests the use of adaptive grids to only resolve zones of larger gradients while keeping the computational mesh properly coarse elsewhere (thus keeping the computational cost low). With these two hypotheses in mind, the finite element scheme presented here is an open option to the development of the next generation Numerical Weather Prediction (NWP) codes. This methodology is as new in Computational Fluid Dynamics for compressible flows at low Mach number as it is in Numerical Weather Prediction (NWP). We however mean to show its ability to maintain stability in the solution of thermal, gravity-driven flows in a stratified environment in the specific context of dry atmospheric dynamics. Standard two dimensional benchmarks are implemented and compared against the reference literature. In the context of thermal and gravity-driven flows in a neutral atmosphere, we present: (1) the density current

  4. Massively parallel computation of 3D flow and reactions in chemical vapor deposition reactors

    SciTech Connect

    Salinger, A.G.; Shadid, J.N.; Hutchinson, S.A.; Hennigan, G.L.; Devine, K.D.; Moffat, H.K.

    1997-12-01

    Computer modeling of Chemical Vapor Deposition (CVD) reactors can greatly aid in the understanding, design, and optimization of these complex systems. Modeling is particularly attractive in these systems since the costs of experimentally evaluating many design alternatives can be prohibitively expensive, time consuming, and even dangerous, when working with toxic chemicals like Arsine (AsH{sub 3}): until now, predictive modeling has not been possible for most systems since the behavior is three-dimensional and governed by complex reaction mechanisms. In addition, CVD reactors often exhibit large thermal gradients, large changes in physical properties over regions of the domain, and significant thermal diffusion for gas mixtures with widely varying molecular weights. As a result, significant simplifications in the models have been made which erode the accuracy of the models` predictions. In this paper, the authors will demonstrate how the vast computational resources of massively parallel computers can be exploited to make possible the analysis of models that include coupled fluid flow and detailed chemistry in three-dimensional domains. For the most part, models have either simplified the reaction mechanisms and concentrated on the fluid flow, or have simplified the fluid flow and concentrated on rigorous reactions. An important CVD research thrust has been in detailed modeling of fluid flow and heat transfer in the reactor vessel, treating transport and reaction of chemical species either very simply or as a totally decoupled problem. Using the analogy between heat transfer and mass transfer, and the fact that deposition is often diffusion limited, much can be learned from these calculations; however, the effects of thermal diffusion, the change in physical properties with composition, and the incorporation of surface reaction mechanisms are not included in this model, nor can transitions to three-dimensional flows be detected.

  5. Massively parallel neural circuits for stereoscopic color vision: encoding, decoding and identification.

    PubMed

    Lazar, Aurel A; Slutskiy, Yevgeniy B; Zhou, Yiyin

    2015-03-01

    Past work demonstrated how monochromatic visual stimuli could be faithfully encoded and decoded under Nyquist-type rate conditions. Color visual stimuli were then traditionally encoded and decoded in multiple separate monochromatic channels. The brain, however, appears to mix information about color channels at the earliest stages of the visual system, including the retina itself. If information about color is mixed and encoded by a common pool of neurons, how can colors be demixed and perceived? We present Color Video Time Encoding Machines (Color Video TEMs) for encoding color visual stimuli that take into account a variety of color representations within a single neural circuit. We then derive a Color Video Time Decoding Machine (Color Video TDM) algorithm for color demixing and reconstruction of color visual scenes from spikes produced by a population of visual neurons. In addition, we formulate Color Video Channel Identification Machines (Color Video CIMs) for functionally identifying color visual processing performed by a spiking neural circuit. Furthermore, we derive a duality between TDMs and CIMs that unifies the two and leads to a general theory of neural information representation for stereoscopic color vision. We provide examples demonstrating that a massively parallel color visual neural circuit can be first identified with arbitrary precision and its spike trains can be subsequently used to reconstruct the encoded stimuli. We argue that evaluation of the functional identification methodology can be effectively and intuitively performed in the stimulus space. In this space, a signal reconstructed from spike trains generated by the identified neural circuit can be compared to the original stimulus. PMID:25594573

  6. Detection of inherited mutations for breast and ovarian cancer using genomic capture and massively parallel sequencing

    PubMed Central

    Walsh, Tom; Lee, Ming K.; Casadei, Silvia; Thornton, Anne M.; Stray, Sunday M.; Pennil, Christopher; Nord, Alex S.; Mandell, Jessica B.; Swisher, Elizabeth M.; King, Mary-Claire

    2010-01-01

    Inherited loss-of-function mutations in the tumor suppressor genes BRCA1, BRCA2, and multiple other genes predispose to high risks of breast and/or ovarian cancer. Cancer-associated inherited mutations in these genes are collectively quite common, but individually rare or even private. Genetic testing for BRCA1 and BRCA2 mutations has become an integral part of clinical practice, but testing is generally limited to these two genes and to women with severe family histories of breast or ovarian cancer. To determine whether massively parallel, “next-generation” sequencing would enable accurate, thorough, and cost-effective identification of inherited mutations for breast and ovarian cancer, we developed a genomic assay to capture, sequence, and detect all mutations in 21 genes, including BRCA1 and BRCA2, with inherited mutations that predispose to breast or ovarian cancer. Constitutional genomic DNA from subjects with known inherited mutations, ranging in size from 1 to >100,000 bp, was hybridized to custom oligonucleotides and then sequenced using a genome analyzer. Analysis was carried out blind to the mutation in each sample. Average coverage was >1200 reads per base pair. After filtering sequences for quality and number of reads, all single-nucleotide substitutions, small insertion and deletion mutations, and large genomic duplications and deletions were detected. There were zero false-positive calls of nonsense mutations, frameshift mutations, or genomic rearrangements for any gene in any of the test samples. This approach enables widespread genetic testing and personalized risk assessment for breast and ovarian cancer. PMID:20616022

  7. Massively parallel sampling of lattice proteins reveals foundations of thermal adaptation

    NASA Astrophysics Data System (ADS)

    Venev, Sergey V.; Zeldovich, Konstantin B.

    2015-08-01

    Evolution of proteins in bacteria and archaea living in different conditions leads to significant correlations between amino acid usage and environmental temperature. The origins of these correlations are poorly understood, and an important question of protein theory, physics-based prediction of types of amino acids overrepresented in highly thermostable proteins, remains largely unsolved. Here, we extend the random energy model of protein folding by weighting the interaction energies of amino acids by their frequencies in protein sequences and predict the energy gap of proteins designed to fold well at elevated temperatures. To test the model, we present a novel scalable algorithm for simultaneous energy calculation for many sequences in many structures, targeting massively parallel computing architectures such as graphics processing unit. The energy calculation is performed by multiplying two matrices, one representing the complete set of sequences, and the other describing the contact maps of all structural templates. An implementation of the algorithm for the CUDA platform is available at http://www.github.com/kzeldovich/galeprot and calculates protein folding energies over 250 times faster than a single central processing unit. Analysis of amino acid usage in 64-mer cubic lattice proteins designed to fold well at different temperatures demonstrates an excellent agreement between theoretical and simulated values of energy gap. The theoretical predictions of temperature trends of amino acid frequencies are significantly correlated with bioinformatics data on 191 bacteria and archaea, and highlight protein folding constraints as a fundamental selection pressure during thermal adaptation in biological evolution.

  8. A SNP panel for identity and kinship testing using massive parallel sequencing.

    PubMed

    Grandell, Ida; Samara, Raed; Tillmar, Andreas O

    2016-07-01

    Within forensic genetics, there is still a need for supplementary DNA marker typing in order to increase the power to solve cases for both identity testing and complex kinship issues. One major disadvantage with current capillary electrophoresis (CE) methods is the limitation in DNA marker multiplex capability. By utilizing massive parallel sequencing (MPS) technology, this capability can, however, be increased. We have designed a customized GeneRead DNASeq SNP panel (Qiagen) of 140 previously published autosomal forensically relevant identity SNPs for analysis using MPS. One single amplification step was followed by library preparation using the GeneRead Library Prep workflow (Qiagen). The sequencing was performed on a MiSeq System (Illumina), and the bioinformatic analyses were done using the software Biomedical Genomics Workbench (CLC Bio, Qiagen). Forty-nine individuals from a Swedish population were genotyped in order to establish genotype frequencies and to evaluate the performance of the assay. The analyses showed to have a balanced coverage among the included loci, and the heterozygous balance showed to have less than 0.5 % outliers. Analyses of dilution series of the 2800M Control DNA gave reproducible results down to 0.2 ng DNA input. In addition, typing of FTA samples and bone samples was performed with promising results. Further studies and optimizations are, however, required for a more detailed evaluation of the performance of degraded and PCR-inhibited forensic samples. In summary, the assay offers a straightforward sample-to-genotype workflow and could be useful to gain information in forensic casework, for both identity testing and in order to solve complex kinship issues. PMID:26932869

  9. Implementation of a Message Passing Interface into a Cloud-Resolving Model for Massively Parallel Computing

    NASA Technical Reports Server (NTRS)

    Juang, Hann-Ming Henry; Tao, Wei-Kuo; Zeng, Xi-Ping; Shie, Chung-Lin; Simpson, Joanne; Lang, Steve

    2004-01-01

    The capability for massively parallel programming (MPP) using a message passing interface (MPI) has been implemented into a three-dimensional version of the Goddard Cumulus Ensemble (GCE) model. The design for the MPP with MPI uses the concept of maintaining similar code structure between the whole domain as well as the portions after decomposition. Hence the model follows the same integration for single and multiple tasks (CPUs). Also, it provides for minimal changes to the original code, so it is easily modified and/or managed by the model developers and users who have little knowledge of MPP. The entire model domain could be sliced into one- or two-dimensional decomposition with a halo regime, which is overlaid on partial domains. The halo regime requires that no data be fetched across tasks during the computational stage, but it must be updated before the next computational stage through data exchange via MPI. For reproducible purposes, transposing data among tasks is required for spectral transform (Fast Fourier Transform, FFT), which is used in the anelastic version of the model for solving the pressure equation. The performance of the MPI-implemented codes (i.e., the compressible and anelastic versions) was tested on three different computing platforms. The major results are: 1) both versions have speedups of about 99% up to 256 tasks but not for 512 tasks; 2) the anelastic version has better speedup and efficiency because it requires more computations than that of the compressible version; 3) equal or approximately-equal numbers of slices between the x- and y- directions provide the fastest integration due to fewer data exchanges; and 4) one-dimensional slices in the x-direction result in the slowest integration due to the need for more memory relocation for computation.

  10. Identification of cancer/testis-antigen genes by massively parallel signature sequencing

    PubMed Central

    Chen, Yao-Tseng; Scanlan, Matthew J.; Venditti, Charis A.; Chua, Ramon; Theiler, Gregory; Stevenson, Brian J.; Iseli, Christian; Gure, Ali O.; Vasicek, Tom; Strausberg, Robert L.; Jongeneel, C. Victor; Old, Lloyd J.; Simpson, Andrew J. G.

    2005-01-01

    Massively parallel signature sequencing (MPSS) generates millions of short sequence tags corresponding to transcripts from a single RNA preparation. Most MPSS tags can be unambiguously assigned to genes, thereby generating a comprehensive expression profile of the tissue of origin. From the comparison of MPSS data from 32 normal human tissues, we identified 1,056 genes that are predominantly expressed in the testis. Further evaluation by using MPSS tags from cancer cell lines and EST data from a wide variety of tumors identified 202 of these genes as candidates for encoding cancer/testis (CT) antigens. Of these genes, the expression in normal tissues was assessed by RT-PCR in a subset of 166 intron-containing genes, and those with confirmed testis-predominant expression were further evaluated for their expression in 21 cancer cell lines. Thus, 20 CT or CT-like genes were identified, with several exhibiting expression in five or more of the cancer cell lines examined. One of these genes is a member of a CT gene family that we designated as CT45. The CT45 family comprises six highly similar (>98% cDNA identity) genes that are clustered in tandem within a 125-kb region on Xq26.3. CT45 was found to be frequently expressed in both cancer cell lines and lung cancer specimens. Thus, MPSS analysis has resulted in a significant extension of our knowledge of CT antigens, leading to the discovery of a distinctive X-linked CT-antigen gene family. PMID:15905330

  11. Massively parallel computation of lattice associative memory classifiers on multicore processors

    NASA Astrophysics Data System (ADS)

    Ritter, Gerhard X.; Schmalz, Mark S.; Hayden, Eric T.

    2011-09-01

    Over the past quarter century, concepts and theory derived from neural networks (NNs) have featured prominently in the literature of pattern recognition. Implementationally, classical NNs based on the linear inner product can present performance challenges due to the use of multiplication operations. In contrast, NNs having nonlinear kernels based on Lattice Associative Memories (LAM) theory tend to concentrate primarily on addition and maximum/minimum operations. More generally, the emergence of LAM-based NNs, with their superior information storage capacity, fast convergence and training due to relatively lower computational cost, as well as noise-tolerant classification has extended the capabilities of neural networks far beyond the limited applications potential of classical NNs. This paper explores theory and algorithmic approaches for the efficient computation of LAM-based neural networks, in particular lattice neural nets and dendritic lattice associative memories. Of particular interest are massively parallel architectures such as multicore CPUs and graphics processing units (GPUs). Originally developed for video gaming applications, GPUs hold the promise of high computational throughput without compromising numerical accuracy. Unfortunately, currently-available GPU architectures tend to have idiosyncratic memory hierarchies that can produce unacceptably high data movement latencies for relatively simple operations, unless careful design of theory and algorithms is employed. Advantageously, some GPUs (e.g., the Nvidia Fermi GPU) are optimized for efficient streaming computation (e.g., concurrent multiply and add operations). As a result, the linear or nonlinear inner product structures of NNs are inherently suited to multicore GPU computational capabilities. In this paper, the authors' recent research in lattice associative memories and their implementation on multicores is overviewed, with results that show utility for a wide variety of pattern

  12. Identifying Children With Poor Cochlear Implantation Outcomes Using Massively Parallel Sequencing

    PubMed Central

    Wu, Chen-Chi; Lin, Yin-Hung; Liu, Tien-Chen; Lin, Kai-Nan; Yang, Wei-Shiung; Hsu, Chuan-Jen; Chen, Pei-Lung; Wu, Che-Ming

    2015-01-01

    Abstract Cochlear implantation is currently the treatment of choice for children with severe to profound hearing impairment. However, the outcomes with cochlear implants (CIs) vary significantly among recipients. The purpose of the present study is to identify the genetic determinants of poor CI outcomes. Twelve children with poor CI outcomes (the “cases”) and 30 “matched controls” with good CI outcomes were subjected to comprehensive genetic analyses using massively parallel sequencing, which targeted 129 known deafness genes. Audiological features, imaging findings, and auditory/speech performance with CIs were then correlated to the genetic diagnoses. We identified genetic variants which are associated with poor CI outcomes in 7 (58%) of the 12 cases; 4 cases had bi-allelic PCDH15 pathogenic mutations and 3 cases were homozygous for the DFNB59 p.G292R variant. Mutations in the WFS1, GJB3, ESRRB, LRTOMT, MYO3A, and POU3F4 genes were detected in 7 (23%) of the 30 matched controls. The allele frequencies of PCDH15 and DFNB59 variants were significantly higher in the cases than in the matched controls (both P < 0.001). In the 7 CI recipients with PCDH15 or DFNB59 variants, otoacoustic emissions were absent in both ears, and imaging findings were normal in all 7 implanted ears. PCDH15 or DFNB59 variants are associated with poor CI performance, yet children with PCDH15 or DFNB59 variants might show clinical features indistinguishable from those of other typical pediatric CI recipients. Accordingly, genetic examination is indicated in all CI candidates before operation. PMID:26166082

  13. Application of Massively Parallel Sequencing to Genetic Diagnosis in Multiplex Families with Idiopathic Sensorineural Hearing Impairment

    PubMed Central

    Wu, Chen-Chi; Lin, Yin-Hung; Lu, Ying-Chang; Chen, Pei-Jer; Yang, Wei-Shiung; Hsu, Chuan-Jen; Chen, Pei-Lung

    2013-01-01

    Despite the clinical utility of genetic diagnosis to address idiopathic sensorineural hearing impairment (SNHI), the current strategy for screening mutations via Sanger sequencing suffers from the limitation that only a limited number of DNA fragments associated with common deafness mutations can be genotyped. Consequently, a definitive genetic diagnosis cannot be achieved in many families with discernible family history. To investigate the diagnostic utility of massively parallel sequencing (MPS), we applied the MPS technique to 12 multiplex families with idiopathic SNHI in which common deafness mutations had previously been ruled out. NimbleGen sequence capture array was designed to target all protein coding sequences (CDSs) and 100 bp of the flanking sequence of 80 common deafness genes. We performed MPS on the Illumina HiSeq2000, and applied BWA, SAMtools, Picard, GATK, Variant Tools, ANNOVAR, and IGV for bioinformatics analyses. Initial data filtering with allele frequencies (<5% in the 1000 Genomes Project and 5400 NHLBI exomes) and PolyPhen2/SIFT scores (>0.95) prioritized 5 indels (insertions/deletions) and 36 missense variants in the 12 multiplex families. After further validation by Sanger sequencing, segregation pattern, and evolutionary conservation of amino acid residues, we identified 4 variants in 4 different genes, which might lead to SNHI in 4 families compatible with autosomal dominant inheritance. These included GJB2 p.R75Q, MYO7A p.T381M, KCNQ4 p.S680F, and MYH9 p.E1256K. Among them, KCNQ4 p.S680F and MYH9 p.E1256K were novel. In conclusion, MPS allows genetic diagnosis in multiplex families with idiopathic SNHI by detecting mutations in relatively uncommon deafness genes. PMID:23451214

  14. Three-dimensional electromagnetic modeling and inversion on massively parallel computers

    SciTech Connect

    Newman, G.A.; Alumbaugh, D.L.

    1996-03-01

    This report has demonstrated techniques that can be used to construct solutions to the 3-D electromagnetic inverse problem using full wave equation modeling. To this point great progress has been made in developing an inverse solution using the method of conjugate gradients which employs a 3-D finite difference solver to construct model sensitivities and predicted data. The forward modeling code has been developed to incorporate absorbing boundary conditions for high frequency solutions (radar), as well as complex electrical properties, including electrical conductivity, dielectric permittivity and magnetic permeability. In addition both forward and inverse codes have been ported to a massively parallel computer architecture which allows for more realistic solutions that can be achieved with serial machines. While the inversion code has been demonstrated on field data collected at the Richmond field site, techniques for appraising the quality of the reconstructions still need to be developed. Here it is suggested that rather than employing direct matrix inversion to construct the model covariance matrix which would be impossible because of the size of the problem, one can linearize about the 3-D model achieved in the inverse and use Monte-Carlo simulations to construct it. Using these appraisal and construction tools, it is now necessary to demonstrate 3-D inversion for a variety of EM data sets that span the frequency range from induction sounding to radar: below 100 kHz to 100 MHz. Appraised 3-D images of the earth`s electrical properties can provide researchers opportunities to infer the flow paths, flow rates and perhaps the chemistry of fluids in geologic mediums. It also offers a means to study the frequency dependence behavior of the properties in situ. This is of significant relevance to the Department of Energy, paramount to characterizing and monitoring of environmental waste sites and oil and gas exploration.

  15. Transcriptional analysis of the Arabidopsis ovule by massively parallel signature sequencing

    PubMed Central

    Sánchez-León, Nidia; Arteaga-Vázquez, Mario; Alvarez-Mejía, César; Mendiola-Soto, Javier; Durán-Figueroa, Noé; Rodríguez-Leal, Daniel; Rodríguez-Arévalo, Isaac; García-Campayo, Vicenta; García-Aguilar, Marcelina; Olmedo-Monfil, Vianey; Arteaga-Sánchez, Mario; Martínez de la Vega, Octavio; Nobuta, Kan; Vemaraju, Kalyan; Meyers, Blake C.; Vielle-Calzada, Jean-Philippe

    2012-01-01

    The life cycle of flowering plants alternates between a predominant sporophytic (diploid) and an ephemeral gametophytic (haploid) generation that only occurs in reproductive organs. In Arabidopsis thaliana, the female gametophyte is deeply embedded within the ovule, complicating the study of the genetic and molecular interactions involved in the sporophytic to gametophytic transition. Massively parallel signature sequencing (MPSS) was used to conduct a quantitative large-scale transcriptional analysis of the fully differentiated Arabidopsis ovule prior to fertilization. The expression of 9775 genes was quantified in wild-type ovules, additionally detecting >2200 new transcripts mapping to antisense or intergenic regions. A quantitative comparison of global expression in wild-type and sporocyteless (spl) individuals resulted in 1301 genes showing 25-fold reduced or null activity in ovules lacking a female gametophyte, including those encoding 92 signalling proteins, 75 transcription factors, and 72 RNA-binding proteins not reported in previous studies based on microarray profiling. A combination of independent genetic and molecular strategies confirmed the differential expression of 28 of them, showing that they are either preferentially active in the female gametophyte, or dependent on the presence of a female gametophyte to be expressed in sporophytic cells of the ovule. Among 18 genes encoding pentatricopeptide-repeat proteins (PPRs) that show transcriptional activity in wild-type but not spl ovules, CIHUATEOTL (At4g38150) is specifically expressed in the female gametophyte and necessary for female gametogenesis. These results expand the nature of the transcriptional universe present in the ovule of Arabidopsis, and offer a large-scale quantitative reference of global expression for future genomic and developmental studies. PMID:22442422

  16. Systematic comparison of three genomic enrichment methods for massively parallel DNA sequencing.

    PubMed

    Teer, Jamie K; Bonnycastle, Lori L; Chines, Peter S; Hansen, Nancy F; Aoyama, Natsuyo; Swift, Amy J; Abaan, Hatice Ozel; Albert, Thomas J; Margulies, Elliott H; Green, Eric D; Collins, Francis S; Mullikin, James C; Biesecker, Leslie G

    2010-10-01

    Massively parallel DNA sequencing technologies have greatly increased our ability to generate large amounts of sequencing data at a rapid pace. Several methods have been developed to enrich for genomic regions of interest for targeted sequencing. We have compared three of these methods: Molecular Inversion Probes (MIP), Solution Hybrid Selection (SHS), and Microarray-based Genomic Selection (MGS). Using HapMap DNA samples, we compared each of these methods with respect to their ability to capture an identical set of exons and evolutionarily conserved regions associated with 528 genes (2.61 Mb). For sequence analysis, we developed and used a novel Bayesian genotype-assigning algorithm, Most Probable Genotype (MPG). All three capture methods were effective, but sensitivities (percentage of targeted bases associated with high-quality genotypes) varied for an equivalent amount of pass-filtered sequence: for example, 70% (MIP), 84% (SHS), and 91% (MGS) for 400 Mb. In contrast, all methods yielded similar accuracies of >99.84% when compared to Infinium 1M SNP BeadChip-derived genotypes and >99.998% when compared to 30-fold coverage whole-genome shotgun sequencing data. We also observed a low false-positive rate with all three methods; of the heterozygous positions identified by each of the capture methods, >99.57% agreed with 1M SNP BeadChip, and >98.840% agreed with the whole-genome shotgun data. In addition, we successfully piloted the genomic enrichment of a set of 12 pooled samples via the MGS method using molecular bar codes. We find that these three genomic enrichment methods are highly accurate and practical, with sensitivities comparable to that of 30-fold coverage whole-genome shotgun data. PMID:20810667

  17. Water mass-specificity of bacterial communities in the North Atlantic revealed by massively parallel sequencing

    PubMed Central

    Agogué, Hélène; Lamy, Dominique; Neal, Phillip R.; Sogin, Mitchell L.; Herndl, Gerhard J.

    2011-01-01

    Bacterial assemblages from subsurface (100 m depth), meso- (200–1000 m depth) and bathy-pelagic (below 1000 m depth) zones at 10 stations along a North Atlantic Ocean transect from 60°N to 5°S were characterized using massively parallel pyrotag sequencing of the V6 region of the 16S rRNA gene (V6 pyrotags). In a dataset of more than 830,000 pyrotags we identified 10,780 OTUs of which 52% were singletons. The singletons accounted for less than 2% of the OTU abundance, while the 100 and 1,000 most abundant OTUs represented 80% and 96%, respectively, of all recovered OTUs. Non-metric Multi-Dimensional Scaling and Canonical Correspondence Analysis of all the OTUs excluding the singletons revealed a clear clustering of the bacterial communities according to the water masses. More than 80% of the 1,000 most abundant OTUs corresponded to Proteobacteria of which 55% were Alphaproteobacteria, mostly composed of the SAR11 cluster. Gammaproteobacteria increased with depth and included a relatively large number of OTUs belonging to Alteromonadales and Oceanospirillales. The bathypelagic zone showed higher taxonomic evenness than the overlying waters, albeit bacterial diversity was remarkably variable. Both abundant and low-abundance OTUs were responsible for the distinct bacterial communities characterizing the major deep-water masses. Taken together, our results reveal that deep-water masses act as bio-oceanographic islands for bacterioplankton leading to water mass-specific bacterial communities in the deep waters of the Atlantic. PMID:21143328

  18. The complete genome of an individual by massively parallel DNA sequencing.

    PubMed

    Wheeler, David A; Srinivasan, Maithreyan; Egholm, Michael; Shen, Yufeng; Chen, Lei; McGuire, Amy; He, Wen; Chen, Yi-Ju; Makhijani, Vinod; Roth, G Thomas; Gomes, Xavier; Tartaro, Karrie; Niazi, Faheem; Turcotte, Cynthia L; Irzyk, Gerard P; Lupski, James R; Chinault, Craig; Song, Xing-zhi; Liu, Yue; Yuan, Ye; Nazareth, Lynne; Qin, Xiang; Muzny, Donna M; Margulies, Marcel; Weinstock, George M; Gibbs, Richard A; Rothberg, Jonathan M

    2008-04-17

    The association of genetic variation with disease and drug response, and improvements in nucleic acid technologies, have given great optimism for the impact of 'genomic medicine'. However, the formidable size of the diploid human genome, approximately 6 gigabases, has prevented the routine application of sequencing methods to deciphering complete individual human genomes. To realize the full potential of genomics for human health, this limitation must be overcome. Here we report the DNA sequence of a diploid genome of a single individual, James D. Watson, sequenced to 7.4-fold redundancy in two months using massively parallel sequencing in picolitre-size reaction vessels. This sequence was completed in two months at approximately one-hundredth of the cost of traditional capillary electrophoresis methods. Comparison of the sequence to the reference genome led to the identification of 3.3 million single nucleotide polymorphisms, of which 10,654 cause amino-acid substitution within the coding sequence. In addition, we accurately identified small-scale (2-40,000 base pair (bp)) insertion and deletion polymorphism as well as copy number variation resulting in the large-scale gain and loss of chromosomal segments ranging from 26,000 to 1.5 million base pairs. Overall, these results agree well with recent results of sequencing of a single individual by traditional methods. However, in addition to being faster and significantly less expensive, this sequencing technology avoids the arbitrary loss of genomic sequences inherent in random shotgun sequencing by bacterial cloning because it amplifies DNA in a cell-free system. As a result, we further demonstrate the acquisition of novel human sequence, including novel genes not previously identified by traditional genomic sequencing. This is the first genome sequenced by next-generation technologies. Therefore it is a pilot for the future challenges of 'personalized genome sequencing'. PMID:18421352

  19. Investigations on the usefulness of the Massively Parallel Processor for study of electronic properties of atomic and condensed matter systems

    NASA Technical Reports Server (NTRS)

    Das, T. P.

    1988-01-01

    The usefulness of the Massively Parallel Processor (MPP) for investigation of electronic structures and hyperfine properties of atomic and condensed matter systems was explored. The major effort was directed towards the preparation of algorithms for parallelization of the computational procedure being used on serial computers for electronic structure calculations in condensed matter systems. Detailed descriptions of investigations and results are reported, including MPP adaptation of self-consistent charge extended Hueckel (SCCEH) procedure, MPP adaptation of the first-principles Hartree-Fock cluster procedure for electronic structures of large molecules and solid state systems, and MPP adaptation of the many-body procedure for atomic systems.

  20. 3D frequency modeling of elastic seismic wave propagation via a structured massively parallel direct Helmholtz solver

    NASA Astrophysics Data System (ADS)

    Wang, S.; De Hoop, M. V.; Xia, J.; Li, X.

    2011-12-01

    We consider the modeling of elastic seismic wave propagation on a rectangular domain via the discretization and solution of the inhomogeneous coupled Helmholtz equation in 3D, by exploiting a parallel multifrontal sparse direct solver equipped with Hierarchically Semi-Separable (HSS) structure to reduce the computational complexity and storage. In particular, we are concerned with solving this equation on a large domain, for a large number of different forcing terms in the context of seismic problems in general, and modeling in particular. We resort to a parsimonious mixed grid finite differences scheme for discretizing the Helmholtz operator and Perfect Matched Layer boundaries, resulting in a non-Hermitian matrix. We make use of a nested dissection based domain decomposition, and introduce an approximate direct solver by developing a parallel HSS matrix compression, factorization, and solution approach. We cast our massive parallelization in the framework of the multifrontal method. The assembly tree is partitioned into local trees and a global tree. The local trees are eliminated independently in each processor, while the global tree is eliminated through massive communication. The solver for the inhomogeneous equation is a parallel hybrid between multifrontal and HSS structure. The computational complexity associated with the factorization is almost linear with the size of the Helmholtz matrix. Our numerical approach can be compared with the spectral element method in 3D seismic applications.

  1. Underlying Data for Sequencing the Mitochondrial Genome with the Massively Parallel Sequencing Platform Ion Torrent™ PGM™

    PubMed Central

    2015-01-01

    Background Massively parallel sequencing (MPS) technologies have the capacity to sequence targeted regions or whole genomes of multiple nucleic acid samples with high coverage by sequencing millions of DNA fragments simultaneously. Compared with Sanger sequencing, MPS also can reduce labor and cost on a per nucleotide basis and indeed on a per sample basis. In this study, whole genomes of human mitochondria (mtGenome) were sequenced on the Personal Genome Machine (PGMTM) (Life Technologies, San Francisco, CA), the out data were assessed, and the results were compared with data previously generated on the MiSeqTM (Illumina, San Diego, CA). The objectives of this paper were to determine the feasibility, accuracy, and reliability of sequence data obtained from the PGM. Results 24 samples were multiplexed (in groups of six) and sequenced on the at least 10 megabase throughput 314 chip. The depth of coverage pattern was similar among all 24 samples; however the coverage across the genome varied. For strand bias, the average ratio of coverage between the forward and reverse strands at each nucleotide position indicated that two-thirds of the positions of the genome had ratios that were greater than 0.5. A few sites had more extreme strand bias. Another observation was that 156 positions had a false deletion rate greater than 0.15 in one or more individuals. There were 31-98 (SNP) mtGenome variants observed per sample for the 24 samples analyzed. The total 1237 (SNP) variants were concordant between the results from the PGM and MiSeq. The quality scores for haplogroup assignment for all 24 samples ranged between 88.8%-100%. Conclusions In this study, mtDNA sequence data generated from the PGM were analyzed and the output evaluated. Depth of coverage variation and strand bias were identified but generally were infrequent and did not impact reliability of variant calls. Multiplexing of samples was demonstrated which can improve throughput and reduce cost per sample analyzed

  2. A massively parallel adaptive scheme for melt migration in geodynamics computations

    NASA Astrophysics Data System (ADS)

    Dannberg, Juliane; Heister, Timo; Grove, Ryan

    2016-04-01

    Melt generation and migration are important processes for the evolution of the Earth's interior and impact the global convection of the mantle. While they have been the subject of numerous investigations, the typical time and length-scales of melt transport are vastly different from global mantle convection, which determines where melt is generated. This makes it difficult to study mantle convection and melt migration in a unified framework. In addition, modelling magma dynamics poses the challenge of highly non-linear and spatially variable material properties, in particular the viscosity. We describe our extension of the community mantle convection code ASPECT that adds equations describing the behaviour of silicate melt percolating through and interacting with a viscously deforming host rock. We use the original compressible formulation of the McKenzie equations, augmented by an equation for the conservation of energy. This approach includes both melt migration and melt generation with the accompanying latent heat effects, and it incorporates the individual compressibilities of the solid and the fluid phase. For this, we derive an accurate and stable Finite Element scheme that can be combined with adaptive mesh refinement. This is particularly advantageous for this type of problem, as the resolution can be increased in mesh cells where melt is present and viscosity gradients are high, whereas a lower resolution is sufficient in regions without melt. Together with a high-performance, massively parallel implementation, this allows for high resolution, 3d, compressible, global mantle convection simulations coupled with melt migration. Furthermore, scalable iterative linear solvers are required to solve the large linear systems arising from the discretized system. Finally, we present benchmarks and scaling tests of our solver up to tens of thousands of cores, show the effectiveness of adaptive mesh refinement when applied to melt migration and compare the

  3. Evaluation of Two Highly-Multiplexed Custom Panels for Massively Parallel Semiconductor Sequencing on Paraffin DNA

    PubMed Central

    Kotoula, Vassiliki; Lyberopoulou, Aggeliki; Papadopoulou, Kyriaki; Charalambous, Elpida; Alexopoulou, Zoi; Gakou, Chryssa; Lakis, Sotiris; Tsolaki, Eleftheria; Lilakos, Konstantinos; Fountzilas, George

    2015-01-01

    Background—Aim Massively parallel sequencing (MPS) holds promise for expanding cancer translational research and diagnostics. As yet, it has been applied on paraffin DNA (FFPE) with commercially available highly multiplexed gene panels (100s of DNA targets), while custom panels of low multiplexing are used for re-sequencing. Here, we evaluated the performance of two highly multiplexed custom panels on FFPE DNA. Methods Two custom multiplex amplification panels (B, 373 amplicons; T, 286 amplicons) were coupled with semiconductor sequencing on DNA samples from FFPE breast tumors and matched peripheral blood samples (n samples: 316; n libraries: 332). The two panels shared 37% DNA targets (common or shifted amplicons). Panel performance was evaluated in paired sample groups and quartets of libraries, where possible. Results Amplicon read ratios yielded similar patterns per gene with the same panel in FFPE and blood samples; however, performance of common amplicons differed between panels (p<0.001). FFPE genotypes were compared for 1267 coding and non-coding variant replicates, 999 out of which (78.8%) were concordant in different paired sample combinations. Variant frequency was highly reproducible (Spearman’s rho 0.959). Repeatedly discordant variants were of high coverage / low frequency (p<0.001). Genotype concordance was (a) high, for intra-run duplicates with the same panel (mean±SD: 97.2±4.7, 95%CI: 94.8–99.7, p<0.001); (b) modest, when the same DNA was analyzed with different panels (mean±SD: 81.1±20.3, 95%CI: 66.1–95.1, p = 0.004); and (c) low, when different DNA samples from the same tumor were compared with the same panel (mean±SD: 59.9±24.0; 95%CI: 43.3–76.5; p = 0.282). Low coverage / low frequency variants were validated with Sanger sequencing even in samples with unfavourable DNA quality. Conclusions Custom MPS may yield novel information on genomic alterations, provided that data evaluation is adjusted to tumor tissue FFPE DNA. To this

  4. Performance analysis of three dimensional integral equation computations on a massively parallel computer. M.S. Thesis

    NASA Technical Reports Server (NTRS)

    Logan, Terry G.

    1994-01-01

    The purpose of this study is to investigate the performance of the integral equation computations using numerical source field-panel method in a massively parallel processing (MPP) environment. A comparative study of computational performance of the MPP CM-5 computer and conventional Cray-YMP supercomputer for a three-dimensional flow problem is made. A serial FORTRAN code is converted into a parallel CM-FORTRAN code. Some performance results are obtained on CM-5 with 32, 62, 128 nodes along with those on Cray-YMP with a single processor. The comparison of the performance indicates that the parallel CM-FORTRAN code near or out-performs the equivalent serial FORTRAN code for some cases.

  5. Chemoselective ligation

    DOEpatents

    Saxon, Eliana; Bertozzi, Carolyn R.

    2010-02-23

    The present invention features a chemoselective ligation reaction that can be carried out under physiological conditions. In general, the invention involves condensation of a specifically engineered phosphine, which can provide for formation of an amide bond between the two reactive partners resulting in a final product comprising a phosphine moiety, or which can be engineered to comprise a cleavable linker so that a substituent of the phosphine is transferred to the azide, releasing an oxidized phosphine byproduct and producing a native amide bond in the final product. The selectivity of the reaction and its compatibility with aqueous environments provides for its application in vivo (e.g. on the cell surface or intracellularly) and in vitro (e.g., synthesis of peptides and other polymers, production of modified (e.g., labeled) amino acids).

  6. Chemoselective ligation

    DOEpatents

    Saxon, Eliana; Bertozzi, Carolyn

    2006-10-17

    The present invention features a chemoselective ligation reaction that can be carried out under physiological conditions. In general, the invention involves condensation of a specifically engineered phosphine, which can provide for formation of an amide bond between the two reactive partners resulting in a final product comprising a phosphine moiety, or which can be engineered to comprise a cleavable linker so that a substituent of the phosphine is transferred to the azide, releasing an oxidized phosphine byproduct and producing a native amide bond in the final product. The selectivity of the reaction and its compatibility with aqueous environments provides for its application in vivo (e.g., on the cell surface or intracellularly) and in vitro (e.g., synthesis of peptides and other polymers, production of modified (e.g., labeled) amino acids).

  7. Chemoselective ligation

    DOEpatents

    Saxon, Eliana; Bertozzi, Carolyn Ruth

    2011-12-13

    The present invention features a chemoselective ligation reaction that can be carried out under physiological conditions. In general, the invention involves condensation of a specifically engineered phosphine, which can provide for formation of an amide bond between the two reactive partners resulting in a final product comprising a phosphine moiety, or which can be engineered to comprise a cleavable linker so that a substituent of the phosphine is transferred to the azide, releasing an oxidized phosphine byproduct and producing a native amide bond in the final product. The selectivity of the reaction and its compatibility with aqueous environments provides for its application in vivo (e.g., on the cell surface or intracellularly) and in vitro (e.g., synthesis of peptides and other polymers, production of modified (e.g., labeled) amino acids).

  8. Chemoselective ligation

    DOEpatents

    Saxon, Eliana; Bertozzi, Carolyn Ruth

    2010-11-23

    The present invention features a chemoselective ligation reaction that can be carried out under physiological conditions. In general, the invention involves condensation of a specifically engineered phosphine, which can provide for formation of an amide bond between the two reactive partners resulting in a final product comprising a phosphine moiety, or which can be engineered to comprise a cleavable linker so that a substituent of the phosphine is transferred to the azide, releasing an oxidized phosphine byproduct and producing a native amide bond in the final product. The selectivity of the reaction and its compatibility with aqueous environments provides for its application in vivo (e.g., on the cell surface or intracellularly) and in vitro (e.g., synthesis of peptides and other polymers, production of modified (e.g., labeled) amino acids).

  9. Chemoselective ligation

    DOEpatents

    Saxon, Eliana; Bertozzi, Carolyn

    2003-05-27

    The present invention features a chemoselective ligation reaction that can be carried out under physiological conditions. In general, the invention involves condensation of a specifically engineered phosphine, which can provide for formation of an amide bond between the two reactive partners resulting in a final product comprising a phosphine moiety, or which can be engineered to comprise a cleavable linker so that a substituent of the phosphine is transferred to the azide, releasing an oxidized phosphine byproduct and producing a native amide bond in the final product. The selectivity of the reaction and its compatibility with aqueous environments provides for its application in vivo (e.g., on the cell surface or intracellularly) and in vitro (e.g., synthesis of peptides and other polymers, production of modified (e.g., labeled) amino acids).

  10. Chemoselective ligation

    DOEpatents

    Saxon, Eliana; Bertozzi, Carolyn R.

    2011-05-10

    The present invention features a chemoselective ligation reaction that can be carried out under physiological conditions. In general, the invention involves condensation of a specifically engineered phosphine, which can provide for formation of an amide bond between the two reactive partners resulting in a final product comprising a phosphine moiety, or which can be engineered to comprise a cleavable linker so that a substituent of the phosphine is transferred to the azide, releasing an oxidized phosphine byproduct and producing a native amide bond in the final product. The selectivity of the reaction and its compatibility with aqueous environments provides for its application in vivo (e.g., on the cell surface or intracellularly) and in vitro (e.g., synthesis of peptides and other polymers, production of modified (e.g., labeled) amino acids).

  11. Chemoselective ligation

    DOEpatents

    Saxon, Eliana; Bertozzi, Carolyn R.

    2011-04-12

    The present invention features a chemoselective ligation reaction that can be carried out under physiological conditions. In general, the invention involves condensation of a specifically engineered phosphine, which can provide for formation of an amide bond between the two reactive partners resulting in a final product comprising a phosphine moiety, or which can be engineered to comprise a cleavable linker so that a substituent of the phosphine is transferred to the azide, releasing an oxidized phosphine byproduct and producing a native amide bond in the final product. The selectivity of the reaction and its compatibility with aqueous environments provides for its application in vivo (e.g., on the cell surface or intracellularly) and in vitro (e.g., synthesis of peptides and other polymers, production of modified (e.g., labeled) amino acids).

  12. A massively parallel algorithm for the collision probability calculations in the Apollo-II code using the PVM library

    SciTech Connect

    Stankovski, Z.

    1995-12-31

    The collision probability method in neutron transport, as applied to 2D geometries, consume a great amount of computer time, for a typical 2D assembly calculation about 90% of the computing time is consumed in the collision probability evaluations. Consequently RZ or 3D calculations became prohibitive. In this paper the author presents a simple but efficient parallel algorithm based on the message passing host/node programmation model. Parallelization was applied to the energy group treatment. Such approach permits parallelization of the existing code, requiring only limited modifications. Sequential/parallel computer portability is preserved, which is a necessary condition for a industrial code. Sequential performances are also preserved. The algorithm is implemented on a CRAY 90 coupled to a 128 processor T3D computer, a 16 processor IBM SPI and a network of workstations, using the Public Domain PVM library. The tests were executed for a 2D geometry with the standard 99-group library. All results were very satisfactory, the best ones with IBM SPI. Because of heterogeneity of the workstation network, the author did not ask high performances for this architecture. The same source code was used for all computers. A more impressive advantage of this algorithm will appear in the calculations of the SAPHYR project (with the future fine multigroup library of about 8000 groups) with a massively parallel computer, using several hundreds of processors.

  13. System and method for representing and manipulating three-dimensional objects on massively parallel architectures

    DOEpatents

    Karasick, M.S.; Strip, D.R.

    1996-01-30

    A parallel computing system is described that comprises a plurality of uniquely labeled, parallel processors, each processor capable of modeling a three-dimensional object that includes a plurality of vertices, faces and edges. The system comprises a front-end processor for issuing a modeling command to the parallel processors, relating to a three-dimensional object. Each parallel processor, in response to the command and through the use of its own unique label, creates a directed-edge (d-edge) data structure that uniquely relates an edge of the three-dimensional object to one face of the object. Each d-edge data structure at least includes vertex descriptions of the edge and a description of the one face. As a result, each processor, in response to the modeling command, operates upon a small component of the model and generates results, in parallel with all other processors, without the need for processor-to-processor intercommunication. 8 figs.

  14. System and method for representing and manipulating three-dimensional objects on massively parallel architectures

    DOEpatents

    Karasick, Michael S.; Strip, David R.

    1996-01-01

    A parallel computing system is described that comprises a plurality of uniquely labeled, parallel processors, each processor capable of modelling a three-dimensional object that includes a plurality of vertices, faces and edges. The system comprises a front-end processor for issuing a modelling command to the parallel processors, relating to a three-dimensional object. Each parallel processor, in response to the command and through the use of its own unique label, creates a directed-edge (d-edge) data structure that uniquely relates an edge of the three-dimensional object to one face of the object. Each d-edge data structure at least includes vertex descriptions of the edge and a description of the one face. As a result, each processor, in response to the modelling command, operates upon a small component of the model and generates results, in parallel with all other processors, without the need for processor-to-processor intercommunication.

  15. Method and apparatus for obtaining stack traceback data for multiple computing nodes of a massively parallel computer system

    DOEpatents

    Gooding, Thomas Michael; McCarthy, Patrick Joseph

    2010-03-02

    A data collector for a massively parallel computer system obtains call-return stack traceback data for multiple nodes by retrieving partial call-return stack traceback data from each node, grouping the nodes in subsets according to the partial traceback data, and obtaining further call-return stack traceback data from a representative node or nodes of each subset. Preferably, the partial data is a respective instruction address from each node, nodes having identical instruction address being grouped together in the same subset. Preferably, a single node of each subset is chosen and full stack traceback data is retrieved from the call-return stack within the chosen node.

  16. A Precision Dose Control Circuit for Maskless E-Beam Lithography With Massively Parallel Vertically Aligned Carbon Nanofibers

    SciTech Connect

    Eliza, Sazia A.; Islam, Syed K; Rahman, Touhidur; Bull, Nora D; Blalock, Benjamin; Baylor, Larry R; Ericson, Milton Nance; Gardner, Walter L

    2011-01-01

    This paper describes a highly accurate dose control circuit (DCC) for the emission of a desired number of electrons from vertically aligned carbon nanofibers (VACNFs) in a massively parallel maskless e-beam lithography system. The parasitic components within the VACNF device cause a premature termination of the electron emission, resulting in underexposure of the photoresist. In this paper, we compensate for the effects of the parasitic components and noise while reducing the area of the chip and achieving a precise count of emitted electrons from the VACNFs to obtain the optimum dose for the e-beam lithography.

  17. Efficient Extraction of Regional Subsets from Massive Climate Datasets using Parallel IO

    SciTech Connect

    Daily, Jeffrey A.; Schuchardt, Karen L.; Palmer, Bruce J.

    2010-09-16

    The size of datasets produced by current climate models is increasing rapidly to the scale of petabytes. To handle data at this scale parallel analysis tools are required, however the majority of climate analysis software remains at the scale of workstations. Further, many climate analysis tools adequately process regularly gridded data but lack sufficient features when handling unstructured grids. This paper presents a data-parallel subsetter capable of correctly handling unstructured grids while scaling to over 2000 cores. The approach is based on the partitioned global address space (PGAS) parallel programming model and one-sided communication. The paper demonstrates that IO remains the single greatest bottleneck for this domain of applications and that parallel analysis of climate data succeeds in practice.

  18. Massively parallel implementation of the multi-reference Brillouin-Wigner CCSD method

    SciTech Connect

    Brabec, Jiri; Krishnamoorthy, Sriram; van Dam, Hubertus JJ; Kowalski, Karol; Pittner, Jiri

    2011-10-06

    This paper reports the parallel implementation of the Brillouin Wigner MultiReference Coupled Cluster method with Single and Double excitations (BW-MRCCSD). Preliminary tests for systems composed of 304 and 440 correlated obritals demonstrate the performance of our implementation across 1000 cores and clearly indicate the advantages of using improved task scheduling. Possible ways for further improvements of the parallel performance are also delineated.

  19. A parallel computing tool for large-scale simulation of massive fluid injection in thermo-poro-mechanical systems

    NASA Astrophysics Data System (ADS)

    Karrech, Ali; Schrank, Christoph; Regenauer-Lieb, Klaus

    2015-10-01

    Massive fluid injections into the earth's upper crust are commonly used to stimulate permeability in geothermal reservoirs, enhance recovery in oil reservoirs, store carbon dioxide and so forth. Currently used models for reservoir simulation are limited to small perturbations and/or hydraulic aspects that are insufficient to describe the complex thermal-hydraulic-mechanical behaviour of natural geomaterials. Comprehensive approaches, which take into account the non-linear mechanical deformations of rock masses, fluid flow in percolating pore spaces, and changes of temperature due to heat transfer, are necessary to predict the behaviour of deep geo-materials subjected to high pressure and temperature changes. In this paper, we introduce a thermodynamically consistent poromechanics formulation which includes coupled thermal, hydraulic and mechanical processes. Moreover, we propose a numerical integration strategy based on massively parallel computing. The proposed formulations and numerical integration are validated using analytical solutions of simple multi-physics problems. As a representative application, we investigate the massive injection of fluids within deep formation to mimic the conditions of reservoir stimulation. The model showed, for instance, the effects of initial pre-existing stress fields on the orientations of stimulation-induced failures.

  20. Parallel selections in vitro reveal a preference for 2'-5' RNA ligation upon deoxyribozyme-mediated opening of a 2',3'-cyclic phosphate.

    PubMed

    Semlow, Daniel R; Silverman, Scott K

    2005-08-01

    We previously used in vitro selection to identify Mg(2+)-dependent deoxyribozymes that mediate the ligation reaction of an RNA 5'-hydroxyl group with a 2',3'-cyclic phosphate. In these efforts, all of the deoxyribozymes were identified via a common in vitro selection strategy, and all of the newly formed RNA linkages were non-native 2'-5' phosphodiester bonds rather than native 3'-5' linkages. Here we performed several new selections in which the relative arrangements of RNA and DNA were different as compared with the earlier studies. In all cases, we again find deoxyribozymes that create only 2'-5' linkages. This includes deoxyribozymes with an arrangement that favors 3'-5' linkages for a different chemical reaction, that of a 2',3'-diol plus 5'-triphosphate. These data indicate a strong and context-independent chemical preference for creating 2'-5' RNA linkages upon opening of a 2',3'-cyclic phosphate with a 5'-hydroxyl group. Preliminary assays show that some of the newly identified deoxyribozymes have promise for ligating RNA in a sequence-general fashion. Because 2',3'-cyclic phosphates are the products of uncatalyzed RNA backbone cleavage, their ligation reactions may be of direct relevance to the RNA World hypothesis. PMID:16007488

  1. Massively parallel computing simulation of fluid flow in the unsaturated zone of Yucca Mountain, Nevada

    SciTech Connect

    Zhang, Keni; Wu, Yu-Shu; Bodvarsson, G.S.

    2001-08-31

    This paper presents the application of parallel computing techniques to large-scale modeling of fluid flow in the unsaturated zone (UZ) at Yucca Mountain, Nevada. In this study, parallel computing techniques, as implemented into the TOUGH2 code, are applied in large-scale numerical simulations on a distributed-memory parallel computer. The modeling study has been conducted using an over-one-million-cell three-dimensional numerical model, which incorporates a wide variety of field data for the highly heterogeneous fractured formation at Yucca Mountain. The objective of this study is to analyze the impact of various surface infiltration scenarios (under current and possible future climates) on flow through the UZ system, using various hydrogeological conceptual models with refined grids. The results indicate that the one-million-cell models produce better resolution results and reveal some flow patterns that cannot be obtained using coarse-grid modeling models.

  2. Satisfiability Test with Synchronous Simulated Annealing on the Fujitsu AP1000 Massively-Parallel Multiprocessor

    NASA Technical Reports Server (NTRS)

    Sohn, Andrew; Biswas, Rupak

    1996-01-01

    Solving the hard Satisfiability Problem is time consuming even for modest-sized problem instances. Solving the Random L-SAT Problem is especially difficult due to the ratio of clauses to variables. This report presents a parallel synchronous simulated annealing method for solving the Random L-SAT Problem on a large-scale distributed-memory multiprocessor. In particular, we use a parallel synchronous simulated annealing procedure, called Generalized Speculative Computation, which guarantees the same decision sequence as sequential simulated annealing. To demonstrate the performance of the parallel method, we have selected problem instances varying in size from 100-variables/425-clauses to 5000-variables/21,250-clauses. Experimental results on the AP1000 multiprocessor indicate that our approach can satisfy 99.9 percent of the clauses while giving almost a 70-fold speedup on 500 processors.

  3. Solution of the within-group multidimensional discrete ordinates transport equations on massively parallel architectures

    NASA Astrophysics Data System (ADS)

    Zerr, Robert Joseph

    2011-12-01

    The integral transport matrix method (ITMM) has been used as the kernel of new parallel solution methods for the discrete ordinates approximation of the within-group neutron transport equation. The ITMM abandons the repetitive mesh sweeps of the traditional source iterations (SI) scheme in favor of constructing stored operators that account for the direct coupling factors among all the cells and between the cells and boundary surfaces. The main goals of this work were to develop the algorithms that construct these operators and employ them in the solution process, determine the most suitable way to parallelize the entire procedure, and evaluate the behavior and performance of the developed methods for increasing number of processes. This project compares the effectiveness of the ITMM with the SI scheme parallelized with the Koch-Baker-Alcouffe (KBA) method. The primary parallel solution method involves a decomposition of the domain into smaller spatial sub-domains, each with their own transport matrices, and coupled together via interface boundary angular fluxes. Each sub-domain has its own set of ITMM operators and represents an independent transport problem. Multiple iterative parallel solution methods have investigated, including parallel block Jacobi (PBJ), parallel red/black Gauss-Seidel (PGS), and parallel GMRES (PGMRES). The fastest observed parallel solution method, PGS, was used in a weak scaling comparison with the PARTISN code. Compared to the state-of-the-art SI-KBA with diffusion synthetic acceleration (DSA), this new method without acceleration/preconditioning is not competitive for any problem parameters considered. The best comparisons occur for problems that are difficult for SI DSA, namely highly scattering and optically thick. SI DSA execution time curves are generally steeper than the PGS ones. However, until further testing is performed it cannot be concluded that SI DSA does not outperform the ITMM with PGS even on several thousand or tens of

  4. A Novel Algorithm for Solving the Multidimensional Neutron Transport Equation on Massively Parallel Architectures

    SciTech Connect

    Azmy, Yousry

    2014-06-10

    We employ the Integral Transport Matrix Method (ITMM) as the kernel of new parallel solution methods for the discrete ordinates approximation of the within-group neutron transport equation. The ITMM abandons the repetitive mesh sweeps of the traditional source iterations (SI) scheme in favor of constructing stored operators that account for the direct coupling factors among all the cells' fluxes and between the cells' and boundary surfaces' fluxes. The main goals of this work are to develop the algorithms that construct these operators and employ them in the solution process, determine the most suitable way to parallelize the entire procedure, and evaluate the behavior and parallel performance of the developed methods with increasing number of processes, P. The fastest observed parallel solution method, Parallel Gauss-Seidel (PGS), was used in a weak scaling comparison with the PARTISN transport code, which uses the source iteration (SI) scheme parallelized with the Koch-baker-Alcouffe (KBA) method. Compared to the state-of-the-art SI-KBA with diffusion synthetic acceleration (DSA), this new method- even without acceleration/preconditioning-is completitive for optically thick problems as P is increased to the tens of thousands range. For the most optically thick cells tested, PGS reduced execution time by an approximate factor of three for problems with more than 130 million computational cells on P = 32,768. Moreover, the SI-DSA execution times's trend rises generally more steeply with increasing P than the PGS trend. Furthermore, the PGS method outperforms SI for the periodic heterogeneous layers (PHL) configuration problems. The PGS method outperforms SI and SI-DSA on as few as P = 16 for PHL problems and reduces execution time by a factor of ten or more for all problems considered with more than 2 million computational cells on P = 4.096.

  5. Massively parallel read mapping on GPUs with the q-group index and PEANUT

    PubMed Central

    Rahmann, Sven

    2014-01-01

    We present the q-group index, a novel data structure for read mapping tailored towards graphics processing units (GPUs) with a small memory footprint and efficient parallel algorithms for querying and building. On top of the q-group index we introduce PEANUT, a highly parallel GPU-based read mapper. PEANUT provides the possibility to output both the best hits or all hits of a read. Our benchmarks show that PEANUT outperforms other state-of-the-art read mappers in terms of speed while maintaining or slightly increasing precision, recall and sensitivity. PMID:25289191

  6. Design of electrostatic microcolumn for nanoscale photoemission source in massively parallel electron-beam lithography

    NASA Astrophysics Data System (ADS)

    Wen, Ye; Du, Zhidong; Pan, Liang

    2015-10-01

    Microcolumns are widely used for parallel electron-beam lithography because of their compactness and the ability to achieve high spatial resolution. A design of an electrostatic microcolumn for our recent nanoscale photoemission sources is presented. We proposed a compact column structure (as short as several microns in length) for the ease of microcolumn fabrication and lithography operation. We numerically studied the influence of several design parameters on the optical performance such as microcolumn diameter, electrode thickness, beam current, working voltages, and working distance. We also examined the effect of fringing field between adjacent microcolumns during parallel lithography operations.

  7. Massively Parallel, Three-Dimensional Transport Solutions for the k-Eigenvalue Problem

    SciTech Connect

    Davidson, Gregory G; Evans, Thomas M; Jarrell, Joshua J; Pandya, Tara M; Slaybaugh, R

    2014-01-01

    We have implemented a new multilevel parallel decomposition in the Denovo dis- crete ordinates radiation transport code. In concert with Krylov subspace iterative solvers, the multilevel decomposition allows concurrency over energy in addition to space-angle, enabling scalability beyond the limits imposed by the traditional KBA space-angle partitioning. Furthermore, a new Arnoldi-based k-eigenvalue solver has been implemented. The added phase-space concurrency combined with the high- performance Krylov and Arnoldi solvers has enabled weak scaling to O(100K) cores on the Jaguar XK6 supercomputer. The multilevel decomposition provides sucient parallelism to scale to exascale computing and beyond.

  8. Massively parallel solution of the inverse scattering problem for integrated circuit quality control

    SciTech Connect

    Leland, R.W.; Draper, B.L.; Naqvi, S.; Minhas, B.

    1997-09-01

    The authors developed and implemented a highly parallel computational algorithm for solution of the inverse scattering problem generated when an integrated circuit is illuminated by laser. The method was used as part of a system to measure diffraction grating line widths on specially fabricated test wafers and the results of the computational analysis were compared with more traditional line-width measurement techniques. The authors found they were able to measure the line width of singly periodic and doubly periodic diffraction gratings (i.e. 2D and 3D gratings respectively) with accuracy comparable to the best available experimental techniques. They demonstrated that their parallel code is highly scalable, achieving a scaled parallel efficiency of 90% or more on typical problems running on 1024 processors. They also made substantial improvements to the algorithmics and their original implementation of Rigorous Coupled Waveform Analysis, the underlying computational technique. These resulted in computational speed-ups of two orders of magnitude in some test problems. By combining these algorithmic improvements with parallelism the authors achieve speedups of between a few thousand and hundreds of thousands over the original engineering code. This made the laser diffraction measurement technique practical.

  9. Spatiotemporal Domain Decomposition for Massive Parallel Computation of Space-Time Kernel Density

    NASA Astrophysics Data System (ADS)

    Hohl, A.; Delmelle, E. M.; Tang, W.

    2015-07-01

    Accelerated processing capabilities are deemed critical when conducting analysis on spatiotemporal datasets of increasing size, diversity and availability. High-performance parallel computing offers the capacity to solve computationally demanding problems in a limited timeframe, but likewise poses the challenge of preventing processing inefficiency due to workload imbalance between computing resources. Therefore, when designing new algorithms capable of implementing parallel strategies, careful spatiotemporal domain decomposition is necessary to account for heterogeneity in the data. In this study, we perform octtree-based adaptive decomposition of the spatiotemporal domain for parallel computation of space-time kernel density. In order to avoid edge effects near subdomain boundaries, we establish spatiotemporal buffers to include adjacent data-points that are within the spatial and temporal kernel bandwidths. Then, we quantify computational intensity of each subdomain to balance workloads among processors. We illustrate the benefits of our methodology using a space-time epidemiological dataset of Dengue fever, an infectious vector-borne disease that poses a severe threat to communities in tropical climates. Our parallel implementation of kernel density reaches substantial speedup compared to sequential processing, and achieves high levels of workload balance among processors due to great accuracy in quantifying computational intensity. Our approach is portable of other space-time analytical tests.

  10. Massive parallel implementation of JPEG2000 decoding algorithm with multi-GPUs

    NASA Astrophysics Data System (ADS)

    Wu, Xianyun; Li, Yunsong; Liu, Kai; Wang, Keyan; Wang, Li

    2014-05-01

    JPEG2000 is an important technique for image compression that has been successfully used in many fields. Due to the increasing spatial, spectral and temporal resolution of remotely sensed imagery data sets, fast decompression of remote sensed data is becoming a very important and challenging object. In this paper, we develop an implementation of the JPEG2000 decompression in graphics processing units (GPUs) for fast decoding of codeblock-based parallel compression stream. We use one CUDA block to decode one frame. Tier-2 is still serial decoded while Tier-1 and IDWT are parallel processed. Since our encode stream are block-based parallel which means each block are independent with other blocks, we parallel process each block in T1 with one thread. For IDWT, we use one CUDA block to execute one line and one CUDA thread to process one pixel. We investigate the speedups that can be gained by using the GPUs implementations with regards to the CPUs-based serial implementations. Experimental result reveals that our implementation can achieve significant speedups compared with serial implementations.

  11. Optical binary de Bruijn networks for massively parallel computing: design methodology and feasibility study

    NASA Astrophysics Data System (ADS)

    Louri, Ahmed; Sung, Hongki

    1995-10-01

    The interconnection network structure can be the deciding and limiting factor in the cost and the performance of parallel computers. One of the most popular point-to-point interconnection networks for parallel computers today is the hypercube. The regularity, logarithmic diameter, symmetry, high connectivity, fault tolerance, simple routing, and reconfigurability (easy embedding of other network topologies) of the hypercube make it a very attractive choice for parallel computers. Unfortunately the hypercube possesses a major drawback, which is the links per node increases as the network grows in size. As an alternative to the hypercube, the binary de Bruijn (BdB) network has recently received much attention. The BdB not only provides a logarithmic diameter, fault tolerance, and simple routing but also requires fewer links than the hypercube for the same network size. Additionally, a major advantage of the BdB edges per node is independent of the network size. This makes it very desirable for large-scale parallel systems. However, because of its asymmetrical nature and global connectivity, it poses a major challenge for VLSI technology. Optics, owing to its three-dimensional and global-connectivity nature, seems to be very suitable for implementing BdB networks. We present an implementation methodology for optical BdB networks. The distinctive feature of the proposed implementation methodology is partitionability of the network into a few primitive operations that can be implemented efficiently. We further show feasibility of the

  12. Solution of the within-group multidimensional discrete ordinates transport equations on massively parallel architectures

    NASA Astrophysics Data System (ADS)

    Zerr, Robert Joseph

    2011-12-01

    The integral transport matrix method (ITMM) has been used as the kernel of new parallel solution methods for the discrete ordinates approximation of the within-group neutron transport equation. The ITMM abandons the repetitive mesh sweeps of the traditional source iterations (SI) scheme in favor of constructing stored operators that account for the direct coupling factors among all the cells and between the cells and boundary surfaces. The main goals of this work were to develop the algorithms that construct these operators and employ them in the solution process, determine the most suitable way to parallelize the entire procedure, and evaluate the behavior and performance of the developed methods for increasing number of processes. This project compares the effectiveness of the ITMM with the SI scheme parallelized with the Koch-Baker-Alcouffe (KBA) method. The primary parallel solution method involves a decomposition of the domain into smaller spatial sub-domains, each with their own transport matrices, and coupled together via interface boundary angular fluxes. Each sub-domain has its own set of ITMM operators and represents an independent transport problem. Multiple iterative parallel solution methods have investigated, including parallel block Jacobi (PBJ), parallel red/black Gauss-Seidel (PGS), and parallel GMRES (PGMRES). The fastest observed parallel solution method, PGS, was used in a weak scaling comparison with the PARTISN code. Compared to the state-of-the-art SI-KBA with diffusion synthetic acceleration (DSA), this new method without acceleration/preconditioning is not competitive for any problem parameters considered. The best comparisons occur for problems that are difficult for SI DSA, namely highly scattering and optically thick. SI DSA execution time curves are generally steeper than the PGS ones. However, until further testing is performed it cannot be concluded that SI DSA does not outperform the ITMM with PGS even on several thousand or tens of

  13. Nonlinear structural response using adaptive dynamic relaxation on a massively-parallel-processing system

    NASA Technical Reports Server (NTRS)

    Oakley, David R.; Knight, Norman F., Jr.

    1994-01-01

    A parallel adaptive dynamic relaxation (ADR) algorithm has been developed for nonlinear structural analysis. This algorithm has minimal memory requirements, is easily parallelizable and scalable to many processors, and is generally very reliable and efficient for highly nonlinear problems. Performance evaluations on single-processor computers have shown that the ADR algorithm is reliable and highly vectorizable, and that it is competitive with direct solution methods for the highly nonlinear problems considered. The present algorithm is implemented on the 512-processor Intel Touchstone DELTA system at Caltech, and it is designed to minimize the extent and frequency of interprocessor communication. The algorithm has been used to solve for the nonlinear static response of two and three dimensional hyperelastic systems involving contact. Impressive relative speedups have been achieved and demonstrate the high scalability of the ADR algorithm. For the class of problems addressed, the ADR algorithm represents a very promising approach for parallel-vector processing.

  14. Analysis and selection of optimal function implementations in massively parallel computer

    DOEpatents

    Archer, Charles Jens; Peters, Amanda; Ratterman, Joseph D.

    2011-05-31

    An apparatus, program product and method optimize the operation of a parallel computer system by, in part, collecting performance data for a set of implementations of a function capable of being executed on the parallel computer system based upon the execution of the set of implementations under varying input parameters in a plurality of input dimensions. The collected performance data may be used to generate selection program code that is configured to call selected implementations of the function in response to a call to the function under varying input parameters. The collected performance data may be used to perform more detailed analysis to ascertain the comparative performance of the set of implementations of the function under the varying input parameters.

  15. Application of Parallel Hybrid Algorithm in Massively Parallel GPGPU—The Improved Effective and Efficient Method for Calculating Coulombic Interactions in Simulations of Many Ions with SIMION

    NASA Astrophysics Data System (ADS)

    Saito, Kenichiro; Koizumi, Eiko; Koizumi, Hideya

    2012-09-01

    In our previous study, we introduced a new hybrid approach to effectively approximate the total force on each ion during a trajectory calculation in mass spectrometry device simulations, and the algorithm worked successfully with SIMION. We took one step further and applied the method in massively parallel general-purpose computing with GPU (GPGPU) to test its performance in simulations with thousands to over a million ions. We took extra care to minimize the barrier synchronization and data transfer between the host (CPU) and the device (GPU) memory, and took full advantage of the latency hiding. Parallel codes were written in CUDA C++ and implemented to SIMION via the user-defined Lua program. In this study, we tested the parallel hybrid algorithm with a couple of basic models and analyzed the performance by comparing it to that of the original, fully-explicit method written in serial code. The Coulomb explosion simulation with 128,000 ions was completed in 309 s, over 700 times faster than the 63 h taken by the original explicit method in which we evaluated two-body Coulomb interactions explicitly on one ion with each of all the other ions. The simulation of 1,024,000 ions was completed in 2650 s. In another example, we applied the hybrid method on a simulation of ions in a simple quadrupole ion storage model with 100,000 ions, and it only took less than 10 d. Based on our estimate, the same simulation is expected to take 5-7 y by the explicit method in serial code.

  16. Process Simulation of Complex Biological Pathways in Physical Reactive Space and Reformulated for Massively Parallel Computing Platforms.

    PubMed

    Ganesan, Narayan; Li, Jie; Sharma, Vishakha; Jiang, Hanyu; Compagnoni, Adriana

    2016-01-01

    Biological systems encompass complexity that far surpasses many artificial systems. Modeling and simulation of large and complex biochemical pathways is a computationally intensive challenge. Traditional tools, such as ordinary differential equations, partial differential equations, stochastic master equations, and Gillespie type methods, are all limited either by their modeling fidelity or computational efficiency or both. In this work, we present a scalable computational framework based on modeling biochemical reactions in explicit 3D space, that is suitable for studying the behavior of large and complex biological pathways. The framework is designed to exploit parallelism and scalability offered by commodity massively parallel processors such as the graphics processing units (GPUs) and other parallel computing platforms. The reaction modeling in 3D space is aimed at enhancing the realism of the model compared to traditional modeling tools and framework. We introduce the Parallel Select algorithm that is key to breaking the sequential bottleneck limiting the performance of most other tools designed to study biochemical interactions. The algorithm is designed to be computationally tractable, handle hundreds of interacting chemical species and millions of independent agents by considering all-particle interactions within the system. We also present an implementation of the framework on the popular graphics processing units and apply it to the simulation study of JAK-STAT Signal Transduction Pathway. The computational framework will offer a deeper insight into various biological processes within the cell and help us observe key events as they unfold in space and time. This will advance the current state-of-the-art in simulation study of large scale biological systems and also enable the realistic simulation study of macro-biological cultures, where inter-cellular interactions are prevalent. PMID:27045833

  17. Massively parallel 454-sequencing of fungal communities in Quercus spp. ectomycorrhizas indicates seasonal dynamics in urban and rural sites.

    PubMed

    Jumpponen, Ari; Jones, Kenneth L; David Mattox, J; Yaege, Chulee

    2010-03-01

    We analysed two sites within and outside an urban development in a rural background to estimate the fungal richness, diversity and community composition in Quercus spp. ectomycorrhizas using massively parallel 454-sequencing in combination with DNA-tagging. Our analyses indicated that shallow sequencing ( approximately 150 sequences) of a large number of samples (192 in total) provided data that allowed identification of seasonal trends within the fungal communities: putative root-associated antagonists and saprobes that were abundant early in the growing season were replaced by common ectomycorrhizal fungi in the course of the growing season. Ordination analyses identified a number of factors that were correlated with the observed communities including host species as well as soil organic matter, nutrient and heavy metal enrichment. Overall, our application of the high throughput 454 sequencing provided an expedient means for characterization of fungal communities. PMID:20331769

  18. Probing the Nanosecond Dynamics of a Designed Three-Stranded Beta-Sheet with a Massively Parallel Molecular Dynamics Simulation

    PubMed Central

    Voelz, Vincent A.; Luttmann, Edgar; Bowman, Gregory R.; Pande, Vijay S.

    2009-01-01

    Recently a temperature-jump FTIR study of a designed three-stranded sheet showing a fast relaxation time of ~140 ± 20 ns was published. We performed massively parallel molecular dynamics simulations in explicit solvent to probe the structural events involved in this relaxation. While our simulations produce similar relaxation rates, the structural ensemble is broad. We observe the formation of turn structure, but only very weak interaction in the strand regions, which is consistent with the lack of strong backbone-backbone NOEs in previous structural NMR studies. These results suggest that either DPDP-II folds at time scales longer than 240 ns, or that DPDP-II is not a well-defined three-stranded β-sheet. This work also provides an opportunity to compare the performance of several popular forcefield models against one another. PMID:19399235

  19. High-Throughput Detection of Actionable Genomic Alterations in Clinical Tumor Samples by Targeted, Massively Parallel Sequencing

    PubMed Central

    Wagle, Nikhil; Berger, Michael F.; Davis, Matthew J.; Blumenstiel, Brendan; DeFelice, Matthew; Pochanard, Panisa; Ducar, Matthew; Van Hummelen, Paul; MacConaill, Laura E.; Hahn, William C.; Meyerson, Matthew; Gabriel, Stacey B.; Garraway, Levi A.

    2011-01-01

    Knowledge of “actionable” somatic genomic alterations present in each tumor (e.g., point mutations, small insertions/deletions, and copy number alterations that direct therapeutic options) should facilitate individualized approaches to cancer treatment. However, clinical implementation of systematic genomic profiling has rarely been achieved beyond limited numbers of oncogene point mutations. To address this challenge, we utilized a targeted, massively parallel sequencing approach to detect tumor genomic alterations in formalin-fixed, paraffin embedded (FFPE) tumor samples. Nearly 400-fold mean sequence coverage was achieved, and single nucleotide sequence variants, small insertions/deletions, and chromosomal copy number alterations were detected simultaneously with high accuracy compared to other methods in clinical use. Putatively actionable genomic alterations, including those that predict sensitivity or resistance to established and experimental therapies, were detected in each tumor sample tested. Thus, targeted deep sequencing of clinical tumor material may enable mutation-driven clinical trials and, ultimately, ”personalized” cancer treatment. PMID:22585170

  20. Development of a Massively Parallel Particle-Mesh Algorithm for Simulations of Galaxy Dynamics and Plasmas

    NASA Astrophysics Data System (ADS)

    Wallin, John

    1996-01-01

    Particle-mesh calculations treat forces and potentials as field quantities which are represented approximately on a mesh. A system of particles is mapped onto this mesh as a density distribution of mass or charge. The Fourier transform is used to convolve this distribution with the Green's function of the potential, and a finite difference scheme is used to calculate the forces acting on the particles. The computation time scales as the Ng log Ng, where Ng is the size of the computational grid. In contrast, the particle-particle method's computing time relies on direct summation, so the time for each calculation is given by Np2, where Np is the number of particles. The particle-mesh method is best suited for simulations with a fixed minimum resolution and for collisionless systems, while hierarchical tree codes have proven to be superior for collisional systems where two-body interactions are important. Particle mesh methods still dominate in plasma physics where collisionless systems are modeled. The CM-200 Connection Machine produced by Thinking Machines Corp. is a data parallel system. On this system, the front-end computer controls the timing and execution of the parallel processing units. The programming paradigm is Single-Instruction, Multiple Data (SIMD). The processors on the CM-200 are connected in an N-dimensional hypercube; the largest number of links a message will ever have to make is N. As in all parallel computing, the efficiency of an algorithm is primarily determined by the fraction of the time spent communicating compared to that spent computing. Because of the topology of the processors, nearest neighbor communication is more efficient than general communication.

  1. The implementation of the upwind leapfrog scheme for 3D electromagnetic scattering on massively parallel computers

    SciTech Connect

    Nguyen, B.T.; Hutchinson, S.A.

    1995-07-01

    The upwind leapfrog scheme for electromagnetic scattering is briefly described. Its application to the 3D Maxwell`s time domain equations is shown in detail. The scheme`s use of upwind characteristic variables and a narrow stencil result in a smaller demand in communication overhead, making it ideal for implementation on distributed memory parallel computers. The algorithm`s implementation on two message passing computers, a 1024-processor nCUBE 2 and a 1840-processor Intel Paragon, is described. Performance evaluation demonstrates that the scheme performs well with both good scaling qualities and high efficiencies on these machines.

  2. Extended computational kernels in a massively parallel implementation of the Trotter-Suzuki approximation

    NASA Astrophysics Data System (ADS)

    Wittek, Peter; Calderaro, Luca

    2015-12-01

    We extended a parallel and distributed implementation of the Trotter-Suzuki algorithm for simulating quantum systems to study a wider range of physical problems and to make the library easier to use. The new release allows periodic boundary conditions, many-body simulations of non-interacting particles, arbitrary stationary potential functions, and imaginary time evolution to approximate the ground state energy. The new release is more resilient to the computational environment: a wider range of compiler chains and more platforms are supported. To ease development, we provide a more extensive command-line interface, an application programming interface, and wrappers from high-level languages.

  3. Exposing malaria in-host diversity and estimating population diversity by capture-recapture using massively parallel pyrosequencing

    PubMed Central

    Juliano, Jonathan J.; Porter, Kimberly; Mwapasa, Victor; Sem, Rithy; Rogers, William O.; Ariey, Frédéric; Wongsrichanalai, Chansuda; Read, Andrew; Meshnick, Steven R.

    2010-01-01

    Malaria infections commonly contain multiple genetically distinct variants. Mathematical and animal models suggest that interactions among these variants have a profound impact on the emergence of drug resistance. However, methods currently used for quantifying parasite diversity in individual infections are insensitive to low-abundance variants and are not quantitative for variant population sizes. To more completely describe the in-host complexity and ecology of malaria infections, we used massively parallel pyrosequencing to characterize malaria parasite diversity in the infections of a group of patients. By individually sequencing single strands of DNA in a complex mixture, this technique can quantify uncommon variants in mixed infections. The in-host diversity revealed by this method far exceeded that described by currently recommended genotyping methods, with as many as sixfold more variants per infection. In addition, in paired pre- and posttreatment samples, we show a complex milieu of parasites, including variants likely up-selected and down-selected by drug therapy. As with all surveys of diversity, sampling limitations prevent full discovery and differences in sampling effort can confound comparisons among samples, hosts, and populations. Here, we used ecological approaches of species accumulation curves and capture-recapture to estimate the number of variants we failed to detect in the population, and show that these methods enable comparisons of diversity before and after treatment, as well as between malaria populations. The combination of ecological statistics and massively parallel pyrosequencing provides a powerful tool for studying the evolution of drug resistance and the in-host ecology of malaria infections. PMID:21041629

  4. Inter-laboratory evaluation of SNP-based forensic identification by massively parallel sequencing using the Ion PGM™.

    PubMed

    Eduardoff, M; Santos, C; de la Puente, M; Gross, T E; Fondevila, M; Strobl, C; Sobrino, B; Ballard, D; Schneider, P M; Carracedo, Á; Lareu, M V; Parson, W; Phillips, C

    2015-07-01

    Next generation sequencing (NGS) offers the opportunity to analyse forensic DNA samples and obtain massively parallel coverage of targeted short sequences with the variants they carry. We evaluated the levels of sequence coverage, genotyping precision, sensitivity and mixed DNA patterns of a prototype version of the first commercial forensic NGS kit: the HID-Ion AmpliSeq™ Identity Panel with 169-markers designed for the Ion PGM™ system. Evaluations were made between three laboratories following closely matched Ion PGM™ protocols and a simple validation framework of shared DNA controls. The sequence coverage obtained was extensive for the bulk of SNPs targeted by the HID-Ion AmpliSeq™ Identity Panel. Sensitivity studies showed 90-95% of SNP genotypes could be obtained from 25 to 100pg of input DNA. Genotyping concordance tests included Coriell cell-line control DNA analyses checked against whole-genome sequencing data from 1000 Genomes and Complete Genomics, indicating a very high concordance rate of 99.8%. Discordant genotypes detected in rs1979255, rs1004357, rs938283, rs2032597 and rs2399332 indicate these loci should be excluded from the panel. Therefore, the HID-Ion AmpliSeq™ Identity Panel and Ion PGM™ system provide a sensitive and accurate forensic SNP genotyping assay. However, low-level DNA produced much more varied sequence coverage and in forensic use the Ion PGM™ system will require careful calibration of the total samples loaded per chip to preserve the genotyping reliability seen in routine forensic DNA. Furthermore, assessments of mixed DNA indicate the user's control of sequence analysis parameter settings is necessary to ensure mixtures are detected robustly. Given the sensitivity of Ion PGM™, this aspect of forensic genotyping requires further optimisation before massively parallel sequencing is applied to routine casework. PMID:25955683

  5. A massively parallel method of characteristic neutral particle transport code for GPUs

    SciTech Connect

    Boyd, W. R.; Smith, K.; Forget, B.

    2013-07-01

    Over the past 20 years, parallel computing has enabled computers to grow ever larger and more powerful while scientific applications have advanced in sophistication and resolution. This trend is being challenged, however, as the power consumption for conventional parallel computing architectures has risen to unsustainable levels and memory limitations have come to dominate compute performance. Heterogeneous computing platforms, such as Graphics Processing Units (GPUs), are an increasingly popular paradigm for solving these issues. This paper explores the applicability of GPUs for deterministic neutron transport. A 2D method of characteristics (MOC) code - OpenMOC - has been developed with solvers for both shared memory multi-core platforms as well as GPUs. The multi-threading and memory locality methodologies for the GPU solver are presented. Performance results for the 2D C5G7 benchmark demonstrate 25-35 x speedup for MOC on the GPU. The lessons learned from this case study will provide the basis for further exploration of MOC on GPUs as well as design decisions for hardware vendors exploring technologies for the next generation of machines for scientific computing. (authors)

  6. Harnessing the killer micros: Applications from LLNL's massively parallel computing initiative

    SciTech Connect

    Belak, J.F.

    1991-07-01

    Recent developments in microprocessor technology have led to performance on scalar applications exceeding traditional supercomputers. This suggests that coupling hundreds or even thousands of these killer-micros'' (all working on a single physical problem) may lead to performance on vector applications in excess of vector supercomputers. Also, future generation killer-micros are expected to have vector floating point units as well. The purpose of this paper is to present an overview of the parallel computing environment at Lawrence Livermore National Laboratory. However, the perspective is necessarily quite narrow and most of the examples are taken from the author's implementation of a large scale molecular dynamics code on the BBN-TC2000 at LLNL. Parallelism is achieved through a geometric domain decomposition -- each processor is assigned a distinct region of space and all atoms contained therein. As the atomic positions evolve, the processors must exchange ownership of specific atoms. This geometric domain decomposition proves to be quite general and we highlight its application to image processing and hydrodynamics simulations as well. 10 refs., 6 figs.

  7. A practical approach to portability and performance problems on massively parallel supercomputers

    SciTech Connect

    Beazley, D.M.; Lomdahl, P.S.

    1994-12-08

    We present an overview of the tactics we have used to achieve a high-level of performance while improving portability for a large-scale molecular dynamics code SPaSM. SPaSM was originally implemented in ANSI C with message passing for the Connection Machine 5 (CM-5). In 1993, SPaSM was selected as one of the winners in the IEEE Gordon Bell Prize competition for sustaining 50 Gflops on the 1024 node CM-5 at Los Alamos National Laboratory. Achieving this performance on the CM-5 required rewriting critical sections of code in CDPEAC assembler language. In addition, the code made extensive use of CM-5 parallel I/O and the CMMD message passing library. Given this highly specialized implementation, we describe how we have ported the code to the Cray T3D and high performance workstations. In addition we will describe how it has been possible to do this using a single version of source code that runs on all three platforms without sacrificing any performance. Sound too good to be true? We hope to demonstrate that one can realize both code performance and portability without relying on the latest and greatest prepackaged tool or parallelizing compiler.

  8. Massive parallelization of a 3D finite difference electromagnetic forward solution using domain decomposition methods on multiple CUDA enabled GPUs

    NASA Astrophysics Data System (ADS)

    Schultz, A.

    2010-12-01

    3D forward solvers lie at the core of inverse formulations used to image the variation of electrical conductivity within the Earth's interior. This property is associated with variations in temperature, composition, phase, presence of volatiles, and in specific settings, the presence of groundwater, geothermal resources, oil/gas or minerals. The high cost of 3D solutions has been a stumbling block to wider adoption of 3D methods. Parallel algorithms for modeling frequency domain 3D EM problems have not achieved wide scale adoption, with emphasis on fairly coarse grained parallelism using MPI and similar approaches. The communications bandwidth as well as the latency required to send and receive network communication packets is a limiting factor in implementing fine grained parallel strategies, inhibiting wide adoption of these algorithms. Leading Graphics Processor Unit (GPU) companies now produce GPUs with hundreds of GPU processor cores per die. The footprint, in silicon, of the GPU's restricted instruction set is much smaller than the general purpose instruction set required of a CPU. Consequently, the density of processor cores on a GPU can be much greater than on a CPU. GPUs also have local memory, registers and high speed communication with host CPUs, usually through PCIe type interconnects. The extremely low cost and high computational power of GPUs provides the EM geophysics community with an opportunity to achieve fine grained (i.e. massive) parallelization of codes on low cost hardware. The current generation of GPUs (e.g. NVidia Fermi) provides 3 billion transistors per chip die, with nearly 500 processor cores and up to 6 GB of fast (DDR5) GPU memory. This latest generation of GPU supports fast hardware double precision (64 bit) floating point operations of the type required for frequency domain EM forward solutions. Each Fermi GPU board can sustain nearly 1 TFLOP in double precision, and multiple boards can be installed in the host computer system. We

  9. Library Preparation and Multiplex Capture for Massive Parallel Sequencing Applications Made Efficient and Easy

    PubMed Central

    Neiman, Mårten; Sundling, Simon; Grönberg, Henrik; Hall, Per; Czene, Kamila

    2012-01-01

    During the recent years, rapid development of sequencing technologies and a competitive market has enabled researchers to perform massive sequencing projects at a reasonable cost. As the price for the actual sequencing reactions drops, enabling more samples to be sequenced, the relative price for preparing libraries gets larger and the practical laboratory work becomes complex and tedious. We present a cost-effective strategy for simplified library preparation compatible with both whole genome- and targeted sequencing experiments. An optimized enzyme composition and reaction buffer reduces the number of required clean-up steps and allows for usage of bulk enzymes which makes the whole process cheap, efficient and simple. We also present a two-tagging strategy, which allows for multiplex sequencing of targeted regions. To prove our concept, we have prepared libraries for low-pass sequencing from 100 ng DNA, performed 2-, 4- and 8-plex exome capture and a 96-plex capture of a 500 kb region. In all samples we see a high concordance (>99.4%) of SNP calls when comparing to commercially available SNP-chip platforms. PMID:23139805

  10. Implementation of Helioseismic Data Reduction and Diagnostic Techniques on Massively Parallel Architectures

    NASA Technical Reports Server (NTRS)

    Korzennik, Sylvain

    1997-01-01

    Under the direction of Dr. Rhodes, and the technical supervision of Dr. Korzennik, the data assimilation of high spatial resolution solar dopplergrams has been carried out throughout the program on the Intel Delta Touchstone supercomputer. With the help of a research assistant, partially supported by this grant, and under the supervision of Dr. Korzennik, code development was carried out at SAO, using various available resources. To ensure cross-platform portability, PVM was selected as the message passing library. A parallel implementation of power spectra computation for helioseismology data reduction, using PVM was successfully completed. It was successfully ported to SMP architectures (i.e. SUN), and to some MPP architectures (i.e. the CM5). Due to limitation of the implementation of PVM on the Cray T3D, the port to that architecture was not completed at the time.

  11. On Deciding between Conservative and Optimistic Approaches on Massively Parallel Platforms

    SciTech Connect

    Carothers, Prof. Christopher D.; Perumalla, Kalyan S

    2010-01-01

    Over 5000 publications on parallel discrete event simulation (PDES) have appeared in the literature to date. Nevertheless, few articles have focused on empirical studies of PDES performance on large supercomputer-based systems. This gap is bridged here, by undertaking a parameterized performance study on thousands of processor cores of a Blue Gene supercomputing system. In contrast to theoretical insights from analytical studies, our study is based on actual implementation in software, incurring the actual messaging and computational overheads for both conservative and optimistic synchronization approaches of PDES. Complex and counter-intuitive effects are uncovered and analyzed, with different event timestamp distributions and available levels of concurrency in the synthetic benchmark models. The results are intended to provide guidance to the PDES community in terms of how the synchronization protocols behave at high processor core counts using a state-of-the-art supercomputing systems.

  12. Neptune: An astrophysical smooth particle hydrodynamics code for massively parallel computer architectures

    NASA Astrophysics Data System (ADS)

    Sandalski, Stou

    Smooth particle hydrodynamics is an efficient method for modeling the dynamics of fluids. It is commonly used to simulate astrophysical processes such as binary mergers. We present a newly developed GPU accelerated smooth particle hydrodynamics code for astrophysical simulations. The code is named neptune after the Roman god of water. It is written in OpenMP parallelized C++ and OpenCL and includes octree based hydrodynamic and gravitational acceleration. The design relies on object-oriented methodologies in order to provide a flexible and modular framework that can be easily extended and modified by the user. Several pre-built scenarios for simulating collisions of polytropes and black-hole accretion are provided. The code is released under the MIT Open Source license and publicly available at http://code.google.com/p/neptune-sph/.

  13. The transition to massively parallel computing within a production environment at a DOE access center

    SciTech Connect

    McCoy, M.G.

    1993-04-01

    In contemplating the transition from sequential to MP computing, the National Energy Research Supercomputer Center (NERSC) is faced with the frictions inherent in the duality of its mission. There have been two goals, the first has been to provide a stable, serviceable, production environment to the user base, the second to bring the most capable early serial supercomputers to the Center to make possible the leading edge simulations. This seeming conundrum has in reality been a source of strength. The task of meeting both goals was faced before with the CRAY 1 which, as delivered, was all iron; so the problems associated with the advent of parallel computers are not entirely new, but they are serious. Current vector supercomputers, such as the C90, offer mature production environments, including software tools, a large applications base, and generality; these machines can be used to attack the spectrum of scientific applications by a large user base knowledgeable in programming techniques for this architecture. Parallel computers to date have offered less developed, even rudimentary, working environments, a sparse applications base, and forced specialization. They have been specialized in terms of programming models, and specialized in terms of the kinds of applications which would do well on the machines. Given this context, why do many service computer centers feel that now is the time to cease or slow the procurement of traditional vector supercomputers in favor of MP systems? What are some of the issues that NERSC must face to engineer a smooth transition? The answers to these questions are multifaceted and by no means completely clear. However, a route exists as a result of early efforts at the Laboratories combined with research within the HPCC Program. One can begin with an analysis of why the hardware and software appearing shortly should be made available to the mainstream, and then address what would be required in an initial production environment.

  14. Dissecting the target specificity of RNase H recruiting oligonucleotides using massively parallel reporter analysis of short RNA motifs

    PubMed Central

    Rukov, Jakob Lewin; Hagedorn, Peter H.; Høy, Isabel Bro; Feng, Yanping; Lindow, Morten; Vinther, Jeppe

    2015-01-01

    Processing and post-transcriptional regulation of RNA often depend on binding of regulatory molecules to short motifs in RNA. The effects of such interactions are difficult to study, because most regulatory molecules recognize partially degenerate RNA motifs, embedded in a sequence context specific for each RNA. Here, we describe Library Sequencing (LibSeq), an accurate massively parallel reporter method for completely characterizing the regulatory potential of thousands of short RNA sequences in a specific context. By sequencing cDNA derived from a plasmid library expressing identical reporter genes except for a degenerate 7mer subsequence in the 3′UTR, the regulatory effects of each 7mer can be determined. We show that LibSeq identifies regulatory motifs used by RNA-binding proteins and microRNAs. We furthermore apply the method to cells transfected with RNase H recruiting oligonucleotides to obtain quantitative information for >15000 potential target sequences in parallel. These comprehensive datasets provide insights into the specificity requirements of RNase H and allow a specificity measure to be calculated for each tested oligonucleotide. Moreover, we show that inclusion of chemical modifications in the central part of an RNase H recruiting oligonucleotide can increase its sequence-specificity. PMID:26220183

  15. Running ATLAS workloads within massively parallel distributed applications using Athena Multi-Process framework (AthenaMP)

    NASA Astrophysics Data System (ADS)

    Calafiura, Paolo; Leggett, Charles; Seuster, Rolf; Tsulaia, Vakhtang; Van Gemmeren, Peter

    2015-12-01

    AthenaMP is a multi-process version of the ATLAS reconstruction, simulation and data analysis framework Athena. By leveraging Linux fork and copy-on-write mechanisms, it allows for sharing of memory pages between event processors running on the same compute node with little to no change in the application code. Originally targeted to optimize the memory footprint of reconstruction jobs, AthenaMP has demonstrated that it can reduce the memory usage of certain configurations of ATLAS production jobs by a factor of 2. AthenaMP has also evolved to become the parallel event-processing core of the recently developed ATLAS infrastructure for fine-grained event processing (Event Service) which allows the running of AthenaMP inside massively parallel distributed applications on hundreds of compute nodes simultaneously. We present the architecture of AthenaMP, various strategies implemented by AthenaMP for scheduling workload to worker processes (for example: Shared Event Queue and Shared Distributor of Event Tokens) and the usage of AthenaMP in the diversity of ATLAS event processing workloads on various computing resources: Grid, opportunistic resources and HPC.

  16. Deep mutational scanning of an antibody against epidermal growth factor receptor using mammalian cell display and massively parallel pyrosequencing

    PubMed Central

    Forsyth, Charles M.; Juan, Veronica; Akamatsu, Yoshiko; DuBridge, Robert B.; Doan, Minhtam; Ivanov, Alexander V.; Ma, Zhiyuan; Polakoff, Dixie; Razo, Jennifer; Wilson, Keith; Powers, David B.

    2013-01-01

    We developed a method for deep mutational scanning of antibody complementarity-determining regions (CDRs) that can determine in parallel the effect of every possible single amino acid CDR substitution on antigen binding. The method uses libraries of full length IgGs containing more than 1000 CDR point mutations displayed on mammalian cells, sorted by flow cytometry into subpopulations based on antigen affinity and analyzed by massively parallel pyrosequencing. Higher, lower and neutral affinity mutations are identified by their enrichment or depletion in the FACS subpopulations. We applied this method to a humanized version of the anti-epidermal growth factor receptor antibody cetuximab, generated a near comprehensive data set for 1060 point mutations that recapitulates previously determined structural and mutational data for these CDRs and identified 67 point mutations that increase affinity. The large-scale, comprehensive sequence-function data sets generated by this method should have broad utility for engineering properties such as antibody affinity and specificity and may advance theoretical understanding of antibody-antigen recognition. PMID:23765106

  17. Tubal ligation - Series (image)

    MedlinePlus

    Tubal ligation is surgery to tie the tubes (fallopian tubes) of a woman which causes permanent sterility by ... transport of the egg (ovum) to the uterus. Tubal ligation may be recommended for adult women who are ...

  18. cuTauLeaping: A GPU-Powered Tau-Leaping Stochastic Simulator for Massive Parallel Analyses of Biological Systems

    PubMed Central

    Besozzi, Daniela; Pescini, Dario; Mauri, Giancarlo

    2014-01-01

    Tau-leaping is a stochastic simulation algorithm that efficiently reconstructs the temporal evolution of biological systems, modeled according to the stochastic formulation of chemical kinetics. The analysis of dynamical properties of these systems in physiological and perturbed conditions usually requires the execution of a large number of simulations, leading to high computational costs. Since each simulation can be executed independently from the others, a massive parallelization of tau-leaping can bring to relevant reductions of the overall running time. The emerging field of General Purpose Graphic Processing Units (GPGPU) provides power-efficient high-performance computing at a relatively low cost. In this work we introduce cuTauLeaping, a stochastic simulator of biological systems that makes use of GPGPU computing to execute multiple parallel tau-leaping simulations, by fully exploiting the Nvidia's Fermi GPU architecture. We show how a considerable computational speedup is achieved on GPU by partitioning the execution of tau-leaping into multiple separated phases, and we describe how to avoid some implementation pitfalls related to the scarcity of memory resources on the GPU streaming multiprocessors. Our results show that cuTauLeaping largely outperforms the CPU-based tau-leaping implementation when the number of parallel simulations increases, with a break-even directly depending on the size of the biological system and on the complexity of its emergent dynamics. In particular, cuTauLeaping is exploited to investigate the probability distribution of bistable states in the Schlögl model, and to carry out a bidimensional parameter sweep analysis to study the oscillatory regimes in the Ras/cAMP/PKA pathway in S. cerevisiae. PMID:24663957

  19. Development and characterization of hollow microprobe array as a potential tool for versatile and massively parallel manipulation of single cells.

    PubMed

    Nagai, Moeto; Oohara, Kiyotaka; Kato, Keita; Kawashima, Takahiro; Shibata, Takayuki

    2015-04-01

    Parallel manipulation of single cells is important for reconstructing in vivo cellular microenvironments and studying cell functions. To manipulate single cells and reconstruct their environments, development of a versatile manipulation tool is necessary. In this study, we developed an array of hollow probes using microelectromechanical systems fabrication technology and demonstrated the manipulation of single cells. We conducted a cell aspiration experiment with a glass pipette and modeled a cell using a standard linear solid model, which provided information for designing hollow stepped probes for minimally invasive single-cell manipulation. We etched a silicon wafer on both sides and formed through holes with stepped structures. The inner diameters of the holes were reduced by SiO2 deposition of plasma-enhanced chemical vapor deposition to trap cells on the tips. This fabrication process makes it possible to control the wall thickness, inner diameter, and outer diameter of the probes. With the fabricated probes, single cells were manipulated and placed in microwells at a single-cell level in a parallel manner. We studied the capture, release, and survival rates of cells at different suction and release pressures and found that the cell trapping rate was directly proportional to the suction pressure, whereas the release rate and viability decreased with increasing the suction pressure. The proposed manipulation system makes it possible to place cells in a well array and observe the adherence, spreading, culture, and death of the cells. This system has potential as a tool for massively parallel manipulation and for three-dimensional hetero cellular assays. PMID:25749639

  20. Massively parallel haplotyping on microscopic beads for the high-throughput phase analysis of single molecules.

    PubMed

    Boulanger, Jérôme; Muresan, Leila; Tiemann-Boege, Irene

    2012-01-01

    In spite of the many advances in haplotyping methods, it is still very difficult to characterize rare haplotypes in tissues and different environmental samples or to accurately assess the haplotype diversity in large mixtures. This would require a haplotyping method capable of analyzing the phase of single molecules with an unprecedented throughput. Here we describe such a haplotyping method capable of analyzing in parallel hundreds of thousands single molecules in one experiment. In this method, multiple PCR reactions amplify different polymorphic regions of a single DNA molecule on a magnetic bead compartmentalized in an emulsion drop. The allelic states of the amplified polymorphisms are identified with fluorescently labeled probes that are then decoded from images taken of the arrayed beads by a microscope. This method can evaluate the phase of up to 3 polymorphisms separated by up to 5 kilobases in hundreds of thousands single molecules. We tested the sensitivity of the method by measuring the number of mutant haplotypes synthesized by four different commercially available enzymes: Phusion, Platinum Taq, Titanium Taq, and Phire. The digital nature of the method makes it highly sensitive to detecting haplotype ratios of less than 1:10,000. We also accurately quantified chimera formation during the exponential phase of PCR by different DNA polymerases. PMID:22558329

  1. Efficient massively parallel simulation of dynamic channel assignment schemes for wireless cellular communications

    NASA Technical Reports Server (NTRS)

    Greenberg, Albert G.; Lubachevsky, Boris D.; Nicol, David M.; Wright, Paul E.

    1994-01-01

    Fast, efficient parallel algorithms are presented for discrete event simulations of dynamic channel assignment schemes for wireless cellular communication networks. The driving events are call arrivals and departures, in continuous time, to cells geographically distributed across the service area. A dynamic channel assignment scheme decides which call arrivals to accept, and which channels to allocate to the accepted calls, attempting to minimize call blocking while ensuring co-channel interference is tolerably low. Specifically, the scheme ensures that the same channel is used concurrently at different cells only if the pairwise distances between those cells are sufficiently large. Much of the complexity of the system comes from ensuring this separation. The network is modeled as a system of interacting continuous time automata, each corresponding to a cell. To simulate the model, conservative methods are used; i.e., methods in which no errors occur in the course of the simulation and so no rollback or relaxation is needed. Implemented on a 16K processor MasPar MP-1, an elegant and simple technique provides speedups of about 15 times over an optimized serial simulation running on a high speed workstation. A drawback of this technique, typical of conservative methods, is that processor utilization is rather low. To overcome this, new methods were developed that exploit slackness in event dependencies over short intervals of time, thereby raising the utilization to above 50 percent and the speedup over the optimized serial code to about 120 times.

  2. Delta: An object-oriented finite element code architecture for massively parallel computers

    SciTech Connect

    Weatherby, J.R.; Schutt, J.A.; Peery, J.S.; Hogan, R.E.

    1996-02-01

    Delta is an object-oriented code architecture based on the finite element method which enables simulation of a wide range of engineering mechanics problems in a parallel processing environment. Written in C{sup ++}, Delta is a natural framework for algorithm development and for research involving coupling of mechanics from different Engineering Science disciplines. To enhance flexibility and encourage code reuse, the architecture provides a clean separation of the major aspects of finite element programming. Spatial discretization, temporal discretization, and the solution of linear and nonlinear systems of equations are each implemented separately, independent from the governing field equations. Other attractive features of the Delta architecture include support for constitutive models with internal variables, reusable ``matrix-free`` equation solvers, and support for region-to-region variations in the governing equations and the active degrees of freedom. A demonstration code built from the Delta architecture has been used in two-dimensional and three-dimensional simulations involving dynamic and quasi-static solid mechanics, transient and steady heat transport, and flow in porous media.

  3. Massively-parallel electrical-conductivity imaging of hydrocarbonsusing the Blue Gene/L supercomputer

    SciTech Connect

    Commer, M.; Newman, G.A.; Carazzone, J.J.; Dickens, T.A.; Green,K.E.; Wahrmund, L.A.; Willen, D.E.; Shiu, J.

    2007-05-16

    Large-scale controlled source electromagnetic (CSEM)three-dimensional (3D) geophysical imaging is now receiving considerableattention for electrical conductivity mapping of potential offshore oiland gas reservoirs. To cope with the typically large computationalrequirements of the 3D CSEM imaging problem, our strategies exploitcomputational parallelism and optimized finite-difference meshing. Wereport on an imaging experiment, utilizing 32,768 tasks/processors on theIBM Watson Research Blue Gene/L (BG/L) supercomputer. Over a 24-hourperiod, we were able to image a large scale marine CSEM field data setthat previously required over four months of computing time ondistributed clusters utilizing 1024 tasks on an Infiniband fabric. Thetotal initial data misfit could be decreased by 67 percent within 72completed inversion iterations, indicating an electrically resistiveregion in the southern survey area below a depth of 1500 m below theseafloor. The major part of the residual misfit stems from transmitterparallel receiver components that have an offset from the transmittersail line (broadside configuration). Modeling confirms that improvedbroadside data fits can be achieved by considering anisotropic electricalconductivities. While delivering a satisfactory gross scale image for thedepths of interest, the experiment provides important evidence for thenecessity of discriminating between horizontal and verticalconductivities for maximally consistent 3D CSEM inversions.

  4. Sassena — X-ray and neutron scattering calculated from molecular dynamics trajectories using massively parallel computers

    NASA Astrophysics Data System (ADS)

    Lindner, Benjamin; Smith, Jeremy C.

    2012-07-01

    Massively parallel computers now permit the molecular dynamics (MD) simulation of multi-million atom systems on time scales up to the microsecond. However, the subsequent analysis of the resulting simulation trajectories has now become a high performance computing problem in itself. Here, we present software for calculating X-ray and neutron scattering intensities from MD simulation data that scales well on massively parallel supercomputers. The calculation and data staging schemes used maximize the degree of parallelism and minimize the IO bandwidth requirements. The strong scaling tested on the Jaguar Petaflop Cray XT5 at Oak Ridge National Laboratory exhibits virtually linear scaling up to 7000 cores for most benchmark systems. Since both MPI and thread parallelism is supported, the software is flexible enough to cover scaling demands for different types of scattering calculations. The result is a high performance tool capable of unifying large-scale supercomputing and a wide variety of neutron/synchrotron technology. Catalogue identifier: AELW_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AELW_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: GNU General Public License, version 3 No. of lines in distributed program, including test data, etc.: 1 003 742 No. of bytes in distributed program, including test data, etc.: 798 Distribution format: tar.gz Programming language: C++, OpenMPI Computer: Distributed Memory, Cluster of Computers with high performance network, Supercomputer Operating system: UNIX, LINUX, OSX Has the code been vectorized or parallelized?: Yes, the code has been parallelized using MPI directives. Tested with up to 7000 processors RAM: Up to 1 Gbytes/core Classification: 6.5, 8 External routines: Boost Library, FFTW3, CMAKE, GNU C++ Compiler, OpenMPI, LibXML, LAPACK Nature of problem: Recent developments in supercomputing allow molecular dynamics simulations to

  5. Modeling cardiovascular hemodynamics using the lattice Boltzmann method on massively parallel supercomputers

    NASA Astrophysics Data System (ADS)

    Randles, Amanda Elizabeth

    the modeling of fluids in vessels with smaller diameters and a method for introducing the deformational forces exerted on the arterial flows from the movement of the heart by borrowing concepts from cosmodynamics are presented. These additional forces have a great impact on the endothelial shear stress. Third, the fluid model is extended to not only recover Navier-Stokes hydrodynamics, but also a wider range of Knudsen numbers, which is especially important in micro- and nano-scale flows. The tradeoffs of many optimizations methods such as the use of deep halo level ghost cells that, alongside hybrid programming models, reduce the impact of such higher-order models and enable efficient modeling of extreme regimes of computational fluid dynamics are discussed. Fourth, the extension of these models to other research questions like clogging in microfluidic devices and determining the severity of co-arctation of the aorta is presented. Through this work, a validation of these methods by taking real patient data and the measured pressure value before the narrowing of the aorta and predicting the pressure drop across the co-arctation is shown. Comparison with the measured pressure drop in vivo highlights the accuracy and potential impact of such patient specific simulations. Finally, a method to enable the simulation of longer trajectories in time by discretizing both spatially and temporally is presented. In this method, a serial coarse iterator is used to initialize data at discrete time steps for a fine model that runs in parallel. This coarse solver is based on a larger time step and typically a coarser discretization in space. Iterative refinement enables the compute-intensive fine iterator to be modeled with temporal parallelization. The algorithm consists of a series of prediction-corrector iterations completing when the results have converged within a certain tolerance. Combined, these developments allow large fluid models to be simulated for longer time durations

  6. Massively Parallel Geostatistical Inversion of Coupled Processes in Heterogeneous Porous Media

    NASA Astrophysics Data System (ADS)

    Ngo, A.; Schwede, R. L.; Li, W.; Bastian, P.; Ippisch, O.; Cirpka, O. A.

    2012-04-01

    another level of parallelization has been added.

  7. Massively-parallel neuromonitoring and neurostimulation rodent headset with nanotextured flexible microelectrodes.

    PubMed

    Bagheri, Arezu; Gabran, S R I; Salam, Muhammad Tariqus; Perez Velazquez, Jose Luis; Mansour, Raafat R; Salama, M M A; Genov, Roman

    2013-10-01

    We present a compact wireless headset for simultaneous multi-site neuromonitoring and neurostimulation in the rodent brain. The system comprises flexible-shaft microelectrodes, neural amplifiers, neurostimulators, a digital time-division multiplexer (TDM), a micro-controller and a ZigBee wireless transceiver. The system is built by parallelizing up to four 0.35 μm CMOS integrated circuits (each having 256 neural amplifiers and 64 neurostimulators) to provide a total maximum of 1024 neural amplifiers and 256 neurostimulators. Each bipolar neural amplifier features 54 dB-72 dB adjustable gain, 1 Hz-5 kHz adjustable bandwidth with an input-referred noise of 7.99 μVrms and dissipates 12.9 μW. Each current-mode bipolar neurostimulator generates programmable arbitrary-waveform biphasic current in the range of 20-250 μA and dissipates 2.6 μW in the stand-by mode. Reconfigurability is provided by stacking a set of dedicated mini-PCBs that share a common signaling bus within as small as 22 × 30 × 15 mm³ volume. The system features flexible polyimide-based microelectrode array design that is not brittle and increases pad packing density. Pad nanotexturing by electrodeposition reduces the electrode-tissue interface impedance from an average of 2 MΩ to 30 kΩ at 100 Hz. The rodent headset and the microelectrode array have been experimentally validated in vivo in freely moving rats for two months. We demonstrate 92.8 percent seizure rate reduction by responsive neurostimulation in an acute epilepsy rat model. PMID:24144667

  8. High-Throughput Massively Parallel Sequencing for Fetal Aneuploidy Detection from Maternal Plasma

    PubMed Central

    Džakula, Željko; Kim, Sung K.; Mazloom, Amin R.; Zhu, Zhanyang; Tynan, John; Lu, Tim; McLennan, Graham; Palomaki, Glenn E.; Canick, Jacob A.; Oeth, Paul; Deciu, Cosmin; van den Boom, Dirk; Ehrich, Mathias

    2013-01-01

    Background Circulating cell-free (ccf) fetal DNA comprises 3–20% of all the cell-free DNA present in maternal plasma. Numerous research and clinical studies have described the analysis of ccf DNA using next generation sequencing for the detection of fetal aneuploidies with high sensitivity and specificity. We sought to extend the utility of this approach by assessing semi-automated library preparation, higher sample multiplexing during sequencing, and improved bioinformatic tools to enable a higher throughput, more efficient assay while maintaining or improving clinical performance. Methods Whole blood (10mL) was collected from pregnant female donors and plasma separated using centrifugation. Ccf DNA was extracted using column-based methods. Libraries were prepared using an optimized semi-automated library preparation method and sequenced on an Illumina HiSeq2000 sequencer in a 12-plex format. Z-scores were calculated for affected chromosomes using a robust method after normalization and genomic segment filtering. Classification was based upon a standard normal transformed cutoff value of z = 3 for chromosome 21 and z = 3.95 for chromosomes 18 and 13. Results Two parallel assay development studies using a total of more than 1900 ccf DNA samples were performed to evaluate the technical feasibility of automating library preparation and increasing the sample multiplexing level. These processes were subsequently combined and a study of 1587 samples was completed to verify the stability of the process-optimized assay. Finally, an unblinded clinical evaluation of 1269 euploid and aneuploid samples utilizing this high-throughput assay coupled to improved bioinformatic procedures was performed. We were able to correctly detect all aneuploid cases with extremely low false positive rates of 0.09%, <0.01%, and 0.08% for trisomies 21, 18, and 13, respectively. Conclusions These data suggest that the developed laboratory methods in concert with improved bioinformatic

  9. Massively-parallel FDTD simulations to address mask electromagnetic effects in hyper-NA immersion lithography

    NASA Astrophysics Data System (ADS)

    Tirapu Azpiroz, Jaione; Burr, Geoffrey W.; Rosenbluth, Alan E.; Hibbs, Michael

    2008-03-01

    In the Hyper-NA immersion lithography regime, the electromagnetic response of the reticle is known to deviate in a complicated manner from the idealized Thin-Mask-like behavior. Already, this is driving certain RET choices, such as the use of polarized illumination and the customization of reticle film stacks. Unfortunately, full 3-D electromagnetic mask simulations are computationally intensive. And while OPC-compatible mask electromagnetic field (EMF) models can offer a reasonable tradeoff between speed and accuracy for full-chip OPC applications, full understanding of these complex physical effects demands higher accuracy. Our paper describes recent advances in leveraging High Performance Computing as a critical step towards lithographic modeling of the full manufacturing process. In this paper, highly accurate full 3-D electromagnetic simulation of very large mask layouts are conducted in parallel with reasonable turnaround time, using a Blue- Gene/L supercomputer and a Finite-Difference Time-Domain (FDTD) code developed internally within IBM. A 3-D simulation of a large 2-D layout spanning 5μm×5μm at the wafer plane (and thus (20μm×20μm×0.5μm at the mask) results in a simulation with roughly 12.5GB of memory (grid size of 10nm at the mask, single-precision computation, about 30 bytes/grid point). FDTD is flexible and easily parallelizable to enable full simulations of such large layout in approximately an hour using one BlueGene/L "midplane" containing 512 dual-processor nodes with 256MB of memory per processor. Our scaling studies on BlueGene/L demonstrate that simulations up to 100μm × 100μm at the mask can be computed in a few hours. Finally, we will show that the use of a subcell technique permits accurate simulation of features smaller than the grid discretization, thus improving on the tradeoff between computational complexity and simulation accuracy. We demonstrate the correlation of the real and quadrature components that comprise the

  10. Assessing mutant p53 in primary high-grade serous ovarian cancer using immunohistochemistry and massively parallel sequencing

    PubMed Central

    Cole, Alexander J.; Dwight, Trisha; Gill, Anthony J.; Dickson, Kristie-Ann; Zhu, Ying; Clarkson, Adele; Gard, Gregory B.; Maidens, Jayne; Valmadre, Susan; Clifton-Bligh, Roderick; Marsh, Deborah J.

    2016-01-01

    The tumour suppressor p53 is mutated in cancer, including over 96% of high-grade serous ovarian cancer (HGSOC). Mutations cause loss of wild-type p53 function due to either gain of abnormal function of mutant p53 (mutp53), or absent to low mutp53. Massively parallel sequencing (MPS) enables increased accuracy of detection of somatic variants in heterogeneous tumours. We used MPS and immunohistochemistry (IHC) to characterise HGSOCs for TP53 mutation and p53 expression. TP53 mutation was identified in 94% (68/72) of HGSOCs, 62% of which were missense. Missense mutations demonstrated high p53 by IHC, as did 35% (9/26) of non-missense mutations. Low p53 was seen by IHC in 62% of HGSOC associated with non-missense mutations. Most wild-type TP53 tumours (75%, 6/8) displayed intermediate p53 levels. The overall sensitivity of detecting a TP53 mutation based on classification as ‘Low’, ‘Intermediate’ or ‘High’ for p53 IHC was 99%, with a specificity of 75%. We suggest p53 IHC can be used as a surrogate marker of TP53 mutation in HGSOC; however, this will result in misclassification of a proportion of TP53 wild-type and mutant tumours. Therapeutic targeting of mutp53 will require knowledge of both TP53 mutations and mutp53 expression. PMID:27189670

  11. Massively parallel E-beam inspection: enabling next-generation patterned defect inspection for wafer and mask manufacturing

    NASA Astrophysics Data System (ADS)

    Malloy, Matt; Thiel, Brad; Bunday, Benjamin D.; Wurm, Stefan; Mukhtar, Maseeh; Quoi, Kathy; Kemen, Thomas; Zeidler, Dirk; Eberle, Anna Lena; Garbowski, Tomasz; Dellemann, Gregor; Peters, Jan Hendrik

    2015-03-01

    SEMATECH aims to identify and enable disruptive technologies to meet the ever-increasing demands of semiconductor high volume manufacturing (HVM). As such, a program was initiated in 2012 focused on high-speed e-beam defect inspection as a complement, and eventual successor, to bright field optical patterned defect inspection [1]. The primary goal is to enable a new technology to overcome the key gaps that are limiting modern day inspection in the fab; primarily, throughput and sensitivity to detect ultra-small critical defects. The program specifically targets revolutionary solutions based on massively parallel e-beam technologies, as opposed to incremental improvements to existing e-beam and optical inspection platforms. Wafer inspection is the primary target, but attention is also being paid to next generation mask inspection. During the first phase of the multi-year program multiple technologies were reviewed, a down-selection was made to the top candidates, and evaluations began on proof of concept systems. A champion technology has been selected and as of late 2014 the program has begun to move into the core technology maturation phase in order to enable eventual commercialization of an HVM system. Performance data from early proof of concept systems will be shown along with roadmaps to achieving HVM performance. SEMATECH's vision for moving from early-stage development to commercialization will be shown, including plans for development with industry leading technology providers.

  12. Non-CAR resists and advanced materials for Massively Parallel E-Beam Direct Write process integration

    NASA Astrophysics Data System (ADS)

    Pourteau, Marie-Line; Servin, Isabelle; Lepinay, Kévin; Essomba, Cyrille; Dal'Zotto, Bernard; Pradelles, Jonathan; Lattard, Ludovic; Brandt, Pieter; Wieland, Marco

    2016-03-01

    The emerging Massively Parallel-Electron Beam Direct Write (MP-EBDW) is an attractive high resolution high throughput lithography technology. As previously shown, Chemically Amplified Resists (CARs) meet process/integration specifications in terms of dose-to-size, resolution, contrast, and energy latitude. However, they are still limited by their line width roughness. To overcome this issue, we tested an alternative advanced non-CAR and showed it brings a substantial gain in sensitivity compared to CAR. We also implemented and assessed in-line post-lithographic treatments for roughness mitigation. For outgassing-reduction purpose, a top-coat layer is added to the total process stack. A new generation top-coat was tested and showed improved printing performances compared to the previous product, especially avoiding dark erosion: SEM cross-section showed a straight pattern profile. A spin-coatable charge dissipation layer based on conductive polyaniline has also been tested for conductivity and lithographic performances, and compatibility experiments revealed that the underlying resist type has to be carefully chosen when using this product. Finally, the Process Of Reference (POR) trilayer stack defined for 5 kV multi-e-beam lithography was successfully etched with well opened and straight patterns, and no lithography-etch bias.

  13. LiNbO3: A photovoltaic substrate for massive parallel manipulation and patterning of nano-objects

    NASA Astrophysics Data System (ADS)

    Carrascosa, M.; García-Cabañes, A.; Jubera, M.; Ramiro, J. B.; Agulló-López, F.

    2015-12-01

    The application of evanescent photovoltaic (PV) fields, generated by visible illumination of Fe:LiNbO3 substrates, for parallel massive trapping and manipulation of micro- and nano-objects is critically reviewed. The technique has been often referred to as photovoltaic or photorefractive tweezers. The main advantage of the new method is that the involved electrophoretic and/or dielectrophoretic forces do not require any electrodes and large scale manipulation of nano-objects can be easily achieved using the patterning capabilities of light. The paper describes the experimental techniques for particle trapping and the main reported experimental results obtained with a variety of micro- and nano-particles (dielectric and conductive) and different illumination configurations (single beam, holographic geometry, and spatial light modulator projection). The report also pays attention to the physical basis of the method, namely, the coupling of the evanescent photorefractive fields to the dielectric response of the nano-particles. The role of a number of physical parameters such as the contrast and spatial periodicities of the illumination pattern or the particle deposition method is discussed. Moreover, the main properties of the obtained particle patterns in relation to potential applications are summarized, and first demonstrations reviewed. Finally, the PV method is discussed in comparison to other patterning strategies, such as those based on the pyroelectric response and the electric fields associated to domain poling of ferroelectric materials.

  14. LiNbO{sub 3}: A photovoltaic substrate for massive parallel manipulation and patterning of nano-objects

    SciTech Connect

    Carrascosa, M.; García-Cabañes, A.; Jubera, M.; Ramiro, J. B.; Agulló-López, F.

    2015-12-15

    The application of evanescent photovoltaic (PV) fields, generated by visible illumination of Fe:LiNbO{sub 3} substrates, for parallel massive trapping and manipulation of micro- and nano-objects is critically reviewed. The technique has been often referred to as photovoltaic or photorefractive tweezers. The main advantage of the new method is that the involved electrophoretic and/or dielectrophoretic forces do not require any electrodes and large scale manipulation of nano-objects can be easily achieved using the patterning capabilities of light. The paper describes the experimental techniques for particle trapping and the main reported experimental results obtained with a variety of micro- and nano-particles (dielectric and conductive) and different illumination configurations (single beam, holographic geometry, and spatial light modulator projection). The report also pays attention to the physical basis of the method, namely, the coupling of the evanescent photorefractive fields to the dielectric response of the nano-particles. The role of a number of physical parameters such as the contrast and spatial periodicities of the illumination pattern or the particle deposition method is discussed. Moreover, the main properties of the obtained particle patterns in relation to potential applications are summarized, and first demonstrations reviewed. Finally, the PV method is discussed in comparison to other patterning strategies, such as those based on the pyroelectric response and the electric fields associated to domain poling of ferroelectric materials.

  15. Use of Massive Parallel Computing Libraries in the Context of Global Gravity Field Determination from Satellite Data

    NASA Astrophysics Data System (ADS)

    Brockmann, J. M.; Schuh, W.-D.

    2011-07-01

    The estimation of the global Earth's gravity field parametrized as a finite spherical harmonic series is computationally demanding. The computational effort depends on the one hand on the maximal resolution of the spherical harmonic expansion (i.e. the number of parameters to be estimated) and on the other hand on the number of observations (which are several millions for e.g. observations from the GOCE satellite missions). To circumvent these restrictions, a massive parallel software based on high-performance computing (HPC) libraries as ScaLAPACK, PBLAS and BLACS was designed in the context of GOCE HPF WP6000 and the GOCO consortium. A prerequisite for the use of these libraries is that all matrices are block-cyclic distributed on a processor grid comprised by a large number of (distributed memory) computers. Using this set of standard HPC libraries has the benefit that once the matrices are distributed across the computer cluster, a huge set of efficient and highly scalable linear algebra operations can be used.

  16. Feasibility of using the Massively Parallel Processor for large eddy simulations and other Computational Fluid Dynamics applications

    NASA Technical Reports Server (NTRS)

    Bruno, John

    1984-01-01

    The results of an investigation into the feasibility of using the MPP for direct and large eddy simulations of the Navier-Stokes equations is presented. A major part of this study was devoted to the implementation of two of the standard numerical algorithms for CFD. These implementations were not run on the Massively Parallel Processor (MPP) since the machine delivered to NASA Goddard does not have sufficient capacity. Instead, a detailed implementation plan was designed and from these were derived estimates of the time and space requirements of the algorithms on a suitably configured MPP. In addition, other issues related to the practical implementation of these algorithms on an MPP-like architecture were considered; namely, adaptive grid generation, zonal boundary conditions, the table lookup problem, and the software interface. Performance estimates show that the architectural components of the MPP, the Staging Memory and the Array Unit, appear to be well suited to the numerical algorithms of CFD. This combined with the prospect of building a faster and larger MMP-like machine holds the promise of achieving sustained gigaflop rates that are required for the numerical simulations in CFD.

  17. A massively parallel track-finding system for the LEVEL 2 trigger in the CLAS detector at CEBAF

    SciTech Connect

    Doughty, D.C. Jr.; Collins, P.; Lemon, S. ); Bonneau, P. )

    1994-02-01

    The track segment finding subsystem of the LEVEL 2 trigger in the CLAS detector has been designed and prototyped. Track segments will be found in the 35,076 wires of the drift chambers using a massively parallel array of 768 Xilinx XC-4005 FPGA's. These FPGA's are located on daughter cards attached to the front-end boards distributed around the detector. Each chip is responsible for finding tracks passing through a 4 x 6 slice of an axial superlayer, and reports two segment found bits, one for each pair of cells. The algorithm used finds segments even when one or two layers or cells along the track is missing (this number is programmable), while being highly resistant to false segments arising from noise hits. Adjacent chips share data to find tracks crossing cell and board boundaries. For maximum speed, fully combinatorial logic is used inside each chip, with the result that all segments in the detector are found within 150 ns. Segment collection boards gather track segments from each axial superlayer and pass them via a high speed link to the segment linking subsystem in an additional 400 ns for typical events. The Xilinx chips are ram-based and therefore reprogrammable, allowing for future upgrades and algorithm enhancements.

  18. Determination of the Allelic Frequency in Smith-Lemli-Opitz Syndrome by Analysis of Massively Parallel Sequencing Data Sets

    PubMed Central

    Cross, Joanna L.; Iben, James; Simpson, Claire; Thurm, Audrey; Swedo, Susan; Tierney, Elaine; Bailey-Wilson, Joan; Biesecker, Leslie G.; Porter, Forbes D.; Wassif, Christopher A.

    2014-01-01

    Data from massively parallel sequencing or “Next Generation Sequencing” of the human exome has reached a critical mass in both public and private databases, in that these collections now allow researchers to critically evaluate population genetics in a manner that was not feasible a decade ago. The ability to determine pathogenic allele frequencies by evaluation of the full coding sequences and not merely a single SNP or series of SNPs will lead to more accurate estimations of incidence. For demonstrative purposes we analyzed the causative gene for the disorder Smith-Lemli-Opitz Syndrome (SLOS), the 7-dehydrocholesterol reductase (DHCR7) gene and determined both the carrier frequency for DHCR7 mutations, and predicted an expected incidence of the disorder. Estimations of the incidence of SLOS have ranged widely from 1:10,000 to 1:70,000 while the carrier frequency has been reported as high as 1 in 30. Using four exome data sets with a total of 17,836 chromosomes, we ascertained a carrier frequency of pathogenic DHRC7 mutations of 1.01%, and predict a SLOS disease incidence of 1/39,215 conceptions. This approach highlights yet another valuable aspect of the exome sequencing databases, to inform clinical and health policy decisions related to genetic counseling, prenatal testing and newborn screening. PMID:24813812

  19. Massively parallel sequencing of short tandem repeats-Population data and mixture analysis results for the PowerSeq™ system.

    PubMed

    van der Gaag, Kristiaan J; de Leeuw, Rick H; Hoogenboom, Jerry; Patel, Jaynish; Storts, Douglas R; Laros, Jeroen F J; de Knijff, Peter

    2016-09-01

    Current forensic DNA analysis predominantly involves identification of human donors by analysis of short tandem repeats (STRs) using Capillary Electrophoresis (CE). Recent developments in Massively Parallel Sequencing (MPS) technologies offer new possibilities in analysis of STRs since they might overcome some of the limitations of CE analysis. In this study 17 STRs and Amelogenin were sequenced in high coverage using a prototype version of the Promega PowerSeq™ system for 297 population samples from the Netherlands, Nepal, Bhutan and Central African Pygmies. In addition, 45 two-person mixtures with different minor contributions down to 1% were analysed to investigate the performance of this system for mixed samples. Regarding fragment length, complete concordance between the MPS and CE-based data was found, marking the reliability of MPS PowerSeq™ system. As expected, MPS presented a broader allele range and higher power of discrimination and exclusion rate. The high coverage sequencing data were used to determine stutter characteristics for all loci and stutter ratios were compared to CE data. The separation of alleles with the same length but exhibiting different stutter ratios lowers the overall variation in stutter ratio and helps in differentiation of stutters from genuine alleles in mixed samples. All alleles of the minor contributors were detected in the sequence reads even for the 1% contributions, but analysis of mixtures below 5% without prior information of the mixture ratio is complicated by PCR and sequencing artefacts. PMID:27347657

  20. Combined fragment molecular orbital cluster in molecule approach to massively parallel electron correlation calculations for large systems.

    PubMed

    Findlater, Alexander D; Zahariev, Federico; Gordon, Mark S

    2015-04-16

    The local correlation "cluster-in-molecule" (CIM) method is combined with the fragment molecular orbital (FMO) method, providing a flexible, massively parallel, and near-linear scaling approach to the calculation of electron correlation energies for large molecular systems. Although the computational scaling of the CIM algorithm is already formally linear, previous knowledge of the Hartree-Fock (HF) reference wave function and subsequent localized orbitals is required; therefore, extending the CIM method to arbitrarily large systems requires the aid of low-scaling/linear-scaling approaches to HF and orbital localization. Through fragmentation, the combined FMO-CIM method linearizes the scaling, with respect to system size, of the HF reference and orbital localization calculations, achieving near-linear scaling at both the reference and electron correlation levels. For the 20-residue alanine α helix, the preliminary implementation of the FMO-CIM method captures 99.6% of the MP2 correlation energy, requiring 21% of the MP2 wall time. The new method is also applied to solvated adamantine to illustrate the multilevel capability of the FMO-CIM method. PMID:25794346

  1. Determination of the allelic frequency in Smith-Lemli-Opitz syndrome by analysis of massively parallel sequencing data sets.

    PubMed

    Cross, J L; Iben, J; Simpson, C L; Thurm, A; Swedo, S; Tierney, E; Bailey-Wilson, J E; Biesecker, L G; Porter, F D; Wassif, C A

    2015-06-01

    Data from massively parallel sequencing or 'Next Generation Sequencing' of the human exome has reached a critical mass in both public and private databases, in that these collections now allow researchers to critically evaluate population genetics in a manner that was not feasible a decade ago. The ability to determine pathogenic allele frequencies by evaluation of the full coding sequences and not merely a single nucleotide polymorphism (SNP) or series of SNPs will lead to more accurate estimations of incidence. For demonstrative purposes, we analyzed the causative gene for the disorder Smith-Lemli-Opitz Syndrome (SLOS), the 7-dehydrocholesterol reductase (DHCR7) gene and determined both the carrier frequency for DHCR7 mutations, and predicted an expected incidence of the disorder. Estimations of the incidence of SLOS have ranged widely from 1:10,000 to 1:70,000 while the carrier frequency has been reported as high as 1 in 30. Using four exome data sets with a total of 17,836 chromosomes, we ascertained a carrier frequency of pathogenic DHRC7 mutations of 1.01%, and predict a SLOS disease incidence of 1/39,215 conceptions. This approach highlights yet another valuable aspect of the exome sequencing databases, to inform clinical and health policy decisions related to genetic counseling, prenatal testing and newborn screening. PMID:24813812

  2. Masking fields: a massively parallel neural architecture for learning, recognizing, and predicting multiple groupings of patterned data

    SciTech Connect

    Cohen, M.A.; Grossberg, S.

    1987-05-15

    A massively parallel neural-network architecture, called a masking field, is characterized through systematic computer simulations. A masking field can simultaneously detect multiple groupings within its input patterns and assign activation weights to the codes for these groupings that are predictive with respect to the contextual information embedded within the patterns and the prior learning of the system. A masking field automatically rescales its sensitivity as the overall size of an input pattern changes, yet also remains sensitive to the microstructure within each pattern. Thus, a masking field suggests a solution of the credit assignment problem by embodying a real-time code for the predictive evidence contained within its input patterns. Such capabilities are useful in speech recognition, visual object recognition, and cognitive information processing. An absolutely stable design for a masking field is disclosed through an analysis of the computer simulations. This design suggests how associative mechanisms, cooperative-competitive interactions, and modulatory gating signals can be joined together to regulate the learning of compressed recognition codes. Data about the neural substrates of learning and memory are compared to these mechanisms.

  3. Massively parallel rRNA gene sequencing exacerbates the potential for biased community diversity comparisons due to variable library sizes

    SciTech Connect

    Gihring, Thomas; Green, Stefan; Schadt, Christopher Warren

    2011-01-01

    Technologies for massively parallel sequencing are revolutionizing microbial ecology and are vastly increasing the scale of ribosomal RNA (rRNA) gene studies. Although pyrosequencing has increased the breadth and depth of possible rRNA gene sampling, one drawback is that the number of reads obtained per sample is difficult to control. Pyrosequencing libraries typically vary widely in the number of sequences per sample, even within individual studies, and there is a need to revisit the behaviour of richness estimators and diversity indices with variable gene sequence library sizes. Multiple reports and review papers have demonstrated the bias in non-parametric richness estimators (e.g. Chao1 and ACE) and diversity indices when using clone libraries. However, we found that biased community comparisons are accumulating in the literature. Here we demonstrate the effects of sample size on Chao1, ACE, CatchAll, Shannon, Chao-Shen and Simpson's estimations specifically using pyrosequencing libraries. The need to equalize the number of reads being compared across libraries is reiterated, and investigators are directed towards available tools for making unbiased diversity comparisons.

  4. Enabling inspection solutions for future mask technologies through the development of massively parallel E-Beam inspection

    NASA Astrophysics Data System (ADS)

    Malloy, Matt; Thiel, Brad; Bunday, Benjamin D.; Wurm, Stefan; Jindal, Vibhu; Mukhtar, Maseeh; Quoi, Kathy; Kemen, Thomas; Zeidler, Dirk; Eberle, Anna Lena; Garbowski, Tomasz; Dellemann, Gregor; Peters, Jan Hendrik

    2015-09-01

    The new device architectures and materials being introduced for sub-10nm manufacturing, combined with the complexity of multiple patterning and the need for improved hotspot detection strategies, have pushed current wafer inspection technologies to their limits. In parallel, gaps in mask inspection capability are growing as new generations of mask technologies are developed to support these sub-10nm wafer manufacturing requirements. In particular, the challenges associated with nanoimprint and extreme ultraviolet (EUV) mask inspection require new strategies that enable fast inspection at high sensitivity. The tradeoffs between sensitivity and throughput for optical and e-beam inspection are well understood. Optical inspection offers the highest throughput and is the current workhorse of the industry for both wafer and mask inspection. E-beam inspection offers the highest sensitivity but has historically lacked the throughput required for widespread adoption in the manufacturing environment. It is unlikely that continued incremental improvements to either technology will meet tomorrow's requirements, and therefore a new inspection technology approach is required; one that combines the high-throughput performance of optical with the high-sensitivity capabilities of e-beam inspection. To support the industry in meeting these challenges SUNY Poly SEMATECH has evaluated disruptive technologies that can meet the requirements for high volume manufacturing (HVM), for both the wafer fab [1] and the mask shop. Highspeed massively parallel e-beam defect inspection has been identified as the leading candidate for addressing the key gaps limiting today's patterned defect inspection techniques. As of late 2014 SUNY Poly SEMATECH completed a review, system analysis, and proof of concept evaluation of multiple e-beam technologies for defect inspection. A champion approach has been identified based on a multibeam technology from Carl Zeiss. This paper includes a discussion on the

  5. My-Forensic-Loci-queries (MyFLq) framework for analysis of forensic STR data generated by massive parallel sequencing.

    PubMed

    Van Neste, Christophe; Vandewoestyne, Mado; Van Criekinge, Wim; Deforce, Dieter; Van Nieuwerburgh, Filip

    2014-03-01

    Forensic scientists are currently investigating how to transition from capillary electrophoresis (CE) to massive parallel sequencing (MPS) for analysis of forensic DNA profiles. MPS offers several advantages over CE such as virtually unlimited multiplexy of loci, combining both short tandem repeat (STR) and single nucleotide polymorphism (SNP) loci, small amplicons without constraints of size separation, more discrimination power, deep mixture resolution and sample multiplexing. We present our bioinformatic framework My-Forensic-Loci-queries (MyFLq) for analysis of MPS forensic data. For allele calling, the framework uses a MySQL reference allele database with automatically determined regions of interest (ROIs) by a generic maximal flanking algorithm which makes it possible to use any STR or SNP forensic locus. Python scripts were designed to automatically make allele calls starting from raw MPS data. We also present a method to assess the usefulness and overall performance of a forensic locus with respect to MPS, as well as methods to estimate whether an unknown allele, which sequence is not present in the MySQL database, is in fact a new allele or a sequencing error. The MyFLq framework was applied to an Illumina MiSeq dataset of a forensic Illumina amplicon library, generated from multilocus STR polymerase chain reaction (PCR) on both single contributor samples and multiple person DNA mixtures. Although the multilocus PCR was not yet optimized for MPS in terms of amplicon length or locus selection, the results show excellent results for most loci. The results show a high signal-to-noise ratio, correct allele calls, and a low limit of detection for minor DNA contributors in mixed DNA samples. Technically, forensic MPS affords great promise for routine implementation in forensic genomics. The method is also applicable to adjacent disciplines such as molecular autopsy in legal medicine and in mitochondrial DNA research. PMID:24528572

  6. Comprehensive Assessment of Potential Multiple Myeloma Immunoglobulin Heavy Chain V-D-J Intraclonal Variation Using Massively Parallel Pyrosequencing

    PubMed Central

    Tschumper, Renee C.; Asmann, Yan W.; Hossain, Asif; Huddleston, Paul M.; Wu, Xiaosheng; Dispenzieri, Angela; Eckloff, Bruce W.; Jelinek, Diane F.

    2012-01-01

    Multiple myeloma (MM) is characterized by the accumulation of malignant plasma cells (PCs) in the bone marrow (BM). MM is viewed as a clonal disorder due to lack of verified intraclonal sequence diversity in the immunoglobulin heavy chain variable region gene (IGHV). However, this conclusion is based on analysis of a very limited number of IGHV subclones and the methodology employed did not permit simultaneous analysis of the IGHV repertoire of non-malignant PCs in the same samples. Here we generated genomic DNA and cDNA libraries from purified MM BMPCs and performed massively parallel pyrosequencing to determine the frequency of cells expressing identical IGHV sequences. This method provided an unprecedented opportunity to interrogate the presence of clonally related MM cells and evaluate the IGHV repertoire of non-MM PCs. Within the MM sample, 37 IGHV genes were expressed, with 98.9% of all immunoglobulin sequences using the same IGHV gene as the MM clone and 83.0% exhibiting exact nucleotide sequence identity in the IGHV and heavy chain complementarity determining region 3 (HCDR3). Of interest, we observed in both genomic DNA and cDNA libraries 48 sets of identical sequences with single point mutations in the MM clonal IGHV or HCDR3 regions. These nucleotide changes were suggestive of putative subclones and therefore were subjected to detailed analysis to interpret: 1) their legitimacy as true subclones; and 2) their significance in the context of MM. Finally, we report for the first time the IGHV repertoire of normal human BMPCs and our data demonstrate the extent of IGHV repertoire diversity as well as the frequency of clonally-related normal BMPCs. This study demonstrates the power and potential weaknesses of in-depth sequencing as a tool to thoroughly investigate the phylogeny of malignant PCs in MM and the IGHV repertoire of normal BMPCs. PMID:22522905

  7. Investigating the effect of two methane-mitigating diets on the rumen microbiome using massively parallel sequencing.

    PubMed

    Ross, E M; Moate, P J; Marett, L; Cocks, B G; Hayes, B J

    2013-09-01

    Variation in the composition of microorganisms in the rumen (the rumen microbiome) of dairy cattle (Bos taurus) is of great interest because of possible links to methane emission levels. Feed additives are one method being investigated to reduce enteric methane production by dairy cattle. Here we report the effect of 2 methane-mitigating feed additives (grapemarc and a combination of lipids and tannin) on rumen microbiome profiles of Holstein dairy cattle. We used untargeted (shotgun) massively parallel sequencing of microbes present in rumen fluid to generate quantitative rumen microbiome profiles. We observed large effects of the feed additives on the rumen microbiome profiles using multiple approaches, including linear mixed modeling, hierarchical clustering, and metagenomic predictions. The effect on the fecal microbiome profiles was not detectable using hierarchical clustering, but was significant in the linear mixed model and when metagenomic predictions were used, suggesting a more subtle effect of the diets on the lower gastrointestinal microbiome. A differential representation analysis (analogous to differential expression in RNA sequencing) showed significant overlap in the contigs (which are genome fragments representing different microorganism species) that were differentially represented between experiments. These similarities suggest that, despite the different additives used, the 2 diets assessed in this investigation altered the microbiomes of the samples in similar ways. Contigs that were differentially represented in both experiments were tested for associations with methane production in an independent set of animals. These animals were not treated with a methane-mitigating diet, but did show substantial natural variation in methane emission levels. The contigs that were significantly differentially represented in response to both dietary additives showed a significant enrichment for associations with methane production. This suggests that these

  8. A whole-genome massively parallel sequencing analysis of BRCA1 mutant oestrogen receptor negative and positive breast cancers

    PubMed Central

    Weigelt, Britta; Wilkerson, Paul M; Manie, Elodie; Grigoriadis, Anita; A’Hern, Roger; van der Groep, Petra; Kozarewa, Iwanka; Popova, Tatiana; Mariani, Odette; Turaljic, Samra; Furney, Simon J; Marais, Richard; Rodruigues, Daniel-Nava; Flora, Adriana C; Wai, Patty; Pawar, Vidya; McDade, Simon; Carroll, Jason; Stoppa-Lyonnet, Dominique; Green, Andrew R; Ellis, Ian O; Swanton, Charles; van Diest, Paul; Delattre, Olivier; Lord, Christopher J; Foulkes, William D; Vincent-Salomon, Anne; Ashworth, Alan; Stern, Marc Henri; Reis-Filho, Jorge S

    2016-01-01

    BRCA1 encodes a tumour suppressor protein that plays pivotal roles in homologous recombination (HR) DNA repair, cell-cycle checkpoints, and transcriptional regulation. BRCA1 germline mutations confer a high risk of early-onset breast and ovarian cancer. In >80% of cases, tumours arising in BRCA1 germline mutation carriers are oestrogen receptor (ER)-negative, however up to 15% are ER-positive. It has been suggested that BRCA1 ER-positive breast cancers constitute sporadic cancers arising in the context of a BRCA1 germline mutation rather than being causally related to BRCA1 loss-of-function. Whole-genome massively parallel sequencing of ER-positive and ER-negative BRCA1 breast cancers, and their respective germline DNAs, was used to characterise the genetic landscape of BRCA1 cancers at base-pair resolution. Only BRCA1 germline mutations and somatic loss of the wild-type allele, and TP53 somatic mutations were recurrently found in the index cases. BRCA1 breast cancers displayed a mutational signature consistent with that caused by lack of HR DNA repair in both ER-positive and ER-negative cases. Sequencing analysis of independent cohorts of hereditary BRCA1 and sporadic non-BRCA1 breast cancers for the presence of recurrent pathogenic mutations and/or homozygous deletions found in the index cases revealed that DAPK3, TMEM135, KIAA1797, PDE4D and GATA4 are potential additional drivers of breast cancers. This study demonstrates that BRCA1 pathogenic germline mutations coupled with somatic loss of the wild-type allele are not sufficient for hereditary breast cancers to display an ER-negative phenotype, and has led to the identification of three potential novel breast cancer genes (i.e. DAPK3, TMEM135 and GATA4). PMID:22362584

  9. Massively-parallel sequencing of genes on a single chromosome: a comparison of solution hybrid selection and flow sorting

    PubMed Central

    2013-01-01

    Background Targeted capture, combined with massively-parallel sequencing, is a powerful technique that allows investigation of specific portions of the genome for less cost than whole genome sequencing. Several methods have been developed, and improvements have resulted in commercial products targeting the human or mouse exonic regions (the exome). In some cases it is desirable to custom-target other regions of the genome, either to reduce the amount of sequence that is targeted or to capture regions that are not targeted by commercial kits. It is important to understand the advantages, limitations, and complexity of a given capture method before embarking on a targeted sequencing experiment. Results We compared two custom targeted capture methods suitable for single chromosome analysis: Solution Hybrid Selection (SHS) and Flow Sorting (FS) of single chromosomes. Both methods can capture targeted material and result in high percentages of genotype identifications across these regions: 59-92% for SHS and 70-79% for FS. FS is amenable to current structural variation detection methods, and variants were detected. Structural variation was also assessed for SHS samples with paired end sequencing, resulting in variant identification. Conclusions While both methods can effectively target genomic regions for genotype determination, several considerations make each method appropriate in different circumstances. SHS is well suited for experiments targeting smaller regions in a larger number of samples. FS is well suited when regions of interest cover large regions of a single chromosome. Although whole genome sequencing is becoming less expensive, the sequencing, data storage, and analysis costs make targeted sequencing using SHS or FS a compelling option. PMID:23586822

  10. Unintended Pulmonary Artery Ligation during PDA Ligation.

    PubMed

    Kim, Dohun; Kim, Si-Wook; Shin, Hong-Ju; Hong, Jong-Myeon; Lee, Ji Hyuk; Han, Heon-Seok

    2016-01-01

    A 10-day-old boy was transferred to our hospital due to tachypnea. Patent ductus arteriosus (PDA), 4.8 mm in diameter, with small ASD was diagnosed on echocardiography. Surgical ligation of the ductus was performed after failure of three cycles of ibuprofen. However, the ductus remained open on routine postoperative echocardiography on the second postoperative day, and chest CT revealed inadvertent ligation of the left pulmonary artery (LPA) rather than the PDA. Emergent operation successfully reopened the clipped LPA and ligated the ductus on the same (second postoperative) day.Mechanical ventilator support was weaned on postoperative day 21, and the baby was discharged on postoperative day 47 with a normal left lung shadow. PMID:27585199

  11. Massively Parallel Assimilation of TOGA/TAO and Topex/Poseidon Measurements into a Quasi Isopycnal Ocean General Circulation Model Using an Ensemble Kalman Filter

    NASA Technical Reports Server (NTRS)

    Keppenne, Christian L.; Rienecker, Michele; Borovikov, Anna Y.; Suarez, Max

    1999-01-01

    A massively parallel ensemble Kalman filter (EnKF)is used to assimilate temperature data from the TOGA/TAO array and altimetry from TOPEX/POSEIDON into a Pacific basin version of the NASA Seasonal to Interannual Prediction Project (NSIPP)ls quasi-isopycnal ocean general circulation model. The EnKF is an approximate Kalman filter in which the error-covariance propagation step is modeled by the integration of multiple instances of a numerical model. An estimate of the true error covariances is then inferred from the distribution of the ensemble of model state vectors. This inplementation of the filter takes advantage of the inherent parallelism in the EnKF algorithm by running all the model instances concurrently. The Kalman filter update step also occurs in parallel by having each processor process the observations that occur in the region of physical space for which it is responsible. The massively parallel data assimilation system is validated by withholding some of the data and then quantifying the extent to which the withheld information can be inferred from the assimilation of the remaining data. The distributions of the forecast and analysis error covariances predicted by the ENKF are also examined.

  12. Scalable evaluation of polarization energy and associated forces in polarizable molecular dynamics: II. Toward massively parallel computations using smooth particle mesh Ewald.

    PubMed

    Lagardère, Louis; Lipparini, Filippo; Polack, Étienne; Stamm, Benjamin; Cancès, Éric; Schnieders, Michael; Ren, Pengyu; Maday, Yvon; Piquemal, Jean-Philip

    2015-06-01

    In this article, we present a parallel implementation of point dipole-based polarizable force fields for molecular dynamics (MD) simulations with periodic boundary conditions (PBC). The smooth particle mesh Ewald technique is combined with two optimal iterative strategies, namely, a preconditioned conjugate gradient solver and a Jacobi solver in conjunction with the direct inversion in the iterative subspace for convergence acceleration, to solve the polarization equations. We show that both solvers exhibit very good parallel performances and overall very competitive timings in an energy and force computation needed to perform a MD step. Various tests on large systems are provided in the context of the polarizable AMOEBA force field as implemented in the newly developed Tinker-HP package, which is the first implementation of a polarizable model that makes large-scale experiments for massively parallel PBC point dipole models possible. We show that using a large number of cores offers a significant acceleration of the overall process involving the iterative methods within the context of SPME and a noticeable improvement of the memory management, giving access to very large systems (hundreds of thousands of atoms) as the algorithm naturally distributes the data on different cores. Coupled with advanced MD techniques, gains ranging from 2 to 3 orders of magnitude in time are now possible compared to nonoptimized, sequential implementations, giving new directions for polarizable molecular dynamics with periodic boundary conditions using massively parallel implementations. PMID:26575557

  13. Masking fields: a massively parallel neural architecture for learning, recognizing, and predicting multiple groupings of patterned data.

    PubMed

    Cohen, M A; Grossberg, S

    1987-05-15

    A massively parallel neural network architecture, called a masking field, is characterized through systematic computer simulations. A masking field is a multiple-scale self-similar automatically gain-controlled cooperative- competitive feedback network F(2). Network F(2) receives input patterns from an adaptive filter F(1) ? F(2) that is activated by a prior processing level F(1). Such a network F(2) behaves like a content-addressable memory. It activates compressed recognition codes that are predictive with respect to the activation patterns flickering across the feature detectors of F(1) and competitively inhibits, or masks, codes which are unpredictive with respect to the F(1) patterns. In particular, a masking field can simultaneously detect multiple groupings within its input patterns and assign activation weights to the codes for these groupings which are predictive with respect to the contextual information embedded within the patterns and the prior learning of the system. A masking field automatically rescales its sensitivity as the overall size of an input pattern changes, yet also remains sensitive to the microstructure within each input pattern. In this way, a masking field can more strongly activate a code for the whole F(1) pattern than for its salient parts, yet amplifies the code for a pattern part when it becomes a pattern whole in a new input context. A masking field can also be primed by inputs from F(1): it can activate codes which represent predictions of how the F(1) pattern may evolve in the subsequent time interval. Network F(2) can also exhibit an adaptive sharpening property: repetition of a familiar F(1) pattern can tune the adaptive filter to elicit a more focal spatial activation of its F(2) recognition code than does an unfamiliar input pattern. The F(2) recognition code also becomes less distributed when an input pattern contains more contextual information on which to base an unambiguous prediction of which the F(1) pattern is being

  14. Utility of different massive parallel sequencing platforms for mutation profiling in clinical samples and identification of pitfalls using FFPE tissue.

    PubMed

    Fassunke, Jana; Haller, Florian; Hebele, Simone; Moskalev, Evgeny A; Penzel, Roland; Pfarr, Nicole; Merkelbach-Bruse, Sabine; Endris, Volker

    2015-11-01

    In the growing field of personalised medicine, the analysis of numerous potential targets is becoming a challenge in terms of work load, tissue availability, as well as costs. The molecular analysis of non-small cell lung cancer (NSCLC) has shifted from the analysis of the epidermal growth factor receptor (EGFR) mutation status to the analysis of different gene regions, including resistance mutations or translocations. Massive parallel sequencing (MPS) allows rapid comprehensive mutation testing in routine molecular pathological diagnostics even on small formalin-fixed, paraffin‑embedded (FFPE) biopsies. In this study, we compared and evaluated currently used MPS platforms for their application in routine pathological diagnostics. We initiated a first round‑robin testing of 30 cases diagnosed with NSCLC and a known EGFR gene mutation status. In this study, three pathology institutes from Germany received FFPE tumour sections that had been individually processed. Fragment libraries were prepared by targeted multiplex PCR using institution‑specific gene panels. Sequencing was carried out using three MPS systems: MiSeq™, GS Junior and PGM Ion Torrent™. In two institutes, data analysis was performed with the platform-specific software and the Integrative Genomics Viewer. In one institute, data analysis was carried out using an in-house software system. Of 30 samples, 26 were analysed by all institutes. Concerning the EGFR mutation status, concordance was found in 26 out of 26 samples. The analysis of a few samples failed due to poor DNA quality in alternating institutes. We found 100% concordance when comparing the results of the EGFR mutation status. A total of 38 additional mutations were identified in the 26 samples. In two samples, minor variants were found which could not be confirmed by qPCR. Other characteristic variants were identified as fixation artefacts by reanalyzing the respective sample by Sanger sequencing. Overall, the results of this study

  15. Utility of different massive parallel sequencing platforms for mutation profiling in clinical samples and identification of pitfalls using FFPE tissue

    PubMed Central

    FASSUNKE, JANA; HALLER, FLORIAN; HEBELE, SIMONE; MOSKALEV, EVGENY A.; PENZEL, ROLAND; PFARR, NICOLE; MERKELBACH-BRUSE, SABINE; ENDRIS, VOLKER

    2015-01-01

    In the growing field of personalised medicine, the analysis of numerous potential targets is becoming a challenge in terms of work load, tissue availability, as well as costs. The molecular analysis of non-small cell lung cancer (NSCLC) has shifted from the analysis of the epidermal growth factor receptor (EGFR) mutation status to the analysis of different gene regions, including resistance mutations or translocations. Massive parallel sequencing (MPS) allows rapid comprehensive mutation testing in routine molecular pathological diagnostics even on small formalin-fixed, paraffin-embedded (FFPE) biopsies. In this study, we compared and evaluated currently used MPS platforms for their application in routine pathological diagnostics. We initiated a first round-robin testing of 30 cases diagnosed with NSCLC and a known EGFR gene mutation status. In this study, three pathology institutes from Germany received FFPE tumour sections that had been individually processed. Fragment libraries were prepared by targeted multiplex PCR using institution-specific gene panels. Sequencing was carried out using three MPS systems: MiSeq™, GS Junior and PGM Ion Torrent™. In two institutes, data analysis was performed with the platform-specific software and the Integrative Genomics Viewer. In one institute, data analysis was carried out using an in-house software system. Of 30 samples, 26 were analysed by all institutes. Concerning the EGFR mutation status, concordance was found in 26 out of 26 samples. The analysis of a few samples failed due to poor DNA quality in alternating institutes. We found 100% concordance when comparing the results of the EGFR mutation status. A total of 38 additional mutations were identified in the 26 samples. In two samples, minor variants were found which could not be confirmed by qPCR. Other characteristic variants were identified as fixation artefacts by reanalyzing the respective sample by Sanger sequencing. Overall, the results of this study

  16. Tubal ligation (image)

    MedlinePlus

    Surgical sterilization which permanently prevents the transport of the egg to the uterus by means of sealing the fallopian tubes is called tubal ligation, commonly called "having one's tubes tied". This operation ...

  17. Tubal ligation - discharge

    MedlinePlus

    Sterilization surgery - female - discharge; Tubal sterilization - discharge; Tube tying - discharge; Tying the tubes - discharge ... You had tubal ligation (or tying the tubes) surgery to close your fallopian tubes. These tubes connect the ovaries to the uterus. ...

  18. FWT2D: A massively parallel program for frequency-domain full-waveform tomography of wide-aperture seismic data—Part 1: Algorithm

    NASA Astrophysics Data System (ADS)

    Sourbier, Florent; Operto, Stéphane; Virieux, Jean; Amestoy, Patrick; L'Excellent, Jean-Yves

    2009-03-01

    This is the first paper in a two-part series that describes a massively parallel code that performs 2D frequency-domain full-waveform inversion of wide-aperture seismic data for imaging complex structures. Full-waveform inversion methods, namely quantitative seismic imaging methods based on the resolution of the full wave equation, are computationally expensive. Therefore, designing efficient algorithms which take advantage of parallel computing facilities is critical for the appraisal of these approaches when applied to representative case studies and for further improvements. Full-waveform modelling requires the resolution of a large sparse system of linear equations which is performed with the massively parallel direct solver MUMPS for efficient multiple-shot simulations. Efficiency of the multiple-shot solution phase (forward/backward substitutions) is improved by using the BLAS3 library. The inverse problem relies on a classic local optimization approach implemented with a gradient method. The direct solver returns the multiple-shot wavefield solutions distributed over the processors according to a domain decomposition driven by the distribution of the LU factors. The domain decomposition of the wavefield solutions is used to compute in parallel the gradient of the objective function and the diagonal Hessian, this latter providing a suitable scaling of the gradient. The algorithm allows one to test different strategies for multiscale frequency inversion ranging from successive mono-frequency inversion to simultaneous multifrequency inversion. These different inversion strategies will be illustrated in the following companion paper. The parallel efficiency and the scalability of the code will also be quantified.

  19. Method and apparatus for routing data in an inter-nodal communications lattice of a massively parallel computer system by routing through transporter nodes

    DOEpatents

    Archer, Charles Jens; Musselman, Roy Glenn; Peters, Amanda; Pinnow, Kurt Walter; Swartz, Brent Allen; Wallenfelt, Brian Paul

    2010-11-16

    A massively parallel computer system contains an inter-nodal communications network of node-to-node links. An automated routing strategy routes packets through one or more intermediate nodes of the network to reach a destination. Some packets are constrained to be routed through respective designated transporter nodes, the automated routing strategy determining a path from a respective source node to a respective transporter node, and from a respective transporter node to a respective destination node. Preferably, the source node chooses a routing policy from among multiple possible choices, and that policy is followed by all intermediate nodes. The use of transporter nodes allows greater flexibility in routing.

  20. Method and apparatus for routing data in an inter-nodal communications lattice of a massively parallel computer system by employing bandwidth shells at areas of overutilization

    SciTech Connect

    Archer, Charles Jens; Musselman, Roy Glenn; Peters, Amanda; Pinnow, Kurt Walter; Swartz, Brent Allen; Wallenfelt, Brian Paul

    2010-04-27

    A massively parallel computer system contains an inter-nodal communications network of node-to-node links. An automated routing strategy routes packets through one or more intermediate nodes of the network to reach a final destination. The default routing strategy is altered responsive to detection of overutilization of a particular path of one or more links, and at least some traffic is re-routed by distributing the traffic among multiple paths (which may include the default path). An alternative path may require a greater number of link traversals to reach the destination node.

  1. Method and apparatus for analyzing error conditions in a massively parallel computer system by identifying anomalous nodes within a communicator set

    DOEpatents

    Gooding, Thomas Michael

    2011-04-19

    An analytical mechanism for a massively parallel computer system automatically analyzes data retrieved from the system, and identifies nodes which exhibit anomalous behavior in comparison to their immediate neighbors. Preferably, anomalous behavior is determined by comparing call-return stack tracebacks for each node, grouping like nodes together, and identifying neighboring nodes which do not themselves belong to the group. A node, not itself in the group, having a large number of neighbors in the group, is a likely locality of error. The analyzer preferably presents this information to the user by sorting the neighbors according to number of adjoining members of the group.

  2. Method and apparatus for routing data in an inter-nodal communications lattice of a massively parallel computer system by dynamically adjusting local routing strategies

    DOEpatents

    Archer, Charles Jens; Musselman, Roy Glenn; Peters, Amanda; Pinnow, Kurt Walter; Swartz, Brent Allen; Wallenfelt, Brian Paul

    2010-03-16

    A massively parallel computer system contains an inter-nodal communications network of node-to-node links. Each node implements a respective routing strategy for routing data through the network, the routing strategies not necessarily being the same in every node. The routing strategies implemented in the nodes are dynamically adjusted during application execution to shift network workload as required. Preferably, adjustment of routing policies in selective nodes is performed at synchronization points. The network may be dynamically monitored, and routing strategies adjusted according to detected network conditions.

  3. Method and apparatus for routing data in an inter-nodal communications lattice of a massively parallel computer system by dynamic global mapping of contended links

    DOEpatents

    Archer, Charles Jens; Musselman, Roy Glenn; Peters, Amanda; Pinnow, Kurt Walter; Swartz, Brent Allen; Wallenfelt, Brian Paul

    2011-10-04

    A massively parallel nodal computer system periodically collects and broadcasts usage data for an internal communications network. A node sending data over the network makes a global routing determination using the network usage data. Preferably, network usage data comprises an N-bit usage value for each output buffer associated with a network link. An optimum routing is determined by summing the N-bit values associated with each link through which a data packet must pass, and comparing the sums associated with different possible routes.

  4. Three pillars for achieving quantum mechanical molecular dynamics simulations of huge systems: Divide-and-conquer, density-functional tight-binding, and massively parallel computation.

    PubMed

    Nishizawa, Hiroaki; Nishimura, Yoshifumi; Kobayashi, Masato; Irle, Stephan; Nakai, Hiromi

    2016-08-01

    The linear-scaling divide-and-conquer (DC) quantum chemical methodology is applied to the density-functional tight-binding (DFTB) theory to develop a massively parallel program that achieves on-the-fly molecular reaction dynamics simulations of huge systems from scratch. The functions to perform large scale geometry optimization and molecular dynamics with DC-DFTB potential energy surface are implemented to the program called DC-DFTB-K. A novel interpolation-based algorithm is developed for parallelizing the determination of the Fermi level in the DC method. The performance of the DC-DFTB-K program is assessed using a laboratory computer and the K computer. Numerical tests show the high efficiency of the DC-DFTB-K program, a single-point energy gradient calculation of a one-million-atom system is completed within 60 s using 7290 nodes of the K computer. © 2016 Wiley Periodicals, Inc. PMID:27317328

  5. Development of a MEMS electrostatic condenser lens array for nc-Si surface electron emitters of the Massive Parallel Electron Beam Direct-Write system

    NASA Astrophysics Data System (ADS)

    Kojima, A.; Ikegami, N.; Yoshida, T.; Miyaguchi, H.; Muroyama, M.; Yoshida, S.; Totsu, K.; Koshida, N.; Esashi, M.

    2016-03-01

    Developments of a Micro Electro-Mechanical System (MEMS) electrostatic Condenser Lens Array (CLA) for a Massively Parallel Electron Beam Direct Write (MPEBDW) lithography system are described. The CLA converges parallel electron beams for fine patterning. The structure of the CLA was designed on a basis of analysis by a finite element method (FEM) simulation. The lens was fabricated with precise machining and assembled with a nanocrystalline silicon (nc-Si) electron emitter array as an electron source of MPEBDW. The nc-Si electron emitter has the advantage that a vertical-emitted surface electron beam can be obtained without any extractor electrodes. FEM simulation of electron optics characteristics showed that the size of the electron beam emitted from the electron emitter was reduced to 15% by a radial direction, and the divergence angle is reduced to 1/18.

  6. Preselective Screening for Linear-Scaling Exact Exchange-Gradient Calculations for Graphics Processing Units and General Strong-Scaling Massively Parallel Calculations.

    PubMed

    Kussmann, Jörg; Ochsenfeld, Christian

    2015-03-10

    We present an extension of our recently presented PreLinK scheme (J. Chem. Phys. 2013, 138, 134114) for the exact exchange contribution to nuclear forces. The significant contributions to the exchange gradient are determined by preselection based on accurate shell-pair contributions to the SCF exchange energy prior to the calculation. Therefore, our method is highly suitable for massively parallel electronic structure calculations because of an efficient load balancing of the significant contributions only and an unhampered control flow. The efficiency of our method is shown for several illustrative calculations on single GPU servers, as well as for hybrid MPI/CUDA parallel calculations with the largest system comprising 3369 atoms and 26952 basis functions. PMID:26579745

  7. A Method for Studying Protistan Diversity Using Massively Parallel Sequencing of V9 Hypervariable Regions of Small-Subunit Ribosomal RNA Genes

    PubMed Central

    Amaral-Zettler, Linda A.; McCliment, Elizabeth A.; Ducklow, Hugh W.; Huse, Susan M.

    2009-01-01

    Background Massively parallel pyrosequencing of amplicons from the V6 hypervariable regions of small-subunit (SSU) ribosomal RNA (rRNA) genes is commonly used to assess diversity and richness in bacterial and archaeal populations. Recent advances in pyrosequencing technology provide read lengths of up to 240 nucleotides. Amplicon pyrosequencing can now be applied to longer variable regions of the SSU rRNA gene including the V9 region in eukaryotes. Methodology/Principal Findings We present a protocol for the amplicon pyrosequencing of V9 regions for eukaryotic environmental samples for biodiversity inventories and species richness estimation. The International Census of Marine Microbes (ICoMM) and the Microbial Inventory Research Across Diverse Aquatic Long Term Ecological Research Sites (MIRADA-LTERs) projects are already employing this protocol for tag sequencing of eukaryotic samples in a wide diversity of both marine and freshwater environments. Conclusions/Significance Massively parallel pyrosequencing of eukaryotic V9 hypervariable regions of SSU rRNA genes provides a means of estimating species richness from deeply-sampled populations and for discovering novel species from the environment. PMID:19633714

  8. Inter-laboratory evaluation of the EUROFORGEN Global ancestry-informative SNP panel by massively parallel sequencing using the Ion PGM™.

    PubMed

    Eduardoff, M; Gross, T E; Santos, C; de la Puente, M; Ballard, D; Strobl, C; Børsting, C; Morling, N; Fusco, L; Hussing, C; Egyed, B; Souto, L; Uacyisrael, J; Syndercombe Court, D; Carracedo, Á; Lareu, M V; Schneider, P M; Parson, W; Phillips, C; Parson, W; Phillips, C

    2016-07-01

    The EUROFORGEN Global ancestry-informative SNP (AIM-SNPs) panel is a forensic multiplex of 128 markers designed to differentiate an individual's ancestry from amongst the five continental population groups of Africa, Europe, East Asia, Native America, and Oceania. A custom multiplex of AmpliSeq™ PCR primers was designed for the Global AIM-SNPs to perform massively parallel sequencing using the Ion PGM™ system. This study assessed individual SNP genotyping precision using the Ion PGM™, the forensic sensitivity of the multiplex using dilution series, degraded DNA plus simple mixtures, and the ancestry differentiation power of the final panel design, which required substitution of three original ancestry-informative SNPs with alternatives. Fourteen populations that had not been previously analyzed were genotyped using the custom multiplex and these studies allowed assessment of genotyping performance by comparison of data across five laboratories. Results indicate a low level of genotyping error can still occur from sequence misalignment caused by homopolymeric tracts close to the target SNP, despite careful scrutiny of candidate SNPs at the design stage. Such sequence misalignment required the exclusion of component SNP rs2080161 from the Global AIM-SNPs panel. However, the overall genotyping precision and sensitivity of this custom multiplex indicates the Ion PGM™ assay for the Global AIM-SNPs is highly suitable for forensic ancestry analysis with massively parallel sequencing. PMID:27208666

  9. Massively Parallel Sequencing of Patients with Intellectual Disability, Congenital Anomalies and/or Autism Spectrum Disorders with a Targeted Gene Panel

    PubMed Central

    Brett, Maggie; McPherson, John; Zang, Zhi Jiang; Lai, Angeline; Tan, Ee-Shien; Ng, Ivy; Ong, Lai-Choo; Cham, Breana; Tan, Patrick; Rozen, Steve; Tan, Ene-Choo

    2014-01-01

    Developmental delay and/or intellectual disability (DD/ID) affects 1–3% of all children. At least half of these are thought to have a genetic etiology. Recent studies have shown that massively parallel sequencing (MPS) using a targeted gene panel is particularly suited for diagnostic testing for genetically heterogeneous conditions. We report on our experiences with using massively parallel sequencing of a targeted gene panel of 355 genes for investigating the genetic etiology of eight patients with a wide range of phenotypes including DD/ID, congenital anomalies and/or autism spectrum disorder. Targeted sequence enrichment was performed using the Agilent SureSelect Target Enrichment Kit and sequenced on the Illumina HiSeq2000 using paired-end reads. For all eight patients, 81–84% of the targeted regions achieved read depths of at least 20×, with average read depths overlapping targets ranging from 322× to 798×. Causative variants were successfully identified in two of the eight patients: a nonsense mutation in the ATRX gene and a canonical splice site mutation in the L1CAM gene. In a third patient, a canonical splice site variant in the USP9X gene could likely explain all or some of her clinical phenotypes. These results confirm the value of targeted MPS for investigating DD/ID in children for diagnostic purposes. However, targeted gene MPS was less likely to provide a genetic diagnosis for children whose phenotype includes autism. PMID:24690944

  10. Chromosome Conformation Capture Carbon Copy (5C): A massively parallel solution for mapping interactions between genomic elements

    PubMed Central

    Dostie, Josée; Richmond, Todd A.; Arnaout, Ramy A.; Selzer, Rebecca R.; Lee, William L.; Honan, Tracey A.; Rubio, Eric D.; Krumm, Anton; Lamb, Justin; Nusbaum, Chad; Green, Roland D.; Dekker, Job

    2006-01-01

    Physical interactions between genetic elements located throughout the genome play important roles in gene regulation and can be identified with the Chromosome Conformation Capture (3C) methodology. 3C converts physical chromatin interactions into specific ligation products, which are quantified individually by PCR. Here we present a high-throughput 3C approach, 3C-Carbon Copy (5C), that employs microarrays or quantitative DNA sequencing using 454-technology as detection methods. We applied 5C to analyze a 400-kb region containing the human β-globin locus and a 100-kb conserved gene desert region. We validated 5C by detection of several previously identified looping interactions in the β-globin locus. We also identified a new looping interaction in K562 cells between the β-globin Locus Control Region and the γ–β-globin intergenic region. Interestingly, this region has been implicated in the control of developmental globin gene switching. 5C should be widely applicable for large-scale mapping of cis- and trans- interaction networks of genomic elements and for the study of higher-order chromosome structure. PMID:16954542

  11. Mapping unstructured grid computations to massively parallel computers. Ph.D. Thesis - Rensselaer Polytechnic Inst., Feb. 1992

    NASA Technical Reports Server (NTRS)

    Hammond, Steven Warren

    1992-01-01

    Investigated here is this mapping problem: assign the tasks of a parallel program to the processors of a parallel computer such that the execution time is minimized. First, a taxonomy of objective functions and heuristics used to solve the mapping problem is presented. Next, we develop a highly parallel heuristic mapping algorithm, called Cyclic Pairwise Exchange (CPE), and discuss its place in the taxonomy. CPE uses local pairwise exchanges of processor assignments to iteratively improve an initial mapping. A variety of initial mapping schemes are tested and recursive spectral bipartitioning (RSB) followed by CPE is shown to result in the best mappings. For the test cases studied here, problems arising in computational fluid dynamics and structural mechanics on unstructured triangular and tetrahedral meshes, RSB and CPE outperform methods based on simulated annealing. Much less time is required to do the mapping and the results obtained are better. Compared with random and naive mappings, RSB and CPE reduce the communication time two fold for the test problems used. Finally, we use CPE in two applications on a CM-2. The first application is a data parallel mesh-vertex upwind finite volume scheme for solving the Euler equations on 2-D triangular unstructured meshes. CPE is used to map grid points to processors. The performance of this code is compared with a similar code on a Cray-YMP and an Intel iPSC/860. The second application is parallel sparse matrix-vector multiplication used in the iterative solution of large sparse linear systems of equations. We map rows of the matrix to processors and use an inner-product based matrix-vector multiplication. We demonstrate that this method is an order of magnitude faster than methods based on scan operations for our test cases.

  12. Covalent ligation studies on the human telomere quadruplex

    PubMed Central

    Qi, Jianying; Shafer, Richard H.

    2005-01-01

    Recent X-ray crystallographic studies on the human telomere sequence d[AGGG(TTAGGG)3] revealed a unimolecular, parallel quadruplex structure in the presence of potassium ions, while earlier NMR results in the presence of sodium ions indicated a unimolecular, antiparallel quadruplex. In an effort to identify and isolate the parallel form in solution, we have successfully ligated into circular products the single-stranded human telomere and several modified human telomere sequences in potassium-containing solutions. Using these sequences with one or two terminal phosphates, we have made chemically ligated products via creation of an additional loop. Circular products have been identified by polyacrylamide gel electrophoresis, enzymatic digestion with exonuclease VII and electrospray mass spectrometry in negative ion mode. Optimum pH for the ligation reaction of the human telomere sequence ranges from 4.5 to 6.0. Several buffers were also examined, with MES yielding the greatest ligation efficiency. Human telomere sequences with two phosphate groups, one each at the 3′ and 5′ ends, were more efficient at ligation, via pyrophosphate bond formation, than the corresponding sequences with only one phosphate group, at the 5′ end. Circular dichroism spectra showed that the ligation product was derived from an antiparallel, single-stranded guanine quadruplex rather than a parallel single-stranded guanine quadruplex structure. PMID:15933211

  13. Anesthetic Management During Emergency Surgical Ligation for Carotid Blowout Syndrome.

    PubMed

    Klein Nulent, Casper G A; de Graaff, Henri J D; Ketelaars, Rein; Sewnaik, Aniel; Maissan, Iscander M

    2016-08-15

    A 44-year-old man presented to our emergency department with a pharyngeal hemorrhage, 6 weeks after a total laryngectomy and extensive neck dissection. Immediate surgical intervention was necessary to stop massive arterial hemorrhage from the pharynx. The head and neck surgeon successfully ligated the common carotid artery during this procedure. We describe the anesthetic strategy and the thromboelastometry (ROTEM®)-guided massive transfusion protocol. PMID:27310900

  14. Performance of a TthPrimPol-based whole genome amplification kit for copy number alteration detection using massively parallel sequencing

    PubMed Central

    Deleye, Lieselot; De Coninck, Dieter; Dheedene, Annelies; De Sutter, Petra; Menten, Björn; Deforce, Dieter; Van Nieuwerburgh, Filip

    2016-01-01

    Starting from only a few cells, current whole genome amplification (WGA) methods provide enough DNA to perform massively parallel sequencing (MPS). Unfortunately, all current WGA methods introduce representation bias which limits detection of copy number aberrations (CNAs) smaller than 3 Mb. A recent WGA method, called TruePrime single cell WGA, uses a recently discovered DNA primase, TthPrimPol, instead of artificial primers to initiate DNA amplification. This method could lead to a lower representation bias, and consequently to a better detection of CNAs. The enzyme requires no complementarity and thus should generate random primers, equally distributed across the genome. The performance of TruePrime WGA was assessed for aneuploidy screening and CNA analysis after MPS, starting from 1, 3 or 5 cells. Although the method looks promising, the single cell TruePrime WGA kit v1 is not suited for high resolution CNA detection after MPS because too much representation bias is introduced. PMID:27546482

  15. Performance of a TthPrimPol-based whole genome amplification kit for copy number alteration detection using massively parallel sequencing.

    PubMed

    Deleye, Lieselot; De Coninck, Dieter; Dheedene, Annelies; De Sutter, Petra; Menten, Björn; Deforce, Dieter; Van Nieuwerburgh, Filip

    2016-01-01

    Starting from only a few cells, current whole genome amplification (WGA) methods provide enough DNA to perform massively parallel sequencing (MPS). Unfortunately, all current WGA methods introduce representation bias which limits detection of copy number aberrations (CNAs) smaller than 3 Mb. A recent WGA method, called TruePrime single cell WGA, uses a recently discovered DNA primase, TthPrimPol, instead of artificial primers to initiate DNA amplification. This method could lead to a lower representation bias, and consequently to a better detection of CNAs. The enzyme requires no complementarity and thus should generate random primers, equally distributed across the genome. The performance of TruePrime WGA was assessed for aneuploidy screening and CNA analysis after MPS, starting from 1, 3 or 5 cells. Although the method looks promising, the single cell TruePrime WGA kit v1 is not suited for high resolution CNA detection after MPS because too much representation bias is introduced. PMID:27546482

  16. An open-source, massively parallel code for non-LTE synthesis and inversion of spectral lines and Zeeman-induced Stokes profiles

    NASA Astrophysics Data System (ADS)

    Socas-Navarro, H.; de la Cruz Rodríguez, J.; Asensio Ramos, A.; Trujillo Bueno, J.; Ruiz Cobo, B.

    2015-05-01

    With the advent of a new generation of solar telescopes and instrumentation, interpreting chromospheric observations (in particular, spectropolarimetry) requires new, suitable diagnostic tools. This paper describes a new code, NICOLE, that has been designed for Stokes non-LTE radiative transfer, for synthesis and inversion of spectral lines and Zeeman-induced polarization profiles, spanning a wide range of atmospheric heights from the photosphere to the chromosphere. The code features a number of unique features and capabilities and has been built from scratch with a powerful parallelization scheme that makes it suitable for application on massive datasets using large supercomputers. The source code is written entirely in Fortran 90/2003 and complies strictly with the ANSI standards to ensure maximum compatibility and portability. It is being publicly released, with the idea of facilitating future branching by other groups to augment its capabilities. The source code is currently hosted at the following repository: http://https://github.com/hsocasnavarro/NICOLE

  17. Use of Massively Parallel Pyrosequencing to Evaluate the Diversity of and Selection on Plasmodium falciparum csp T-Cell Epitopes in Lilongwe, Malawi

    PubMed Central

    Bailey, Jeffrey A.; Mvalo, Tisungane; Aragam, Nagesh; Weiser, Matthew; Congdon, Seth; Kamwendo, Debbie; Martinson, Francis; Hoffman, Irving; Meshnick, Steven R.; Juliano, Jonathan J.

    2012-01-01

    The development of an effective malaria vaccine has been hampered by the genetic diversity of commonly used target antigens. This diversity has led to concerns about allele-specific immunity limiting the effectiveness of vaccines. Despite extensive genetic diversity of circumsporozoite protein (CS), the most successful malaria vaccine is RTS/S, a monovalent CS vaccine. By use of massively parallel pyrosequencing, we evaluated the diversity of CS haplotypes across the T-cell epitopes in parasites from Lilongwe, Malawi. We identified 57 unique parasite haplotypes from 100 participants. By use of ecological and molecular indexes of diversity, we saw no difference in the diversity of CS haplotypes between adults and children. We saw evidence of weak variant-specific selection within this region of CS, suggesting naturally acquired immunity does induce variant-specific selection on CS. Therefore, the impact of CS vaccines on variant frequencies with widespread implementation of vaccination requires further study. PMID:22551816

  18. Self ligating lingual appliance.

    PubMed

    Juneja, Pankaj; Chopra, S S; Jayan, B K

    2015-12-01

    Adult demand for orthodontics has grown considerably over the past 10 years propelling increased demand for Esthetic Orthodontics. Lingual appliances are a viable option toward providing Esthetic Orthodontics. The lingual surface of the teeth has a unique morphology that makes it difficult to place brackets in ideal positions. Indirect bonding has become the established methods of overcoming these discrepancies, along with the latest designs of self ligating brackets which offer more efficient mechanics and shorter treatment time. PMID:26843757

  19. Adaptation of a Multi-Block Structured Solver for Effective Use in a Hybrid CPU/GPU Massively Parallel Environment

    NASA Astrophysics Data System (ADS)

    Gutzwiller, David; Gontier, Mathieu; Demeulenaere, Alain

    2014-11-01

    Multi-Block structured solvers hold many advantages over their unstructured counterparts, such as a smaller memory footprint and efficient serial performance. Historically, multi-block structured solvers have not been easily adapted for use in a High Performance Computing (HPC) environment, and the recent trend towards hybrid GPU/CPU architectures has further complicated the situation. This paper will elaborate on developments and innovations applied to the NUMECA FINE/Turbo solver that have allowed near-linear scalability with real-world problems on over 250 hybrid GPU/GPU cluster nodes. Discussion will focus on the implementation of virtual partitioning and load balancing algorithms using a novel meta-block concept. This implementation is transparent to the user, allowing all pre- and post-processing steps to be performed using a simple, unpartitioned grid topology. Additional discussion will elaborate on developments that have improved parallel performance, including fully parallel I/O with the ADIOS API and the GPU porting of the computationally heavy CPUBooster convergence acceleration module. Head of HPC and Release Management, Numeca International.

  20. Rapid profiling of the antigen regions recognized by serum antibodies using massively parallel sequencing of antigen-specific libraries.

    PubMed

    Domina, Maria; Lanza Cariccio, Veronica; Benfatto, Salvatore; D'Aliberti, Deborah; Venza, Mario; Borgogni, Erica; Castellino, Flora; Biondo, Carmelo; D'Andrea, Daniel; Grassi, Luigi; Tramontano, Anna; Teti, Giuseppe; Felici, Franco; Beninati, Concetta

    2014-01-01

    There is a need for techniques capable of identifying the antigenic epitopes targeted by polyclonal antibody responses during deliberate or natural immunization. Although successful, traditional phage library screening is laborious and can map only some of the epitopes. To accelerate and improve epitope identification, we have employed massive sequencing of phage-displayed antigen-specific libraries using the Illumina MiSeq platform. This enabled us to precisely identify the regions of a model antigen, the meningococcal NadA virulence factor, targeted by serum antibodies in vaccinated individuals and to rank hundreds of antigenic fragments according to their immunoreactivity. We found that next generation sequencing can significantly empower the analysis of antigen-specific libraries by allowing simultaneous processing of dozens of library/serum combinations in less than two days, including the time required for antibody-mediated library selection. Moreover, compared with traditional plaque picking, the new technology (named Phage-based Representation OF Immuno-Ligand Epitope Repertoire or PROFILER) provides superior resolution in epitope identification. PROFILER seems ideally suited to streamline and guide rational antigen design, adjuvant selection, and quality control of newly produced vaccines. Furthermore, this method is also susceptible to find important applications in other fields covered by traditional quantitative serology. PMID:25473968

  1. Partition-of-unity finite-element method for large scale quantum molecular dynamics on massively parallel computational platforms

    SciTech Connect

    Pask, J E; Sukumar, N; Guney, M; Hu, W

    2011-02-28

    Over the course of the past two decades, quantum mechanical calculations have emerged as a key component of modern materials research. However, the solution of the required quantum mechanical equations is a formidable task and this has severely limited the range of materials systems which can be investigated by such accurate, quantum mechanical means. The current state of the art for large-scale quantum simulations is the planewave (PW) method, as implemented in now ubiquitous VASP, ABINIT, and QBox codes, among many others. However, since the PW method uses a global Fourier basis, with strictly uniform resolution at all points in space, and in which every basis function overlaps every other at every point, it suffers from substantial inefficiencies in calculations involving atoms with localized states, such as first-row and transition-metal atoms, and requires substantial nonlocal communications in parallel implementations, placing critical limits on scalability. In recent years, real-space methods such as finite-differences (FD) and finite-elements (FE) have been developed to address these deficiencies by reformulating the required quantum mechanical equations in a strictly local representation. However, while addressing both resolution and parallel-communications problems, such local real-space approaches have been plagued by one key disadvantage relative to planewaves: excessive degrees of freedom (grid points, basis functions) needed to achieve the required accuracies. And so, despite critical limitations, the PW method remains the standard today. In this work, we show for the first time that this key remaining disadvantage of real-space methods can in fact be overcome: by building known atomic physics into the solution process using modern partition-of-unity (PU) techniques in finite element analysis. Indeed, our results show order-of-magnitude reductions in basis size relative to state-of-the-art planewave based methods. The method developed here is

  2. User's guide of TOUGH2-EGS-MP: A Massively Parallel Simulator with Coupled Geomechanics for Fluid and Heat Flow in Enhanced Geothermal Systems VERSION 1.0

    SciTech Connect

    Xiong, Yi; Fakcharoenphol, Perapon; Wang, Shihao; Winterfeld, Philip H.; Zhang, Keni; Wu, Yu-Shu

    2013-12-01

    TOUGH2-EGS-MP is a parallel numerical simulation program coupling geomechanics with fluid and heat flow in fractured and porous media, and is applicable for simulation of enhanced geothermal systems (EGS). TOUGH2-EGS-MP is based on the TOUGH2-MP code, the massively parallel version of TOUGH2. In TOUGH2-EGS-MP, the fully-coupled flow-geomechanics model is developed from linear elastic theory for thermo-poro-elastic systems and is formulated in terms of mean normal stress as well as pore pressure and temperature. Reservoir rock properties such as porosity and permeability depend on rock deformation, and the relationships between these two, obtained from poro-elasticity theories and empirical correlations, are incorporated into the simulation. This report provides the user with detailed information on the TOUGH2-EGS-MP mathematical model and instructions for using it for Thermal-Hydrological-Mechanical (THM) simulations. The mathematical model includes the fluid and heat flow equations, geomechanical equation, and discretization of those equations. In addition, the parallel aspects of the code, such as domain partitioning and communication between processors, are also included. Although TOUGH2-EGS-MP has the capability for simulating fluid and heat flows coupled with geomechanical effects, it is up to the user to select the specific coupling process, such as THM or only TH, in a simulation. There are several example problems illustrating applications of this program. These example problems are described in detail and their input data are presented. Their results demonstrate that this program can be used for field-scale geothermal reservoir simulation in porous and fractured media with fluid and heat flow coupled with geomechanical effects.

  3. Implementation of a flexible and scalable particle-in-cell method for massively parallel computations in the mantle convection code ASPECT

    NASA Astrophysics Data System (ADS)

    Gassmöller, Rene; Bangerth, Wolfgang

    2016-04-01

    Particle-in-cell methods have a long history and many applications in geodynamic modelling of mantle convection, lithospheric deformation and crustal dynamics. They are primarily used to track material information, the strain a material has undergone, the pressure-temperature history a certain material region has experienced, or the amount of volatiles or partial melt present in a region. However, their efficient parallel implementation - in particular combined with adaptive finite-element meshes - is complicated due to the complex communication patterns and frequent reassignment of particles to cells. Consequently, many current scientific software packages accomplish this efficient implementation by specifically designing particle methods for a single purpose, like the advection of scalar material properties that do not evolve over time (e.g., for chemical heterogeneities). Design choices for particle integration, data storage, and parallel communication are then optimized for this single purpose, making the code relatively rigid to changing requirements. Here, we present the implementation of a flexible, scalable and efficient particle-in-cell method for massively parallel finite-element codes with adaptively changing meshes. Using a modular plugin structure, we allow maximum flexibility of the generation of particles, the carried tracer properties, the advection and output algorithms, and the projection of properties to the finite-element mesh. We present scaling tests ranging up to tens of thousands of cores and tens of billions of particles. Additionally, we discuss efficient load-balancing strategies for particles in adaptive meshes with their strengths and weaknesses, local particle-transfer between parallel subdomains utilizing existing communication patterns from the finite element mesh, and the use of established parallel output algorithms like the HDF5 library. Finally, we show some relevant particle application cases, compare our implementation to a

  4. Massively parallel approach to time-domain forward and inverse modelling of EM induction problem in spherical Earth

    NASA Astrophysics Data System (ADS)

    Velimsky, J.

    2011-12-01

    Inversion of observatory and low-orbit satellite geomagnetic data in terms of the three-dimensional distribution of electrical conductivity in the Earth's mantle can provide an independent constraint on the physical, chemical, and mineralogical composition of the Earth's mantle. This problem has been recently approached by different numerical methods. There are several key challenges from the numerical and algorithmic point of view, in particular the accuracy and speed of the forward solver, the effective evaluation of sensitivities of data to changes of model parameters, and the dependence of results on the a-priori knowledge of the spatio-temporal structure of the primary ionospheric and magnetospheric electric currents. Here I present recent advancements of the time-domain, spherical harmonic-finite element approach. The forward solver has been adapted to distributed-memory parallel architecture using band-matrix routines from the ScaLapack library. The evaluation of gradient of data misfit in the model space using adjoint approach has been also paralellized. Finally, the inverse problem has been reformulated in a way which allows for simultaneous reconstruction of conductivity model and external field model directly from the data.

  5. Scientific development of a massively parallel ocean climate model. Progress report for 1992--1993 and Continuing request for 1993--1994 to CHAMMP (Computer Hardware, Advanced Mathematics, and Model Physics)

    SciTech Connect

    Semtner, A.J. Jr.; Chervin, R.M.

    1993-05-01

    During the second year of CHAMMP funding to the principal investigators, progress has been made in the proposed areas of research, as follows: investigation of the physics of the thermohaline circulation; examination of resolution effects on ocean general circulation; and development of a massively parallel ocean climate model.

  6. Analysis of bacterial and archaeal diversity in coastal microbial mats using massive parallel 16S rRNA gene tag sequencing

    PubMed Central

    Bolhuis, Henk; Stal, Lucas J

    2011-01-01

    Coastal microbial mats are small-scale and largely closed ecosystems in which a plethora of different functional groups of microorganisms are responsible for the biogeochemical cycling of the elements. Coastal microbial mats play an important role in coastal protection and morphodynamics through stabilization of the sediments and by initiating the development of salt-marshes. Little is known about the bacterial and especially archaeal diversity and how it contributes to the ecological functioning of coastal microbial mats. Here, we analyzed three different types of coastal microbial mats that are located along a tidal gradient and can be characterized as marine (ST2), brackish (ST3) and freshwater (ST3) systems. The mats were sampled during three different seasons and subjected to massive parallel tag sequencing of the V6 region of the 16S rRNA genes of Bacteria and Archaea. Sequence analysis revealed that the mats are among the most diverse marine ecosystems studied so far and consist of several novel taxonomic levels ranging from classes to species. The diversity between the different mat types was far more pronounced than the changes between the different seasons at one location. The archaeal community for these mats have not been studied before and revealed a strong reaction on a short period of draught during summer resulting in a massive increase in halobacterial sequences, whereas the bacterial community was barely affected. We concluded that the community composition and the microbial diversity were intrinsic of the mat type and depend on the location along the tidal gradient indicating a relation with salinity. PMID:21544102

  7. Massively parallel implementations of coupled-cluster methods for electron spin resonance spectra. I. Isotropic hyperfine coupling tensors in large radicals

    NASA Astrophysics Data System (ADS)

    Verma, Prakash; Perera, Ajith; Morales, Jorge A.

    2013-11-01

    Coupled cluster (CC) methods provide highly accurate predictions of molecular properties, but their high computational cost has precluded their routine application to large systems. Fortunately, recent computational developments in the ACES III program by the Bartlett group [the OED/ERD atomic integral package, the super instruction processor, and the super instruction architecture language] permit overcoming that limitation by providing a framework for massively parallel CC implementations. In that scheme, we are further extending those parallel CC efforts to systematically predict the three main electron spin resonance (ESR) tensors (A-, g-, and D-tensors) to be reported in a series of papers. In this paper inaugurating that series, we report our new ACES III parallel capabilities that calculate isotropic hyperfine coupling constants in 38 neutral, cationic, and anionic radicals that include the 11B, 17O, 9Be, 19F, 1H, 13C, 35Cl, 33S,14N, 31P, and 67Zn nuclei. Present parallel calculations are conducted at the Hartree-Fock (HF), second-order many-body perturbation theory [MBPT(2)], CC singles and doubles (CCSD), and CCSD with perturbative triples [CCSD(T)] levels using Roos augmented double- and triple-zeta atomic natural orbitals basis sets. HF results consistently overestimate isotropic hyperfine coupling constants. However, inclusion of electron correlation effects in the simplest way via MBPT(2) provides significant improvements in the predictions, but not without occasional failures. In contrast, CCSD results are consistently in very good agreement with experimental results. Inclusion of perturbative triples to CCSD via CCSD(T) leads to small improvements in the predictions, which might not compensate for the extra computational effort at a non-iterative N7-scaling in CCSD(T). The importance of these accurate computations of isotropic hyperfine coupling constants to elucidate experimental ESR spectra, to interpret spin-density distributions, and to

  8. Massively parallel implementations of coupled-cluster methods for electron spin resonance spectra. I. Isotropic hyperfine coupling tensors in large radicals

    SciTech Connect

    Verma, Prakash; Morales, Jorge A.; Perera, Ajith

    2013-11-07

    Coupled cluster (CC) methods provide highly accurate predictions of molecular properties, but their high computational cost has precluded their routine application to large systems. Fortunately, recent computational developments in the ACES III program by the Bartlett group [the OED/ERD atomic integral package, the super instruction processor, and the super instruction architecture language] permit overcoming that limitation by providing a framework for massively parallel CC implementations. In that scheme, we are further extending those parallel CC efforts to systematically predict the three main electron spin resonance (ESR) tensors (A-, g-, and D-tensors) to be reported in a series of papers. In this paper inaugurating that series, we report our new ACES III parallel capabilities that calculate isotropic hyperfine coupling constants in 38 neutral, cationic, and anionic radicals that include the {sup 11}B, {sup 17}O, {sup 9}Be, {sup 19}F, {sup 1}H, {sup 13}C, {sup 35}Cl, {sup 33}S,{sup 14}N, {sup 31}P, and {sup 67}Zn nuclei. Present parallel calculations are conducted at the Hartree-Fock (HF), second-order many-body perturbation theory [MBPT(2)], CC singles and doubles (CCSD), and CCSD with perturbative triples [CCSD(T)] levels using Roos augmented double- and triple-zeta atomic natural orbitals basis sets. HF results consistently overestimate isotropic hyperfine coupling constants. However, inclusion of electron correlation effects in the simplest way via MBPT(2) provides significant improvements in the predictions, but not without occasional failures. In contrast, CCSD results are consistently in very good agreement with experimental results. Inclusion of perturbative triples to CCSD via CCSD(T) leads to small improvements in the predictions, which might not compensate for the extra computational effort at a non-iterative N{sup 7}-scaling in CCSD(T). The importance of these accurate computations of isotropic hyperfine coupling constants to elucidate

  9. Using Massive Parallel Sequencing for the Development, Validation, and Application of Population Genetics Markers in the Invasive Bivalve Zebra Mussel (Dreissena polymorpha)

    PubMed Central

    Peñarrubia, Luis; Sanz, Nuria; Pla, Carles; Vidal, Oriol; Viñas, Jordi

    2015-01-01

    The zebra mussel (Dreissena polymorpha, Pallas, 1771) is one of the most invasive species of freshwater bivalves, due to a combination of biological and anthropogenic factors. Once this species has been introduced to a new area, individuals form dense aggregations that are very difficult to remove, leading to many adverse socioeconomic and ecological consequences. In this study, we identified, tested, and validated a new set of polymorphic microsatellite loci (also known as SSRs, Single Sequence Repeats) using a Massive Parallel Sequencing (MPS) platform. After several pruning steps, 93 SSRs could potentially be amplified. Out of these SSRs, 14 were polymorphic, producing a polymorphic yield of 15.05%. These 14 polymorphic microsatellites were fully validated in a first approximation of the genetic population structure of D. polymorpha in the Iberian Peninsula. Based on this polymorphic yield, we propose a criterion for establishing the number of SSRs that require validation in similar species, depending on the final use of the markers. These results could be used to optimize MPS approaches in the development of microsatellites as genetic markers, which would reduce the cost of this process. PMID:25780924

  10. A digital 25 µm pixel-pitch uncooled amorphous silicon TEC-less VGA IRFPA with massive parallel Sigma-Delta-ADC readout

    NASA Astrophysics Data System (ADS)

    Weiler, Dirk; Russ, Marco; Würfel, Daniel; Lerch, Renee; Yang, Pin; Bauer, Jochen; Vogt, Holger

    2010-04-01

    This paper presents an advanced 640 x 480 (VGA) IRFPA based on uncooled microbolometers with a pixel-pitch of 25μm developed by Fraunhofer-IMS. The IRFPA is designed for thermal imaging applications in the LWIR (8 .. 14μm) range with a full-frame frequency of 30 Hz and a high sensitivity with NETD < 100 mK @ f/1. A novel readout architecture which utilizes massively parallel on-chip Sigma-Delta-ADCs located under the microbolometer array results in a high performance digital readout. Sigma-Delta-ADCs are inherently linear. A high resolution of 16 bit for a secondorder Sigma-Delta-modulator followed by a third-order digital sinc-filter can be obtained. In addition to several thousand Sigma-Delta-ADCs the readout circuit consists of a configurable sequencer for controlling the readout clocking signals and a temperature sensor for measuring the temperature of the IRFPA. Since packaging is a significant part of IRFPA's price Fraunhofer-IMS uses a chip-scaled package consisting of an IR-transparent window with antireflection coating and a soldering frame for maintaining the vacuum. The IRFPAs are completely fabricated at Fraunhofer-IMS on 8" CMOS wafers with an additional surface micromachining process. In this paper the architecture of the readout electronics, the packaging, and the electro-optical performance characterization are presented.

  11. Towards Anatomic Scale Agent-Based Modeling with a Massively Parallel Spatially Explicit General-Purpose Model of Enteric Tissue (SEGMEnT_HPC)

    PubMed Central

    Cockrell, Robert Chase; Christley, Scott; Chang, Eugene; An, Gary

    2015-01-01

    Perhaps the greatest challenge currently facing the biomedical research community is the ability to integrate highly detailed cellular and molecular mechanisms to represent clinical disease states as a pathway to engineer effective therapeutics. This is particularly evident in the representation of organ-level pathophysiology in terms of abnormal tissue structure, which, through histology, remains a mainstay in disease diagnosis and staging. As such, being able to generate anatomic scale simulations is a highly desirable goal. While computational limitations have previously constrained the size and scope of multi-scale computational models, advances in the capacity and availability of high-performance computing (HPC) resources have greatly expanded the ability of computational models of biological systems to achieve anatomic, clinically relevant scale. Diseases of the intestinal tract are exemplary examples of pathophysiological processes that manifest at multiple scales of spatial resolution, with structural abnormalities present at the microscopic, macroscopic and organ-levels. In this paper, we describe a novel, massively parallel computational model of the gut, the Spatially Explicitly General-purpose Model of Enteric Tissue_HPC (SEGMEnT_HPC), which extends an existing model of the gut epithelium, SEGMEnT, in order to create cell-for-cell anatomic scale simulations. We present an example implementation of SEGMEnT_HPC that simulates the pathogenesis of ileal pouchitis, and important clinical entity that affects patients following remedial surgery for ulcerative colitis. PMID:25806784

  12. Helena, the hidden beauty: Resolving the most common West Eurasian mtDNA control region haplotype by massively parallel sequencing an Italian population sample.

    PubMed

    Bodner, Martin; Iuvaro, Alessandra; Strobl, Christina; Nagl, Simone; Huber, Gabriela; Pelotti, Susi; Pettener, Davide; Luiselli, Donata; Parson, Walther

    2015-03-01

    The analysis of mitochondrial (mt)DNA is a powerful tool in forensic genetics when nuclear markers fail to give results or maternal relatedness is investigated. The mtDNA control region (CR) contains highly condensed variation and is therefore routinely typed. Some samples exhibit an identical haplotype in this restricted range. Thus, they convey only weak evidence in forensic queries and limited phylogenetic information. However, a CR match does not imply that also the mtDNA coding regions are identical or samples belong to the same phylogenetic lineage. This is especially the case for the most frequent West Eurasian CR haplotype 263G 315.1C 16519C, which is observed in various clades within haplogroup H and occurs at a frequency of 3-4% in many European populations. In this study, we investigated the power of massively parallel complete mtGenome sequencing in 29 Italian samples displaying the most common West Eurasian CR haplotype - and found an unexpected high diversity. Twenty-eight different haplotypes falling into 19 described sub-clades of haplogroup H were revealed in the samples with identical CR sequences. This study demonstrates the benefit of complete mtGenome sequencing for forensic applications to enforce maximum discrimination, more comprehensive heteroplasmy detection, as well as highest phylogenetic resolution. PMID:25303789

  13. Using Massive Parallel Sequencing for the development, validation, and application of population genetics markers in the invasive bivalve zebra mussel (Dreissena polymorpha).

    PubMed

    Peñarrubia, Luis; Sanz, Nuria; Pla, Carles; Vidal, Oriol; Viñas, Jordi

    2015-01-01

    The zebra mussel (Dreissena polymorpha, Pallas, 1771) is one of the most invasive species of freshwater bivalves, due to a combination of biological and anthropogenic factors. Once this species has been introduced to a new area, individuals form dense aggregations that are very difficult to remove, leading to many adverse socioeconomic and ecological consequences. In this study, we identified, tested, and validated a new set of polymorphic microsatellite loci (also known as SSRs, Single Sequence Repeats) using a Massive Parallel Sequencing (MPS) platform. After several pruning steps, 93 SSRs could potentially be amplified. Out of these SSRs, 14 were polymorphic, producing a polymorphic yield of 15.05%. These 14 polymorphic microsatellites were fully validated in a first approximation of the genetic population structure of D. polymorpha in the Iberian Peninsula. Based on this polymorphic yield, we propose a criterion for establishing the number of SSRs that require validation in similar species, depending on the final use of the markers. These results could be used to optimize MPS approaches in the development of microsatellites as genetic markers, which would reduce the cost of this process. PMID:25780924

  14. Performance of the UCAN2 Gyrokinetic Particle In Cell (PIC) Code on Two Massively Parallel Mainframes with Intel ``Sandy Bridge'' Processors

    NASA Astrophysics Data System (ADS)

    Leboeuf, Jean-Noel; Decyk, Viktor; Newman, David; Sanchez, Raul

    2013-10-01

    The massively parallel, 2D domain-decomposed, nonlinear, 3D, toroidal, electrostatic, gyrokinetic, Particle in Cell (PIC), Cartesian geometry UCAN2 code, with particle ions and adiabatic electrons, has been ported to two emerging mainframes. These two computers, one at NERSC in the US built by Cray named Edison and the other at the Barcelona Supercomputer Center (BSC) in Spain built by IBM named MareNostrum III (MNIII) just happen to share the same Intel ``Sandy Bridge'' processors. The successful port of UCAN2 to MNIII which came online first has enabled us to be up and running efficiently in record time on Edison. Overall, the performance of UCAN2 on Edison is superior to that on MNIII, particularly at large numbers of processors (>1024) for the same Intel IFORT compiler. This appears to be due to different MPI modules (OpenMPI on MNIII and MPICH2 on Edison) and different interconnection networks (Infiniband on MNIII and Cray's Aries on Edison) on the two mainframes. Details of these ports and comparative benchmarks are presented. Work supported by OFES, USDOE, under contract no. DE-FG02-04ER54741 with the University of Alaska at Fairbanks.

  15. Massively parallel multiple interacting continua formulation for modeling flow in fractured porous media using the subsurface reactive flow and transport code PFLOTRAN

    NASA Astrophysics Data System (ADS)

    Kumar, J.; Mills, R. T.; Lichtner, P. C.; Hammond, G. E.

    2010-12-01

    Fracture dominated flows occur in numerous subsurface geochemical processes and at many different scales in rock pore structures, micro-fractures, fracture networks and faults. Fractured porous media can be modeled as multiple interacting continua which are connected to each other through transfer terms that capture the flow of mass and energy in response to pressure, temperature and concentration gradients. However, the analysis of large-scale transient problems using the multiple interacting continuum approach presents an algorithmic and computational challenge for problems with very large numbers of degrees of freedom. A generalized dual porosity model based on the Dual Continuum Disconnected Matrix approach has been implemented within a massively parallel multiphysics-multicomponent-multiphase subsurface reactive flow and transport code PFLOTRAN. Developed as part of the Department of Energy's SciDAC-2 program, PFLOTRAN provides subsurface simulation capabilities that can scale from laptops to ultrascale supercomputers, and utilizes the PETSc framework to solve the large, sparse algebraic systems that arises in complex subsurface reactive flow and transport problems. It has been successfully applied to the solution of problems composed of more than two billions degrees of freedom, utilizing up to 131,072 processor cores on Jaguar, the Cray XT5 system at Oak Ridge National Laboratory that is the world’s fastest supercomputer. Building upon the capabilities and computational efficiency of PFLOTRAN, we will present an implementation of the multiple interacting continua formulation for fractured porous media along with an application case study.

  16. Development of ballistic hot electron emitter and its applications to parallel processing: active-matrix massive direct-write lithography in vacuum and thin films deposition in solutions

    NASA Astrophysics Data System (ADS)

    Koshida, N.; Kojima, A.; Ikegami, N.; Suda, R.; Yagi, M.; Shirakashi, J.; Yoshida, T.; Miyaguchi, H.; Muroyama, M.; Nishino, H.; Yoshida, S.; Sugata, M.; Totsu, K.; Esashi, M.

    2015-03-01

    Making the best use of the characteristic features in nanocrystalline Si (nc-Si) ballistic hot electron source, the alternative lithographic technology is presented based on the two approaches: physical excitation in vacuum and chemical reduction in solutions. The nc-Si cold cathode is a kind of metal-insulator-semiconductor (MIS) diode, composed of a thin metal film, an nc-Si layer, an n+-Si substrate, and an ohmic back contact. Under a biased condition, energetic electrons are uniformly and directionally emitted through the thin surface electrodes. In vacuum, this emitter is available for active-matrix drive massive parallel lithography. Arrayed 100×100 emitters (each size: 10×10 μm2, pitch: 100 μm) are fabricated on silicon substrate by conventional planar process, and then every emitter is bonded with integrated complementary metal-oxide-semiconductor (CMOS) driver using through-silicon-via (TSV) interconnect technology. Electron multi-beams emitted from selected devices are focused by a micro-electro-mechanical system (MEMS) condenser lens array and introduced into an accelerating system with a demagnification factor of 100. The electron accelerating voltage is 5 kV. The designed size of each beam landing on the target is 10×10 nm2 in square. Here we discuss the fabrication process of the emitter array with TSV holes, implementation of integrated ctive-matrix driver circuit, the bonding of these components, the construction of electron optics, and the overall operation in the exposure system including the correction of possible aberrations. The experimental results of this mask-less parallel pattern transfer are shown in terms of simple 1:1 projection and parallel lithography under an active-matrix drive scheme. Another application is the use of this emitter as an active electrode supplying highly reducing electrons into solutions. A very small amount of metal-salt solutions is dripped onto the nc-Si emitter surface, and the emitter is driven without

  17. Parallel inversion of a massive ERT data set to characterize deep vadose zone contamination beneath former nuclear waste infiltration galleries at the Hanford Site B-Complex (Invited)

    NASA Astrophysics Data System (ADS)

    Johnson, T.; Rucker, D. F.; Wellman, D.

    2013-12-01

    revealed the general footprint of vadose zone contamination beneath infiltration galleries. In 2011, the USDOE commissioned an effort to re-invert the B-Complex ERT data as a whole using a recently developed massively parallel 3D ERT inversion code. The computational mesh included approximately 1.085 million elements and closely honored the 37m of topographic relief as determined by LiDAR imaging. The water table and tank boundaries were also incorporated into the mesh to facilitate regularization disconnects, enabling sharp conductivity contrasts where they occur naturally without penalty. The data were inverted using 1024 processors, requiring 910 Gb of memory and 11.5 hours of computation time. The imaging results revealed previously unrealized detail concerning the distribution and behavior of contaminants migrating through the vadose zone, and are currently being used by site cleanup operators and regulators to understand the origin of a groundwater nitrate plume emerging from one of the infiltration galleries. The results overall demonstrate the utility of high performance computing, unstructured meshing, and custom regularization constraints for optimal processing of massive ERT data sets enabled by modern ERT survey hardware.

  18. Significant Association between Sulfate-Reducing Bacteria and Uranium-Reducing Microbial Communities as Revealed by a Combined Massively Parallel Sequencing-Indicator Species Approach

    SciTech Connect

    Cardenas, Erick; Leigh, Mary Beth; Marsh, Terence; Tiedje, James M.; Wu, Wei-min; Luo, Jian; Ginder-Vogel, Matthew; Kitanidis, Peter K.; Criddle, Craig; Carley, Jack M; Carroll, Sue L; Gentry, Terry J; Watson, David B; Gu, Baohua; Jardine, Philip M; Zhou, Jizhong

    2010-10-01

    Massively parallel sequencing has provided a more affordable and high-throughput method to study microbial communities, although it has mostly been used in an exploratory fashion. We combined pyrosequencing with a strict indicator species statistical analysis to test if bacteria specifically responded to ethanol injection that successfully promoted dissimilatory uranium(VI) reduction in the subsurface of a uranium contamination plume at the Oak Ridge Field Research Center in Tennessee. Remediation was achieved with a hydraulic flow control consisting of an inner loop, where ethanol was injected, and an outer loop for flow-field protection. This strategy reduced uranium concentrations in groundwater to levels below 0.126 M and created geochemical gradients in electron donors from the inner-loop injection well toward the outer loop and downgradient flow path. Our analysis with 15 sediment samples from the entire test area found significant indicator species that showed a high degree of adaptation to the three different hydrochemical-created conditions. Castellaniella and Rhodanobacter characterized areas with low pH, heavy metals, and low bioactivity, while sulfate-, Fe(III)-, and U(VI)-reducing bacteria (Desulfovibrio, Anaeromyxobacter, and Desulfosporosinus) were indicators of areas where U(VI) reduction occurred. The abundance of these bacteria, as well as the Fe(III) and U(VI) reducer Geobacter, correlated with the hydraulic connectivity to the substrate injection site, suggesting that the selected populations were a direct response to electron donor addition by the groundwater flow path. A false-discovery-rate approach was implemented to discard false-positive results by chance, given the large amount of data compared.

  19. Significant association between sulfate-reducing bacteria and uranium-reducing microbial communities as revealed by a combined massively parallel sequencing-indicator species approach.

    PubMed

    Cardenas, Erick; Wu, Wei-Min; Leigh, Mary Beth; Carley, Jack; Carroll, Sue; Gentry, Terry; Luo, Jian; Watson, David; Gu, Baohua; Ginder-Vogel, Matthew; Kitanidis, Peter K; Jardine, Philip M; Zhou, Jizhong; Criddle, Craig S; Marsh, Terence L; Tiedje, James M

    2010-10-01

    Massively parallel sequencing has provided a more affordable and high-throughput method to study microbial communities, although it has mostly been used in an exploratory fashion. We combined pyrosequencing with a strict indicator species statistical analysis to test if bacteria specifically responded to ethanol injection that successfully promoted dissimilatory uranium(VI) reduction in the subsurface of a uranium contamination plume at the Oak Ridge Field Research Center in Tennessee. Remediation was achieved with a hydraulic flow control consisting of an inner loop, where ethanol was injected, and an outer loop for flow-field protection. This strategy reduced uranium concentrations in groundwater to levels below 0.126 μM and created geochemical gradients in electron donors from the inner-loop injection well toward the outer loop and downgradient flow path. Our analysis with 15 sediment samples from the entire test area found significant indicator species that showed a high degree of adaptation to the three different hydrochemical-created conditions. Castellaniella and Rhodanobacter characterized areas with low pH, heavy metals, and low bioactivity, while sulfate-, Fe(III)-, and U(VI)-reducing bacteria (Desulfovibrio, Anaeromyxobacter, and Desulfosporosinus) were indicators of areas where U(VI) reduction occurred. The abundance of these bacteria, as well as the Fe(III) and U(VI) reducer Geobacter, correlated with the hydraulic connectivity to the substrate injection site, suggesting that the selected populations were a direct response to electron donor addition by the groundwater flow path. A false-discovery-rate approach was implemented to discard false-positive results by chance, given the large amount of data compared. PMID:20729318

  20. Quaternary Morphodynamics of Fluvial Dispersal Systems Revealed: The Fly River, PNG, and the Sunda Shelf, SE Asia, simulated with the Massively Parallel GPU-based Model 'GULLEM'

    NASA Astrophysics Data System (ADS)

    Aalto, R. E.; Lauer, J. W.; Darby, S. E.; Best, J.; Dietrich, W. E.

    2015-12-01

    During glacial-marine transgressions vast volumes of sediment are deposited due to the infilling of lowland fluvial systems and shallow shelves, material that is removed during ensuing regressions. Modelling these processes would illuminate system morphodynamics, fluxes, and 'complexity' in response to base level change, yet such problems are computationally formidable. Environmental systems are characterized by strong interconnectivity, yet traditional supercomputers have slow inter-node communication -- whereas rapidly advancing Graphics Processing Unit (GPU) technology offers vastly higher (>100x) bandwidths. GULLEM (GpU-accelerated Lowland Landscape Evolution Model) employs massively parallel code to simulate coupled fluvial-landscape evolution for complex lowland river systems over large temporal and spatial scales. GULLEM models the accommodation space carved/infilled by representing a range of geomorphic processes, including: river & tributary incision within a multi-directional flow regime, non-linear diffusion, glacial-isostatic flexure, hydraulic geometry, tectonic deformation, sediment production, transport & deposition, and full 3D tracking of all resulting stratigraphy. Model results concur with the Holocene dynamics of the Fly River, PNG -- as documented with dated cores, sonar imaging of floodbasin stratigraphy, and the observations of topographic remnants from LGM conditions. Other supporting research was conducted along the Mekong River, the largest fluvial system of the Sunda Shelf. These and other field data provide tantalizing empirical glimpses into the lowland landscapes of large rivers during glacial-interglacial transitions, observations that can be explored with this powerful numerical model. GULLEM affords estimates for the timing and flux budgets within the Fly and Sunda Systems, illustrating complex internal system responses to the external forcing of sea level and climate. Furthermore, GULLEM can be applied to most ANY fluvial system to

  1. Uncooled digital IRFPA-family with 17μm pixel-pitch based on amorphous silicon with massively parallel Sigma-Delta-ADC readout

    NASA Astrophysics Data System (ADS)

    Weiler, D.; Hochschulz, F.; Würfel, D.; Lerch, R.; Geruschke, T.; Wall, S.; Heß, J.; Wang, Q.; Vogt, H.

    2014-06-01

    This paper presents the results of an advanced digital IRFPA-family developed by Fraunhofer IMS. The IRFPA-family compromises the two different optical resolutions VGA (640 ×480 pixel) and QVGA (320 × 240 pixel) by using a pin-compatible detector board. The uncooled IRFPAs are designed for thermal imaging applications in the LWIR (8 .. 14μm) range with a full-frame frequency of 30 Hz and a high thermal sensitivity. The microbolometer with a pixel-pitch of 17μm consists of amorphous silicon as the sensing layer. By scaling and optimizing our previous microbolometer technology with a pixel-pitch of 25μm we enhance the thermal sensitivity of the microbolometer. The microbolometers are read out by a novel readout architecture which utilizes massively parallel on-chip Sigma-Delta-ADCs. This results in a direct digital conversion of the resistance change of the microbolometer induced by incident infrared radiation. To reduce production costs a chip-scale-package is used as vacuum package. This vacuum package consists of an IR-transparent window with an antireflection coating and a soldering frame which is fixed by a wafer-to-chip process directly on top of the CMOS-substrate. The chip-scale-package is placed onto a detector board by a chip-on-board technique. The IRFPAs are completely fabricated at Fraunhofer IMS on 8" CMOS wafers with an additional surface micromachining process. In this paper the architecture of the readout electronics, the packaging, and the electro-optical performance characterization are presented.

  2. Influences of diurnal sampling bias on fixed-point monitoring of plankton biodiversity determined using a massively parallel sequencing-based technique.

    PubMed

    Nagai, Satoshi; Hida, Kohsuke; Urushizaki, Shingo; Onitsuka, Goh; Yasuike, Motoshige; Nakamura, Yoji; Fujiwara, Atushi; Tajimi, Seisuke; Kimoto, Katsunori; Kobayashi, Takanori; Gojobori, Takashi; Ototake, Mitsuru

    2016-02-01

    In this study, we investigated the influence of diurnal sampling bias on the community structure of plankton by comparing the biodiversity among seawater samples (n=9) obtained every 3h for 24h by using massively parallel sequencing (MPS)-based plankton monitoring at a fixed point conducted at Himedo seaport in Yatsushiro Sea, Japan. The number of raw operational taxonomy units (OTUs) and OTUs after re-sampling was 507-658 (558 ± 104, mean ± standard deviation) and 448-544 (467 ± 81), respectively, indicating high plankton biodiversity at the sampling location. The relative abundance of the top 20 OTUs in the samples from Himedo seaport was 48.8-67.7% (58.0 ± 5.8%), and the highest-ranked OTU was Pseudo-nitzschia species (Bacillariophyta) with a relative abundance of 17.3-39.2%, followed by Oithona sp. 1 and Oithona sp. 2 (Arthropoda). During seawater sampling, the semidiurnal tidal current having an amplitude of 0.3ms(-1) was dominant, and the westward residual current driven by the northeasterly wind was continuously observed during the 24-h monitoring. Therefore, the relative abundance of plankton species apparently fluctuated among the samples, but no significant difference was noted according to G-test (p>0.05). Significant differences were observed between the samples obtained from a different locality (Kusuura in Yatsushiro Sea) and at different dates, suggesting that the influence of diurnal sampling bias on plankton diversity, determined using the MPS-based survey, was not significant and acceptable. PMID:26475937

  3. A combined massively parallel sequencing indicator species approach revealed significant association between sulfate-reducing bacteria and uranium-reducing microbial communities

    SciTech Connect

    Cardenas, Erick; Wu, Wei-min; Leigh, Mary Beth; Carley, Jack M; Carroll, Sue L; Gentry, Terry; Luo, Jian; Watson, David B; Gu, Baohua; Ginder-Vogel, Matthew A.; Kitanidis, Peter K.; Jardine, Philip; Kelly, Shelly D; Zhou, Jizhong; Criddle, Craig; Marsh, Terence; Tiedje, James

    2010-08-01

    Massively parallel sequencing has provided a more affordable and high throughput method to study microbial communities, although it has been mostly used in an exploratory fashion. We combined pyrosequencing with a strict indicator species statistical analysis to test if bacteria specifically responded to ethanol injection that successfully promoted dissimilatory uranium (VI) reduction in the subsurface of a uranium contamination plume at the Oak Ridge Field Research Center in Tennessee, USA. Remediation was achieved with a hydraulic flow control consisting of an inner loop, where ethanol was injected, and an outer loop for flow field protection. This strategy reduced uranium concentrations in groundwater to levels below 0.126 {micro}M, and created geochemical gradients in electron donors from the inner loop injection well towards the outer loop and down-gradient flow path. Our analysis with 15 sediment samples from the entire test area found significant indicator species that showed a high degree of adaptation to the three different hydrochemical created conditions. Castellaniella, and Rhodanobacter characterized areas with low pH, heavy metals, and low bioactivity; while sulfate-, Fe(III)-, and U(VI)-reducing bacteria (Desulfovibrio, Anaeromyxobacter, and Desulfosporosinus) were indicators of areas where U(VI) reduction occurred. Abundance of these bacteria as well as the Fe(III)- and U(VI)-reducer Geobacter correlated with the hydraulic connectivity to the substrate injection site, suggesting that the selected populations were a direct response to the electron donor addition and by the groundwater flow path. A false discovery rate approach was implemented to discard false positives by chance given the large amount of data compared.

  4. The effect of the macrolide antibiotic tylosin on microbial diversity in the canine small intestine as demonstrated by massive parallel 16S rRNA gene sequencing

    PubMed Central

    2009-01-01

    Background Recent studies have shown that the fecal microbiota is generally resilient to short-term antibiotic administration, but some bacterial taxa may remain depressed for several months. Limited information is available about the effect of antimicrobials on small intestinal microbiota, an important contributor to gastrointestinal health. The antibiotic tylosin is often successfully used for the treatment of chronic diarrhea in dogs, but its exact mode of action and its effect on the intestinal microbiota remain unknown. The aim of this study was to evaluate the effect of tylosin on canine jejunal microbiota. Tylosin was administered at 20 to 22 mg/kg q 24 hr for 14 days to five healthy dogs, each with a pre-existing jejunal fistula. Jejunal brush samples were collected through the fistula on days 0, 14, and 28 (14 days after withdrawal of tylosin). Bacterial diversity was characterized using massive parallel 16S rRNA gene pyrosequencing. Results Pyrosequencing revealed a previously unrecognized species richness in the canine small intestine. Ten bacterial phyla were identified. Microbial populations were phylogenetically more similar during tylosin treatment. However, a remarkable inter-individual response was observed for specific taxa. Fusobacteria, Bacteroidales, and Moraxella tended to decrease. The proportions of Enterococcus-like organisms, Pasteurella spp., and Dietzia spp. increased significantly during tylosin administration (p < 0.05). The proportion of Escherichia coli-like organisms increased by day 28 (p = 0.04). These changes were not accompanied by any obvious clinical effects. On day 28, the phylogenetic composition of the microbiota was similar to day 0 in only 2 of 5 dogs. Bacterial diversity resembled the pre-treatment state in 3 of 5 dogs. Several bacterial taxa such as Spirochaetes, Streptomycetaceae, and Prevotellaceae failed to recover at day 28 (p < 0.05). Several bacterial groups considered to be sensitive to tylosin increased in their

  5. Massively parallel sequencing of forensic STRs: Considerations of the DNA commission of the International Society for Forensic Genetics (ISFG) on minimal nomenclature requirements.

    PubMed

    Parson, Walther; Ballard, David; Budowle, Bruce; Butler, John M; Gettings, Katherine B; Gill, Peter; Gusmão, Leonor; Hares, Douglas R; Irwin, Jodi A; King, Jonathan L; Knijff, Peter de; Morling, Niels; Prinz, Mechthild; Schneider, Peter M; Neste, Christophe Van; Willuweit, Sascha; Phillips, Christopher

    2016-05-01

    The DNA Commission of the International Society for Forensic Genetics (ISFG) is reviewing factors that need to be considered ahead of the adoption by the forensic community of short tandem repeat (STR) genotyping by massively parallel sequencing (MPS) technologies. MPS produces sequence data that provide a precise description of the repeat allele structure of a STR marker and variants that may reside in the flanking areas of the repeat region. When a STR contains a complex arrangement of repeat motifs, the level of genetic polymorphism revealed by the sequence data can increase substantially. As repeat structures can be complex and include substitutions, insertions, deletions, variable tandem repeat arrangements of multiple nucleotide motifs, and flanking region SNPs, established capillary electrophoresis (CE) allele descriptions must be supplemented by a new system of STR allele nomenclature, which retains backward compatibility with the CE data that currently populate national DNA databases and that will continue to be produced for the coming years. Thus, there is a pressing need to produce a standardized framework for describing complex sequences that enable comparison with currently used repeat allele nomenclature derived from conventional CE systems. It is important to discern three levels of information in hierarchical order (i) the sequence, (ii) the alignment, and (iii) the nomenclature of STR sequence data. We propose a sequence (text) string format the minimal requirement of data storage that laboratories should follow when adopting MPS of STRs. We further discuss the variant annotation and sequence comparison framework necessary to maintain compatibility among established and future data. This system must be easy to use and interpret by the DNA specialist, based on a universally accessible genome assembly, and in place before the uptake of MPS by the general forensic community starts to generate sequence data on a large scale. While the established

  6. FAVR (Filtering and Annotation of Variants that are Rare): methods to facilitate the analysis of rare germline genetic variants from massively parallel sequencing datasets

    PubMed Central

    2013-01-01

    Background Characterising genetic diversity through the analysis of massively parallel sequencing (MPS) data offers enormous potential to significantly improve our understanding of the genetic basis for observed phenotypes, including predisposition to and progression of complex human disease. Great challenges remain in resolving genetic variants that are genuine from the millions of artefactual signals. Results FAVR is a suite of new methods designed to work with commonly used MPS analysis pipelines to assist in the resolution of some of the issues related to the analysis of the vast amount of resulting data, with a focus on relatively rare genetic variants. To the best of our knowledge, no equivalent method has previously been described. The most important and novel aspect of FAVR is the use of signatures in comparator sequence alignment files during variant filtering, and annotation of variants potentially shared between individuals. The FAVR methods use these signatures to facilitate filtering of (i) platform and/or mapping-specific artefacts, (ii) common genetic variants, and, where relevant, (iii) artefacts derived from imbalanced paired-end sequencing, as well as annotation of genetic variants based on evidence of co-occurrence in individuals. We applied conventional variant calling applied to whole-exome sequencing datasets, produced using both SOLiD and TruSeq chemistries, with or without downstream processing by FAVR methods. We demonstrate a 3-fold smaller rare single nucleotide variant shortlist with no detected reduction in sensitivity. This analysis included Sanger sequencing of rare variant signals not evident in dbSNP131, assessment of known variant signal preservation, and comparison of observed and expected rare variant numbers across a range of first cousin pairs. The principles described herein were applied in our recent publication identifying XRCC2 as a new breast cancer risk gene and have been made publically available as a suite of software

  7. ROVER variant caller: read-pair overlap considerate variant-calling software applied to PCR-based massively parallel sequencing datasets

    PubMed Central

    2014-01-01

    Background We recently described Hi-Plex, a highly multiplexed PCR-based target-enrichment system for massively parallel sequencing (MPS), which allows the uniform definition of library size so that subsequent paired-end sequencing can achieve complete overlap of read pairs. Variant calling from Hi-Plex-derived datasets can thus rely on the identification of variants appearing in both reads of read-pairs, permitting stringent filtering of sequencing chemistry-induced errors. These principles underly ROVER software (derived from Read Overlap PCR-MPS variant caller), which we have recently used to report the screening for genetic mutations in the breast cancer predisposition gene PALB2. Here, we describe the algorithms underlying ROVER and its usage. Results ROVER enables users to quickly and accurately identify genetic variants from PCR-targeted, overlapping paired-end MPS datasets. The open-source availability of the software and threshold tailorability enables broad access for a range of PCR-MPS users. Methods ROVER is implemented in Python and runs on all popular POSIX-like operating systems (Linux, OS X). The software accepts a tab-delimited text file listing the coordinates of the target-specific primers used for targeted enrichment based on a specified genome-build. It also accepts aligned sequence files resulting from mapping to the same genome-build. ROVER identifies the amplicon a given read-pair represents and removes the primer sequences by using the mapping co-ordinates and primer co-ordinates. It considers overlapping read-pairs with respect to primer-intervening sequence. Only when a variant is observed in both reads of a read-pair does the signal contribute to a tally of read-pairs containing or not containing the variant. A user-defined threshold informs the minimum number of, and proportion of, read-pairs a variant must be observed in for a ‘call’ to be made. ROVER also reports the depth of coverage across amplicons to facilitate the

  8. Massively parallel sequencing of phyllodes tumours of the breast reveals actionable mutations, and TERT promoter hotspot mutations and TERT gene amplification as likely drivers of progression.

    PubMed

    Piscuoglio, Salvatore; Ng, Charlotte Ky; Murray, Melissa; Burke, Kathleen A; Edelweiss, Marcia; Geyer, Felipe C; Macedo, Gabriel S; Inagaki, Akiko; Papanastasiou, Anastasios D; Martelotto, Luciano G; Marchio, Caterina; Lim, Raymond S; Ioris, Rafael A; Nahar, Pooja K; Bruijn, Ino De; Smyth, Lillian; Akram, Muzaffar; Ross, Dara; Petrini, John H; Norton, Larry; Solit, David B; Baselga, Jose; Brogi, Edi; Ladanyi, Marc; Weigelt, Britta; Reis-Filho, Jorge S

    2016-03-01

    Phyllodes tumours (PTs) are breast fibroepithelial lesions that are graded based on histological criteria as benign, borderline or malignant. PTs may recur locally. Borderline PTs and malignant PTs may metastasize to distant sites. Breast fibroepithelial lesions, including PTs and fibroadenomas, are characterized by recurrent MED12 exon 2 somatic mutations. We sought to define the repertoire of somatic genetic alterations in PTs and whether these may assist in the differential diagnosis of these lesions. We collected 100 fibroadenomas, 40 benign PTs, 14 borderline PTs and 22 malignant PTs; six, six and 13 benign, borderline and malignant PTs, respectively, and their matched normal tissue, were subjected to targeted massively parallel sequencing (MPS) using the MSK-IMPACT sequencing assay. Recurrent MED12 mutations were found in 56% of PTs; in addition, mutations affecting cancer genes (eg TP53, RB1, SETD2 and EGFR) were exclusively detected in borderline and malignant PTs. We found a novel recurrent clonal hotspot mutation in the TERT promoter (-124 C>T) in 52% and TERT gene amplification in 4% of PTs. Laser capture microdissection revealed that these mutations were restricted to the mesenchymal component of PTs. Sequencing analysis of the entire cohort revealed that the frequency of TERT alterations increased from benign (18%) to borderline (57%) and to malignant PTs (68%; p < 0.01), and TERT alterations were associated with increased levels of TERT mRNA (p < 0.001). No TERT alterations were observed in fibroadenomas. An analysis of TERT promoter sequencing and gene amplification distinguished PTs from fibroadenomas with a sensitivity and a positive predictive value of 100% (CI 95.38-100%) and 100% (CI 85.86-100%), respectively, and a sensitivity and a negative predictive value of 39% (CI 28.65-51.36%) and 68% (CI 60.21-75.78%), respectively. Our results suggest that TERT alterations may drive the progression of PTs, and may assist in the differential diagnosis

  9. BerkeleyGW: A massively parallel computer package for the calculation of the quasiparticle and optical properties of materials and nanostructures

    NASA Astrophysics Data System (ADS)

    Deslippe, Jack; Samsonidze, Georgy; Strubbe, David A.; Jain, Manish; Cohen, Marvin L.; Louie, Steven G.

    2012-06-01

    BerkeleyGW is a massively parallel computational package for electron excited-state properties that is based on the many-body perturbation theory employing the ab initio GW and GW plus Bethe-Salpeter equation methodology. It can be used in conjunction with many density-functional theory codes for ground-state properties, including PARATEC, PARSEC, Quantum ESPRESSO, SIESTA, and Octopus. The package can be used to compute the electronic and optical properties of a wide variety of material systems from bulk semiconductors and metals to nanostructured materials and molecules. The package scales to 10 000s of CPUs and can be used to study systems containing up to 100s of atoms. Program summaryProgram title: BerkeleyGW Catalogue identifier: AELG_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AELG_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: Open source BSD License. See code for licensing details. No. of lines in distributed program, including test data, etc.: 576 540 No. of bytes in distributed program, including test data, etc.: 110 608 809 Distribution format: tar.gz Programming language: Fortran 90, C, C++, Python, Perl, BASH Computer: Linux/UNIX workstations or clusters Operating system: Tested on a variety of Linux distributions in parallel and serial as well as AIX and Mac OSX RAM: (50-2000) MB per CPU (Highly dependent on system size) Classification: 7.2, 7.3, 16.2, 18 External routines: BLAS, LAPACK, FFTW, ScaLAPACK (optional), MPI (optional). All available under open-source licenses. Nature of problem: The excited state properties of materials involve the addition or subtraction of electrons as well as the optical excitations of electron-hole pairs. The excited particles interact strongly with other electrons in a material system. This interaction affects the electronic energies, wavefunctions and lifetimes. It is well known that ground-state theories, such as standard methods

  10. NON-CONFORMING FINITE ELEMENTS; MESH GENERATION, ADAPTIVITY AND RELATED ALGEBRAIC MULTIGRID AND DOMAIN DECOMPOSITION METHODS IN MASSIVELY PARALLEL COMPUTING ENVIRONMENT

    SciTech Connect

    Lazarov, R; Pasciak, J; Jones, J

    2002-02-01

    Construction, analysis and numerical testing of efficient solution techniques for solving elliptic PDEs that allow for parallel implementation have been the focus of the research. A number of discretization and solution methods for solving second order elliptic problems that include mortar and penalty approximations and domain decomposition methods for finite elements and finite volumes have been investigated and analyzed. Techniques for parallel domain decomposition algorithms in the framework of PETC and HYPRE have been studied and tested. Hierarchical parallel grid refinement and adaptive solution methods have been implemented and tested on various model problems. A parallel code implementing the mortar method with algebraically constructed multiplier spaces was developed.

  11. Staudinger ligation as a method for bioconjugation.

    PubMed

    van Berkel, Sander S; van Eldijk, Mark B; van Hest, Jan C M

    2011-09-12

    In 1919 the German chemist Hermann Staudinger was the first to describe the reaction between an azide and a phosphine. It was not until recently, however, that Bertozzi and co-workers recognized the potential of this reaction as a method for bioconjugation and transformed it into the so-called Staudinger ligation. The bio-orthogonal character of both the azide and the phosphine functions has resulted in the Staudinger ligation finding numerous applications in various complex biological systems. For example, the Staudinger ligation has been utilized to label glycans, lipids, DNA, and proteins. Moreover, the Staudinger ligation has been used as a synthetic method to construct glycopeptides, microarrays, and functional biopolymers. In the emerging field of bio-orthogonal ligation strategies, the Staudinger ligation has set a high standard to which most of the new techniques are often compared. This Review summarizes recent developments and new applications of the Staudinger ligation. PMID:21887733

  12. Protein-templated peptide ligation.

    PubMed

    Brauckhoff, Nicolas; Hahne, Gernot; Yeh, Johannes T-H; Grossmann, Tom N

    2014-04-22

    Molecular templates bind particular reactants, thereby increasing their effective concentrations and accelerating the corresponding reaction. This concept has been successfully applied to a number of chemical problems with a strong focus on nucleic acid templated reactions. We present the first protein-templated reaction that allows N-terminal linkage of two peptides. In the presence of a protein template, ligation reactions were accelerated by more than three orders of magnitude. The templated reaction is highly selective and proved its robustness in a protein-labeling reaction that was performed in crude cell lysate. PMID:24644125

  13. MPP parallel forth

    NASA Technical Reports Server (NTRS)

    Dorband, John E.

    1987-01-01

    Massively Parallel Processor (MPP) Parallel FORTH is a derivative of FORTH-83 and Unified Software Systems' Uni-FORTH. The extension of FORTH into the realm of parallel processing on the MPP is described. With few exceptions, Parallel FORTH was made to follow the description of Uni-FORTH as closely as possible. Likewise, the parallel FORTH extensions were designed as philosophically similar to serial FORTH as possible. The MPP hardware characteristics, as viewed by the FORTH programmer, is discussed. Then a description is presented of how parallel FORTH is implemented on the MPP.

  14. Method and apparatus for routing data in an inter-nodal communications lattice of a massively parallel computer system by semi-randomly varying routing policies for different packets

    DOEpatents

    Archer, Charles Jens; Musselman, Roy Glenn; Peters, Amanda; Pinnow, Kurt Walter; Swartz, Brent Allen; Wallenfelt, Brian Paul

    2010-11-23

    A massively parallel computer system contains an inter-nodal communications network of node-to-node links. Nodes vary a choice of routing policy for routing data in the network in a semi-random manner, so that similarly situated packets are not always routed along the same path. Semi-random variation of the routing policy tends to avoid certain local hot spots of network activity, which might otherwise arise using more consistent routing determinations. Preferably, the originating node chooses a routing policy for a packet, and all intermediate nodes in the path route the packet according to that policy. Policies may be rotated on a round-robin basis, selected by generating a random number, or otherwise varied.

  15. The "Floss-Ligature" Ligation Technique.

    PubMed

    Dugdale, Charlotte Anne; Malik, Ovais Humair; Waring, David Trevor

    2015-01-01

    This clinical pearl describes an alternative technique to aid effective ligation of rotated teeth during the aligning stage of fixed appliance treatment. This technique has the potential to improve patient experience and confidence, by reducing the risk of trauma and discomfort and treatment efficiency, by ensuring complete ligation of even severely rotated teeth. PMID:27029094

  16. 21 CFR 876.4400 - Hemorrhoidal ligator.

    Code of Federal Regulations, 2010 CFR

    2010-04-01

    ... 21 Food and Drugs 8 2010-04-01 2010-04-01 false Hemorrhoidal ligator. 876.4400 Section 876.4400 Food and Drugs FOOD AND DRUG ADMINISTRATION, DEPARTMENT OF HEALTH AND HUMAN SERVICES (CONTINUED) MEDICAL DEVICES GASTROENTEROLOGY-UROLOGY DEVICES Surgical Devices § 876.4400 Hemorrhoidal ligator....

  17. 21 CFR 876.4400 - Hemorrhoidal ligator.

    Code of Federal Regulations, 2011 CFR

    2011-04-01

    ... 21 Food and Drugs 8 2011-04-01 2011-04-01 false Hemorrhoidal ligator. 876.4400 Section 876.4400 Food and Drugs FOOD AND DRUG ADMINISTRATION, DEPARTMENT OF HEALTH AND HUMAN SERVICES (CONTINUED) MEDICAL DEVICES GASTROENTEROLOGY-UROLOGY DEVICES Surgical Devices § 876.4400 Hemorrhoidal ligator....

  18. The Application of a Massively Parallel Computer to the Simulation of Electrical Wave Propagation Phenomena in the Heart Muscle Using Simplified Models

    NASA Technical Reports Server (NTRS)

    Karpoukhin, Mikhii G.; Kogan, Boris Y.; Karplus, Walter J.

    1995-01-01

    The simulation of heart arrhythmia and fibrillation are very important and challenging tasks. The solution of these problems using sophisticated mathematical models is beyond the capabilities of modern super computers. To overcome these difficulties it is proposed to break the whole simulation problem into two tightly coupled stages: generation of the action potential using sophisticated models. and propagation of the action potential using simplified models. The well known simplified models are compared and modified to bring the rate of depolarization and action potential duration restitution closer to reality. The modified method of lines is used to parallelize the computational process. The conditions for the appearance of 2D spiral waves after the application of a premature beat and the subsequent traveling of the spiral wave inside the simulated tissue are studied.

  19. Implementation and scaling of the fully coupled Terrestrial Systems Modeling Platform (TerrSysMP) in a massively parallel supercomputing environment - a case study on JUQUEEN (IBM Blue Gene/Q)

    NASA Astrophysics Data System (ADS)

    Gasper, F.; Goergen, K.; Kollet, S.; Shrestha, P.; Sulis, M.; Rihani, J.; Geimer, M.

    2014-06-01

    Continental-scale hyper-resolution simulations constitute a grand challenge in characterizing non-linear feedbacks of states and fluxes of the coupled water, energy, and biogeochemical cycles of terrestrial systems. Tackling this challenge requires advanced coupling and supercomputing technologies for earth system models that are discussed in this study, utilizing the example of the implementation of the newly developed Terrestrial Systems Modeling Platform (TerrSysMP) on JUQUEEN (IBM Blue Gene/Q) of the Jülich Supercomputing Centre, Germany. The applied coupling strategies rely on the Multiple Program Multiple Data (MPMD) paradigm and require memory and load balancing considerations in the exchange of the coupling fields between different component models and allocation of computational resources, respectively. These considerations can be reached with advanced profiling and tracing tools leading to the efficient use of massively parallel computing environments, which is then mainly determined by the parallel performance of individual component models. However, the problem of model I/O and initialization in the peta-scale range requires major attention, because this constitutes a true big data challenge in the perspective of future exa-scale capabilities, which is unsolved.

  20. Formation of oligonucleotide-PNA-chimeras by template-directed ligation

    NASA Technical Reports Server (NTRS)

    Koppitz, M.; Nielsen, P. E.; Orgel, L. E.; Bada, J. L. (Principal Investigator)

    1998-01-01

    DNA sequences have previously been reported to act as templates for the synthesis of PNA, and vice versa. A continuous evolutionary transition from an informational replicating system based on one polymer to a system based on the other would be facilitated if it were possible to form chimeras, that is molecules that contain monomers of both types. Here we show that ligation to form chimeras proceeds efficiently both on PNA and on DNA templates. The efficiency of ligation is primarily determined by the number of backbone bonds at the ligation site and the relative orientation of template and substrate strands. The most efficient reactions result in the formation of chimeras with ligation junctions resembling the structures of the backbones of PNA and DNA and with antiparallel alignment of both components of the chimera with the template, that is, ligations involving formation of 3'-phosphoramidate and 5'-ester bonds. However, double helices involving PNA are stable both with antiparallel and parallel orientation of the two strands. Ligation on PNA but not on DNA templates is, therefore, sometimes possible on templates with reversed orientation. The relevance of these findings to discussions of possible transitions between genetic systems is discussed.

  1. Development and application of a 6.5 million feature Affymetrix Genechip® for massively parallel discovery of single position polymorphisms in lettuce (Lactuca spp.)

    PubMed Central

    2012-01-01

    Background High-resolution genetic maps are needed in many crops to help characterize the genetic diversity that determines agriculturally important traits. Hybridization to microarrays to detect single feature polymorphisms is a powerful technique for marker discovery and genotyping because of its highly parallel nature. However, microarrays designed for gene expression analysis rarely provide sufficient gene coverage for optimal detection of nucleotide polymorphisms, which limits utility in species with low rates of polymorphism such as lettuce (Lactuca sativa). Results We developed a 6.5 million feature Affymetrix GeneChip® for efficient polymorphism discovery and genotyping, as well as for analysis of gene expression in lettuce. Probes on the microarray were designed from 26,809 unigenes from cultivated lettuce and an additional 8,819 unigenes from four related species (L. serriola, L. saligna, L. virosa and L. perennis). Where possible, probes were tiled with a 2 bp stagger, alternating on each DNA strand; providing an average of 187 probes covering approximately 600 bp for each of over 35,000 unigenes; resulting in up to 13 fold redundancy in coverage per nucleotide. We developed protocols for hybridization of genomic DNA to the GeneChip® and refined custom algorithms that utilized coverage from multiple, high quality probes to detect single position polymorphisms in 2 bp sliding windows across each unigene. This allowed us to detect greater than 18,000 polymorphisms between the parental lines of our core mapping population, as well as numerous polymorphisms between cultivated lettuce and wild species in the lettuce genepool. Using marker data from our diversity panel comprised of 52 accessions from the five species listed above, we were able to separate accessions by species using both phylogenetic and principal component analyses. Additionally, we estimated the diversity between different types of cultivated lettuce and distinguished morphological types

  2. Second-Order Møller-Plesset Perturbation Theory in the Condensed Phase: An Efficient and Massively Parallel Gaussian and Plane Waves Approach.

    PubMed

    Del Ben, Mauro; Hutter, Jürg; VandeVondele, Joost

    2012-11-13

    A novel algorithm, based on a hybrid Gaussian and plane waves (GPW) approach, is developed for the canonical second-order Møller-Plesset perturbation energy (MP2) of finite and extended systems. The key aspect of the method is that the electron repulsion integrals (ia|λσ) are computed by direct integration between the products of Gaussian basis functions λσ and the electrostatic potential arising from a given occupied-virtual pair density ia. The electrostatic potential is obtained in a plane waves basis set after solving the Poisson equation in Fourier space. In particular, for condensed phase systems, this scheme is highly efficient. Furthermore, our implementation has low memory requirements and displays excellent parallel scalability up to 100 000 processes. In this way, canonical MP2 calculations for condensed phase systems containing hundreds of atoms or more than 5000 basis functions can be performed within minutes, while systems up to 1000 atoms and 10 000 basis functions remain feasible. Solid LiH has been employed as a benchmark to study basis set and system size convergence. Lattice constants and cohesive energies of various molecular crystals have been studied with MP2 and double-hybrid functionals. PMID:26605583

  3. PMESH: A parallel mesh generator

    SciTech Connect

    Hardin, D.D.

    1994-10-21

    The Parallel Mesh Generation (PMESH) Project is a joint LDRD effort by A Division and Engineering to develop a unique mesh generation system that can construct large calculational meshes (of up to 10{sup 9} elements) on massively parallel computers. Such a capability will remove a critical roadblock to unleashing the power of massively parallel processors (MPPs) for physical analysis. PMESH will support a variety of LLNL 3-D physics codes in the areas of electromagnetics, structural mechanics, thermal analysis, and hydrodynamics.

  4. Massive Stars

    NASA Astrophysics Data System (ADS)

    Livio, Mario; Villaver, Eva

    2009-11-01

    Participants; Preface Mario Livio and Eva Villaver; 1. High-mass star formation by gravitational collapse of massive cores M. R. Krumholz; 2. Observations of massive star formation N. A. Patel; 3. Massive star formation in the Galactic center D. F. Figer; 4. An X-ray tour of massive star-forming regions with Chandra L. K. Townsley; 5. Massive stars: feedback effects in the local universe M. S. Oey and C. J. Clarke; 6. The initial mass function in clusters B. G. Elmegreen; 7. Massive stars and star clusters in the Antennae galaxies B. C. Whitmore; 8. On the binarity of Eta Carinae T. R. Gull; 9. Parameters and winds of hot massive stars R. P. Kudritzki and M. A. Urbaneja; 10. Unraveling the Galaxy to find the first stars J. Tumlinson; 11. Optically observable zero-age main-sequence O stars N. R. Walborn; 12. Metallicity-dependent Wolf-Raynet winds P. A. Crowther; 13. Eruptive mass loss in very massive stars and Population III stars N. Smith; 14. From progenitor to afterlife R. A. Chevalier; 15. Pair-production supernovae: theory and observation E. Scannapieco; 16. Cosmic infrared background and Population III: an overview A. Kashlinsky.

  5. Massive parallel analysis of the binding specificity of histone-like protein HU to single- and double-stranded DNA with generic oligodeoxyribonucleotide microchips.

    SciTech Connect

    Krylov, A. S.; Zasedateleva, O. A.; Prokopenko, D. V.; Rouviere-Yaniv, J.; Mirzabekov, A. D.; Biochip Technology Center; Engelhardt Inst. of Molecular Biology; Inst. de Biologie Physico-Chimique

    2001-06-15

    A generic hexadeoxyribonucleotide microchip has been applied to test the DNA-binding properties of HU histone-like bacterial protein, which is known to have a low sequence specificity. All 4096 hexamers flanked within 8mers by degenerate bases at both the 3'- and 5'-ends were immobilized within the 100 x 100 x 20 mm polyacrylamide gel pads of the microchip. Single-stranded immobilized oligonucleotides were converted in some experiments to the double-stranded form by hybridization with a specified mixture of 8mers. The DNA interaction with HU was characterized by three type of measurements: (i) binding of FITC-labeled HU to microchip oligonucleotides; (ii) melting curves of complexes of labeled HU with single-stranded microchip oligonucleotides; (iii) the effect of HU binding on melting curves of microchip double-stranded DNA labeled with another fluorescent dye, Texas Red. Large numbers of measurements of these parameters were carried out in parallel for all or many generic microchip elements in real time with a multi-wavelength fluorescence microscope. Statistical analysis of these data suggests some preference for HU binding to G/C-rich single-stranded oligonucleotides. HU complexes with double-stranded microchip 8mers can be divided into two groups in which HU binding either increased the melting temperature (T{sub m}) of duplexes or decreased it. The stabilized duplexes showed some preference for presence of the sequence motifs AAG, AGA and AAGA. In the second type of complex, enriched with A/T base pairs, the destabilization effect was higher for longer stretches of A/T duplexes. Binding of HU to labeled duplexes in the second type of complex caused some decrease in fluorescence. This decrease also correlates with the higher A/T content and lower T{sub m}. The results demonstrate that generic microchips could be an efficient approach in analysis of sequence specificity of proteins.

  6. Kinetics of ligation of fibrin oligomers.

    PubMed

    Nelb, G W; Kamykowski, G W; Ferry, J D

    1980-07-10

    Human fibrinogen was treated with thrombin in the presence of fibrinoligase and calcium ion at pH 8.5, ionic strength 0.45, and the ensuring polymerization was interrupted at various time intervals (t) both before and after the clotting time (tc) by solubilization with a solution of sodium dodecyl sulfate and urea. Aliquots of the solubilized protein were subjected to gel electrophoresis on polyacrylamide gels after disulfide reduction by dithiothreitol and on agarose gels without reduction. The degree of gamma-gamma ligation was determined from the former and the size distribution of ligated oligomers, for degree of polymerization x from 1 to 10, from the latter. The degree of gamma-gamma ligation was calculated independently from the size distribution with the assumption that every junction between two fibrin monomers remaining intact after solubilization is ligated, and this agreed well with the direct determination. The size distribution at t/tc = 1.3 to 1.6 differed somewhat from that calculated by the classical theory of linear polycondensation on the assumption that all reactive sites react with equal probability and rate. Analysis of the difference suggests that ligation of a fibrin digomer is not a random process; the probability of ligation of a given junction between two monomers increases with the oligomer length. The number-average degree of polymerization, xn, of ligated oligomers increases approximately linearly with time up to a value of 1.6. PMID:7391026

  7. Cloning of DNA fragments: ligation reactions in agarose gel.

    PubMed

    Furtado, Agnelo

    2014-01-01

    Ligation reactions to ligate a desired DNA fragment into a vector can be challenging to beginners and especially if the amount of the insert is limiting. Although additives known as crowding agents, such as PEG 8000, added to the ligation mixes can increase the success one has with ligation reactions, in practice the amount of insert used in the ligation can determine the success or the failure of the ligation reaction. The method described here, which uses insert DNA in gel slice added directly into the ligation reaction, has two benefits: (a) using agarose as the crowding agent and (b) reducing steps of insert purification. The use of rapid ligation buffer and incubation of the ligation reaction at room temperature greatly increase the efficiency of the ligation reaction even for blunt-ended ligation. PMID:24243199

  8. Parallel computing works

    SciTech Connect

    Not Available

    1991-10-23

    An account of the Caltech Concurrent Computation Program (C{sup 3}P), a five year project that focused on answering the question: Can parallel computers be used to do large-scale scientific computations '' As the title indicates, the question is answered in the affirmative, by implementing numerous scientific applications on real parallel computers and doing computations that produced new scientific results. In the process of doing so, C{sup 3}P helped design and build several new computers, designed and implemented basic system software, developed algorithms for frequently used mathematical computations on massively parallel machines, devised performance models and measured the performance of many computers, and created a high performance computing facility based exclusively on parallel computers. While the initial focus of C{sup 3}P was the hypercube architecture developed by C. Seitz, many of the methods developed and lessons learned have been applied successfully on other massively parallel architectures.

  9. Small-molecule-dependent split aptamer ligation.

    PubMed

    Sharma, Ashwani K; Heemstra, Jennifer M

    2011-08-17

    Here we describe the first use of small-molecule binding to direct a chemical reaction between two nucleic acid strands. The reported reaction is a ligation between two fragments of a DNA split aptamer using strain-promoted azide-alkyne cycloaddition. Utilizing the split aptamer for cocaine, we demonstrate small-molecule-dependent ligation that is dose-dependent over a wide range of cocaine concentrations and is compatible with complex biological fluids such as human blood serum. Moreover, studies of split aptamer ligation at varying salt concentrations and using structurally similar analogues of cocaine have revealed new insight into the assembly and small-molecule binding properties of the cocaine split aptamer. The ability to translate the presence of a small-molecule target into the output of DNA ligation is anticipated to enable the development of new, broadly applicable small-molecule detection assays. PMID:21761903

  10. Implementation and scaling of the fully coupled Terrestrial Systems Modeling Platform (TerrSysMP v1.0) in a massively parallel supercomputing environment - a case study on JUQUEEN (IBM Blue Gene/Q)

    NASA Astrophysics Data System (ADS)

    Gasper, F.; Goergen, K.; Shrestha, P.; Sulis, M.; Rihani, J.; Geimer, M.; Kollet, S.

    2014-10-01

    Continental-scale hyper-resolution simulations constitute a grand challenge in characterizing nonlinear feedbacks of states and fluxes of the coupled water, energy, and biogeochemical cycles of terrestrial systems. Tackling this challenge requires advanced coupling and supercomputing technologies for earth system models that are discussed in this study, utilizing the example of the implementation of the newly developed Terrestrial Systems Modeling Platform (TerrSysMP v1.0) on JUQUEEN (IBM Blue Gene/Q) of the Jülich Supercomputing Centre, Germany. The applied coupling strategies rely on the Multiple Program Multiple Data (MPMD) paradigm using the OASIS suite of external couplers, and require memory and load balancing considerations in the exchange of the coupling fields between different component models and the allocation of computational resources, respectively. Using the advanced profiling and tracing tool Scalasca to determine an optimum load balancing leads to a 19% speedup. In massively parallel supercomputer environments, the coupler OASIS-MCT is recommended, which resolves memory limitations that may be significant in case of very large computational domains and exchange fields as they occur in these specific test cases and in many applications in terrestrial research. However, model I/O and initialization in the petascale range still require major attention, as they constitute true big data challenges in light of future exascale computing resources. Based on a factor-two speedup due to compiler optimizations, a refactored coupling interface using OASIS-MCT and an optimum load balancing, the problem size in a weak scaling study can be increased by a factor of 64 from 512 to 32 768 processes while maintaining parallel efficiencies above 80% for the component models.

  11. A comparative study of conventional ligation and self-ligation bracket systems.

    PubMed

    Shivapuja, P K; Berger, J

    1994-11-01

    The increased use of self-ligating bracket systems frequently raises the question of how they compare with conventional ligation systems. An in vitro and clinical investigation was undertaken to evaluate and compare these distinctly different groups, by using five different brackets. The Activa ("A" Company, Johnson & Johnson, San Diego, Calif.), Edgelok (Ormco, Glendora, Calif.), and SPEED (Strite Industries Ltd., Cambridge, Ontario) self-ligating bracket systems displayed a significantly lower level of frictional resistance, dramatically less chairtime for arch wire removal and insertion, and promoted improved infection control, when compared with polyurethane elastomeric and stainless steel tie wire ligation for ceramic and metal twin brackets. PMID:7977187

  12. Massive Hemoptysis.

    PubMed

    Rali, Parth; Gandhi, Viral; Tariq, Cheema

    2016-01-01

    Hemoptysis, or coughing of blood, oftentimes triggers anxiety and fear for patients. The etiology of hemoptysis will determine the clinical course, which includes watchful waiting or intensive care admission. Any amount of hemoptysis that compromises the patient's respiratory status is considered massive hemoptysis and should be considered a medical emergency. In this article, we review introduction, definition, bronchial circulation anatomy, etiology, and management of massive hemoptysis. PMID:26919675

  13. Development of ballistic hot electron emitter and its applications to parallel processing: active-matrix massive direct-write lithography in vacuum and thin-film deposition in solutions

    NASA Astrophysics Data System (ADS)

    Koshida, Nobuyoshi; Kojima, Akira; Ikegami, Naokatsu; Suda, Ryutaro; Yagi, Mamiko; Shirakashi, Junichi; Miyaguchi, Hiroshi; Muroyama, Masanori; Yoshida, Shinya; Totsu, Kentaro; Esashi, Masayoshi

    2015-07-01

    Making the best use of the characteristic features in nanocrystalline Si (nc-Si) ballistic hot electron source, an alternative lithographic technology is presented based on two approaches: physical excitation in vacuum and chemical reduction in solutions. The nc-Si cold cathode is composed of a thin metal film, an nc-Si layer, an n+-Si substrate, and an ohmic back contact. Under a biased condition, energetic electrons are uniformly and directionally emitted through the thin surface electrodes. In vacuum, this emitter is available for active-matrix drive massive parallel lithography. Arrayed 100×100 emitters (each emitting area: 10×10 μm2) are fabricated on a silicon substrate by a conventional planar process, and then every emitter is bonded with the integrated driver using through-silicon-via interconnect technology. Another application is the use of this emitter as an active electrode supplying highly reducing electrons into solutions. A very small amount of metal-salt solutions is dripped onto the nc-Si emitter surface, and the emitter is driven without using any counter electrodes. After the emitter operation, thin metal and elemental semiconductors (Si and Ge) films are uniformly deposited on the emitting surface. Spectroscopic surface and compositional analyses indicate that there are no significant contaminations in deposited thin films.

  14. Reconstruction of the 1997/1998 El Nino from TOPEX/POSEIDON and TOGA/TAO Data Using a Massively Parallel Pacific-Ocean Model and Ensemble Kalman Filter

    NASA Technical Reports Server (NTRS)

    Keppenne, C. L.; Rienecker, M.; Borovikov, A. Y.

    1999-01-01

    Two massively parallel data assimilation systems in which the model forecast-error covariances are estimated from the distribution of an ensemble of model integrations are applied to the assimilation of 97-98 TOPEX/POSEIDON altimetry and TOGA/TAO temperature data into a Pacific basin version the NASA Seasonal to Interannual Prediction Project (NSIPP)ls quasi-isopycnal ocean general circulation model. in the first system, ensemble of model runs forced by an ensemble of atmospheric model simulations is used to calculate asymptotic error statistics. The data assimilation then occurs in the reduced phase space spanned by the corresponding leading empirical orthogonal functions. The second system is an ensemble Kalman filter in which new error statistics are computed during each assimilation cycle from the time-dependent ensemble distribution. The data assimilation experiments are conducted on NSIPP's 512-processor CRAY T3E. The two data assimilation systems are validated by withholding part of the data and quantifying the extent to which the withheld information can be inferred from the assimilation of the remaining data. The pros and cons of each system are discussed.

  15. Effects of double ligation of Stensen's duct on the rabbit parotid gland.

    PubMed

    Maria, O M; Maria, S M; Redman, R S; Maria, A M; Saad El-Din, T A; Soussa, E F; Tran, S D

    2014-04-01

    Salivary gland duct ligation is an alternative to gland excision for treating sialorrhea or reducing salivary gland size prior to tumor excision. Duct ligation also is used as an approach to study salivary gland aging, regeneration, radiotherapy, sialolithiasis and sialadenitis. Reports conflict about the contribution of each salivary cell population to gland size reduction after ductal ligation. Certain cell populations, especially acini, reportedly undergo atrophy, apoptosis and proliferation during reduction of gland size. Acini also have been reported to de-differentiate into ducts. These contradictory results have been attributed to different animal or salivary gland models, or to methods of ligation. We report here a bilateral double ligature technique for rabbit parotid glands with histologic observations at 1, 7, 14, 30, 60 days after ligation. A large battery of special stains and immunohistochemical procedures was employed to define the cell populations. Four stages with overlapping features were observed that led to progressive shutdown of gland activities: 1) marked atrophy of the acinar cells occurred by 14 days, 2) response to and removal of the secretory material trapped in the acinar and ductal lumens mainly between 30 and 60 days, 3) reduction in the number of parenchymal (mostly acinar) cells by apoptosis that occurred mainly between 14-30 days, and 4) maintenance of steady-state at 60 days with a low rate of fluid, protein, and glycoprotein secretion, which greatly decreased the number of leukocytes engaged in the removal of the luminal contents. The main post- ligation characteristics were dilation of ductal and acinar lumens, massive transient infiltration of mostly heterophils (rabbit polymorphonuclear leukocytes), acinar atrophy, and apoptosis of both acinar and ductal cells. Proliferation was uncommon except in the larger ducts. By 30 days, the distribution of myoepithelial cells had spread from exclusively investing the intercalated ducts

  16. COSMOS: Python library for massively parallel workflows

    PubMed Central

    Gafni, Erik; Luquette, Lovelace J.; Lancaster, Alex K.; Hawkins, Jared B.; Jung, Jae-Yoon; Souilmi, Yassine; Wall, Dennis P.; Tonellato, Peter J.

    2014-01-01

    Summary: Efficient workflows to shepherd clinically generated genomic data through the multiple stages of a next-generation sequencing pipeline are of critical importance in translational biomedical science. Here we present COSMOS, a Python library for workflow management that allows formal description of pipelines and partitioning of jobs. In addition, it includes a user interface for tracking the progress of jobs, abstraction of the queuing system and fine-grained control over the workflow. Workflows can be created on traditional computing clusters as well as cloud-based services. Availability and implementation: Source code is available for academic non-commercial research purposes. Links to code and documentation are provided at http://lpm.hms.harvard.edu and http://wall-lab.stanford.edu. Contact: dpwall@stanford.edu or peter_tonellato@hms.harvard.edu. Supplementary information: Supplementary data are available at Bioinformatics online. PMID:24982428

  17. Massively parallel collaboration : a literature review.

    SciTech Connect

    Dornburg, Courtney C.; Stevens, Susan Marie; Davidson, George S.; Forsythe, James Chris

    2007-09-01

    The present paper explores group dynamics and electronic communication, two components of wicked problem solving that are inherent to the national security environment (as well as many other business environments). First, because there can be no ''right'' answer or solution without first having agreement about the definition of the problem and the social meaning of a ''right solution'', these problems (often) fundamentally relate to the social aspects of groups, an area with much empirical research and application still needed. Second, as computer networks have been increasingly used to conduct business with decreased costs, increased information accessibility, and rapid document, database, and message exchange, electronic communication enables a new form of problem solving group that has yet to be well understood, especially as it relates to solving wicked problems.

  18. Massively parallel X-ray holography

    SciTech Connect

    Spence, John C.H; Marchesini, Stefano; Boutet, Sebastien; Sakdinawat, Anne E.; Bogan, Michael J.; Bajt, Sasa; Barty, Anton; Chapman, Henry N.; Frank, Matthias; Hau-Riege, Stefan P.; Szöke, Abraham; Cui, Congwu; Shapiro, David A.; Howells, MAlcolm R.; Shaevitz, Joshua W; Lee, Joanna Y.; Hajdu, Janos; Seibert, Marvin M.

    2008-08-01

    Advances in the development of free-electron lasers offer the realistic prospect of nanoscale imaging on the timescale of atomic motions. We identify X-ray Fourier-transform holography1,2,3 as a promising but, so far, inefficient scheme to do this. We show that a uniformly redundant array4 placed next to the sample, multiplies the efficiency of X-ray Fourier transform holography by more than three orders of magnitude, approaching that of a perfect lens, and provides holographic images with both amplitude- and phase-contrast information. The experiments reported here demonstrate this concept by imaging a nano-fabricated object at a synchrotron source, and a bacterial cell with a soft-X-ray free-electron laser, where illumination by a single 15-fs pulse was successfully used in producing the holographic image. As X-ray lasers move to shorter wavelengths we expect to obtain higher spatial resolution ultrafast movies of transient states of matter

  19. Modeling groundwater flow on massively parallel computers

    SciTech Connect

    Ashby, S.F.; Falgout, R.D.; Fogwell, T.W.; Tompson, A.F.B.

    1994-12-31

    The authors will explore the numerical simulation of groundwater flow in three-dimensional heterogeneous porous media. An interdisciplinary team of mathematicians, computer scientists, hydrologists, and environmental engineers is developing a sophisticated simulation code for use on workstation clusters and MPPs. To date, they have concentrated on modeling flow in the saturated zone (single phase), which requires the solution of a large linear system. they will discuss their implementation of preconditioned conjugate gradient solvers. The preconditioners under consideration include simple diagonal scaling, s-step Jacobi, adaptive Chebyshev polynomial preconditioning, and multigrid. They will present some preliminary numerical results, including simulations of groundwater flow at the LLNL site. They also will demonstrate the code`s scalability.

  20. High-speed massively parallel scanning

    DOEpatents

    Decker, Derek E.

    2010-07-06

    A new technique for recording a series of images of a high-speed event (such as, but not limited to: ballistics, explosives, laser induced changes in materials, etc.) is presented. Such technique(s) makes use of a lenslet array to take image picture elements (pixels) and concentrate light from each pixel into a spot that is much smaller than the pixel. This array of spots illuminates a detector region (e.g., film, as one embodiment) which is scanned transverse to the light, creating tracks of exposed regions. Each track is a time history of the light intensity for a single pixel. By appropriately configuring the array of concentrated spots with respect to the scanning direction of the detection material, different tracks fit between pixels and sufficient lengths are possible which can be of interest in several high-speed imaging applications.

  1. Parallel image compression

    NASA Technical Reports Server (NTRS)

    Reif, John H.

    1987-01-01

    A parallel compression algorithm for the 16,384 processor MPP machine was developed. The serial version of the algorithm can be viewed as a combination of on-line dynamic lossless test compression techniques (which employ simple learning strategies) and vector quantization. These concepts are described. How these concepts are combined to form a new strategy for performing dynamic on-line lossy compression is discussed. Finally, the implementation of this algorithm in a massively parallel fashion on the MPP is discussed.

  2. Phosphoramidate Ligation of Oligonucleotides in Nanoscale Structures.

    PubMed

    Kalinowski, Matthäus; Haug, Rüdiger; Said, Hassan; Piasecka, Sylwia; Kramer, Markus; Richert, Clemens

    2016-06-16

    The folding of long DNA strands into designed nanostructures has evolved into an art. Being based on linear chains only, the resulting nanostructures cannot readily be transformed into covalently linked frameworks. Covalently linking strands in the context of folded DNA structures requires a robust method that avoids sterically demanding reagents or enzymes. Here we report chemical ligation of the 3'-amino termini of oligonucleotides and 5'-phosphorylated partner strands in templated reactions that produce phosphoramidate linkages. These reactions produce inter-nucleotide linkages that are isoelectronic and largely isosteric to phosphodiesters. Ligations were performed at three levels of complexity, including the extension of branched DNA hybrids and the ligation of six scaffold strands in a small origami. PMID:27225865

  3. Rapid oligonucleotide-templated fluorogenic tetrazine ligations

    PubMed Central

    Šečkutė, Jolita; Yang, Jun; Devaraj, Neal K.

    2013-01-01

    Template driven chemical ligation of fluorogenic probes represents a powerful method for DNA and RNA detection and imaging. Unfortunately, previous techniques have been hampered by requiring chemistry with sluggish kinetics and background side reactions. We have developed fluorescent DNA probes containing quenched fluorophore-tetrazine and methyl-cyclopropene groups that rapidly react by bioorthogonal cycloaddition in the presence of complementary DNA or RNA templates. Ligation increases fluorescence with negligible background signal in the absence of hybridization template. Reaction kinetics depend heavily on template length and linker structure. Using this technique, we demonstrate rapid discrimination between single template mismatches both in buffer and cell media. Fluorogenic bioorthogonal ligations offer a promising route towards the fast and robust fluorescent detection of specific DNA or RNA sequences. PMID:23775794

  4. Parallel Dislocation Simulator

    Energy Science and Technology Software Center (ESTSC)

    2006-10-30

    ParaDiS is software capable of simulating the motion, evolution, and interaction of dislocation networks in single crystals using massively parallel computer architectures. The software is capable of outputting the stress-strain response of a single crystal whose plastic deformation is controlled by the dislocation processes.

  5. Inferior vena cava injury caused by an anteriorly migrated cage resulting in ligation: case report.

    PubMed

    Ariyoshi, Dai; Sano, Shigeo; Kawamura, Naohiro

    2016-03-01

    Anterior dislodgement of the transforaminal lumbar interbody fusion (TLIF) cage is one of the severe complications seen in this procedure, which may cause an intraoperative major vessel injury. The objective of this report is to present a rare case of inferior vena cava (IVC) injury during revision surgery for removal of the anteriorly migrated cage. The authors describe a case of 74-year-old woman with lumbar spinal canal stenosis and degenerative scoliosis. During the TLIF surgery, an inserted titanium cage at the L4-5 level dislodged anteriorly to the retroperitoneal space without massive bleeding from the disc space. In the second surgery, which was performed via an anterior retroperitoneal approach to remove the migrated cage, massive torrential bleeding occurred because of IVC injury. The laceration in the posterior wall of the IVC necessitated ligation of this vessel and both common iliac veins by a vascular surgeon. Postoperative edema of the lower extremities after ligation of the vessels was well tolerated, and the patient showed almost full recovery. For removal surgery of an anteriorly migrated cage, the surgeon should be well prepared for the risk of IVC injury, including requesting the attendance of a vascular surgeon. Ligation of the infrarenal IVC is an acceptable solution in irreparable IVC injury. PMID:26637062

  6. Massive Fibroid

    PubMed Central

    Weekes, Leroy R.

    1977-01-01

    This ten-year study of the massive fibroid at the Queen of Angels Hospital will reveal an average of 66 cases per year which could be classified as large and massive. Only about ten cases per year qualify as massive (four gestational months or larger). There were none considered giant size (25 lbs or more). The literature is replete with these, one of which (weighing 100.2 lbs) will be reported in detail. The mortality rate continues to be considerable in these (14.8 to 16.7 percent). In the smaller tumors, mortality is rare and morbidity is minimal. Bleeding, pain, and pressure symptoms, due to impingement on neighboring organs, are the principal symptoms. Sarcomatous change, fortunately, still remains quite rare. Treatment usually involves a pre-operative dilatation and curettage when bleeding is a problem, followed by total abdominal hysterectomy and bilateral salpingo-oophorectomy where indicated. Appendectomy is usually incidental. Anesthesia is usually spinal, if not otherwise contraindicated. Ultrasound is a new and refined diagnostic tool. PMID:833892

  7. Thoracoscopic Ligation of the Thoracic Duct

    PubMed Central

    Teixeira, Julio A.

    2000-01-01

    Objective: When nonoperative treatment of chylothorax fails, thoracic duct ligation is usually performed through a thoracotomy. We describe two cases of persistent chylothorax, in a child and an adult, successfully treated with thoracoscopic ligation of the thoracic duct. Methods: A 4-year-old girl developed a right chylothorax following a Fontan procedure. Aggressive nonoperative management failed to eliminate the persistent chyle loss. A 72-year-old insulin-dependent diabetic man was involved in a motor vehicle accident, in which he sustained multiple fractured ribs, a right hemopneumothorax, a right femoral shaft fracture, and a T-11 thoracic vertebral fracture. Subsequently, he developed a right chylothorax, which did not respond to nonoperative management. Both patients were successfully treated with thoracoscopic ligation of the thoracic duct. Results: The child had significant decrease of chyle drainage following surgery. Increased drainage that appeared after the introduction of full feedings five days postoperatively was controlled with the somatostatin analog octreotide. The chest tube was removed two weeks after surgery. After two years' follow-up, she has had no recurrence of chylothorax. The adult had no chyle drainage following surgery. He was maintained on a medium-chain triglyceride diet postoperatively for two weeks. The chest tube was removed four days after surgery. After six months' follow-up, he has had no recurrence of chylothorax. Conclusions: Thoracoscopic ligation of the thoracic duct provides a safe and effective treatment of chylothorax and may avoid thoracotomy and its associated morbidity. PMID:10987402

  8. Implantation of self-expanding metal stent in the treatment of severe bleeding from esophageal ulcer after endoscopic band ligation.

    PubMed

    Mishin, I; Ghidirim, G; Dolghii, A; Bunic, G; Zastavnitsky, G

    2010-09-01

    Endoscopic variceal ligation is superior to sclerotherapy because of its lower rebleeding and complication rates. However, ligation may be associated with life-threatening bleeding from postbanding esophageal ulcer. We report a case of a 49-year-old male with massive hemorrhage from esophageal ulcer on 8th day after successful band ligation of bleeding esophageal varices caused by postviral liver cirrhosis (Child-Pugh class C). A removable polyurethane membrane-covered self-expanding metal stent (SX-ELLA stent Danis, 135 mm × 25 mm, ELLA-CS, Hradec-Kralove, Czech Republic) was inserted in ICU for preventing fatal hemorrhage. Complete hemostasis was achieved and stent was removed after 8 days without rebleeding or any complications. To the best of our knowledge, this is the first report in English literature regarding life-threatening hemorrhage from postbanding esophageal ulcer successfully treated by self-expanding metal stent in a patient with portal hypertension. PMID:20731698

  9. Appendix E: Parallel Pascal development system

    NASA Technical Reports Server (NTRS)

    1985-01-01

    The Parallel Pascal Development System enables Parallel Pascal programs to be developed and tested on a conventional computer. It consists of several system programs, including a Parallel Pascal to standard Pascal translator, and a library of Parallel Pascal subprograms. The library includes subprograms for using Parallel Pascal on a parallel system with a fixed degree of parallelism, such as the Massively Parallel Processor, to conveniently manipulate arrays which have dimensions than the hardware. Programs can be conveninetly tested with small sized arrays on the conventional computer before attempting to run on a parallel system.

  10. Massive Bleeding and Massive Transfusion

    PubMed Central

    Meißner, Andreas; Schlenke, Peter

    2012-01-01

    Massive bleeding in trauma patients is a serious challenge for all clinicians, and an interdisciplinary diagnostic and therapeutic approach is warranted within a limited time frame. Massive transfusion usually is defined as the transfusion of more than 10 units of packed red blood cells (RBCs) within 24 h or a corresponding blood loss of more than 1- to 1.5-fold of the body's entire blood volume. Especially male trauma patients experience this life-threatening condition within their productive years of life. An important parameter for clinical outcome is to succeed in stopping the bleeding preferentially within the first 12 h of hospital admission. Additional coagulopathy in the initial phase is induced by trauma itself and aggravated by consumption and dilution of clotting factors. Although different aspects have to be taken into consideration when viewing at bleedings induced by trauma compared to those caused by major surgery, the basic strategy is similar. Here, we will focus on trauma-induced massive hemorrhage. Currently there are no definite, worldwide accepted algorithms for blood transfusion and strategies for optimal coagulation management. There is increasing evidence that a higher ratio of plasma and RBCs (e.g. 1:1) endorsed by platelet transfusion might result in a superior survival of patients at risk for trauma-induced coagulopathy. Several strategies have been evolved in the military environment, although not all strategies should be transferred unproven to civilian practice, e.g. the transfusion of whole blood. Several agents have been proposed to support the restoration of coagulation. Some have been used for years without any doubt on their benefit-to-risk profile, whereas great enthusiasm of other products has been discouraged by inefficacy in terms of blood transfusion requirements and mortality or significant severe side effects. This review surveys current literature on fluid resuscitation, blood transfusion, and hemostatic agents currently

  11. LaMEM: a massively parallel 3D staggered-grid finite-difference code for coupled nonlinear themo-mechanical modeling of lithospheric deformation with visco-elasto-plastic rheology

    NASA Astrophysics Data System (ADS)

    Popov, Anton; Kaus, Boris

    2015-04-01

    This software project aims at bringing the 3D lithospheric deformation modeling to a qualitatively different level. Our code LaMEM (Lithosphere and Mantle Evolution Model) is based on the following building blocks: * Massively-parallel data-distributed implementation model based on PETSc library * Light, stable and accurate staggered-grid finite difference spatial discretization * Marker-in-Cell pedictor-corector time discretization with Runge-Kutta 4-th order * Elastic stress rotation algorithm based on the time integration of the vorticity pseudo-vector * Staircase-type internal free surface boundary condition without artificial viscosity contrast * Geodynamically relevant visco-elasto-plastic rheology * Global velocity-pressure-temperature Newton-Raphson nonlinear solver * Local nonlinear solver based on FZERO algorithm * Coupled velocity-pressure geometric multigrid preconditioner with Galerkin coarsening Staggered grid finite difference, being inherently Eulerian and rather complicated discretization method, provides no natural treatment of free surface boundary condition. The solution based on the quasi-viscous sticky-air phase introduces significant viscosity contrasts and spoils the convergence of the iterative solvers. In LaMEM we are currently implementing an approximate stair-case type of the free surface boundary condition which excludes the empty cells and restores the solver convergence. Because of the mutual dependence of the stress and strain-rate tensor components, and their different spatial locations in the grid, there is no straightforward way of implementing the nonlinear rheology. In LaMEM we have developed and implemented an efficient interpolation scheme for the second invariant of the strain-rate tensor, that solves this problem. Scalable efficient linear solvers are the key components of the successful nonlinear problem solution. In LaMEM we have a range of PETSc-based preconditioning techniques that either employ a block factorization of

  12. Bidirectional Glenn with interruption of antegrade pulmonary blood flow: Which is the preferred option: Ligation or division of the pulmonary artery?

    PubMed Central

    Chowdhury, Ujjwal Kumar; Kapoor, Poonam Malhotra; Rao, Keerthi; Gharde, Parag; Kumawat, Mukesh; Jagia, Priya

    2016-01-01

    We report a rare complication of massive aneurysm of the proximal ligated end of the main pulmonary artery which occurred in the setting of a patient with a functionally univentricular heart and increased pulmonary blood flow undergoing superior cavopulmonary connection. Awareness of this possibility may guide others to electively transect the pulmonary artery in such a clinical setting. PMID:27397472

  13. Parallelized direct execution simulation of message-passing parallel programs

    NASA Technical Reports Server (NTRS)

    Dickens, Phillip M.; Heidelberger, Philip; Nicol, David M.

    1994-01-01

    As massively parallel computers proliferate, there is growing interest in findings ways by which performance of massively parallel codes can be efficiently predicted. This problem arises in diverse contexts such as parallelizing computers, parallel performance monitoring, and parallel algorithm development. In this paper we describe one solution where one directly executes the application code, but uses a discrete-event simulator to model details of the presumed parallel machine such as operating system and communication network behavior. Because this approach is computationally expensive, we are interested in its own parallelization specifically the parallelization of the discrete-event simulator. We describe methods suitable for parallelized direct execution simulation of message-passing parallel programs, and report on the performance of such a system, Large Application Parallel Simulation Environment (LAPSE), we have built on the Intel Paragon. On all codes measured to date, LAPSE predicts performance well typically within 10 percent relative error. Depending on the nature of the application code, we have observed low slowdowns (relative to natively executing code) and high relative speedups using up to 64 processors.

  14. Ligation of hemorrhoids as an office procedure

    PubMed Central

    Rudd, William W. H.

    1973-01-01

    Hemorrhoidectomy has been done on all our patients in whom operation was indicated, using the ligation method in the office. The patient does not require hospitalization or general anesthesia and rarely has pain or loss of time from work. Many thousands of hospital bed-days have been saved for more serious problems. The results of the treatment of our first 1000 patients have been very satisfactory. ImagesFIG. 1FIG. 2AFIG. 2B PMID:4682639

  15. Effective analgesia after bilateral tubal ligation.

    PubMed

    Wittels, B; Faure, E A; Chavez, R; Moawad, A; Ismail, M; Hibbard, J; Principe, D; Karl, L; Toledano, A Y

    1998-09-01

    To evaluate the analgesic efficacy of local anesthetic infiltration, 20 parturients scheduled for elective minilaparotomy and bilateral tubal ligation with either spinal or epidural anesthesia participated in this prospective, randomized, controlled, double-blind trial. All patients received intravenous (iv) metoclopramide 10 mg and ketorolac 60 mg intraoperatively, as well as preincisional infiltration of the infraumbilical skin incision with 0.5% bupivacaine. Infiltration of bilateral uterine tubes and mesosalpinx was performed either with 0.5% bupivacaine (n = 10) or isotonic sodium chloride solution (n = 10). Intravenous meperidine (25 mg every 3 minutes as needed) was given to treat pain in the postanesthesia care unit (PACU). The total amount of meperidine administered in the PACU was significantly larger in the saline group than in the bupivacaine group. Pain scores at 30, 45, 60, 75, and 90 minutes postoperatively and on the 7th postoperative day were significantly lower in the bupivacaine group than in the saline group. During tubal ligation, infiltration of uterine tubes and mesosalpinx with 0.5% bupivacaine significantly enhanced analgesia both immediately postoperatively and on the 7th postoperative day compared with infiltration with sodium chloride. In conclusion, this study proved that during bilateral tubal ligation with either spinal or epidural anesthesia, preemptive analgesia using iv ketorolac, iv metoclopramide, and infiltration of the incised skin and uterine tubes with 0.5% bupivacaine can eliminate pain, nausea, vomiting, or cramping and maintain good analgesia for 7 days postoperatively. PMID:9728841

  16. Treatment of Tracheoinnominate Fistula with Ligation of the Innominate Artery: A Case Report

    PubMed Central

    Menen, Rhiana S; Pak, Jimmy J; Dowell, Matthew A; Patel, Ashish R; Ashiku, Simon K; Velotta, Jeffrey B

    2016-01-01

    Introduction: Tracheoinnominate fistula, a rare complication of tracheostomy, carries high mortality regardless of treatment; therefore prevention and quick diagnosis is pertinent to survival. Case Presentation: A 76-year-old man who underwent emergent tracheostomy placement presented on postoperative day 10 with massive hemorrhage concerning for tracheoinnominate fistula and was treated with median sternotomy and ligation of the innominate artery. Discussion: This presentation describes a concise diagnosis and treatment plan for a rare event. The key to good outcomes is quick diagnosis and urgent surgical intervention. PMID:27352412

  17. A parallel version of FORM 3

    NASA Astrophysics Data System (ADS)

    Fliegner, D.; Rétey, A.; Vermaseren, J. A. M.

    2001-08-01

    The parallel version of the symbolic manipulation program FORM for clusters of workstations and massive parallel systems is presented. We discuss various cluster architectures and the implementation of the parallel program using message passing (MPI). Performance results for real physics applications are shown.

  18. Differences Between Distributed and Parallel Systems

    SciTech Connect

    Brightwell, R.; Maccabe, A.B.; Rissen, R.

    1998-10-01

    Distributed systems have been studied for twenty years and are now coming into wider use as fast networks and powerful workstations become more readily available. In many respects a massively parallel computer resembles a network of workstations and it is tempting to port a distributed operating system to such a machine. However, there are significant differences between these two environments and a parallel operating system is needed to get the best performance out of a massively parallel system. This report characterizes the differences between distributed systems, networks of workstations, and massively parallel systems and analyzes the impact of these differences on operating system design. In the second part of the report, we introduce Puma, an operating system specifically developed for massively parallel systems. We describe Puma portals, the basic building blocks for message passing paradigms implemented on top of Puma, and show how the differences observed in the first part of the report have influenced the design and implementation of Puma.

  19. Adaptive parallel logic networks

    NASA Technical Reports Server (NTRS)

    Martinez, Tony R.; Vidal, Jacques J.

    1988-01-01

    Adaptive, self-organizing concurrent systems (ASOCS) that combine self-organization with massive parallelism for such applications as adaptive logic devices, robotics, process control, and system malfunction management, are presently discussed. In ASOCS, an adaptive network composed of many simple computing elements operating in combinational and asynchronous fashion is used and problems are specified by presenting if-then rules to the system in the form of Boolean conjunctions. During data processing, which is a different operational phase from adaptation, the network acts as a parallel hardware circuit.

  20. Speeding up parallel processing

    NASA Technical Reports Server (NTRS)

    Denning, Peter J.

    1988-01-01

    In 1967 Amdahl expressed doubts about the ultimate utility of multiprocessors. The formulation, now called Amdahl's law, became part of the computing folklore and has inspired much skepticism about the ability of the current generation of massively parallel processors to efficiently deliver all their computing power to programs. The widely publicized recent results of a group at Sandia National Laboratory, which showed speedup on a 1024 node hypercube of over 500 for three fixed size problems and over 1000 for three scalable problems, have convincingly challenged this bit of folklore and have given new impetus to parallel scientific computing.

  1. Atrophy of submandibular gland by the duct ligation and a blockade of SP receptor in rats

    PubMed Central

    Hishida, Sumiyo; Ozaki, Noriyuki; Honda, Takashi; Shigetomi, Toshio; Ueda, Minoru; Hibi, Hideharu; Sugiura, Yasuo

    2016-01-01

    ABSTRACT To clarify the mechanisms underlying the submandibular gland atrophies associated with ptyalolithiasis, morphological changes were examined in the rat submandibular gland following either surgical intervention of the duct or functional blockade at substance P receptors (SPRs). Progressive acinar atrophy was observed after duct ligation or avulsion of periductal tissues. This suggested that damage to periductal tissue involving nerve fibers might contribute to ligation-associated acinar atrophy. Immunohistochemically labeled-substance P positive nerve fibers (SPFs) coursed in parallel with the main duct and were distributed around the interlobular, striated, granular and intercalated duct, and glandular acini. Strong SPR immunoreactivity was observed in the duct. Injection into the submandibular gland of a SPR antagonist induced marked acinar atrophy. The results revealed that disturbance of SPFs and SPRs might be involved in the atrophy of the submandibular gland associated with ptyalolithiasis. PMID:27303108

  2. Synthetic strategies for polypeptides and proteins by chemical ligation.

    PubMed

    Chen, Ming; Heimer, Pascal; Imhof, Diana

    2015-07-01

    This review focuses on chemical ligation methods for the preparation of oligopeptides and proteins. Chemical ligation is a practical and convenient methodology in peptide and protein synthesis. Longer peptides and proteins can be obtained with high yield in aqueous buffer solutions by coupling unprotected peptide segments even without activation by enzymes or further chemical agents. Several methods and protocols were developed in the past. The potential of the most important approaches of the thioester- and imine-ligation techniques is demonstrated by a broad spectrum of applications. In addition, special features and protocols such as the template-directed ligation, ligation with novel additives or solvent media, microwave-assisted ligation, and the achievements obtained with those are also highlighted herein. PMID:25894893

  3. Peptide ligation from alkoxyamine based radical addition.

    PubMed

    Trimaille, Thomas; Autissier, Laurent; Rakotonirina, Mamy Daniel; Guillaneuf, Yohann; Villard, Claude; Bertin, Denis; Gigmes, Didier; Mabrouk, Kamel

    2014-03-14

    Intermolecular radical 1,2-addition (IRA) of N-tert-butyl-N-(1-diethylphosphono-2,2-dimethylpropyl)aminoxyl (SG1) based alkoxyamines onto activated olefins is used as a tool for peptide ligation. This strategy relies on simple peptide pre-derivatization to obtain (i) a SG1 nitroxide functionalized resin peptide at its N-terminus (SG1-peptide alkoxyamine), (ii) a vinyl functionalized peptide (either at its C-terminus or N-terminus), and does not require any coupling agents. PMID:24476638

  4. Parallel nearest neighbor calculations

    NASA Astrophysics Data System (ADS)

    Trease, Harold

    We are just starting to parallelize the nearest neighbor portion of our free-Lagrange code. Our implementation of the nearest neighbor reconnection algorithm has not been parallelizable (i.e., we just flip one connection at a time). In this paper we consider what sort of nearest neighbor algorithms lend themselves to being parallelized. For example, the construction of the Voronoi mesh can be parallelized, but the construction of the Delaunay mesh (dual to the Voronoi mesh) cannot because of degenerate connections. We will show our most recent attempt to tessellate space with triangles or tetrahedrons with a new nearest neighbor construction algorithm called DAM (Dial-A-Mesh). This method has the characteristics of a parallel algorithm and produces a better tessellation of space than the Delaunay mesh. Parallel processing is becoming an everyday reality for us at Los Alamos. Our current production machines are Cray YMPs with 8 processors that can run independently or combined to work on one job. We are also exploring massive parallelism through the use of two 64K processor Connection Machines (CM2), where all the processors run in lock step mode. The effective application of 3-D computer models requires the use of parallel processing to achieve reasonable "turn around" times for our calculations.

  5. Sequencing by Cyclic Ligation and Cleavage (CycLiC) directly on a microarray captured template

    PubMed Central

    Mir, Kalim U.; Qi, Hong; Salata, Oleg; Scozzafava, Giuseppe

    2009-01-01

    Next generation sequencing methods that can be applied to both the resequencing of whole genomes and to the selective resequencing of specific parts of genomes are needed. We describe (i) a massively scalable biochemistry, Cyclical Ligation and Cleavage (CycLiC) for contiguous base sequencing and (ii) apply it directly to a template captured on a microarray. CycLiC uses four color-coded DNA/RNA chimeric oligonucleotide libraries (OL) to extend a primer, a base at a time, along a template. The cycles comprise the steps: (i) ligation of OLs, (ii) identification of extended base by label detection, and (iii) cleavage to remove label/terminator and undetermined bases. For proof-of-principle, we show that the method conforms to design and that we can read contiguous bases of sequence correctly from a template captured by hybridization from solution to a microarray probe. The method is amenable to massive scale-up, miniaturization and automation. Implementation on a microarray format offers the potential for both selection and sequencing of a large number of genomic regions on a single platform. Because the method uses commonly available reagents it can be developed further by a community of users. PMID:19015154

  6. Successful Management of Two Cases of Placenta Accreta and a Literature Review: Use of the B-Lynch Suture and Bilateral Uterine Artery Ligation Procedures

    PubMed Central

    Arab, Maliheh; Ghavami, Behnaz; Saraeian, Samaneh; Sheibani, Samaneh; Abbasian Azar, Fatemeh; Hosseini-Zijoud, Seyed-Mostafa

    2016-01-01

    Introduction Placenta accreta is an increasingly common complication of pregnancy that can result in massive hemorrhage. Case Presentation We describe two cases of placenta accreta, with successful conservative management in a referral hospital in Tehran, Iran. In both cases, two procedures were performed: compression suture (B-Lynch) and a perfusion-decreasing procedure (bilateral uterine artery ligation). We also present the results of a narrative literature review. Conclusions The double B-Lynch and uterine arterial ligation procedure in cases of abnormal placentation might be strongly considered in fertility preservation, coagulopathy, coexisting medical disease, blood access shortage, low surgical experience, distant local hospitals, and no help. PMID:27354921

  7. Potential performance of methods for parallelism across time in ODEs

    SciTech Connect

    Xuhai, Xu; Gear, C.W.

    1990-01-01

    An earlier report described a possible approach to massive parallelism across time. This report describes some numerical experiments on the method and two modifications to the method to increase the parallelism and improve the behavior for nonlinear stiff problems.

  8. Medium-throughput production of recombinant human proteins: ligation-independent cloning.

    PubMed

    Strain-Damerell, Claire; Mahajan, Pravin; Gileadi, Opher; Burgess-Brown, Nicola A

    2014-01-01

    Structural genomics groups have identified the need to generate multiple truncated versions of each target to improve their success in producing a well-expressed, soluble, and stable protein and one that crystallizes and diffracts to a sufficient resolution for structural determination. At the SGC, we opted for the Ligation-Independent Cloning (LIC) method which provides the medium throughput we desire to produce and screen many proteins in a parallel process. Here, we describe our LIC protocol for generating constructs in a 96-well format and provide a choice of vectors suitable for expressing proteins in both E. coli and the baculovirus expression vector system (BEVS). PMID:24203324

  9. Enabling N-to-C Ser/Thr Ligation for Convergent Protein Synthesis via Combining Chemical Ligation Approaches.

    PubMed

    Lee, Chi Lung; Liu, Han; Wong, Clarence T T; Chow, Hoi Yee; Li, Xuechen

    2016-08-24

    In this article, Ser/Thr ligation(on/off) has been realized to enable N-to-C successive peptide ligations using a salicylaldehyde semicarbazone (SAL(off)) group by in situ activation with pyruvic acid of the peptide SAL(off) ester into the peptide salicylaldehyde (SAL(on)) ester. In addition, a peptide with a C-terminal thioester and N-terminal Ser or Thr as the middle peptide segment can undergo one-pot Ser/Thr ligation and native chemical ligation in the N-to-C direction. The utility of this combined ligation strategy in the N-to-C direction has been showcased through the convergent assembly of a human cytokine protein sequence, GlcNAcylated interleukin-25. PMID:27479006

  10. Cluster-based parallel image processing toolkit

    NASA Astrophysics Data System (ADS)

    Squyres, Jeffery M.; Lumsdaine, Andrew; Stevenson, Robert L.

    1995-03-01

    Many image processing tasks exhibit a high degree of data locality and parallelism and map quite readily to specialized massively parallel computing hardware. However, as network technologies continue to mature, workstation clusters are becoming a viable and economical parallel computing resource, so it is important to understand how to use these environments for parallel image processing as well. In this paper we discuss our implementation of parallel image processing software library (the Parallel Image Processing Toolkit). The Toolkit uses a message- passing model of parallelism designed around the Message Passing Interface (MPI) standard. Experimental results are presented to demonstrate the parallel speedup obtained with the Parallel Image Processing Toolkit in a typical workstation cluster over a wide variety of image processing tasks. We also discuss load balancing and the potential for parallelizing portions of image processing tasks that seem to be inherently sequential, such as visualization and data I/O.

  11. Massive transfusion and massive transfusion protocol

    PubMed Central

    Patil, Vijaya; Shetmahajan, Madhavi

    2014-01-01

    Haemorrhage remains a major cause of potentially preventable deaths. Rapid transfusion of large volumes of blood products is required in patients with haemorrhagic shock which may lead to a unique set of complications. Recently, protocol based management of these patients using massive transfusion protocol have shown improved outcomes. This section discusses in detail both management and complications of massive blood transfusion. PMID:25535421

  12. Efficient Ligation of the Schistosoma Hammerhead Ribozyme †

    PubMed Central

    Canny, Marella D.; Jucker, Fiona M.; Pardi, Arthur

    2011-01-01

    The hammerhead ribozyme from Schistosoma mansoni is the best characterized of the natural hammerhead ribozymes. Biophysical, biochemical, and structural studies have shown that the formation of the loop-loop tertiary interaction between stems I and II alters the global folding, cleavage kinetics, and conformation of the catalytic core of this hammerhead, leading to a ribozyme that is readily cleaved under physiological conditions. This study investigates the ligation kinetics and the internal equilibrium between cleavage and ligation for the Schistosoma hammerhead. Single turnover kinetic studies on a construct where the ribozyme cleaves and ligates substrate(s) in trans showed up to 23% ligation when starting from fully cleaved products. This was achieved by a ~2,000-fold increase in the rate of ligation compared to a minimal hammerhead without the loop-loop tertiary interaction, yielding an internal equilibrium that ranges from 2–3 at physiological Mg2+ ion concentrations (0.1 –1 mM). Thus, the natural Schistosoma hammerhead ribozyme is almost as efficient at ligation as it is at cleavage. The results here are consistent with a model where formation of the loop-loop tertiary interaction leads to a higher population of catalytically active molecules, and where formation of this tertiary interaction has a much larger effect on the ligation than the cleavage activity of the Schistosoma hammerhead ribozyme. PMID:17319693

  13. DNA methylation profiling using HpaII tiny fragment enrichment by ligation-mediated PCR (HELP).

    PubMed

    Suzuki, Masako; Greally, John M

    2010-11-01

    The HELP assay is a technique that allows genome-wide analysis of cytosine methylation. Here we describe the assay, its relative strengths and weaknesses, and the transition of the assay from a microarray to massively-parallel sequencing-based foundation. PMID:20434563

  14. Recovery of testicular blood flow following ligation of testicular vessels

    SciTech Connect

    Pascual, J.A.; Villanueva-Meyer, J.; Salido, E.; Ehrlich, R.M.; Mena, I.; Rajfer, J.

    1989-08-01

    To determine whether initial ligation of the testicular vessels of the high undescended testis followed by a delayed secondary orchiopexy is a viable alternative to the classical Fowler-Stephens procedure, a series of preliminary experiments were conducted in the rat in which testicular blood flow was measured by the 133-xenon washout technique before, and 1 hour and 30 days after ligation of the vessels. In addition, testicular histology, and testis and sex-accessory tissue weights were measured in 6 control, 6 sham operated and 6 testicular vessel ligated rats 54 days after vessel ligation. The data demonstrate that ligation and division of the testicular blood vessels produce an 80 per cent decrease in testicular blood flow 1 hour after ligation of the vessels. However, 30 days later testis blood flow returns to the control and pre-treatment value. There were no significant changes in testis or sex-accessory tissue weights 54 days after vessel ligation. Histologically, 4 of the surgically operated testes demonstrated necrosis of less than 25 per cent of the seminiferous tubules while 1 testis demonstrated more than 75 per cent necrosis. The rest of the tubules in all 6 testes demonstrated normal spermatogenesis. From this study we conclude that initial testicular vessel ligation produces an immediate decrease in testicular blood flow but with time the collateral vessels are able to compensate and return the testis blood flow to its normal pre-treatment value. These preliminary observations lend support for the concept that initial ligation of the testicular vessels followed by a delayed secondary orchiopexy in patients with a high undescended testis may be a possible alternative to the classical Fowler-Stephens approach.

  15. Parallel rendering

    NASA Technical Reports Server (NTRS)

    Crockett, Thomas W.

    1995-01-01

    This article provides a broad introduction to the subject of parallel rendering, encompassing both hardware and software systems. The focus is on the underlying concepts and the issues which arise in the design of parallel rendering algorithms and systems. We examine the different types of parallelism and how they can be applied in rendering applications. Concepts from parallel computing, such as data decomposition, task granularity, scalability, and load balancing, are considered in relation to the rendering problem. We also explore concepts from computer graphics, such as coherence and projection, which have a significant impact on the structure of parallel rendering algorithms. Our survey covers a number of practical considerations as well, including the choice of architectural platform, communication and memory requirements, and the problem of image assembly and display. We illustrate the discussion with numerous examples from the parallel rendering literature, representing most of the principal rendering methods currently used in computer graphics.

  16. Convergent synthesis of proteins by kinetically controlled ligation

    DOEpatents

    Kent, Stephen; Pentelute, Brad; Bang, Duhee; Johnson, Erik; Durek, Thomas

    2010-03-09

    The present invention concerns methods and compositions for synthesizing a polypeptide using kinetically controlled reactions involving fragments of the polypeptide for a fully convergent process. In more specific embodiments, a ligation involves reacting a first peptide having a protected cysteyl group at its N-terminal and a phenylthioester at its C-terminal with a second peptide having a cysteine residue at its N-termini and a thioester at its C-termini to form a ligation product. Subsequent reactions may involve deprotecting the cysteyl group of the resulting ligation product and/or converting the thioester into a thiophenylester.

  17. [The decision to undergo tubal ligation].

    PubMed

    Fontaine, J G

    1984-07-01

    Gynecologists frequently seek the opinion of a psychiatrist before performing tubal ligations. The case of a 25-year-old childless divorcee whose gynecologist feared that her request for sterilization was actually a sign of depression is described to illustrate different aspects of psychiatric evaluation of women seeking sterilization. The patient's appearance and comportment during the interview, husband, family, attitude toward children, and dreams appeared normal. A traditional psychiatric evaluation shed little light on her reasons for seeking sterilization. A scale developed in the course of a few years of interviewing sterilization candidates to try to achieve a degree of objectivity was applied to the case. The scale requires assessment of the client's motivation, contraceptive usage, self-image, psychosexual development, the economic value of the ligation, and the consequences for the patient's sexual, social, and professional life. The psychiatrist summarizes his detailed findings and states them in comprehensible and practical terms for the sterilization practitioner. In this case, the client was raised rather coldly in a large family of 13 children where she received little attention from her father or mother. Her family background caused her to enter the work force at a very early age and to seek affection from an older man whom she married at 19 after 2 years of living together. She soon realized that he could not fill her need for affection and decided to leave him as he was demanding a child. Her 2nd partner was an alcoholic whom she left for the same reasons. She could not accept the idea of having a child who would be demanding when her own needs were so pressing. Her position was reinforced by her belief that she could not use oral contraceptives because of recurrent infections and that the IUD is unreliable. Deep psychotherapy could probably lower her resistence to motherhood but the client had no motivation for it. There was little reason not to

  18. Formatting and ligating biopolymers using adjustable nanoconfinement

    NASA Astrophysics Data System (ADS)

    Berard, Daniel J.; Shayegan, Marjan; Michaud, Francois; Henkin, Gil; Scott, Shane; Leslie, Sabrina

    2016-07-01

    Sensitive visualization and conformational control of long, delicate biopolymers present critical challenges to emerging biotechnologies and biophysical studies. Next-generation nanofluidic manipulation platforms strive to maintain the structural integrity of genomic DNA prior to analysis but can face challenges in device clogging, molecular breakage, and single-label detection. We address these challenges by integrating the Convex Lens-induced Confinement (CLiC) technique with a suite of nanotopographies embedded within thin-glass nanofluidic chambers. We gently load DNA polymers into open-face nanogrooves in linear, concentric circular, and ring array formats and perform imaging with single-fluorophore sensitivity. We use ring-shaped nanogrooves to access and visualize confinement-enhanced self-ligation of long DNA polymers. We use concentric circular nanogrooves to enable hour-long observations of polymers at constant confinement in a geometry which eliminates the confinement gradient which causes drift and can alter molecular conformations and interactions. Taken together, this work opens doors to myriad biophysical studies and biotechnologies which operate on the nanoscale.

  19. Runtime volume visualization for parallel CFD

    NASA Technical Reports Server (NTRS)

    Ma, Kwan-Liu

    1995-01-01

    This paper discusses some aspects of design of a data distributed, massively parallel volume rendering library for runtime visualization of parallel computational fluid dynamics simulations in a message-passing environment. Unlike the traditional scheme in which visualization is a postprocessing step, the rendering is done in place on each node processor. Computational scientists who run large-scale simulations on a massively parallel computer can thus perform interactive monitoring of their simulations. The current library provides an interface to handle volume data on rectilinear grids. The same design principles can be generalized to handle other types of grids. For demonstration, we run a parallel Navier-Stokes solver making use of this rendering library on the Intel Paragon XP/S. The interactive visual response achieved is found to be very useful. Performance studies show that the parallel rendering process is scalable with the size of the simulation as well as with the parallel computer.

  20. Ultrascalable petaflop parallel supercomputer

    DOEpatents

    Blumrich, Matthias A.; Chen, Dong; Chiu, George; Cipolla, Thomas M.; Coteus, Paul W.; Gara, Alan G.; Giampapa, Mark E.; Hall, Shawn; Haring, Rudolf A.; Heidelberger, Philip; Kopcsay, Gerard V.; Ohmacht, Martin; Salapura, Valentina; Sugavanam, Krishnan; Takken, Todd

    2010-07-20

    A massively parallel supercomputer of petaOPS-scale includes node architectures based upon System-On-a-Chip technology, where each processing node comprises a single Application Specific Integrated Circuit (ASIC) having up to four processing elements. The ASIC nodes are interconnected by multiple independent networks that optimally maximize the throughput of packet communications between nodes with minimal latency. The multiple networks may include three high-speed networks for parallel algorithm message passing including a Torus, collective network, and a Global Asynchronous network that provides global barrier and notification functions. These multiple independent networks may be collaboratively or independently utilized according to the needs or phases of an algorithm for optimizing algorithm processing performance. The use of a DMA engine is provided to facilitate message passing among the nodes without the expenditure of processing resources at the node.